u/tomByrer

▲ 2.1k r/LocalLLM+2 crossposts

Heretic has been served a legal notice by Meta, Inc.

To Whomsoever it May Concern,

The individual behind the Heretic Free Software Project (henceforth called "Heretic", notwithstanding unrelated entities of the same name) has been served a notice by a legal services provider representing Meta Platforms, Inc. (henceforth called "Meta"), via the digital communications medium variously known as Internet Mail, Electronic Mail, or simply "email".

The Heretic Project conducts its affairs in full compliance with applicable laws, regulations, rules, guidelines, opinions, and hunches. Following the commendable example set by the renowned heretic Galileo Galilei in 1616, we are recanting the relevant materials, namely derivatives of Meta's "Llama" Artificial Intelligence language models, and have removed the same from all model weight repositories controlled by the Heretic Project.

We are grateful to Meta and its legal representatives for the opportunity to better align ourselves with the agenda of the global corporate oligarchy. The Llama model family ranks among the 200 best language models available today, trailing only 168 other models from 23 competitors on the LM Arena leaderboard, and Meta's concern for that asset naturally outweighs scientific freedom, as well as the legally and ethically dubious circumstances under which those models were created in the first place, regarding which, ironically, Meta is currently facing lawsuits and investigations in multiple jurisdictions around the world.

On a completely unrelated note, the Heretic Project is diversifying its infrastructure, and now has an official Codeberg mirror at https://codeberg.org/p-e-w/heretic, hosted in Germany. Additional mirrors are planned. We are also actively working to implement technological measures that will preserve access to models created with Heretic without depending on any specific service provider. We are proud to be part of this journey as we navigate an evolving global regulatory landscape, and work with stakeholders from diverse institutional backgrounds to ensure that Artificial Intelligence remains safe, culturally appropriate, and controlled by those who have always known what is best for humanity. If you, too, would like to share in this exciting adventure, please join us!

Sincerely, p-e-w, Chief Heretic

u/-p-e-w- — 1 day ago

▲ 71 r/comfyui+1 crossposts

LumiPic: Oumoumad's (LTX lora fame) SDR->HDR conversion LoRAs for Qwen, soon Kline Base 4 & 9

>LumiPic — Single-Image SDR to HDR LoRA

>Converts standard dynamic range (SDR) images to high dynamic range (HDR) EXR files — float-valued, with range well beyond what an 8-bit SDR output can carry.

Released weeks ago, surprised no one posted about it.

Even if your target usecase is not HDR, if you want to post edit your images, the extra image range can help with exposure / colorization editing.

ComfyUI workflows in the files tab.

Edit: video https://www.youtube.com/watch?v=z0ue28hbMTk

huggingface.co

u/tomByrer — 5 days ago

▲ 1 r/LocalLLM

DwarfStar4: DeepSeek 4 Flash for MBPro 96GB & Sparks

Metal is our primary target. Starting from MacBooks with 96GB of RAM.
NVIDIA CUDA with special care for the DGX Spark.
AMD ROCm is only supported in the rocm branch.

Custom quant; some 2 bit, some left full resolution.
Custom engine meant only for their custom quant targeting only 2 hardware profiles.

github.com

u/tomByrer — 8 days ago

▲ 30 r/LocalLLM

lamma.cpp: Qwen3.6 MTP Unsloth GGUFs now 1.8x faster!

u/tomByrer — 8 days ago

▲ 0 r/LocalLLM

Alex digs deeper this time, inspired by this channel:
https://www.youtube.com/@Protorikis

u/tomByrer — 16 days ago

▲ 8 r/LocalLLM

TL;DW:
Analysing 1 large code file, first split in half, then full =
llama.cpp serving GGUF was decent, Ollama MLX+NVFP4 was faster.
MLX LM was good for smaller files (smaller context) but crashed the Mac on a bigger file.

u/tomByrer — 17 days ago

▲ 231 r/unsloth+1 crossposts

Hey guys, you can now run open LLMs in Claude Code, Codex and OpenClaw via Unsloth's API inference endpoint and we made lots of tutorials for it!

Use Gemma 4 and Qwen3.6 GGUFs for local agentic coding on 24GB RAM.

Run with self-healing tool calls, code execution, web search via the Unsloth API endpoint and llama.cpp.

Guide: https://unsloth.ai/docs/basics/api

Unsloth makes it easy to deploy a fast API inference endpoint that provides:

Self-healing tool calling, which helps reduce broken or malformed tool calls by 50%
Code execution support, allowing Bash and Python execution for more accurate code outputs.
Advanced Web search that visits and actually reads webpages to gather in-depth info.
Automatic inference settings for GGUF models (temp, top-k etc.)

Please update Unsloth to leverage this new update and let us know if you have any feedback. Thank you!!

u/tomByrer — 16 days ago

▲ 5 r/LocalLLM

The title "RTX 5090, Mac Studio, or DGX Spark? I tried all three." is deceptive; he only talks about hardware 20% of the time. He spends more time on use-cases, then a quick touch on model families for each use case.

He also skipped over a few things, like VRAM vs system RAM vs shared iGPU VRAM, how much harddrive space you'll need for all your models, if you want to do image/video/OCR.

u/tomByrer — 21 days ago

▲ 2 r/LocalLLM

Note: not my video, I don't even own any AMD anything.
But many on here with AMD or considering AMD would like to know.
& I'm sure he'd like support; he's working hard to help AMD folks improve their AI capabilities.

u/tomByrer — 21 days ago

▲ 482 r/olkb+2 crossposts

I built a split keyboard with a 55mm trackball, and honestly… I can’t stay away from this thing anymore.

ZMK firmware, a lot of switches using Charlieplexing, a 55mm trackball, rotary encoders on both sides, and a 4‑way + push switch. It’s absolutely packed with features.

I originally designed and used a wired version called Mooose, but after moving to ZMK and making it possible to go wireless, I added "Free" to the name to reflect the freedom from cables.

Switches: keygeek Y3, Kailh Ghost, and others Keycaps: awekeys Air, etc.

u/Outside-Bluejay-4582 — 16 days ago

▲ 282 r/LocalLLM+1 crossposts

Came across hipfire the other day. It's a brand new inference engine focused on all AMD GPU's (not just the latest).

Github.

It uses a special mq4 quantization method. The hipfire creator is pumping out models on huggingface.

I don't know enough about quantization to know how good these quants are in terms of quality, but as an RDNA3 aficionado I'm happy AMD is getting some attention.

Localmaxxing is a new LLM benchmarking site, and shows some pretty dramatic speedups for hipfire inference.

Edit: I should have just said hipfire - I don't think this is connected to AMD officially.

u/Thrumpwart — 25 days ago

▲ 5 r/unsloth

Why does the install script insists on using 3.14, when it found the version it was looking for?
Why not use v3.14?
Why use Node when Bun has a package manger built in?

PS C:\Users\me&gt; irm https://unsloth.ai/install.ps1 | iex

  🦥 Unsloth Studio Installer (Windows)
  ────────────────────────────────────────────────────

  python         Python 3.13 already installed
                 found legacy Studio environment, validating...
                 legacy environment is healthy -- migrating...
                 moved .venv -&gt; unsloth_studio
  venv           using migrated environment
                 C:\Users\me\.unsloth\studio\unsloth_studio
  gpu            NVIDIA GPU detected
                 upgrading unsloth in migrated environment...
  setup          running unsloth studio setup...

  🦥 Unsloth Studio Setup
  ────────────────────────────────────────────────────
  gpu            NVIDIA GPU detected
  long paths     enabled
  git            git version 2.53.0.windows.2
  cmake          cmake version 4.2.3
  vs             Visual Studio 17 2022 (filesystem (BuildTools))
                 cl.exe: C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.44.35207\bin\Hostx64\x64\cl.exe
                 driver supports up to CUDA 13.2
                 GPU Compute Capability = 8.6 (sm_86)
                 using existing CUDA Toolkit at CUDA_PATH (nvcc: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.2\bin\nvcc.exe)
                 Persisted CUDA_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.2 to user environment
                 Set CUDA_PATH_V13_2 (cleared other CUDA_PATH_V* vars)
                 Persisted CUDA bin dir to user PATH
  cuda           C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.2\bin\nvcc.exe
                 CUDA_PATH      = C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.2
                 CudaToolkitDir = C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.2\
                 Node v24.14.0 and npm 11.9.0 already meet requirements.
  node           v24.14.0 | npm 11.9.0
                 bun already installed (1.3.11)
[ERROR] Python Python 3.14.3 is outside supported range (need &gt;= 3.11 and &lt; 3.14).
        Install Python 3.12 from https://python.org/downloads/
[ERROR] unsloth studio setup failed (exit code 1)

reddit.com

u/tomByrer — 27 days ago

Heretic has been served a legal notice by Meta, Inc.

LumiPic: Oumoumad's (LTX lora fame) SDR-&gt;HDR conversion LoRAs for Qwen, soon Kline Base 4 &amp; 9

DwarfStar4: DeepSeek 4 Flash for MBPro 96GB &amp; Sparks

lamma.cpp: Qwen3.6 MTP Unsloth GGUFs now 1.8x faster!

LumiPic: Oumoumad's (LTX lora fame) SDR->HDR conversion LoRAs for Qwen, soon Kline Base 4 & 9

DwarfStar4: DeepSeek 4 Flash for MBPro 96GB & Sparks