u/ComfyUser48

▲ 2 r/LocalLLM+1 crossposts

Qwen 3.6 27b MTP - getting //// in response

Not sure what I'm doing wrong. Running llama.cpp with these flags:
--spec-type mtp
--spec-draft-n-max 3

llama.cpp running with:
RUN git clone https://github.com/ggml-org/llama.cpp.git . \
&& git fetch origin pull/22673/head:mtp-branch \
&& git checkout mtp-branch

I'm running via with docker. Here's my Dockerfile:

# Use CUDA 12.8+ to support Blackwell (RTX 50-series)

FROM nvidia/cuda:12.8.0-devel-ubuntu22.04

# Set up environment for the linker to find CUDA stubs during build
ENV LD_LIBRARY_PATH=/usr/local/cuda/lib64/stubs:${LD_LIBRARY_PATH}

# Install dependencies
RUN apt-get update && apt-get install -y \
pciutils \
libcurl4-openssl-dev \
curl \
git \
cmake \
build-essential \
&& rm -rf /var/lib/apt/lists/*

# Create a symlink so the linker finds libcuda.so.1 in the stubs folder
RUN ln -s /usr/local/cuda/lib64/stubs/libcuda.so /usr/local/cuda/lib64/stubs/libcuda.so.1

WORKDIR /app

# Clone from the official organization and fetch the MTP PR branchRUN git clone https://github.com/ggml-org/llama.cpp.git . \
&& git fetch origin pull/22673/head:mtp-branch \
&& git checkout mtp-branch

# Build with CUDA support targeting Blackwell architecture (sm_120)
RUN mkdir build && cd build \
&& cmake .. -DGGML_CUDA=ON -DBUILD_SHARED_LIBS=OFF
DCMAKE_CUDA_ARCHITECTURES="120" \
&& cmake --build . --config Release -j$(nproc)

# Clean up the stub symlink after build is complete
RUN rm /usr/local/cuda/lib64/stubs/libcuda.so.1

# Expose the server port
EXPOSE 8888

# Set the entrypoint to the compiled llama-server
ENTRYPOINT ["./build/bin/llama-server"]

Any idea?

Thanks

reddit.com
u/ComfyUser48 — 10 days ago

Qwen 3.6 27b vs Codex GPT 5.5 / Claude Opus 4.7

My local llm discovered a bug that they both missed

And it turns out it's critical

GPT 5.5 and Claude both stood their ground and didn't give up until the end - they claimed to be right all along.

I told my Qwen to provide detailed proof of his arguments, brought the evidance to both of them, and only then came their admission.

Qwen 3.6 27b thinks a lot. That can be both a good and a bad thing. In this case, the long thinking actually discovered a bug neither of the frontier models couldn't find.

GPT 5.5 is FAST. Really fast. But in reality as I found out, it comes with a big tradeoff.

GPT 5.5 admission

Claude Opus 4.7 admission

reddit.com
u/ComfyUser48 — 18 days ago

The motherboard is only few months old. The RGB of the RAM stick doesn't even turn on.
It's not the RAM bcs I placed the second RAM into the slot and it doesn't turn on too.

So right now I get the kit in 2nd and 4th slots. My system only recognized the 4th slot one.

Cleared CMOS and it didn't help.

Is the motherboard cooked?

reddit.com
u/ComfyUser48 — 20 days ago