Qwen 3.6 27b MTP - getting //// in response
Not sure what I'm doing wrong. Running llama.cpp with these flags:
--spec-type mtp
--spec-draft-n-max 3
llama.cpp running with:
RUN git clone https://github.com/ggml-org/llama.cpp.git . \
&& git fetch origin pull/22673/head:mtp-branch \
&& git checkout mtp-branch
I'm running via with docker. Here's my Dockerfile:
# Use CUDA 12.8+ to support Blackwell (RTX 50-series)
FROM nvidia/cuda:12.8.0-devel-ubuntu22.04
# Set up environment for the linker to find CUDA stubs during buildENV LD_LIBRARY_PATH=/usr/local/cuda/lib64/stubs:${LD_LIBRARY_PATH}
# Install dependenciesRUN apt-get update && apt-get install -y \pciutils \libcurl4-openssl-dev \curl \git \cmake \build-essential \&& rm -rf /var/lib/apt/lists/*
# Create a symlink so the linker finds libcuda.so.1 in the stubs folderRUN ln -s /usr/local/cuda/lib64/stubs/libcuda.so /usr/local/cuda/lib64/stubs/libcuda.so.1
WORKDIR /app
# Clone from the official organization and fetch the MTP PR branchRUN git clone https://github.com/ggml-org/llama.cpp.git . \&& git fetch origin pull/22673/head:mtp-branch \&& git checkout mtp-branch
# Build with CUDA support targeting Blackwell architecture (sm_120)RUN mkdir build && cd build \&& cmake .. -DGGML_CUDA=ON -DBUILD_SHARED_LIBS=OFFDCMAKE_CUDA_ARCHITECTURES="120" \&& cmake --build . --config Release -j$(nproc)
# Clean up the stub symlink after build is completeRUN rm /usr/local/cuda/lib64/stubs/libcuda.so.1
# Expose the server portEXPOSE 8888
# Set the entrypoint to the compiled llama-serverENTRYPOINT ["./build/bin/llama-server"]
Any idea?
Thanks