u/Dolboyob77

Scaller llm for intel big update to run 6 months old models…

So today, may 20th 2026 we finally received a long waited update on scaler llm for intel gpu!!!! FINALLY!!! I was so excited… until…. I read the supported models : Qwen2.5 and so on…. This is F……g joke….!!!!!! Please if soneoje can teach me how to compile and upgrade these things i am willing to work on it and give a decent update…. That is actually up to date!!!!

reddit.com
u/Dolboyob77 — 1 day ago
▲ 44 r/LocalLLM+1 crossposts

Where are the Intel devs????

I own 2 intel gpus both battlemage xe drivers with intel core cpu, i have been fed with the promise of a dream land being all intel it would make things so much faster and irrisistible…. What i came to understand is that everything is done for the nvidia community, maybe the devs at nvidia are more passionate or involved…. Llamacpp sycl works 70% of what the intel gpu can really achieve, and the only real reason to to buy intel gpu is because there was ipex vllm and now it is replace by intel scaler vllm… but obviously they make an update every 6 weeks or even more…. So we have gpus that are just sitting there half asleep…. Come on… our gpus were meant to run vllm!!!! But what is the point to run models that are 2-3 months old or more??? Each time im trying to launch a model on unraid os, the container crashes because the repo is too old…. If it goes on this way, i will resell everything wnd invest more for something that actually works… i was not asking to get the same tokens per second as nvidia because their bandiwth is faster…. But to get something that actually works would be rhe minimum, no?
Intel core 9 ultra 285h with 96g ram
Intel arc pro b70
Intel arc b580
If i use llamq cpp sycl with gguf models , yes it works but it is not optimized and i get way less than what the gpu is capable… so if there are Intel devs somewhere… can you please do something abiut it and update the intel scaler vllm ??? Thanks

reddit.com
u/Dolboyob77 — 3 days ago
▲ 8 r/LocalLLM+1 crossposts

[Bug] llama.cpp full-intel image breaks Q8_0 models on Intel Arc GPUs - reorder_qw_q8_0 SYCL out of memory error

hello I ran into a problem following the update of latest image :

  • Image: ghcr.io/ggml-org/llama.cpp:full-intel
  • Error: reorder_qw_q8_0 UR_RESULT_ERROR_OUT_OF_DEVICE_MEMORY
  • GPU: Intel Arc Pro B70 + Intel Arc B580
  • Works fine with Q6_K_XL, crashes with Q8_0
  • Working version: full-intel-b9144
u/Dolboyob77 — 5 days ago

Openwebui + comfyui

Hello, is someone succeeding in making these 2 work together? No matter what i am trying, unet loader, checkpoint… the workflow works when i type thenpromot in comfyui but as soon as i type same prompt in openwebui , i cannot manage to get an image and always get errors… i i port the fson worflow and specify prompt id and checkpoint if and model in openwebui but nothing works… is it because i use flux 1 dev fp16 ? Does it require smaller models to work ? Thanks for input !!

SOLUTION : I finally made it work using the help of qwen3.6-27b-q8 ))) so the problem is that ALL NODES ID must be filled in openwebui and also must add this command line to openwebui : ENABLE_RAG_LOCAL_WEB_FETCH=True , it was the fix for me )) now working perfectly !!!

reddit.com
u/Dolboyob77 — 7 days ago

Update for intel scaler vllm ?

Hello, i am currently using the intel scaler vllm 14.8b2 i think, the one for intel arc pro b70. But the core is an old model so i cannot use newer models like qwen3.6-27b-fp8. So when will we see an update to be able to use the latest models in safetensor? Thanks

reddit.com
u/Dolboyob77 — 8 days ago
▲ 10 r/LocalLLM+1 crossposts

Stop the " Thinking" in Openwebui

Hello, i have been going crazy trying to stop the qwen3.6-27b models from thinking in openwebui. I tried all sorts of post arguments like nothink, no-think, jinja….. nothing is working. Each question i type , even “hello" it thinks for 4 linutes and then sends me choices to select as an answer… this is just ridiculous. I have tried different models gguf in llama cpp with sycl ( i have intel arc pro b70) going from qwen3.6 q4 to q8 they all load fine without error but i cant get any proper answer…. Just thinking forever and answering my hello by a list of questions. Any help would be appreciated !!!!

reddit.com
u/Dolboyob77 — 11 days ago

Hello everyone, newbie here so don’t scream))) i have just installed openclaw on unraid. When opening the container at first they asked me for a gateway key. I wrote a password. Apparently this was a mistake becausz i cannot connect on the openclaw page, they ask for login and password and admin or root with the password i weote in container is not working obviously. Network is set on HOST . And the error on openclaw page is : origin not allowed (open the Control UI from the gateway host or allow it in gateway.controlUi.allowedOrigins)
So if anyone has a fix to this with simple words and step guide i would be very grateful !!! Thank you in advance )))

reddit.com
u/Dolboyob77 — 16 days ago

Hello i added a new GPU 5x16 on my EX pro dock ( gen5x8). The mini pc is advertised as 5x8 also. What a surprise when on my server it showed only GEN 4x8 speed, which is half the speed of GEN5. So i went to check my bios ( T205) and all the pcie slots show a GEN 4 max speed option….. waiting on an update with the right speeds as mentionned on their website.

reddit.com
u/Dolboyob77 — 18 days ago