u/nunodonato

Does vllm need a restart once in a while?

Out of the blues I started getting replies from the agent that completely broke tool parsing. Like

<read", "path": "/home/agent/.agents/skills/research/SKILL.md"}
{"path": "/home/agent/.agents/skills/research/SKILL.md"}
</read>

I checked all my code and made sure nothing I did today could have impacted this. No changes in version, nothing. But try after try, it just kept outputting this kind of garbage.

Out of desperation, I shutdown vllm and turned it back on. Lo and behold, it works like a charm again.

So now I'm really confused, are we supposed to refresh vllm once in a while? Could long running sessions corrupt memory in a way that harm the way it works?

reddit.com

u/nunodonato — 4 days ago

▲ 7 r/Vllm

Help me understand max_num_seqs

Hi all

I've been a bit confused on how to better tune max_num_seqs

When my vllm starts and loads the model it give me the max nr of requests at full context (usually around 12)

If I exceed this number, they go into waiting, and I see that in the logs.

So what is max_num_seqs used for? Is there any reason why we would set this value to be lower than the max requests vllm can handle?

thanks

reddit.com

u/nunodonato — 8 days ago

▲ 7 r/braga

Instalação de ventoinhas de tecto

Bom dia pessoal

Alguma recomendação de loja/empresa que venda e instale ventoinhas de tecto?

(Não,não quero as do Leroy Merlin)

Obrigado!

reddit.com

u/nunodonato — 1 month ago

▲ 4 r/kobo

TIL: you can also search for handwritten text in BASIC notebooks!

Well, that's it.

I didn't know exactly why there was a search feature in basic notebooks, since you can't add typed text. But I tried it and lo and behold, it actually finds handwritten text and shows me "screenshots" of where the words appear.

I'm quite impressed!

If anyone has more details, I'd like to understand if this is "live" or if takes some time to index the words, maybe after a sync? I got the feeling that it doesnt find them immediately after writing

reddit.com

u/nunodonato — 1 month ago

▲ 2 r/Supernote

How's the reading experience? (kobo)

Hi all

I've been using Kobo e-readers for many years, currently with a Libra Colour.

But the note-taking aspects of the kobo are very simple and annoying sometimes. The supernote ticks so many boxes for me that I'm planning to switch.

However, my usage is still 60% reading and 40% note-taking, so I want to be sure I'm not downgrading in one very important aspect of the device.

Any feedback on the reading of the supernote? Especially if any of you use the kobo app (I'd like to keep my huge collection)

thanks in advance

reddit.com

u/nunodonato — 1 month ago

▲ 15 r/PiCodingAgent

Best practices to call Pi inside a container?

Hi everyone

I'm setting up a docker container for pi, and I want to initiate agents from another container, kind of "remotely".

I was wondering if anyone has done something like this, and what are the best ways to accomplish it. I'm focused solely on the start of the agent session (passing the prompt), I don't need to read the output synchronously.

My first idea was to call docker exec with the full cli params, but probably there are better ways to do this.

AI suggested setting up a small http server to receive triggers via an endpoint and then run the command locally.

Thoughts?

thanks in advance

EDIT: I discovered pi-web (https://github.com/jmfederico/pi-web) and I think this fits my needs

u/nunodonato — 2 months ago

▲ 7 r/opencode

/goal or any other way to ensure an agent reaches a good final result?

Hi folks

I'm testing OpenCode to power an agent to work in the background by receiving requests via an HTTP API.

Right now I'm getting good results, but sometimes it can get stuck (thinks but doesnt do anything else afterwards). I understand this is related to a bug parsing Qwen3.6, but regardless of that, other problems might also come up in the future.

So I was wondering if there is already a way to prevent the agent from getting stuck and "dead". Hermes has a really cool feature that detects this cases and nudges it to move forward. I'm not yet familiar with all the ins and outs of OpenCode, so I thought I would ask first.

Thanks!

reddit.com

u/nunodonato — 2 months ago

▲ 12 r/MistralAI

Which Mistral model do you recommend for a local agent? (Hermes)

Basically, what's in the title.

I'm setting up an agent for my wife, and am looking for a cheap model that can perform well. Mistral sounds good, but I'm a bit confused on the model offering, small 4, medium 3.5, large 3. Seems the bigger (better), the older it is. Would small be a good fit?

thanks in advance

reddit.com

u/nunodonato — 3 months ago

▲ 17 r/Vllm+1 crossposts

I'm running Qwen3.6-27B with vLLM at FP16. There are a few known issues with the chat template (I think), and I do get occasional stop in OpenCode or other harnesses.

But in OpenWebUI is 100x worse. The model stops, sometimes gives me garbage words in a loop and other times fails tool calls due to bad json. It's a 50% chance to actually manage to use it or not.

I don't get it, I'm using the default values and yes, Native tool calls. In vLLM I'm using the recommended params.

What else can I try?

reddit.com

u/nunodonato — 3 months ago

Does vllm *need* a restart once in a while?

Help me understand max_num_seqs

Instalação de ventoinhas de tecto

TIL: you can also search for handwritten text in BASIC notebooks!

How's the reading experience? (kobo)

Best practices to call Pi inside a container?

/goal or any other way to ensure an agent reaches a good final result?

Which Mistral model do you recommend for a local agent? (Hermes)

Does vllm need a restart once in a while?