u/Grand_Competition_99

Help me to reduce the token count and get faster reply.( I am getting 2 token/sec reply)

Hey I noticed that using openclaw sent ridiculous amount of tokens for every tak around 18k tokens per request.

I am using a free model using openrouter. So it does not perform as well as the flagship model which can handle the context.

It loads the entire file about me every time even if doesn't need it.

Can you tell me ways to speed up the reply.

I am getting only 2 token per second reply sometimes.

The output ranges from 2 to 16 but average speed is on the lower side.

reddit.com
u/Grand_Competition_99 — 7 days ago
▲ 1 r/openrouter+1 crossposts

Decrease the token count as the model reply slowly

Hi I started using openclaw. Had multiple issues with it with gateway, models and setup.

It is now working but the main issue is of the token count.

I am using gpt oss 120 B and facing the issue of slow reply.

I am using openrouter api and the model is in itself free so i know that it might be slow.
To get this Straight every small task dumps all the files in context I know that I just want to know how I can decrease the token count.

It sends nearly 18K tokens per input and the token/sec output is sometimes 2-4 token/sec.

It has gone to nearly 10 to 20 sometimes but mostly slow.

How can I reduce it. Help guys!!!

reddit.com
u/Grand_Competition_99 — 8 days ago
▲ 4 r/ollama+2 crossposts

Openclaw locally runs very slow. Openclaw web is not feasible.

Hey I tried openclaw to run locally.

I chose ollama route sinse i am just a student and paying for the api and running openclaw on cloud would require money.

I tried ollama deepseek 1.5 B model which is small and fast and can be run on my laptop.

I have rtx 4050 with 6 gb vram.

Running the model just with ollama is fast and can run at speeds which I can easily work with but when I used openclaw and used that model there the query openclaw takes is very slow. Also other things in openclaw felt very slow.

Slow things include the reply which I get even hello is replied very late( not a problem when I single run model). The ui, when I go to different tabs like skills , channels,instances, sessions,cron jobs and other tabs they feel like taking 1 minute to load.

I want a solution to run openclaw faster locally( can't pay for web version as I just want to experiment)

I heard there was a lighter version for it but didn't understood what it means so recommended it ( idk what it is)

My model( even though small can run 9B or 7B model ) can be used in ollama. I can link lmstudio api so model running is not hard but openclaw itself is slow.

Everyone teaching openclaw is either in good hardware( mac or high end windows or Linux) or using web hosting so pc won't break I just want to experiment small and will do carefully.

Can anyone help with running it locally far better.

reddit.com
u/Grand_Competition_99 — 11 days ago