u/Arvy_0

Best local multimodal llm for 8GB Vram?

Hi everyone, I’m currently looking for recommendations for a good local multimodal model for my project: an AI-based assistant system for visually impaired users that helps operate an air conditioner remote control. The model needs strong multimodal understanding because it must read, recognize, and analyze the buttons, labels, symbols, and layout of different AC remotes from camera input. Right now I’m using Qwen 3.5 9B quantized to 4-bit using Unsloth, and the deployment target is an RTX 4060, 8GB VRAM. The current model still struggles to correctly interpret remote display states, especially indicators such as small logos, icons, bars, mode symbols, fan speed indicators, and similar visual elements.. I’m trying to find the best balance between multimodal accuracyband VRAM efficiency for local inference. If anyone has experience with lightweight VLMs or local multimodal setups for assistive technology projects, I’d really appreciate your recommendations for models, quantization strategies, or inference frameworks.

reddit.com
u/Arvy_0 — 10 days ago

Best free AI Agent provider?

Hi everyone, I’m looking for recommendations for the best free AI agent providers and which models work best for coding and general development workflows. So far, I’ve mainly been using Cursor, and honestly it has given me the best overall experience for code generation, context handling, and productivity. I also tried Cline with DeepSeek models, but in my experience the coding quality and reasoning were still a bit weaker compared to Cursor. Recently I tested Codex as well, and it felt pretty decent/fine overall. I’m curious what other people are currently using in 2026 for free or low-cost AI coding agents. Which providers and models do you think are currently the strongest for real-world coding tasks, debugging, planning, and autonomous agent workflows?

reddit.com
u/Arvy_0 — 10 days ago