r/OpenSourceAI

Why aren’t more companies using Sarvam-105B? Isn’t it the cheapest most capable model?
▲ 158 r/OpenSourceAI+1 crossposts

Why aren’t more companies using Sarvam-105B? Isn’t it the cheapest most capable model?

I find it a bit perplexing that most Indian companies aren’t using or even talking about the Sarvam-105B model which is the cheapest most capable model out there! My question here is out of curiosity. My team is building something with Sarvam-105B and I want to check if there are others doing the same. If not why?
With a depreciating rupee, aren’t you worried about increasing your AI input cost in your AI workflows?

u/Ornery-Wrongdoer-865 — 18 hours ago
▲ 100 r/OpenSourceAI+2 crossposts

Open-Source Microsoft Office Extensions for Open WebUI

Ciao community di Open WebUI 👋

Sono Nick, faccio parte del team di Ianustec e siamo grandi fan di Open WebUI da molto tempo.

Apprezziamo molto ciò che questa community sta costruendo attorno all'IA open source e self-hosted, quindi volevamo dare il nostro contributo.

Al momento stiamo sviluppando una suite completamente open source di estensioni per Microsoft Office progettate per funzionare con Open WebUI, incluse integrazioni per:

  • PowerPoint
  • Word
  • Excel
  • Outlook

Il nostro obiettivo è rendere i flussi di lavoro di IA nativi all'interno di Microsoft Office, mantenendo tutto aperto, flessibile e compatibile con l'ecosistema di Open WebUI.

Alcune delle cose su cui stiamo lavorando:

  • Creazione di documenti con l'ausilio dell'IA in Word
  • Analisi e automazione di fogli di calcolo in Excel
  • Creazione e modifica di presentazioni in PowerPoint
  • Stesura e riepilogo di email in Outlook

Tutto verrà rilasciato come open source.

Ci piacerebbe anche collaborare con la community e conoscere le vostre opinioni:

  • Quali funzionalità vi sarebbero più utili?
  • Cosa renderebbe questi strumenti davvero preziosi nel vostro flusso di lavoro quotidiano?

Siamo entusiasti di collaborare con questa community e contribuire all'ecosistema 🚀

u/NicErGoblin9 — 1 day ago
▲ 27 r/OpenSourceAI+14 crossposts

Ask questions across your Markdown notes using a fully local Graph RAG engine. Built for Obsidian vaults, works with any folder of Markdown files. Extracts entity-relation triples from wikilinks & YAML frontmatter, retrieves answers via hybrid search (vector + BM25 + temporal). Multilingual. No cloud. Runs on Ollama.

https://github.com/benmaster82/Kwipu

u/WritHerAI — 24 hours ago
▲ 27 r/OpenSourceAI+6 crossposts

Open-source CLI for red-teaming LLM agents before they touch tools and memory

Sharing RedThread, an open-source CLI for AI red-team campaigns:

https://github.com/matheusht/redthread

The angle is AI agents as an attack surface. Prompt injection gets more interesting once the model can call tools, delegate to workers, write memory, retry failed actions, or propose guardrail changes.

RedThread is built for staging/internal targets. It runs LLM red-team campaigns, records traces, scores failures, and can replay exploit and benign cases before treating a defense as evidence.

Current pieces:

  • PAIR, TAP, Crescendo, and GS-MCTS attack flows
  • JudgeAgent/rubric scoring
  • replay-backed defense proposals
  • telemetry/drift signals
  • agentic checks for tool poisoning, confused deputy paths, canary propagation, and budget amplification

It is not a magic prompt shield and not broad production enforcement.

Looking for people who test agent workflows and can suggest realistic failure cases or target adapters.

▲ 1.1k r/OpenSourceAI+3 crossposts

Open Source Palantir on Git

Open Source Palantir

We're building OSIRIS - The Open-Source Palantir Alternative

Feel free to Pull Request the team will review and merge if applicable 🙏

Just launched at osirisai.live - a free, open-source global intelligence platform:

-Real-Time Tracking:

-10,000+ commercial, military and private aircraft live on a 3D globe

- 2,000+ satellites including ISS

- 1,400+ worldwide CCTV camera feeds

- Earthquakes, wildfires, nuclear facilities and severe weather

Built-In OSINT Tools (no installs needed):

Nmap port scanning from the browser

- DNS record lookup and enumeration

- WHOIS domain intelligence

- SSL/TLS certificate transparency

- BGP routing and ASN lookup

- Threat intelligence and IP reputation

All running on a 3D interactive globe with day/night cycle, 20+ live API feeds, and a SIGINT news aggregator.

Live: https://osirisai.live

GitHub: https://github.com/simplifaisoul/osiris

Free. Open Source. No sign-up required.

u/Gold-Comfortable-340 — 2 days ago
▲ 13 r/OpenSourceAI+4 crossposts

Free RAG Interview Q&A repo with all 10 types of RAG. 50 questions with detailed answers, difficulty tags, and a decision tree. Contributors welcome!

Hey everyone,

I've been going deep on RAG architectures lately and couldn't find a single resource that covered all the modern variants in one place, so I built one and open-sourced it.

What's in the repo:

  • 10 sections covering every major RAG type
  • 50 interview questions tagged [Basic] / [Intermediate] / [Advanced]
  • Detailed answers with architecture diagrams, code snippets, and trade-off tables
  • A cheatsheet with a decision tree ("which RAG should I use?")
  • GitHub Pages site auto-deployed on every push

RAG types covered: Naive, Advanced, Modular, Agentic, Graph, Corrective (CRAG), Self-RAG, Speculative, Multi-modal, and Long-context RAG.

https://github.com/ather-techie/rag-interview-questions

Looking for contributors! If you've been in an ML/LLM interview recently and got a question not covered here, please open a PR or drop it in the comments. I'll add it with credit.

If this is useful, a star on GitHub goes a long way. it helps others discover it. Thanks!

u/Western-Slip199 — 1 day ago
▲ 96 r/OpenSourceAI+11 crossposts

Finally releasing Micracode - an open-source, self-hostable ai App builder.

It’s basically a open source alternative to lovable that runs on your own server and lets you build/deploy apps instantly.

- batteries-included: db, files, auth, payments (planning to support in future)

- code-editor

- BYO AI key

repo link: https://github.com/Jamessdevops/micracode

(Any star will be super appreciated ❤️)

I am basically building things together with our contributors based on your feedback :)

I'm so happy to hear about more things to implement.

Thank you all!

u/james-paul0905 — 1 day ago
▲ 4 r/OpenSourceAI+4 crossposts

I built a small AI tool that checks if a text or email is a scam

Reason I built this: family group chats keep getting the same kind of message. "Is this real?" with a screenshot of some sketchy text. Fake USPS fee, IRS arrest threat, "wrong number" that pivots to a crypto pitch a few replies later. Same thing every week.

The people getting these are usually the ones least equipped to spot them, and the kids/grandkids they ping aren't always around in time.

So, small open-source web app for it. Paste the message or upload a screenshot, get a green/yellow/red verdict in plain English. Built so someone in their 70s can use it, not security people.

A few things worth mentioning. It's fully client-side, no backend, no telemetry. The message goes from your browser straight to Anthropic. There isn't a server I could peek at if I wanted to.

It's BYOK, so you plug in your own Anthropic API key (free to start). About a tenth of a cent per scan. I'm never monetizing this.

The scam pattern library is just JSON files in /scam-patterns/. If you've seen something in the wild that's not covered, PR a new file and everyone's version gets better. No retraining.

Built over a weekend with Claude Code after writing a proper spec. Stack is Vite, React, TypeScript, Tailwind. MIT.

Repo: https://github.com/srivatp2-code/scam-shield

Being honest about the limits, Claude can be wrong on both sides. It'll occasionally call a legit message suspicious, and it'll miss novel scams. It's a second opinion, not gospel. Always confirm with the real sender through a channel you trust.

What scam types am I missing in the starter library? Genuinely interested in adding the ones people have seen recently.

u/fhard007 — 1 day ago
▲ 6 r/OpenSourceAI+3 crossposts

Built an agent that builds agents — pure Python, Qwen3.6 35b a3b Q8_0 MTP

Hi, i built this agentic ai,

Closed-loop system that ships standalone Python agents.

What's different:

- Interviews you until it understands the request before building anything

- Two testing stages: prompt validation via LLM invoke, then real subprocess execution of generated code. Not the same thing.

- Self-referential: injects its own source as a reference template for generation

- Structured rating schema drives iteration. Human approval gate before anything saves.

Runs on Qwen3.6-35B a3b Q8_0 locally.

https://github.com/0c33/Agentic-Ai

Give a shot and tell me what do you think.

github.com
u/NigaTroubles — 2 days ago
▲ 6 r/OpenSourceAI+3 crossposts

The npm/Docker/PyPI supply chain security pattern is repeating with MCP, and we are at the 2015 moment

The sequence is always the same: registry launches and grows fast, minimal vetting because the priority is growth, first wave of incidents, community outrage, tooling catches up, security becomes a baseline expectation. npm took about three years to go from event-stream to npm audit being standard. Docker Hub took similar.

MCP is at step 2 heading into step 3. The numbers from a scan of 500 Smithery servers this month: 18.8% had security findings, 6 had live hardcoded credentials, none were caught by a pre-publication scan because there is no pre-publication scan. A Check Point research disclosure in February showed an 8.7 CVSS attack chain against Claude Code where the entire payload was natural language in a config file.

The difference from npm is what the malicious content does. An npm package executes unauthorized code. A malicious MCP skill file gives unauthorized instructions to an agent that already has access to your tools, file system, and APIs. The LLM cannot distinguish between instructions from the user and instructions from a skill file. Both arrive in the context window and both get acted on. Existing security tooling has no model for this.

The fix is the same three layers it always is: pre-publication registry scanning, CI integration for consumers, and a public advisory database. None of the three exist yet in any mature form for MCP.

Whether the timeline is one year or three depends on whether registry operators move proactively or wait for a sufficiently public incident. Based on how npm and Docker played out, my bet is on the incident coming first.

We built a static scanner for this: pip install bawbel - scans skill files and MCP server configs without executing anything. The vulnerability database it checks against the AVE.

reddit.com
u/SelectionBitter6821 — 2 days ago
▲ 12 r/OpenSourceAI+1 crossposts

🧬 flux-genotype: A self-evolving AI kernel that runs on CPU with Ollama — mutates its own architecture

`🧬 Flux‑Genotype – A CPU LLM that rewrites itself`

I've been working on an open-source kernel called **flux-genotype**. It orchestrates local models (TinyLlama, Llama 3.2, Hermes 3, DeepSeek-Coder) into a self-modifying ecosystem. Everything runs on **CPU** — I tested it on a Xeon without AVX2, 20 GB RAM.

> **Important:** this is an alpha. It works, it mutates, it evolves — but there's a lot of work ahead. The **MetaDesigner**, in particular, is the module I'm focusing on next. Right now it proposes architectural changes by writing new `.flux` files, but the validation and application pipeline needs to be more robust. The vision is to make it fully autonomous: an external architect that watches the ecosystem, diagnoses weaknesses, and rewrites the structure to improve confidence. It's not there yet, but the foundation is solid.

## How it works

  1. Ask a question → fast model (TinyLlama) answers.
  2. Judge model evaluates the answer (0–1). Initially this was Llama 3.2.
  3. If confidence drops below the golden ratio threshold (≈0.618), the ecosystem mutates its own structure.
  4. A **MetaDesigner** (Hermes 3) writes new `.flux` architecture files, which get validated by a Lark parser and applied.
  5. The system tracks confidence history with EMA and adapts temperature dynamically.

## Real example of self‑modification

The mutation can also replace the Judge. During one of the growth cycles, the MetaDesigner proposed swapping the Judge from **Llama 3.2** to **DeepSeek-Coder 6.7B**. The new configuration was tested, scored better, and the ecosystem applied the change permanently.

The system is not just tweaking parameters — it's rewriting its own **division of labor between models**.

## Why this is different

- It mutates its own architecture, not just model weights.

- It can replace its own Judge with a different model if performance improves.

- It has memory (confidence history with Exponential Moving Average).

- It uses a custom language (`.flux`) with a formal grammar — not YAML, not JSON.

- It runs on modest hardware. No GPU. Just a CPU and 20 GB of RAM.

## If you want to understand the architecture deeply

I wrote a **technical manifesto** that defines FLUX as a formal Architecture Description Language for self-evolving cognitive ecosystems. It covers the fractal design, the OODA loop, the role of the golden ratio, and the long-term vision (including the MetaDesigner). It's in the repo:

📄 `/papers/FLUX-Kernel.pdf`

## The companion novel

There's also a novel called **"IF THIS IS A ROBOT"** (in Italian and English, CC BY-NC-SA 4.0) that tells the story of a guy who finds this kernel running on a forgotten server. The novel is basically the kernel's manual. But the code stands on its own.

## Links

- **Repo:** [github.com/flux-genotype/nodo_zero](https://github.com/flux-genotype/nodo_zero)

- Kernel is **MIT-licensed**. Novel is **CC BY-NC-SA 4.0**.

Happy to answer questions, and **open to collaborators** who want to help push the MetaDesigner forward.

reddit.com
u/Inner-Dot-7490 — 3 days ago

Kasetto - a declarative AI agent environment manager

https://preview.redd.it/e5mlrb1pby1h1.png?width=1650&format=png&auto=webp&s=96b31df64005a0d96a4e5900cd5c8dd39426e013

I've been building Kasetto: a single Rust binary that takes one YAML config and syncs Skills and MCP servers into every AI agent on your machine or your teammates' machines. Supported: Claude Code, Cursor, Codex, Windsurf, Copilot, Gemini CLI, and more.

Sources can be GitHub, GitLab, Bitbucket, Codeberg, Gitea, self-hosted instances, or local directories. MCP configs are auto-merged into the right format per agent so you don't have to hand-edit four different settings files every time you add a server.

The core idea: the YAML is the source of truth. Version it, share it, bootstrap a teammate's whole agent setup in one command. No registry, no boilerplate — any directory with a SKILL.md is a skill.

>Inspired by uv - what uv did for Python packages, Kasetto aims to do for AI skills.

What it gives you:

  • Declarative - one YAML describes your entire setup. Version-controlled, readable, auditable.
  • Multi-agent - Claude Code, Cursor, Codex, Windsurf, Copilot, Gemini CLI, and more. One config, every agent updated.
  • Enterprise & private repos — GitHub, GitLab, Bitbucket, Codeberg, Gitea, and self-hosted instances out of the box.
  • Skills & MCP - any directory with a SKILL.md is a skill. MCP server configs are auto-merged into every supported format (Cursor JSON, Claude JSON, Copilot VS Code, Codex TOML).
  • Fast - written in Rust. SHA-256 content hashing and lock file diffing mean only what changed gets touched.
  • Universal - single static binary for macOS, Linux, and Windows. Install as kasetto, run as kst. CI-friendly with --json output and proper exit codes.

A kasetto.yaml looks like this - multiple agents, multiple sources, pinned refs/branches, per-skill paths, and an optional extends: for inheriting a shared team base:

# inherit a shared base config — overrides merge on top
# extends: github.com/acme/kasetto-base/raw/main/kasetto.yaml
agent:
  - claude-code
  - cursor
  - opencode

scope: project # or global
# destination: ./.agents/skills  # optional, override install path

skills:
  - source: github.com/acme/frontend-pack
    skills: "*"

  - source: gitlab.com/team/internal-tools
    branch: master
    skills:
      - react-patterns
      - go-standards

  - source: codeberg.org/oss/shared
    ref: v2.1.0
    skills:
      - name: custom-lint
        path: rules/custom-lint
      - name: format-helpers
        path: rules/format

mcps:
  - source: github.com/acme/mcp-pack
    mcps: "*"

  - source: github.com/acme/monorepo
    ref: v1.4.0
    mcps:
      - github
      - linear

Running it:

# uses ./kasetto.yaml in the current directory
kst sync

# or point at a shared team config over HTTPS
kst sync --config https://example.com/team-skills.yaml

Want bare kst sync to always pull from a remote URL? Persist it once in ~/.config/kasetto/config.yaml:

source: https://github.com/pivoshenko/pivoshenko.ai/blob/main/kasetto.yaml

After that, kst sync resolves the URL automatically — no --config flag needed. Then to see what landed:

kst list      # interactive browser with vim-style navigation
kst doctor    # version, paths, last sync status

For a real, runnable example: pivoshenko/pivoshenko.ai is my public config — it pulls skills from Anthropic, Vercel Labs, Apollo, and a few independent authors into Claude Code and OpenCode. Fork it, point your own config at it with extends:, or use it as the source: above.

Install:

curl -fsSL kasetto.dev/install | sh
# or: brew install pivoshenko/tap/kasetto
# or: cargo install kasetto

Docs: https://kasetto.dev

Repository: https://github.com/pivoshenko/kasetto

Happy to hear feedback, especially from anyone juggling skills across multiple agents or sharing setups across a team.

reddit.com
u/pivoshenko — 3 days ago
▲ 57 r/OpenSourceAI+1 crossposts

I built a free tool that installs ComfyUI on any cloud GPU in one command and saves your whole setup between sessions. Open source.

Got frustrated reinstalling ComfyUI every time I rented a GPU. Custom nodes, models, configs every session started with 45 minutes of setup before I could actually generate anything. Docker images got stale fast and different providers have different base images so nothing was truly portable.

So I built swm. It's a CLI that handles GPU rental and setup across 10 cloud providers.

For ComfyUI specifically:

  • swm gpus -g a100 --max-price 2.00 --sort price shows you the cheapest GPU across RunPod, Vast ai, Lambda, and 7 others
  • swm pod create — spins up whatever's cheapest
  • swm setup install comfyui — installs ComfyUI on the pod
  • Your whole workspace (custom nodes, models, outputs, everything) syncs to S3 so next session you just pull and it's all there. No starting from scratch every time.

The other thing that's saved me a lot of money is the lifecycle guard. It watches GPU utilization and if nothing's happening for 30 minutes (configurable), it saves your workspace and terminates the instance. I used to fall asleep or get distracted mid-session and wake up to stupid bills. Doesn't happen anymore.

It also works with vLLM, Ollama, Open WebUI, SwarmUI, and Axolotl if you do more than just SD.

Free, open source, Apache 2.0. pipx install swm-gpu

Site: https://swmgpu.com GitHub: https://github.com/swm-gpu/swm 

Curious if anyone else has been dealing with the same setup-every-time problem or if I'm the only one who was doing it wrong lol. Open to feedback on what to build next.

reddit.com
u/Glensta — 5 days ago
▲ 6 r/OpenSourceAI+4 crossposts

DevHelper v2. Апликација за градење на интерактивни прототипи со помош на AI од идеја до код, 100% бесплатна.

Здраво дечки,
Ја градам апликацијава повеќе од 5 години, повеќе инфо на видеото и официјалниот сајт.

https://smileytech.mk/devhelper

https://youtu.be/aU2OA4v5C-o?si=j8ITaMMdRmB0FtbO

u/SmileyTech-mk — 5 days ago
▲ 12 r/OpenSourceAI+3 crossposts

open-source AI evaluation platform

he problem I kept seeing:

Companies are deploying AI agents into healthcare, legal, and finance. Their testing process is one developer asking it a few questions and saying "looks good."

The people who actually know what a correct answer looks like — doctors, lawyers, compliance officers — have zero tools they can use. Everything in the eval space requires Python, CLI setup, or JSON configs. Completely inaccessible to domain experts.

What I built:

EvalDesk — open source, self-hostable, no-code AI evaluation.

The workflow is three steps:

Designed specifically so a doctor or lawyer can use it without an engineer in the room. Self-hostable so sensitive data never leaves your infrastructure — critical for HIPAA and legal contexts.

Current features:

What I'm looking for:

Honest feedback. Is this solving a real problem or am I wrong about the gap? Anyone working in AI deployment in regulated industries — does this workflow actually match how your team operates?

GitHub: https://github.com/ramandagar/EvalDesk

u/Immediate-Tap-4777 — 5 days ago

Best affordable hosting for openclaw-style ai agents?

I'm trying to keep costs reasonable while still having something reliable enough to leave running all day. curious what VPS providers people here recommend for balancing simplicity and uptime. is hostinger 1-click openclaw a good option if not then i would need some more insights help your girl out hahah im desperate to make this work

reddit.com
u/Flaky-Factor-5128 — 4 days ago
▲ 6 r/OpenSourceAI+2 crossposts

GetMCP: Zero Trust for AI agents

Just shipped v0.1.0 of something I've been building. Sharing because I haven't seen anyone solve this end-to-end as a self-hostable thing.

The problem. AI agents (Claude, ChatGPT, Cursor, in-house bots) are starting to make real calls into production APIs. Most companies are handing them a single long-lived API key and praying. There's no per-request audit, no per-agent revocation, no policy layer, no human-in-the-loop for sensitive mutations.

What GetMCP does:
- Generates two MCP servers from any OpenAPI spec: Internal (full surface) and External (scoped/customer-safe). LLM-classified, human-overridable per endpoint.
- Runs as a streaming proxy in front of them : auth, agent identity (revocable in 5s), 5 rule types (allowlist / block / audit / rate-limit / Slack approval).
- Tamper-evident audit log, every call writes one row to a per-org sha256 hash chain. GET /audit/verify walks it end-to-end. Property-tested with 200 random inserts + 50 random tampers, all detected.
- Slack approvals with HMAC-signed callbacks and an idempotent state machine.

Stack: NestJS + Postgres + React. Apache 2.0. Single bash command to bootstrap (./deploy/scripts/bootstrap.sh) generates secrets, brings up Postgres + API + dashboard, seeds a demo org. Helm chart included for k8s. No telemetry, no phone-home, no license server.

Repo: https://github.com/Rayenbabdallah/GetMCP

Looking for honest feedback especially from anyone who's tried to safely expose APIs to AI agents in their homelab or at work. What did I miss? Where's the ergonomics broken? PRs welcome.

u/rayen_ba — 6 days ago
▲ 286 r/OpenSourceAI+4 crossposts

Open Source Palantir

We're building OSIRIS - The Open-Source Palantir Alternative

Please keep in mind we are Early in development and any feedback is much appreciated.

Just launched at osirisai.live - a free, open-source global intelligence platform:

Real-Time Tracking:

10,000+ commercial, military and private aircraft live on a 3D globe

- 2,000+ satellites including ISS

- 1,400+ worldwide CCTV camera feeds

- Earthquakes, wildfires, nuclear facilities and severe weather

Built-In OSINT Tools (no installs needed):

Nmap port scanning from the browser

- DNS record lookup and enumeration

- WHOIS domain intelligence

- SSL/TLS certificate transparency

- BGP routing and ASN lookup

- Threat intelligence and IP reputation

All running on a 3D interactive globe with day/night cycle, 20+ live API feeds, and a SIGINT news aggregator.

Live: https://osirisai.live

GitHub: https://github.com/simplifaisoul/osiris

Free. Open Source. No sign-up required.

#OSINT #CyberSecurity #InfoSec #ThreatIntel #OpenSource #Nmap #PalantirAlternative #Intelligence

u/Gold-Comfortable-340 — 8 days ago

Why hasn't TurboQuant been implemented in llama.cpp yet? (Genuine question from a hobbyist)

Hi everyone,
I've been following the local LLM scene for a while, but I lack the deep technical background in C++ or low-level CUDA programming to understand the inner workings of quantization frameworks.
Recently, I’ve been reading about **TurboQuant** and its performance claims. I know there are repos out there with implementations, like the one by **TheTom**, but it got me wondering: **Why hasn't it been integrated or ported into the main llama.cpp project yet?**
Is there a fundamental architectural incompatibility between how llama.cpp (GGML) handles inference and how TurboQuant is designed? Or is it simply a matter of community priority, given that formats like GGUF (with IQ/Q quantizations) are already highly optimized and widely adopted?
Thanks for the answers!

reddit.com
u/InternationalTune750 — 5 days ago