LLMDB.org - A place to share benchmarks and configs
I basically vibecoded this entire app over the last few days because I was frustrated with the current state of local LLM benchmarking.
Whenever I wanted to check how fast a model runs on a specific setup, I couldn't find a single unified database. The few platforms that do exist are either outdated, focus exclusively on Nvidia or enterprise cloud GPUs (like H100s/A100s), or don't accept community submissions. I really couldn't find a similar app out there that allowed people to submit benchmarks for any arbitrary hardware entries.
So I built LLMDB (LLM Benchmarks Database).
GitHub: https://github.com/secretdino/llmdb-org
It is completely open source (MIT license) and community-driven. The goal is to make it easy to compare, search, and submit runs across any setup - whether you're running on an Apple Silicon Mac, AMD ROCm, an old Nvidia GTX card, or a wild multi-GPU rig.
A few quick things about how it works:
- Any hardware entries allowed: No rigid dropdowns or pre-approved configurations. You can submit and document performance for any custom setup.
- Console log and Settings auto-parsing: To make submissions easy, you can literally just copy and paste raw console output from llama.cpp or vLLM. The backend will parse out the model name, tokens per second, engine parameters (like KV cache precision). You can also paste in your llama.cpp settings and it will parse those automatically too.
- Fully Open Source: The repository is open-source under the MIT license, so anyone can check out the codebase or contribute.
Some disclaimers:
- Probably full of bugs, I used Gemini 3.5 Flash for most of the development. I didn't even spring for Gemini Pro or Claude Opus.
- It's as secure as asking Claude Opus to make it secure, so probably not very.
- I asked Gemini to find some seed data, so the starting data might be missing a lot of detail. It did provide links to the sources, so the seed data isn't hallucinations.
- Absolutely feel free to trash the code and my development skills (or lack there of). I know how much effort I put into this (not much) and that I should do better.
- Will this be standard vibe-coded abandonware? Maybe, but that's why I open sourced it.
If you have a few minutes, check it out, play around with the search/filters, or upload some of your own local benchmark logs to help populate the database. I'd love to hear your feedback on it!