What is a good upgradable basis?

Right now im trying to figure out what a good basis is to start with ive got a budget of 5000€ and want to build a enterprise type of basis with one GPU.

Starting with one GPU and see when I will hit the limit then buy another one.

Any suggestions?

Im open to buy from China bc I have to pay 0% import taxes, the EU market is going crazy rn about Vram prices.

reddit.com
u/No-Solution6262 — 21 days ago

Is AMD really that bad?

I’m thinking of buying 2× W7800.
I used to run stuff on NVIDIA before but buying their cards feels like using money as toilet paper.

reddit.com
u/No-Solution6262 — 22 days ago

Follow Up on: https://www.reddit.com/r/LocalLLM/s/vT4m7UWeMg

This is kind of a follow up to my last post (got way more replies than expected, thanks for that btw).
I’m trying to build a local AI setup for a small manufacturing company and honestly I’m starting to think I might be focusing on the wrong thing with hardware.

Setup:
Small team (3 people)
We have:
~10,000 technical PDFs (manuals, standards, internal docs)
~60GB product + customer database
CAD related stuff (STEP files, drawings, technical docs)
need to generate proper offers (so pricing + technical correctness matters)
marketing + product development support
fully local, no cloud, no APIs

I don’t really care that much about speed.
More like:
answers should be correct
consistent across multiple documents
grounded in actual data (not hallucinations)
usable for real offers / internal decisions

After reading the replies in the last post I’m honestly not sure anymore if hardware is even the main issue here.
Feels like maybe:
RAG / retrieval design matters way more
data structure is probably the real pain point (PDFs + CAD stuff is messy)
pricing logic should probably not even be inside the LLM at all

For people who actually built something like this:
At what point does hardware (VRAM, unified memory, multi GPU etc.) actually become the limiting factor?
Or is it mostly just system design and data pipeline stuff and hardware is kinda secondary?
I’m trying not to overbuy hardware before I even understand what’s actually breaking first.

Would appreciate real world experience from people who actually ran local LLM / RAG systems in something more serious than a hobby setup.

reddit.com
u/No-Solution6262 — 25 days ago

DGX Spark (128GB Unified Memory) vs RTX 5090 – what matters more for real business AI: context or speed?

I’m setting up a fully local AI server for a small company (3 users) and I feel like most advice online ignores the real problem.
Use case:
10,000+ technical PDFs
60GB database (products, pricing, customers)
CAD-related documentation (engineering-heavy)
automatic offer generation (must be correct)
product development support
fully local (no API/cloud)

Key requirement:
Correctness > speed
1–5 min responses are fine if quality is higher and more reliable.

My dilemma:
DGX Spark / 128GB unified memory → larger models, more context, slower
RTX 5090 server (32GB VRAM) → faster, but limited context + more RAG splitting

Honest question:
For real knowledge-heavy business AI:
Is more memory / larger context actually more important than raw GPU speed?
Or is everyone still overvaluing CUDA + fast inference?
Would appreciate real-world experience, not benchmark talk.

reddit.com
u/No-Solution6262 — 26 days ago