u/Eventual-Conguar7292

Can anyone create fact benchmark for LLM?

My idea, to Score the AI how many obscure facts it knows about any domain; For example, It can be Specs about GPUs.
When I use small Models like 30B paramter gemma-4, to tell me Retro GPU specs it halluclates the numbers in specs.

Not local Large Models that can remember a lot of facts but do sometimes fail on obscure facts.

reddit.com
u/Eventual-Conguar7292 — 2 days ago