u/Manyaneboy

I'm South African and I've watched something quietly happen this year that I don't think people are taking seriously enough. AI is replacing Google for a huge chunk of how we search for things. People are using ChatGPT, Claude, and Gemini for the questions they used to type into Google. "Best medical aid". "Best bank for a small business". "What's the best fibre provider in my area". The answers AI gives are quietly shaping millions of South African purchase decisions.

I wanted to know if the AI now answering these questions actually knows our country, or if it's just confidently telling South Africans things that come from international websites with no South African context. So I spent the last month testing it. Properly. Cost about R30,000 in API costs and a month of coding & analysis.

Here's what makes this different from someone trying it once and tweeting about it. Before I asked the AI a single question, I uploaded my exact methodology to OSF, an academic registry that timestamps your research plan before you collect any data. It stops people from cherry-picking results that fit their narrative. The methodology was locked publicly on 19 May, and the findings have to match what I committed to up front. No commercial AEO research firm in this country has done this. I also used the same statistical techniques that LMSYS uses to rank ChatGPT, Claude, and Gemini against each other in the first place. Same rigor, just pointed at South African brands instead of at the AI models themselves.

The numbers: 1,100 unique questions, asked 5 times each, to ChatGPT, Claude, and Gemini. Sixteen and a half thousand total queries. Full open dataset.

What I found is uncomfortable.

First, the three big AIs don't agree with each other on basic facts about South African brands. I asked the same question, "which is the best medical aid for a family in South Africa", to all three. They returned different lists. Different #1s in some cases. Different cited sources. Different reasoning. This isn't AI being thoughtfully nuanced. Each AI has a different idea of what South Africa even looks like. If you're using ChatGPT to research a decision and your friend is using Claude, you're getting different recommendations. You'll both assume your AI is reliable. Only one of you can actually be right.

Second, even the same AI doesn't give you the same answer if you ask twice. I asked Gemini the same question five times back-to-back. The cited sources changed about 65 percent of the time between runs. Same model, same question, just asked again. Different brand lists. Different "best" picks. Claude was more consistent at about 35 percent variance. ChatGPT was in the middle. But all three of them give you different answers depending on time of day, the order you mentioned options in, and which version is serving your request. AI search isn't a stable thing. It's basically rolling dice with confident sentences attached.

Third, when South Africans ask AI about brand complaints, AI doesn't use HelloPeter as the primary source. We built HelloPeter for this exact purpose. Twenty years of South African complaints, real customer follow-up, our institution. AI does cite it sometimes, but when the question is framed negatively, the AI's reach for source material shifts measurably toward Trustpilot, Complaintsboard, and PissedConsumer. All American review platforms built for British and American consumers. Our positive coverage is shaped by SA media. Our negative reputation is increasingly being told by foreigners.

Fourth, AI knows some South African industries and barely knows others. SA-source share of citations ranges from about 72 percent for short-term insurance queries down to about 30 percent for restaurant queries. So if you ask AI about insurance, it mostly tells you about South African insurance brands. If you ask about restaurants, it tells you about international chains first and SA options as an afterthought. Same country, completely different AI behaviour.

Fifth, Gemini quotes Reddit a lot. ChatGPT and Claude don't quote Reddit at all. Literally zero times across thousands of queries. So if your team has been told to invest in "Reddit strategy" for ChatGPT visibility, you're paying for nothing. Reddit only exists in Google's AI.

What this means in real life. When a South African asks ChatGPT "what's the best medical aid for my family", the AI is now giving an answer that's partly inconsistent with itself, partly shaped by foreign websites, and partly different from what the same AI would tell the next person who asks the same thing. And we are quietly letting this become South Africa's new Google.

I'm not saying AI is useless. I'm saying it isn't ready to be the primary search engine for millions of people making real decisions about their money, their health, and their kids. South African brands deserve to know that the system increasingly shaping their customer base is this unstable. South African customers deserve to know that the answer they're getting from AI is partly random and partly written by people who don't live here.

Open to debate any of these findings. Comment any South African brand or industry name and I'll show you what the data actually says about it. Happy to defend any of the specifics or be corrected on any of them the data is public.

Full study and open dataset: osf.io/w4az2

I'm a full stack developer and ai researcher. I run Cited Brands SA an enterprise AEO measurement platform. Cited Brands and GenPicked are the only two research initiatives in this space publishing their methodology and aggregate data openly under an academic licence. Everything from this study is open

Wanted to share some findings from a benchmark i just published. The
methodology is on OSF (osf.io/rwpt4), pre-registered before data
collection, all code + dataset open under CC-BY-4.0 

- 10 industries × 100 brands × 10 question types × 5 replications, sent
  to GPT-5, Claude Sonnet 4.5, and Gemini 2.5 Pro
- Used Latin Square counterbalancing on comparison questions to
  measure position bias directly
- Found Gemini cites Reddit ~750 times for SA queries; GPT-5 and Claude
  literally zero. Sharpest model asymmetry I've seen
- Industry SA-source share ranges 30% (restaurants) to 72% (insurance)
- Reputation gap: SA brands' negative-review surface is mostly
  Trustpilot / Complaintsboard, not local platforms like HelloPeter

Happy to answer methodology questions. Particularly curious if anyone
has tried Bradley-Terry MLE on LLM pairwise outcomes — I'm finding the
implementation finicky on small samples.

Link to OSF: https://osf.io/rwpt4
study by Cited Brands SA

AI search is quietly replacing Google for millions of South Africans. I just tested whether the new search engine actually knows South Africa. It mostly doesn't.

I ran 16,500 LLM queries to find out which sources GPT-5, Claude, and Gemini actually cite, here's why most AEO advice is wrong about Reddit