u/DowntownThing4875

▲ 9 r/aeo+1 crossposts

Six months deep into AEO/GEO for a Series B SaaS and this is everything I've learned, tested, and still don't fully understand.

Been quietly working on answer engine optimization for a Series B agritech SaaS over the last several months. Started as an internal audit, turned into a full implementation. Sharing everything I've mapped so far i.e. what worked, what's still unclear, and what I'm genuinely curious about from people further along than me.

This is long. Worth it if you're actively building in this space.

A- The starting point viz. why Google rank means nothing for AI visibility

The company I was auditing ranked well on Google across their core category keywords. Decent traffic, solid backlink profile, clean technical SEO. Completely invisible in ChatGPT, Perplexity, and Gemini when buyers searched the same queries.

The gap isn't an anomaly. It's structural. Google rewards topical authority and backlink equity. LLMs reward entity clarity and off-site corroboration. Different trust signals entirely. A brand can rank #1 on Google and still be absent from every AI-generated answer in their category.

B - The On-site extraction architecture

This is where most AEO advice starts and stops. It's necessary but not sufficient on its own.

What actually moves citation probability on-site:

Research consistently shows ~44% of LLM citations pull from the first 30% of a page's content. Front-loading matters more than comprehensive coverage. Lead every section with the answer, follow with the proof. Never the reverse.

Headers should mirror exact prompts your ICP types into AI tools, not keyword targets, not some clever copy. If your buyer asks ChatGPT "how does X solve Y problem" your H2 should read exactly that way.

FAQ sections bolted at the bottom of pages don't work. The FAQ logic needs to be embedded in the body content structure itself. Each section functioning as an implicit question and answer.

JSON-LD schema for FAQPage, HowTo, and Article types genuinely helps. Not because Google cares, yet because it gives the LLM a pre-parsed semantic structure to extract from without ambiguity. Organisation schema with consistent NAP data across every page builds entity clarity.

llms.txt is worth implementing. Analogous to robots.txt but specifically for LLM crawlers lets you signal which content is highest priority for extraction. Still early but adoption is growing fast enough that ignoring it is a mistake.

robots.txt make sure you're not accidentally blocking AI crawlers. GPTBot, ClaudeBot, PerplexityBot all need explicit allowance if you've been aggressive with your crawl restrictions.

C- Off-site entity footprint

This is where most brands have the biggest gap and where the audit for the Series B company was most revealing.

LLMs triangulate brand legitimacy from independent third-party sources. If you only exist on your own domain, the model has nothing to corroborate you with and won't cite you confidently regardless of how well-structured your pages are.

What the citation environment mapping revealed: their competitors were being cited primarily through four source types viz. G2 and Capterra reviews, Reddit threads in niche subreddits, Crunchbase and Tracxn profiles, and long-form technical content on Medium. The company had minimal presence across all four.

The priority fix order matters by category. For B2B SaaS: review platforms first (G2, Capterra brands with active review profiles have meaningfully higher AI citation rates), structured entity data second (Crunchbase, Wikipedia if notability threshold is met), community presence third, long-form editorial fourth.

For developer tools the order shifts: GitHub presence and Stack Overflow tags move to the top. LLMs answering developer queries pull heavily from these sources.

Review velocity is a real signal. A detailed user review that says "this tool solved our API routing issues in three days" functions as a direct citable answer to a future prompt from someone asking Perplexity for API routing solutions. The review isn't just social proof it'll be your citation infrastructure.

D-Citation environment mapping

Before optimising anything, map what's actually getting cited in your category.

Run your target queries through ChatGPT and Perplexity. Don't search your brand, instead go on to search your category. Screenshot every cited source. Categorise them by source type. This gap map tells you exactly what you need to build, in what order, rather than optimising blindly.

The brands consistently winning citations in most B2B categories share a pattern: they have presence across at least 4-5 independent source types simultaneously. Single-source brands i.e. even with excellent on-site content, sometimes get bypassed.

E- What I've been experimenting with 'CBaaS architecture'

One thing I've been testing is building what I'd call Citation Banks viz. single content nodes engineered to satisfy multiple related query intents simultaneously rather than one piece per keyword.

The logic: LLMs extract semantic tokens rather than reading pages linearly. A single well-architected piece can serve as the cited source for an entire family of related queries. Every successful retrieval strengthens that node's citation weight for adjacent queries. The compounding is real, a piece that answers 6 related intents accumulates citation weight 6x faster than 6 separate pieces each answering one.

Construction requirements are different from standard SEO content. Query clustering before writing via mapping the entire family of intents you're targeting. Semantic bridges between sections that help the LLM understand the relationship between adjacent questions. Front-loading the most critical answers in the first 30% of the document.

Still testing this properly but early signals from the Series B implementation are promising.

F- The founder content signal, something I wasn't expecting

One finding that surprised me: founder-authored content was getting cited significantly more than equivalent brand content across the same topics.

The mechanism makes sense in retrospect. LLMs are trained heavily on conversational, human-written content i.e. Reddit threads, technical blogs, build-in-public posts. Authentic founder voice pattern-matches to this training data. Polished brand copy pattern-matches to advertising, which retrieval systems treat with skepticism.

A founder documenting a real product decision including the wrong assumptions, the pivot, the failed experiment, creates context-rich content that retrieval systems extract for "how-to" and "best practice" queries. The specificity and honesty is what makes it retrievable, not the polish.

Have been calling this Thought Leadership as a Service internally viz. the systematic documentation of a founder's building journey structured specifically for LLM extraction rather than human readers. Shifting thought leadership from PR vanity metric to data infrastructure.

G-Entity velocity, what I still don't fully understand

This is where I have observations but incomplete theory and would genuinely value input from people further along.

Entity velocity seems to matter, the rate at which new citation signals are accumulating across independent sources. A brand that optimised six months ago and went quiet appears to lose citation ground over time even if their content hasn't changed. The retrieval system seems to interpret signal stagnation as entity decay.

What I can't fully quantify: the half-life of a citation signal. How quickly does a G2 review or a Reddit thread decay in citation weight? Is it uniform across source types or does it vary? Does a Wikipedia entry decay more slowly than a Reddit thread because of perceived permanence?

If anyone has data or even directional observations on this I'd genuinely find it useful.

What's still unclear to me

A few things I haven't been able to resolve cleanly:

How much does llms.txt adoption actually move citation probability right now versus six months from now as crawler support matures?

For brands in genuinely niche B2B categories with low Reddit presence, is community seeding the right answer or is there a faster path to off-site entity corroboration?

Does author entity weight on Medium (follower count, publication history, external links to author profile) affect citation probability of articles published there? Or is it purely content and domain level signals?

How do retrieval systems handle contradictory information across sources — if your G2 reviews describe your product differently from your homepage, does that create citation ambiguity or does one source type override?

Happy to share more

reddit.com
u/DowntownThing4875 — 8 days ago

We stopped publishing blog posts and started documenting our founder's thinking instead. AI citations tripled up.

Sharing this because the results surprised us enough that it feels worth discussing openly.

Six months ago our content strategy looked like most SaaS companies. Blog posts targeting category keywords, case studies, feature announcements. This gave us some decent Google traffic, yet high prone invisible to ChatGPT/Gemini when buyers searched our category.

We made one structural change viz. stopped publishing brand blog posts entirely and instead started documenting our founder's actual thinking instead incl. real product decisions, the reasoning behind pricing changes, honest post-mortems on features that failed, framework breakdowns from internal strategy sessions.

All same publishing cadence, yet completely different authorship and format.

Three months in, our inflow of AI traffic across our target queries nearly tripled.

The mechanism became obvious in retrospect, LLMs are pattern-matching engines trained heavily on conversational, human-written content. Authentic founder documentation viz. specific, honest, technically dense pattern-matches to the high-signal human content these models were trained to extract and trust.

Every piece now led with the decision or conclusion, and not a narrative buildup. The reasoning followed the thesis, never preceding it.

The compounding effect turned out real. Each piece of founder documentation accumulated citation weight over time(was able to track this down precisely due to certain AI citation tools I had built). An honest pov published four months ago is still being cited in AI answers today because the retrieval system keeps finding it relevant to adjacent queries we never explicitly targeted.

Has anyone else seen a similar pattern after shifting from brand content to founder-authored documentation? Curious whether the category matters or whether this holds across SaaS verticals.

reddit.com
u/DowntownThing4875 — 10 days ago

The content strategy shift, writing for retrieval instead of rankings

Something has changed in how content performs and I don't think the strategy conversation has caught up yet.

For years, content marketing has optimised for one outcome i.e. Google rankings. Traffic, backlinks, topical authority, keyword coverage, all of the entire discipline has been built around one distribution channel.

AI search has introduced a parallel distribution channel with completely different rules. When someone asks GPT/Gemini a question in your category, the answer they get isn't determined by your domain authority or your keyword rankings. It's determined by whether your content exists in the right format, in the right places, with enough third-party corroboration for the retrieval system to trust it as a source.

As an observer in the AI search domain, I see two big shifts.

1=Structure is changing. Content written for LLM extraction leads with the answer, not the narrative. Ever section header mirrors an exact question your audience is typing into AI tools. The core thesis appears in the first sentence of the section, followed by proof. Never the reverse. The first 30% of the document carries disproportionate citation weigh, roughly 44% of LLM citations pull from that window according to current research.

2= Authorship Signals change. Brand content written by content teams is losing retrieval ground to founder-authored content. LLMs are trained on conversational, human-written datasets. Authentic founder voice viz. honest about failures, specific about decisions, transparent about the building process, pattern-matches to what these models were trained to trust. This is what's driving the shift toward Thought Leadership as a Service viz. the systematic documentation of a founder's journey structured specifically for AI retrieval rather than human readers.

The brands compounding fastest in AI search right now are doing both. Architecturally sound content structured for extraction, authored by voices the retrieval layer recognises as human and trustworthy.

The brands ignoring this are producing content that performs fine on Google and gets completely bypassed when their buyers ask an AI tool for a recommendation.

Curious how many content teams here are actively adjusting strategy for AI retrieval or whether it's still mostly theoretical at this stage.

reddit.com
u/DowntownThing4875 — 10 days ago

Random observation from something I've been tracking lately.

Asked ChatGPT and Perplexity to recommend tools in a few different SaaS categories. The brands that consistently surfaced had one thing in common that wasn't obvious, strong presence on legacy review platforms. G2, Capterra, Tracxn, even some niche directories.

The brands with better products but thinner directory presence were invisible in AI answers despite ranking fine on Google.

Working theory: LLMs use third-party platforms as legitimacy signals. If the brand is not being talked about on independent platforms, the model is probably treating you as unverified and moving on to suggest competitors.

Has anyone actually seen a correlation between their G2 or Capterra presence and showing up in AI-generated recommendations? Genuinely curious whether this is consistent or just what I'm seeing. Open to share my findings further.

reddit.com
u/DowntownThing4875 — 16 days ago

Been noticing something counterintuitive while mapping citation environments for a few SaaS brands recently.

The brands showing up consistently in ChatGPT and Perplexity answers aren't necessarily brewing up with top notch backlink profiles or the best on-page SEO entities, yet, the ones with a deep off-site entity presence viz. G2 reviews, Capterra listings, Crunchbase profiles, niche directory mentions.

The aspect that is turning out to be more useful for me is trust triangulation. LLMs won't confidently cite a brand that merely exists on its own isolated sphere. They are prioritising over independent corroboration viz. human consensus from platforms they already trust, before they'll surface a recommendation.

What's interesting is that the SEO industry largely wrote these directories off over the last five years in favour of high-DA editorial links. That might have been the right call for Google. For AI search it looks like the opposite.

Anyone else seeing this pattern? Curious whether the brands you're working with have strong directory presence or whether it's more about content structure. Open to discuss findings further.

reddit.com
u/DowntownThing4875 — 16 days ago