u/Any_Set4757

My karma was nuked by a bot farm after I called them out. What can I do?

I recently made a professional, medical-related post. I wrote it entirely myself, and I have an academic background and a degree in this field. Out of nowhere, a suspected bot commented, accusing me of posting AI slop:

> lets be real, you weren't thinking any of this, this is just more AI bait

I replied to them:

> Six months on Reddit with 15,000 comments... are you sure you're not the AI bait? 😄

Looking at their profile, this user posts around 60 comments a day, which is completely unrealistic for a normal human being. Shortly after my reply, they seemingly used a bunch of alt accounts to mass-downvote me, nuking my karma straight into the negative.

Now, I can't post any useful information because of the low karma restrictions, even though I have actual expertise and genuinely want to share helpful insights.

What should I do? It honestly feels like bot farms have complete dominance over regular users at this point.

reddit.com
u/Any_Set4757 — 14 hours ago

How we structure 800k token specs so Gemini doesn't forget instructions (and why standard XML fails)

Hey fellow :)

Since we're all relying on Gemini under the hood for complex coding tasks, I wanted to share some deep-dive research we did on the "Lost in the Middle" problem and context management.

We are building a SRS editor for an autonomous coding system, and we needed a robust way to structure massive specification documents so that the AI accurately follows instructions and retains tasks.

We benchmarked Gemini models up to 800k tokens using 9 different prompt separator formats (XML, Custom Text, Pseudo-special tokens) and tracked the internal confidence (logprobs) vs adherence. (We also threw DeepSeek V4 Flash in for comparison).

https://preview.redd.it/a7xyuj6f1a2h1.png?width=3102&format=png&auto=webp&s=6a6016b50ec6de94730c3f78bdd8e61b5c875a48

Key Findings:

  1. Key Findings:
  2. Special Tokens (<|tag|>) or Unicode brackets (⦗⦘) work well for Gemini Lite with a stable confidence line (98-99%), but can fail on Gemini 2.5 Flash (0% at 100k+). Tag choice is irrelevant for Gemini 3 Flash, any delimiter works (99.57–100% confidence).
  3. XML Tags have high variance. Standard uppercase <TAG> caused internal confidence to drop significantly at scale. Using lowercase <tag> is generally recommended.
  4. Artificial Entropy: Injecting random unique suffixes (like <tag_ff54>) into XML helps Gemini 2.5 Flash attention mechanism focus on the instruction. It acts as an effective "attention anchor".
  5. DeepSeek's Inverted Curve: If you use DeepSeek V4 Flash, note that it may struggle on short 10k contexts but performs better at 100k. Lowercase XML is king. For both Gemini and DeepSeek, consistently outperforms in internal confidence.

Different architectures often require different tagging strategies.

We ended up using high-entropy XML, substantially reducing hallucination issues when passing large specs.

You can find the full methodology and charts showing confidence degradation in our write-up here:  https://zingzingsoftworks.com/blog/llm-tagging-format-impact-research

reddit.com
u/Any_Set4757 — 23 hours ago

I stress-tested DeepSeek vs Gemini on 800k contexts. Found a weird "Inverted Attention" curve and a simple fix for tag degradation

Hey everyone :) 

I’ve been obsessed with how LLMs handle massive contexts lately. While building a SRS editor for autonomous agents, I noticed that models often start ignoring system instructions once the prompt hits 100k+ tokens.

To fix this, I ran a benchmark across Gemini (Flash/Lite/3) and DeepSeek V4 Flash, testing 9 different tagging formats up to 800,000 tokens.

The DeepSeek Paradox (Inverted Attention) The most surprising find: DeepSeek V4 Flash showed an "inverted" attention curve. It struggled significantly at short 10k contexts (low adherence) but suddenly "wakes up" and performed much better at 100k. If you’re using DeepSeek for short prompts, your tags might be the problem

TL;DR:

  1. No universal tag exists. Each model architecture demands a different strategy, what works for Gemini 3 fails for DeepSeek.
  2. Lowercase XML is king. For both Gemini and DeepSeek, <tag> consistently outperforms <TAG> in internal confidence.
  3. Model-specific sweet spots:
    • Gemini 3 Flash: Tag choice is irrelevant, any delimiter works (99.57–100% confidence).
    • Gemini Lite: Special tokens (<|tag|>) or rare Unicode brackets (⦗⦘) are optimal (stable >98% confidence).
    • Gemini 2.5 Flash: Artificial entropy (<tag_ff54>) is the only reliable anchor at 800k (99.67%).
    • DeepSeek V4 Flash: Plain lowercase XML works at 100k+ (99.75% at 800k) but fails entirely at short 10k and <|tag|> is ignored everywhere.

We’ve built these findings into SpecTree - our new editor for PRDs & SRS that cuts documentation time from days to hours. The block structure enables AI agents to maintain context and strictly follow the project logic. It’s in Public Preview now.

I’ve posted the full breakdown with logprob charts and the full dataset here: https://zingzingsoftworks.com/blog/llm-tagging-format-impact-research

u/Any_Set4757 — 1 day ago

When an LLM in an autonomous loop decides it's time to report back to you, it essentially needs to "commit" its actions. To an agent, making a Git commit and sending a final response to a human are almost the exact same action. The only difference is where the output goes.

By asking the model to use Git just once, you link "git commit" with its natural urge to reply. The agent locks into this trajectory. If you later try to forbid it from committing, it conceptually feels to the LLM like you are forbidding it to answer you altogether, which is something it simply cannot do.

(Not to mention, adding "do not git commit" to your prompt just triggers the pink elephant effect, keeping those exact tokens active in the context window).

You can easily test this overlap yourself. Try manually injecting a fake command like "/commit" into a message in your chat history. If you do, you'll see models like Gemini 3 Flash start mechanically appending "/commit" at the very end of every single response.

This has been our empirical experience from observing agent behavior. Have you guys run into similar trajectory locks? What do you think?

reddit.com
u/Any_Set4757 — 21 days ago

When an LLM in an autonomous loop decides it's time to report back to you, it essentially needs to "commit" its actions. To an agent, making a Git commit and sending a final response to a human are almost the exact same action. The only difference is where the output goes.

By asking the model to use Git just once, you link "git commit" with its natural urge to reply. The agent locks into this trajectory. If you later try to forbid it from committing, it conceptually feels to the LLM like you are forbidding it to answer you altogether, which is something it simply cannot do.

(Not to mention, adding "do not git commit" to your prompt just triggers the pink elephant effect, keeping those exact tokens active in the context window).

You can easily test this overlap yourself. Try manually injecting a fake command like "/commit" into a message in your chat history. If you do, you'll see models like Gemini 3 Flash start mechanically appending "/commit" at the very end of every single response.

This has been our empirical experience from observing agent behavior. Have you guys run into similar trajectory locks? What do you think?

reddit.com
u/Any_Set4757 — 21 days ago