u/Old_Variation_5493

Why Snowflake Cortex Code has sub-par performance?
▲ 44 r/snowflake+1 crossposts

Why Snowflake Cortex Code has sub-par performance?

tl;dr: Each sessions starts with ~25,000 tokens of system prompt overhead before the model reads your question. 56% of which is skill descriptions for tools most users will never touch.

I tried out Snowflake's AI tool, Cortex Code CLI, which was created specifically to help with data engineering and Snowflake related coding tasks. However, compared to a plain Claude Code session, it provides sub-par performance.

I've asked Cortex Code to write a Snowflake stored procedure that finds and recreates broken views (this is common issue in our environment if DDL of upstream objects is changed). What I got back was broken SQL. It tried to create a stored procedure that executes ALTER VIEW sub_view COMPILE; which is a valid command on Oracle, but not on Snowflake.

The funny thing is that it has a dedicated /sql-author skill, a /sql-verify subagent designed to catch exactly these kinds of errors, and access to Snowflake's own documentation via cortex search docs. It used none of them before it started working. 

My first instinct was to work within the system. Cortex Code has a context rule system mechanism:

cortex ctx rule add "Always check Snowflake documentation using cortex search docs before writing SQL"

It didn’t help. I quickly realized that context rules aren’t loaded by default when starting a new session, they depend on the model deciding to run cortex ctx rule list first, which is not a mandatory step.

So I added an instruction to always run cortex ctx rule list into ~/.claude/CLAUDE.md, the persistent instruction file that gets injected into every session. 

It was ignored. Not always, but often enough to be unreliable. I tweaked the wording, I restructured my CLAUDE.md. The reliability improved, but the fundamental problem remained: my instructions were not always applied, CoCo has not read the docs, and created broken SQL.

At one point I confronted Cortex Code directly about its failure, it replied:

>Context is ~800+ lines across multiple <system-reminder> blocks. Impact: Attention dilution;
Mandatory action buried in nested file contents. 

The model itself was telling me the context was too large. It even recommended: “Reduce context noise. Many system reminders repeat or overlap.” 

Cortex Code source code is not available, so getting the actual system prompt was a bit tricky, but I succeeded:

Skill description 56%, Tool schemas 29%, Fake system reminder messages 15%

I only typed 4 characters. The model received ~25 700 tokens of context.

More than half the context is consumed by skill descriptions: verbose paragraphs explaining 68 bundled skills, most of which any given user will never touch.

My CLAUDE.md directive to "always check Snowflake docs before writing SQL" was competing with 17 system-reminder blocks, 32 tool definitions, and 60+ skill descriptions. The model's attention to any single instruction drops as the total volume increases. 

That's not a model quality problem. It's a context design problem. And it directly explains the hallucinated SQL syntax I kept running into.

Happy to discuss the technical findings. Criticism welcome, especially if you've seen different behavior.

original article

reddit.com
u/Old_Variation_5493 — 3 days ago