r/SQLPerformanceTips

When query tuning starts too late: a dbForge use case for finding bottlenecks earlier

This keeps happening right before prod deploys.

Most of the time, query tuning doesn't start until something goes wrong. Timeout, alerts, someone staring at execution plans at 2 AM.

The annoying part is the slow query was already there.

It just didn’t hurt yet.

Staging data is usually tiny compared to prod. Dev data is clean, row counts are low, cost estimates look cheap, QA passes.

Then the same query hits real production volume and the plan changes.

Seeks become scans. Nested loops suddenly touch millions of rows. The query that looked harmless is now the problem.

I’ve seen the same thing with waits too. During load testing, IO waits or lock waits start creeping up, but they get ignored because nothing is technically broken yet. Then release happens and everyone acts surprised.

Parameter sniffing is another fun one. SQL Server caches the plan from the first execution, and if those parameters are weird, the cached plan can be terrible for the normal workload.

Some teams try to catch this earlier by profiling while the app is still in staging. Usually the risky queries are the ones with too many reads, CPU spikes, or big estimate gaps. We’ve used regular execution plans and profilers for this, including the query profiler in dbForge Studio when checking SQL Server stuff.

How do people here handle this? Do you look at query performance during staging, or mostly once something breaks in prod?

reddit.com
u/db_Forge — 2 days ago

SSMS vs DBeaver vs DataGrip?

I still use SSMS for SQL Server because, well, it’s SSMS. It’s not pretty, but it’s reliable and does what I need.

For multi-database work, I usually see people split between DBeaver and DataGrip. DBeaver is open-source and covers a lot for free, which is hard to argue with. DataGrip feels stronger when you write a lot of SQL and want the editor to stay out of your way.

So my current setup is basically SSMS + one “universal” database tool, depending on the project. Has anyone here actually moved fully to one tool and stayed loyal, or does everyone keep a slightly embarrassing toolbox too?

reddit.com
u/MatynDev — 4 days ago

Dev vs staging vs prod: where do database changes usually break?

Everyone knows the clean version. Dev is where you build. Staging is where you prepare the release and test what should have already been checked earlier. Prod is where real users and real data live. Nice theory. Databases love ruining it.

In dev, database changes usually look harmless. The table has 300 rows, the test data is clean, and the query runs fast enough that nobody thinks about indexes, locks, bad estimates, or weird NULLs. A migration script passes. A constraint works. A new column looks fine.

Then staging catches some things, but not always the right things. The schema might be close to prod, but the data usually is not. Row counts are smaller, old edge cases are missing, permissions are slightly different, jobs are disabled, and nobody has the same traffic patterns. So the deployment looks safe until prod gets involved.

That is where the fun starts. A query plan changes because the optimizer finally sees real data volume. A missing index becomes obvious. A migration fails because production has dirty values that dev never had. A NOT NULL constraint looks fine in staging, then hits ten years of “temporary” data in prod. A stored procedure depends on a column nobody remembered. One environment has a trigger, another does not.

The scary part is that none of this means the code was obviously broken. It usually means the database workflow had blind spots. 

For me, the weak spots are usually schema drift between environments, dirty production data, missing validation after migration, and assumptions that were true only in dev.

Before deployment, I’d rather know what changed, what data already exists, and what might behave differently under real row counts. Schema compare, data checks, migration dry runs, and reviewing execution plans are boring, but they beat finding out from users.

reddit.com
u/individjournalist — 11 days ago

What’s one performance metric you wish more teams watched earlier?

Been thinking about this after a rough on-call week.

Everyone watches CPU and memory. What actually tipped us off that something was wrong is queue depth and lock waits. Both were creeping up for days before anything obvious showed up in the dashboards.

What signals have saved you from a bad production incident? Replication lag, lock waits, queue depth… what else do teams watch that doesn’t get enough attention?

reddit.com
u/Quanord — 10 days ago