Snowflake Credit Optimization at Scale

01 — Step oneDiagnose before you tune

Most cost-cutting efforts start in the wrong place — with the warehouses. Resist that. Start with attribution: which queries, in which warehouses, run by which roles, are burning the most credits? Until you can answer that, every change is a guess.

Snowflake's QUERY_HISTORY and WAREHOUSE_METERING_HISTORY views give you what you need. Join them on warehouse and time window, group by query pattern, and you have a credit-attribution table for the last year.

credit_attribution.sql

SELECT
  q.warehouse_name,
  q.role_name,
  REGEXP_REPLACE(q.query_text, '[0-9]+|''[^'']*''', '?') AS query_pattern,
  COUNT(*)                       AS exec_count,
  SUM(q.execution_time)/1000      AS total_seconds,
  SUM(q.credits_used_cloud_services) AS cloud_credits
FROM snowflake.account_usage.query_history q
WHERE q.start_time > DATEADD(day, -90, CURRENT_DATE())
GROUP BY 1, 2, 3
ORDER BY total_seconds DESC
LIMIT 50;

What you'll find, almost universally, is that 5–10 query patterns account for the majority of compute. That's where you focus.

02 — Step twoWorkload segmentation

The single biggest wins usually come from running different workload types on different warehouses, sized appropriately to each. Mixing ad-hoc analyst queries with overnight ETL on one warehouse means you're paying for the larger size 24/7 — even when only one workload needs it.

ETL_WH (medium)

Scheduled pipelines. Runs in batches. Predictable, contained. Auto-suspend at 60s. Sized for throughput, not interactivity.

BI_WH (small, multi-cluster)

Dashboard queries. High concurrency, low individual cost. Auto-suspend at 60s. Multi-cluster for spike absorption.

ADHOC_WH (x-small)

Analyst exploration. Bursty, unpredictable. Auto-suspend at 30s. Smaller size forces analysts to think about query shape.

LOAD_WH (large, suspend aggressively)

Bulk loads only. Used briefly, then suspended. Sized for time-to-completion, not concurrency. Strict suspend window.

Resize each warehouse based on its workload, not on the worst case across all workloads.

03 — Step threeThe top-10 query list

Take your attribution query above, sort by total compute time, and pick the top ten. For each, ask three questions:

Is it doing a full table scan? Check the query profile. If yes, look for clustering keys, partition pruning, or rewriting filters to use indexed columns.
Is it materializing intermediate sets larger than necessary? CTEs and subqueries that produce 100M rows for a final 1K-row result are silent budget killers.
Is it running more often than it needs to? Some dashboard queries run on every page load when caching every 15 minutes would suffice.

Optimizing the top ten queries usually moves the bill more than upgrading every warehouse to a smaller size.

04 — DetailAuto-suspend, the silent saver

Snowflake bills per-second after the first minute. A warehouse left running with no queries is pure burn. Default AUTO_SUSPEND values (often 600 seconds) are too generous for most workloads.

Fig. 02 · Every red block is money spent on a warehouse that isn't doing anything.

For ETL warehouses, set auto-suspend to 60 seconds. For BI warehouses serving cached dashboards, 60 seconds is fine — the cache rehydration cost is small. For ad-hoc warehouses, 30 seconds. The savings compound: every minute you trim off idle is a minute you don't pay for, every day, forever.

→ Caveat

Aggressive auto-suspend has one cost: cold-start latency. If your dashboards are sensitive to first-query latency, balance auto-suspend with multi-cluster scale-out instead of a single oversized warehouse left running.

05 — TakeawaysCost as a habit

Cost optimization isn't a one-time project. It's a habit you bake into the engineering cadence. Build a credit-attribution dashboard. Review it weekly. When a new pipeline ships, check whether it changed the shape of the top-10 list. When a customer onboards, anticipate the load.

Attribute first, optimize second. You can't fix what you can't see.
Segment workloads onto purpose-sized warehouses. Mixed workloads pay the worst-case rate.
The top-10 list is where the money is. Optimize those before touching anything else.
Auto-suspend is silent savings. Tighter is almost always better.

The goal isn't a one-time spike of savings — it's a credit profile that grows linearly with your business, not exponentially with your tech debt.

01 — Step oneDiagnose before you tune

02 — Step twoWorkload segmentation

ETL_WH (medium)

BI_WH (small, multi-cluster)

ADHOC_WH (x-small)

LOAD_WH (large, suspend aggressively)

03 — Step threeThe top-10 query list

04 — DetailAuto-suspend, the silent saver

05 — TakeawaysCost as a habit

Designing a Multi-Tenant Healthcare Data Warehouse in Snowflake

Building Idempotent ETL Pipelines in Snowflake + Python