01 — Pattern ASnapshot

You take a full picture of the source table at regular intervals — once an hour, once a day — and store each picture. The warehouse becomes a sequence of states over time, and you reason about change by diffing snapshots.

Strengths: Simple. No source-side instrumentation required. Works on any system you can SELECT from. Captures soft deletes naturally.

Weaknesses: Storage grows linearly with snapshot frequency. You lose the granularity of when a change happened — only that something differed between two snapshots. Bandwidth-heavy on large tables.

When it shines: Slowly-changing reference data, configuration tables, customer-managed lookups. Anything where "show me what it looked like last Tuesday" is a useful question.

02 — Pattern BChange Data Capture

The source emits an event for every row-level change — insert, update, delete — and your pipeline applies those events incrementally. The warehouse stays in sync with the source, often within seconds.

Strengths: Low bandwidth (only deltas move). Fine-grained timing — you know exactly when each change happened. Near-real-time freshness possible. Naturally compatible with SCD Type 2.

Weaknesses: Requires source-side support (binlog, write-ahead log, or application-level event emission). Operational complexity — out-of-order events, replays, and schema drift all need handling. One missed event is silent corruption.

When it shines: High-velocity transactional data, where freshness matters and full snapshots are too expensive. Audit-trail-grade history requirements.

CDC is the right answer until you realize you have to operate it. Then full-load looks tempting again.

03 — Pattern CFull-Load

Every run truncates the target and copies the source in its entirety. No deltas, no watermarks, no merge logic.

Strengths: The simplest possible model. Always self-consistent — there is no "drift" because every run is a fresh start. Perfect for small reference tables.

Weaknesses: Doesn't scale with table size. Bandwidth and compute proportional to total table size, not change rate. You lose all history unless you wrap it in snapshot-style captures yourself.

When it shines: Small tables (< few million rows). Reference data with no inherent history. Cases where "always reflects the source as of now" is the only requirement.

04 — DecisionHow to choose

Choose Snapshot when

Source has no CDC support · table is too large for full-load but slow-changing · history of state-at-a-time is genuinely useful · you want simplicity over freshness.

Choose CDC when

Freshness is a product requirement (minutes, not hours) · source supports it natively · you have engineering capacity to operate it · history of every change matters.

Choose Full-Load when

Table is small enough that full reload is cheap · history isn't needed · you value simplicity and self-consistency above all else.

Choose hybrid when

You have a mix — most warehouses do. Use full-load for reference data, snapshot for slowly-changing dimensions, CDC for high-velocity facts. Pick per table, not per project.

→ The trap

The most common mistake is choosing CDC for every table because "real-time is better." It's not better — it's faster. If nobody is reading the data more than once a day, CDC's cost (operational and engineering) buys you nothing the snapshot approach wouldn't.

05 — TakeawaysMatch the pattern to the data

None of these patterns is universally correct. The right answer for a single warehouse is usually all three, applied per table. The wrong answer is picking one and forcing every table into it.

  • Pattern follows the data, not the project. Different tables justify different patterns.
  • CDC is powerful but expensive to operate. Use it where freshness matters; not as a default.
  • Snapshots are underrated. They're simple, they capture soft deletes, and they give you point-in-time views.
  • Full-load is fine for small tables. Don't over-engineer reference data.

The discipline is in matching the pattern to the table — and being willing to revisit the choice when a table grows past its original assumptions.