01 — ProblemOne warehouse, many customers
When you serve a B2B SaaS product into healthcare, your warehouse is rarely just your warehouse. It's the foundation under every customer-facing dashboard, every renewal conversation, every operational alert. And it has to be all of those things — for many customers — without ever leaking data across boundaries.
The warehouse I inherited had grown organically. Each customer's analytics needs were met with bespoke views and one-off SQL. Cross-tenant benchmarking was impossible without manually joining customer-specific tables. Onboarding new customers required engineering work, not configuration. And the lurking risk — quietly visible to anyone who knew where to look — was that nothing structurally prevented one tenant's row from being returned to another.
In multi-tenant analytics, the worst-case bug is silent. A query returns extra rows. A dashboard shows a number that's wrong by 3%. Nobody notices until someone does — and at that point, trust is already gone.
02 — ArchitectureThe layered architecture
The redesign was anchored on three layers, each with explicit contracts:
- Staging — exact mirrors of source systems, no transformations, no joins. The job here is fidelity.
- Intermediate — business logic applied. Conformed dimensions built. SCD Type 2 history tracked. Tenant ID propagated. The job here is correctness.
- Marts — semantic exposure, denormalized for dashboard performance. Tenant-isolated views built on top of conformed facts. The job here is speed.
The temptation in any growing warehouse is to skip the middle layer — to let dashboards reach into staging when the business logic is "obvious." Resist this. Every shortcut that bypasses the intermediate layer is a place where business rules drift between consumers.
03 — TenancyThree patterns for tenancy
You have roughly three options for serving multiple tenants from one warehouse:
① Database per tenant
Hardest to leak across, but operationally expensive. Schema migrations multiply. Cross-tenant analytics require external joins. Reasonable when tenants need radically different schemas.
② Schema per tenant
Middle ground. Easier to manage than DB-per-tenant. Cross-tenant queries possible but ugly. Schema drift between tenants becomes a real risk.
③ Tenant column + row policies
One physical table for each fact, with tenant_id propagated and a row-access policy enforcing isolation. Cleanest for benchmarking, easiest to maintain. This is what I chose.
Hybrid
Shared dimensions in row-policy mode; tenant-specific marts in schema-per-tenant mode for customer-facing exposure. Pragmatic for most cases. Used selectively here for customer-export marts.
The decision hinges on what you optimize for. Going row-policy-first lets the warehouse benchmark, aggregate, and govern in one place — at the cost of getting the row-access policy exactly right, every time. There is no margin for error there.
04 — ConformedConformed dimensions across tenants
Some entities are tenant-private (a customer's own provider list). Others are shared and conformed across tenants (calendar, geography, ICD-10 codes, payer lookups). Mixing these without discipline is how drift creeps in.
The rule I settled on: shared dimensions are owned centrally and live in a single schema. Tenant-specific overrides go into a parallel tenant_overrides table that joins back via tenant ID. Reports always join through a view that applies overrides where present and falls back to the shared dimension otherwise.
-- Conformed dimension with tenant overrides CREATE OR REPLACE VIEW dim_provider AS SELECT COALESCE(o.provider_key, b.provider_key) AS provider_key, COALESCE(o.display_name, b.display_name) AS display_name, COALESCE(o.specialty, b.specialty) AS specialty, b.tenant_id, b.effective_from, b.effective_to, b.is_current FROM dim_provider_base b LEFT JOIN dim_provider_tenant_overrides o ON b.provider_key = o.provider_key AND b.tenant_id = o.tenant_id;
This pattern keeps the shared schema canonical while letting individual tenants rename, recategorize, or hide entries without forking the dimension.
05 — GovernanceIsolation enforced at the platform
Code is fallible. Reviews miss things. The only durable answer is to enforce isolation at a layer below the SQL — at the warehouse itself.
In Snowflake, this means:
- Row-access policies on every fact table, keyed off the calling role's tenant claim
- Tag-based masking on PHI columns so dev and staging environments never see real data
- Role hierarchy built top-down: tenant roles inherit from a base read-role with the row policy applied; admin roles bypass it explicitly and audibly
- Warehouse grants separated from schema grants so compute access can be revoked without restructuring data permissions
Write tests that try to violate the policy. Pick a query, run it as Tenant A, capture the count. Run it as Tenant B, capture the count. Run it as the platform role with no tenant — capture the union. The first two must never overlap. The third must equal their sum.
06 — TakeawaysWhat this bought us
This architecture became the foundation for 60+ multi-tenant customer dashboards and dozens of internal analytical use cases. Onboarding a new customer is now a config change, not a code change. Cross-tenant benchmarking — once a manual spreadsheet exercise — runs as a single query. Drift between consumers stopped, because there is one place where business logic lives.
If I had to compress what I learned into a few lines:
- Pick one tenancy pattern and apply it everywhere. Hybrid is fine, but only by exception, and only with documentation.
- Make tenant_id structural, not a convention. It belongs in primary keys, indexes, and access policies — not just SELECT lists.
- Conformed dimensions are organizational design, not just data design. Decide who owns them before you build them.
- Test isolation like a security boundary, because that's what it is.
The dashboards on top of this warehouse are the visible product. The architecture under them is the reason the dashboards are trusted. That's a trade-off I'll always take.