Back to Learnings
dbt
6 min
April 10, 2025

dbt Best Practices for Enterprise Data Teams

How to structure models, manage dependencies, and build maintainable transformations at scale.

Why Most dbt Projects Become Unmaintainable

After reviewing dozens of enterprise dbt projects, I see the same patterns causing problems:

  • A models/ folder with 200 unsorted SQL files
  • No clear naming conventions
  • Everything built in a single layer
  • Tests on less than 20% of models

Here's how to avoid this.

Project Structure: The Medallion Architecture

models/
├── staging/        # Raw source cleaning — one model per source table
│   ├── salesforce/
│   │   ├── stg_salesforce__accounts.sql
│   │   └── stg_salesforce__opportunities.sql
│   └── erp/
│       └── stg_erp__orders.sql
├── intermediate/   # Business logic, joining staging models
│   └── int_customer_orders.sql
└── marts/          # Final models consumed by BI tools
    ├── finance/
    │   └── fct_revenue.sql
    └── sales/
        └── dim_customers.sql

Rule: Staging models touch only one source. Marts models are the only thing your BI tool should query.

Naming Conventions

Consistent naming is not optional. Use these prefixes:

| Prefix | Layer | Example | |---|---|---| | stg_ | Staging | stg_salesforce__accounts | | int_ | Intermediate | int_customer_lifetime_value | | fct_ | Fact table | fct_orders | | dim_ | Dimension table | dim_customers |

The double underscore __ separates source from entity at the staging layer.

Testing Strategy

Every model needs at minimum:

models:
  - name: fct_orders
    columns:
      - name: order_id
        tests:
          - unique
          - not_null
      - name: customer_id
        tests:
          - not_null
          - relationships:
              to: ref('dim_customers')
              field: customer_id

Target: 100% of fact tables and dimension tables have tests. Staging models should have unique + not_null on primary keys at minimum.

Documentation as Code

Write documentation in your schema.yml, not in a Confluence page nobody reads:

models:
  - name: fct_orders
    description: >
      One row per order. Includes all completed and cancelled orders.
      Does not include returns — use fct_returns for that.

The One Rule That Changes Everything

If you follow only one practice from this article, make it this: Never let staging models be referenced directly by BI tools. Always go through a mart. This gives you the freedom to change your underlying data model without breaking dashboards.

dbtAnalytics EngineeringSQLData Modeling