dbt - Documentation

BigQuery specific configurations

Using table partitioning and clustering

Partition clause

BigQuery supports the use of a partition by clause to easily partition a table by a column or expression. This option can help decrease latency and cost when querying large tables. Note that the expression provided as the partition_by field must be of type Date, and that partition pruning only works when comparing using scalar/literal values (so selecting partitions using a subquery won't improve performance).

{{ config(materialized='table', partition_by='DATE(created_at)') }}

select
  user_id,
  event_name,
  created_at
  
from {{ ref('events') }}

Because BigQuery only prunes partitions when filters are provided literally, it may be useful to use a statement block to generate filter clauses:


{% call statement('get_date', fetch_result=True) %}

  SELECT
    min(created_at) as min_date
    
  FROM {{ ref('bigquery_table') }}

{% endcall %}

{% set min_date = load_result('get_date').table.columns['min_date'].values()[0] %}

select
  user_id,
  event_name,
  created_at

from {{ ref('bigquery_table') }}
where created_at >= DATE('{{ min_date }}')

Clustering Clause

BigQuery tables can be clustered to colocate related data. At present, BigQuery is only able to cluster partitioned tables, so be sure to use the cluster_by config in conjunction with the partition_by config.

Clustering on a single column:

{{
  config(
    materialized = "table",
    partition_by = "date(created_at)",
    cluster_by = "order_id",
  )
}}

select * from ...

Clustering on a multiple columns:

{{
  config(
    materialized = "table",
    partition_by = "date(created_at)",
    cluster_by = ["customer_id", "order_id"],
  )
}}

select * from ...

Persisting model descriptions

New in dbt v0.14.0

This functionality is new in dbt v0.14.0. For upgrading instructions, check out the docs

The persist_docs config can be used to persist the dbt description supplied for a model to the resulting BigQuery table or view. The persist_docs config is currently only supported on BigQuery for views and tables.

The persist_docs config can be specified in the dbt_project.yml file, or in a specific model.


models:
  # enable docs persistence for all models
  persist_docs:
    relation: true

or:

{{
  config(persist_docs={"relation": true})
}}

select ...

When the persist_docs option is configured appropriately, you'll be able to see your model descriptions in the BigQuery UI:

This description is generated by dbt

This description is generated by dbt

Managing KMS Encryption

Customer managed encryption keys can be configured for BigQuery tables using the kms_key_name model configuration.

Using KMS Encryption

To specify the KMS key name for a model (or a group of models), use the kms_key_name model configuration. The following example sets the kms_key_name for all of the models in the encrypted/ directory of your dbt project.


name: my_project
version: 1.0.0

...

models:
  my_project:
    encrypted:
      kms_key_name: 'projects/PROJECT_ID/locations/global/keyRings/test/cryptoKeys/quickstart'

Labels and Tags

Specifying labels

dbt supports the specification of BigQuery labels for the tables and views that it creates. These labels can be specified using the labels model config.

The labels config can be provided in a model config, or in the dbt_project.yml file, as shown below.

Configuring labels in a model file

{{
  config(
    materialized = "table",
    labels = {'contains_pii': 'yes', 'contains_pie': 'no'}
  )
}}

select * from {{ ref('another_model') }}

Configuring labels in dbt_project.yml


models:
  my_project:
    snowplow:
      labels:
        domain: clickstream
    finance:
      labels:
        domain: finance
Viewing labels in the BigQuery console

Viewing labels in the BigQuery console

Specifying tags

BigQuery table and view tags can be created by supplying an empty string for the label value.

{{
  config(
    materialized = "table",
    labels = {'contains_pii': ''}
  )
}}

select * from {{ ref('another_model') }}

Updated 2 months ago

BigQuery specific configurations


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.