GA4 BigQuery export stores data in nested and repeated structures (event_params, user_properties, items). This is great inside BigQuery, but most downstream tools (pandas, ML pipelines, many BI tools) expect flat tables. Repeated UNNEST logic in every query is tedious and increases error risk.
This repo shows a practical workflow to profile GA4 export data and build a reproducible flattening query using the public ga4_obfuscated_sample_ecommerce dataset.
Public BigQuery dataset: Google Merchandise Store : Obfuscated GA360 Data
BigQuery ID: bigquery-public-data.ga4_obfuscated_sample_ecommerce
Period: Nov 2020 to Jan 2021.
Scale: 92 daily tables, ~4.3M events.
ga4_bq_profiling_and_flattening.ipynb- end-to-end: connect to BigQuery, profile schema and nested fields, build and validate a flat events table
The sql/ folder contains the same profiling and flattening workflow as standalone queries for running directly in the BigQuery UI.
Run order:
01_profile_event_params.sql02_profile_user_properties.sql03_profile_items.sql04_profile_flat_fields.sql05_flatten.sql06_validate.sql(optional)
- Profile
event_params- key coverage, NULL rates, type inconsistencies - Profile
user_properties- presence and sparsity - Profile
items- presence and sparsity (ecommerce context) - Profile flat event fields - identify fields that are always NULL in this dataset
- Build flatten query - pivot selected
event_paramskeys into columns and keep stable fields - Validate - row count parity, date range, event-type distribution
- 28
event_paramskeys observed, 4 are always NULL, 24 retained in the flat table user_propertiesare extremely sparse (present as key-value pairs, not as stable columns)itemsare excluded from the final flat table to keep an event-level layout simple (can be expanded later)- 12 flat fields are always NULL in this dataset and are excluded
- 1 mixed-type key (
session_engaged) handled viaCOALESCEacross value types
Flat table: 4,295,584 rows x 60 columns, 1:1 with source events (event-level granularity).
- Google Cloud project with BigQuery access
- Python 3
- Packages:
google-cloud-bigquery,pandas - Colab or local environment (service account if needed)
- Open the notebook in Colab
- Set
PROJECT_IDandDEST_DATASET - Run all cells
Medium article (BigQuery UI walkthrough): Profiling and Flattening GA4 Data in BigQuery: A Step-by-Step Walkthrough
The processed and flattened dataset is available on Hugging Face: GA4 BigQuery Flattened Dataset