Member-only story

Data Modelling in Lakehouse

Data Saint Consulting Inc
2 min readOct 4, 2024

--

Photo by Growtika on Unsplash

In the Medallion Architecture, data modeling occurs across all three layers – Bronze, Silver, and Gold – but with different purposes and levels of refinement at each stage. Here’s how data modeling typically applies to each layer:

Bronze Layer

The Bronze layer is primarily for raw data ingestion, so minimal modeling occurs here:

  1. Create tables or directories to store raw data in its original format.
  2. Add metadata columns like ingestion timestamp, source system, and file names.
  3. Enable schema evolution to handle changes in source data structure[1].

Silver Layer

The Silver layer is where significant data modeling and transformation takes place:

  1. Design normalized or semi-normalized data models, often adhering to 3rd normal form principles[5].
  2. Create tables for key business entities and relationships.
  3. Apply data quality rules, deduplication, and basic transformations[1].
  4. Implement source system-aligned data models to maintain traceability[1].

Gold Layer

The Gold layer focuses on optimizing data for specific use cases and consumption:

--

--

Data Saint Consulting Inc
Data Saint Consulting Inc

Written by Data Saint Consulting Inc

For Consultation services regarding Data Engineering and Analytics: datasaintconsulting@ gmail.com

No responses yet