Member-only story

Data Modelling in Lakehouse

Data Saint Consulting Inc

--

Photo by Growtika on Unsplash

In the Medallion Architecture, data modeling occurs across all three layers – Bronze, Silver, and Gold – but with different purposes and levels of refinement at each stage. Here’s how data modeling typically applies to each layer:

Bronze Layer

The Bronze layer is primarily for raw data ingestion, so minimal modeling occurs here:

  1. Create tables or directories to store raw data in its original format.
  2. Add metadata columns like ingestion timestamp, source system, and file names.
  3. Enable schema evolution to handle changes in source data structure[1].

Silver Layer

The Silver layer is where significant data modeling and transformation takes place:

  1. Design normalized or semi-normalized data models, often adhering to 3rd normal form principles[5].
  2. Create tables for key business entities and relationships.
  3. Apply data quality rules, deduplication, and basic transformations[1].
  4. Implement source system-aligned data models to maintain traceability[1].

Gold Layer

The Gold layer focuses on optimizing data for specific use cases and consumption:

--

--

No responses yet