Refactoring Flat Data into Relational Tables
メニューを表示するにはスワイプしてください
A wide flat sheet is a single table containing every piece of information about every event — customer details, product details, dates, and transaction values all live in one place. It is easy to read but cannot power a relational data model.
The problem with repetition: every order for Acme Corp repeats the customer name, region, and segment. If those details ever change, every row must be updated individually — which leads to inconsistency and bloated storage.
Step 1 — Identify and Extract Dimension Tables
Dimension tables describe the who, what, and when of each event. Start with them because their boundaries are usually the clearest. Each row must be unique — remove duplicates after pasting.
- Customers table: CustomerID (PK), CustomerName, Region, Segment;
- Products table: ProductID (PK), ProductName, Category;
- Dates table: Date (PK), Year, Month (number), Month name, Quarter.
Why must the dates table span the full year?
Time-intelligence calculations (year-to-date, rolling averages, period comparisons) require a continuous, gap-free calendar. If the table only contains order dates, months with no sales are missing entirely and calculations break.
Step 2 — Build the Fact Table
The fact table records what happened — one row per event. It stores measurable values and the foreign keys that link each event back to its dimension tables. It intentionally contains no descriptive text.
- Sales (fact table): OrderID (PK), OrderDate → Dates (FK), CustomerID → Customers (FK), ProductID → Products (FK), UnitPrice, Quantity, Total.
- PK — Primary Key (a unique identifier for each record in a table);
- FK — Foreign Key (a field that links to a Primary Key in another table).
Star Schema
The four tables form a star schema: the fact table sits at the centre and the dimension tables radiate outward, each connected by a primary-key / foreign-key pair.
- Color-code columns in the wide flat sheet by destination: customers (blue), products (orange), dates (green), fact values (yellow);
- Copy each color group to a new sheet. Convert the range to a named Excel table (Insert → Table);
- Select the primary key column and use Data → Remove Duplicates to ensure every row is unique.
- For the dates table, extend the date column to cover every day from 1 Jan to 31 Dec, then add Year, Month, Month Name, and Quarter columns using
=YEAR(),=MONTH(),=TEXT(B2, "MMMM"), and the quarter formula="Q"&INT((MONTH(B2)-1)/3+1). - Paste the dates sheet as values only (no formulas) into a clean sheet before converting to a table.
- Build the fact table from the remaining columns plus the three foreign keys (
OrderDate,CustomerID,ProductID).
1. What are the main benefits of refactoring a flat sheet into a fact table plus dimensions? (Select two)
2. When creating a Customers dimension from a flat table, which step is essential?
フィードバックありがとうございます!
AIに質問する
AIに質問する
何でも質問するか、提案された質問の1つを試してチャットを始めてください