Designing Fact and Dimension Tables
Stryg for at vise menuen
Most Excel workbooks store everything in one wide sheet: customer names, product categories, regions, and transaction totals all sitting on the same row, repeated thousands of times. This works for small problems. It breaks down when data grows, when a name changes, or when you want to combine it with other sources.
A relational model solves this by separating data into focused tables, each describing one subject. The key concept behind this separation is the distinction between fact tables and dimension tables.
Fact Tables
A fact table records events or transactions — things that happened.
Characteristics:
-
Many rows: one per event (one order, one payment, one visit);
-
Measures: numeric values you want to aggregate — Quantity, Total, Cost, Hours;
-
Foreign keys: ID columns that point to dimension tables — CustomerID, ProductID, OrderDate;
-
Little descriptive text: names, labels, and categories belong in dimensions.
Examples of fact tables:
- Sales order lines;
- Invoice payments;
- Website visits;
- Support tickets.
Dimension Tables
A dimension table describes the entities involved in facts such as who, what, where, and when.
Characteristics:
-
One row per unique entity: one per customer, one per product, one per date;
-
Descriptive attributes: names, categories, regions, segments — mostly text;
-
Primary key: a unique ID column that the fact table references.
Examples of dimension tables:
- Customers (CustomerID, CustomerName, Region, Segment);
- Products (ProductID, ProductName, Category, UnitPrice);
- Dates (Date, Year, Month, Quarter);
- Employees (EmployeeID, Name, Department, Role).
A Three-Question Decision Framework
For any column in a flat table, ask these questions in order:
Q1: Does this column measure something you want to add up or count?—Quantity, Total, Cost → fact table measure;
Q2: Is this column an ID that links to a description somewhere else?—CustomerID, ProductID, OrderDate → fact table foreign key;
Q3: Is this a descriptive label about a person, product, or time period?—CustomerName, Region, Category → dimension table attribute.
If a column falls into Q3, ask one more: what subject does it describe? All columns describing customers go in Customers. All columns describing products go in Products. Columns describing dates go in Dates.
1. In the Orders_Flat sheet, which of the following groups of columns most clearly belong together in a Products dimension?
2. Which statement best describes the grain of the Sales fact table in this chapter?
Tak for dine kommentarer!
Spørg AI
Spørg AI
Spørg om hvad som helst eller prøv et af de foreslåede spørgsmål for at starte vores chat