What is Normalization in Power BI?

Cody Schneider8 min read

If you've ever tried to build a report in Power BI with a single, massive Excel file, you've probably felt the pain of slow reports, confusing DAX formulas, and data that just won't cooperate. The secret to fixing this isn't a faster computer - it's a technique called normalization. This article will show you what normalization is, why it's a lifesaver for your reports, and how to do it step-by-step in Power BI’s Power Query Editor.

GraphedGraphed

Still Building Reports Manually?

Watch how growth teams are getting answers in seconds — not days.

Watch Graphed demo video

What is Normalization? And Why Should You Care?

In simple terms, normalization is the process of breaking down a large, clunky table into smaller, more organized, and related tables. Think of it like organizing a messy garage. Instead of having one giant pile of tools, sports equipment, and holiday decorations, you'd put everything into separate, labeled bins. Tools go in the toolbox, basketballs go in the sports bin, and lights go in the holiday box.

In the world of data, that "one giant pile" is often a single, wide spreadsheet where every piece of information about a sale - customer details, product information, order facts - is crammed into a single row. This is what we call a "flat" or "denormalized" table.

For example, a flat sales table might look like this:

One Big, Messy Table (Denormalized)

See all that repeated information? John Smith's details and the Laptop S-100's information appear every single time they are part of a transaction. Normalizing this would mean creating separate "bins" or tables:

  • A Customers table with unique customer information.
  • A Products table with unique product information.
  • A clean Sales table that just records the transaction, using IDs to refer to the customer and product.

This organized structure is the foundation of a clean and efficient Power BI report.

GraphedGraphed

Still Building Reports Manually?

Watch how growth teams are getting answers in seconds — not days.

Watch Graphed demo video

The Real-World Benefits of Normalization

Organizing your data isn't just for neatness. It has very practical advantages that will make your life in Power BI much easier.

  • Faster Reports: Power BI's engine (called the VertiPaq engine) is highly optimized for this normalized structure, known as a star schema. Smaller, leaner tables process much faster than one "mega-table," which means your visuals will load quicker and slicers will be more responsive.
  • Reduced Data Errors: In our messy table, what if someone enters a customer as "John Smith" in one row and "Jon Smith" in another? Or a product as "Laptop S-100" and "Laptop S100"? Your report will treat these as two different people or products. With a normalized structure, each customer and product exists in only one place. You fix a typo once, and it's corrected everywhere.
  • Easier Maintenance: Imagine a customer changes their email address. In the flat file, you'd have to find and update every single row where that customer appears. With a Customers table, you update it in one single spot. Done.
  • Simpler DAX Formulas: Writing DAX measures becomes far more intuitive when your model is properly structured. Calculating Total Sales or Average Order Quantity is straightforward when your model clearly separates your numbers (facts) from their context (dimensions).

The Building Blocks: Fact Tables and Dimension Tables

To properly normalize your data for Power BI, you need to understand two key types of tables: fact tables and dimension tables. This is the core concept of a star schema model, which is the gold standard for analytics tools like Power BI.

Fact Tables: The "What Happened"

A fact table contains the numbers and measurements from your business events or transactions. Think of it as the record of what happened.

  • It records things like sales amount, quantity sold, cost, clicks, or ad spend.
  • It’s typically made up of mostly numeric values and foreign keys (ID columns) that link to dimension tables.
  • Fact tables are often long and skinny, with many rows (since you have lots of transactions) but few columns.

Our Sales table, once normalized, will become our fact table, containing only quantifiable values and IDs. This makes it efficient for Power BI to perform aggregations like SUM, AVERAGE, and COUNT.

Example Fact Table: Sales

Dimension Tables: The "Who, What, Where, When"

Dimension tables provide the context for the numbers in your fact table. They answer the questions of who, what, where, and when related to the business event.

  • They describe your business entities: customers, products, employees, locations, and dates.
  • The columns are mostly descriptive text attributes like Customer Name, Product Category, City, or Shipping Address.
  • Dimension tables are usually short and wide, with fewer rows (one for each unique entity) but more descriptive columns.
  • Each dimension table has a primary key—a unique identifier for each row (like CustomerID or ProductID)—that links it to a fact table.

Example Dimension Table: Customers

When combined, your model looks like a star, with the fact table in the center and the dimension tables radiating outwards. This clean structure is exactly what Power BI wants.

GraphedGraphed

Still Building Reports Manually?

Watch how growth teams are getting answers in seconds — not days.

Watch Graphed demo video

How to Normalize Data in Power BI (A Step-by-Step Guide)

Enough theory. Let's walk through reorganizing our messy, flat sales file into a beautiful star schema right inside Power BI's Power Query Editor.

Step 1: Get Your Data into Power Query

First, load your messy Excel or CSV file into Power BI. From the Home tab, click Get Data and select your file source. Once the data preview appears, don't click "Load." Click Transform Data instead. This will take you into the Power Query Editor, which is where the magic happens.

Step 2: Create Your Dimension Tables

You’ll start with one query, which represents your original flat table. Our goal is to use this single query to create our separate dimension tables: Products and Customers.

Creating the Customers Dimension

  1. In the Queries pane on the left, right-click your main query (e.g., Sales_Data) and select Duplicate.
  2. Rename the new query to Customers. This query will become your clean list of unique customers.
  3. Select the columns that contain unique customer information (e.g., CustomerID, CustomerName, Email, City). You can do this by holding Ctrl and clicking on each column header.
  4. Right-click on one of the selected column headers and choose Remove Other Columns. You'll be left with only the customer-related columns.
  5. Now, to ensure each customer appears only once, click the CustomerID column to select it, then go to the Home tab and click Remove Rows > Remove Duplicates.

You now have a clean dimension table with one row for each unique customer.

Creating the Products Dimension

Repeat the exact same process to create your Products table:

  1. Duplicate the original Sales_Data query again.
  2. Rename it to Products.
  3. Select only the columns related to products (ProductID, ProductName, Category, Price).
  4. Right-click and select Remove Other Columns.
  5. Select the ProductID column and click Remove Rows > Remove Duplicates.

Step 3: Clean Up Your Fact Table

Now, go back to your original query (Sales_Data). This will become our central Sales fact table. Since all the descriptive information is now in our new dimension tables, we can safely remove it from here.

  1. Select your original query and rename it to Sales.
  2. Select all the descriptive columns you moved into the dimension tables—columns like CustomerName, Email, City, ProductName, and Category. Do not delete the ID columns (CustomerID, ProductID) or your transactional data (OrderID, Date, Quantity, SalesAmount).
  3. Right-click one of the selected columns and choose Remove Columns.

You're now left with a lean, efficient fact table that contains only numbers and the keys needed to connect to the dimension tables.

GraphedGraphed

Still Building Reports Manually?

Watch how growth teams are getting answers in seconds — not days.

Watch Graphed demo video

Step 4: Load and Create Relationships

With your fact and dimension tables created, go to the Home tab in Power Query and click Close & Apply. Power BI will now load these three neat tables into your data model.

The final step is to tell Power BI how they're related:

  1. Go to the Model view in Power BI (the third icon on the left sidebar).
  2. You should see your three tables: Sales, Customers, and Products.
  3. Click and drag the CustomerID field from your Customers table and drop it onto the CustomerID field in your Sales table. A line will appear, representing the relationship.
  4. Do the same for the Products table: click and drag ProductID from the Products table to the ProductID in the Sales table.

That's it! You've successfully normalized your flat file into a high-performance star schema. Your report will now be faster, your data more reliable, and your DAX formulas easier to write.

Final Thoughts

Normalization turns unwieldy spreadsheets into a disciplined data model that Power BI can work with efficiently. By splitting your data into central fact tables (the numbers) and descriptive dimension tables (the context), you create reports that are faster, more accurate, and much easier to manage over time. It's a fundamental skill that elevates your reporting from basic to professional.

Of course, all this data wrangling in Power Query, while powerful, can become time-consuming when you're pulling data not just from a spreadsheet, but from live sources like Google Analytics, Shopify, your CRM, and finance tools. For this, we built Graphed. It automates connecting to all your data sources so you can build real-time, interactive dashboards by simply telling us what you want to see. This approach allows you to skip much of the manual data modeling and get straight from questions to insights in seconds.

Related Articles