What is the Default Data Blending Join in Tableau?
When you're trying to combine data from different sources in Tableau, you'll quickly encounter "data blending." But what's actually happening behind the scenes when you drop fields from two separate spreadsheets onto your dashboard? This article will break down exactly how Tableau's default data blending join works, why it works that way, and how to avoid the common errors that can trip you up. We'll cover the difference between blending and joining, see how Tableau decides which data source is in charge, and walk through a practical example.
Data Blending vs. Data Joining: What's the Difference?
First, let's clear up a common point of confusion. Data blending and data joining are two different ways to combine data in Tableau, and they work in fundamentally different ways. Understanding this distinction is the first step to mastering either one.
Data Joining is what you do on the "Data Source" page. You combine tables at the row level before you start building your visualization. Think of it like creating one big, wide uber-table from two or more tables that share a common field (like a 'User ID' or 'Product SKU'). All the data is mashed together first, and then you start your analysis on this combined dataset.
Data Blending, on the other hand, happens on the worksheet level after the data has been loaded. Instead of combining everything at the row level, blending works with aggregated data. Here's the key difference:
- Tableau sends a separate query to each data source.
- It aggregates the results from each query independently, based on the dimensions in your view.
- Then, it combines those aggregated results into a single visualization.
Think of it as asking two different people for a summary and then combining their answers, rather than asking them to combine a giant spreadsheet for you. Blending is essential when you can’t perform a join, such as when your data lives in different databases (like Google Sheets and a SQL server) or when the data is at different levels of granularity (e.g., daily sales data and monthly sales targets).
The Default Blend: It's a Left Join
So, when Tableau blends those aggregated results, what kind of "join" is it using? Tableau's default and only data blending join is a left join.
In a left join, you have a "left" table and a "right" table. The join keeps all the records from the left table and only brings in matching records from the right table. If a record in the left table has no match in the right table, the fields from the right table will simply appear as null.
In the world of data blending, Tableau doesn't call them "left" and "right" tables, it calls them primary and secondary data sources. The primary source acts as the left table in the join, meaning every member of a dimension in your primary source will appear in your view, regardless of whether it has a match in the secondary source. This decision about which source is "primary" is the single most important factor in how your blended view will look.
How Tableau Determines Primary vs. Secondary Sources
This is where things can get a little tricky if you’re not paying attention. Tableau determines the primary and secondary sources based on your actions on the worksheet. It's incredibly simple but has major implications:
The first data source you drag a field from onto your view automatically becomes the primary data source.
That's it. Tableau will mark this primary source with a small blue checkmark next to its name. Any subsequent data sources you use in that same worksheet become secondary sources and are marked with an orange checkmark. This distinction is worksheet-specific, so you can have the same data source be primary on one sheet and secondary on another.
The blue checkmark means "this is my 'left' table." Its dimension members will define the headers in your view (e.g., all the product names, regions, or dates). The orange checkmark means "this is my 'right' table." Only data corresponding to the dimension members already established by the primary source will be brought in.
A Step-by-Step Example of Data Blending
Let's make this tangible with an example. Imagine we have two simple data sources:
- Coffee Sales (Excel): A file containing daily sales transactions. It has
Order ID,Product Name,Sales, andRegion. - Regional Managers (Google Sheets): A list of regions and the manager responsible for each one. It has
RegionandManager.
We want to build a simple view showing total sales for each manager. Since the data is in two different types of sources, we need to blend.
Step 1: Connect to Your Data Sources
First, connect to both the "Coffee Sales" Excel file and the "Regional Managers" Google Sheet in Tableau. You will see both sources listed in the Data pane.
Step 2: Choose Your Primary Source (Crucial!)
We want a list of managers and their sales. Since all our sales data lives in the "Coffee Sales" source, and we want to see all the sales even if a region doesn't have a manager assigned yet, we should start there.
Let's drag Region from the Coffee Sales data source onto the 'Rows' shelf. Instantly, you'll see a blue checkmark appear next to "Coffee Sales." You have just defined your primary source for this worksheet.
Step 3: Define the Relationship
Next, switch to the "Regional Managers" data source in the Data pane. You'll notice a small grayed-out "linking" icon - it looks like a paperclip or a piece of chain - next to the Region field. This means Tableau has automatically detected that both sources share a field with the same name.
If the chain icon is broken and red, it means you need to define the relationship manually. You can do this by going to Data > Edit Blend Relationships. By default, Tableau will automatically try to link on any fields that share the same name.
Step 4: Bring in Fields from the Secondary Source
Now, click on the Regional Managers source. An orange checkmark will have appeared, designating it as the secondary source. Drag Manager onto the 'Rows' shelf next to Region and drag Sales from the primary "Coffee Sales" source onto the 'Text' shelf in the Marks card.
What just happened behind the scenes?
- Because "Coffee Sales" was primary, Tableau first queried that data source to get a list of all regions and their aggregate sales (something like:
SELECT Region, SUM(Sales) FROM CoffeeSales GROUP BY Region). - Then, it queried the "Regional Managers" source to get a list of regions and their managers (like
SELECT Region, Manager FROM RegionalManagers). - Finally, it performed a LEFT JOIN on these aggregated results, with the aggregated Sales data as a "left" table. It matched the Manager from the secondary source where the
Regionfield matched. If a region in our sales data didn't have a manager listed in the Google Sheet, the Manager field would be null, but the Region and its sales figure would still appear in our view.
Common Data Blending Pitfalls
Because the process is so dependent on the order of operations, it's easy to make a few common mistakes.
1. Accidentally Swapping Primary and Secondary Sources
What if, in our example, we had dragged Manager from the "Regional Managers" sheet first? That source would have become primary. Our left join would then be driven by the list of managers. If any sales occurred in a region that did not have a manager assigned, those sales would be excluded from the view entirely! This is often the cause of "missing data" when blending.
How to Fix It: Always start building your view with the data source that contains the most complete set of dimension members you want to see. Your primary source should answer the question, "what defines a row in my view?"
2. The Dreaded Asterisk (*)
Sometimes when you bring a measure or dimension from a secondary source, you don’t see a value - you see an asterisk (*). This happens when there is a one-to-many relationship between your primary source dimension and your secondary source at that level of blending.
For example, what if our "Regional Managers" file accidentally had two managers listed for the "West" region? When Tableau joins the aggregated data, it finds two different managers for one "West" region value from the primary source. Since it can't choose between them, it displays a * to let you know there are multiple possible values.
How to Fix It: You need to fix the underlying data in your secondary source to ensure there's a unique match for your linking field. Or, you need to adjust the level or dimensions of detail on your worksheet so that the blend is one-to-one.
3. Filtering Confusion
A Quick Filter on your primary data source will behave as you expect - it filters the data before the aggregate results are blended.
However, when you apply a Quick Filter using a field from the secondary data source, it can be misleading. This filter applies after the data has been blended. It's essentially filtering a result set that has already been aggregated and joined. This can sometimes lead to unexpected outcomes compared to how filters work in non-blended sheets.
Final Thoughts
Mastering Tableau’s data blending means remembering two main things: the default join is a left join, and the primary data source - determined by the first field you drag into the view - acts as the left table. It’s an incredibly powerful feature for combining aggregated results from totally different sources without complex database work, but using it correctly requires a clear understanding of what’s happening in what order.
At our core, we built Graphed to smooth out these very reporting challenges. Figuring out data relationships across platforms like Shopify, Google Analytics, and Salesforce can be time-consuming. Instead, we let you connect your sources in seconds and then use simple, natural language to build your dashboards. You just describe what you want to see - "show me sales revenue by marketing campaign" - and the charts appear, pulling live, combined data without you having to manually define a single blend or join.
Related Articles
How to Enable Data Analysis in Excel
Enable Excel's hidden data analysis tools with our step-by-step guide. Uncover trends, make forecasts, and turn raw numbers into actionable insights today!
What SEO Tools Work with Google Analytics?
Discover which SEO tools integrate seamlessly with Google Analytics to provide a comprehensive view of your site's performance. Optimize your SEO strategy now!
Looker Studio vs Metabase: Which BI Tool Actually Fits Your Team?
Looker Studio and Metabase both help you turn raw data into dashboards, but they take completely different approaches. This guide breaks down where each tool fits, what they are good at, and which one matches your actual workflow.