What Can Be Configured in an Extract in Tableau?

Cody Schneider8 min read

Working with large datasets in Tableau can feel like trying to steer a ship in a storm - slow, heavy, and unresponsive. Tableau Extracts are your best tool for calming those waters and making your dashboards incredibly fast. This article walks you through exactly what you can configure in an extract to optimize performance, reduce data volume, and build much more efficient workbooks.

GraphedGraphed

Build AI Agents for Marketing

Build virtual employees that run your go to market. Connect your data sources, deploy autonomous agents, and grow your company.

Watch Graphed demo video

First, What Is a Tableau Extract?

In Tableau, you have two primary ways to connect to your data: Live Connection and Extract.

  • A Live Connection queries the source database directly every time you interact with your dashboard (e.g., change a filter, click on a chart). This gives you real-time data but can be slow if the database is overloaded or the queries are complex.
  • An Extract is a highly compressed snapshot or subset of your data that is stored locally in Tableau's high-performance database engine (as a .hyper file). Because the data is right there with your workbook, performance is typically much faster, and you can even work with it offline.

Think of it this way: a Live Connection is like watching a live stream of an event, which depends on a good internet connection. An extract is like downloading the video of that event - it's saved on your device and plays instantly whenever you want.

The real power comes from the fact that you don't have to extract all your data. You can be selective, and this is where configuration becomes critical.

Creating an Extract and Finding the Configuration Menu

To start configuring an extract, you first need to tell Tableau you want to use one. On the Data Source tab in Tableau Desktop, look in the upper-right corner. You'll see the Connection options: Live and Extract. Simply select the Extract radio button.

Once you select Extract, a link labeled "Edit..." will appear. Clicking this opens the "Extract | Edit" dialog box. This is where you configure exactly what data gets pulled into your .hyper file.

Let's break down each option available in this menu.

Free PDF · the crash course

AI Agents for Marketing Crash Course

Learn how to deploy AI marketing agents across your go-to-market — the best tools, prompts, and workflows to turn your data into autonomous execution without writing code.

Core Configuration Options for your Tableau Extract

The "Extract | Edit" dialog box gives you fine-grained control over your data snapshot. Mastering these settings is the key to balancing performance with data freshness.

1. Data Storage: Logical Tables vs. Physical Tables

This setting determines how Tableau stores the data from your joined tables. It's a fundamental choice that impacts the structure and sometimes the size of your extract.

  • Logical Tables (Default): This option stores data in separate tables that mirror the logical layer of your data model in Tableau. If you joined an Orders table with a Customers table, the extract would contain two distinct tables, Orders and Customers. This is the modern default because it offers greater flexibility, often results in smaller extract files, and can still perform joins quickly within the Hyper engine.
  • Physical Tables (Single Table): This option combines all your joined data into a single, denormalized, flat table. Using the same example, joining Orders and Customers and choosing "Physical Tables" would create one wide table. Each row would contain both the order information and the corresponding customer information. This was the old default and can sometimes be faster for dashboards where every visualization uses fields from all the joined tables, but it can also lead to much larger extracts due to data repetition.

When to choose which? For most use cases, stick with the default "Logical Tables." It's generally more efficient. Only consider "Physical Tables" for performance testing on very specific dashboards or if advised to do so to resolve a specific performance bottleneck.

2. Filtering the Data

This is perhaps the most powerful and commonly used configuration. An Extract Filter lets you remove unwanted data before it's ever pulled from the source database. This dramatically reduces the extract size and improves refresh speeds.

Click the "Add..." button in the Filters section to create an extract filter. A filter dialog box appears, allowing you to filter on any dimension or measure.

Practical examples of useful extract filters include:

  • Date-Based Filters: Does your dashboard only analyze sales from the past 24 months? Add a filter on your Order Date field to exclude everything older. This alone can cut a huge historical dataset down to a manageable size.
  • Geographic Filters: If you're building a dashboard for the North American sales team, filter your data to only include "USA" and "Canada."
  • Status Filters: For a CRM analysis, you might filter out old, closed/lost opportunities that are no longer relevant to your current pipeline analysis.

By filtering at the extract level, you make the entire workbook faster because Tableau simply has less data to process for every single calculation, rendering, and interaction.

GraphedGraphed

Build AI Agents for Marketing

Build virtual employees that run your go to market. Connect your data sources, deploy autonomous agents, and grow your company.

Watch Graphed demo video

3. Aggregating the Data

Aggregation allows you to "roll up" your data from its most granular form to a higher level. This is another incredibly effective way to reduce the amount of data in your extract without losing the business insight you need.

Check the box for "Aggregate data for visible dimensions." When you do this, Tableau summarizes your measures to the level of detail of your selected dimensions.

Roll Up Dates

The most common use of aggregation is with dates. You'll see an option to "Roll up dates to" a specific level like Year, Quarter, Month, or Day.

Example: Imagine your database records every single sales transaction with a precise timestamp, down to the second. Your sales performance dashboard, however, only ever analyzes trends at the daily or weekly level. Instead of storing millions of transactional rows for each day, you can aggregate the data to the "Day" level. Tableau will pre-calculate the total Sales, Profit, and Quantity for each day and only store those summarized rows in the extract.

This single change can transform an extract with billions of rows into one with just thousands, leading to nearly instantaneous load times.

Warning: Aggregation is a permanent change for that extract. If you aggregate to the daily level, you can no longer analyze your data by the hour or minute. Make sure the level of aggregation fits all the analysis needs for any dashboards connected to that extract.

Free PDF · the crash course

AI Agents for Marketing Crash Course

Learn how to deploy AI marketing agents across your go-to-market — the best tools, prompts, and workflows to turn your data into autonomous execution without writing code.

4. Set the Number of Rows

This section allows you to decide if you want all your data (after passing through the filters and aggregation) or just a sample.

  • All rows: This is the default. It will import every row that meets your criteria.
  • Top N rows: This option creates an extract with the first specified number of rows from your data source. This is invaluable when you're in the early stages of developing a dashboard with a massive dataset. You can create a small sample extract (e.g., Top 10,000 rows) to quickly build your charts and layouts without waiting for a multi-gigabyte extract to be created. Once you're done designing, you can switch it back to "All rows" and perform the full extract.

5. Incremental vs. Full Refreshes

Once you publish your extract, you'll need to keep it up-to-date. This setting controls how the refresh happens.

  • Full Refresh (Default): Every time the extract is refreshed, Tableau deletes all the old data and re-imports the entire dataset from scratch based on your configuration. It's simple and reliable but can be time-consuming for large extracts.
  • Incremental Refresh: This option only adds new rows to the extract. To use this, you need to specify a column in your table that Tableau can use to identify what's new (e.g., an ever-increasing OrderID or a LastModifiedDate timestamp). Tableau will check the maximum value of that column from the last refresh and only query for rows with a greater value.

Choosing an incremental refresh is one of the best ways to speed up your recurring data updates on Tableau Server or Tableau Cloud. A refresh that takes an hour as a full refresh might only take a minute as an incremental refresh.

Final Thoughts

Configuring a Tableau Extract is a critical skill for anyone moving from a beginner to an intermediate developer. By thoughtfully using filters, aggregation, and incremental refreshes, you take control of your workbook's performance, ensuring your end-users get a fast, responsive, and valuable dashboard experience. It turns data preparation from a passive step into an active optimization strategy.

Of course, this process of connecting data, tweaking performance settings, and visualizing insights is often the most time-consuming part of analytics. We built Graphed to simplify this entire workflow. Instead of manually configuring extracts and wrestling with complex dashboards, you can connect your business applications - like Google Analytics, Salesforce, or Shopify - and simply ask for what you want in plain English. Your data is automatically synced and kept up-to-date, allowing you to ask questions and get real-time dashboards in seconds, not hours.

Related Articles