How to Do Data Quality Assessment in Excel

Cody Schneider

Checking your data's health in Excel shouldn't feel like wrestling with a spreadsheet. If you want trustworthy reports and dashboards, you first need to assess your data for errors, and it's easier than you think. This guide will walk you through a practical, step-by-step process for performing a data quality assessment directly within Excel, using tools you already have.

What is Data Quality and Why Does It Matter?

Data quality isn't just an abstract concept, it determines whether you can trust your numbers. High-quality data is accurate, complete, and consistent, while low-quality data is a leading cause of misinformed decisions, wasted marketing spend, and inaccurate sales forecasts.

Imagine you export your customer list to send a special offer. If your data quality is poor, you might have:

  • Customers with no email address listed (incompleteness).

  • Duplicate entries for the same person, causing them to get the email twice (uniqueness issue).

  • Mismatched state and zip codes, ruining any location-based targeting (inconsistency).

  • Email addresses misspelled or missing the "@" symbol (validity issue).

Poor data quality creates real problems that impact everything from your marketing campaigns to your financial reporting. A quick assessment in Excel is your first line of defense against these kinds of errors.

Key Dimensions of Data Quality

Before jumping into Excel, it's helpful to know what you're looking for. Data quality is an umbrella term that breaks down into several key dimensions. Here are the most common ones you'll assess in your spreadsheets:

  • Completeness: Are there any gaps? This checks for empty cells or missing records where data should exist.

  • Uniqueness: Are there any duplicates? This ensures each record is unique and there are no redundant entries.

  • Consistency: Is the data uniform? This checks if data is represented the same way across your entire dataset (e.g., "CA" vs. "California").

  • Validity: Does the data follow the rules? This confirms that data is in the correct format (e.g., a valid email format, recognized date format).

  • Accuracy: Is the data correct? This measures how well the data reflects the real world (e.g., does the customer's listed address actually exist?). Accuracy is often the hardest to check without an external source of truth.

Your Pre-Assessment Checklist in Excel

Never work on your original file. Before you start investigating, take a minute to set yourself up for success and protect your master dataset.

  1. Create a Working Copy: Always, always duplicate your data file first. Perform all your quality checks on this copy. This protects your original data from accidental changes or deletions.

  2. Understand the Context: Get a feel for the data. What does each column represent? Where did the data come from? What is its purpose? Understanding the context helps you define what "good" data should look like. A price column should contain numbers, a state column should contain state abbreviations, and so on.

  3. Convert to a Table: Select any cell in your data range and press Ctrl + T (or Cmd + T on Mac) to convert your data into an Excel Table. Tables offer several advantages, including easy sorting and filtering, and formulas that automatically fill down for the entire column.

How to Assess Your Data Quality in Excel: A Step-by-Step Guide

Now that you're prepared, it's time to roll up our sleeves and check your data against the quality dimensions using simple Excel tools and formulas.

1. Checking for Completeness (Finding Blanks)

Missing data is one of the most common issues. An empty cell in a critical column like 'Email' or 'Order Value' can render that entire row of data useless.

Method 1: Using COUNTBLANK

The COUNTBLANK function does exactly what it says: it counts the number of empty cells in a specified range. This is great for getting a quick high-level view of how many values are missing in a key column.

Let's say your contacts are in cells A2 through A1001. You would use this formula:

This will give you the total count of empty cells. To see this as a percentage, you could use:

Method 2: Using Filters

For a visual check of exactly where the blanks are, filters are your best bet.

  1. Make sure your data is in an Excel Table, so filters are automatically enabled.

  2. Click the filter dropdown arrow on the header of the column you want to check.

  3. In the filter menu, scroll to the bottom. If there are empty cells, you will see an option labeled "(Blanks)".

  4. Deselect "Select All" and then check only the "(Blanks)" box.

Excel will hide all rows with data, showing you only the records where that specific field is missing. This is a fast and easy way to isolate incomplete records.

2. Checking for Uniqueness (Finding Duplicates)

Duplicate records can inflate your customer counts, skew your sales numbers, and lead to frustrating customer experiences. Fortunately, Excel has powerful built-in tools for this.

Method 1: Conditional Formatting

This is the best method for highlighting duplicates without deleting them.

  1. Select the column where you want to find duplicates (e.g., the 'Email' column).

  2. Go to the Home tab on the ribbon.

  3. Click Conditional Formatting > Highlight Cells Rules > Duplicate Values...

  4. In the pop-up box, ensure "Duplicate" is selected, and choose a formatting style (the default light red fill with dark red text works well). Click OK.

Excel will instantly highlight all cells that have a duplicate entry within your selected range. You can then filter by color to review all the duplicate records together.

Method 2: Using the COUNTIF Formula

If you prefer to flag duplicates in a separate helper column, COUNTIF is perfect. This can be useful for creating a report on your data quality.

Assuming your email addresses are in column C, starting at C2, you could add this formula into column D:

Drag this formula down for all your rows. It will check how many times the email address in cell C2 appears in the entire C column. If the count is greater than 1, it flags the row as a "Duplicate".

3. Checking for Consistency (Finding Formatting Errors)

Inconsistent data is a silent killer of accurate reporting. When "USA", "U.S.A.", and "United States" are all used in the same 'Country' column, any attempt to group by country will split your data into three separate buckets.

Method 1: Use Filters to Review Unique Values

This is the quickest way to spot inconsistencies in categorical data.

  1. Click the filter dropdown on the header of the column you want to inspect (e.g., 'Country').

  2. The filter list shows every unique value present in that column.

  3. Carefully scan this list for variations. You'll easily spot issues like capitalization differences ("ca" vs. "CA") or abbreviation inconsistencies ("St." vs. "Street").

Method 2: Using Text Functions Like TRIM and LEN

Sometimes the inconsistency is invisible, caused by extra spaces before or after a text entry (" Boston" vs. "Boston"). The TRIM function cleans these up, while the LEN function (length) can help you find them.

To check for hidden spaces, create a new column and use this formula:

Then create another column with this formula:

If the numbers in these two columns don't match for any given row, it means there are unwanted leading or trailing spaces that need to be cleaned.

4. Checking for Validity (Finding Incorrect Data Types)

Data is valid if it conforms to a set of rules. For example, a 'Price' column should be a numeric value, and an 'Order Date' should be a valid date format.

A common issue in Excel is numbers being stored as text. This prevents you from running calculations like SUM or AVERAGE on them. You can usually spot these because they align to the left of the cell by default, instead of the right.

Method 1: Using ISNUMBER and ISTEXT Functions

To systematically check an entire column, you can use helper formulas.

  • To check if a column that should be numeric actually contains numbers, use: =ISNUMBER(A2). This will return TRUE if it's a number and FALSE if it isn't.

  • Similarly, =ISTEXT(A2) will return TRUE for text values.

Method 2: Data Validation

While Data Validation is primarily a tool for preventing bad data from being entered, you can also use it to check existing data using its "Circle Invalid Data" feature.

  1. Select the column you want to check (e.g., a column of dates).

  2. Go to the Data tab and click Data Validation.

  3. In the settings tab, set the criteria. For example, under "Allow," select "Date," and specify a range like "greater than" 1/1/2000.

  4. Click OK. Then, click the Data Validation dropdown again and select "Circle Invalid Data".

Excel will draw a red circle around any cells in your selection that do not meet the criteria you just defined, making it easy to spot entries that are not valid dates.

Final Thoughts

Performing a regular data quality assessment in Excel is a fundamental skill for anyone who relies on data to make decisions. By using tools like Conditional Formatting, Filters, and a few key formulas like COUNTBLANK and COUNTIF, you can systematically diagnose problems with completeness, uniqueness, consistency, and validity to build a foundation of trustworthy data.

Running these checks manually in Excel is a great starting point, but it can quickly become repetitive and time-consuming, especially when you're pulling data from multiple sources like Google Ads, Shopify, and Salesforce every week. At our company, we built Graphed to automate precisely this kind of tedious work. We connect to all your marketing and sales platforms, handle the data linking and quality checks in the background, and allow you to build real-time, accurate dashboards just by describing what you want to see - all without exporting another CSV file again.