ElyxAI
data manipulation

How to How to Use Remove Duplicates in Power Query in Excel

Excel 2016Excel 2019Excel 365

Learn how to efficiently remove duplicate rows from your datasets using Power Query's Remove Duplicates feature. This tutorial covers accessing the tool, selecting columns to analyze, and applying deduplication to clean messy data quickly without formulas.

Why This Matters

Removing duplicates is essential for data integrity and accurate analysis; Power Query automates this task, saving hours of manual work on large datasets.

Prerequisites

  • Excel 2016 or later (Excel 365 recommended)
  • Basic understanding of loading data into Power Query
  • Access to the Power Query Editor interface

Step-by-Step Instructions

1

Load your data into Power Query

Select your data range and go to Data > Get & Transform Data > From Table/Range to open the Power Query Editor with your dataset.

2

Access the Remove Duplicates feature

In the Power Query Editor, navigate to the Home tab and click Remove Rows > Remove Duplicates button in the ribbon.

3

Select columns for duplicate detection

A dialog box appears; choose which columns to check for duplicates (leave all selected to remove rows identical across all columns, or uncheck columns to ignore them).

4

Review the duplicate removal results

Power Query will display a preview showing how many rows will be removed; confirm the operation is correct before proceeding.

5

Close and load the cleaned data

Click Close & Load or Close & Load To to finish; Power Query will return your deduplicated dataset to a new or existing worksheet.

Alternative Methods

Using the Data tab (legacy method)

In Excel's main ribbon, select Data > Data Tools > Remove Duplicates for older Excel versions without Power Query access.

Filter and delete manually

Use Advanced Filter with unique records only, then delete duplicate rows manually for smaller datasets.

Utilize formulas with conditional formatting

Combine COUNTIF and helper columns to identify duplicates, then filter and delete rows visually.

Tips & Tricks

  • Always create a backup of your data before removing duplicates to avoid accidental loss.
  • Sort your data before deduplication to review which rows will be removed.
  • Use column-specific deduplication if you only want to match on certain fields (e.g., email addresses).

Pro Tips

  • Combine Remove Duplicates with other Power Query transformations (sort, filter) in a single query for maximum efficiency.
  • Keep the original data in a separate table and use Power Query to create a deduplicated version for reporting.
  • Document which columns you used for duplicate detection to maintain data governance standards.

Troubleshooting

Remove Duplicates button is greyed out

Ensure you have selected data in the Power Query Editor and that at least one column is highlighted; the feature requires an active selection.

Duplicate rows still appear after removal

Check for whitespace, case sensitivity, or formatting differences in your data; Power Query performs exact matches, so 'John' and 'john' are treated as different values.

Too many rows were deleted unexpectedly

Verify your column selection in the Remove Duplicates dialog; you may have included columns that should have been excluded from the duplicate check.

Related Excel Formulas

Frequently Asked Questions

Does Remove Duplicates in Power Query preserve the original data?
No, Remove Duplicates modifies your query result; the original Excel table remains unchanged unless you overwrite it. Always keep a backup or use a separate sheet for your original data.
Can I undo Remove Duplicates after closing the query?
No, once you close and load the deduplicated data, you cannot undo the operation in Power Query. However, you can reopen the original query and revert to the previous step if the query is still available.
What happens to data in hidden columns when using Remove Duplicates?
Hidden columns are still considered in the duplicate detection process; only visible columns in your Power Query view are analyzed, so ensure all relevant columns are visible before deduplication.
Can I remove duplicates based on multiple criteria?
Yes, you can select multiple columns in the Remove Duplicates dialog; Power Query will consider a row duplicate only if all selected columns match exactly with another row.

This was one task. ElyxAI handles hundreds.

Sign up