How To Remove Duplicates In Google Sheets
Level: intermediate · ~16 min read · Intent: informational
Audience: data analysts, finance teams, operations teams
Prerequisites
- intermediate spreadsheet literacy
- comfort with formulas or pivot concepts
Key takeaways
- Removing duplicates in Google Sheets is a data-quality task, not just a cleanup trick, because duplicate rows can distort counts, totals, dashboards, and spreadsheet-based reporting.
- The best duplicate-removal method depends on the workflow: use the built-in remove duplicates tool for direct cleanup, and use formulas like UNIQUE when the original data should remain untouched.
FAQ
- What is the fastest way to remove duplicates in Google Sheets?
- The fastest way is usually the built-in remove duplicates tool, which lets you select a range, choose the relevant columns, and remove repeated rows directly.
- What is the difference between removing duplicates and using UNIQUE in Google Sheets?
- Removing duplicates changes the original data directly, while UNIQUE creates a separate deduplicated result without changing the source range.
- Why do duplicate-looking values still stay in Google Sheets?
- Values that look duplicated may still differ because of hidden spaces, text-number mismatches, inconsistent formatting, or small source-data differences that Google Sheets treats as separate values.
- Should I always remove duplicates from the source data?
- No. Some repeated rows are valid records, so it is important to decide whether the duplicates are true errors or just repeated transactions, entries, or events that should remain in the dataset.
Removing duplicates in Google Sheets is one of the most useful cleanup tasks in spreadsheet work because duplicate values and repeated rows can quietly damage analysis. A sheet may still look normal, but repeated data can distort totals, inflate counts, confuse lookups, and create unreliable dashboards.
That is why duplicate removal matters so much.
In real spreadsheet workflows, duplicates often appear because of:
- copied imports
- repeated form submissions
- export issues
- merge mistakes
- manual data entry
- combining multiple lists into one sheet
- bad source-system hygiene
But not every repeated row is wrong.
Sometimes a duplicate is a genuine data issue. Sometimes it is a valid repeated transaction, event, or record that should stay in the sheet.
This guide explains how to remove duplicates in Google Sheets, when to use the built-in tool, when to use formulas instead, how to avoid deleting the wrong data, and how duplicate cleanup fits into real reporting workflows.
Overview
A duplicate in Google Sheets usually means one of two things:
- the same value appears more than once in a column
- the same row, or the same combination of selected fields, appears more than once in a table
Examples include:
- the same invoice number appearing twice
- the same customer ID repeated accidentally
- the same product row imported more than once
- the same region listed repeatedly when only one distinct list is needed
- duplicate form responses
- repeated codes in a reference table
The key question is not just: “Is this repeated?”
The key question is: “Should this repeated value remain, or should it be removed?”
That distinction matters because deleting real business records by mistake can be just as damaging as leaving true duplicates in place.
Why duplicates are a problem
Duplicates are a reporting problem because they affect the numbers people rely on.
For example, duplicates can:
- inflate revenue
- overcount tasks
- double-count customers
- distort dashboard metrics
- break lookup tables
- create unreliable summaries
- make shared spreadsheets harder to trust
This is especially risky in:
- finance tracking
- operations reports
- KPI dashboards
- deduplicated reference tables
- customer lists
- inventory spreadsheets
- team planning sheets
That is why duplicate removal is not just cosmetic. It is part of keeping spreadsheet data trustworthy.
The most common types of duplicates
Not all duplicates behave the same way.
Exact duplicate rows
These are rows where every selected field is the same.
Example:
| Invoice ID | Customer | Amount |
|---|---|---|
| 1001 | Acme Ltd | 2500 |
| 1001 | Acme Ltd | 2500 |
This is the easiest type of duplicate to identify and remove.
Duplicate keys in one column
Sometimes the issue is not the whole row. It is one field that should only appear once.
Examples:
- employee ID
- customer code
- invoice number
- product SKU
The rest of the row may vary, but the repeated key is still a problem depending on the business rule.
Duplicate-looking rows that are not true duplicates
A row may look repeated but still be valid.
Examples:
- same invoice number, different line item
- same customer, different order date
- same product, different warehouse
- same employee, different month
This is why removing duplicates without defining the actual duplicate logic is dangerous.
Hidden-data duplicates
Some rows look identical to a person but are different to Google Sheets because of:
- trailing spaces
- text-number mismatches
- inconsistent capitalization
- imported hidden characters
- formatting differences
This is why duplicate removal sometimes seems to miss rows that “look” duplicated.
The built-in remove duplicates tool
Google Sheets includes a built-in remove duplicates tool.
This is often the fastest method when:
- the source range should be cleaned directly
- the duplicate logic is clear
- the cleanup is a one-time step
- you want repeated rows removed from the actual sheet
The tool works by:
- checking the selected range
- comparing the selected columns
- keeping the first matching row
- removing later repeated rows
This is one of the easiest ways to clean imported or repeated data.
When the built-in tool is the right choice
The built-in tool is usually the better choice when:
- the source data itself should become deduplicated
- the repeated rows are clearly errors
- the table is not meant to preserve history of repeated entries
- the cleanup is direct and intentional
- you want a faster manual cleanup step
This is common in:
- imported CSV files
- duplicate customer lists
- repeated product tables
- bad merge outputs
- reference tables that should contain unique values only
Why column selection matters so much
One of the most important parts of removing duplicates is choosing which columns define a duplicate.
That matters because:
- removing duplicates across every column only removes exact duplicate rows
- removing duplicates based on one key field removes repeated occurrences of that key
- removing duplicates based on a combination of fields removes repeated combinations
This is a business decision, not just a technical one.
For example:
- if invoice ID should be unique, use that field
- if product and warehouse together define uniqueness, use both fields
- if the whole row must match, use all columns
The quality of the deduplication depends on this choice.
A practical duplicate example
Suppose you have:
| Customer ID | Customer Name | Region |
|---|---|---|
| C100 | Acme Ltd | North |
| C101 | Blue Peak | South |
| C100 | Acme Ltd | North |
If the business rule says each customer ID should only appear once, the third row is likely a duplicate.
That could be removed.
But if the sheet tracked time-based customer activity and a date column existed elsewhere, the repeat might be valid.
That is why the duplicate rule should be clear before cleanup begins.
Using UNIQUE instead of deleting rows
Sometimes you do not want to remove duplicates from the original sheet. You just want a clean deduplicated output somewhere else.
This is where the UNIQUE function is often better.
Example:
=UNIQUE(A2:A100)
This returns a distinct list from the selected range.
Or for multiple columns:
=UNIQUE(A2:C100)
This returns unique rows from the range without changing the source data.
That makes UNIQUE very useful when:
- the raw data should remain untouched
- the result should update dynamically
- the deduplicated output is part of a report or dashboard
- you want a clean list for dropdowns or summaries
Remove duplicates versus UNIQUE
This is one of the most important distinctions.
Built-in remove duplicates tool
Use it when:
- you want to change the source data
- duplicate rows should be deleted directly
- the cleanup is intentional and final
UNIQUE
Use it when:
- the original table should remain unchanged
- you need a dynamic distinct list
- the output belongs in another tab or summary area
- the deduplicated result should update automatically
Both methods are useful. They just solve different spreadsheet problems.
Identifying duplicates before removing them
In many cases, the safest workflow is to identify duplicates first before removing them.
This gives you a chance to review:
- whether the duplicate is real
- whether the source has formatting issues
- whether only certain columns matter
- whether some repeated rows should stay
Useful review methods include:
- sorting by the suspected key
- using COUNTIF to count repeated values
- using conditional formatting to highlight duplicates
- using UNIQUE in a comparison area
- filtering the suspect rows manually
This is especially important in finance, operations, and any shared business sheet where mistakes have consequences.
Using COUNTIF to spot duplicates
A practical duplicate-check pattern is:
=COUNTIF(A:A,A2)
This counts how many times the value in A2 appears in column A.
If the result is greater than 1, the value appears more than once.
This is useful for:
- invoice IDs
- product codes
- employee numbers
- customer IDs
- tracking duplicate keys before deletion
It is a very useful review step before running full duplicate removal.
Using conditional formatting to highlight duplicates
Another helpful approach is to visually highlight repeated values before removing them.
This helps users:
- inspect the pattern
- confirm whether the duplicates are true errors
- understand how widespread the issue is
- avoid removing valid repeated records accidentally
This is especially useful in shared sheets or before a cleanup step that changes source data.
Common business use cases
Finance
Finance teams remove duplicates from:
- invoice lists
- vendor lists
- payment logs
- account tables
- imported ledger extracts
This is especially important because duplicated financial records can distort totals.
Operations
Operations teams remove duplicates from:
- issue logs
- request queues
- site lists
- resource trackers
- repeated manual-entry rows
Analytics
Analysts remove duplicates from:
- imported datasets
- customer tables
- campaign exports
- product reference tables
- report-support lists
These are everyday spreadsheet cleanup tasks.
Common duplicate-removal mistakes
Removing duplicates without defining the business key
This is the biggest mistake.
You need to know what actually counts as a duplicate before deleting anything.
Using all columns when only one field matters
Sometimes the key should be:
- invoice number
- customer ID
- SKU
- employee number
If you compare all columns instead, true key duplicates may survive because one non-key field differs.
Using only one column when the duplicate should be multi-field
Sometimes uniqueness depends on a combination such as:
- product + warehouse
- employee + month
- invoice + line item
Choosing too few columns can remove valid rows.
Ignoring hidden spaces and type mismatches
Values that look duplicated may not match perfectly because of:
- spaces
- text-number inconsistencies
- imported formatting issues
This can make cleanup seem incomplete.
Removing valid repeated activity
Some repeated rows are real records, not mistakes.
For example:
- repeated orders from the same customer
- repeated shipments of the same product
- repeated monthly entries for the same employee
Those should not be removed just because they look repeated.
Step-by-step workflow
If you want to remove duplicates safely in Google Sheets, this is a strong process.
Step 1: Define what a duplicate means
Ask: Which field or fields must match for the row to count as a duplicate?
Step 2: Decide whether to delete the source rows or create a deduplicated output
If the source should change, use the built-in tool. If the source should remain, use UNIQUE.
Step 3: Review the source data
Check for:
- spaces
- formatting inconsistencies
- type mismatches
- repeated keys
- columns that may define uniqueness together
Step 4: Back up the sheet if the cleanup matters
This is especially important for important operational or finance files.
Step 5: Apply the chosen method
Use:
- the remove duplicates tool
- UNIQUE
- COUNTIF or formatting for review first
Step 6: Validate the result
Check:
- row counts
- totals
- key metrics
- whether important records disappeared unexpectedly
This is one of the most important steps.
Practical examples
Remove duplicate rows directly
Use the built-in remove duplicates tool on the full selected range.
Best for:
- imports
- repeated records
- direct cleanup
Remove duplicates by one key
Use the built-in tool but select only one key field, such as:
- customer ID
- invoice number
- employee number
Best for:
- tables where that key should be unique
Create a distinct output list
=UNIQUE(A2:A100)
Best for:
- categories
- vendors
- departments
- customers
- regions
Create unique rows without changing source data
=UNIQUE(A2:C100)
Best for:
- report inputs
- dashboard lists
- comparison tables
- clean helper outputs
Spot duplicate keys
=COUNTIF(A:A,A2)
Best for:
- review
- audit
- flagging repeated identifiers before cleanup
When removing duplicates is the better choice
Remove duplicates directly when:
- repeated rows are clearly errors
- the source data itself should be cleaned
- the table should only keep one version of each record
- the business logic supports deletion
This is especially common in reference-table cleanup and imported list cleanup.
When a formula approach is better
Use formulas like UNIQUE when:
- the raw data should remain available
- the deduplicated result belongs in a summary area
- the output should update dynamically
- you are not ready to permanently delete anything yet
This is especially helpful in report-building workflows.
FAQ
What is the fastest way to remove duplicates in Google Sheets?
The fastest way is usually the built-in remove duplicates tool, which lets you select a range, choose the relevant columns, and remove repeated rows directly.
What is the difference between removing duplicates and using UNIQUE in Google Sheets?
Removing duplicates changes the original data directly, while UNIQUE creates a separate deduplicated result without changing the source range.
Why do duplicate-looking values still stay in Google Sheets?
Values that look duplicated may still differ because of hidden spaces, text-number mismatches, inconsistent formatting, or small source-data differences that Google Sheets treats as separate values.
Should I always remove duplicates from the source data?
No. Some repeated rows are valid records, so it is important to decide whether the duplicates are true errors or just repeated transactions, entries, or events that should remain in the dataset.
Final thoughts
Removing duplicates in Google Sheets is one of the most valuable cleanup tasks in spreadsheet work, but it only works well when the logic is clear.
That is the real lesson.
The goal is not just to make repeated values disappear. The goal is to improve data quality without deleting valid information.
That means you should always define what a duplicate really is in the context of the sheet, decide whether the source should be changed or only summarized differently, and review the outcome carefully before trusting the result.
Once you approach duplicate cleanup that way, Google Sheets becomes much more reliable for reporting, dashboards, and shared spreadsheet analysis.