PostgreSQL ANALYZE Command Explained

Data & Database Workflows

Apr 3, 2026·By Elysiate·Updated Apr 3, 2026·

postgresqldatabasesqlanalyzequery-plannerperformance

·

Level: intermediate · ~13 min read · Intent: informational

Audience: backend developers, database engineers, technical teams

Prerequisites

basic familiarity with PostgreSQL

Key takeaways

ANALYZE does not speed up queries directly. It improves the planner's decisions by collecting table and column statistics that help PostgreSQL estimate row counts and choose better execution plans.
In most systems, autoanalyze handles routine statistics refreshes, but manual ANALYZE still matters after major data changes, bulk loads, unusual churn, and for partitioned parent tables whose stats may not be refreshed automatically.

FAQ

What does PostgreSQL ANALYZE actually do?: ANALYZE samples table data and updates planner statistics so PostgreSQL can estimate row counts and choose better query plans.
Does ANALYZE lock the table?: ANALYZE takes a read lock that allows normal SELECT, INSERT, UPDATE, and DELETE activity to continue, so it is much less disruptive than heavier maintenance commands.
When should I run ANALYZE manually?: Manual ANALYZE is especially useful after bulk loads, major deletes or updates, large data distribution changes, and on partitioned parent tables if planner quality depends on parent-level statistics.
What is the difference between VACUUM and ANALYZE?: VACUUM reclaims dead-tuple space for reuse and supports table health, while ANALYZE refreshes statistics used by the planner. They solve different problems, though both are part of routine maintenance.

0

The PostgreSQL ANALYZE command is one of the most important maintenance tools that many developers only partly understand.

That is understandable because it is easy to confuse it with:

VACUUM
query optimization itself
or EXPLAIN ANALYZE

Those are all different things.

ANALYZE does not directly make a slow query fast in the same way an index can. What it does is give the PostgreSQL planner better information so it can choose a better plan in the first place.

That is a huge deal.

A lot of bad PostgreSQL performance is not caused by the database “failing.” It is caused by the planner making a reasonable decision based on outdated or incomplete statistics.

This guide explains:

what ANALYZE does
what statistics it updates
how autoanalyze works
when manual ANALYZE helps
and how to use it without treating it like magic

What `ANALYZE` Does

At a practical level, ANALYZE samples table data and updates statistics that PostgreSQL stores for the planner.

The planner uses those statistics to estimate things like:

how many rows a filter will return
how selective a condition is
how data values are distributed
whether a join is likely to be small or large
whether an index scan is better than a sequential scan
and how expensive one plan shape is relative to another

That means ANALYZE is fundamentally about better estimates.

And better estimates usually lead to:

better scan choices
better join choices
better sort and aggregation decisions
and fewer “why is PostgreSQL doing that?” query plans

`ANALYZE` Does Not Execute the Query

This is one of the easiest points of confusion because of EXPLAIN ANALYZE.

`ANALYZE`

Collects table statistics for the planner.

`EXPLAIN ANALYZE`

Runs a query and shows the actual execution plan with runtime results.

They are completely different commands with different jobs.

If a developer says:

“I ran analyze on the query,”

they often really mean:

“I ran explain analyze.”

That distinction matters.

What Statistics `ANALYZE` Updates

PostgreSQL stores planner statistics in system catalogs such as pg_statistic.

The exact details are internal and approximate, but the planner uses them to estimate:

number of distinct values
fraction of nulls
common values
histogram-style distribution information
and, when configured, some extended statistics across columns

This is why ANALYZE matters so much for queries with:

selective filters
skewed data distributions
joins between uneven tables
correlated columns
and changing workloads after bulk data movement

A simple example

Suppose you have a table with a status column where:

99.5% of rows are archived
0.5% are active

That distribution matters a lot.

If PostgreSQL does not know that active is rare, it may choose a much worse plan than it would if the statistics were fresh.

Why Stale Statistics Cause Slow Queries

PostgreSQL does not inspect the full table contents every time it plans a query. That would be too expensive.

Instead, it relies on maintained statistics.

If those statistics are stale, the planner may misjudge:

how many rows match the WHERE clause
whether a nested loop or hash join is smarter
whether an index is worthwhile
whether sorting a result set will be cheap or expensive

Common symptoms of stale stats

the query plan looks irrational
estimated rows and actual rows are very far apart in EXPLAIN ANALYZE
a query got slower after a bulk load or major delete
an index stopped being chosen after data distribution changed
a join suddenly behaves much worse after a large data refresh

This is why ANALYZE is not optional maintenance trivia. It is part of how PostgreSQL stays smart.

Basic `ANALYZE` Syntax

Analyze the whole current database

analyze;

This processes every table and materialized view in the current database that the current user has permission to analyze.

Analyze one table

analyze orders;

Analyze specific columns in a table

analyze orders (status, created_at);

This is useful when:

you changed specific columns heavily
or want to target important planner inputs without analyzing everything

Verbose output

analyze verbose orders;

This is helpful when you want more feedback about what PostgreSQL is doing.

When to Run `ANALYZE` Manually

In many systems, autovacuum and autoanalyze handle routine maintenance well enough that developers do not need to think about manual ANALYZE often.

But manual ANALYZE still matters in some very common cases.

1. After bulk loads

If you load a large amount of new data:

from ETL
a migration
a backfill
COPY
or a restore process

run ANALYZE afterward.

Otherwise the planner may still be reasoning from statistics that describe the old table shape.

2. After major deletes or updates

If a large portion of a table changes:

row counts shift
value distributions shift
and selectivity assumptions can become stale

A manual ANALYZE can help the planner adapt sooner rather than waiting for routine maintenance timing.

3. After big data distribution changes

Sometimes the row count does not change much, but the distribution does.

Examples:

a status column shifts heavily toward one value
a date range becomes much more concentrated
a tenant with huge volume lands in the system
a large archive move changes the table profile

That is exactly the kind of shift that can make planner decisions worse if stats lag behind reality.

4. On partitioned parent tables

This is a subtle but important edge case.

PostgreSQL notes that the autovacuum daemon does not issue ANALYZE for partitioned parent tables. If planner quality depends on the parent’s statistics, you may need to run manual ANALYZE on those parent tables periodically.

This surprises a lot of people because they assume routine autoanalyze covers the whole hierarchy automatically.

If you are using partitioned tables and the planner reasons at the parent level, this matters.

`ANALYZE` vs `VACUUM`

These commands are often mentioned together, but they solve different problems.

`ANALYZE`

updates planner statistics
helps the optimizer choose better query plans

`VACUUM`

cleans up dead tuple space for reuse
supports table health and MVCC maintenance
helps visibility map behavior
helps prevent table and transaction-age problems

Why people combine them

Historically and practically, you will often see maintenance routines like:

VACUUM ANALYZE
or autovacuum/autoanalyze discussed together

That is because both contribute to healthy long-term PostgreSQL performance.

But they are not interchangeable.

A bad query plan caused by stale statistics may be fixed by ANALYZE even if table cleanup is not the main issue. A bloated churn-heavy table may need vacuum attention even if the planner stats are fine.

How Autoanalyze Fits In

In normal PostgreSQL operation, autoanalyze is usually triggered by the autovacuum system.

That means PostgreSQL tracks table activity and runs maintenance automatically when thresholds are crossed.

For many systems, this is enough.

Why that is good

It means you do not have to:

schedule ANALYZE manually for every table
or treat planner statistics as a handcrafted ritual

Why it is still not the whole story

Defaults are general-purpose. Some tables are not.

Very hot or unusual tables may need:

more aggressive per-table tuning
manual intervention after unusual bulk operations
or explicit monitoring if planner behavior starts drifting

A good posture is:

trust autoanalyze as the normal baseline
but understand when your workload needs help

What `ANALYZE` Does Not Do

This is worth stating clearly because many performance conversations get muddled here.

ANALYZE does not:

rebuild indexes
shrink the table
reclaim disk space directly
rewrite data
optimize SQL syntax by itself
or guarantee that every bad plan becomes good

It improves the planner’s information. That is powerful, but it is still only one part of query performance.

If the query is bad, or the indexes are wrong, or the app is issuing 300 queries instead of 3, ANALYZE will not fix that alone.

How to Tell If `ANALYZE` Helped

The best way is to compare plans and execution after refreshing stats.

Good workflow

identify the slow query
run EXPLAIN ANALYZE
note:
- estimated rows
- actual rows
- plan shape
run ANALYZE on the relevant table(s)
rerun EXPLAIN ANALYZE
compare plan changes

What improvement often looks like

row estimates move closer to reality
the planner chooses an index scan instead of a seq scan
a join strategy changes to something more sensible
a bad nested loop disappears
total execution time drops because PostgreSQL stopped doing the wrong kind of work

If nothing changes, that is useful information too. It may mean the issue is:

query shape
indexing
or something else entirely

Column-Specific `ANALYZE`

You do not always need to analyze everything.

PostgreSQL supports analyzing only selected columns of a table.

That can be useful when:

one or two columns drive the critical query decisions
a specific filtered column changed heavily
you want to limit effort while refreshing key planner inputs

Example:

analyze orders (status, account_id, created_at);

This is not something you need for every case, but it is a useful tool when you know which columns matter most to the planner.

Statistics Targets and Accuracy

PostgreSQL’s planner statistics are approximate by design. That is normal.

The docs also note that increasing the statistics target usually increases accuracy, but also increases the time and work required to compute those statistics.

That means if a column is especially important and especially tricky, you may want to raise its statistics target.

When this matters

This is often useful when:

value distribution is skewed
selectivity is very important
one column heavily affects plan quality
default stats are too coarse for your workload

Example

alter table orders alter column status set statistics 500;
analyze orders;

This is not something you should raise everywhere blindly. But for a few important columns, it can make the planner much smarter.

`ANALYZE` and Extended Statistics

Single-column statistics are often enough. Sometimes they are not.

If query quality depends on relationships between columns, extended statistics can help.

Examples:

country and region are highly correlated
tenant_id and status have non-independent behavior
multiple columns together drive selectivity in ways one-column stats cannot express well

PostgreSQL notes that ANALYZE computes extended statistics using the same sample of rows it uses for ordinary stats.

That means ANALYZE is also part of making those multi-column planning features useful.

Does `ANALYZE` Block Normal Work?

Not in the way heavier maintenance commands do.

ANALYZE takes a read lock that still allows ordinary:

SELECT
INSERT
UPDATE
and DELETE

to continue.

That makes it much safer to run than commands like:

VACUUM FULL
or big table rewrites

It is still real work and should still be used thoughtfully, but it is generally normal production maintenance, not a major lock event.

Progress Reporting

PostgreSQL supports progress reporting for ANALYZE.

That matters when:

analyzing large relations
running maintenance interactively
or trying to understand whether a command is still moving

If you are working with large tables, this is useful operationally because it makes ANALYZE feel less like a black box.

Common Mistakes With `ANALYZE`

Mistake 1: Thinking it is the same as `EXPLAIN ANALYZE`

These are completely different tools.

Mistake 2: Assuming autovacuum always covers every case perfectly

It covers many cases well, but not all unusual workloads or partitioned parent-table needs.

Mistake 3: Running `ANALYZE` and expecting it to fix bad schema design

It improves planner choices. It does not replace good indexing or query design.

Mistake 4: Ignoring huge estimate gaps in `EXPLAIN ANALYZE`

If estimates are badly wrong, stale stats or insufficient stats may be part of the reason.

Mistake 5: Raising statistics targets everywhere

This increases work and is usually not necessary. Use higher targets deliberately for important columns.

Mistake 6: Forgetting `ANALYZE` after large data movement

Bulk loads and large churn events are some of the best reasons to run it manually.

Practical Examples

Example 1: After a large import

copy events from '/path/to/events.csv' csv header;
analyze events;

This is a classic pattern. Bulk load first, then refresh stats so the planner sees the new table distribution.

Example 2: After a large backfill

update orders
set status = 'archived'
where created_at < now() - interval '2 years';

analyze orders (status, created_at);

If those columns drive important queries, this helps PostgreSQL adapt sooner.

Example 3: Refresh parent-table stats in a partitioned design

analyze orders_parent;

If the parent table’s stats matter for planning, this may still be necessary even when leaf partitions are changing normally.

FAQ

What does PostgreSQL `ANALYZE` actually do?

ANALYZE samples table data and updates planner statistics so PostgreSQL can estimate row counts and choose better query plans.

Does `ANALYZE` lock the table?

ANALYZE takes a read lock that allows normal SELECT, INSERT, UPDATE, and DELETE activity to continue, so it is much less disruptive than heavier maintenance commands.

When should I run `ANALYZE` manually?

Manual ANALYZE is especially useful after bulk loads, major deletes or updates, large data distribution changes, and on partitioned parent tables if planner quality depends on parent-level statistics.

What is the difference between `VACUUM` and `ANALYZE`?

VACUUM reclaims dead-tuple space for reuse and supports table health, while ANALYZE refreshes statistics used by the planner. They solve different problems, though both are part of routine maintenance.

Conclusion

The PostgreSQL ANALYZE command is really about helping the planner make better decisions.

It works by refreshing statistics that PostgreSQL uses to estimate:

row counts
data distribution
selectivity
and relative plan cost

That means ANALYZE matters most when:

the data changed a lot
the distribution changed a lot
the planner seems badly wrong
or your workload depends on statistics that routine maintenance has not refreshed yet

For many systems, autoanalyze does most of the work.

But understanding manual ANALYZE still matters because it helps you diagnose one of the most common hidden causes of bad PostgreSQL query plans:

the planner is only as good as the statistics it has.

About the author

Elysiate publishes practical guides and privacy-first tools for data workflows, developer tooling, SEO, and product engineering.

View author profile Read editorial policy

Free, privacy-first utilities in your browser — no uploads required for most workflows.

PostgreSQL cluster

Explore the connected PostgreSQL guides around tuning, indexing, operations, schema design, scaling, and app integrations.

Pillar guide

PostgreSQL Performance Tuning: Complete Developer Guide

A practical PostgreSQL performance tuning guide for developers covering indexing, query plans, caching, connection pooling, vacuum, schema design, and troubleshooting with real examples.

View all PostgreSQL guides →

PostgreSQL ANALYZE Command Explained

Prerequisites

Key takeaways

FAQ

What ANALYZE Does

ANALYZE Does Not Execute the Query

ANALYZE

EXPLAIN ANALYZE

What Statistics ANALYZE Updates

A simple example

Why Stale Statistics Cause Slow Queries

Common symptoms of stale stats

Basic ANALYZE Syntax

Analyze the whole current database

Analyze one table

Analyze specific columns in a table

Verbose output

When to Run ANALYZE Manually

1. After bulk loads

2. After major deletes or updates

3. After big data distribution changes

4. On partitioned parent tables

ANALYZE vs VACUUM

ANALYZE

VACUUM

Why people combine them

How Autoanalyze Fits In

Why that is good

Why it is still not the whole story

What ANALYZE Does Not Do

How to Tell If ANALYZE Helped

Good workflow

What improvement often looks like

Column-Specific ANALYZE

Statistics Targets and Accuracy

When this matters

Example

ANALYZE and Extended Statistics

Does ANALYZE Block Normal Work?

Progress Reporting

Common Mistakes With ANALYZE

Mistake 1: Thinking it is the same as EXPLAIN ANALYZE

Mistake 2: Assuming autovacuum always covers every case perfectly

Mistake 3: Running ANALYZE and expecting it to fix bad schema design

Mistake 4: Ignoring huge estimate gaps in EXPLAIN ANALYZE

Mistake 5: Raising statistics targets everywhere

Mistake 6: Forgetting ANALYZE after large data movement

Practical Examples

Example 1: After a large import

Example 2: After a large backfill

Example 3: Refresh parent-table stats in a partitioned design

FAQ

What does PostgreSQL ANALYZE actually do?

Does ANALYZE lock the table?

When should I run ANALYZE manually?

What is the difference between VACUUM and ANALYZE?

Conclusion

About the author

Use these tools

PostgreSQL cluster

Related posts

What `ANALYZE` Does

`ANALYZE` Does Not Execute the Query

`ANALYZE`

`EXPLAIN ANALYZE`

What Statistics `ANALYZE` Updates

Basic `ANALYZE` Syntax

When to Run `ANALYZE` Manually

`ANALYZE` vs `VACUUM`

`ANALYZE`

`VACUUM`

What `ANALYZE` Does Not Do

How to Tell If `ANALYZE` Helped

Column-Specific `ANALYZE`

`ANALYZE` and Extended Statistics

Does `ANALYZE` Block Normal Work?

Common Mistakes With `ANALYZE`

Mistake 1: Thinking it is the same as `EXPLAIN ANALYZE`

Mistake 3: Running `ANALYZE` and expecting it to fix bad schema design

Mistake 4: Ignoring huge estimate gaps in `EXPLAIN ANALYZE`

Mistake 6: Forgetting `ANALYZE` after large data movement

What does PostgreSQL `ANALYZE` actually do?

Does `ANALYZE` lock the table?

When should I run `ANALYZE` manually?

What is the difference between `VACUUM` and `ANALYZE`?