How to Monitor PostgreSQL Performance
Level: intermediate · ~14 min read · Intent: informational
Audience: backend developers, database engineers, technical teams
Prerequisites
- basic familiarity with PostgreSQL
Key takeaways
- The best PostgreSQL monitoring setup combines query-level visibility, live session visibility, lock and wait awareness, I/O signals, autovacuum health, and application-side request metrics.
- You do not monitor PostgreSQL well by watching CPU alone. The most useful workflow combines pg_stat_statements, pg_stat_activity, logs, I/O views, and system-level tools so you can separate slow SQL from locking, connection pressure, and infrastructure problems.
FAQ
- What is the best way to monitor PostgreSQL performance?
- For most systems, the strongest baseline is pg_stat_statements for historical query cost, pg_stat_activity for live query and wait visibility, slow-query logging, lock monitoring, autovacuum visibility, and basic system metrics like CPU, memory, disk, and I/O.
- Should I use pg_stat_statements or logs?
- Usually both. pg_stat_statements is better for aggregated query cost over time, while logs are better for seeing slow statements in specific time windows and correlating them with incidents.
- What should I alert on in PostgreSQL?
- Good alert candidates include sustained slow query time, replication lag where relevant, lock waits, connection saturation, autovacuum problems, abnormal I/O latency, disk pressure, and unusual query volume spikes.
- Can I monitor PostgreSQL only from the database side?
- Not well. Good PostgreSQL monitoring also needs operating system and application context, because some database symptoms are actually caused by CPU pressure, storage latency, noisy query patterns, or poor application behavior.
Monitoring PostgreSQL performance is not about staring at one dashboard and waiting for a single number to turn red.
That is usually how teams miss the real problem.
A PostgreSQL system can look bad for very different reasons:
- one expensive query
- too many small queries
- lock contention
- slow disk I/O
- bloated tables
- stale statistics
- long-running transactions
- connection pool pressure
- or an application deployment that quietly changed query behavior
If you monitor only:
- CPU
- memory
- or average response time
you will often see symptoms without seeing causes.
Good PostgreSQL monitoring is about separating those causes clearly enough that the fix becomes obvious.
What Good PostgreSQL Monitoring Actually Means
A useful monitoring setup should help you answer six questions quickly:
- What queries are expensive over time?
- What is running right now?
- Are sessions waiting or blocked?
- Is the system limited by CPU, memory, or disk I/O?
- Is routine maintenance keeping up?
- Is the application making the database work harder than it should?
If your monitoring cannot answer those questions, it is probably too shallow.
The Best Monitoring Stack for PostgreSQL
A strong practical PostgreSQL monitoring stack usually has these layers:
Database-native visibility
pg_stat_statementspg_stat_activity- cumulative statistics views
- lock and wait visibility
- progress views
- replication and WAL views where relevant
Logging
- slow-query logging
- lock wait logging where useful
- autovacuum logs when troubleshooting
- optionally
auto_explainfor slow plans
System metrics
- CPU
- RAM
- disk usage
- disk latency
- IOPS
- load averages
- network where relevant
Application metrics
- request latency
- query count per request
- pool saturation
- slow endpoints
- background job timing
- deploy correlation
That mix is much more powerful than watching any one layer in isolation.
1. Start With pg_stat_statements
If you only enable one PostgreSQL performance extension, this is usually the one to start with.
pg_stat_statements gives you aggregated query statistics over time. That makes it ideal for answering:
- which statements consume the most total time
- which statements are called most often
- which statements have the worst average latency
- and which query shapes quietly dominate the workload
Why this matters
A query does not need to be individually dramatic to be expensive.
Sometimes the real problem is:
- a 6 ms query called hundreds of thousands of times
rather than:
- one obviously slow 2-second report query
That is why pg_stat_statements should be part of the default monitoring story.
What to watch
Track:
callstotal_exec_timemean_exec_time- rows returned
- planning time where available
- query shapes that spike after application changes
Example query
select
query,
calls,
total_exec_time,
mean_exec_time,
rows
from pg_stat_statements
order by total_exec_time desc
limit 20;
Good habit
Monitor both:
- highest total execution time
- and highest average execution time
Those are not always the same problem.
2. Watch pg_stat_activity for Live Visibility
pg_stat_statements tells you what has been expensive over time.
pg_stat_activity tells you what is happening right now.
This is one of the most useful views for real-time diagnosis.
It helps answer:
- what queries are currently active
- how long they have been running
- whether sessions are waiting
- whether transactions are stuck open
- what application is issuing the query
Example query
select
pid,
usename,
application_name,
state,
wait_event_type,
wait_event,
now() - query_start as runtime,
query
from pg_stat_activity
where state <> 'idle'
order by query_start asc;
What to watch
Look for:
- long-running active queries
- repeated copies of the same statement
- sessions stuck in
idle in transaction - waits on locks or I/O
- missing or unhelpful
application_namevalues
Practical tip
Set application_name from your app services.
It makes production troubleshooting much easier because you can quickly see:
- which API
- worker
- background job
- or migration process
is responsible for the load.
3. Monitor Locks and Waits, Not Only Query Time
A query can be slow because it is expensive. It can also be slow because it is waiting.
That distinction matters.
If the real issue is lock contention, adding memory or indexes may not help at all.
Common symptoms of wait-driven slowness
- requests hang unpredictably
- CPU is not maxed out
- sessions pile up behind one blocker
- latency spikes happen in bursts, not gradually
- one migration or batch job seems to trigger unrelated slowness
What to watch
Use:
pg_stat_activity- lock views
- blocked-process queries
- log-based lock wait visibility if enabled
Useful investigation question
When something is slow, ask:
- is the query working, or is it waiting?
That one question often cuts the debugging time dramatically.
4. Monitor Slow Queries in Logs
Query stats views are powerful, but logs still matter.
They help you correlate:
- time window
- application release
- user action
- or incident period
with actual slow statements.
Useful settings to consider
A common starting point is:
log_min_duration_statement- possibly
log_lock_waits - thoughtful
log_line_prefix - and structured log ingestion if you have centralized logging
Why logs help
They are especially useful when:
- the issue is intermittent
- you need time-based evidence
- you want to correlate slow statements with deploys
- you are debugging something that happened earlier
Practical caution
Set thresholds deliberately. Too much logging can create noise and overhead. Too little logging leaves you blind during incidents.
5. Use auto_explain When Slow Plans Are Hard to Reproduce
Some slow queries do not show up cleanly in manual testing.
They only happen:
- under real concurrency
- with production data shape
- or inside complex application paths
That is where auto_explain can help.
It allows PostgreSQL to log execution plans for slow statements automatically.
Why this matters
It is especially helpful when:
- the application is large
- a slow statement is hard to isolate by hand
- you need the plan without manually rerunning the exact workload
- you want plan evidence attached to slow-query events
When to use it
Use it carefully and intentionally, especially in production. The goal is useful evidence, not a flood of plan noise.
6. Monitor I/O, Not Just Query Counts
Some PostgreSQL problems are really I/O problems in disguise.
A system can look like it has “slow queries” when the deeper issue is:
- disk latency
- random read pressure
- poor cache hit behavior
- temporary file spill
- WAL pressure
- checkpoint behavior
- or large scans fighting for storage bandwidth
PostgreSQL’s monitoring docs now include pg_stat_io, which exposes cluster-wide I/O statistics by backend type, object, and context. That gives you more direct visibility into relation and WAL I/O behavior than older monitoring setups had. (postgresql.org)
What to watch
- read and write pressure
- temp file creation
- relation versus WAL activity
- I/O timing if enabled
- OS disk latency and utilization
- checkpoint and background write behavior
Important PostgreSQL setting
track_io_timing is off by default because it can add overhead, but when enabled it lets PostgreSQL expose database I/O timing in places like pg_stat_database, pg_stat_io, EXPLAIN with BUFFERS, and pg_stat_statements. (postgresql.org)
That makes it extremely useful during real investigations.
Also watch the operating system
The PostgreSQL monitoring chapter explicitly says not to neglect standard Unix tools like:
pstopiostatvmstat(postgresql.org)
That matters because not every database symptom is visible from SQL views alone.
7. Track Table and Index Health
A database can degrade over time even if no single query changed.
That often happens because of:
- table growth
- index growth
- dead rows
- bloat
- changing scan patterns
- stale usage assumptions
What to monitor
- table size growth
- index size growth
- table scan frequency
- index scan frequency
- dead tuple accumulation
- autovacuum frequency and lag
- sequential scan trends on big tables
This helps you catch cases where a query is getting slower not because the SQL changed, but because the table underneath it is now much larger or less healthy.
8. Monitor Autovacuum and Analyze Health
PostgreSQL performance depends on maintenance keeping up with the workload.
If autovacuum or analyze behavior falls behind, you can see:
- more dead rows
- worse planner choices
- larger tables
- weaker visibility information
- more inconsistent latency over time
What to watch
- hot write-heavy tables
- dead tuple accumulation
- autovacuum frequency
- long-running transactions that delay cleanup
- autoanalyze lag after major data changes
Why this matters
A performance issue is sometimes not “this query is bad.” It is:
- “this table has become harder to query because maintenance is lagging.”
That distinction changes the fix completely.
9. Watch Connection Pressure and Pool Saturation
A lot of teams monitor queries carefully but barely monitor connections.
That is a mistake.
PostgreSQL is not designed for unlimited connection growth. A busy app can become unstable because of:
- too many client sessions
- bursty connection creation
- pool exhaustion
- idle but expensive sessions
- serverless fan-out patterns
- or background workers that never release connections properly
What to monitor
- total connections
- active versus idle sessions
- idle-in-transaction sessions
- connection pool size and wait time
- connection acquisition latency in the application
- database-side session growth after deploys
Good rule
Monitor connection behavior as a first-class performance concern, not a side note.
10. Monitor Replication and WAL If You Use Them
If your system uses read replicas or streaming replication, performance monitoring must include replication health.
Otherwise you may miss problems like:
- growing lag
- slow replay
- WAL backlog
- replication conflicts
- or standby performance that looks fine until traffic depends on it
What to watch
- replication lag
- WAL generation rate
- replica apply delay
- replication slot behavior if used
- network issues between primary and standby
For teams with replicas, replication health is not separate from performance. It is part of the user experience.
11. Monitor From the Application Side Too
This is one of the most important and most ignored parts of PostgreSQL monitoring.
The database may be healthy and still look bad because the application is:
- issuing too many queries
- doing N+1 lookups
- leaving transactions open
- creating bursty traffic patterns
- retrying aggressively
- or shipping a slow handler after a deploy
Application metrics that matter
- request latency
- query count per request
- DB time per endpoint
- transaction duration
- pool wait time
- top slow handlers
- background job duration and batch size
- deploy markers in dashboards
A lot of “database monitoring” becomes much more useful once you connect it to:
- endpoint names
- worker names
- release times
- and request patterns
12. Use Alerting for Sustained Problems, Not Every Spike
A noisy alerting setup makes PostgreSQL monitoring worse, not better.
Good alerting should focus on:
- sustained elevated query time
- lock waits that persist
- connection saturation
- disk or WAL pressure
- replication lag
- autovacuum failure or serious lag
- unusually high query volume shifts
- abnormal temp file growth or I/O timing changes
Bad alerting example
- every single slow query
- every CPU spike
- every short-lived replication blip
- every checkpoint event
Better alerting example
- “mean query time above threshold for 10 minutes”
- “active sessions waiting on locks above threshold”
- “pool wait time sustained above threshold”
- “replication lag sustained beyond acceptable window”
Monitoring should help you intervene intelligently, not constantly.
13. A Good PostgreSQL Monitoring Dashboard Should Answer These Questions
A useful dashboard should help you answer, quickly:
Query layer
- Which queries are most expensive?
- Which queries are called most often?
- What changed after the last release?
Live activity layer
- What is running now?
- Is anything blocked?
- Are any sessions stuck in transaction?
Resource layer
- Is the database CPU-bound?
- Is it memory-pressured?
- Is the storage layer becoming the limit?
Maintenance layer
- Is autovacuum keeping up?
- Are stats fresh enough?
- Are table and index sizes growing normally?
Application layer
- Which service or endpoint is driving load?
- Are request query counts changing?
- Is the pool saturated?
If your dashboard cannot answer those questions, it is probably showing too many pretty metrics and not enough decision-making information.
14. A Practical Monitoring Baseline for Most Teams
If you want a simple starting point, use this baseline:
Must-have
pg_stat_statementspg_stat_activity- basic lock visibility
- slow-query logging
- connection count and pool monitoring
- CPU, memory, disk, and I/O metrics
- request latency and DB time from the app
Very useful
pg_stat_io- autovacuum and dead tuple monitoring
- replication metrics if relevant
auto_explainfor harder cases- release markers on dashboards
Nice to have
- query plan sampling
- per-table growth dashboards
- temp file tracking
- WAL and checkpoint trend visibility
- per-service application_name breakdowns
This is enough to monitor PostgreSQL well without building a giant observability project on day one.
Common PostgreSQL Monitoring Mistakes
Mistake 1: Watching only CPU
This tells you far less than people think.
Mistake 2: Looking only at current activity
You also need history and aggregation.
Mistake 3: Looking only at history
You also need live session visibility.
Mistake 4: Not tracking waits and locks
A blocked query is not the same as an expensive query.
Mistake 5: Ignoring the application layer
Many database problems start in handler behavior, not in PostgreSQL configuration.
Mistake 6: Not enabling useful extensions
Without tools like pg_stat_statements, you lose one of the best visibility layers.
Mistake 7: Over-alerting
A noisy monitoring setup makes signal harder to see.
FAQ
What is the best way to monitor PostgreSQL performance?
For most systems, the best baseline is pg_stat_statements for historical query cost, pg_stat_activity for live query and wait visibility, slow-query logging, lock monitoring, autovacuum visibility, and basic system metrics like CPU, memory, disk, and I/O.
Should I use pg_stat_statements or logs?
Usually both. pg_stat_statements is better for aggregated query cost over time, while logs are better for seeing slow statements in specific time windows and correlating them with incidents.
What should I alert on in PostgreSQL?
Good alert candidates include sustained slow query time, replication lag where relevant, lock waits, connection saturation, autovacuum problems, abnormal I/O latency, disk pressure, and unusual query volume spikes.
Can I monitor PostgreSQL only from the database side?
Not well. Good PostgreSQL monitoring also needs operating system and application context, because some database symptoms are actually caused by CPU pressure, storage latency, noisy query patterns, or poor application behavior.
Conclusion
Monitoring PostgreSQL performance well is really about building enough visibility to classify problems correctly.
That means knowing:
- what queries are expensive over time
- what is running right now
- what is blocked or waiting
- what the storage layer is doing
- whether maintenance is healthy
- and how the application is shaping database load
That is why the best PostgreSQL monitoring setup is not one metric or one view.
It is a practical combination of:
pg_stat_statementspg_stat_activity- logs
- I/O visibility
- lock awareness
- maintenance signals
- OS metrics
- and application metrics
Once those pieces are in place, PostgreSQL performance work gets much easier because the system stops feeling mysterious and starts feeling observable.