Why is My PostgreSQL `SELECT DISTINCT` So Slow, and How Can I Fix It?-Mysql Tutorial-php.cn

Why is My PostgreSQL `SELECT DISTINCT` So Slow, and How Can I Fix It?

Linda Hamilton

Release： 2025-01-07 18:39:40

Original

981 people have browsed it

Why is My PostgreSQL `SELECT DISTINCT` So Slow, and How Can I Fix It?

PostgreSQL SELECT DISTINCT Performance Bottleneck: A Case Study

This article examines performance issues encountered with a SELECT DISTINCT query on a PostgreSQL table containing approximately two million records.

Context

The tickers table stores data from Coinbase Pro's "ticker" channel. A composite primary key includes the product_id column.

Performance Problem

The query SELECT DISTINCT product_id FROM tickers was anticipated to perform well due to the index on product_id. However, execution consistently took 500-600 milliseconds.

Query Plan Investigation

EXPLAIN ANALYZE showed the query planner defaulted to a sequential scan, ignoring the product_id index. Forcing index use didn't resolve the performance problem.

Index Optimization Attempts

Creating a dedicated index on product_id yielded only minor improvements, with sequential scans still preferred by the planner unless explicitly overridden.

Effective Solution: Index Skip Scan Emulation

The solution implemented emulates index skip scans using recursive queries with lateral joins. This approach significantly improved performance, reducing execution time to 0.75 milliseconds even with a dataset of 2.25 million rows.

Summary

PostgreSQL's current lack of native index skip scan capabilities is addressed by this emulation technique. This method effectively utilizes existing indexes and avoids the performance penalty of sequential scans for SELECT DISTINCT queries on large tables.

The above is the detailed content of Why is My PostgreSQL `SELECT DISTINCT` So Slow, and How Can I Fix It?. For more information, please follow other related articles on the PHP Chinese website!