Home > Database > Mysql Tutorial > Why is SELECT DISTINCT Slow on a Table with a Composite Primary Key in PostgreSQL, and How Can It Be Optimized?

Why is SELECT DISTINCT Slow on a Table with a Composite Primary Key in PostgreSQL, and How Can It Be Optimized?

Patricia Arquette
Release: 2025-01-07 18:27:40
Original
749 people have browsed it

On tables using composite primary keys in PostgreSQL SELECT DISTINCT Reasons for slow query speed and optimization methods

Why is SELECT DISTINCT Slow on a Table with a Composite Primary Key in PostgreSQL, and How Can It Be Optimized?

In a PostgreSQL database, SELECT DISTINCT the execution speed of a query depends on the table structure and data distribution. Although the tickers column in the product_id table is part of a composite primary key and is therefore indexed on it, a query that uses SELECT DISTINCT product_id FROM tickers to get unique product_id performs a sequential scan by default.

Reasons for slow performance

The main reason for the slow performance of

is that there are duplicate values ​​of product_id in the table. This means that for each unique product_id retrieved, PostgreSQL must scan the entire table to ensure that there are no duplicates.

Solution: simulate index skip scan

Since PostgreSQL does not yet natively support index skip scans, you can use recursive CTEs (common table expressions) to simulate this behavior. This CTE iteratively retrieves and discards duplicates, effectively filtering out all but one instance of each unique product_id.

Improved solution

<code class="language-sql">WITH RECURSIVE cte AS (
   (   -- 括号必需
   SELECT product_id
   FROM   tickers
   ORDER  BY 1
   LIMIT  1
   )
   UNION ALL
   SELECT l.*
   FROM   cte c
   CROSS  JOIN LATERAL (
      SELECT product_id
      FROM   tickers t
      WHERE  t.product_id > c.product_id  -- 横向引用
      ORDER  BY 1
      LIMIT  1
      ) l
   )
TABLE  cte;</code>
Copy after login

This query uses a horizontal join to traverse the sorted table and retrieve unique orderBy values ​​using product_id .

Conclusion

The execution time of SELECT DISTINCT product_id queries can be significantly improved by simulating an index skip scan using the CTE method, thereby reducing the time required to retrieve unique tickerss from the product_id table.

The above is the detailed content of Why is SELECT DISTINCT Slow on a Table with a Composite Primary Key in PostgreSQL, and How Can It Be Optimized?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template