How Can I Efficiently Count Previous Rows Within a Time Range in PostgreSQL?-Mysql Tutorial-php.cn

How Can I Efficiently Count Previous Rows Within a Time Range in PostgreSQL?

Susan Sarandon

Release： 2024-12-23 08:13:10

Original

250 people have browsed it

How Can I Efficiently Count Previous Rows Within a Time Range in PostgreSQL?

Count Previous Rows Within Range

Many real-world applications require determining the count of preceding records within a specific time range. This article explores different approaches to achieving this in PostgreSQL, especially when dealing with large datasets and dynamic time ranges.

Window Function Approach (Postgres 11 )

In Postgres 11 and later, the RANGE framing option of window functions enables a straightforward solution:

SELECT id, ts,
       COUNT(*) OVER (ORDER BY ts RANGE '1 hour' PRECEDING EXCLUDE CURRENT ROW) AS ct
FROM test
ORDER BY ts;

Copy after login

CTE, Array Aggregation, and Counting (Postgres 10 and Older)

Despite its performance limitations, Roman's CTE-based solution remains an option:

SELECT id, ts,
       (SELECT COUNT(*)::int - 1
        FROM   unnest(dates) x
        WHERE  x >= sub.ts - interval '1h') AS ct
FROM (
   SELECT id, ts, array_agg(ts) OVER (ORDER BY ts) AS dates
   FROM   test
   ) sub;

Copy after login

Correlated Subquery Approach

The correlated subquery method offers superior performance:

SELECT id, ts,
       (SELECT COUNT(*)
        FROM   test t1
        WHERE  t1.ts >= t.ts - interval '1h'
        AND    t1.ts < t.ts) AS ct
FROM   test t
ORDER BY ts;

Copy after login

PL/pgSQL Function with Cursor (Postgres 9.1 and Newer)

For optimal performance, especially in scenarios with dynamic time ranges, a PL/pgSQL function combined with a cursor can be employed:

CREATE FUNCTION running_window_ct(_intv interval = '1 hour')
RETURNS TABLE (id bigint, ts timestamp, ct int)
LANGUAGE plpgsql AS
$func$
.....
$func$;

Copy after login

SELECT * FROM running_window_ct();

Copy after login

Benchmark Results

Benchmarking these approaches on a dataset of 100,000 rows demonstrates the superiority of the PL/pgSQL function for scalability and performance:

100 rows:
ROM: 27.656 ms
ARR: 7.834 ms
COR: 5.488 ms
FNC: 1.115 ms

1000 rows:
ROM: 2116.029 ms
ARR: 189.679 ms
COR: 65.802 ms
FNC: 8.466 ms

100000 rows:
ROM: DNF
ARR: DNF
COR: 6760 ms
FNC: 828 ms

Copy after login

The above is the detailed content of How Can I Efficiently Count Previous Rows Within a Time Range in PostgreSQL?. For more information, please follow other related articles on the PHP Chinese website!