How to Efficiently Count Preceding Rows Within a Time Range in PostgreSQL?-Mysql Tutorial-php.cn

How to Efficiently Count Preceding Rows Within a Time Range in PostgreSQL?

Linda Hamilton

Release： 2024-12-27 07:27:12

Original

492 people have browsed it

How to Efficiently Count Preceding Rows Within a Time Range in PostgreSQL?

Counting Previous Rows within a Range

Problem Statement:

Determine the total count of preceding records within a defined time range for each row in a table.

Specific Scenario:

Query:

SELECT id, date
     , count(*) OVER (HAVING previous_rows.date >= (date - '1 hour'::interval))  -- ?
FROM test;

Copy after login

Table:

CREATE TABLE test (
  id  bigint
, ts  timestamp
);

Copy after login

Postgres 11 or Newer:

Postgres 11 introduced improved window function framing options, allowing the use of RANGE mode with PRECEDING and FOLLOWING to select rows within a specified offset.

SELECT id, ts
     , count(*) OVER (ORDER BY ts RANGE '1 hour' PRECEDING EXCLUDE CURRENT ROW)
FROM   test
ORDER  BY ts;

Copy after login

Postgres 10 or Older:

ROM (Roman's query):

SELECT id, ts
     , (SELECT count(*)::int - 1
        FROM   unnest(dates) x
        WHERE  x >= sub.ts - interval '1h') AS ct
FROM (
   SELECT id, ts
        , array_agg(ts) OVER(ORDER BY ts) AS dates
   FROM   test
   ) sub;

Copy after login

ARR (count array elements):

SELECT id, ts
     , (SELECT count(*)
        FROM   test t1
        WHERE  t1.ts >= t.ts - interval '1h'
        AND    t1.ts < t.ts) AS ct
FROM   test t
ORDER  BY ts;

Copy after login

COR (correlated subquery):

CREATE OR REPLACE FUNCTION running_window_ct(_intv interval = '1 hour')
  RETURNS TABLE (id bigint, ts timestamp, ct int)
  LANGUAGE plpgsql AS
$func$
DECLARE
   cur   CURSOR FOR
         SELECT t.ts + _intv AS ts1
              , row_number() OVER (ORDER BY t.ts ROWS UNBOUNDED PRECEDING) AS rn
         FROM   test t
         ORDER  BY t.ts;
   rec   record;
   rn    int;
BEGIN
   OPEN cur;
   FETCH cur INTO rec;
   ct := -1;  -- init

   FOR id, ts, rn IN
      SELECT t.id, t.ts
           , row_number() OVER (ORDER BY t.ts ROWS UNBOUNDED PRECEDING)
      FROM   test t ORDER BY t.ts
   LOOP
      IF rec.ts1 >= ts THEN
         ct := ct + 1;
      ELSE
         LOOP
            FETCH cur INTO rec;
            EXIT WHEN rec.ts1 >= ts;
         END LOOP;
         ct := rn - rec.rn;
      END IF;

      RETURN NEXT;
   END LOOP;
END
$func$;

Copy after login

Call the function:

SELECT * FROM running_window_ct();

Copy after login

Benchmark Results:

A benchmark using a table with varying row counts showed that the FNC function is the clear victor in terms of performance and scalability.

The above is the detailed content of How to Efficiently Count Preceding Rows Within a Time Range in PostgreSQL?. For more information, please follow other related articles on the PHP Chinese website!