Query Optimization for Multiple Function Calls on Array Arguments
In Postgres, it can be inefficient to call a function multiple times on an array argument, especially when the function returns multiple columns. Let's explore a query optimization strategy to address this challenge.
Consider a function foo that processes an array of rows with a given parameter and returns a set of rows plus a new column:
CREATE OR REPLACE FUNCTION foo(data data[], parameter int) RETURNS SETOF enhanceddata AS ...
Initially, the function works on a single set of data using:
SELECT * FROM foo( (SELECT ARRAY_AGG(data) FROM datatable GROUP BY dataid WHERE dataid = something), 1)
However, the goal is to make it work for multiple groups of data without specifying a dataid parameter.
One approach involves using a subquery to aggregate the data into an array and then pass it to the foo function:
SELECT dataid, (foo(ARRAY_AGG(data)),1).* FROM dataset WHERE dataid = something -- only testing on 1 GROUP BY dataid
While this may seem logical, it suffers from the issue of calling foo multiple times, one for each data row.
Using Lateral Join for Optimization
To optimize this query, the PostgreSQl LATERAL JOIN can be employed. This powerful technique creates a row-based Cartesian product between the results of a subquery and the rows of another table. In this case, the subquery aggregates the data into an array, and the lateral join executes foo once for each row in the aggregated array.
Using PostgreSQL 9.3 or later, the following query optimizes the multiple function calls:
SELECT sub.dataid, f.* FROM ( SELECT dataid, array_agg(data) AS arr FROM dataset WHERE dataid = something GROUP BY 1 ) sub LEFT JOIN LATERAL foo(sub.arr) f ON true;
The LEFT JOIN LATERAL ensures that all rows from the left table (subquery) are preserved, even if no rows are returned by foo. If the function can return no rows, this is the preferred syntax.
For functions that always return a result, the simplified syntax can be used:
CROSS JOIN LATERAL foo(sub.arr)
or its equivalent shorthand:
, foo(sub.arr)
As highlighted in the Postgres manual, using LATERAL JOIN optimizes queries with set-returning functions like foo. This technique efficiently processes multiple groups of data without incurring unnecessary function evaluations.
The above is the detailed content of How Can PostgreSQL's LATERAL JOIN Optimize Multiple Function Calls on Array Arguments?. For more information, please follow other related articles on the PHP Chinese website!