在PostgreSQL中使用子查询分组和聚合连续数值
处理包含数值型数据的表时,通常需要根据数值的顺序对这些值进行分组和聚合。在PostgreSQL 9.0及以上版本中,可以使用子查询的组合来实现此目的。
识别非连续值
第一步是识别目标数值字段中的非连续值。这可以通过一个子查询来完成,该子查询为每一行分配一个组计数,当出现非连续值时,计数器将重置。
<code class="language-sql">SELECT company, profession, year, CASE WHEN row_number() OVER (PARTITION BY company, profession ORDER BY year) = 1 OR year - lag(year, 1, year) OVER (PARTITION BY company, profession ORDER BY year) > 1 THEN 1 ELSE 0 END AS group_cnt FROM qualification;</code>
定义组ID
接下来,我们为每个连续的数值序列创建组ID。这涉及到另一个子查询,它对来自前一个子查询的组计数进行求和。
<code class="language-sql">SELECT company, profession, year, SUM(group_cnt) OVER (ORDER BY company, profession, year) AS group_nr FROM ( SELECT company, profession, year, CASE WHEN row_number() OVER (PARTITION BY company, profession ORDER BY year) = 1 OR year - lag(year, 1, year) OVER (PARTITION BY company, profession ORDER BY year) > 1 THEN 1 ELSE 0 END AS group_cnt FROM qualification ) t1;</code>
最终聚合
最后,我们使用组ID对每个组的数值进行聚合。
<code class="language-sql">SELECT company, profession, array_agg(year) AS years FROM ( SELECT company, profession, year, SUM(group_cnt) OVER (ORDER BY company, profession, year) AS group_nr FROM ( SELECT company, profession, year, CASE WHEN row_number() OVER (PARTITION BY company, profession ORDER BY year) = 1 OR year - lag(year, 1, year) OVER (PARTITION BY company, profession ORDER BY year) > 1 THEN 1 ELSE 0 END AS group_cnt FROM qualification ) t1 ) t2 GROUP BY company, profession, group_nr ORDER BY company, profession, group_nr;</code>
此查询将返回一个表,其中包含唯一的公司和职业组合,年份值根据其数值顺序分组到数组中。 该方法有效地将连续的数值序列分组,便于后续分析和处理。
以上是如何使用子查询对 PostgreSQL 中的连续数值进行分组和聚合?的详细内容。更多信息请关注PHP中文网其他相关文章!