Why the High Logical Reads for Windowed Aggregate Functions?
When utilizing common subexpression spools in execution plans, logical reads tend to be significantly inflated for larger tables. After experimenting and observing the execution plan, it was discovered that the following formula appears to hold true:
Worktable logical reads = 1 NumberOfRows 2 NumberOfGroups 4
However, the underlying reason for this formula remains unclear. This article aims to unravel the mystery behind this logical read calculation.
Understanding Windowed Aggregate Function Execution
The segment iterator at the outset of the plan appends a flag to rows indicating the start of each new partition. The primary segment spool subsequently retrieves rows one at a time and inserts them into a tempdb work table. Upon encountering the new group flag, the spool returns a row to the upper input of the nested loops operator.
This triggers the stream aggregate over the work table rows, computing the average. The computed average is then joined with the work table rows, and the worktable is truncated in preparation for the next group. The segment spool generates a dummy row to process the final group.
Logical Read Calculation for Worktables
According to our understanding, the worktable is a heap (or index spool if otherwise specified in the plan). In the example provided, contrary to expectations only 11 logical reads are required. An explanation for this difference is as follows:
This brings the total logical reads to 4 x 3 = 12, omitting the insertion of the fourth row that triggers a logical read only in the original scenario.
Conclusion
The key to understanding this formula lies in the discrepancy between logical read counting for worktables and regular spool tables. For worktables, each row read is counted as one logical read, whereas for spool tables, each hashed page is counted.
The formula aligns with the observed execution: two secondary spools are read twice (2 COUNT()), while the primary spool emits (COUNT(DISTINCT CustomerID) 1) rows as explained in the blog entry mentioned in the additional info. The additional one is due to the extra row emitted to indicate the end of the final group.
The above is the detailed content of Why Do Windowed Aggregate Functions Cause Such High Logical Reads in SQL Server?. For more information, please follow other related articles on the PHP Chinese website!