Avoiding Multiplied SUM Results in MySQL Joins
MySQL joins, when combined with SUM aggregations, often produce inaccurate, inflated results due to the Cartesian product effect. Multiple joins can lead to row duplication, inflating the SUM values. The solution is to pre-aggregate data using subqueries.
Effective Strategy: Subqueries for Accurate Aggregation
To prevent inflated SUMs, perform the aggregations before joining tables. This is achieved through subqueries that group and sum data independently, then join the aggregated results.
Query Example and Explanation:
This query demonstrates how to correctly join tables while using SUM aggregations:
SELECT i.last_name, i.first_name, DATE_FORMAT(LEAST(m.mil_date, t.tm_date), '%m/%d/%y') AS dates, t.total, m.minutes FROM bhds_teachers AS i LEFT JOIN ( SELECT ds_id, YEARWEEK(mil_date) AS week, MIN(mil_date) AS mil_date, SUM(drive_time) AS minutes FROM bhds_mileage WHERE mil_date BETWEEN '2016-04-11' AND '2016-04-30' AND ds_id = 5 GROUP BY ds_id, week ) AS m ON m.ds_id = i.ds_id LEFT JOIN ( SELECT ds_id, YEARWEEK(tm_date) AS week, MIN(tm_date) AS tm_date, SUM(tm_hours) AS total FROM bhds_timecard WHERE tm_date BETWEEN '2016-04-11' AND '2016-04-30' AND ds_id = 5 GROUP BY ds_id, week ) AS t ON t.ds_id = i.ds_id AND t.week = m.week;
The subqueries (m
and t
) independently calculate the sums for drive_time
and tm_hours
, respectively, grouping by ds_id
and week
. These aggregated results are then joined with bhds_teachers
(aliased as i
) using the ds_id
and matching week
numbers. LEAST()
selects the earliest date from the two aggregated tables. This method avoids the multiplication problem inherent in joining before summing.
The above is the detailed content of How Can I Avoid Multiplied SUM Results When Joining Tables in MySQL?. For more information, please follow other related articles on the PHP Chinese website!