Tips to avoid multiplication of SUM values when using aggregate functions to join tables in MySQL
When combining multiple join operations in a MySQL query, using aggregate functions such as SUM may lead to unexpected results. This is because the row Cartesian product of the join table multiplies the summed results, resulting in inaccurate results.
Suppose there is an example where two queries are initially used to retrieve SUM values from different tables. Query 1 calculates the total driving time for a specific employee during a specific date range, while Query 2 calculates the total working time for the same employee during the same period.
Trying to combine these queries into a single join query will result in incorrect sums. This is because the JOIN operation creates the Cartesian product of the bhds_timecard
and bhds_mileage
table rows. As a result, the SUM values for driving time and work time are multiplied by the number of matching rows in each table.
To solve this problem, you can use a subquery to calculate the SUM value before joining the tables. By moving the SUM operation into a separate subquery, you can prevent Cartesian product and obtain accurate results.
The following is a query improved using subqueries:
<code class="language-sql">SELECT last_name, first_name, DATE_FORMAT(LEAST(mil_date, tm_date), '%m/%d/%y') AS dates, total, minutes FROM bhds_teachers AS i LEFT JOIN ( SELECT ds_id, YEARWEEK(mil_date) AS week, MIN(mil_date) AS mil_date, SUM(drive_time) AS minutes FROM bhds_mileage WHERE mil_date BETWEEN '2016-04-11' AND '2016-04-30' AND bhds_mileage.ds_id = 5 GROUP BY ds_id, week ) AS m ON m.ds_id = i.ds_id LEFT JOIN ( SELECT ds_id, YEARWEEK(tm_date) AS week, MIN(tm_date) AS tm_date, SUM(tm_hours) AS total FROM bhds_timecard WHERE tm_date BETWEEN '2016-04-11' AND '2016-04-30' AND bhds_timecard.ds_id = 5 GROUP BY ds_id, week ) AS t ON t.ds_id = i.ds_id AND t.week = m.week;</code>
In this query:
m
Calculates total driving time grouped by employee ID and week. t
Calculates the total hours worked grouped by employee ID and week. bhds_teachers
table with subqueries m
and t
. As a result, the query retrieves the last name, first name, date, total work time, and total driving time for the employee and the specified period without the problem of multiplying SUM values.
The above is the detailed content of How to Avoid Multiplied SUM Values When Joining Tables with Aggregate Functions in MySQL?. For more information, please follow other related articles on the PHP Chinese website!