In MySQL, subqueries using EXISTS and IN can achieve similar results, but with contrasting performance implications. This article delves into the differences between these techniques and explores their impact on query execution times.
The provided example demonstrates two subqueries that yield identical outcomes using different approaches:
Method 1 (IN)
SELECT * FROM tracker WHERE reservation_id IN ( SELECT reservation_id FROM tracker GROUP BY reservation_id HAVING ( method = 1 AND type = 0 AND Count(*) > 1 ) OR ( method = 1 AND type = 1 AND Count(*) > 1 ) OR ( method = 2 AND type = 2 AND Count(*) > 0 ) OR ( method = 3 AND type = 0 AND Count(*) > 0 ) OR ( method = 3 AND type = 1 AND Count(*) > 1 ) OR ( method = 3 AND type = 3 AND Count(*) > 0 ) )
Method 2 (EXISTS)
SELECT * FROM `tracker` t WHERE EXISTS ( SELECT reservation_id FROM `tracker` t3 WHERE t3.reservation_id = t.reservation_id GROUP BY reservation_id HAVING ( METHOD = 1 AND TYPE = 0 AND COUNT(*) > 1 ) OR ( METHOD = 1 AND TYPE = 1 AND COUNT(*) > 1 ) OR ( METHOD = 2 AND TYPE = 2 AND COUNT(*) > 0 ) OR ( METHOD = 3 AND TYPE = 0 AND COUNT(*) > 0 ) OR ( METHOD = 3 AND TYPE = 1 AND COUNT(*) > 1 ) OR ( METHOD = 3 AND TYPE = 3 AND COUNT(*) > 0 ) )
It is observed that Method 1 (IN) requires approximately 10 seconds for execution, while Method 2 (EXISTS) completes in under a second. This substantial performance disparity warrants exploration.
A closer examination reveals that these subqueries utilize different techniques to identify rows in the tracker table that satisfy specific conditions.
IN Subquery (Method 1): This subquery returns a set of reservation_ids that meet the specified criteria. The outer query then checks if the reservation_id for each row in tracker exists in this set, and returns rows that match.
EXISTS Subquery (Method 2): EXISTS determines whether any rows in the tracker table meet the specified conditions for a given reservation_id. The outer query then evaluates this condition and returns rows for which EXISTS returns true.
The performance difference between these approaches stems from their underlying logic and efficiency:
In general, EXISTS is the preferred approach when dealing with large subquery results and it is not sensitive to NULL values in the subquery. IN, on the other hand, can be more efficient when the subquery results are relatively small and NULL values are not a concern.
To further optimize performance, it is recommended to consider the following:
The above is the detailed content of EXISTS vs. IN in MySQL Subqueries: Which is More Performant?. For more information, please follow other related articles on the PHP Chinese website!