MySQL Performance Conundrum: Subquery Performance Anomaly
When attempting to retrieve duplicate records from a database, an SQL query utilizing a subquery to identify the relevant rows exhibited a surprising performance disparity.
The initial query, effectively isolating duplicate records by grouping and filtering on a specific field, executed swiftly. However, a subsequent query seeking to retrieve all rows with values matching those from the duplicate set (achieved using the WHERE ... IN (subquery) construct) became prohibitively slow.
Despite the existence of an index on the relevant field, the execution took minutes to complete. Suspecting database limitations, a view was created from the subquery, and the parent query was modified to reference the view instead. This resulted in near-instantaneous execution.
Unveiling the Culprit: Correlated Query Woes
Upon investigation, it was revealed that the slow performance stemmed from the correlated nature of the subquery. In a correlated subquery, the inner query references a field from the outer query, causing the subquery to execute multiple times for each row in the outer query, resulting in reduced efficiency.
Resolving the Bottleneck: Isolating the Subquery
To alleviate the performance penalty, the correlated subquery was transformed into a non-correlated subquery by selecting all columns from the subquery and assigning it an alias. This ensured the subquery's execution only once, significantly improving query performance.
The modified parent query, now referencing the non-correlated subquery result, executed with the desired efficiency, resolving the performance issue.
The above is the detailed content of Why is my WHERE IN (subquery) query slow, but referencing a view from the subquery is fast?. For more information, please follow other related articles on the PHP Chinese website!