SQL's IN
Condition: Performance Issues Compared to =
In specific situations, SQL's IN
condition can significantly underperform compared to the =
operator. This was a notable issue in older MySQL versions, though later versions (MySQL 5.6 and beyond) have addressed this.
The performance difference stems from optimization challenges. A subquery within an IN
clause might be incorrectly identified as dependent, rather than independent. This leads to the subquery's repeated execution for each row in the main query, dramatically impacting performance.
Let's look at an example:
<code class="language-sql">SELECT * FROM question_law_version WHERE question_law_id IN ( SELECT MAX(foo_id) FROM bar );</code>
An EXPLAIN
plan for this query might reveal:
<code>1 'PRIMARY' 'question_law_version' 'ALL' '' '' '' '' 10148 'Using where' 2 'DEPENDENT SUBQUERY' 'question_law_version' 'ALL' '' '' '' '' 10148 'Using where' 3 'DEPENDENT SUBQUERY' 'question_law' 'ALL' '' '' '' '' 10040 'Using where'</code>
Notice the "DEPENDENT SUBQUERY" designation. The inner query is repeatedly executed. However, rewriting the query using =
(assuming the subquery returns a single value):
<code class="language-sql">SELECT * FROM question_law_version WHERE question_law_id = ( SELECT MAX(foo_id) FROM bar );</code>
...often results in a more efficient plan:
<code>1 'PRIMARY' 'question_law_version' 'ALL' '' '' '' '' 10148 'Using where' 2 'SUBQUERY' 'question_law_version' 'ALL' '' '' '' '' 10148 'Using where' 3 'SUBQUERY' 'question_law' 'ALL' '' '' '' '' 10040 'Using where'</code>
The key difference is the "SUBQUERY" classification, indicating a single execution. This performance discrepancy arises from MySQL's (in older versions) misinterpretation of the IN
clause's subquery dependency. The database's inefficient handling of this leads to the performance slowdown.
The above is the detailed content of Why is SQL's `IN` Condition Sometimes Slower Than the `=` Operator?. For more information, please follow other related articles on the PHP Chinese website!