Home > Backend Development > PHP Tutorial > The difference between SQL from multiple tables and inner join

The difference between SQL from multiple tables and inner join

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB
Release: 2016-08-04 09:20:17
Original
2463 people have browsed it

This is how to write inner join

<code>SELECT
    B.dict_data_name,
    SUM(A.pv) AS pv
FROM
    shw_mo_health_news A
INNER JOIN bas_dict_data B ON A.third_name_dictid = B.item_id
WHERE
    A.class_level = 3
AND B.class_id = 1012
AND A.collect_date >= '2016-04-01'
AND A.collect_date <= '2016-05-31'
GROUP BY
    A.third_name_dictid
ORDER BY
    pv DESC;</code>
Copy after login
Copy after login

The actual query time for users is about 0.6S

View with explain:
The difference between SQL from multiple tables and inner join


This is how to write from in multiple tables

<code>SELECT
    B.dict_data_name,
    A.PV
FROM
    (
        SELECT
            hn.third_name_dictid,
            SUM(hn.pv) AS PV
        FROM
            shw_mo_health_news hn
        WHERE
            hn.class_level = 3
        AND hn.collect_date >= '2016-04-01'
        AND hn.collect_date <= '2016-05-31'
        GROUP BY
            hn.third_name_dictid
    ) A,
    (
        SELECT
            dd.item_id,
            dd.dict_data_name
        FROM
            bas_dict_data dd
        WHERE
            dd.class_id = 1012
    ) B
WHERE
    A.third_name_dictid = B.item_id
ORDER BY
    PV DESC</code>
Copy after login
Copy after login

The actual time is about 0.03s

Explain View
The difference between SQL from multiple tables and inner join


Why is there such a big difference in the efficiency of these two query methods?
Many people on the Internet say that the efficiency of these two writing methods is almost the same; but my one is 0.6 and the other is 0.03, the difference between the two is quite big. What is the cause of this? Is it because there is a problem with the SQl statement I wrote or is it for other reasons?

Reply content:

This is how to write inner join

<code>SELECT
    B.dict_data_name,
    SUM(A.pv) AS pv
FROM
    shw_mo_health_news A
INNER JOIN bas_dict_data B ON A.third_name_dictid = B.item_id
WHERE
    A.class_level = 3
AND B.class_id = 1012
AND A.collect_date >= '2016-04-01'
AND A.collect_date <= '2016-05-31'
GROUP BY
    A.third_name_dictid
ORDER BY
    pv DESC;</code>
Copy after login
Copy after login

The actual query time for users is about 0.6S

View with explain:
The difference between SQL from multiple tables and inner join


This is how to write from in multiple tables

<code>SELECT
    B.dict_data_name,
    A.PV
FROM
    (
        SELECT
            hn.third_name_dictid,
            SUM(hn.pv) AS PV
        FROM
            shw_mo_health_news hn
        WHERE
            hn.class_level = 3
        AND hn.collect_date >= '2016-04-01'
        AND hn.collect_date <= '2016-05-31'
        GROUP BY
            hn.third_name_dictid
    ) A,
    (
        SELECT
            dd.item_id,
            dd.dict_data_name
        FROM
            bas_dict_data dd
        WHERE
            dd.class_id = 1012
    ) B
WHERE
    A.third_name_dictid = B.item_id
ORDER BY
    PV DESC</code>
Copy after login
Copy after login

The actual time is about 0.03s

Explain View
The difference between SQL from multiple tables and inner join


Why is there such a big difference in the efficiency of these two query methods?
Many people on the Internet say that the efficiency of these two writing methods is almost the same; but my one is 0.6 and the other is 0.03, the difference between the two is quite big. What is the cause of this? Is it because there is a problem with the SQl statement I wrote or is it for other reasons?

It should be connection loss. I think it’s because the conditions you used don’t have indexes. If the fields are properly indexed, because MySQL will automatically optimize the SQL statement, the final query statement will be the same and the performance will be the same. When there is no index, the so-called optimization will not exist. At this time, the final query statement is basically Equivalent to the sql you submitted.
You can try it

<code>SELECT
    B.dict_data_name,
    SUM(A.pv) AS pv
FROM
    shw_mo_health_news A
INNER JOIN bas_dict_data B ON A.class_level = 3 and A.collect_date >= '2016-04-01'
AND A.collect_date <= '2016-05-31' and B.class_id=1012 and  A.third_name_dictid = B.item_id

GROUP BY
    A.third_name_dictid
ORDER BY
    pv DESC;</code>
Copy after login

I think the performance will be significantly different.

Same, it belongs to different specifications of SQL-89 and SQL-92. See https://en.wikipedia.org/wiki...


I found a related question and answer, and one of the answers is exactly your question https://community.microstrate...

The second SQL here will have extra overhead (temporary table) due to subquery.
Why the second SQL is better than the first SQL? Nothing can be seen from the execution plan. It feels like it is just an exception and does not mean anything

Theoretically, there is no essential difference between subqueries and JOIN, and they should be equivalent after reasonable optimization by the query analyzer. However, it is precisely due to various flaws in the query analyzer that sometimes some versions of the database support subqueries better, and some support JOIN better. For MySQL, most versions of subqueries I have seen are equivalent to JOIN, but be careful when the subquery is located in WHERE, such as:

<code>-- 查询1
SELECT * FROM table_a
WHERE A IN (
    SELECT A FROM table_b
    WHERE B = 'x'
)
-- 查询2
SELECT table_a.* FROM table_a A
    INNER JOIN table_b B ON a.A = b.A
WHERE B.B = 'x'</code>
Copy after login

This is a typical MySQL query analyzer failure scenario. If the A field of table_a has an index, theoretically query 1 and query 2 should be equivalent, but in fact the performance of query 2 in MySQL 5.x version is significantly better than query 1. This was discussed on Stackoverflow and is a known issue that has been around for nearly 10 years. You must upgrade to 6.0.x to get the fix. Therefore, for the question of which one is better, the most reliable way is to see the results of explain yourself before drawing a conclusion.
For your two queries, the execution plans are actually different. It is obvious that the second one consumes more time. The time is just because the amount of data obtained by the extra two steps is not too large. Furthermore, these two queries are actually not equivalent and cannot be compared.

Related labels:
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template