#I have suffered this loss many times. In the sql code before development, many queries with or as the where condition were even updated. Here are examples to illustrate the disadvantages of using or and how to improve it.
select f_crm_id from d_dbname1.t_tbname1 where f_xxx_id = 926067 and (f_mobile ='1234567891' or f_phone ='1234567891' ) limit 1
It is easy to see from the query statement that both fields f_mobile and f_phone may store phone numbers. The general idea is to use or to solve it with a sql, but a large amount of table data is simply a disaster:
t_tbanme1 has the indexes idx_id_mobile(f_xxx_id,f_mobile), idx_phone(f_phone), idx_id_email(f_id,f_email), but the explain result uses the idx_id_email index. Sometimes I am lucky. You may choose idx_id_mobile f_xxx_id
because for each query of mysql, you can only select one index on each table. If the idx_id_mobile index is used and there happens to be one piece of data because there is limit 1, then congratulations on getting the result quickly; but if there is no data for f_mobile, then the f_phone field can only be searched one by one under the f_id condition, scanning 120,000 rows. or is different from and. Some developers even think that adding (f_xxx_id, f_mobile, f_phone) would be perfect. I’m going to vomit blood~
So optimizing sql is very simple (Note that there must be corresponding indexes on f_mobile and f_phone), Method 1:
(select f_crm_id from d_dbname1.t_tbname1 where f_xxx_id = 926067 and f_mobile ='1234567891' limit 1 ) UNION ALL (select f_crm_id from d_dbname1.t_tbname1 where f_xxx_id = 926067 and f_phone ='1234567891' limit 1 )
Two independent SQLs can use the index, and each query has its own limit. If both result sets are returned, just pick one.
There is another optimization method. If this kind of query is particularly frequent (and there is no cache), change it to a separate sql execution. For example, most of the number values are on f_mobile, then execute sql1 first, and there will be results. Then end, judge that there is no result and then execute sql2, which can reduce the database query speed and allow the code to handle more things. Method 2 Pseudo code:
sql1 = select f_crm_id from d_dbname1.t_tbname1 where f_xxx_id = 926067 and f_mobile ='1234567891' limit 1; sq1.execute(); if no result sql1: sql1 = select f_crm_id from d_dbname1.t_tbname1 where f_xxx_id = 926067 and f_phone ='1234567891' limit 1; sql1.execute();
A more complex scenario is the end It is as simple as returning a record, limit 2:
select a.f_crm_id from d_dbname1.t_tbname1 as a where (a.f_create_time > from_unixtime('1464397527') or a.f_modify_time > from_unixtime('1464397527') ) limit 0,200
In this case, methods one and two need to be modified, because both f_create_time and f_modify_time may meet the judgment conditions, so duplicate data will be returned.
Method 1 needs to be modified:
(select a.f_crm_id from d_dbname1.t_tbname1 as a where a.f_create_time > from_unixtime('1464397527') limit 0,200 ) UNION ALL (select a.f_crm_id from d_dbname1.t_tbname1 as a where a.f_modify_time > from_unixtime('1464397527')and a.f_create_time <= from_unixtime('1464397527') limit 0,200)
Some people say that changing UNION ALL to UNION will eliminate the duplication, right? If queries are frequent or the limit is relatively large, the database will still be under pressure, so trade off is required.
This situation is more suitable for method two, including situations where order by limit may be required. Transform the pseudocode:
sql1 = (select a.f_crm_id from d_dbname1.t_tbname1 as a where a.f_create_time > from_unixtime('1464397527') limit 0,200 ); sql1.execute(); sql1_count = sql1.result.count if sql1_count < 200 : sql2 = (select a.f_crm_id from d_dbname1.t_tbname1 as a where a.f_modify_time > from_unixtime('1464397527') and a.f_create_time <= from_unixtime('1464397527') limit 0, (200 - sql1_count) ); sql2.execute(); final_result = paste(sql1,sql2);
or conditions are difficult to optimize on the database. The logic can be optimized in the code so as not to bring down the database. It is only considered when no index is needed under the or condition (and the amount of data to be compared is small).
The same field or can be changed to in, such as f_id=1 or f_id=100 -> f_id in (1,100). For efficiency issues, see the article The efficiency issues of or and in in mysql.
The above optimization scenarios are all storage engines in the case of InnoDB. They are different in MyISAM. See mysql or conditions where you can use indexes to avoid the full table.
The above is the content of MySQL to avoid using OR conditions in index columns. For more related content, please pay attention to the PHP Chinese website (www.php.cn)!