Interviewer: Have you ever operated Linux?
Me: Yes
Interviewer: What command should I use to check the memory usage?
Me: free or top
Interviewer: Then tell me what information you can see using the free command
Me: Then, as shown in the figure below, you can see the usage of memory and cache.
total total memory
##used used memory
free free memory
buff/cache used cache
avaiable memory
##Interviewer:
Then you know how Clear the used cache (buff/cache) Me: em... I don’t know Interviewer: ##Me: (Send points, Overjoyed) The benefits are huge. After clearing the cache, we will have more available memory space. Just like the little rocket of xx Guardian on the PC, a lot of memory will be released with one click. Interviewer: em…., go back and wait for notification Interviewer: Change the topic and let’s talk Your understanding of join Me: Okay (if you answer it wrong again, it’s over, seize the opportunity) inner join inner join
##left join left join
right join right join
full join Full join
#Picture source: https://www.cnblogs.com/reaptomorrow-flydream/p/8145610.html Interviewer: If you need to use join statements during project development, how to optimize and improve performance? Me: Divided into two In this case, the data size is small and the data size is large. Interviewer: Then? Me: For 1. The data size is small and all is put into the memory. Wow 2. The data scale is large #You can optimize the execution speed of the join statement by adding indexes You can use redundant information to reduce the number of joins Reduce the number of table connections as much as possible, the number of table connections for one SQL statement No more than 5 times Interviewer: It can be summarized that the join statement is relatively performance-consuming, right? Me: Yes Interviewer: Why? Me: There must be a comparison process when executing the join statement Interviewer: Yes Me: The statement comparing two tables one by one is relatively slow, so we can read the data in the two tables into a memory block in sequence, using MySQL Taking the InnoDB engine as an example, we can definitely find the relevant memory area by using the following statement As shown in the figure Indicates that the size of join_buffer_size will affect the execution performance of our join statement Interviewer: What else? Me: Any project will eventually go online, it is inevitable to generate data, and the scale of the data cannot be too small Interviewer: Yes Like this Me:Most of the data in the database will eventually be saved to the hard disk and stored in the form of files. Take MySQL's InnoDB engine as an example InnoDB uses page as the basic IO unit, and the size of each page is 16KB InnoDB will create an .ibd file for each table to store data Verification Me: This means that we need to read as many files as there are tables to connect, although it can be used Index, but it is still inevitable to move the hard disk head frequently Interviewer:In other words, frequent movement of the head will affect the performance, right Me:Yes, don’t the current open source frameworks like to say that they have greatly improved performance through sequential reading and writing, such as hbase and kafka Interviewer: That’s right, then Do you think Linux has optimized this? Tip, you can execute the free command again to take a look Me:Strange why the cache occupies more than 1.2G
##Image source: https://www.linuxatemyram.com/ Interviewer: Have you ever thought about buff/cache is stored in What? Why does buff/cache occupy so much memory, and the available memory is available and there is still 1.1G? Why can you clear the memory occupied by buff/cache through two commands, but you can only release used by ending the process? Me: Releasing the memory occupied by buff/cache so casually means that it is not important, and clearing it will not affect the operation of the system Interviewer: Not entirely true Me: Is that so? I think of a sentence in "CSAPP" (In-depth Understanding of Computer Systems) The essence of the memory hierarchy is that each layer of storage device is the cache of the lower layer device In layman’s terms, it means that Linux will treat the memory as the cache of the hard disk Related information: http://tldp.org /LDP/sag/html/buffer-cache.html Interviewer: Now you know how to answer the scoring question Me: I…. Interviewer: Give it to you again Given an opportunity, what would you do if you were asked to implement the Join algorithm? Me: If there is no index, the nested loop will be finished. If there is an index, you can use the index to improve performance. Interviewer: Back to join_buffer, what do you think is stored in join_buffer? Me: During the scanning process, the database will select a table and add it to The data that needs to be returned and compared with other tables is put into join_buffer Interviewer: How to deal with it when there is an index? Me: This is relatively simple. Just read the index trees of the two tables directly for comparison and that's it. Let me introduce the non-index processing method here Nested Loop Join ##Nested loop only reads one row of data in the table at a time, that is to say If the outerTable has 100,000 rows of data and the innerTable has 100 rows of data, it needs to be read 10,000,000 times (assuming that the files of these two tables are not cached in memory by the operating system, we call them cold data tables) Of course, no database engine currently uses this algorithm (too slow)
Block nested loop Block block, that is to say, a piece of data will be fetched into the memory each time to reduce I/O overhead
MySQL InnoDB will use this algorithm when no index can be used Consider the following two tables t_a and t_b When it is not possible When using an index to perform a join operation, InnoDB will automatically use the Block nested loop algorithm When I was in school, the database teacher most I like to study database paradigms, and it wasn’t until I got to work that I learned that everything should be based on performance. If redundancy is possible, use redundancy. If redundancy is not possible, join if join really affects performance. Try increasing your join_buffer_size, or change to a solid state drive. "In-depth understanding of computer systems"-Chapter 6 Memory Hierarchysync; echo 3 > /proc/sys/vm/drop_caches
You can clear the buff/cache. Can you tell me if I can execute this command online? Let’s talk about SQL Join
Review
join in SQL can combine specified tables according to certain conditions and return data to the clientJoin methods include
Buffer
show variables like '%buffer%'
A major premise
Taste it carefullyAfter thinking for a few minutes
##Join Algorithm
Summary
Reference materials
Author of "Experiments and fun with the Linux disk cache" Use several examples to illustrate the impact of hard disk cache on program execution performance
《Linux ate my ram》Explanation of Free parameters
How to clear the buffer/pagecache (disk cache) under Linux The sub-question command is given at the beginning of the article Explain
How MySQL runs: Understand MySQL from the root
Block bested loop The official documentation from MariaDB explains the implementation of the Block-Nested-Loop algorithm
The above is the detailed content of Why do code specifications require SQL statements not to have too many joins?. For more information, please follow other related articles on the PHP Chinese website!