Querying a Random Sample from a MySQL Database with Efficiency
Initial Approach and Limitations:
The straightforward method of generating a random sample using SELECT * FROM table ORDER BY RAND() LIMIT 10000 faces performance bottlenecks with large tables. This approach is computationally intensive due to the requirement to sort the entire table, making it impractical for tables with hundreds of thousands of rows.
Optimized Sampling Technique:
An efficient alternative is to utilize the following query:
SELECT * FROM table WHERE rand() <= .3
This query employs the following principles:
Advantages of this Approach:
The above is the detailed content of How Can I Efficiently Query a Random Sample from a Large MySQL Database?. For more information, please follow other related articles on the PHP Chinese website!