Random Row Selection with Weighted Probability
Consider a table with fields such as id, content, and weight. Your goal is to randomly select a single row from this table while factoring in the weights.
For instance, given three rows with the following data:
id, content, weight 1, "some content", 60 2, "other content", 40 3, "something", 100
The probability distribution is as follows:
Weighted Reservoir Sampling
The most straightforward approach to achieve this is through weighted reservoir sampling:
SELECT id, -LOG(RAND()) / weight AS priority FROM your_table ORDER BY priority LIMIT 1;
This method guarantees a reasonable selection of M elements from a collection of N elements, where the probability of selecting each element is proportional to its weight. It remains effective even when only a single element is desired.
The underlying principle of weighted reservoir sampling is described in detail in the provided article. Notably, it involves selecting the smallest values of -LOG(RAND()) / weight rather than the largest values of POW(RAND(), 1/weight) as depicted in the article. This yields equivalent results.
The above is the detailed content of How Can I Randomly Select a Row from a Weighted Table with SQL?. For more information, please follow other related articles on the PHP Chinese website!