For MySQL, I need a report of rows that deviate more than 1 standard deviation from the mean (expressed as a percentage)
P粉002572690
P粉002572690 2023-09-13 22:06:57
0
1
633

I'm trying to write a MySQL query to count rows that are significantly different from the most common row for each product and report the percentage difference, where below 100% the price is lower than the average price for that product, and above 100% The price is higher. Prices less than 1 standard deviation from the mean are ignored.

Sample data:

_rowid _Timestamp Code fk_product_id fk_po_id cost
5952 2021-01-10 10:19:01 00805 1367 543 0.850
9403 2022-05-23 14:54:34 00805 1367 2942 0.850
41595 2022-11-23 11:20:26 00805 1367 3391 1.350
39635 2022-01-18 12:49:32 Water1 344 3153 0.140
40134 2022-04-06 22:39:34 Water1 344 2747 0.190
41676 2022-12-09 16:28:28 Water1 344 3398 0.140
39634 2022-01-18 12:49:31 gr309203 344400 1024 3154 0.770
35634 2021-03-03 15:23:23 gr309203 344400 1024 3203 0.790
41264 2022-11-16 11:41:44 gr309203 344400 1024 3357 0.970
SELECT code, fk_product_id, cost, cost/
  (SELECT avg(cost) FROM po_line aa WHERE aa.code = code) AS percent 
FROM po_line 
WHERE (SELECT STDDEV(cost) FROM po_line ss WHERE ss.code = code)>1;

This will not return any rows, but three rows (one for each product) should appear in the report.

The expected result should be:

Code fk_product_id cost percentage
00805 1367 1.350 133
Water1 344 0.190 121
gr309203 344400 1024 0.970 115

P粉002572690
P粉002572690

reply all(1)
P粉141911244

This query shows how to use window functions to calculate the number of standard deviations and the percentage of cost relative to the average cost for each given code.

SELECT code, cost, avg, std, 
  ABS(cost-avg)/std AS num_std, 
  cost*100/avg AS pct
FROM (
  SELECT code, cost,
    AVG(cost) OVER (PARTITION BY code) AS avg,
    STDDEV(cost) OVER (PARTITION BY code) AS std
  FROM po_line
) AS p;

result:

+-----------------+-------+-----------+---------------------+--------------------+-------------+
| code            | cost  | avg       | std                 | num_std            | pct         |
+-----------------+-------+-----------+---------------------+--------------------+-------------+
| 00805           | 0.850 | 1.0166667 | 0.23570226039551592 | 0.7071069226079035 |  83.6065546 |
| 00805           | 0.850 | 1.0166667 | 0.23570226039551592 | 0.7071069226079035 |  83.6065546 |
| 00805           | 1.350 | 1.0166667 | 0.23570226039551592 | 1.4142134209517383 | 132.7868809 |
| gr309203 344400 | 0.770 | 0.8433333 |  0.0899382504215469 | 0.8153738777025533 |  91.3043514 |
| gr309203 344400 | 0.790 | 0.8433333 |  0.0899382504215469 | 0.5929990827042229 |  93.6758930 |
| gr309203 344400 | 0.970 | 0.8433333 |  0.0899382504215469 | 1.4083740722807512 | 115.0197674 |
| Water1          | 0.140 | 0.1566667 | 0.02357022603955158 | 0.7071081954001099 |  89.3616831 |
| Water1          | 0.190 | 0.1566667 | 0.02357022603955158 | 1.4142121481595331 | 121.2765699 |
| Water1          | 0.140 | 0.1566667 | 0.02357022603955158 | 0.7071081954001099 |  89.3616831 |
+-----------------+-------+-----------+---------------------+--------------------+-------------+

(Please note that window functions require MySQL 8.0).

This query only shows you how the calculation is done. To get the results you want:

SELECT code, fk_product_id, cost, ROUND(cost*100/avg) AS pct
FROM (
  SELECT code, cost, fk_product_id,
    AVG(cost) OVER (PARTITION BY code) AS avg,
    STDDEV(cost) OVER (PARTITION BY code) AS std
  FROM po_line
) AS p
WHERE ABS(cost-avg)/std > 1;

+-----------------+---------------+-------+------+
| code            | fk_product_id | cost  | pct  |
+-----------------+---------------+-------+------+
| 00805           |          1367 | 1.350 |  133 |
| gr309203 344400 |          1024 | 0.970 |  115 |
| Water1          |           344 | 0.190 |  121 |
+-----------------+---------------+-------+------+
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template