SQL: Need aggregated pivot view with multiple fields with boolean values
P粉969253139
P粉969253139 2023-09-07 19:47:09
0
1
614

given: I have a source table that contains different attributes for footwear in multiple boolean fields that indicate whether the footwear for that attribute is available. 1=Available, 0=Not available. Sample data is as follows -

Prod_id s_7 s_8 s_9 s_10 c_white c_black c_blue c_brown c_other t_shoes t_Sandals t_slippers ...
001 0 1 0 1 1 1 0 1 0 1 0 0 ...
002 1 1 0 0 1 0 1 0 0 0 0 1 ...
003 0 1 1 0 0 1 1 0 1 1 0 0 ...
004 0 0 1 1 0 0 1 1 0 0 1 1 ...
005 1 0 1 0 1 1 1 0 0 0 0 1 ...
006 0 1 1 1 0 1 0 1 1 1 0 0 ...
007 0 0 1 1 1 1 0 0 0 1 0 0 ...
008 0 1 1 0 0 1 0 1 1 0 0 1 ...
009 1 1 1 0 0 0 1 0 1 0 1 0 ...
... ... ... ... ... ... ... ... ... ... ... ... ... ...

There are also some attribute columns, such as o_casual, o_formal, o_ethnic, m_canvas, m_leather, m_silicon, p_plain, p_textured, p_funky, etc., and all attribute columns have binary values ​​of their respective prod_id. There are over 50k prod_ids.

ask: I need a rotated distribution of the availability of each pair of attributes, and the cell value should indicate the count of prod_id for a given pair at the intersection. For example, a prod_id with attributes "s_8" and "c_black" has a count of 4 (where s_8=1 and c_black=1). It should appear in the view below, which can be useful for further analysis.

Attributes s_7 s_8 s_9 s_10 c_white c_black c_blue c_brown c_other t_shoes t_Sandals t_slippers ...
s_7 3 2 2 0 2 1 3 0 1 0 1 2 ...
s_8 2 6 4 2 2 4 3 3 4 3 1 2 ...
s_9 2 4 7 3 2 5 4 3 4 3 2 3 ...
s_10 0 ... ... ... ... ... ... ... ... ... ... ... ...
c_white 2 ... ...
c_black 1 ... ...
c_blue 3 ... ...
c_Brown 0 ... ...
c_other 1 ... ...
t_shoes 0 ... ...
t_Sandals 1 ... ...
t_slippers 2 ... ...
... ... ... ... ... ... ... ... ... ... ... ... ... ...

Intuitively, the output table is mirrored across the diagonal, and the diagonal cells will contain the total number of prod_ids for that particular attribute.

I am new to SQL. Please help me find a logic through which I can achieve the desired pivot view. Note that the data is large, so consider scalable solutions. If the question is unclear and you need some explanation to clarify, please let me know.

P粉969253139
P粉969253139

reply all(1)
P粉022723606

You can use a query like -

SELECT
    's_7' `attribute`,
    SUM(`s_7` = 1) `s_7`,
    SUM(`s_8` = 1) `s_8`,
    SUM(`s_9` = 1) `s_9`,
    SUM(`s_10` = 1) `s_10`,
    SUM(`c_white` = 1) `c_white`,
    SUM(`c_black` = 1) `c_black`,
    SUM(`c_blue` = 1) `c_blue`,
    SUM(`c_brown` = 1) `c_brown`,
    SUM(`c_other` = 1) `c_other`,
    SUM(`t_shoes` = 1) `t_shoes`,
    SUM(`t_sandals` = 1) `t_sandals`,
    SUM(`t_slippers` = 1) `t_slippers`
FROM t1 WHERE `s_7` = 1
UNION ALL
SELECT
    's_8' `attribute`,
    SUM(`s_7` = 1) `s_7`,
    SUM(`s_8` = 1) `s_8`,
    SUM(`s_9` = 1) `s_9`,
    SUM(`s_10` = 1) `s_10`,
    SUM(`c_white` = 1) `c_white`,
    SUM(`c_black` = 1) `c_black`,
    SUM(`c_blue` = 1) `c_blue`,
    SUM(`c_brown` = 1) `c_brown`,
    SUM(`c_other` = 1) `c_other`,
    SUM(`t_shoes` = 1) `t_shoes`,
    SUM(`t_sandals` = 1) `t_sandals`,
    SUM(`t_slippers` = 1) `t_slippers`
FROM t1 WHERE `s_8` = 1
UNION ALL
SELECT
    's_9' `attribute`,
...

Easily built in the language of your choice. Here is a simple example using PHP -

<?php

$attributes = ['s_7', 's_8', 's_9', 's_10', 'c_white', 'c_black', 'c_blue', 'c_brown', 'c_other', 't_shoes', 't_sandals', 't_slippers'];
$sql = null;
foreach ($attributes as $attribute) {
    if ($sql) {
        $sql .= ' UNION ALL ';
    }
    $sql .= "SELECT '$attribute' `attribute`";
    foreach ($attributes as $attr) {
        $sql .= ", SUM(`$attr` = 1) `$attr`";
    }
    $sql .= " FROM t1 WHERE `$attribute` = 1";
}
 echo $sql;
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template