Dynamic Unpivoting of a Table
Problem
Table: Products
+-------------+---------+
| Column Name | Type |
+-------------+---------+
| product_id | int |
| store_name1 | int |
| store_name2 | int |
| : | int |
| : | int |
| : | int |
| store_namen | int |
+-------------+---------+
product_id is the primary key for this table.
Each row in this table indicates the product's price in n different stores.
If the product is not available in a store, the price will be null in that store's column.
The names of the stores may change from one testcase to another. There will be at least 1 store and at most 30 stores.
Important note: This problem targets those who have a good experience with SQL. If you are a beginner, we recommend that you skip it for now.
Implement the procedure UnpivotProducts to reorganize the Products table so that each row has the id of one product, the name of a store where it is sold, and its price in that store. If a product is not available in a store, do not include a row with that product_id and store combination in the result table. There should be three columns: product_id, store, and
price.
The procedure should return the table after reorganizing it.
Return the result table in any order.
The query result format is in the following example.
Examples
Example 1:
Input:
Products table:
+------------+----------+--------+------+------+
| product_id | LC_Store | Nozama | Shop | Souq |
+------------+----------+--------+------+------+
| 1 | 100 | null | 110 | null |
| 2 | null | 200 | null | 190 |
| 3 | null | null | 1000 | 1900 |
+------------+----------+--------+------+------+
Output:
+------------+----------+-------+
| product_id | store | price |
+------------+----------+-------+
| 1 | LC_Store | 100 |
| 1 | Shop | 110 |
| 2 | Nozama | 200 |
| 2 | Souq | 190 |
| 3 | Shop | 1000 |
| 3 | Souq | 1900 |
+------------+----------+-------+
Explanation:
Product 1 is sold in LC_Store and Shop with prices of 100 and 110 respectively.
Product 2 is sold in Nozama and Souq with prices of 200 and 190.
Product 3 is sold in Shop and Souq with prices of 1000 and 1900.
Solution
Method 1 – Dynamic SQL Unpivot
Intuition
To unpivot a table with an unknown set of store columns, we need to dynamically generate SQL that selects each store column as a row, skipping nulls. In pandas, we can use melt and drop nulls.
Approach
- Get the list of store columns (all columns except
product_id). - For SQL, dynamically generate a UNION ALL query for each store column, selecting only non-null prices.
- For pandas, use
meltto unpivot, then drop rows with null prices.
Code
MySQL
SET SESSION group_concat_max_len = 1000000;
SET @sql = NULL;
SELECT GROUP_CONCAT(
CONCAT('SELECT product_id, ''', COLUMN_NAME, ''' AS store, ', COLUMN_NAME, ' AS price FROM Products WHERE ', COLUMN_NAME, ' IS NOT NULL')
SEPARATOR ' UNION ALL '
) INTO @sql
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'Products' AND COLUMN_NAME != 'product_id';
PREPARE stmt FROM @sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
PostgreSQL
DO $$
DECLARE
col_list text;
dyn_sql text;
BEGIN
SELECT string_agg(
format('SELECT product_id, %L AS store, %I AS price FROM Products WHERE %I IS NOT NULL', column_name, column_name, column_name),
' UNION ALL '
) INTO col_list
FROM information_schema.columns
WHERE table_name = 'products' AND column_name != 'product_id';
dyn_sql := col_list;
EXECUTE dyn_sql;
END $$;
Python (pandas)
import pandas as pd
def unpivot_products(products: pd.DataFrame) -> pd.DataFrame:
df = products.melt(id_vars=['product_id'], var_name='store', value_name='price')
return df.dropna(subset=['price'])
Complexity
- ⏰ Time complexity:
O(n * s), where n is the number of products and s is the number of stores. - 🧺 Space complexity:
O(n * s), for the output table.