Queries Quality and Percentage
Problem
Table: Queries
+-------------+---------+ | Column Name | Type | +-------------+---------+ | query_name | varchar | | result | varchar | | position | int | | rating | int | +-------------+---------+ This table may have duplicate rows. This table contains information collected from some queries on a database. The position column has a value from 1 to 500. The rating column has a value from 1 to 5. Query with rating less than 3 is a poor query.
We define query quality as:
The average of the ratio between query rating and its position.
We also define poor query percentage as:
The percentage of all queries with rating less than 3.
Write a solution to find each query_name, the quality and
poor_query_percentage.
Both quality and poor_query_percentage should be rounded to 2 decimal places.
Return the result table in any order.
The result format is in the following example.
Examples
Example 1
Input:
Queries table:
+------------+-------------------+----------+--------+
| query_name | result | position | rating |
+------------+-------------------+----------+--------+
| Dog | Golden Retriever | 1 | 5 |
| Dog | German Shepherd | 2 | 5 |
| Dog | Mule | 200 | 1 |
| Cat | Shirazi | 5 | 2 |
| Cat | Siamese | 3 | 3 |
| Cat | Sphynx | 7 | 4 |
+------------+-------------------+----------+--------+
Output:
+------------+---------+-----------------------+
| query_name | quality | poor_query_percentage |
+------------+---------+-----------------------+
| Dog | 2.50 | 33.33 |
| Cat | 0.66 | 33.33 |
+------------+---------+-----------------------+
Explanation:
Dog queries quality is ((5 / 1) + (5 / 2) + (1 / 200)) / 3 = 2.50
Dog queries poor_ query_percentage is (1 / 3) * 100 = 33.33
Cat queries quality equals ((2 / 5) + (3 / 3) + (4 / 7)) / 3 = 0.66
Cat queries poor_ query_percentage is (1 / 3) * 100 = 33.33
## Solution
### Method 1 – Aggregation and Conditional Counting
#### Intuition
To compute the required metrics for each query name, we need to aggregate the data by `query_name`, calculate the average of `rating/position`, and count the percentage of poor queries (rating < 3).
#### Approach
1. Group the table by `query_name`.
2. For each group, calculate:
- The average of `rating/position` for quality.
- The percentage of rows with `rating < 3` for poor query percentage.
3. Round both results to two decimal places.
#### Code
{{< code_tabs >}}
##### MySQL
```sql
SELECT
query_name,
ROUND(AVG(rating / position), 2) AS quality,
ROUND(SUM(rating < 3) / COUNT(*) * 100, 2) AS poor_query_percentage
FROM Queries
GROUP BY query_name;
PostgreSQL
SELECT
query_name,
ROUND(AVG(rating::decimal / position), 2) AS quality,
ROUND(SUM(CASE WHEN rating < 3 THEN 1 ELSE 0 END)::decimal / COUNT(*) * 100, 2) AS poor_query_percentage
FROM Queries
GROUP BY query_name;
Python (pandas)
import pandas as pd
def queries_quality_and_percentage(queries: pd.DataFrame) -> pd.DataFrame:
def agg_fn(df):
quality = round((df['rating'] / df['position']).mean(), 2)
poor_pct = round((df['rating'] < 3).mean() * 100, 2)
return pd.Series({'quality': quality, 'poor_query_percentage': poor_pct})
return queries.groupby('query_name').apply(agg_fn).reset_index()
Complexity
- ⏰ Time complexity:
O(n), where n is the number of rows in the table, since each row is processed once per group. - 🧺 Space complexity:
O(k), where k is the number of unique query names (groups).