Problem

Table: Queries

+————-+———+ | Column Name | Type | +————-+———+ | query_name | varchar | | result | varchar | | position | int | | rating | int | +————-+———+ This table may have duplicate rows. This table contains information collected from some queries on a database. The position column has a value from 1 to 500. The rating column has a value from 1 to 5. Query with rating less than 3 is a poor query.

We define query quality as:

The average of the ratio between query rating and its position.

We also define poor query percentage as:

The percentage of all queries with rating less than 3.

Write a solution to find each query_name, the quality and poor_query_percentage.

Both quality and poor_query_percentage should be rounded to 2 decimal places.

Return the result table in any order.

The result format is in the following example.

Examples

Example 1

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
Input: 
Queries table:
+------------+-------------------+----------+--------+
| query_name | result            | position | rating |
+------------+-------------------+----------+--------+
| Dog        | Golden Retriever  | 1        | 5      |
| Dog        | German Shepherd   | 2        | 5      |
| Dog        | Mule              | 200      | 1      |
| Cat        | Shirazi           | 5        | 2      |
| Cat        | Siamese           | 3        | 3      |
| Cat        | Sphynx            | 7        | 4      |
+------------+-------------------+----------+--------+
Output: 
+------------+---------+-----------------------+
| query_name | quality | poor_query_percentage |
+------------+---------+-----------------------+
| Dog        | 2.50    | 33.33                 |
| Cat        | 0.66    | 33.33                 |
+------------+---------+-----------------------+
Explanation: 
Dog queries quality is ((5 / 1) + (5 / 2) + (1 / 200)) / 3 = 2.50
Dog queries poor_ query_percentage is (1 / 3) * 100 = 33.33

Cat queries quality equals ((2 / 5) + (3 / 3) + (4 / 7)) / 3 = 0.66
Cat queries poor_ query_percentage is (1 / 3) * 100 = 33.33

## Solution

### Method 1  Aggregation and Conditional Counting

#### Intuition

To compute the required metrics for each query name, we need to aggregate the data by `query_name`, calculate the average of `rating/position`, and count the percentage of poor queries (rating < 3).

#### Approach

1. Group the table by `query_name`.
2. For each group, calculate:
    - The average of `rating/position` for quality.
    - The percentage of rows with `rating < 3` for poor query percentage.
3. Round both results to two decimal places.

#### Code

1
2
3
4
5
6
SELECT
  query_name,
  ROUND(AVG(rating / position), 2) AS quality,
  ROUND(SUM(rating < 3) / COUNT(*) * 100, 2) AS poor_query_percentage
FROM Queries
GROUP BY query_name;
1
2
3
4
5
6
SELECT
  query_name,
  ROUND(AVG(rating::decimal / position), 2) AS quality,
  ROUND(SUM(CASE WHEN rating < 3 THEN 1 ELSE 0 END)::decimal / COUNT(*) * 100, 2) AS poor_query_percentage
FROM Queries
GROUP BY query_name;
1
2
3
4
5
6
7
import pandas as pd
def queries_quality_and_percentage(queries: pd.DataFrame) -> pd.DataFrame:
    def agg_fn(df):
        quality = round((df['rating'] / df['position']).mean(), 2)
        poor_pct = round((df['rating'] < 3).mean() * 100, 2)
        return pd.Series({'quality': quality, 'poor_query_percentage': poor_pct})
    return queries.groupby('query_name').apply(agg_fn).reset_index()
#### Complexity - Time complexity: `O(n)`, where n is the number of rows in the table, since each row is processed once per group. - 🧺 Space complexity: `O(k)`, where k is the number of unique query names (groups).