Queries Quality and Percentage

Problem

Table: Queries

We define query quality as:

The average of the ratio between query rating and its position.

We also define poor query percentage as:

The percentage of all queries with rating less than 3.

Write a solution to find each query_name, the quality and poor_query_percentage.

Both quality and poor_query_percentage should be rounded to 2 decimal places.

Return the result table in any order.

The result format is in the following example.

Examples

Example 1

Input: 
Queries table:
+------------+-------------------+----------+--------+
| query_name | result            | position | rating |
+------------+-------------------+----------+--------+
| Dog        | Golden Retriever  | 1        | 5      |
| Dog        | German Shepherd   | 2        | 5      |
| Dog        | Mule              | 200      | 1      |
| Cat        | Shirazi           | 5        | 2      |
| Cat        | Siamese           | 3        | 3      |
| Cat        | Sphynx            | 7        | 4      |
+------------+-------------------+----------+--------+
Output: 
+------------+---------+-----------------------+
| query_name | quality | poor_query_percentage |
+------------+---------+-----------------------+
| Dog        | 2.50    | 33.33                 |
| Cat        | 0.66    | 33.33                 |
+------------+---------+-----------------------+
Explanation: 
Dog queries quality is ((5 / 1) + (5 / 2) + (1 / 200)) / 3 = 2.50
Dog queries poor_ query_percentage is (1 / 3) * 100 = 33.33

Cat queries quality equals ((2 / 5) + (3 / 3) + (4 / 7)) / 3 = 0.66
Cat queries poor_ query_percentage is (1 / 3) * 100 = 33.33

## Solution

### Method 1 – Aggregation and Conditional Counting

#### Intuition

To compute the required metrics for each query name, we need to aggregate the data by `query_name`, calculate the average of `rating/position`, and count the percentage of poor queries (rating < 3).

#### Approach

1. Group the table by `query_name`.
2. For each group, calculate:
    - The average of `rating/position` for quality.
    - The percentage of rows with `rating < 3` for poor query percentage.
3. Round both results to two decimal places.

#### Code

{{< code_tabs >}}
##### MySQL
```sql
SELECT
  query_name,
  ROUND(AVG(rating / position), 2) AS quality,
  ROUND(SUM(rating < 3) / COUNT(*) * 100, 2) AS poor_query_percentage
FROM Queries
GROUP BY query_name;

PostgreSQL

SELECT
  query_name,
  ROUND(AVG(rating::decimal / position), 2) AS quality,
  ROUND(SUM(CASE WHEN rating < 3 THEN 1 ELSE 0 END)::decimal / COUNT(*) * 100, 2) AS poor_query_percentage
FROM Queries
GROUP BY query_name;

Python (pandas)

import pandas as pd
def queries_quality_and_percentage(queries: pd.DataFrame) -> pd.DataFrame:
    def agg_fn(df):
        quality = round((df['rating'] / df['position']).mean(), 2)
        poor_pct = round((df['rating'] < 3).mean() * 100, 2)
        return pd.Series({'quality': quality, 'poor_query_percentage': poor_pct})
    return queries.groupby('query_name').apply(agg_fn).reset_index()

Complexity

⏰ Time complexity: O(n), where n is the number of rows in the table, since each row is processed once per group.
🧺 Space complexity: O(k), where k is the number of unique query names (groups).