Problem

Table: Listens

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| user_id     | int     |
| song_id     | int     |
| day         | date    |
+-------------+---------+
This table may contain duplicates (In other words, there is no primary key for this table in SQL).
Each row of this table indicates that the user user_id listened to the song song_id on the day day.

Table: Friendship

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| user1_id      | int     |
| user2_id      | int     |
+---------------+---------+
In SQL,(user1_id, user2_id) is the primary key for this table.
Each row of this table indicates that the users user1_id and user2_id are friends.
Note that user1_id < user2_id.

Recommend friends to Leetcodify users. We recommend user x to user y if:

  • Users x and y are not friends, and
  • Users x and y listened to the same three or more different songs on the same day.

Note that friend recommendations are unidirectional , meaning if user x and user y should be recommended to each other, the result table should have both user x recommended to user y and user y recommended to user x. Also, note that the result table should not contain duplicates (i.e., user y should not be recommended to user x multiple times.).

Return the result table in any order.

The result format is in the following example.

Examples

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
Input: 
Listens table:
+---------+---------+------------+
| user_id | song_id | day        |
+---------+---------+------------+
| 1       | 10      | 2021-03-15 |
| 1       | 11      | 2021-03-15 |
| 1       | 12      | 2021-03-15 |
| 2       | 10      | 2021-03-15 |
| 2       | 11      | 2021-03-15 |
| 2       | 12      | 2021-03-15 |
| 3       | 10      | 2021-03-15 |
| 3       | 11      | 2021-03-15 |
| 3       | 12      | 2021-03-15 |
| 4       | 10      | 2021-03-15 |
| 4       | 11      | 2021-03-15 |
| 4       | 13      | 2021-03-15 |
| 5       | 10      | 2021-03-16 |
| 5       | 11      | 2021-03-16 |
| 5       | 12      | 2021-03-16 |
+---------+---------+------------+
Friendship table:
+----------+----------+
| user1_id | user2_id |
+----------+----------+
| 1        | 2        |
+----------+----------+
Output: 
+---------+----------------+
| user_id | recommended_id |
+---------+----------------+
| 1       | 3              |
| 2       | 3              |
| 3       | 1              |
| 3       | 2              |
+---------+----------------+
Explanation: 
Users 1 and 2 listened to songs 10, 11, and 12 on the same day, but they are already friends.
Users 1 and 3 listened to songs 10, 11, and 12 on the same day. Since they are not friends, we recommend them to each other.
Users 1 and 4 did not listen to the same three songs.
Users 1 and 5 listened to songs 10, 11, and 12, but on different days.
Similarly, we can see that users 2 and 3 listened to songs 10, 11, and 12 on the same day and are not friends, so we recommend them to each other.

Solution

Method 1 – Self Join, Grouping, and Anti-Join

Intuition

We want to recommend users to each other if they are not already friends and have listened to at least three of the same songs on the same day. We can use a self-join on the Listens table to find such user pairs, group by user pairs and day, and count the number of common songs. Then, we filter out pairs who are already friends.

Approach

  1. Self-join the Listens table on day and song_id to find all user pairs who listened to the same song on the same day.
  2. Only consider pairs where user_id < other_user_id to avoid duplicates.
  3. Group by user pairs and day, and count the number of common songs.
  4. Filter to pairs with at least 3 common songs.
  5. Exclude pairs who are already friends (using anti-join with Friendship table).
  6. Output both (user_id, recommended_id) and (recommended_id, user_id) for each valid pair.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
WITH common_songs AS (
  SELECT a.user_id AS user1, b.user_id AS user2, a.day
  FROM Listens a
  JOIN Listens b ON a.day = b.day AND a.song_id = b.song_id AND a.user_id < b.user_id
  GROUP BY a.user_id, b.user_id, a.day, a.song_id
),
user_pairs AS (
  SELECT user1, user2, day, COUNT(*) AS cnt
  FROM common_songs
  GROUP BY user1, user2, day
  HAVING cnt >= 3
),
not_friends AS (
  SELECT user1, user2 FROM user_pairs
  LEFT JOIN Friendship f ON user1 = f.user1_id AND user2 = f.user2_id
  WHERE f.user1_id IS NULL
)
SELECT user1 AS user_id, user2 AS recommended_id FROM not_friends
UNION
SELECT user2 AS user_id, user1 AS recommended_id FROM not_friends;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
WITH common_songs AS (
  SELECT a.user_id AS user1, b.user_id AS user2, a.day
  FROM Listens a
  JOIN Listens b ON a.day = b.day AND a.song_id = b.song_id AND a.user_id < b.user_id
  GROUP BY a.user_id, b.user_id, a.day, a.song_id
),
user_pairs AS (
  SELECT user1, user2, day, COUNT(*) AS cnt
  FROM common_songs
  GROUP BY user1, user2, day
  HAVING COUNT(*) >= 3
),
not_friends AS (
  SELECT user1, user2 FROM user_pairs
  LEFT JOIN Friendship f ON user1 = f.user1_id AND user2 = f.user2_id
  WHERE f.user1_id IS NULL
)
SELECT user1 AS user_id, user2 AS recommended_id FROM not_friends
UNION
SELECT user2 AS user_id, user1 AS recommended_id FROM not_friends;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
def recommend_friends(listens_df, friendship_df):
    import pandas as pd
    # Merge on day and song_id, filter user pairs
    merged = listens_df.merge(listens_df, on=['day', 'song_id'])
    merged = merged[merged['user_id_x'] < merged['user_id_y']]
    # Count common songs per user pair per day
    grouped = merged.groupby(['user_id_x', 'user_id_y', 'day']).size().reset_index(name='cnt')
    filtered = grouped[grouped['cnt'] >= 3]
    # Remove pairs who are already friends
    friends = set(tuple(x) for x in friendship_df[['user1_id', 'user2_id']].values)
    recs = []
    for _, row in filtered.iterrows():
        u, v = row['user_id_x'], row['user_id_y']
        if (u, v) not in friends:
            recs.append((u, v))
            recs.append((v, u))
    recs_df = pd.DataFrame(recs, columns=['user_id', 'recommended_id']).drop_duplicates()
    return recs_df

Complexity

  • ⏰ Time complexity: O(n^2) for the self-join, where n is the number of listens. Grouping and filtering are also O(n^2) in the worst case.
  • 🧺 Space complexity: O(n^2) for storing all user pairs and intermediate results.