Leetcodify Friends Recommendations
Problem
Table: Listens
+-------------+---------+
| Column Name | Type |
+-------------+---------+
| user_id | int |
| song_id | int |
| day | date |
+-------------+---------+
This table may contain duplicates (In other words, there is no primary key for this table in SQL).
Each row of this table indicates that the user user_id listened to the song song_id on the day day.
Table: Friendship
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| user1_id | int |
| user2_id | int |
+---------------+---------+
In SQL,(user1_id, user2_id) is the primary key for this table.
Each row of this table indicates that the users user1_id and user2_id are friends.
Note that user1_id < user2_id.
Recommend friends to Leetcodify users. We recommend user x to user y if:
- Users
xandyare not friends, and - Users
xandylistened to the same three or more different songs on the same day.
Note that friend recommendations are unidirectional , meaning if user x
and user y should be recommended to each other, the result table should have both user x recommended to user y and user y recommended to user x.
Also, note that the result table should not contain duplicates (i.e., user y
should not be recommended to user x multiple times.).
Return the result table in any order.
The result format is in the following example.
Examples
Example 1:
Input:
Listens table:
+---------+---------+------------+
| user_id | song_id | day |
+---------+---------+------------+
| 1 | 10 | 2021-03-15 |
| 1 | 11 | 2021-03-15 |
| 1 | 12 | 2021-03-15 |
| 2 | 10 | 2021-03-15 |
| 2 | 11 | 2021-03-15 |
| 2 | 12 | 2021-03-15 |
| 3 | 10 | 2021-03-15 |
| 3 | 11 | 2021-03-15 |
| 3 | 12 | 2021-03-15 |
| 4 | 10 | 2021-03-15 |
| 4 | 11 | 2021-03-15 |
| 4 | 13 | 2021-03-15 |
| 5 | 10 | 2021-03-16 |
| 5 | 11 | 2021-03-16 |
| 5 | 12 | 2021-03-16 |
+---------+---------+------------+
Friendship table:
+----------+----------+
| user1_id | user2_id |
+----------+----------+
| 1 | 2 |
+----------+----------+
Output:
+---------+----------------+
| user_id | recommended_id |
+---------+----------------+
| 1 | 3 |
| 2 | 3 |
| 3 | 1 |
| 3 | 2 |
+---------+----------------+
Explanation:
Users 1 and 2 listened to songs 10, 11, and 12 on the same day, but they are already friends.
Users 1 and 3 listened to songs 10, 11, and 12 on the same day. Since they are not friends, we recommend them to each other.
Users 1 and 4 did not listen to the same three songs.
Users 1 and 5 listened to songs 10, 11, and 12, but on different days.
Similarly, we can see that users 2 and 3 listened to songs 10, 11, and 12 on the same day and are not friends, so we recommend them to each other.
Solution
Method 1 – Self Join, Grouping, and Anti-Join
Intuition
We want to recommend users to each other if they are not already friends and have listened to at least three of the same songs on the same day. We can use a self-join on the Listens table to find such user pairs, group by user pairs and day, and count the number of common songs. Then, we filter out pairs who are already friends.
Approach
- Self-join the Listens table on day and song_id to find all user pairs who listened to the same song on the same day.
- Only consider pairs where user_id < other_user_id to avoid duplicates.
- Group by user pairs and day, and count the number of common songs.
- Filter to pairs with at least 3 common songs.
- Exclude pairs who are already friends (using anti-join with Friendship table).
- Output both (user_id, recommended_id) and (recommended_id, user_id) for each valid pair.
Code
MySQL
WITH common_songs AS (
SELECT a.user_id AS user1, b.user_id AS user2, a.day
FROM Listens a
JOIN Listens b ON a.day = b.day AND a.song_id = b.song_id AND a.user_id < b.user_id
GROUP BY a.user_id, b.user_id, a.day, a.song_id
),
user_pairs AS (
SELECT user1, user2, day, COUNT(*) AS cnt
FROM common_songs
GROUP BY user1, user2, day
HAVING cnt >= 3
),
not_friends AS (
SELECT user1, user2 FROM user_pairs
LEFT JOIN Friendship f ON user1 = f.user1_id AND user2 = f.user2_id
WHERE f.user1_id IS NULL
)
SELECT user1 AS user_id, user2 AS recommended_id FROM not_friends
UNION
SELECT user2 AS user_id, user1 AS recommended_id FROM not_friends;
PostgreSQL
WITH common_songs AS (
SELECT a.user_id AS user1, b.user_id AS user2, a.day
FROM Listens a
JOIN Listens b ON a.day = b.day AND a.song_id = b.song_id AND a.user_id < b.user_id
GROUP BY a.user_id, b.user_id, a.day, a.song_id
),
user_pairs AS (
SELECT user1, user2, day, COUNT(*) AS cnt
FROM common_songs
GROUP BY user1, user2, day
HAVING COUNT(*) >= 3
),
not_friends AS (
SELECT user1, user2 FROM user_pairs
LEFT JOIN Friendship f ON user1 = f.user1_id AND user2 = f.user2_id
WHERE f.user1_id IS NULL
)
SELECT user1 AS user_id, user2 AS recommended_id FROM not_friends
UNION
SELECT user2 AS user_id, user1 AS recommended_id FROM not_friends;
Python (Pandas)
def recommend_friends(listens_df, friendship_df):
import pandas as pd
# Merge on day and song_id, filter user pairs
merged = listens_df.merge(listens_df, on=['day', 'song_id'])
merged = merged[merged['user_id_x'] < merged['user_id_y']]
# Count common songs per user pair per day
grouped = merged.groupby(['user_id_x', 'user_id_y', 'day']).size().reset_index(name='cnt')
filtered = grouped[grouped['cnt'] >= 3]
# Remove pairs who are already friends
friends = set(tuple(x) for x in friendship_df[['user1_id', 'user2_id']].values)
recs = []
for _, row in filtered.iterrows():
u, v = row['user_id_x'], row['user_id_y']
if (u, v) not in friends:
recs.append((u, v))
recs.append((v, u))
recs_df = pd.DataFrame(recs, columns=['user_id', 'recommended_id']).drop_duplicates()
return recs_df
Complexity
- ⏰ Time complexity:
O(n^2)for the self-join, where n is the number of listens. Grouping and filtering are also O(n^2) in the worst case. - 🧺 Space complexity:
O(n^2)for storing all user pairs and intermediate results.