+-------------+---------+
| Column Name | Type |
+-------------+---------+
| user_id | int |
| song_id | int |
| day | date |
+-------------+---------+
This table may contain duplicates (In other words, there is no primary key for this table in SQL).
Each row of this table indicates that the user user_id listened to the song song_id on the day day.
Table: Friendship
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| user1_id | int |
| user2_id | int |
+---------------+---------+
In SQL,(user1_id, user2_id) is the primary key for this table.
Each row of this table indicates that the users user1_id and user2_id are friends.
Note that user1_id < user2_id.
Recommend friends to Leetcodify users. We recommend user x to user y if:
Users x and y are not friends, and
Users x and y listened to the same three or more different songs on the same day.
Note that friend recommendations are unidirectional , meaning if user x
and user y should be recommended to each other, the result table should have both user x recommended to user y and user y recommended to user x.
Also, note that the result table should not contain duplicates (i.e., user y
should not be recommended to user x multiple times.).
Input:
Listens table:+---------+---------+------------+| user_id | song_id | day |+---------+---------+------------+|1|10|2021-03-15||1|11|2021-03-15||1|12|2021-03-15||2|10|2021-03-15||2|11|2021-03-15||2|12|2021-03-15||3|10|2021-03-15||3|11|2021-03-15||3|12|2021-03-15||4|10|2021-03-15||4|11|2021-03-15||4|13|2021-03-15||5|10|2021-03-16||5|11|2021-03-16||5|12|2021-03-16|+---------+---------+------------+Friendship table:+----------+----------+| user1_id | user2_id |+----------+----------+|1|2|+----------+----------+Output:
+---------+----------------+| user_id | recommended_id |+---------+----------------+|1|3||2|3||3|1||3|2|+---------+----------------+Explanation:
Users 1 and 2 listened to songs 10,11, and 12 on the same day, but they are already friends.Users 1 and 3 listened to songs 10,11, and 12 on the same day. Since they are not friends, we recommend them to each other.Users 1 and 4 did not listen to the same three songs.Users 1 and 5 listened to songs 10,11, and 12, but on different days.Similarly, we can see that users 2 and 3 listened to songs 10,11, and 12 on the same day and are not friends, so we recommend them to each other.
We want to recommend users to each other if they are not already friends and have listened to at least three of the same songs on the same day. We can use a self-join on the Listens table to find such user pairs, group by user pairs and day, and count the number of common songs. Then, we filter out pairs who are already friends.
WITH common_songs AS (
SELECT a.user_id AS user1, b.user_id AS user2, a.dayFROM Listens a
JOIN Listens b ON a.day= b.dayAND a.song_id = b.song_id AND a.user_id < b.user_id
GROUPBY a.user_id, b.user_id, a.day, a.song_id
),
user_pairs AS (
SELECT user1, user2, day, COUNT(*) AS cnt
FROM common_songs
GROUPBY user1, user2, dayHAVING cnt >=3),
not_friends AS (
SELECT user1, user2 FROM user_pairs
LEFTJOIN Friendship f ON user1 = f.user1_id AND user2 = f.user2_id
WHERE f.user1_id ISNULL)
SELECT user1 AS user_id, user2 AS recommended_id FROM not_friends
UNIONSELECT user2 AS user_id, user1 AS recommended_id FROM not_friends;
WITH common_songs AS (
SELECT a.user_id AS user1, b.user_id AS user2, a.dayFROM Listens a
JOIN Listens b ON a.day= b.dayAND a.song_id = b.song_id AND a.user_id < b.user_id
GROUPBY a.user_id, b.user_id, a.day, a.song_id
),
user_pairs AS (
SELECT user1, user2, day, COUNT(*) AS cnt
FROM common_songs
GROUPBY user1, user2, dayHAVINGCOUNT(*) >=3),
not_friends AS (
SELECT user1, user2 FROM user_pairs
LEFTJOIN Friendship f ON user1 = f.user1_id AND user2 = f.user2_id
WHERE f.user1_id ISNULL)
SELECT user1 AS user_id, user2 AS recommended_id FROM not_friends
UNIONSELECT user2 AS user_id, user1 AS recommended_id FROM not_friends;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
defrecommend_friends(listens_df, friendship_df):
import pandas as pd
# Merge on day and song_id, filter user pairs merged = listens_df.merge(listens_df, on=['day', 'song_id'])
merged = merged[merged['user_id_x'] < merged['user_id_y']]
# Count common songs per user pair per day grouped = merged.groupby(['user_id_x', 'user_id_y', 'day']).size().reset_index(name='cnt')
filtered = grouped[grouped['cnt'] >=3]
# Remove pairs who are already friends friends = set(tuple(x) for x in friendship_df[['user1_id', 'user2_id']].values)
recs = []
for _, row in filtered.iterrows():
u, v = row['user_id_x'], row['user_id_y']
if (u, v) notin friends:
recs.append((u, v))
recs.append((v, u))
recs_df = pd.DataFrame(recs, columns=['user_id', 'recommended_id']).drop_duplicates()
return recs_df