Problem

Table: Friendship

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| user1_id      | int     |
| user2_id      | int     |
+---------------+---------+
(user1_id, user2_id) is the primary key (combination of columns with unique values) for this table.
Each row of this table indicates that there is a friendship relation between user1_id and user2_id.

Table: Likes

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| user_id     | int     |
| page_id     | int     |
+-------------+---------+
(user_id, page_id) is the primary key (combination of columns with unique values) for this table.
Each row of this table indicates that user_id likes page_id.

Write a solution to recommend pages to the user with user_id = 1 using the pages that your friends liked. It should not recommend pages you already liked.

Return result table in any order without duplicates.

The result format is in the following example.

Examples

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
Input: 
Friendship table:
+----------+----------+
| user1_id | user2_id |
+----------+----------+
| 1        | 2        |
| 1        | 3        |
| 1        | 4        |
| 2        | 3        |
| 2        | 4        |
| 2        | 5        |
| 6        | 1        |
+----------+----------+
Likes table:
+---------+---------+
| user_id | page_id |
+---------+---------+
| 1       | 88      |
| 2       | 23      |
| 3       | 24      |
| 4       | 56      |
| 5       | 11      |
| 6       | 33      |
| 2       | 77      |
| 3       | 77      |
| 6       | 88      |
+---------+---------+
Output: 
+------------------+
| recommended_page |
+------------------+
| 23               |
| 24               |
| 56               |
| 33               |
| 77               |
+------------------+
Explanation: 
User one is friend with users 2, 3, 4 and 6.
Suggested pages are 23 from user 2, 24 from user 3, 56 from user 3 and 33 from user 6.
Page 77 is suggested from both user 2 and user 3.
Page 88 is not suggested because user 1 already likes it.

Solution

Method 1 – SQL Join and Set Difference

Intuition

The main idea is to recommend to user 1 all pages liked by their friends, except those already liked by user 1. We use joins to find friends and their liked pages, and then filter out pages already liked by user 1.

Approach

  1. Find all friends of user 1 (friendship is bidirectional).
  2. Get all pages liked by these friends.
  3. Exclude pages already liked by user 1.
  4. Return the unique recommended pages.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
SELECT DISTINCT l.page_id AS recommended_page
FROM (
  SELECT user2_id AS friend_id FROM Friendship WHERE user1_id = 1
  UNION
  SELECT user1_id AS friend_id FROM Friendship WHERE user2_id = 1
) f
JOIN Likes l ON f.friend_id = l.user_id
WHERE l.page_id NOT IN (
  SELECT page_id FROM Likes WHERE user_id = 1
);
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
SELECT DISTINCT l.page_id AS recommended_page
FROM (
  SELECT user2_id AS friend_id FROM Friendship WHERE user1_id = 1
  UNION
  SELECT user1_id AS friend_id FROM Friendship WHERE user2_id = 1
) f
JOIN Likes l ON f.friend_id = l.user_id
WHERE l.page_id NOT IN (
  SELECT page_id FROM Likes WHERE user_id = 1
);
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
class Solution:
    def page_recommendations(self, friendship: 'pd.DataFrame', likes: 'pd.DataFrame') -> 'pd.DataFrame':
        # Find all friends of user 1
        f1 = friendship[friendship['user1_id'] == 1]['user2_id']
        f2 = friendship[friendship['user2_id'] == 1]['user1_id']
        friends = pd.concat([f1, f2]).unique()
        # Pages liked by friends
        friend_likes = likes[likes['user_id'].isin(friends)]
        # Pages liked by user 1
        user1_likes = set(likes[likes['user_id'] == 1]['page_id'])
        # Recommend pages not already liked by user 1
        rec = friend_likes[~friend_likes['page_id'].isin(user1_likes)]['page_id'].drop_duplicates()
        return pd.DataFrame({'recommended_page': rec})

Complexity

  • ⏰ Time complexity: O(F + L) where F is the number of friendships and L is the number of likes. Each join and filter is linear in the number of rows.
  • 🧺 Space complexity: O(F + L + R) where R is the number of recommended pages. We need space for the friends list, likes, and the result.