Problem

Table: UserActivity

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| username      | varchar |
| activity      | varchar |
| startDate     | Date    |
| endDate       | Date    |
+---------------+---------+
This table may contain duplicates rows.
This table contains information about the activity performed by each user in a period of time.
A person with username performed an activity from startDate to endDate.

Write a solution to show the second most recent activity of each user.

If the user only has one activity, return that one. A user cannot perform more than one activity at the same time.

Return the result table in any order.

The result format is in the following example.

Examples

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
Input: 
UserActivity table:
+------------+--------------+-------------+-------------+
| username   | activity     | startDate   | endDate     |
+------------+--------------+-------------+-------------+
| Alice      | Travel       | 2020-02-12  | 2020-02-20  |
| Alice      | Dancing      | 2020-02-21  | 2020-02-23  |
| Alice      | Travel       | 2020-02-24  | 2020-02-28  |
| Bob        | Travel       | 2020-02-11  | 2020-02-18  |
+------------+--------------+-------------+-------------+
Output: 
+------------+--------------+-------------+-------------+
| username   | activity     | startDate   | endDate     |
+------------+--------------+-------------+-------------+
| Alice      | Dancing      | 2020-02-21  | 2020-02-23  |
| Bob        | Travel       | 2020-02-11  | 2020-02-18  |
+------------+--------------+-------------+-------------+
Explanation: 
The most recent activity of Alice is Travel from 2020-02-24 to 2020-02-28, before that she was dancing from 2020-02-21 to 2020-02-23.
Bob only has one record, we just take that one.

Solution

Method 1 – Window Functions and Sorting

Intuition

We want to find, for each user, the second most recent activity by endDate. If a user has only one activity, we return that one. This is a classic “nth row per group” problem, which can be solved efficiently using window functions in SQL or sorting/grouping in pandas.

Approach

  1. For each user, sort their activities by endDate in descending order (most recent first).
  2. Assign a row number to each activity per user.
  3. Select the activity with row number 2 (second most recent). If a user has only one activity, select that one.
  4. Return the required columns.

Code

1
2
3
4
5
6
7
8
SELECT username, activity, startDate, endDate
FROM (
    SELECT *,
           ROW_NUMBER() OVER (PARTITION BY username ORDER BY endDate DESC) AS rn,
           COUNT(*) OVER (PARTITION BY username) AS cnt
    FROM UserActivity
) t
WHERE rn = 2 OR (cnt = 1 AND rn = 1);
1
2
3
4
5
6
7
8
SELECT username, activity, startDate, endDate
FROM (
    SELECT *,
           ROW_NUMBER() OVER (PARTITION BY username ORDER BY endDate DESC) AS rn,
           COUNT(*) OVER (PARTITION BY username) AS cnt
    FROM UserActivity
) t
WHERE rn = 2 OR (cnt = 1 AND rn = 1);
1
2
3
4
5
6
7
8
class Solution:
    def get_second_most_recent_activity(self, df):
        # df: pandas.DataFrame with columns ['username', 'activity', 'startDate', 'endDate']
        df = df.sort_values(['username', 'endDate'], ascending=[True, False])
        df['rn'] = df.groupby('username').cumcount() + 1
        cnt = df.groupby('username')['activity'].transform('count')
        res = df[(df['rn'] == 2) | ((cnt == 1) & (df['rn'] == 1))]
        return res[['username', 'activity', 'startDate', 'endDate']]

Complexity

  • ⏰ Time complexity: O(n log n) — Sorting by user and endDate dominates.
  • 🧺 Space complexity: O(n) — For window function or groupby operations.