+---------------+---------+
| Column Name | Type |
+---------------+---------+
| username | varchar |
| activity | varchar |
| startDate | Date |
| endDate | Date |
+---------------+---------+
This table may contain duplicates rows.
This table contains information about the activity performed by each user in a period of time.
A person with username performed an activity from startDate to endDate.
Write a solution to show the second most recent activity of each user.
If the user only has one activity, return that one. A user cannot perform more than one activity at the same time.
Input:
UserActivity table:+------------+--------------+-------------+-------------+| username | activity | startDate | endDate |+------------+--------------+-------------+-------------+| Alice | Travel |2020-02-12|2020-02-20|| Alice | Dancing |2020-02-21|2020-02-23|| Alice | Travel |2020-02-24|2020-02-28|| Bob | Travel |2020-02-11|2020-02-18|+------------+--------------+-------------+-------------+Output:
+------------+--------------+-------------+-------------+| username | activity | startDate | endDate |+------------+--------------+-------------+-------------+| Alice | Dancing |2020-02-21|2020-02-23|| Bob | Travel |2020-02-11|2020-02-18|+------------+--------------+-------------+-------------+Explanation:
The most recent activity of Alice is Travel from 2020-02-24 to 2020-02-28, before that she was dancing from 2020-02-21 to 2020-02-23.Bob only has one record, we just take that one.
We want to find, for each user, the second most recent activity by endDate. If a user has only one activity, we return that one. This is a classic “nth row per group” problem, which can be solved efficiently using window functions in SQL or sorting/grouping in pandas.
SELECT username, activity, startDate, endDate
FROM (
SELECT*,
ROW_NUMBER() OVER (PARTITION BY username ORDERBY endDate DESC) AS rn,
COUNT(*) OVER (PARTITION BY username) AS cnt
FROM UserActivity
) t
WHERE rn =2OR (cnt =1AND rn =1);
1
2
3
4
5
6
7
8
SELECT username, activity, startDate, endDate
FROM (
SELECT*,
ROW_NUMBER() OVER (PARTITION BY username ORDERBY endDate DESC) AS rn,
COUNT(*) OVER (PARTITION BY username) AS cnt
FROM UserActivity
) t
WHERE rn =2OR (cnt =1AND rn =1);