+---------------+---------+
|Column Name |Type|+---------------+---------+
| id | int || name | varchar |+---------------+---------+
id is the primarykey (columnwithuniquevalues) for this table.
This tablecontains the account id and the user name ofeach account.
Table: Logins
1
2
3
4
5
6
7
8
+---------------+---------+
|Column Name |Type|+---------------+---------+
| id | int || login_date | date |+---------------+---------+
This table may contain duplicate rows.
This tablecontains the account id of the user who logged inand the login date. A user may log in multiple times in the day.
Active users are those who logged in to their accounts for five or more consecutive days.
Write a solution to find the id and the name of active users.
Input:
Accounts table:
+----+----------+
| id | name |+----+----------+
|1| Winston ||7| Jonathan |+----+----------+
Logins table:
+----+------------+
| id | login_date |+----+------------+
|7|2020-05-30||1|2020-05-30||7|2020-05-31||7|2020-06-01||7|2020-06-02||7|2020-06-02||7|2020-06-03||1|2020-06-07||7|2020-06-10|+----+------------+
Output:
+----+----------+
| id | name |+----+----------+
|7| Jonathan |+----+----------+
Explanation:
User Winston with id =1 logged in2 times onlyin2 different days, so, Winston isnot an active user.
User Jonathan with id =7 logged in7 times in6 different days, five of them were consecutive days, so, Jonathan is an active user.
The key idea is to identify streaks of consecutive login dates for each user. By assigning a row number to each login date per user and comparing it to the login date, we can group consecutive days together. If any group has at least 5 consecutive days, that user is considered active.
Remove duplicate login dates for each user, as multiple logins on the same day don’t count as extra.
For each user, sort their login dates and assign a row number based on the order.
Calculate the difference between the login date (as days since ‘1970-01-01’) and the row number. This difference will be the same for consecutive days, forming a group.
Group by user and this difference, and count the number of days in each group.
Select users who have any group with at least 5 consecutive days.
Join with the Accounts table to get the user names.
WITH DistinctLogins AS (
SELECTDISTINCT id, login_date
FROM Logins
),
LoginStreaks AS (
SELECT id,
login_date,
ROW_NUMBER() OVER (PARTITION BY id ORDERBY login_date) AS rn
FROM DistinctLogins
),
StreakGroups AS (
SELECT id,
DATE_SUB(login_date, INTERVAL rn DAY) AS grp
FROM LoginStreaks
)
SELECT a.id, a.name
FROM Accounts a
JOIN (
SELECT id
FROM (
SELECT id, COUNT(*) AS streak
FROM StreakGroups
GROUPBY id, grp
HAVING streak >=5 ) t
) active
ON a.id = active.id
ORDERBY a.id;