New Users Daily Count
MediumUpdated: Oct 13, 2025
Practice on:
Problem
Table: Traffic
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| user_id | int |
| activity | enum |
| activity_date | date |
+---------------+---------+
This table may have duplicate rows.
The activity column is an ENUM (category) type of ('login', 'logout', 'jobs', 'groups', 'homepage').
Write a solution to reports for every date within at most 90 days from today, the number of users that logged in for the first time on that date.
Assume today is 2019-06-30.
Return the result table in any order.
The result format is in the following example.
Examples
Example 1:
Input:
Traffic table:
+---------+----------+---------------+
| user_id | activity | activity_date |
+---------+----------+---------------+
| 1 | login | 2019-05-01 |
| 1 | homepage | 2019-05-01 |
| 1 | logout | 2019-05-01 |
| 2 | login | 2019-06-21 |
| 2 | logout | 2019-06-21 |
| 3 | login | 2019-01-01 |
| 3 | jobs | 2019-01-01 |
| 3 | logout | 2019-01-01 |
| 4 | login | 2019-06-21 |
| 4 | groups | 2019-06-21 |
| 4 | logout | 2019-06-21 |
| 5 | login | 2019-03-01 |
| 5 | logout | 2019-03-01 |
| 5 | login | 2019-06-21 |
| 5 | logout | 2019-06-21 |
+---------+----------+---------------+
Output:
+------------+-------------+
| login_date | user_count |
+------------+-------------+
| 2019-05-01 | 1 |
| 2019-06-21 | 2 |
+------------+-------------+
Explanation:
Note that we only care about dates with non zero user count.
The user with id 5 first logged in on 2019-03-01 so he's not counted on 2019-06-21.
Solution
Method 1 - First Login Aggregation with 90-Day Window
Intuition
For each user we only care about their earliest 'login' date. Once we compute that first-login date per user, we count how many users have that date within the last 90 days from 2019-06-30.
Approach
- Filter
Trafficforactivity = 'login'. - For each
user_idcomputeMIN(activity_date)as theirfirst_login. - Restrict
first_loginto the inclusive window fromDATE_SUB('2019-06-30', INTERVAL 89 DAY)to'2019-06-30'(this is 90 days including endpoints). - Group by
first_loginand count users.
Code
MySQL
SELECT first_login AS login_date, COUNT(*) AS user_count
FROM (
SELECT user_id, MIN(activity_date) AS first_login
FROM Traffic
WHERE activity = 'login'
GROUP BY user_id
) t
WHERE first_login BETWEEN DATE_SUB('2019-06-30', INTERVAL 89 DAY) AND '2019-06-30'
GROUP BY first_login;
PostgreSQL
-- PostgreSQL equivalent (use INTERVAL arithmetic)
SELECT first_login AS login_date, COUNT(*) AS user_count
FROM (
SELECT user_id, MIN(activity_date) AS first_login
FROM Traffic
WHERE activity = 'login'
GROUP BY user_id
) t
WHERE first_login BETWEEN (DATE '2019-06-30' - INTERVAL '89 days') AND DATE '2019-06-30'
GROUP BY first_login;
Python (using pandas)
import pandas as pd
def first_login_counts(traffic: pd.DataFrame) -> pd.DataFrame:
"""Return DataFrame with columns ['login_date','user_count'] for first logins in the 90-day window.
Assumes traffic['activity_date'] is a datetime-like column or string in ISO format.
"""
# Ensure activity_date is datetime
df = traffic.copy()
df['activity_date'] = pd.to_datetime(df['activity_date'])
# Filter to login activities and compute first login per user
logins = df[df['activity'] == 'login']
first_login = logins.groupby('user_id', as_index=False)['activity_date'].min()
first_login.rename(columns={'activity_date': 'first_login'}, inplace=True)
# Define window (inclusive)
end = pd.to_datetime('2019-06-30')
start = end - pd.Timedelta(days=89)
# Filter to the 90-day window and count by date
mask = (first_login['first_login'] >= start) & (first_login['first_login'] <= end)
window = first_login.loc[mask].copy()
window['login_date'] = window['first_login'].dt.date
result = (
window.groupby('login_date')
.size()
.reset_index(name='user_count')
.sort_values('login_date')
.reset_index(drop=True)
)
return result
Complexity
- ⏰ Time complexity:
O(N)– We scanTrafficonce to filter and compute per-user minimums, then aggregate over distinct users. - 🧺 Space complexity:
O(U)– We store one row per distinct user (their first login) during aggregation.