Number of Accounts That Did Not Stream
MediumUpdated: Oct 13, 2025
Practice on:
Problem
Table: Subscriptions
+-------------+------+
| Column Name | Type |
+-------------+------+
| account_id | int |
| start_date | date |
| end_date | date |
+-------------+------+
account_id is the primary key column for this table.
Each row of this table indicates the start and end dates of an account's subscription.
Note that always start_date < end_date.
Table: Streams
+-------------+------+
| Column Name | Type |
+-------------+------+
| session_id | int |
| account_id | int |
| stream_date | date |
+-------------+------+
session_id is the primary key column for this table.
account_id is a foreign key from the Subscriptions table.
Each row of this table contains information about the account and the date associated with a stream session.
Write an SQL query to report the number of accounts that bought a subscription in 2021 but did not have any stream session.
The query result format is in the following example.
Examples
Example 1:
Input:
Subscriptions table:
+------------+------------+------------+
| account_id | start_date | end_date |
+------------+------------+------------+
| 9 | 2020-02-18 | 2021-10-30 |
| 3 | 2021-09-21 | 2021-11-13 |
| 11 | 2020-02-28 | 2020-08-18 |
| 13 | 2021-04-20 | 2021-09-22 |
| 4 | 2020-10-26 | 2021-05-08 |
| 5 | 2020-09-11 | 2021-01-17 |
+------------+------------+------------+
Streams table:
+------------+------------+-------------+
| session_id | account_id | stream_date |
+------------+------------+-------------+
| 14 | 9 | 2020-05-16 |
| 16 | 3 | 2021-10-27 |
| 18 | 11 | 2020-04-29 |
| 17 | 13 | 2021-08-08 |
| 19 | 4 | 2020-12-31 |
| 13 | 5 | 2021-01-05 |
+------------+------------+-------------+
Output:
+----------------+
| accounts_count |
+----------------+
| 2 |
+----------------+
Explanation: Users 4 and 9 did not stream in 2021.
User 11 did not subscribe in 2021.
Solution
Method 1 - Active-in-2021 Anti-Join (NOT EXISTS / LEFT JOIN)
Intuition
We want accounts whose subscription period overlaps year 2021 (i.e., start_date <= '2021-12-31' and end_date >= '2021-01-01') but that have no streaming records in 2021. This is an anti-join: pick subscriptions active in 2021 and exclude those with any Streams in the 2021 range.
Approach
- Filter
Subscriptionsto accounts whose subscription overlaps 2021. - For each such account check whether there exists any
Streamsrow withstream_datein 2021. - Count accounts for which no such stream exists. This can be written with
NOT EXISTS(clear and index-friendly) or with aLEFT JOINto a filteredStreamstable and checking forNULL.
Code
MySQL / PostgreSQL (NOT EXISTS, preferred)
SELECT COUNT(*) AS accounts_count
FROM Subscriptions s
WHERE s.start_date <= '2021-12-31'
AND s.end_date >= '2021-01-01'
AND NOT EXISTS (
SELECT 1
FROM Streams t
WHERE t.account_id = s.account_id
AND t.stream_date BETWEEN '2021-01-01' AND '2021-12-31'
);
MySQL / PostgreSQL (LEFT JOIN alternative)
SELECT COUNT(*) AS accounts_count
FROM Subscriptions s
LEFT JOIN Streams t
ON s.account_id = t.account_id
AND t.stream_date BETWEEN '2021-01-01' AND '2021-12-31'
WHERE s.start_date <= '2021-12-31'
AND s.end_date >= '2021-01-01'
AND t.account_id IS NULL;
Python (using pandas)
import pandas as pd
def count_accounts_no_streams_2021(subscriptions: pd.DataFrame, streams: pd.DataFrame) -> int:
"""Return the number of accounts with subscriptions overlapping 2021 but no streams in 2021.
Assumes date columns are datetime-like or ISO strings.
"""
subs = subscriptions.copy()
strm = streams.copy()
subs['start_date'] = pd.to_datetime(subs['start_date'])
subs['end_date'] = pd.to_datetime(subs['end_date'])
strm['stream_date'] = pd.to_datetime(strm['stream_date'])
# Subscriptions that are active at any point in 2021
mask_subs_2021 = (subs['start_date'] <= pd.to_datetime('2021-12-31')) & (
subs['end_date'] >= pd.to_datetime('2021-01-01'))
subs_2021 = subs.loc[mask_subs_2021]
# Accounts that streamed in 2021
strm_2021 = strm[(strm['stream_date'] >= '2021-01-01') & (strm['stream_date'] <= '2021-12-31')]
streamed_accounts = set(strm_2021['account_id'].unique())
# Count accounts in subs_2021 that are not in streamed_accounts
result = subs_2021[~subs_2021['account_id'].isin(streamed_accounts)]
return int(result['account_id'].nunique())
Complexity
- ⏰ Time complexity:
O(N + M)– We scanSubscriptionsandStreamsonce (filters and existence checks) whereN= number of subscriptions andM= number of stream rows. - 🧺 Space complexity:
O(1)– The SQL solutions use constant extra space for aggregation; in-memory pandas may use extra space proportional to filtered rows.