+-------------+------+
|Column Name |Type|+-------------+------+
| account_id | int || start_date | date || end_date | date |+-------------+------+
account_id is the primarykeycolumnfor this table.
Eachrowof this table indicates the startandend dates of an account's subscription.
Note that always start_date < end_date.
Table: Streams
1
2
3
4
5
6
7
8
9
10
+-------------+------+
|Column Name |Type|+-------------+------+
| session_id | int || account_id | int || stream_date | date |+-------------+------+
session_id is the primarykeycolumnfor this table.
account_id is a foreignkeyfrom the Subscriptions table.
Eachrowof this tablecontains information about the account and the date associated with a stream session.
Write an SQL query to report the number of accounts that bought a subscription in 2021 but did not have any stream session.
The query result format is in the following example.
We want accounts whose subscription period overlaps year 2021 (i.e., start_date <= '2021-12-31' and end_date >= '2021-01-01') but that have no streaming records in 2021. This is an anti-join: pick subscriptions active in 2021 and exclude those with any Streams in the 2021 range.
Filter Subscriptions to accounts whose subscription overlaps 2021.
For each such account check whether there exists any Streams row with stream_date in 2021.
Count accounts for which no such stream exists. This can be written with NOT EXISTS (clear and index-friendly) or with a LEFT JOIN to a filtered Streams table and checking for NULL.
SELECTCOUNT(*) AS accounts_count
FROM Subscriptions s
WHERE s.start_date <='2021-12-31'AND s.end_date >='2021-01-01'ANDNOTEXISTS (
SELECT1FROM Streams t
WHERE t.account_id = s.account_id
AND t.stream_date BETWEEN'2021-01-01'AND'2021-12-31' );
1
2
3
4
5
6
7
8
SELECTCOUNT(*) AS accounts_count
FROM Subscriptions s
LEFTJOIN Streams t
ON s.account_id = t.account_id
AND t.stream_date BETWEEN'2021-01-01'AND'2021-12-31'WHERE s.start_date <='2021-12-31'AND s.end_date >='2021-01-01'AND t.account_id ISNULL;
import pandas as pd
defcount_accounts_no_streams_2021(subscriptions: pd.DataFrame, streams: pd.DataFrame) -> int:
"""Return the number of accounts with subscriptions overlapping 2021 but no streams in 2021.
Assumes date columns are datetime-like or ISO strings.
""" subs = subscriptions.copy()
strm = streams.copy()
subs['start_date'] = pd.to_datetime(subs['start_date'])
subs['end_date'] = pd.to_datetime(subs['end_date'])
strm['stream_date'] = pd.to_datetime(strm['stream_date'])
# Subscriptions that are active at any point in 2021 mask_subs_2021 = (subs['start_date'] <= pd.to_datetime('2021-12-31')) & (
subs['end_date'] >= pd.to_datetime('2021-01-01'))
subs_2021 = subs.loc[mask_subs_2021]
# Accounts that streamed in 2021 strm_2021 = strm[(strm['stream_date'] >='2021-01-01') & (strm['stream_date'] <='2021-12-31')]
streamed_accounts = set(strm_2021['account_id'].unique())
# Count accounts in subs_2021 that are not in streamed_accounts result = subs_2021[~subs_2021['account_id'].isin(streamed_accounts)]
return int(result['account_id'].nunique())
⏰ Time complexity:O(N + M) – We scan Subscriptions and Streams once (filters and existence checks) where N = number of subscriptions and M = number of stream rows.
🧺 Space complexity:O(1) – The SQL solutions use constant extra space for aggregation; in-memory pandas may use extra space proportional to filtered rows.