Problem
+--------------+---------+
| Column Name | Type |
+--------------+---------+
| player_id | int |
| device_id | int |
| event_date | date |
| games_played | int |
+--------------+---------+
(player_id, event_date) is the primary key of this table. This table shows the activity of players of some game. Each row is a record of a player who logged in and played a number of games (possibly 0) before logging out on some day using some device.
We define the install date of a player to be the first login day of that player.
We also define day 1 retention of some date X
to be the number of players whose install date is X
and they logged back in on the day right after X
, divided by the number of players whose install date is X
, rounded to 2 decimal places.
Write an SQL query that reports for each install date, the number of players that installed the game on that day and the day 1 retention.
Solution
Method 1 - Using join
Code
SQL
SELECT
a.event_date,
ROUND((COUNT(DISTINCT CASE WHEN a.event_date = b.install_date THEN a.player_id ELSE NULL END)/
COUNT(DISTINCT CASE WHEN DATEDIFF(a.event_date, b.install_date) = 1 THEN a.player_id ELSE NULL END)),2)
FROM
(SELECT a.player_id, a.event_date, b.install_date
FROM activity a
LEFT JOIN (SELECT player_id, MIN(event_date) install_date
FROM activity
GROUP BY player_id) b ON b.player_id = a.player_id) C
WHERE a.event_date = b.install_date OR DATEDIFF(a.event_date, b.install_date) = 1
GROUP BY a.event_date