+-------------+---------+
| Column Name | Type |
+-------------+---------+
| user_id | int |
| tweet_id | int |
| tweet_date | date |
| tweet | varchar |
+-------------+---------+
tweet_id is the primary key (column with unique values) for this table.
Each row of this table contains user_id, tweet_id, tweet_date and tweet.
Write a solution to find the top3 trending hashtags in
February2024. Each tweet only contains one hashtag.
Return the result table orderd by count of hashtag, hashtag indescendingorder.
Input:
Tweets table:+---------+----------+----------------------------------------------+------------+| user_id | tweet_id | tweet | tweet_date |+---------+----------+----------------------------------------------+------------+|135|13| Enjoying a great start to the day!#HappyDay |2024-02-01||136|14| Another #HappyDay with good vibes!|2024-02-03||137|15| Productivity peaks!#WorkLife |2024-02-04||138|16| Exploring new tech frontiers.#TechLife |2024-02-04||139|17| Gratitude for today's moments.#HappyDay |2024-02-05||140|18| Innovation drives us.#TechLife |2024-02-07||141|19| Connecting with nature's serenity.#Nature |2024-02-09|+---------+----------+----------------------------------------------+------------+Output:
+-----------+--------------+| hashtag | hashtag_count|+-----------+--------------+|#HappyDay |3||#TechLife |2||#WorkLife |1|+-----------+--------------+Explanation:
***#HappyDay:** Appeared in tweet IDs 13,14, and 17,with a total count of 3 mentions.***#TechLife:** Appeared in tweet IDs 16 and 18,with a total count of 2 mentions.***#WorkLife:** Appeared in tweet ID 15,with a total count of 1 mention.**Note:** Output table is sorted in descending order by hashtag_count and
hashtag respectively.
SELECTSUBSTRING(tweet, LOCATE('#', tweet)) AS hashtag,
COUNT(*) AS hashtag_count
FROM Tweets
WHERE tweet_date BETWEEN'2024-02-01'AND'2024-02-29'GROUPBY hashtag
ORDERBY hashtag_count DESC, hashtag DESCLIMIT3;
1
2
3
4
5
6
7
8
SELECTSUBSTRING(tweet FROM'#\\w+') AS hashtag,
COUNT(*) AS hashtag_count
FROM Tweets
WHERE tweet_date BETWEEN'2024-02-01'AND'2024-02-29'GROUPBY hashtag
ORDERBY hashtag_count DESC, hashtag DESCLIMIT3;
1
2
3
4
5
6
7
8
classSolution:
deffind_trending_hashtags(self, Tweets: 'pd.DataFrame') ->'pd.DataFrame':
mask = (Tweets['tweet_date'] >='2024-02-01') & (Tweets['tweet_date'] <='2024-02-29')
feb = Tweets[mask].copy()
feb['hashtag'] = feb['tweet'].str.extract(r'(#\\w+)')
ans = feb.groupby('hashtag').size().reset_index(name='hashtag_count')
ans = ans.sort_values(['hashtag_count', 'hashtag'], ascending=[False, False]).head(3)
return ans[['hashtag', 'hashtag_count']]