Invalid Tweets II
EasyUpdated: Sep 29, 2025
Practice on:
Problem
Table: Tweets
+----------------+---------+
| Column Name | Type |
+----------------+---------+
| tweet_id | int |
| content | varchar |
+----------------+---------+
tweet_id is the primary key (column with unique values) for this table.
This table contains all the tweets in a social media app.
Write a solution to find invalid tweets. A tweet is considered invalid if it meets any of the following criteria:
- It exceeds
140characters in length. - It has more than
3mentions. - It includes more than
3hashtags.
Return the result table ordered by tweet_id inascending order.
The result format is in the following example.
Examples
Example 1:
Input:
Tweets table:
+----------+-----------------------------------------------------------------------------------+
| tweet_id | content |
+----------+-----------------------------------------------------------------------------------+
| 1 | Traveling, exploring, and living my best life @JaneSmith @SaraJohnson @LisaTaylor |
| | @MikeBrown #Foodie #Fitness #Learning |
| 2 | Just had the best dinner with friends! #Foodie #Friends #Fun |
| 4 | Working hard on my new project #Work #Goals #Productivity #Fun |
+----------+-----------------------------------------------------------------------------------+
Output:
+----------+
| tweet_id |
+----------+
| 1 |
| 4 |
+----------+
Explanation:
* tweet_id 1 contains 4 mentions.
* tweet_id 4 contains 4 hashtags.
Output table is ordered by tweet_id in ascending order.
Solution
Method 1 – String Functions and Aggregation
Intuition
We need to check three conditions for each tweet: length, number of mentions, and number of hashtags. We can use string functions to count mentions and hashtags, and filter accordingly.
Approach
- For each tweet, check if the length of
contentis greater than 140. - Count the number of mentions (occurrences of '@') in
content. - Count the number of hashtags (occurrences of '#') in
content. - Select tweets where any of the above conditions is true.
- Return the result ordered by
tweet_idascending.
Code
MySQL
SELECT tweet_id
FROM Tweets
WHERE CHAR_LENGTH(content) > 140
OR LENGTH(content) - LENGTH(REPLACE(content, '@', '')) > 3
OR LENGTH(content) - LENGTH(REPLACE(content, '#', '')) > 3
ORDER BY tweet_id ASC;
PostgreSQL
SELECT tweet_id
FROM Tweets
WHERE LENGTH(content) > 140
OR LENGTH(content) - LENGTH(REPLACE(content, '@', '')) > 3
OR LENGTH(content) - LENGTH(REPLACE(content, '#', '')) > 3
ORDER BY tweet_id ASC;
Python (Pandas)
def invalid_tweets(df: 'pd.DataFrame') -> 'pd.DataFrame':
cond1 = df['content'].str.len() > 140
cond2 = df['content'].str.count('@') > 3
cond3 = df['content'].str.count('#') > 3
return df.loc[cond1 | cond2 | cond3, ['tweet_id']].sort_values('tweet_id')
Complexity
- ⏰ Time complexity:
O(n * m)— n is the number of tweets, m is the average length of content (for counting characters). - 🧺 Space complexity:
O(n)— For storing the result set.