Find Users With Valid E-Mails Problem
EasyUpdated: Sep 26, 2024
Practice on:
Problem
Table: Users
+---------------+---------+
| Column Name | Type |
+---------------+---------+
| user_id | int |
| name | varchar |
| mail | varchar |
+---------------+---------+
user_id is the primary key (column with unique values) for this table.
This table contains information of the users signed up in a website. Some e-mails are invalid.
Write a solution to find the users who have valid emails.
A valid e-mail has a prefix name and a domain where:
- The prefix name is a string that may contain letters (upper or lower case), digits, underscore
'_', period'.', and/or dash'-'. The prefix name must start with a letter. - The domain is
'@leetcode.com'.
Return the result table in any order.
The result format is in the following example.
Examples
Example 1:
Input: Users table:
+---------+-----------+-------------------------+
| user_id | name | mail |
+---------+-----------+-------------------------+
| 1 | Winston | [email protected] |
| 2 | Jonathan | jonathanisgreat |
| 3 | Annabelle | [email protected] |
| 4 | Sally | [email protected] |
| 5 | Marwan | quarz#[email protected] |
| 6 | David | [email protected] |
| 7 | Shapiro | [email protected] |
+---------+-----------+-------------------------+
Output:
+---------+-----------+-------------------------+
| user_id | name | mail |
+---------+-----------+-------------------------+
| 1 | Winston | [email protected] |
| 3 | Annabelle | [email protected] |
| 4 | Sally | [email protected] |
+---------+-----------+-------------------------+
Explanation: The mail of user 2 does not have a domain. The mail of user 5 has the # sign which is not allowed. The mail of user 6 does not have the leetcode domain. The mail of user 7 starts with a period.
Solution
Method 1 - Using Regexp
Code
A detailed explanation of the following regular expression solution:
'^[A-Za-z]+[A-Za-z0-9\_\.\-]*@leetcode.com'
^means the beginning of the string[]means character set.[A-Z]means any upper case chars. In other words, the short dash in the character set means range.- After the first and the second character set, there is a notation:
+or*-+means at least one of the character from the preceding charset, and*means 0 or more. \inside the charset mean skipping. In other words,\. means we want the dot as it is. Remember, for example, - means range in the character set. So what if we would like to find - itself as a character? use\-.- Everything else, like
@leetcode.comrefers to exact match.
SQL
select * from Users
where regexp_like(mail, '^[A-Za-z]+[A-Za-z0-9\_\.\-]*@leetcode.com')
Pandas
import pandas as pd
def valid_emails(users: pd.DataFrame) -> pd.DataFrame:
return users[
users['mail'].str.match(r'^[a-zA-Z][a-zA-Z\d_.-]*@leetcode\.com')
]
Complexity
- Time:
O(b) - Space:
O(1)