Find Invalid IP Addresses
HardUpdated: Sep 29, 2025
Practice on:
Problem
Table: logs
+-------------+---------+
| Column Name | Type |
+-------------+---------+
| log_id | int |
| ip | varchar |
| status_code | int |
+-------------+---------+
log_id is the unique key for this table.
Each row contains server access log information including IP address and HTTP status code.
Write a solution to find invalid IP addresses. An IPv4 address is invalid if it meets any of these conditions:
- Contains numbers greater than
255in any octet - Has leading zeros in any octet (like
01.02.03.04) - Has less or more than
4octets
Return the result table ordered by invalid_count, ip indescending order respectively.
The result format is in the following example.
Examples
Example 1:
Input:
logs table:
+--------+---------------+-------------+
| log_id | ip | status_code |
+--------+---------------+-------------+
| 1 | 192.168.1.1 | 200 |
| 2 | 256.1.2.3 | 404 |
| 3 | 192.168.001.1 | 200 |
| 4 | 192.168.1.1 | 200 |
| 5 | 192.168.1 | 500 |
| 6 | 256.1.2.3 | 404 |
| 7 | 192.168.001.1 | 200 |
+--------+---------------+-------------+
Output:
+---------------+--------------+
| ip | invalid_count|
+---------------+--------------+
| 256.1.2.3 | 2 |
| 192.168.001.1 | 2 |
| 192.168.1 | 1 |
+---------------+--------------+
Explanation:
* 256.1.2.3 is invalid because 256 > 255
* 192.168.001.1 is invalid because of leading zeros
* 192.168.1 is invalid because it has only 3 octets
The output table is ordered by invalid_count, ip in descending order
respectively.
Example 2:
Solution
Method 1 – Pattern Matching and Aggregation
Intuition
To find invalid IPv4 addresses, we need to check for three conditions: any octet > 255, any octet with leading zeros, or not exactly 4 octets. We can use SQL string and pattern functions or pandas string methods to check these conditions for each IP.
Approach
- Split the IP into octets using string functions.
- Check:
- If the number of octets is not 4.
- If any octet has leading zeros (length > 1 and starts with '0').
- If any octet is not a number or is > 255.
- Count the number of invalid occurrences for each IP.
- Return the IP and its invalid count, ordered by invalid_count and ip descending.
Code
MySQL
SELECT ip, COUNT(*) AS invalid_count
FROM logs
WHERE
-- Not 4 octets
LENGTH(ip) - LENGTH(REPLACE(ip, '.', '')) != 3
OR
-- Any octet has leading zeros or > 255
(
SUBSTRING_INDEX(ip, '.', 1) REGEXP '^0[0-9]+' OR
SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 2), '.', -1) REGEXP '^0[0-9]+' OR
SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 3), '.', -1) REGEXP '^0[0-9]+' OR
SUBSTRING_INDEX(ip, '.', -1) REGEXP '^0[0-9]+'
OR
CAST(SUBSTRING_INDEX(ip, '.', 1) AS UNSIGNED) > 255
OR CAST(SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 2), '.', -1) AS UNSIGNED) > 255
OR CAST(SUBSTRING_INDEX(SUBSTRING_INDEX(ip, '.', 3), '.', -1) AS UNSIGNED) > 255
OR CAST(SUBSTRING_INDEX(ip, '.', -1) AS UNSIGNED) > 255
)
GROUP BY ip
ORDER BY invalid_count DESC, ip DESC;
PostgreSQL
SELECT ip, COUNT(*) AS invalid_count
FROM logs
WHERE
LENGTH(ip) - LENGTH(REPLACE(ip, '.', '')) != 3
OR
(
split_part(ip, '.', 1) ~ '^0[0-9]+'
OR split_part(ip, '.', 2) ~ '^0[0-9]+'
OR split_part(ip, '.', 3) ~ '^0[0-9]+'
OR split_part(ip, '.', 4) ~ '^0[0-9]+'
OR split_part(ip, '.', 1)::int > 255
OR split_part(ip, '.', 2)::int > 255
OR split_part(ip, '.', 3)::int > 255
OR split_part(ip, '.', 4)::int > 255
)
GROUP BY ip
ORDER BY invalid_count DESC, ip DESC;
Python (pandas)
import pandas as pd
import re
def find_invalid_ips(logs: pd.DataFrame) -> pd.DataFrame:
def is_invalid(ip: str) -> bool:
parts = ip.split('.')
if len(parts) != 4:
return True
for p in parts:
if not p.isdigit():
return True
if len(p) > 1 and p[0] == '0':
return True
if int(p) > 255:
return True
return False
mask = logs['ip'].apply(is_invalid)
res = logs[mask].groupby('ip').size().reset_index(name='invalid_count')
res = res.sort_values(['invalid_count', 'ip'], ascending=[False, False])
return res
Complexity
- ⏰ Time complexity:
O(n), where n is the number of log rows. Each IP is checked once. - 🧺 Space complexity:
O(n), for storing the result and intermediate groupings