Problem

Table: user_content

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| content_id  | int     |
| content_text| varchar |
+-------------+---------+
content_id is the unique key for this table.
Each row contains a unique ID and the corresponding text content.

Write a solution to transform the text in the content_text column by applying the following rules:

  • Convert the first letter of each word to uppercase
  • Keep all other letters in lowercase
  • Preserve all existing spaces

Note : There will be no special character in content_text.

Return the result table that includes both the originalcontent_text and the modified text where each word starts with a capital letter.

The result format is in the following example.

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
Input:
user_content table:
+------------+-----------------------------------+
| content_id | content_text                      |
+------------+-----------------------------------+
| 1          | hello world of SQL                |
| 2          | the QUICK brown fox               |
| 3          | data science AND machine learning |
| 4          | TOP rated programming BOOKS       |
+------------+-----------------------------------+
Output:
+------------+-----------------------------------+-----------------------------------+
| content_id | original_text                     | converted_text                    |
+------------+-----------------------------------+-----------------------------------+
| 1          | hello world of SQL                | Hello World Of Sql                |
| 2          | the QUICK brown fox               | The Quick Brown Fox               |
| 3          | data science AND machine learning | Data Science And Machine Learning |
| 4          | TOP rated programming BOOKS       | Top Rated Programming Books       |
+------------+-----------------------------------+-----------------------------------+
Explanation:
* For content_id = 1: 
* Each word's first letter is capitalized: Hello World Of Sql
* For content_id = 2: 
* Original mixed-case text is transformed to title case: The Quick Brown Fox
* For content_id = 3: 
* The word AND is converted to "And": "Data Science And Machine Learning"
* For content_id = 4: 
* Handles word TOP rated correctly: Top Rated
* Converts BOOKS from all caps to title case: Books

Solution

Method 1 – String Manipulation and Word Split

Intuition

We want to capitalize the first letter of each word and lowercase the rest. This is a classic string manipulation problem, and can be solved using SQL string functions or pandas string methods.

Approach

  1. Split the content_text into words.
  2. For each word, capitalize the first letter and lowercase the rest.
  3. Join the words back with spaces.
  4. Return the transformed text for each content_id.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
SELECT content_id,
       TRIM(
         REPLACE(
           CONCAT(' ',
             REPLACE(
               LOWER(content_text),
               ' ', '  '
             ),
           ' '),
           '  ', ' '
         )
       ) AS content_text
FROM (
  SELECT content_id,
         CONCAT_WS(' ',
           UPPER(SUBSTRING(SUBSTRING_INDEX(SUBSTRING_INDEX(content_text, ' ', n.n), ' ', -1), 1, 1)),
           LOWER(SUBSTRING(SUBSTRING_INDEX(SUBSTRING_INDEX(content_text, ' ', n.n), ' ', -1), 2))
         ) AS content_text
  FROM user_content
  JOIN (
    SELECT a.N + b.N * 10 + 1 n
    FROM (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
          UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
    CROSS JOIN (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
          UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
  ) n
  WHERE n.n <= 1 + LENGTH(content_text) - LENGTH(REPLACE(content_text, ' ', ''))
) t;
1
2
3
SELECT content_id,
       INITCAP(LOWER(content_text)) AS content_text
FROM user_content;
1
2
3
4
5
class Solution:
    def capitalize_first_letter(self, user_content: 'pd.DataFrame') -> 'pd.DataFrame':
        df = user_content.copy()
        df['content_text'] = df['content_text'].str.lower().str.split().apply(lambda ws: ' '.join(w.capitalize() for w in ws))
        return df

Complexity

  • ⏰ Time complexity: O(N*L), where N is the number of rows and L is the average length of content_text, since we process each word in each row.
  • 🧺 Space complexity: O(N*L), for storing the transformed text.