Problem

Table: user_content

+-------------+---------+
| Column Name | Type    |
+-------------+---------+
| content_id  | int     |
| content_text| varchar |
+-------------+---------+
content_id is the unique key for this table.
Each row contains a unique ID and the corresponding text content.

Write a solution to transform the text in the content_text column by applying the following rules:

  • Convert the first letter of each word to uppercase and the remaining letters to lowercase
  • Special handling for words containing special characters:
    • For words connected with a hyphen -, both parts should be capitalized (e.g. , top-rated -> Top-Rated)
  • All other formatting and spacing should remain unchanged

Return the result table that includes both the originalcontent_text and the modified text following the above rules.

The result format is in the following example.

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
Input:
user_content table:
+------------+---------------------------------+
| content_id | content_text                    |
+------------+---------------------------------+
| 1          | hello world of SQL              |
| 2          | the QUICK-brown fox             |
| 3          | modern-day DATA science         |
| 4          | web-based FRONT-end development |
+------------+---------------------------------+
Output:
+------------+---------------------------------+---------------------------------+
| content_id | original_text                   | converted_text                  |
+------------+---------------------------------+---------------------------------+
| 1          | hello world of SQL              | Hello World Of Sql              |
| 2          | the QUICK-brown fox             | The Quick-Brown Fox             |
| 3          | modern-day DATA science         | Modern-Day Data Science         |
| 4          | web-based FRONT-end development | Web-Based Front-End Development |
+------------+---------------------------------+---------------------------------+
Explanation:
* For content_id = 1: 
* Each word's first letter is capitalized: "Hello World Of Sql"
* For content_id = 2: 
* Contains the hyphenated word "QUICK-brown" which becomes "Quick-Brown"
* Other words follow normal capitalization rules
* For content_id = 3: 
* Hyphenated word "modern-day" becomes "Modern-Day"
* "DATA" is converted to "Data"
* For content_id = 4: 
* Contains two hyphenated words: "web-based" -> "Web-Based"
* And "FRONT-end" -> "Front-End"

Example 2:

Solution

Method 1 – String Manipulation with Hyphen Handling

Intuition

We want to capitalize the first letter of each word and lowercase the rest, but for hyphenated words, both parts should be capitalized. This requires splitting on spaces and hyphens, capitalizing each part, and then reconstructing the string while preserving original spacing and hyphens.

Approach

  1. Split the content_text into words, preserving spaces.
  2. For each word, if it contains a hyphen ‘-’, split it into subwords, capitalize each subword, and join with ‘-’.
  3. Otherwise, capitalize the first letter and lowercase the rest.
  4. Join the words back, preserving original spaces and hyphens.
  5. Return the transformed text for each content_id, along with the original text.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
SELECT content_id, content_text AS original_text,
  TRIM(
    REPLACE(
      REPLACE(
        INITCAP(REPLACE(content_text, '-', ' - ')),
        ' - ', '-'
      ),
      '  ', ' '
    )
  ) AS converted_text
FROM user_content;
1
2
3
SELECT content_id, content_text AS original_text,
  REGEXP_REPLACE(INITCAP(REPLACE(content_text, '-', ' - ')), '\s-\s', '-', 'g') AS converted_text
FROM user_content;
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
class Solution:
    def capitalize_hyphen_words(self, user_content: 'pd.DataFrame') -> 'pd.DataFrame':
        def convert(text):
            def cap_word(w):
                return '-'.join(s.capitalize() for s in w.split('-'))
            return ' '.join(cap_word(w) for w in text.split(' '))
        df = user_content.copy()
        df['original_text'] = df['content_text']
        df['converted_text'] = df['content_text'].apply(convert)
        return df[['content_id', 'original_text', 'converted_text']]

Complexity

  • ⏰ Time complexity: O(N*L), where N is the number of rows and L is the average length of content_text, since we process each word in each row.
  • 🧺 Space complexity: O(N*L), for storing the transformed text.