Problem
Given a string, Write a program to remove duplicate characters from the string.
Examples
Example 1:
Input: s = kodeknight
Output: kodenight
Example 2:
Input: s = k5kc
Output: k5c
Solution
There are multiple approach to solve it.
Method 1 : Brute Force
Code
C
void removeDuplicates(char *str)
{
if (!str) {
return;
}
int len = strlen(str);
if (len < 2){
return;
}
int tail = 1;
for (int i = 1; i < len; ++i) {
int j;
for (j = 0; j < tail; ++j)
if (str[i] == str[j]) {
break;
}
if (j == tail) {
str[tail] = str[i];
++tail;
}
}
str[tail] = '\0';
}
Complexity
- ⏰ Time complexity:
O(n^2)
- 🧺 Space complexity:
O(1)
Method 2 : Using Sorting
These are the steps we can follow:
- Convert string to char array (if not already)
- Sort the elements.
k5kc
becomes5ckk
- Now in a loop, remove duplicates by comparing the current character with previous character.
5ck
Code
Java
public static String removeDuplicates(String s) {
char[] chars = s.toCharArray();
Arrays.sort(chars); // O(nlogn)
String sbString = new String();
for (int i = 1; i < chars.length; i++) {
if (chars[i] != chars[i - 1])
sbString += chars[i];
}
return sbString;
}
Complexity
- ⏰ Time complexity:
O(nlogn)
- 🧺 Space complexity:
O(1)
Characters are not in order.
Method 3 - Using Hashtable
- Convert the string to character array.
- Store all the characters in the Set.
- Set will store only one instance of each character
- Iterate through Set and print the characters.
- This approach will change the order of characters.
Code
Java
Two Pass
public static String removeDuplicates(String s) {
HashSet <Character> set = new HashSet<>();
char[] chars = s.toCharArray();
for (int i = 0; i < chars.length; i++) {
set.add(chars[i]);
}
Iterator <Character> iterator = set.iterator();
String sbString = new String();
while (iterator.hasNext()){
sbString += iterator.next() + "";
}
return sbString;
}
One Pass
Single pass algorithm without changing order:
public static String removeDuplicates(String s) {
HashSet < Character > set = new HashSet < > ();
char[] chars = s.toCharArray();
StringBuilder sb = new StringBuilder();
for (int i = 0; i < chars.length; i++) {
char c = chars[i];
if (set.contains(c)) {
continue;
}
set.add(chars[i]);
sb.append(c);
}
return sb.toString();
}
Complexity
- ⏰ Time complexity:
O(n)
- 🧺 Space complexity:
O(n)
Method 4 - LinkedHashset
Here are the algorithm:
- Convert the string to character array.
- Store all the characters in the Linked Hash Set.
- Set will store only one instance of each character
- Iterate through Set and print the characters.
- This approach will retain the order of characters.
Code
Java
public static String removeDuplicates(String s) {
LinkedHashSet <Character> set = new LinkedHashSet<>();
char[] chars = s.toCharArray();
for (int i = 0; i < chars.length; i++) {
set.add(chars[i]);
}
Iterator <Character> iterator = set.iterator();
String sbString = new String();
while (iterator.hasNext())
sbString += iterator.next() + "";
return sbString;
}
Complexity
- ⏰ Time complexity:
O(n^2)
- 🧺 Space complexity:
O(1)
Method 5 - Using array as Hashtable
Similar to approach with hashtable, but if chars are between a-z, or ascii, just use array Here is the procedure
- Convert the string to character array.
- Create the boolean array of 256,( for all 0 – 255 ASCII characters)
- Iterate through character array from step one and set the Boolean array index ( index = ascii value of the character). Ignore if that index is already true.
- Create a separate array based on the index set as true in Boolean array.
- This approach will retain the order of characters.
Code
C
void removeDuplicates(char * str) {
if (!str)
return;
int len = strlen(str);
if (len < 2)
return;
bool hit[256];
for (int i = 0; i < 256; ++i)
hit[i] = false;
hit[str[0]] = true;
int tail = 1;
for (int i = 1; i < len; ++i) {
if (!hit[str[i]]) {
str[tail] = str[i];
++tail;
hit[str[i]] = true;
}
}
str[tail] = '\0';
}
Java
public static String removeDuplicates(String s) {
char[] chrArr = s.toCharArray();
boolean[] asciiChrSet = new boolean[256];
StringBuilder stb = new StringBuilder();
for (int i = 0; i < chrArr.length; i++) {
if (asciiChrSet[chrArr[i]]) {
continue;
}
asciiChrSet[chrArr[i]] = true;
stb.append(chrArr[i]);
}
return stb.toString();
}
Complexity
- ⏰ Time complexity:
O(n)
- 🧺 Space complexity:
O(n)
Method 6 - Special Case of Chars Being a…z
Putting the constraint of space and ordering Suppose the problem becomes:
Write code to remove the duplicate characters in a string without using any additional buffer. NOTE: One or two additional variables are fine. An extra copy of the array is not.
Then what solution do we have? Method 2 cannot be used as we want to keep the ordering same. Method 3 cannot be used, as it uses hashtable, which uses extra spaces. Well, nothing is stopping us from using method 1, but it is brute force, and can we do better?
The answer is yes, if the interviewer confirms, that your string is a constrained by say it can have only small case integers. Since one or two additional variables are fine but no buffer is allowed, you can simulate the behaviour of a hashmap by using an integer to store bits instead. This does not work for normal cases. It will only work from a..z case sensitive. No spaces supported. This would work miracles in a 256bit system to process the entire ASCII range. But it does run in O(N). I still +1 this one.
Code
C
This simple solution runs at O(n), which is faster than yours. Also, it isn’t conceptually complicated and in-place :
public static void removeDuplicates(char[] str) {
int map = 0;
for (int i = 0; i < str.length; i++) {
if ((map & (1 << (str[i] - 'a'))) > 0) // duplicate detected
str[i] = 0;
else // add unique char as a bit '1' to the map
map |= 1 << (str[i] - 'a');
}
}
The drawback is that the duplicates (which are replaced with 0’s) will not be placed at the end of the str[] array. However, this can easily be fixed by looping through the array one last time. Also, an integer has the capacity for only regular letters.
Complexity
- ⏰ Time complexity:
O(n)
- 🧺 Space complexity:
O(1)