Problem
Given a string formula
representing a chemical formula, return the count of each atom.
The atomic element always starts with an uppercase character, then zero or more lowercase letters, representing the name.
One or more digits representing that element’s count may follow if the count is greater than 1
. If the count is 1
, no digits will follow.
- For example,
"H2O"
and"H2O2"
are possible, but"H1O2"
is impossible.
Two formulas are concatenated together to produce another formula.
- For example,
"H2O2He3Mg4"
is also a formula.
A formula placed in parentheses, and a count (optionally added) is also a formula.
- For example,
"(H2O2)"
and"(H2O2)3"
are formulas.
Return the count of all elements as a string in the following form: the first name (in sorted order), followed by its count (if that count is more than 1
), followed by the second name (in sorted order), followed by its count (if that count is more than 1
), and so on.
The test cases are generated so that all the values in the output fit in a 32-bit integer.
Examples
Example 1:
Input: formula = "H2O"
Output: "H2O"
Explanation: The count of elements are {'H': 2, 'O': 1}.
Example 2:
Input: formula = "Mg(OH)2"
Output: "H2MgO2"
Explanation: The count of elements are {'H': 2, 'Mg': 1, 'O': 2}.
Example 3:
Input: formula = "K4(ON(SO3)2)2"
Output: "K4N2O14S4"
Explanation: The count of elements are {'K': 4, 'N': 2, 'O': 14, 'S': 4}.
Solution
Method 1 - Using Stack of frequency maps
We can use stack of frequency maps of element name and count it occurs in the formula. For eg. for H2O, frequency map is : {H:2, O:1}
First frequency map in the stack will be for the case, without brackets.
Now, while parsing, we have 4 kinds of tokens to take care of - 1) element name, 2) element number or multiplicity, 3) open, and 4) close bracket.
So, we iterate on the string and handle the cases:
Handling Parentheses:
- Use a stack to handle nested parentheses. Each time an opening parenthesis
(
is encountered, push a newfrequency map
onto the stack. - When a closing parenthesis
)
is encountered, pop the top counter and multiply its elements by the number following the parenthesis. Element Parsing: - Element Names: Characters following an uppercase letter that are lowercase are part of the atom name.
- Multiplicity: Digits following the atom name or a parenthesis indicate the count.
Once, we do that, we just use convert frequency map in stack to treemap, as elements need to in lexicographically ascending order.
Code
Java
class Solution {
public String countOfAtoms(String formula) {
Deque < Map<String, Integer>> stack = new ArrayDeque<>();
stack.push(new HashMap<>());
int n = formula.length();
for (int i = 0; i < n;) {
char ch = formula.charAt(i);
if (ch == '(') {
stack.push(new HashMap<>());
i++;
} else if (ch == ')') {
Map<String, Integer> top = stack.pop();
i++;
int iStart = i, multiplicity = 1;
while (i < n && Character.isDigit(formula.charAt(i))) {
i++;
}
if (i > iStart) {
multiplicity = Integer.parseInt(formula.substring(iStart, i));
}
for (String name: top.keySet()) {
int count = top.get(name);
stack.peek().put(name, stack.peek().getOrDefault(name, 0) + count * multiplicity);
}
} else {
int iStart = i++;
while (i < n && Character.isLowerCase(formula.charAt(i))) {
i++;
}
String name = formula.substring(iStart, i);
iStart = i;
while (i < n && Character.isDigit(formula.charAt(i))) {
i++;
}
int multiplicity = i > iStart ? Integer.parseInt(formula.substring(iStart, i)) : 1;
stack.peek().put(name, stack.peek().getOrDefault(name, 0) + multiplicity);
}
}
Map<String, Integer> count = stack.pop();
TreeMap<String, Integer> sortedCount = new TreeMap<>(count);
StringBuilder ans = new StringBuilder();
for (String name: sortedCount.keySet()) {
ans.append(name);
int multiplicity = sortedCount.get(name);
if (multiplicity > 1) {
ans.append(multiplicity);
}
}
return ans.toString();
}
}
Complexity
- ⏰ Time complexity:
O(n)
, wheren
is length of formula string.- Parsing the Formula: Each character of the formula is processed in a single pass, leading to a time complexity of O(n).
- Stack Operations: Pushing and popping from the stack occurs at most once per character. Stack operations themselves (push and pop) have constant time O(1) complexity.
- Manipulating Counters: Merging counters when encountering a closing parenthesis involves combining counts of elements. In the worst case, this could take linear time relative to the number of unique elements within a formula.
- 🧺 Space complexity:
O(n)
Dry Run
Lets take formula = “K4(ON(SO3)2)2”
Initial setup:
stack = [{}], n = 13, i = 0
Iterations
i = 0
We found K
, now look ahead for digits, and multiplicity is 4. stack = [{K: 4}]
i = 2
We encounter (
⇨ push new counter to stack ⇨ stack = [{K: 4}, {}]
i = 3
We encounter O
, but no digits following. Hence stack = [{K: 4}, {O:1}]
.
i = 4
We encounter N
, but no digits following. Hence stack = [{K: 4}, {O:1, N:1}]
.
i = 5
We encounter (
, push new counter to stack ⇨ stack = [{K: 4}, {O:1, N:1}, {}]
i = 6
We encounter S
, but no digits following. Hence stack = [{K: 4}, {O:1, N:1}, {S:1}]
i = 7
We encounter O
, now look ahead for digits, and multiplicity is 3. Hence stack = [{K: 4}, {O:1, N:1}, {S:1,O:3}]
i = 9
We encounter )
⇨ we expect digit after it, and we get digit 2. Popout the top the frequency map or the counter, and multiply by 2.
stack = [{K: 4}, {O:1, N:1}]
multiplying top counter { 'S': 1, 'O': 3 }
by 2 gives { 'S': 2, 'O': 6 }
.
We merge it within the stack, hence merge { 'S': 2, 'O': 6 }
with { 'O': 1, 'N': 1}
⇨ to { 'O': 7, 'N': 1, 'S': 2 }
stack = [{'K': 4}, {'O': 7, 'N': 1, 'S': 2}]
i = 11
We encounter )
⇨ we expect digit after it, and we get digit 2. Popout the top the frequency map or the counter, and multiply by 2.
top counter = { 'O': 7, 'N': 1, 'S': 2 }
by 2 gives { 'O': 14, 'N': 2, 'S': 4 }
, and,
Merge { 'O': 14, 'N': 2, 'S': 4 }
with { 'K': 4 }
.
⇨ stack = [{ 'K': 4, 'O': 14, 'N': 2, 'S': 4 }]
i = 13
We break out of loop
Final result
- Result:
"K4N2O14S4"