Comment on page
In this exercise, you're going to decompress a compressed string.
Your input is a compressed string of the format
number[string]and the decompressed output form should be the
numbertimes. For example:
- Number can have more than one digit. For example,
10[a]is allowed, and just means aaaaaaaaaa
- One repetition can occur inside another. For example,
- Characters allowed as input include digits, small English letters and brackets
- Digits are only to represent amount of repetitions.
- Letters are just letters.
- Brackets are only part of syntax of writing repeated substring.
- Input is always valid, so no need to check its validity.
The Idea: The first thing that needs to be done is splitting the input strings into tokens we can recognize. This will simplify things a lot later. The tokens we are interested in are brackets, numbers, and characters. Next we step into an algorithm that will match brackets. We can simulate this using a stack. On the instance we find a closing ], the top of the stack will reveal the opening brace. We can use the top of the index to extract two details: the string, and the multiplier for the string, of which will appear to the right and left of the opening delimiter respectfully.
Then come the most crucial part of the algorithm: 1) If we find a previous instance in the stack that is + 1 from its current length, we do this:
tree_level[len(s)] += multiplier * (my_str + tree_level[len(s) + 1]), otherwise we do this:
tree_level[len(s)] += (multiplier * my_str). The first identifies when we have a nested case:
a[2b[3c]], for example. so that
cccfrom stack 2 becomes
bbcccin stack 1. The point is that
acould have access to
bbccby looking at stack 0 + 1, which becomes
abbccc. Eventually the root node is reached, and we can return 0th index of
In the second case, if there isnt a child (+1 stack index), then we append to the string on the same level. This would be equivalent to the non-nested case. For example,
3[abc]3[d], where both
abcpoint to the same stack level (0).
def extract_components(tokens, open_i):
my_str = tokens[open_i + 1]
multiplier = tokens[open_i - 1]
return int(multiplier), my_str
if not compr:
# delimit by '[' or ']' or by numbers, then by default the characters will be remain and be delimited as well
# then filter out spaces to ensure proper / expected indexing in our algorithm
tokens = re.split('(\[|\]|[0-9]+)', compr)
tokens = [token for token in tokens if token != '']
tree_level = collections.defaultdict(str)
s = 
for i, token in enumerate(tokens):
if token == '[':
elif token == ']':
multiplier, my_str = extract_components(tokens, s.pop())
if my_str.isnumeric(): # checks for instance of empty string
my_str = ''
if len(s) + 1 in tree_level:
tree_level[len(s)] += multiplier * (my_str + tree_level[len(s) + 1])
tree_level[len(s)] += (multiplier * my_str)
t = '2[abc]2[aaa3[ok2[zz]3[qq2[d]]]]'
print(uncompress_str(t)) # 'abcabcaaaokzzzzqqddqqddqqddokzzzzqqddqqddqqddokzzzzqqddqqddqqddaaaokzzzzqqddqqddqqddokzzzzqqddqqddqqddokzzzzqqddqqddqqdd'
t = '2[abc]3[b]0[nope]2[aaa3[ok]]'
print(uncompress_str(t)) # 'abcabcbbbaaaokokokaaaokokok'
t = '2[3[a]1[b]]'
print(uncompress_str(t)) # 'aaabaaab'
t = '3[b2[c10[d]]]'
print(uncompress_str(t)) # 'bcddddddddddcddddddddddbcddddddddddcddddddddddbcddddddddddcdddddddddd'
t = '3[abc4[ab]]'
print(uncompress_str(t)) # 'abcabababababcabababababcabababab'
t = '3[abc]4[ab]1[c]'
print(uncompress_str(t)) # 'abcabcabcababababc'
t = '10[a]'
print(uncompress_str(t)) # 'aaaaaaaaaa'
t = '1[b2[c3[d2[abcd]]]]'
print(uncompress_str(t)) # 'bcdabcdabcddabcdabcddabcdabcdcdabcdabcddabcdabcddabcdabcd'