Below you'll find a list of 26 practice problems for writing regular expressions, roughly arranged in increasing order of difficulty. Click on a question to test your answer or to view our solution walkthroughs.
Use the asterix (*) for the Kleene star and the pipe (|) for "or". Whitespace is not ignored. Happy learning!
Design a regular expression for the language consisting of all binary strings...
except the empty string
Solution:
(0|1)(0|1)*. Let's start with (0|1)*, which generates all binary strings. We want strings generated by our regex to be non-empty; in other words, we want there to be at least one character in our strings. Since we're working with binary strings, this character must be either a 0 or a 1. Thus, we can force this one-character requirement by prepending (0|1).
with any number of 0s followed by any number of 1s
Solution:
0*1*. Recall that the Kleene star generates zero or more of the preceding character. Accordingly, 0* will generate any number of 0s and 1* will generate any number of 1s. It follows that 0*1* will generate any number of 0s followed by any number of 1s.
where the second to last character is a 1
Solution:
(0|1)*1(0|1). Let's start by generating all binary strings, (0|1)*. If we wanted the last character to be a 1, we could simply append a 1 to the end of our regex: (0|1)*1. However, the question asks for the second to last character to be a 1. Thus, we must force there to be exactly one more character after the 1. Appending that requirement to our regex, we get (0|1)*1(0|1).
where every run of 0s is at least of length 3
Solution:
(0000*|1)*. 0s must occur in groups of three or more. 000 forces a group of three 0s, and 0* allows there to be infinitely more 0s. Putting these together, 0000* ensures that each run of 0s is at least of length 3. Now we can simply replace the 0 in the expression for all binary strings, (0|1)*, with 0000* to arrive at our final answer.
containing 0110
Solution:
(0|1)*0110(0|1)*. Let's start off with 0110, since our strings must contain that substring. Then, we can simply sandwhich 0110 with regexes that allow any binary string on either end of our substring. This gives us (0|1)*0110(0|1)*.
containing 0110 or 1001
Solution:
(0|1)*(0110|1001)(0|1)*. Let's start with (0110|1001), since our strings must contain one of those two substrings. We don't care what's on either end of our substring, so we can pad our substring with the regex for all binary strings to get (0|1)*(0110|1001)(0|1)*.
containing 0110 and 1001
Solution:
(0|1)*(0110(0|1)*1001|1001(0|1)*0110|100110|011001)(0|1)*. Similar to the two questions above, we can pad the expression for the substring we want with the regex for all binary strings to get something like this: (0|1)* substring (0|1)*. The tricky part is writing the expression for this substring. We could have 0110 immediately followed by 1001, represented by 01101001. However, since we only care about the presence of 0110 and 1001, we could also have any number of characters between them, so perhaps something like 0110(0|1)*1001 would be better. Of course, we must account for the alternate case where 1001 comes before 0110, so we can "or" that in as well: 0110(0|1)*1001 | 1001(0|1)*0110. Nonetheless, there's yet another edge case: when 0110 and 1001 are intermingled. Consider the string 011001; the first four characters are 0110 and the last four characters are 1001. This string should be accepted by our regex, as should 100110. We can "or" these in as well: 0110(0|1)*1001 | 1001(0|1)*0110 | 100110 | 011001. Finally, we can pad this expression for the substring with (0|1)* on both sides to arrive at our final answer.
of even length
Solutions:
(00|01|10|11)*. Strings of even length must be built up using building blocks that are two characters long. Since our alphabet only consists of 0 and 1, these building blocks must be either 00, 01, 10, or 11. We can "or" these together to get (00|01|10|11)*. Note that this is acutely similar to how we originally constructed the regex for all binary strings, (0|1)*, just with different building blocks.
((0|1)(0|1))*. An equivalent way to describe strings of even length is strings with zero or more substrings of length two. The "zero or more" phrasing is a hint to use the Kleene star, so our final answer should look something like ()*, with the insides of the parantheses specifying substrings of length two. Well, substrings of length two simply have two characters: (0|1)(0|1). Thus, we get ((0|1)(0|1))*.
not ending in 11
Solution:
(0|1)*(00|01|10)|0|1|. There are only four binary strings of length two: 00, 01, 10, and 11. Since the string cannot end with 11, it must end with one of the other three, (0|1)*(00|01|10). This regex is almost correct! We forgot about binary strings that are less than two characters long. Specifically, the empty string, 0, and 1. All of these are valid (since they don't contain 11), so we must "or" them in before arriving at our final answer.
containing at least three 1s
Solution:
(0|1)*1(0|1)*1(0|1)*1(0|1)*. We need three 1s: 111. We can also have any combination of 0s and 1s interspersed between these 1s, so we add (0|1)* before and after all three of our 1s to get our final answer.
containing at most two 1s
Solution:
0*(1|)0*(1|)0*. To allow for a 1, but not require it, we can "or" 1 with the empty string: 1|. We need two of these, but we also need to allow any number of 0s between them and surrounding them, so we get 0*(1|)0*(1|)0*.
beginning and ending with 1
Solution:
1(0|1)*1|1. We can begin by padding the expression for all binary strings with a 1 on both sides: 1(0|1)*1. However, the shortest string this matches is of length two. The string "1" is also valid, since it's first and last character (i.e., it's only character) is a 1. Thus, we must "or" that edge case in to our final answer.
with an even number of 1s
Solution:
0*(10*10*)*. The simplest case is a string with no 1s, which can be represented by 0*. Next, we must allow for groups of two 1s to be added to our string, perhaps something like 0*(11)*. However, we must also allow for 0s to be interspersed among them, so 0*(0*10*10*) would be better. It turns out the first 0* inside the parentheses is unnecessary; either this is the first group of 1s, so the 0* at the beginning of the regex can add 0s, or this is not the first group of 1s, so the 0* at the end of the previous group of 1s can add 0s. Eliminating that redundancy, we arrive at our final answer.
beginning with a run of 1s with length equivalent to 2 mod 3, followed by an even number of 0s
Solution:
11(111)*(00)*. 11 is the simplest group of 1s with length equivalent to 2 mod 3. We can add groups of three 1s to this string while maintaing the desired property: 11(111)*. Next, we need an even number of 0s, i.e. any number of groups of two 0s. We can acheive this with 00*. Putting these together, we arrive at 11(111)*(00)*.
that start with 0 and have odd length, or start with 1 and have even length
Solutions:
0(00|01|10|11)*|1(0|1)(00|01|10|11)*. Recall from an earlier question that strings with even length can be represented by (00|01|10|11)*. It follows that strings of odd length can be matched by adding one character to even length strings: (0|1)(00|01|10|11)*. Now, for a binary string to begin with 0 and have odd length, there must be a string of even length after the 0: 0(00|01|10|11)*. In contrast, for a binary string to begin with 1 and have even legnth, there must be a string of odd length after the 1: 1(0|1)(00|01|10|11)*. We can "or" these two cases together to reach our final answer.
0((0|1)(0|1))*|1(0|1)((0|1)(0|1))*. See the above explanation, but replace (00|01|10|11)* with ((0|1)(0|1))* when attempting to match strings of even length.
starting and ending with the same character, excluding the empty string
Solution:
(1(0|1)*1)|(0(0|1)*0)|0|1. We can pad the expression for all binary strings with 1s on both sides: 1(0|1)*1. This allows any binary string to be matched, as long as it starts and ends with a 1. We can do the same padding with 0s, and then "or" these two cases together: 1(0|1)*1|0(0|1)*0. However, the shortest length string this regex matches is of length 2. Both strings of length 1, "0" and "1", should be matched, so we must "or" them in too: 1(0|1)*1|0(0|1)*0|0|1.
that when treated as a non-negative binary numeral are divisible by 2
Solution:
(0|1)*0. The tricky part of this question is convincing yourself that any binary number ending in 0 is divisble by 2. Once you've done that, just append a 0 to the expression for all binary strings: (0|1)*0.
with alternating 0s and 1s (i.e., not containing 00 or 11)
Solutions:
(1|)(01)*(0|). Alternating 0s and 1s essentially means that we have unlimited repetitions of 01: (01)*. However, it is also possible to precede the first 01 with a 1 without violating the rule. For instance, 0101 is valid, as is 10101. This can be allowed by prepending (1|) to our regex: (1|)(01)*. Similarly, it is possible to append a 0 to the the last 01 without violating the rule: (1|)(01)*(0|).
(0|)(10)*(1|). Same logic as the solution above, but we allow unlimited reptitions of 10 instead. Accordingly, we allow 0 to be prepended and 1 to be appended to our string.
where all occurences of 000 come after any 1s (i.e., the last 1 must precede the first 000)
Solution:
(1|01|001)*0*. The first half of this string must not have 000, and the second half must not have any 1s. The second half is easier to tackle; if there are no 1s, there can only be 0s, so 0* suffices. The first half is tricker. We could try allowing any number of length-three strings except 000: ((0|1)(0|1)(0|1))*. This won't work; for instance, we could generate 100 and then 011, and in the middle we'd have an occurence of 000. More generally, having the Kleene star applied to an option ending in a 0 is unadvisable since it allows sneaky occurences of 000. Eliminating the options that end in 0, we are left with 1, 01, 11, 001, 011, 101, and 111. We could just "or" these together, add the regex for the second half, and get a valid solution: (1|01|11|001|011|101|111)*0*. However, we can make our answer significantly simpler. 11 and 111 can be generated from 1, so we can get rid of those. Similarly, 101 and 011 can be built up from 01 and 1, so we can eliminate those options as well. This leaves us with (1|01|001)*0*.
that don't contain 00
Solution:
(01|1)*(0|). Excluding 00 neccessitates that all 0s either be followed by a 1 or be at the end of the string. For the first condition, we make a slight change to our regex for all binary strings: (01|1)*. For the second condition, we must allow (but not force) a lone 0 to occur at the end of the string, which can be accomplished by "or"-ing a 0 with the empty string. Putting this together, we get (01|1)*(0|).
that don't contain 000
Solutions:
(|0|00)(1(|0|00))*. We can allow at most two 0s before a 1 is required. At most two 0s can be matched by (|0|00). We can allow for, but not force, 1s as such: (|0|00)1*. However, once a 1 appears, we can reset our counter and allow a 0 or 00. Thus, we can replace the 1* with (1(|0|00))*.
(1|01|001)*(|0|00). Recall from the problem regarding "where all occurences of 000 come after any 1s" that (1|01|001)* prevents any occurences of 000. If you do not understand why, please read the solution for that question. Now, we must adapt this regex for the task at hand. Specifically, this regex only matches strings that end in 1, but it is possible for a string that excludes 000 to end in a 0 or 00. Those cases can be done by appending (|0|00) to the earlier regex. Note that the empty string is one of the options since it's OK for a string to end with a 1.
that don't contain 110
Solution:
(0|10)*1*. The main idea here is that once a 11 appears, we can have no longer have a 0 for the rest of the string. To prevent 11, we must force a 0 to follow each 1: (0|10)*. Of course, at the end of the string we can have a long run of 1s insofar as there are no 0s. In other words, once the 11 appears, we're left with only 1s: 1*.
that don't contain 101
Solution:
0*(1|000*)*0*. The main idea here is that 1s must either have nothing in between them, or have at least two 0s in between them. This can be matched by (1|000*)*. While this regex sufficiently excludes 101, it also prevents 0s from happening before and after the 1s (i.e., it prevents any padding 0s). Since those 0s are not in between the 1s, they have no impact on the presence of 101 and should be allowed. Accordingly, we can adapt our regex to get 0*(1|000*)*0*.
that don't contain 11110
Solution:
(0|10|110|1110)*1*. The main idea here is that all 0s must be preceded by three or fewer ones. This can be represented by (0|10|110|1110). However, this regex fails to allow a run of 1s at the end, so we must append 1* to arrive at our final answer. This solution was inspired by this StackExchange thread.
that when treated as a non-negative binary numeral are divisible by 4 with no redundant leading 0s
Solution:
1(0|1)*00|0. In general, any binary number that ends in 00 is divisble by 4. Since we must exclude leading 0s, we can force a 1 at the beginning. These two requirements give us 1(0|1)*00. However, the lowest binary number this regex produces is 100, which translates to 4 in decimal. This leaves out the edge case of 0, which we must "or" in before we can arrive at our final answer.
that when treated as a non-negative binary numeral are divisible by 3 (for simplicty, you can ignore the case of the empty string)
Solution:
(1(01*0)*1|0)*. Your first instinct might be to find a mathematical rule for binary numbers being divisible by 3 and then write a regex for it. While this is a good idea, it won't work, since you cannot (to the best of my knowledge) write a regex for the twomethods that people usually use to check binary divisibility by 3. Instead, the best approach seems to be to make an equivalent DFA for the problem, and then translate it to a regular expression. Here's one such DFA, and an accompanying translation into a regex. This is a really tricky question. If you find a better solution, please email me and let me know!