CS403/SE307/CS355 - Theory of Computation
Department of Computer Science
National University of Ireland, Maynooth
T Naughton, NUIM
Back to CS355 home
Lab sheet 3 - Regular expressions
In this lab, you will be introduced to regular expressions. Once again, you will be formed into teams of two persons. The teams will be chosen randomly. This lab will be graded, by picking a question typical of the hardest problems on this sheet.
Regular expressions
A regular expression is a method of generating a regular language. It is like a template. All words that match the template are in the language. Let us assume that the alphabet we are dealing with is = {0, 1}. The shortest regular expressions over this alphabet would be:
There are also three ways of building up more complicated regular expressions:
A simple regular expression that is formed by combining regular expression 1, 2, and 6 is (0 1). This generates the language {0, 1} because regular expression 0 generates the language {0} and regular expression 1 generates the language {1} and
means take the union of the two languages. Similarly, combining 1, 2, and 7 gives us the regular expression (0
1) which generates the single-word language {01}. We can often remove the
from our regular expressions without loss of meaning, so the previous regular expression could be simply written as (01). The more complicated regular expression (((011)
(01))
(10)) generates the language {01, 10, 011}.
Infinite languages are specified with the star operation. The regular expression (1*) generates all finite length words consisting of zero or more 1s, i.e. {e, 1, 11, 111, 1111, ...}. The regular expression (*) and ((0
1)*) are equivalent and generate the language of all binary words (including e). The regular expression (1(
*)) generates the language of all binary words beginning with a 1.
The order of precedence is: * followed by followed by
. If we agree on this order of evaluation then the parentheses can be removed in many cases. For example, (1(
*)) can be simplified to 1
*, and (((011)
(01))
(10)) can be simplified to 011
01
10.
Some examples of regular expressions follow. You should make sure to understand each one. Assume that = {0, 1}.
*1 = {w : w
{0, 1}*, w ends in a 1}
1* = {w : w
{0, 1}*, w consists of a 0 or a 1 followed by zero or more 1s}
0*10* = {w : w {0, 1}*, w contains exactly one 1}
0* 0*10* = 0*(1
e)0* = {w : w
{0, 1}*, w contains at most one 1}
*1
* = {w : w
{0, 1}*, w contains at least one 1}
*001
* = {w : w
{0, 1}*, w contains the substring 001}
()* = {w : w
{0, 1}*, |w| is a multiple of three (includes length zero)}
*
=
* = {w : w
{0, 1}*, w
e}
01
= {01}
(0 e)(1
e) = {e, 0, 1, 01}
(0 e)1* = {w : w
{0, 1}*, w contains zero or more 1s or contains a 0 followed by zero or more 1s}
Problem 3.1 Below are several regular expressions and their corresponding languages. Explain in English why each regular expression generates the language it does. Assume that = {0, 1}.
Problem 3.2 For each of the following regular languages, devise a regular expression to generate it. You can assume that = {0, 1}.
Machine 3.3 Construct a FA that accepts the language defined by the regular expression (1 0)0*.
Machine 3.4 Let = {0, 1}. Construct a FA that accepts the language defined by the regular expression
e
.
Machine 3.5 Let = {0, 1}. Construct a FA that accepts the language defined by the regular expression
(0
11
111)*
*.
Problem 3.6 For each of the following regular languages, devise a regular expression to generate it. You can assume that = {0, 1}.
Suggested homework
Machine 3.H Let = {a, b}. Construct a FA that accepts the language defined by the regular expression e
(a
*a)
(b
*b).