Regular Expressions in Automata Theory
A regular expression is basically a shorthand way of showing how a regular language is built from the base set of regular languages.
The symbols are identical which are used to construct the languages, and any given expression that has a language closely associated with it.
A Regular Expression can be recursively defined as follows −
- ε is a Regular Expression indicates the language containing an empty string. (L (ε) = {ε})
- φ is a Regular Expression denoting an empty language. (L (φ) = { })
- x is a Regular Expression where L = {x}
- If X is a Regular Expression denoting the language L(X) and Y is a Regular Expression denoting the language L(Y), then
- X + Y is a Regular Expression corresponding to the language L(X) ∪ L(Y) where L(X+Y) = L(X) ∪ L(Y).
- X . Y is a Regular Expression corresponding to the language L(X) . L(Y) where L(X.Y) = L(X) . L(Y)
- R* is a Regular Expression corresponding to the language L(R*)where L(R*) = (L(R))*
- If we apply any of the rules several times from 1 to 5, they are Regular Expressions.
Examples of Regular Expressions
Example 1
If the regular expression is as follows −
$$\mathrm{a \:+\: b \: \cdot \: a*}$$
It can be written in fully parenthesized form as follows −
$$\mathrm{(a \:+\: (b \: \cdot \: (a*)))}$$
Regular Expressions Vs Languages
The symbols of the regular expressions are distinct from those of the languages. These symbols are given below −
Operators in Regular Expression
There are two binary operations on regular expressions (+ and ·) and one unary operator (*)
These are closely associated with the union, product and closure operations on the corresponding languages.
Example 2
The regular expression a + bc* is basically shorthand for the regular language {a} ∪ ({b} · ({c}*)).
Find the language of the given regular expression. It is explained below −
$$\mathrm{L(a \:+\: bc*) \\ \:\:\:\:\:=\: L(a)\: \cup\: L(bc*)\: \\ \:\:\:\:\:=\: L(a)\: \cup\: (L(b)\: \cdot \:L(c*)) \\ \:\:\:\:\:=\: L(a)\: \cup\: (L(b)\: \cdot \:L(c)*) \\ \:\:\:\:\:=\: \{a\}\: \cup\: (\{b\}\: \cdot \:\{c\}*) \\ \:\:\:\:\:=\: \{a\}\: \cup\: (\{b\}\: \cdot \:\{\wedge,\: c,\: c2,\: \dotso ,\: cn,\: \dotso , \}) \\ \:\:\:\:\:=\: \{a\}\: \cup\: \{b,\: bc,\: bc2,\: . . . ,\: bcn,\: . . . \} \\ \:\:\:\:\:=\: \{a,\: b,\: bc,\: bc2,\: \dotso ,\: bcn,\: \dotso \}}$$
Properties of Regular Expressions in TOC
All the properties held for any regular expressions R, E, F and can be verified by using properties of languages and sets.
Additive (+) Properties
The additive properties of regular expressions are as follows −
R + E = E + R
R + ∅ = ∅ + R = R
R + R = R
(R + E) + F = R + (E + F)
Product (.) Properties
The product properties of regular expressions are as follows −
R∅ = ∅R = ∅
R∧ = ∧R = R
(RE)F = R(EF)
Distributive Properties
The distributive properties of regular expressions are as follows −
R(E + F) = RE + RF
(R + E)F = RF + EF
Closure Properties
The closure properties of regular expressions are as follows −
∅* = ∧ * = ∧
R* = R*R* = (R*)* = R + R*
R* = ∧ + RR* = (∧ + R)R*
RR* = R*R
R(ER)* = (RE)*R
(R + E)* = (R*E*)* = (R* + E*)* = R*(ER*)*
All the properties can be verified by using the properties of languages and sets.
Example 1
Show that,
(∅ + a + b)* = a*(ba*)*
Using the properties above −
(∅ + a + b)* = (a + b)* (+ property)
= a*(ba*)* (closure property)
Example 2
Show that,
∧ + ab + abab(ab)* = (ab)*
Using the properties above −
∧ + ab + abab(ab)* = ∧ + ab(∧ + ab(ab)*)
= ∧ + ab((ab)*) (using R* = ∧ + RR*)
= ∧ + ab(ab)*= (ab)* (using R* = ∧ + RR* again)
Regular Expressions and Regular Set
| Regular Expressions | Regular Set |
|---|---|
| (0 + 10*) | L = { 0, 1, 10, 100, 1000, 10000, } |
| (0*10*) | L = {1, 01, 10, 010, 0010, } |
| (0 + ε)(1 + ε) | L = {ε, 0, 1, 01} |
| (a+b)* | Set of strings of as and bs of any length including the null string. So L = { ε, a, b, aa , ab , bb , ba, aaa.} |
| (a+b)*abb | Set of strings of as and bs ending with the string abb. So L = {abb, aabb, babb, aaabb, ababb, ..} |
| (11)* | Set consisting of even number of 1s including empty string, So L= {ε, 11, 1111, 111111, .} |
| (aa)*(bb)*b | Set of strings consisting of even number of as followed by odd number of bs , so L = {b, aab, aabbb, aabbbbb, aaaab, aaaabbb, ..} |
| (aa + ab + ba + bb)* | String of as and bs of even length can be obtained by concatenating any combination of the strings aa, ab, ba and bb including null, so L = {aa, ab, ba, bb, aaab, aaba, ..} |