1.4 REGULAR EXPRESSION NOTATION/FINITE AUTOMATA DEFINITIONS
String
A string is a finite sequence of symbols. We use a letter, such as w, to denote a string. If w is the string, then the length of string is denoted as | w |, and it is a count of number of symbols of w. For example, if w = xyz, | w | = 3. If | w | = 0, then the string is called an "empty" string, and we use ∈ to denote the empty string.
Prefix
A string's prefix is the string formed by taking any number of leading symbols of string. For example, if w = abc, then ∈, a, ab, and abc are the prefixes of w. Any prefix of a string other than the string itself is called a "proper" prefix of the string.
Suffix
A string's suffix is formed by taking any number of trailing symbols of a string. For example, if w = abc, then ∈, c, bc, and abc are the suffixes of the w. Similar to prefixes, any suffix of a string other than the string itself is called a "proper" suffix of the string.
Concatenation
If w1 and w2 are two strings, then the concatenation of w1 and w2 is denoted as w1.w2—simply, a string obtained by writing w1 followed by w2 without any space in between (i.e., a juxtaposition of w1 and w2). For example, if w1 = xyz, and w2 = abc, then w1.w2 = xyzabc. If w is a string, then w.∈ = w, and ∈.w = w. Therefore, we conclude that ∈ (empty string) is a concatenation identity.
Alphabet
An alphabet is a finite set of symbols denoted by the symbol Σ.
Language
A language is a set of strings formed by using the symbols belonging to some previously chosen alphabet. For example, if Σ = { 0, 1 }, then one of the languages that can be defined over this Σ will be L = { ∈, 0, 00, 000, 1, 11, 111, … }.
Set
A set is a collection of objects. It is denoted by the following methods:
-
We can enumerate the members by placing them within curly brackets ({ }). For example, the set A is defined by: A = { 0, 1, 2 }.
-
We can use a predetermined notation in which the set is denoted as: A = { x | P (x) }. This means that A is a set of all those elements x for which the predicate P (x) is true. For example, a set of all integers divisible by three will be denoted as: A = { x | x is an integer and x mod 3 = 0}.
Set Operations
-
Union: If A and B are the two sets, then the union of A and B is denoted as: A ∪ B = { x | x in A or x is in B }.
-
Intersection: If A and B are the two sets, then the intersection of A and B is denoted as: A ∩ B = { x | x is in A and x is in B }.
-
Set difference: If A and B are the two sets, then the difference of A and B is denoted as: A − B = { x | x is in A but not in B }.
-
Cartesian product: If A and B are the two sets, then the Cartesian product of A and B is denoted as: A � B = { (a, b) | a is in A and b is in B }.
-
Power set: If A is the set, then the power set of A is denoted as : 2A = P | P is a subset of A } (i.e., the set contains of all possible subsets of A.) For example:
-
Concatenation: If A and B are the two sets, then the concatenation of A and B is denoted as: AB = { ab | a is in A and b is in B }. For example, if A = { 0, 1 } and B = { 1, 2 }, then AB = { 01, 02, 11, 12 }.
-
Closure: If A is a set, then closure of A is denoted as: A* = A0 ∪ A1 ∪ A2 ∪ … ∪A∞, where Ai is the ith power of set A, defined as Ai = A.A.A …i times.
-
(i.e., the set of all possible combination of members of A of length 0)
-
(i.e., the set of all possible combination of members of A of length 1)
-
(i.e., the set of all possible combinations of members of A of length 2)
Therefore A* is the set of all possible combinations of the members of A. For example, if Σ = { 0,1), then Σ* will be the set of all possible combinations of zeros and ones, which is one of the languages defined over Σ.
|