Regular grammar is a context-free grammar in which every production is restricted to one of the following forms:
-
A → aB, or
-
A → w, where A and B are the nonterminals, a is a terminal symbol, and w is in T *.
The ∈-productions are permitted as a special case when L(G) contains ∈. This grammar is called "regular grammar," because if the format of every production in CFG is restricted to A → aB or A → a, then the grammar can specify only regular sets. Hence, a finite automata exists that accepts L(G), if G is regular grammar. Given a regular grammar G, a finite automata accepting L(G) can be obtained as follows:
-
The number of states of the automata will be equal to the number of nonterminals of the grammar plus one; that is, there will be a state corresponding to every nonterminal of the grammar. And one more state will be there, which will be the final state of the automata. The state corresponding to the start symbol of the grammar will be the initial state of the automata. If L(G) contains ∈, then make the start state also the final state.
-
The transitions in the automata can be obtained as follows:
EXAMPLE 1
Consider the regular grammar shown below and the transition diagram of the automata, shown in Figure 1, that accepts the language generated by the grammar.
This is a non-deterministic automata. Its deterministic equivalent can be obtained as follows:
The transition diagram of the automata is shown in Figure 2.
Consider the following grammar:
The transition diagram of the finite automata that accepts the language generated by the above grammar is shown in Figure 3.
This is a non-deterministic automata. Its deterministic equivalent can be obtained as follows, and the transition diagram is shown in Figure 4.
Given a finite automata M, a regular grammar G that generates L(M) can be obtained as follows:
-
Associate suitable variables like A, B, C, etc, with the states of the automata. The labels of the states can also be used as variable names.
-
Obtain the productions of the grammar as follows. If δ (A, a) = B, then add a production A → aB to the list of productions of the grammar. If B is a final state, then add either A → a or B → ∈, to the grammar's list of productions.
-
The variable associated with the initial state of the automata is the start symbol of the grammar.
For example consider the automata shown in Figure 5.
The regular grammar that generates the language accepted by the automata shown in Figure 5 will have the following productions:
or
where A is the start symbol. Both the grammars are same, but the first one contains ∈-productions, whereas the second is ∈-free.
EXAMPLE 2
Find out whether the following grammar generates the same language.
G1:
G2:
Since the grammars G1 and G2 are the regular grammars, L(G1) = L(G2) if the minimal state automata accepting L(G1), and the minimal state automata accepting L(G2) are identical. The transition diagram of the automata accepting L(G1) is shown in Figure 6.
The automata is deterministic. Hence, to minimize, it we proceed as follows. Since state D is an unreachable state, eliminate it first. So, after eliminating state D, we get the transition diagram shown in Figure 7.
We then identify the nondistinguishable states of the automata shown in Figure 7, as follows. Initially, we have two groups:
Since
state B is distinguishable from rest of the members of Group I. Hence, we divide Group I into two groups—one containing A, and other containing E and C, as shown below:
Since
partitioning of Group II is not possible, because the transitions from all the members of Group II only go to Group II. Similarly:
Partitioning of Group II is not possible, because the transitions from all the members of Group II only go to Group I. And since:
partitioning of Group III is not possible, because the transitions from all the members of Group III only go to Group I. Similarly:
Partitioning of Group III is not possible, because the transitions from all the members of Group III only go to Group III. Hence, states E and C are nondistinguishable states. States B and F are also nondistinguishable states. Therefore, if we merge E and C to form a state E1, and we merge B and F to form B1, we get the automata shown in Figure 8.
Since no dead states exist in the automata shown in Figure 8, it is a minimal state automata that accepts L(G1). The transition diagram of the non-deterministic automata that accepts L(G2) is shown in Figure 8.
Its equivalent deterministic automata is as follows, and the transition diagram is shown in Figure 9.
This automata does not contain unreachable, nondistinguishable states or dead states. Hence, it is a minimal state automata accepting L(G2), and since it is identical to the minimal state automata accepting L(G1), L(G2) = L(G2); and therefore, G1 and G2 generate the same language.
Obtaining a Regular Expression from the Regular Grammar
Given a regular grammar G, a regular expression that specifies L(G) can be directly obtained as follows:
-
Replace the "→" symbols in the grammar's productions with "=" symbols to get a set of equations.
-
Solve the set of equations obtained above to obtain the value of the variable S, where S is the start symbol of the grammar. The result is the regular expression specifying L(G).
For example consider the following regular grammar:
-
Replacing the "→" symbol in the productions of the grammar with the "=" symbol, we get the following set of equations:
From equation (III) we get:
because equation (III) is of the form A = aA | b, where a and b are the expressions that do not contain variable A, and the solution of this is A = a*b. Similarly, from equation (II) we get:
Substituting the values of A in (I) gives:
Hence, the required regular expression is: