If A → α_{1}  α_{2}  …  α_{n} are the Aproductions in the grammar, then a topdown parser can decide if a nonterminal A is to be expanded or not. And if it is to be expanded, the parser decides which Aproduction should be used. It looks at the next input symbol and finds out which of the α_{i} derivatives to a string that start with the terminal symbol comes next in the input. If none of the α_{i} derives to a string starting with a terminal symbol, the parser reports the failure; otherwise, it carries out the derivation of A using a production A → α_{i}, where α_{i} derives to a string whose first terminal symbol is the symbol coming next in the input. Therefore, we conclude that if the set of firstterminal symbols of the strings derivable from α_{i} is computed for each α_{i}, and this set is made available to the parser, then the parser can predict the right choice for the expansion of nonterminal A. This information can be easily computed using the productions of the grammar. We define a function FIRST(α), where α is in (V ∪ T)*, as follows:
FIRST(α) = Set of those terminals with which the strings derivable from α start
If α = XYZ, then FIRST(α) is computed as follows:
FIRST(α) = FIRST(XYZ) = { X } if X is terminal.
Otherwise,
FIRST(α) = FIRST(XYZ) = FIRST(X) if X does not derive to an empty string; that is, if
FIRST(X) does not contain ∈.
If FIRST(X) contains ∈, then
FIRST(α) = FIRST(XYZ) = FIRST(X) − { ∈ } ∪ FIRST(YZ)
FIRST(YZ) is computed in an identical manner:
FIRST(YZ) = { Y } if Y is terminal.
Otherwise,
FIRST(YZ) = FIRST(Y) if Y does not derive to an empty string (i.e., if FIRST(Y) does not contain ∈). If FIRST(Y) contains ∈, then
FIRST(YZ) = FIRST(Y) − { ∈ } ∪ FIRST(Z)
For example, consider the grammar:
FIRST(S) = FIRST(ACB) ∪ FIRST(CbB) ∪
FIRST(A) = FIRST(da) ∪ FIRST(BC)
FIRST(B) = FIRST(g) ∪ FIRST(∈ )
FIRST(C) = FIRST(h) ∪ FIRST(∈)
Therefore:
FIRST(BC) = FIRST(B) − { ∈ } ∪ FIRST(C)
Substituting in (II) we get:
FIRST(A)={ d } ∪ { g, h, ∈ }
FIRST(ACB) =FIRST(A) − { ∈ } ∪ FIRST(CB)
FIRST(CB) =FIRST(C) − { ∈ } ∪ FIRST(B)
Therefore, substituting in (III) we get:
FIRST(ACB)={ d, g, h, ∈ } ∪ { g, h, ∈ }
Similarly,
FIRST(CbB) =FIRST(C) − { ∈ } ∪ FIRST(bB)
Similarly,
FIRST(Ba) =FIRST(B) − { ∈ } ∪ FIRST(a)
Therefore, substituting in (I), we get:
FIRST(S)={ d, g, h, ∈ } ∪ { b, h, ∈ } ∪ { a, g, ∈ }
EXAMPLE 1
Consider the following grammar:
FIRST(aAb)= { a }
FIRST(cd)= { c }, and
FIRST(ef)= { e }
Hence, while deriving S, the parser looks at the next input symbol. And if it happens to be the terminal a, then the parser derives S using S → aAb. Otherwise, the parser reports an error. Similarly, when expanding A, the parser looks at the next input symbol; if it happens to be the terminal c, then the parser derives A using A → cd. If the next terminal input symbol happens to be e, then the parser derives A using A → ef. Otherwise, an error is reported.
Therefore, we conclude that if the righthand FIRST for the production S → aAb is computed, we can decide when the parser should do the derivation using the production S → aAb. Similarly, if the righthand FIRST for the productions A → cd and A → ef are computed, then we can decide when derivation is to be done using A → cd and A → ef, respectively. These decisions can be encoded in the form of table, as shown in Table 1, and can be made available to the parser for the correct selection of productions for derivations during parsing.
Table 1: Production Selections for Parsing Derivations
� 
a

b

c

d

e

f

$

S

S → aAb
 �  �  �  �  �  � 
A
 �  � 
A → cd
 � 
A → ef
 �  � 
The number of rows of the table are equal to the number of nonterminals, whereas the number of columns are equal to the number of terminals, including the end marker. The parser uses of the nonterminal to be derived as the row index of the table, and the next input symbol is used as the column index when the parser decides which production will be derived. Here, the production S → aAb is added in the table at [S, a] because FIRST(aAb) contains a terminal a. Hence, S must be derived using S → aAb if and only if the terminal symbol coming next in the input is a. Similarly, the production A → cd is added at [A, c], because FIRST(cd) contain c. Hence, A must be derived using A → cd if and only if the terminal symbol coming next in the input is c. Finally, A must be derived using A → ef if and only if the terminal symbol coming next in the input is e. Hence, the production A → ef is added at [A, e]. Therefore, we conclude that the table can be constructed as follows:
for every production A → α do
for every a in FIRST(α) do
TABLE[A, a] = A → α
Using the above method, every production of the grammar gets added into the table at the proper place when the grammar is ∈free. But when the grammar is not ∈free, ∈productions will not get added to the table. If there is an ∈production A → ∈ in the grammar, then deciding when A is to be derived to ∈ is not possible using the production's righthand FIRST. Some additional information is required to decide where the production A → ∈ is to be added to the table.
 Tip� 
The derivation by A → ∈ is a right choice when the parser is on the verge of expanding the nonterminal A and the next input symbol happens to be a terminal, which can occur immediately following A in any string occurring on the right side of the production. This will lead to the expansion of A to ∈, and the next leaf in the parse tree will be considered, which is labeled by the symbol immediately following A and, therefore, may match the next input symbol.

Therefore, we conclude that the production A → ∈ is to be added in the table at [A, b] for every b that immediately follows A in any of the production grammar's righthand strings. To compute the set of all such terminals, we make use of the function FOLLOW(A), where A is a nonterminal, as defined below:
FOLLOW(A) = Set of terminals that immediately follow A in any string occurring on the right side of productions of the grammar
For example, if A → αBβ is a production, then FOLLOW(B) can be computed using A → αBβ, as shown below:
FOLLOW(B) = FIRST(β) if FIRST(β) does not contain ∈.
Therefore, we conclude that when the grammar is not ∈free, then the table can be constructed as follows:

Compute FIRST and FOLLOW for every nonterminal of the grammar.

For every production A → α, do:
{
for every non∈ member a in FIRST(α) do
TABLE[A, a] = A → α
If FIRST(α) contain ∈ then
For every b in FOLLOW(A) do
TABLE[A, b] = A → α
}
Therefore, we conclude that if the table is constructed using the above algorithm, a topdown parser can be constructed that will be a nonbacktracking, or ‘predictive’ parser.
1 Implementation of a TableDriven Predictive Parser
A tabledriven parser can be implemented using an input buffer, a stack, and a parsing table. The input buffer is used to hold the string to be parsed. The string is followed by a "$" symbol that is used as a rightend maker to indicate the end of the input string. The stack is used to hold the sequence of grammar symbols. A "$" indicates bottom of the stack. Initially, the stack has the start symbol of a grammar above the $. It is a twodimensional array TABLE[A, a], where A is a nonterminal and a is a terminal, or $ symbol. The parser is controlled by a program that behaves as follows:

The program considers X, the symbol on the top of the stack, and the next input symbol a.

If X = a = $, then parser announces the successful completion of the parsing and halts.

If X = a ≠ $, then the parser pops the X off the stack and advances the input pointer to the next input symbol.

If X is a nonterminal, then the program consults the parsing table entry TABLE[x, a]. If TABLE[x, a] = x → UVW, then the parser replaces X on the top of the stack by UVW in such a manner that U will come on the top. If TABLE[x, a] = error, then the parser calls the errorrecovery routine.
For example consider the following grammar:
FIRST(S) = FIRST(aABb) = { a }
FIRST(A) = FIRST(c) ∪ FIRST(∈) = { c, ∈ }
FIRST(B) = FIRST(d) ∪ FIRST(∈) = { d, ∈ }
Since the rightend marker $ is used to mark the bottom of the stack, $ will initially be immediately below S (the start symbol) on the stack; and hence, $ will be in the FOLLOW(S). Therefore:
Using S → aABb, we get:
Therefore, the parsing table is as shown in Table 2.
Table 2: Production Selections for Parsing Derivations
� 
a

b

c

d

$

S

S → aABb
 �  �  �  � 
A
 � 
A → ∈

A → c

A → ∈
 � 
B
 � 
B → ∈
 � 
B → d
 � 
Consider an input string acdb. The various steps in the parsing of this string, in terms of the contents of the stack and unspent input, are shown in Table 3.
Table 3: Steps Involved in Parsing the String acdb
Stack Contents

Unspent Input

Moves


$S

acdb$

Derivation using S → aABb

$bBAa

acdb$

Popping a off the stack and advancing one position in the input

$bBA

cdb$

Derivation using A → c

$bBc

cdb$

Popping c off the stack and advancing one position in the input

$bB

db$

Derivation using B → d

$bd

db$

Popping d off the stack and advancing one position in the input

$b

b$

Popping b off the stack and advancing one position in the input

$

$

Announce successful completion of the parsing

Similarly, for the input string ab, the various steps in the parsing of the string, in terms of the contents of the stack and unspent input, are shown in Table 4.
Table 4: Production Selections for String ab Parsing Derivations
Stack Contents

Unspent Input

Moves


$S

ab$

Derivation using S → aABb

$bBAa

ab$

Popping a off the stack and advancing one position in the input

$bBA

b$

Derivation using A → ∈

$bB

b$

Derivation using B → ∈

$b

b$

Popping b off the stack and advancing one position in the input

$

$

Announce successful completion of the parsing

For a string adb, the various steps in the parsing of the string, in terms of the contents of the stack and unspent input, are shown in Table 5.
Table 5: Production Selections for Parsing Derivations for the String adb
Stack Contents

Unspent Input

Moves


$S

adb$

Derivation using S → aABb

$bBAa

adb$

Popping a off the stack and advancing one position in the input

$bBA

ab$

Calling an errorhandling routine

The heart of the tabledriven parser is the parsing tablethe parser looks at the parsing table to decide which alternative is a right choice for the expansion of a nonterminal during the parsing of the input string. Hence, constructing a tabledriven predictive parser can be considered as equivalent to constructing the parsing table.
A parsing table for any grammar can be obtained by the application of the above algorithm; but for some grammars, some of the entries in the parsing table may end up being multiple defined entries. Whereas for certain grammars, all of the entries in the parsing table are singly defined entries. If the parsing table contains multiple entries, then the parser is still nondeterministic. The parser will be a deterministic recognizer if and only if there are no multiple entries in the parsing table. All such grammars (i.e., those grammars that, after applying the algorithm above, contain no multiple entries in the parsing table) constitute a subset of CFGs called "LL(1)" grammars. Therefore, a given grammar is LL(1) if its parsing table, constructed by algorithm above, contains no multiple entries. If the table contains multiple entries, then the grammar is not LL(1).
In the acronym LL(1), the first L stands for the lefttoright scan of the input, the second L stands for the leftmost derivation, and the (1) indicates that the next input symbol is used to decide the next parsing process (i.e., length of the lookahead is "1").
In the LL(1) parsing system, parsing is done by scanning the input from left to right, and an attempt is made to derive the input string in a leftmost order. The next input symbol is used to decide what is to be done next in the parsing process. The predictive parser discussed above, therefore, is a LL(1) parser, because it also scans the input from left to right and attempts to obtain the leftmost derivation of it; and it also makes use of the next input symbol to decide what is to be done next. And if the parsing table used by the predictive parser does not contain multiple entries, then the parser acts as a recognizer of only the members of L(G); hence, the grammar is LL(1).
Therefore, LL(1) is the grammar for which an LL(1) parser can be constructed, which acts as a deterministic recognizer of L(G). If a grammar is LL(1), then a deterministic topdown tabledriven recognizer can be constructed to recognize L(G). A parsing table constructed for a given grammar G will have multiple entries if the grammar contains multiple productions that derive the same nonterminalthat is, the grammar contains the productions A → α  β, and both α and β derive to a string that starts with the same terminal symbol. Therefore, one of the basic requirements for a grammar to be considered LL(1) is when the grammar contains multiple productions that derive the same nonterminal, such as:
for every pair of productions A → α  β
FIRST(α) ∩ FIRST(β) = φ (i.e., FIRST(α) and FIRST(β) should be disjoint sets for every pair of productions A → α  β)
For a grammar to be LL(1), the satisfaction of the condition above is necessary as well sufficient if the grammar is ∈free. When the grammar is not ∈free, then the satisfaction of the above condition is necessary but not sufficient, because either FIRST(α) or FIRST(β) might contain ∈, but not both. The above condition will still be satisfied; but if FIRST(β) contains ∈, then production A → β will be added in the table on all terminals in FOLLOW(A). Hence, it also required that FIRST(α) and FOLLOW(A) contain no common symbols. Therefore, an additional condition must be satisfied in order for a grammar to be LL(1). When the grammar is not ∈free: for every pair of productions A → α  β
if FIRST(β) contains ∈, and FIRST(α) does not contain ∈, then
FIRST(α) ∩ FOLLOW(A) = φ
Therefore, for a grammar to be LL(1), the following conditions must be satisfied:
For every pair of productions
{
(1) FIRST(α) ∩ FIRST(β) = φ
and
if FIRST(β) contains ∈, and FIRST(α) does not contain ∈
then
(1) FIRST(α) ∩ FOLLOW(A) = φ
}
2 Examples
EXAMPLE 2
Test whether the grammar is LL(1) or not, and construct a predictive parsing table for it.
Since the grammar contains a pair of productions S → AaAb  BbBa, for the grammar to be LL(1), it is required that:
Hence, the grammar is LL(1).
To construct a parsing table, the FIRST and FOLLOW sets are computed, as shown below:

Using S → AaAb, we get:

Using S → BbBa, we get
Table 6: Production Selections for Example 2 Parsing Derivations
� 
a

b

$

S

S → AaAb

S → BbBa
 � 
A

A → ∈

A → ∈
 � 
B

B → ∈

B → ∈
 � 
EXAMPLE 3
Consider the following grammar, and test whether the grammar is LL(1) or not.
For a pair of productions S → 1AB  ∈:
because FOLLOW(S) = { $ } (i.e., it contains only the end marker. Similarly, for a pair of productions A → 1AC  0C:
Hence, the grammar is LL(1). Now, show that no leftrecursive grammar can be LL(1).
One of the basic requirements for a grammar to be LL(1) is: for every pair of productions A → α  β in the grammar's set of productions, FIRST(α) and FIRST(β) should be disjointed.
If a grammar is leftrecursive, then the set of productions will contain at least one pair of the form A → Aα  β; and hence, FIRST(Aα) and FIRST(β) will not be disjointed sets, because everything in the FIRST(β) will also be in the FIRST(Aα). It thereby violates the condition for LL(1) grammar. Hence, a grammar containing a pair of productions A → Aα  β (i.e., a leftrecursive grammar) cannot be LL(1).
Now, let X be a nullable nonterminal that derives to at least two terminal strings. Show that in LL(1) grammar, no production rule can have two consecutive occurrences of X on the right side of the production.
Since X is a nullable X ∈, X is also deriving to at least to two terminal stringsXw_{1} and Xw_{2}where w_{1} and w_{2} are the strings of terminals. Therefore, for a grammar using X to be LL(1), it is required that:
FIRST(w_{1}) ∩ FIRST(w_{2}) = φ
FIRST (w_{1}) ∩ FOLLOW(X) and FIRST(w_{2}) ∩ FOLLOW(X) = φ
If this grammar contains a production rule A → αXXβa production whose right side has two consecutive occurrences of Xthen everything in FIRST(X) will also be in the FOLLOW(X); and since FIRST(X) contains FIRST(w_{1}) as well as FIRST(w_{2}), the second condition will therefore not be satisfied. Hence, a grammar containing a production of the form A → αXXβ will never be LL(1), thereby proving that in LL(1) grammar, no production rule can have two consecutive occurrences of X on the right side of the production.
EXAMPLE 4
Construct a predictive parsing table for the following grammar where S is a start symbol and # is the end marker.
Here, # is taken as one of the grammar symbols. And therefore, the initial configuration of the parser will be (S, w#), where the first member of the pair is the contents of the stack and the second member is the contents of input buffer.
Therefore, by substituting in (I), we get:

Using S → S# we get:

Using S → qABC we get:
Substituting in (II) we get:

Using A → bbD we get:
Therefore, the parsing table is derived as shown in Table 7.
Table 7: Production Selections for Example 4 Parsing Derivations
� 
q

a

b

c

#

S

S → S#
 �  �  �  � 
S

S → qabc
 �  �  �  � 
A
 � 
A → a

A → bbD
 �  � 
B
 � 
B → a

B → ∈
 � 
B → ∈

C
 �  � 
C → b
 � 
C → ∈

D
 � 
D → ∈

D → ∈

D → c

D → ∈

EXAMPLE 5
Construct predictive parsing table for the following grammar:
Since the grammar is ∈free, FOLLOW sets are not required to be computed in order to enter the productions into the parsing table. Therefore the parsing table is as shown in Table 8.
Table 8: Production Selections for Example 5 Parsing Derivations
� 
a

b

f

g

d

S

S → A
 �  �  �  � 
A

A → aS
 �  � 
A → d
 � 
B
 � 
B → bBC

B → f
 �  � 
C
 �  �  � 
C → g
 � 
EXAMPLE 6
Construct a predictive parsing table for the following grammar, where S is a start symbol.

Using S → iEtSS_{1}:

Using S_{1} → eS:
Therefore, the parsing table is as shown in Table 9.
Table 9: Production Selections for Example 6 Parsing Derivations
� 
i

a

b

e

T

$

S

S → iEtSS_{1}

S → a
 �  �  �  � 
S_{1}
 �  �  � 
S1 → eS
 � 
S_{1} → ∈

S_{1}
 �  �  � 
S1 → ∈
 �  � 
E
 �  � 
E → b
 �  �  � 
EXAMPLE 7
Construct an LL(1) parsing table for the following grammar:
Computation of FIRST and FOLLOW:
Therefore by substituting in (I) we get:

Using the production S → aBDh we get:

Using the production B → cC, we get:

Using the production C → bC, we get:

Using the production D → EF, we get:
Therefore, the parsing table is as shown in Table 10.
Table 10: Production Selections for Example 7 Parsing Derivations
� 
a

b

c

g

f

h

$

S

S→aBDh
 �  �  �  �  �  � 
B
 �  � 
B → cC
 �  �  �  � 
C
 � 
C → bC
 � 
C → ∈

C → ∈

C → ∈
 � 
D
 �  �  � 
D → EF

D → EF

D → EF
 � 
E
 �  �  � 
E → g

E → ∈

E → ∈
 � 
F
 �  �  �  � 
F → f

F → ∈
 � 