<- previous index next ->
Use the "simplification" steps to get to a Chomsky Normal Form.
Use the algorithm on P140, Figure 6.8
Given a CFG grammar G in Chomsky Normal Form and a string x of length n
Group the productions of G into two sets
{ A | A -> a } target is a terminal
{ A | A -> BC } target is exactly two variables
V is a two dimensional matrix. Each element of the matrix is a set.
The set may be empty, denoted phi, or the set may contain one or
more variables from the grammar G. V can be n by n yet only part is used.
x[i] represents the i th character of the input string x
Parse x using G's productions
for i in 1 .. n
V[i,1] = { A | A -> x[i] }
for j in 2..n
for i in 1 .. n-j+1
{
V[i,j] = phi
for k in 1 .. j-1
V[i,j] = V[i,j] union { A | A -> BC where B in V[i,k]
and C in V[i+k,j-k]}
}
if S in V[1,n] then x is in CFL defined by G.
In order to build a derivation tree, a parse tree, you need to extend
the CYK algorithm to record
(variable, production number, from a index, from B index, from C index)
in V[i,j]. V[i,j] is now a set of five tuples.
Then find one of the (S, production number, from a, from B, from C)
entries in V[1,n] and build the derivation tree starting at the root.
Notes: The parse is ambiguous if there is more than one (S,...) in V[1,n]
Multiple levels of the tree may be built while working back V[*,k] to
V[*,k-1] and there may be more than one choice at any level if the
parse is ambiguous.
Example: given a string x = baaba
given grammar productions
A -> a S -> AB
B -> b S -> BC
C -> a A -> BA
B -> CC
C -> AB
V[i,j] i
1(b) 2(a) 3(a) 4(b) 5(a)
1 B A,C A,C B A,C
2 S,A B S,C S,A
j
3 phi B B
4 phi S,A,C
5 S,A,C
^
|_ accept
Derivation tree S
B C
b A B
a C C
A B a
a b
<- previous index next ->