Parsing¶

Bottom-Up Parsing¶

  • Starting with a string, we build the parse tree on top of it
    • Does right most derivation in reverse
  • Can handle left recursion
  • At each step we need to find the handle
    • The handle is portion of the mix of terminals and non-terminals that can simplified to another non-terminal
    • A handle must lead to a valid derivation
    • For the string id + id * id, id is the handle because we can simplfy to F + id * id

Sentential Form and Phrases¶

  • A sentential form is any line of a derivation, that is a mix of terminals and non-terminals
  • A phrase is any set of symbols that will eventually be reduced to a single symbol
    • All the children of one node in the parse tree
  • A simple phrase is a phrase that can be reduces to one symbol in one step, that is a subtree with a depth of 1
  • A handle is the left-most simple phrase
    • We are pruning the parse tree as we go up it

Phrases and Handles Practice¶

  • Draw the parse tree and find the phrases and handles for the following right sentential form given the grammar
  • S $\to$ AbB | bAc
  • A $\to$ Ab | aBB
  • B $\to$ Ac | cBb | c

  • Exercises

    • aAcccbbc (as class)
    • AbcaBccb

Shift-Reduce¶

  • The general algorithm used for bottom-up parsing
  • Uses the LR parsing strategy
    • Scans strings from left-to-right
    • Uses the rightmost derivation
  • Implemented using a parsing table and a stack
  • Shift pushes a token on to the stack while reduce uses a rule of the grammar to simplify part of the stack

Parse Tables¶

For the grammar:

  1. $E \to E \, + \, T$
  2. $E \to T$
  3. $T \to T \, * \, F$
  4. $T \to F$
  5. $F \to (\, E \,) T$
  6. $F \to id$

The parsing table is:

Shift-Reduce Algorithm¶

  • Initialize the stack with state 0
  • While not accept or error
    • Given state and next symbol, find appropriate action in table
    • If action is shift, we shift that symbol and new state onto the stack
    • If action is reduce:
      • Pop handle and apply rule as indicated in table
      • Using next state on stack, look at goto table for new symbol and that state
      • Push non-terminal and state from goto on to stack

The Stack¶

  • In shift-reduce parsing, the stack is always of the form

$S_n$, $SYMBOL$, $S_m$, $SYMBOL$, $S_o$, ..... $S_x$¶

  • Where $S_n$ is a state and SYMBOL is a terminal or non-terminal from the grammar
  • This is sometimes written as

$n$, $SYMBOL$, $m$, $SYMBOL$, $o$, ..... $x$¶

Shift-Reduce Algorithm Practice¶

Parse

Stack Input Action
0 id \* id + id \$

The parsing table is:

Shift-Reduce Practice¶

Show the parse including the stack for id * ( id + id)

Grammar:

  1. $E \to E \, + \, T$
  2. $E \to T$
  3. $T \to T \, * \, F$
  4. $T \to F$
  5. $F \to (\, E \,) $
  6. $F \to id$

The parsing table is:

Parse:

Stack Input Action

Yacc¶

  • Yacc stands for yet another compiler compiler, and generates the tables needed to perform shift-reduce parsing
  • Yacc is not free, but there is a free version known as bison
  • A modified BNF grammar of the form

    LHS:
          RHS
        | RHS
    
  • A bit more involved than lex and flex in the setup

In [ ]: