How to get lookahead symbol when constructing LR(1) NFA for parser?

Posted by greenoldman on Programmers See other posts from Programmers or by greenoldman
Published on 2012-11-29T18:20:12Z Indexed on 2012/11/29 23:18 UTC
Read the original article Hit count: 300

Filed under:

parser

I am reading an explanation (awesome "Parsing Techniques" by D.Grune and C.J.H.Jacobs; p.292 in the 2nd edition) about how to construct an LR(1) parser, and I am at the stage of building the initial NFA. What I don't understand is how to get/compute a lookahead symbol.

Here is the example from the book, the grammar:

S -> E
E -> E - T
E -> T
T -> ( E )
T -> n

n is terminal. The "weird" transitions for me are is the sequence:

1)   S -> . E        eof
2)   E -> . E - T    eof
3)   E -> . E - T    -
4)   E -> E . - T    -
5)   E -> E - . T    -

(Note: In the above table, the state numbers are in front and the lookahead symbol is at the end.)

What puzzles me is that transition from (4) to (5) means reading - token, right? So how is it that - is still a lookahead symbol and even more important why is it that eof is no longer a lookahead symbol? After all in an input such as n - n eof there is only one - symbol.

My naive thinking tells me (5) should be written as:

5)   E -> E - . T    - eof

And another thing -- n is terminal. Why it is not used at all as a lookahead symbol? I mean -- we expect to see - or (, it is ok, but lack of n means we are sure it won't appear in input?

Update: after more reading I am only more confused ;-) I.e. what is really a lookahead? Because I see such state as (p.292, 2nd column, 2nd row):

E -> E . - T      eof

Lookahead says eof but the incoming input says -. Isn't it a contradiction? And it is not only in this book.

Related posts about parser

Core Data error when assigning variable with one-to-one relationship

as seen on Stack Overflow - Search for 'Stack Overflow'
I tried to assign a managed object (C) with its property another managed object (B) (a one-to-one relationship) in which this other managed object (B) has a to-many relationship with one other managed object (A). There is an error from this assignment in which I copied as follows: #0 0x020e53a7… >>> More
RapidXML - does not compile ?

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, I am novice to rapidXML but first impresion was not positive, I made simple Visual Studio 6 C++ Hello World Application and added RapidXML hpp files to project and in main.cpp I put: #include "stdafx.h" #include < iostream > #include < string > #include "rapidxml.hpp" using namespace… >>> More
exception occured in java compiler

as seen on Stack Overflow - Search for 'Stack Overflow'
I am a beginner in Java.I have JDK1.7.0 installed on windows 7 OS.I just wrote a sample java file where the file was not getting compiled and throws the below error. Sam.java:5: ';' expected Sample p = New Sample(); An exception has occurred… >>> More
Doxygen C++ comment string parser in python?

as seen on Stack Overflow - Search for 'Stack Overflow'
Does anybody know of a python module to parse a doxygen style C++ comment string? I mean a string like this (simple example): /** * A constructor. * A more elaborate description of the constructor. * @param param1 test1 * @param param2 test2 */ and I would like to extract the brief… >>> More
Coding a parser for a domain specific language in Java

as seen on Stack Overflow - Search for 'Stack Overflow'
We want to design a simple domain specific language for writing test scripts to automatically test a XML-based interface of one of our applications. A sample test would be: Get an input XML file from network shared folder or subversion repository Import the XML file using the interface Check if… >>> More

Developer IT