Parsing is the process where we take a particular sequence of words
and check to see if they correspond to the language defined by some
Basically what we have to do is to show how we can get:
|from||the start symbol of the grammar|
|to||the sequence of words that supposedly belong to the language|
|by||using the production rules|
We can do this by constructing the appropriate parse tree for the sequence of words, which is simply a graphical way of listing the production rules that we have used.
Considering the following simple grammar for arithmetic expressions:
Here there are just two non-terminals, and (where is the start symbol); the terminal
symbols are: +, -, *, , (, ), num.
Suppose the lexical analysis phase has given us the sequence of tokens which correspond to:
(num + num) num. We can parse this as follows:
We say that derives (num + num) num, and write:
We can represent the above derivation graphically by means of a parse tree.
The root of the tree is the start symbol , and its leaves are the terminal
symbols in the sentence which has been derived. Each internal node is
labelled with a non-terminal, and the children are the symbols obtained
by applying one of the production rules.
There are two ways of constructing a parse tree whose root is the start symbol and whose leaves are the sentence:
Both of these methods will be discussed in detail later.