LR parser
An LR parser reads the characters of a text from beginning to end, i.e. left to right (LR), without ever backtracking. During this process, it groups the characters together to tokens using a lexer function. The tokens themselves are then grouped together to subtrees, which are stored on a stack. For every new token it decides what to do with the stack by consulting the parse table.
Example for LR-parsing a simple text
Parser starts out at beginning of string. Stack is empty.
Text: x * y + z Pos: ^ Stack:
Parser sees first character and recognizes it as a variable. Pushes [var] on the stack.
Text: x * y + z Pos: ^ Stack: [var]
Same with the next 2 tokens.
Text: x * y + z Pos: ^ Stack: [var][*][var]
Parser table for
+
specifies a reduction. This means the last 3 tokens get popped off the stack and get grouped together as a new parent subtree on the stack. Then the+
and the variable z can be pushed on.Text: x * y + z Pos: ^ Stack: [(product var * var)][+][var]
Reaching the end of the text means another reduction. This creates the root node of the final syntax tree.
Text: x * y + z Pos: ^ Stack: [(sum (product var * var) + var)]
Problems with LR parsing
Take the Javascript code x = (y);
. Here x is assigned to y. On the other hand,
take x = (y) => z;
. Here x is assigned to a function that takes y as a
parameter.
Without looking ahead or guessing and then checking, an LR parser cannot know which one it's gonna be. One way to solve this problem is to use generalized LR parsing - or GLR parsing.