This page introduces my implementations of the two parser generation tools: lex and yacc.

ylex

ylex is my implementation of the well-known lexical analyzer generator lex. If you're not familiar with lex, you can find a brief introduction here (lex/flex).

How it works

Differences

ylex implements most features of lex, however, there are some differences:

How to use

Currently the ylex can only run on Windows platform. ylex includes an executable file ylex.exe and two template files LEX_H_TEMPLATE and LEX_CPP_TEMPLATE which provide the templates for generating lexical analyazer.

To run ylex, type "ylex filename" in command window. Two files would be generated, namely _lex.h and _lex.cpp. An example is included in the bottom of this page.


yjacc

yjacc, or Yet Just Another Compiler Compiler (sorry, all good names are already taken), is my implementation of the well-known parser generator yacc. If you're not familiar with yacc, you can find a brief introduction here (yacc/bison).

How it works

yjacc is used to generate a parser by specifying the structure of a language. It usually works together with ylex, because the input for a parser is a sequence of tokens instead of individual characters, and ylex feeds yjacc by the function yylex(). yjacc adopts the LALR(1) parsing technique, thus it generates a LALR(1) table-driven parser. Below is a brief summary of how it works:

Differences

Again, here are the differences that you should know:

How to use

Currently yjacc can only run on Windows platform. yjacc includes an executable file yjacc.exe, two template files PARSER_H_TEMPLATE and PARSER_CPP_TEMPLATE which provide the templates for generating parser, and Limits.h defining important macros for the parser.

To run yjacc, type "yjacc filename" in command window. Three or four files would be generated, namely _parser.h, _parser.cpp, _tokens.h(declare all the terminal symbols) and _yaccmain.cpp(if it's necessary). An example is included in next section.


Source Code

The source code for ylex and yjacc is available on my github page. Another project lygen is used to generate the ylex and yjacc programs which can parse .l and .y files, you will know what I mean when you see the code. The code was written 3 years ago, actually I don't recommend you to use it, but you may find it helpful if you are also going to implement one.


Example

This example is taken from the book Lex & yacc by John R. Levine et al. It defines lexical and grammar rules for a simple calculator that can declare variables and do arithmetic operation between variables and numbers.

Input

lex file,grammar file

Output

After running ylex ch3-04.l and yjacc ch3-04.y, several files are generated:

_lex.h, _lex.cpp, _tokens.h, _parser.h, _parser.cpp, _yaccmain.cpp

Execution

Compile all the above files together with YYSTYPE.h and Limits.h in a C++ compiler. Here is a snapshot of the execution result.


Last updated 8/17/2014