yacc

NAME

yacc - an LALR(1) parser generator

SYNOPSIS

yacc [-dlrtv] [-b file_prefix] [-p symbol_prefix] filename

DESCRIPTION

Yacc(1) reads the grammar specification in the file filename and generates an LR(1) parser for it. The parsers consist of a set of LALR(1) parsing tables and a driver routine written in the C programming language. Yacc(1) normally writes the parse tables and the driver routine to the file y.tab.c.

The following options are available:

-b file_prefix

Use file_prefix as the prefix prepended to the output file names instead of the character y.
-d
Write the header file option causes the header file y.tab.h.
-l
Do not insert #line directives in the generated code. The #line directives let the C compiler relate errors in the generated code to the user's original code. Any #line directives specified by the user will be retained. By default, the directives are inserted.
-p symbol_prefix

use symbol_prefix as the prefix prepended to yacc(1)-generated symbols. The default prefix is the string yy.
-r
Produce separate files for code and tables. The code file is named y.code.c, and the tables file is named y.tab.c.
-t
Change the preprocessor directives generated by yacc(1) so that debugging statements will be incorporated in the compiled code.
-v
Write a human-readable description of the generated parser to the file y.output.

FORMAT OF THE INPUT FILE

The yacc(1) input file consists of three sections, separated by a line with just %% in it:

definitions %% rules %% user code

The definitions section contains declarations; it can also contain C comments (delimited by /* and */), or a literal block of C code, copied to the beginning of the generated file. This literal block usually contains declaration and #include lines. The following keywords can be used in the definitions section:

%{...%}
Delimits a literal block of C code. The literal block usually contains declarations of variables and functions used by the code in the rules section.
%left operator

Declares an operator left-associative. Operators must be declared in increasing order of precedence.
%nonassoc operator

Declares an operator non-associative. Operators must be declared in increasing order of precedence.
%right operator

Declares an operator right-associative. Operators must be declared in increasing order of precedence.
%start rulename

Declares the first rule the parser should start parsing. Normally, this is the first rule in the rules section, but this declaration explicitly labels a different rule.
%token name ...

Defines a symbolic token, a terminal symbol (one that the parser will not attempt to reduce). They are represented internally by integer values; you can assign a value directly to a token with the %token directive, but this is not recommended. Other tokens are individual characters in single quotes or are defined by %left, %right, and %nonassoc.
%type type name,name,...

Declares a non-terminal token as a particular type. The type must already have been defined by a %union.
%union

Identify all possible C types a symbol value can have; this union is declared as type YYSTYPE in the generated source file. The format is:
%union {
	 ...field declarations
}

The rules section is the heart of the grammar. Each rule starts with a non-terminal symbol and a colon, followed by a list of symbols, literal tokens, and actions. The list can be empty. For example, this rule states that a time is an hour, a minute, and a second value, joined by colons:

time: hour ':' minute ':' second ;
(This assumes we've already defined hour, minute, and second.) The semicolon at the end is traditional, but optional. If consecutive rules have the same left-hand side, rules after the first can start with a vertical bar, rather than the name and a colon. This is somewhat easier to read but the semicolon must be omitted before the vertical bar.

An action in a rule is a C compound statement to be executed whenever the parser reaches that point in the grammar. In the action, the name $$ stands for the symbol on the left-hand side, and a dollar sign followed by a digit n stands for the n symbol on the right-hand side. In the example above:

time: hour ':' minute ':' second
	 {printf("time is %d:%d:%d\n",$1, $3, $5); }
	 ;

If no action is defined, the action is:

{ $$ = $1; }

The user subroutines section contains routines called from the actions. It is copied directly into the C file.

ENVIRONMENT VARIABLES

Yacc(1) makes use of the following environment variables, if set:
TMPDIR
If set, the contents of this variable will be used as the name of the directory where the temporary files are created.

FILES

The yacc(1) utility makes use of the following files:
y.code.c
Code file created by yacc(1) when the -r option is given.
y.tab.c
Table file created by yacc(1).
y.tab.h
Table header file created by yacc(1).
y.output
A human-readable version of the output, created with the -v option.
/tmp/yacc.aXXXXXX
/tmp/yacc.tXXXXXX
/tmp/yacc.uXXXXXX
Temporary files.

DIAGNOSTICS

If there are rules that are never reduced, the number of such rules is reported on standard error. If there are any LALR(1) conflicts, the number of conflicts is reported on standard error.