% Copyright 2012-2024, Alexander Shibakov
% Copyright 2002-2014 Free Software Foundation, Inc.
% This file is part of SPLinT
%
% SPLinT is free software: you can redistribute it and/or modify
% it under the terms of the GNU General Public License as published by
% the Free Software Foundation, either version 3 of the License, or
% (at your option) any later version.
%
% SPLinT is distributed in the hope that it will be useful,
% but WITHOUT ANY WARRANTY; without even the implied warranty of
% MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
% GNU General Public License for more details.
%
% You should have received a copy of the GNU General Public License
% along with SPLinT. If not, see .
\ifbootstrapmode % this is a bootstrap run to extract the states
\message{bootstrapping \jobname.tex ...}%
\input limbo.sty
\def\optimization{5}
\input yy.sty
\modebootstrap
\fi
@**The \eatone{bison}\bison\ parser stack.
{%
\newdimen\halfhsize
\newdimen\preskip
\halfhsize=\hsize
\divide\halfhsize by2
\def\mypar{%
\parshape 6
0pt \hsize
0pt \hsize
0pt \hsize
0pt \hsize
0pt \hsize
\halfhsize \halfhsize
}%
The input language for \bison\ loosely follows the {\sc BNF} notation, with
a few enhancements, such as the syntax for {\em actions}, to implement
the syntax-directed translation@^syntax-directed translation@>, as
well as various declarations for tokens, nonterminals, etc.
On the one hand, the language is relatively easy to handle, is
nearly whitespace agnostic, on the other, a primitive parser is
required for some basic setup even at a very early stage, so the
design must be carefully thought out. This {\em
bootstrapping\/}@^bootstrapping@> step is discussed in more detail
further down.
The path chosen here is by no means optimal. What it lacks in
efficiency, though, it may amply gain in practicality, as we reuse the
original grammar used by \bison\ to produce the parser(s) for both
pretty printing and bootstrapping. Some minor subtleties arising from
this approach are explained in later sections.
As was described in the
\ifbootstrapmode\else\locallink{parser.stacks} discussion of parser
stacks \endlink\fi@^parser stack@> above, to pretty print a variety
of grammar fragments, one may employ
a {\em parser stack\/} derived from the original grammar. The most
common unit of a \bison\ grammar is a set of
productions. It is thus natural to begin our discussion of the parsers
in the \bison\ stack with the parser responsible for processing
individual rules.
One should note that the productions below are not directly concerned with the
typesetting of the grammar. Instead, this task is delegated to the
macros in \.{yyunion.sty} and its companions. The first pass of the
parser merely constructs an `executable abstract syntax tree' (or
\EAST\footnote{One may argue that \EAST\ is still merely a syntactic
construct requiring a proper macro framework for its execution and
should be called a `weak executable syntax tree' or \WEST. This
acronym extravagnza is heading south so we shall stop here.}) which
can serve very diverse purposes: from collecting token declarations in
the boostrapping pass to typesetting the grammar rules. This allows for a
great deal of flexibility in where and when the parsing results are
used. A clear divide between the parsing step and the typesetting step
provides for better debugging facilities, as well as more reliable
macro design.
It would be impossible to completely avoid the question of the visual
presentation of the \bison\ input, however. It has already been
pointed out that the syntax adopted by \bison\ is nearly insensitive
to whitespace. This makes {\em writing\/} \bison\ grammars easier. On
the other hand, {\em presenting\/} a grammar is best done using a
variety of typographic devices that take advantage of the meaningful
positioning of text on the page: skips, indents, etc. Therefore, the
macros for \bison\ pretty printing trade a number of \bison\ syntax
elements (such as \.{\yl}, \.{;}, action braces, etc.) for the careful
placement of each fragment of the input on the page. The syntax tree
generated by the parsers in the \bison\ stack is not fully {\em faithful\/} in
that it does not preserve every syntactic element from the original input.
Thus, e.g.\ optional semicolons (\prodstyle{semi.opt}) never find their way into
the tree and their original position is lost\footnote{The opposite is true about
the {\em whitespace\/} the parser sees (or {\em stash\/} as it is called
in this document): all of it is carefully packaged into streams, as was described
\locallink{parser.streams}earlier\endlink.}.
Let's take a short break for a broad overview of the input file.
The basic structure is that of an ordinary \bison\ file that produces
plain \Cee\ output. The \Cee\ actions, however, are programmed to output \TeX.
The \bison\ sections (separated by \.{\%\%} (shown (pretty printed)
as \prodstyle{\%\%} below)) appear between the successive dotted lines.
A number of sections are empty, since the generated \Cee\ is rather trivial.
}%
%\checktabletrue
@(bg.yy@>=
@G Switch to generic mode.
%{@> @ @=%}
@> @ @=
%union {@> @ @=}
%{@> @ @=%}
@> @ @=
%%
@> @ @=
@> @ @=
@> @ @=
%%
@g
@*1 Bootstrapping.
%\checktablefalse
Bootstrap\namedspot{bootstrapping}@^bootstrapping@> parser is
defined next. The purpose of the bootstrapping parser is to
collect a minimal amount of information to `spool up' the `production'
parsers. To understand its inner workings and the reasons behind it,
consider what happens following a declaration such as \.{\%token TOKEN "token"}
(or, as it would be typeset by the macros in this package
`\prodstyle{\%token} \.{TOKEN} \.{token}'; see the index entries for
more details)%
\idxinline{TOKEN}\idxinline{token}.
The two names for the same token are treated very differently. \.{TOKEN} becomes
an |enum| constant in the \Cee\ parser generated by \bison. Even when
that parser becomes part of the `driver' program that outputs the \TeX\
version of the parser tables, there is no easy way to output the {\it
names\/} of the appropriate |enum| constants. The other name
(\.{"token"}) becomes an entry in the |yytname| array. These names
can be output by either the `driver' or \TeX\ itself after the
\.{\\yytname} table has been input. The scanner, on the other hand,
will use the first version (\.{TOKEN}). Therefore, it is important to
establish an equivalence between the two versions of the name. In the
`real' parser, the token values are output in a special header
file. Hence, one has to either parse the header file to establish the
equivalences or find some other means to find out the numerical values
of the tokens.
One approach is to parse the file containing the {\it declarations\/}
and extract the equivalences between the names from it. This is
precisely the function of the bootstrap parser. Since the lexer is reused, some
token values need to be known in advance (and the rest either ignored
or replaced by some `made up' values). These tokens are `hard coded'
into the parser file generated by \bison\ and output using a special
function. The switch `|@[#define@]@; BISON_BOOTSTRAP_MODE|' tells the
`driver' program to output the hard coded token values.
@q Bizarre looking way of typing #define is due to the awkward way@>
@q \CWEB\ treats switching in and out of $-mode in inline \Cee@>
Note that the equivalence of the two versions of token names would
have to be established every time a `string version' of a token is
declared in the \bison\ file and the `macro name version' of the token
is used by the corresponding scanner. To establish this equivalence,
however, the bootstrapping parser below is not always necessary (see
the \.{xxpression} example, specifically, the file \.{xxpression.w} in
the \.{examples} directory for an example of using a different parser
for this purpose). The reason it is necessary here is that a parser
for an appropriate subset of the \bison\ syntax is not yet available
(indeed, {\it any\/} functional parser for a \bison\ syntax subset
would have to use the same scanner (unless you want to write a custom
scanner for it), which would need to know how to output tokens, for
which it would need a parser for a subset of \bison\ syntax $\ldots$
it is a genuine `chicken and egg' problem). Hence the need for
`bootstrap'. Once a functional parser for a large enough subset of the
\bison\ input grammar is operational, {\it it\/} can be used to pair
up the token names. The bootstrap parser is not strictly minimal in that
it is also capable of parsing the \prodstyle{\%nterm} declarations.
This ability is not utilized by the parsers in \splint, however (nor
is the accompanying bootstrap lexer designed to output the
\prodstyle{\%nterm} tokens), and was added for the scenarios other
than bootstrapping.
The second, perhaps even more important function of the bootstrap process
is to collect information about the scanner's states. The mechanism
is slightly different from that for token definition gathering.
While the token equivalences are collected purely in
`\TeX\ mode', the bootstrap mode parser collects all the state names into a
special \Cee\ header file. The reason is simple: unlike the token
values, the numerical values of the scanner states are not passed to
the `driver' program in any data structure (the |yytname| array) and are instead defined as
ordinary (\Cee) macros. The header file is the information the `driver' file
needs to output the state values for the use by the lexer.
Naturally, to accomplish their task, the lexer and the parser emplyed in
state gathering need the state and token information, as well. Fortunately,
the parser is a subset of the \flex\ input parser that does not define
any `string' names for it tokens. Similarly, the lexer collects all the necessary
tokens in the \flexsnstyle{INITIAL} state\footnote{An additional subtlety is
the necessity to gracefully handle (and, in some cases, cause) the multiple
possible {\em failures\/} for which the lexer redefines \inlineTeXx{/yyBEGIN}\
to fail immediately when attempting to switch states. Note that the bootstrap
mode parser looks at sections other than those where the declarations reside
and must fail quickly and quietly in such cases.}.
To reiterate a point made in the middle of this section, the bootstrapping
process described here is necessary to `spool up' the \bison\ and \flex\ input parsers.
A simpler procedure may be followed while designing other custom parsers where
the programmer uses, say the full \bison\ parser to collect information about
the token equivalences (whether such information is needed to make the parser operational
or just to facilitate the typesetting of the token names). By adding custom
`bootstrapping' macros to the ones defined in \.{yyunion.sty}, a number of different
preprocessing tasks can be accomplished.
@(bb.yy@>=
@G Switch to generic mode.
%{
@> @ @=
@> @/#define BISON_BOOTSTRAP_MODE @=
%}
@> @ @=
%union {@> @ @=}
%{@> @ @=%}
@> @ @=
%%
@> @ @=
@> @ @=
%%
@g
@*1 Prologue and full parsers.
The prologue parser is responsible for parsing various grammar
declarations as well as parser options.
@(bd.yy@>=
@G Switch to generic mode.
%{@> @ @=%}
@> @ @=
%union {@> @ @=}
%{@> @ @=%}
@> @ @=
%%
@> @ @=
@> @ @=
@> @ @=
%%
@g
@ The full \bison\ input parser is used when a complete \bison\ file is
expected. It is also capable of parsing a `skeleton' of such a file,
similar to the one that follows this paragraph. As a stopgap measure,
the skeleton of a \flex\ scanner is also parsed by this parser, as they have
an almost identical structure. This is not a perfect arrangement, however, since
it precludes one from putting the constructs that this parser does not
recognize into the outline. To give an example, one cannot put \flex\ specific
options into such `skeleton'.
@(bf.yy@>=
@G Switch to generic mode.
%{@> @ @=%}
@> @ @=
%union {@> @ @=}
%{@> @ @=%}
@> @ @=
%%
@> @ @=
@> @ @=
@> @ @=
@> @ @=
%%
@g
@ \namedspot{bison.options}The first two options below are essential
for the parser operation
as each of them makes \bison\ produce additional tables (arrays) used
in the operation (or bootstrapping) of \bison\ parsers. The
start symbol can be set implicitly by listing the appropriate
production first. Modern \bison\ also allows specifying the kind of
parsing algorithm to be used (provided the supplied grammar is in the
appropriate class): {\sc LALR}($n$), {\sc LR}($n$), {\sc GLR}, etc.
The default is to use the {\sc LALR}($1$) algorithm (with the
corresponding assumption about the grammar) which can also be set
explicitly by putting\gtextidx{\bison\ options example}{bison options example}{\bisonidxdomain}%
\medskip
\beginprod
\%define lr.type canonical-lr
\endprod
\medskip
\noindent
in with the rest of the options.
Using other types of grammars will wreak havoc
on the parsing algorithm hardcoded into \splint\ (see \.{yyparse.sty})
as well as on the production of \.{\\stashed} and \.{\\format} streams.
@=
@G
%token-table
%debug
%start input
@g
@*1 Token declarations. Most of the original comments present in
the grammar file used by \bison\ itself have been preserved and appear in
{\it italics\/} at the beginning of the appropriate section.
To facilitate the {\it bootstrapping\/} of the parser (see above), some
declarations have been separated into their own sections. Also, a
number of new rules have been introduced to create a hierarchy of
`subparsers' that parse subsets of the grammar. We begin by listing
most of the tokens used by the grammar. Only the string versions are
kept in the |yytname| array, which, in part is the reason for a
special bootstrapping parser as explained earlier.
\iffalse
\checktrailingstashtrue % see what is left at the end
\checktabletrue % display the table
\fi
@=
@G
%token GRAM_EOF 0 "end of file"
%token STRING "string"
%token PERCENT_TOKEN "%token"
%token PERCENT_NTERM "%nterm"
%token PERCENT_TYPE "%type"
%token PERCENT_DESTRUCTOR "%destructor"
%token PERCENT_PRINTER "%printer"
%token PERCENT_LEFT "%left"
%token PERCENT_RIGHT "%right"
%token PERCENT_NONASSOC "%nonassoc"
%token PERCENT_PRECEDENCE "%precedence"
%token PERCENT_PREC "%prec"
%token PERCENT_DPREC "%dprec"
%token PERCENT_MERGE "%merge"
@g
@@;
@ We continue with the list of tokens below, following the layout of
the original parser.
\iffalse
\checktrailingstashfalse
\checktablefalse
\fi
@=
@G
%token
PERCENT_CODE "%code"
PERCENT_DEFAULT_PREC "%default-prec"
PERCENT_DEFINE "%define"
PERCENT_DEFINES "%defines"
PERCENT_ERROR_VERBOSE "%error-verbose"
PERCENT_EXPECT "%expect"
PERCENT_EXPECT_RR "%expect-rr"
PERCENT_FLAG "%"
PERCENT_FILE_PREFIX "%file-prefix"
PERCENT_GLR_PARSER "%glr-parser"
PERCENT_INITIAL_ACTION "%initial-action"
PERCENT_LANGUAGE "%language"
PERCENT_NAME_PREFIX "%name-prefix"
PERCENT_NO_DEFAULT_PREC "%no-default-prec"
PERCENT_NO_LINES "%no-lines"
PERCENT_NONDETERMINISTIC_PARSER
"%nondeterministic-parser"
PERCENT_OUTPUT "%output"
PERCENT_REQUIRE "%require"
PERCENT_SKELETON "%skeleton"
PERCENT_START "%start"
PERCENT_TOKEN_TABLE "%token-table"
PERCENT_VERBOSE "%verbose"
PERCENT_YACC "%yacc"
;
%token BRACED_CODE "{...}"
%token BRACED_PREDICATE "%?{...}"
%token BRACKETED_ID "[identifier]"
%token CHAR "char"
%token EPILOGUE "epilogue"
%token EQUAL "="
%token ID "identifier"
%token ID_COLON "identifier:"
%token PERCENT_PERCENT "%%"
%token PIPE "|"
%token PROLOGUE "%{...%}"
%token SEMICOLON ";"
%token TAG ""
%token TAG_ANY "<*>"
%token TAG_NONE "<>"
%token INT "integer"
%token PERCENT_PARAM "%param";
@g
@*1 Grammar productions.
We are ready to describe the top levels of the parse tree. The first
`sub parser' we consider is a `full' parser, that is the parser that
expects a full grammar file, complete with the prologue, declarations,
etc. This parser can be used to extract information from the grammar
that is otherwise absent from the executable code generated by
\bison. This includes, for example, the `name' part of
\.{\$}\.{[}{\rm name}\.{]}.
This parser is therefore used to generate the `symbolic
switch' to provide support for symbolic term names similar to
the `genuine' \bison's \.{\$}\.{[}$\ldots$\.{]} syntax.
The action of the parser in this case is simply to separate the
accumulated `parse tree' from the auxiliary information carried by the
parser on the stack.
\saveparseoutputfalse
\checktablefalse
\tracenamesfalse
@=
@G
@t}\vb{\inline}{@>
input:
prologue_declarations
"%%" grammar epilogue.opt {@> @ @=}
;
@g
@ @=
@[TeX_( "/finishlist{/expandafter/yyfirstoftwo/the/yy(3)}" );@]@; /* complete the list */
@[TeX_( "/table/expandafter{/romannumeral0" );@]@;
@[TeX_( " /executelistat{/expandafter/yyfirstoftwo/the/yy(3)}{0}}" );@]@;
@ Another subgrammar deals with the syntax of isolated \bison\ rules. This is
the most commonly used `subparser' since a rules cluster is the most
natural `unit' to include in a \CWEB\ file.
@=
@G
@t}\vb{\inline}{@>
input:
grammar epilogue.opt {@> @ @=}
;
@g
@ @=
@[TeX_( "/finishlist{/expandafter/yyfirstoftwo/the/yy(1)}" );@]@; /* complete the list */
@[TeX_( "/table/expandafter{/romannumeral0" );@]@;
@[TeX_( " /executelistat{/expandafter/yyfirstoftwo/the/yy(1)}{0}}" );@]@;
@ The bootstrap parser has a very narrow set of goals: it is concerned
with \prodstyle{\%token} declarations only in
order to supply the token information to the lexer (since, as noted
above, such information is not kept in the |yytname| array).
The parser can also parse \prodstyle{\%nterm} declarations but the
bootstrap lexer ignores the \prodstyle{\%nterm} token, since the
\bison\ grammar does not use one.
It also extends the syntax of a \prodstyle{grammar_declaration} by allowing a
declaration with or without a semicolon at the end (the latter is only
allowed in the prologue). This works since the token declarations have
been carefully separated from the rest of the grammar in different
\CWEB\ sections. The range of tokens output by the bootstrap
lexer is limited, hence most of the other rules are ignored.
@=
@G
@t}\vb{\inline}{@>
input:
grammar_declarations {@> TeX_( "/table=/yy(1)" ); @=}
;
@t}\vb{\resetf}{@>
grammar_declarations:
symbol_declaration semi.opt {@> @ @=}
| grammar_declarations
symbol_declaration semi.opt {@> TeX_( "/yy0{/the/yy(1)/the/yy(2)}" ); @=}
;
@t}\vb{\inline\flatten}{@>
semi.opt: {} | ";" {};
@g
@ The following is perhaps the most common action performed by the
parser. It is done automatically by the parser code but this feature
is undocumented so we supply an explicit action in each case.
@=
@[TeX_( "/yy0{/the/yy(1)}" );@]@;
@ Next comes a subgrammar for processing prologue declarations. Finer
differentiation is possible but the `subparsers' described here work
pretty well and impose a mild style on the grammar writer. Note that
these rules are not part of the official \bison\ input grammar and are
added to make the typesetting of `file outlines' (e.g.~|@(bb.yy@>|
above) possible.
@=
@G
@t}\vb{\inline}{@>
input:
prologue_declarations epilogue.opt {@> @ @=}
| prologue_declarations
"%%" "%%" EPILOGUE {@> @ @=}
| prologue_declarations
"%%" "%%" {@> @ @=}
;
@g
@ @=
@[TeX_( "/finishlist{/expandafter/yyfirstoftwo/the/yy(1)}" );@]@; /* complete the list */
@[TeX_( "/table/expandafter{/romannumeral0" );@]@;
@[TeX_( " /executelistat{/expandafter/yyfirstoftwo/the/yy(1)}{0}}" );@]@;
@ {\it Declarations: before the first \prodstyle{\%\%}}. We are now
ready to deal with the specifics of the declarations themselves.
@=
@G
prologue_declarations:
{@> @ @=}
| prologue_declarations
prologue_declaration {@> @ @=}
;
@g
@ @=
@[TeX_( "/initlist{/prologuedeclarationsprefix prologue_declarations}" );@]@;
@[TeX_( "/yy0{{/prologuedeclarationsprefix prologue_declarations}{/nx/empty}}" );@]@;
@[TeX_( "/edef/prologuedeclarationsprefix{./prologuedeclarationsprefix}" );@]@;
@ @=
@@;
@ Here is a list of most kinds of declarations that can appear in the
prologue. The scanner returns the `stream pointers' for all the
keywords so the declaration `structures' pass on those pointers to the
grammar list. The original syntax has been left intact even though for
the purposes of this parser some of the inline rules are unnecessary.
\eraselocalformattrue
@=
@G
prologue_declaration:
grammar_declaration {@> @ @=}
| "%{...%}" {@> TeX_( "/yy0{/nx/prologuecode/the/yy(1)}" ); @=}
| "%" {@> TeX_( "/yy0{/nx/optionflag/the/yy(1)}" ); @=}
| "%define" variable value {@> TeX_( "/yy0{/nx/vardef{/the/yy(2)}{/the/yy(3)}/the/yy(1)}" ); @=}
| "%defines" {@> TeX_( "/yy0{/nx/optionflag{defines}{}/the/yy(1)}" ); @=}
| "%defines" STRING {@> @t}\vb{\stashed{\Xmark prologue.decls:\Xmark}}{@> @=
@> @[TeX_( "/toksa{defines}" );@]@+@ @=}
| "%error-verbose" {@> TeX_( "/yy0{/nx/optionflag{error verbose}{}/the/yy(1)}" ); @=}
| "%expect" INT {@> @t}\vb{\stashed{\Xmark prologue.decls(g):\Xmark}}{@> @=
@> @[TeX_( "/toksa{expect}" );@]@+@ @=}
| "%expect-rr" INT {@> @t}\vb{\stashed{\Xmark prologue.decls(g):\Xmark}}{@> @=
@> @[TeX_( "/toksa{expect-rr}" );@]@+@ @=}
| "%file-prefix" STRING {@> @[TeX_( "/toksa{file prefix}" );@]@+@ @=}
| "%glr-parser" {@> TeX_( "/yy0{/nx/optionflag{glr parser}{}/the/yy(1)}" ); @=}
| "%initial-action" "{...}" {@> TeX_( "/yy0{/nx/initaction/the/yy(2)}" ); @=}
| "%language" STRING {@> @[TeX_( "/toksa{language}" );@]@+@ @=}
| "%name-prefix" STRING {@> @[TeX_( "/toksa{name prefix}" );@]@+@ @=}
| "%no-lines" {@> TeX_( "/yy0{/nx/optionflag{no lines}{}/the/yy(1)}" ); @=}
| "%nondeterministic-parser" {@> TeX_( "/yy0{/nx/optionflag{nondet. parser}{}/the/yy(1)}" ); @=}
| "%output" STRING {@> @t}\vb{\stashed{\Xmark prologue.decls:\Xmark}}{@> @=
@> @[TeX_( "/toksa{output}" );@]@+@ @=}
@t}\vb{\flatten}{@>
| "%param" {@> @t}\vb{\stashed{\rm (we simply return pointers below)}}{@> @=}
params {@> TeX_( "/yy0{/nx/paramdef{/the/yy(3)}/the/yy(1)}" ); @=}
@t}\vb{\fold}{@>
| "%require" STRING {@> @t}\vb{\stashed{\Xmark prologue.decls:\Xmark}}{@> @=
@> @[TeX_( "/toksa{require}" );@]@+@ @=}
| "%skeleton" STRING {@> @[TeX_( "/toksa{skeleton}" );@]@+@ @=}
| "%token-table" {@> TeX_( "/yy0{/nx/optionflag{token table}{}/the/yy(1)}" ); @=}
| "%verbose" {@> TeX_( "/yy0{/nx/optionflag{verbose}{}/the/yy(1)}" ); @=}
| "%yacc" {@> TeX_( "/yy0{/nx/optionflag{yacc}{}/the/yy(1)}" ); @=}
| ";" {@> TeX_( "/yy0{/nx/empty}" ); @=}
;
params:
params "{...}" {@> TeX_( "/yy0{/the/yy(1)/nx/braceit/the/yy(2)}" ); @=}
| "{...}" {@> TeX_( "/yy0{/nx/braceit/the/yy(1)}" ); @=}
;
@g
@ This is a typical parser action: encapsulate the `type' of the
construct just parsed and attach some auxiliary info, in this case the
stream pointers.
\eraselocalformatfalse
\smallskip
\rulereferencex{\showlastactionfalse}{\nx\inline\nx\flatten}{prologue.decls}
\smallskip
\noindent The productions above are typical examples.
@=
@[TeX_( "/yy0{/nx/oneparametricoption{/the/toksa}{/nx/stringify/the/yy(2)}/the/yy(1)}" );@]@;
@ A variation on the theme above where the parameter is not a \prodstyle{STRING}.
\smallskip
\rulereferencex{\showlastactionfalse}{\nx\inline\nx\flatten}{prologue.decls(g)}
\smallskip
\noindent A sample of the rules to which the code below applies are given above.
@=
@[TeX_( "/yy0{/nx/oneparametricoption{/the/toksa}{/the/yy(2)}/the/yy(1)}" );@]@;
@ {\it Grammar declarations}. These declarations can appear in both
the prologue and the rules sections. Their treatment is very similar to
the prologue-only options.
@=
@G
grammar_declaration:
precedence_declaration {@> @ @=}
| symbol_declaration {@> @ @=}
| "%start" symbol {@> @t}\vb{\stashed{\Xmark prologue.decls(g):\Xmark}}{@> @=
@> @[TeX_( "/toksa{start}" );@]@+@ @=}
| code_props_type "{...}" generic_symlist {@> @ @=}
| "%default-prec" {@> TeX_( "/yy0{/nx/optionflag{default prec.}{}/the/yy(1)}" ); @=}
| "%no-default-prec" {@> TeX_( "/yy0{/nx/optionflag{no default prec.}{}/the/yy(1)}" ); @=}
| "%code" "{...}" {@> TeX_( "/yy0{/nx/codeassoc{code}{}/the/yy(2)/the/yy(1)}" ); @=}
| "%code" ID "{...}" {@> TeX_( "/yy0{/nx/codeassoc{code}{/nx/idit/the/yy(2)}/the/yy(3)/the/yy(1)}" ); @=}
;
code_props_type:
"%destructor" {@> TeX_( "/yy0{{destructor}/the/yy(1)}" ); @=}
| "%printer" {@> TeX_( "/yy0{{printer}/the/yy(1)}" ); @=}
;
@g
@ @=
@[TeX_( "/getfirst{/yy(1)}/to/toksa" );@]@; /* name of the property */
@[TeX_( "/getfirst{/yy(2)}/to/toksb" );@]@; /* contents of the braced code */
@[TeX_( "/getsecond{/yy(2)}/to/toksc" );@]@; /* braced code format pointer */
@[TeX_( "/getthird{/yy(2)}/to/toksd" );@]@; /* braced code stash pointer */
@[TeX_( "/getsecond{/yy(1)}/to/tokse" );@]@; /* code format pointer */
@[TeX_( "/getthird{/yy(1)}/to/toksf" );@]@; /* code stash pointer */
@[TeX_( "/yy0{/nx/codepropstype{/the/toksa}{/the/toksb}{/the/yy(3)}{/the/toksc}{/the/toksd}{/the/tokse}{/the/toksf}}" );@]@;
@ @=
@G
%token PERCENT_UNION "%union";
@g
@ @=
@G
@t}\vb{\inline\flatten}{@>
union_name:
{@> TeX_( "/yy0{}" ); @=}
| ID {@> @ @=}
;
grammar_declaration:
"%union" union_name "{...}" {@> @ @=}
;
symbol_declaration:
"%type" TAG symbols.1 {@> @ @=}
;
@t}\vb{\resetf\flatten}{@>
precedence_declaration:
precedence_declarator tag.opt symbols.prec {@> @ @=}
;
precedence_declarator:
"%left" {@> TeX_( "/yy0{/nx/preckind{left}/the/yy(1)}" ); @=}
| "%right" {@> TeX_( "/yy0{/nx/preckind{right}/the/yy(1)}" ); @=}
| "%nonassoc" {@> TeX_( "/yy0{/nx/preckind{nonassoc}/the/yy(1)}" ); @=}
| "%precedence" {@> TeX_( "/yy0{/nx/preckind{precedence}/the/yy(1)}" ); @=}
;
@t}\vb{\inline}{@>
tag.opt:
{@> TeX_( "/yy0{}" ); @=}
| TAG {@> @ @=}
;
@t}\vb{\insertraw{\beginfoldedsections}}{@>
@g
@ @=
@[TeX_( "/yy0{/nx/codeassoc{union}{/the/yy(2)}/the/yy(3)/the/yy(1)}" );@]@;
@ @=
@[TeX_( "/yy0{/nx/typedecls{/nx/tagit/the/yy(2)}{/the/yy(3)}/the/yy(1)}" );@]@;
@t}\endfoldedsections{@>
@ @=
@[TeX_( "/getthird{/yy(1)}/to/toksa" );@]@; /* format pointer */
@[TeX_( "/getfourth{/yy(1)}/to/toksb" );@]@; /* stash pointer */
@[TeX_( "/getsecond{/yy(1)}/to/toksc" );@]@; /* kind of precedence */
@[TeX_( "/yy0{/nx/precdecls{/the/toksc}{/the/yy(2)}{/the/yy(3)}{/the/toksa}{/the/toksb}}" );@]@;
@ The bootstrap grammar forms the smallest subset of the full grammar.
@=
@@;
@ @=
@[TeX_( "/yy0{/nx/tagit/the/yy(1)}" );@]@;
@ These are the two most important rules for the bootstrap parser. The reasons for
the~\prodstyle{\%token} declarations to be collected during the bootstrap pass are
outlined in the \locallink{bootstrapping}section on bootstrapping\endlink.
The~\prodstyle{\%nterm} declarations are not strictly necessary for
boostrapping the parsers included in \splint\ but they are added for
the cases when the bootstrap mode is used for purposes other than
bootstrapping \splint.
@=
@G
@t}\vb{\flatten}{@>
symbol_declaration:
"%nterm" {} symbol_defs.1 {@> TeX_( "/yy0{/nx/ntermdecls{/the/yy(3)}/the/yy(1)}" ); @=}
@t}\vb{\fold\flatten}{@>
| "%token" {} symbol_defs.1 {@> TeX_( "/yy0{/nx/tokendecls{/the/yy(3)}/the/yy(1)}" ); @=}
;
@g
@ {\it Just like \prodstyle{symbols.1} but accept \prodstyle{INT} for
the sake of \POSIX}. Perhaps the only point worth mentioning here is
the inserted separator (%
\texrefx{/hspace}{other}%
\.{\{}$p_0$\.{\}\{}$p_1$\.{\}},
typeset as
|TeXa("/hspace"); TeXao(@t\TeXlit"\{\hbox{$p_0$}\}\{\hbox{$p_1$}\}\hbox{$\!$}"@>);|).
@q A string "..." is a syntactic unit in \CWEB\ so it is impossible@>
@q to insert \TeX\ material in the middle of the string directly@>
Like any other separator, it takes
two parameters, the stream pointers $p_0$ and~$p_1$. In this case, however, both pointers are null
since there seems to be no other meaningful assignment. If any
formatting or stash information is needed, it can be extracted by the
symbols themselves.
@=
@G
symbols.prec:
symbol.prec {@> @ @=}
| symbols.prec symbol.prec {@> TeX_( "/yy0{/the/yy(1)/nx/hspace{0}{0}/the/yy(2)}" ); @=}
;
symbol.prec:
symbol {@> TeX_( "/yy0{/nx/symbolprec{/the/yy(1)}{}}" ); @=}
| symbol INT {@> TeX_( "/yy0{/nx/symbolprec{/the/yy(1)}{/the/yy(2)}}" ); @=}
;
@g
@ {\it One or more symbols to be \prodstyle{\%type}'d}.
@=
@G
%type symbols.1 symbol;
symbols.1:
symbol {@> @ @=}
| symbols.1 symbol {@> TeX_( "/yy0{/the/yy(1)/nx/hspace{0}{0}/the$[symbol]}" ); @=}
;
generic_symlist:
generic_symlist_item {@> @ @=}
| generic_symlist generic_symlist_item {@> TeX_( "/yy0{/the/yy(1)/nx/hspace{0}{0}/the/yy(2)}" ); @=}
;
@t}\vb{\flatten\inline}{@>
generic_symlist_item:
symbol {@> @ @=}
| tag {@> @ @=}
;
tag:
TAG {@> @ @=}
| "<*>" {@> @ @=}
| "<>" {@> @ @=}
;
@g
@ {\it One token definition}.
@=
@G
symbol_def:
TAG {@> @ @=}
@t}\vb{\flatten}{@>
| id {@> TeX_( "/yy0{/nx/onesymbol{/the/yy(1)}{}{}}" ); @=}
| id INT {@> TeX_( "/yy0{/nx/onesymbol{/the/yy(1)}{/the/yy(2)}{}}" ); @=}
| id string_as_id {@> TeX_( "/yy0{/nx/onesymbol{/the/yy(1)}{}{/the/yy(2)}}" ); @=}
| id INT string_as_id {@> TeX_( "/yy0{/nx/onesymbol{/the/yy(1)}{/the/yy(2)}{/the/yy(3)}}" ); @=}
;
@g
@ {\it One or more symbol definitions}.
@=
@G
symbol_defs.1:
symbol_def {@> @ @=}
| symbol_defs.1 symbol_def {@> @ @=}
;
@g
@ @=
@[TeX_( "/getsecond{/yy(2)}/to/toksa" );@]@; /* the identifier */
@[TeX_( "/getfourth{/toksa}/to/toksb" );@]@; /* the format pointer */
@[TeX_( "/getfifth{/toksa}/to/toksc" );@]@; /* the stash pointer */
@[TeX_( "/yy0{/the/yy(1)/nx/hspace{/the/toksb}{/the/toksc}/the/yy(2)}" );@]@;
@ {\it The grammar section: between the two
\prodstyle{\%\%}'s}. Finally, the following few short sections define
the syntax of \bison's rules.
@=
@G
grammar:
rules_or_grammar_declaration {@> @ @=}
| grammar rules_or_grammar_declaration {@> @ @=}
;
@g
@*2 Rules syntax. {\it As a \bison\ extension, one can use the grammar declarations in the
body of the grammar}. What follows is the syntax of the right hand
side of a grammar rule. The type declarations for various non-terminals are used exclusively
by the postprocessor whenever the `native' \bison\ term references are used (see elsewhere
for details).
@=
@G
%type rhs id_colon named_ref.opt rhses.1 "|";
rules_or_grammar_declaration:
rules {@> @