2. Implementation


.PP
.Xp
is written in the
.UX
systems programming language C.
The structure of
.XP
is very similar to that of the Pascal translator
.PI .
In fact
.XP
was, until recently, a conditional compilation of
.PI .
.Xp
uses the same parser and scanner as
.PI ;
these are described in [1].
.PP
The major changes to the compiler in writing
.XP
were the rewriting of the
semantic routines of the compiler and the addition of three
``clusters''
of routines;
one for profiling,
one for the production of the restructured program listing,
and one for the processing of comments.
These ``clusters''
are written with local data bases, are largely
independent of the other parts of the program, and consist
of a large number of small, related routines.
By structuring the program in this way, functional
independence is obtained.
Clusters are organized separately
from the rest of the program, and accessed through a high-level and abstract
interface.
The concrete implementation of a cluster is not fixed by these
external specifications, and could be changed without the need to
rewrite any other
part of the program provided that the routines then implemented
a suitable interpretation of the same abstraction.
This organization provides a powerful and convenient decomposition
of the profiling process.
.PP
The print cluster routines are kept in the file
.I pp.c.
Their basic function is to provide
an interface to a high-level
printing automaton, with primitives such as 
``print keyword''
and
``print number.''
Information about the macro- and micro-structure of the output is passed to
the routines implicitly by calling the appropriate entry points.
Thus, after a statement is processed, the routines are informed that an
``item'' is finished so that an intelligent decision can be made about
where to place the statement.
Similarly, when processing a list of items such as a constant type
declaration, the print cluster is given information as to where
the ``left bracket,'' i.e. `(' occurs, where each ``separator,'' i.e. `,'
occurs, and, when the final `)' occurs, this is noted as a ``right bracket.''
With this information, an implementation of the printing cluster can be
provided which sensibly divides long structured constructs even though it has
no knowledge of the high-level language structure of its input.
.PP
The profile cluster, which is kept in the file
.I pmon.c,
has a number of related functions.
Initially, it deals with the problem of extracting the count information
from the profile data file.
If the execution of the interpreter terminated abnormally, then the data must
be extracted from the
.I core
process image file.
There is an image of the profile data file in the
.I core
image.
To extract the data from
.I core,
.I pxp
need only determine the offset of this image in the file.
.PP
After extracting the data from the file, the profile cluster provides
operations to return successive counters from the count buffer.
In addition it
builds the data structure for storing the information used to print
the optional table summarizing
.B procedure
and
.B function
invocation counts.
Other functions provided include
saving and restoring counts before
and after structured statements,
determining when embedded
.B goto
statements will have caused extra counts to be generated,
generating pseudo-counters for the
.B else
parts of
.B if
statements,
and controlling when the counts are to be printed in the listing
by keeping track of whether each count has been printed.
All manipulation of counts and counters is handled by these routines,
and focuses on a single counter structure called
.PX
which contains the current count.
.PP
In section 2.1 the profile data format and
the process of getting the data from the profile
data file or the
.PX
process
.I core
image are discussed.
Section 2.2 gives the data and control structures of the profiling
and printing clusters, and section 2.3 describes the tree walk
at the function level, and at illustrative parts of the statement
and expression levels.
The implementation description concludes in section 2.4 with
a note on the handling of multi-file programs.

2.1. External data structures


.PP
The following sections deal with the recovery of the profile data.
The simplest case, that of normal termination, is dealt with in section 2.1.1.
The remaining sections deal with the handling of
.I core
process image files.
These are likely to be of interest to
system programmers only.

2.1.1. Format of the profile data file


.PP
The
.I pmon.out
data file,
is normally written by the Pascal system at the end of profiling
execution.
The record structure of the header at the beginning of the file is as
follows:
.LS
.ta 8n 16n 40n
\*bstruct\fP {
	\*bint\fP	magic;	/* 0426 */
	\*bint\fP	magic2;	/* 0 */
	\*blong\fP	proftime;	/* Time profiled */
	\*bint\fP	cnt;	/* # statement cntrs */
	\*bint\fP	pfcnt;	/* # procedures+functions + 1 */
} header;
.LE
Each counter is a long integer so the rest of the file is essentially
an array:
.LS
\*blong\fP counts[header\*b.\fPcnt];
.LE
.PP
As an error check,
.XP
first insures that the first word is 0426
and the second is a 0.
If this is true, and the size of the remainder of the file is the size of
the
.I counts
array as implied by the number
.I cnt
in the header, then
profiling can proceed.
If not, then the specified profile data file has a bad format, and
.XP
reports this fact.

2.1.2. Core data files


.PP
.Xp
has the capability to
extract the profile information from the
.I core
data file dumped by an
abnormal
.PX
termination.
This is greatly simplified by the fact that there is, in the core image,
an exact replica of the
.I pmon.out
file which would have been written at normal termination.
In order to extract the profile data, it is only necessary
to locate this buffer.
.PP
The first 1024 bytes of the core image file is a copy of
the system's per-user data for the process, including the registers
as they were at the time of the fault.
The remainder of the file represents the actual contents of the user's
core area when the core image was written.
If the text segment of the process was write-protected and shared,
it will not be present; otherwise the entire address space will have
been dumped.

2.1.3. Shared text programs


.PP
It is possible in
.UX
for the text image of a program to be shared by all of the persons
who invoke it.
The instructions will be write protected, and each process which
executes this
.I pure
text will have only a data area, sharing one
common text.
Thus, when such a process is swapped out, only its data area need
be swapped.
The text space can always be abandoned and read in from external
storage when needed, as it cannot have changed.
.PP
When a core image is dumped,
the text space is not dumped if the process was running as shared text.
This is the situation with the installed version of
.PX .
The disadvantage of shared text is that breakpoint debugging
is impossible in this case.
For this reason
.X
is most often debugged
as non-shared text.
It is desirable, therefore, to have
.XP
be able
to extract data correctly in either case.
.PP
The important fact here, which allows a number of simplifications
to the extraction method, is that
the non-stack data space of the process
is dumped immediately after the system area.
If the text space was not ``shared,'' then the first word dumped will
have been location 0 in the process address space, otherwise it will
have been the first data word, as the text of the process appears
first in the process space.

2.1.4. Extraction methods


.PP
The structure needed to navigate in the
.I core
image is the
.I pcx
information structure\u\s-2\*(dg\s0\d, which has the following format:
.FS
\*(dg
.I Px
was previously
.I pcx
hence the name of this structure
.FE
.LS
\*bstruct\fR {
	\*bchar\fR	*buf;
	\*bint\fR	cnt;
} pcx;
.LE
The
.I buf
pointer here points to the image of
.I pmon.out
in core, and is a memory address.
The
.I cnt
gives the size of this image,
and is not currently used by
.XP
as the size is determined from information in the image itself.
If the
.I buf
pointer is 0, i.e. a 
.I NIL
pointer, then
.PX
was not
gathering profile data in the execution which abnormally terminated.
.PP
To locate this structure, the runtime start up routine for the interpreter
was modified from the standard C start up to be as given in the following
figure.
.so pcrt0.s
This runtime start up is structured so that it is possible to determine,
from the first three words after the system area, whether
the process was executed as shared text, and the offset of the
.PX
structure in the file.
If the second word here is a 0, then this can only be a non-pure
core image; if it is a 1 then this can only be a pure core image.
If it is neither then this is not a
.PX
core image.
.PP
If the core image is pure, the first word dumped of the process
was not at location 0 in the process data space.
In this case, however, the runtime start up is set up so the
first word contains its address in memory.
Thus it is possible to can convert the third core image word, which is the index
into memory, into an index into the core image by subtracting
this offset from it.
In this way,
.XP
can find the
.I pcx
structure in the 
.I core
image.
.PP
As noted above, if the
.I buf
pointer is
NIL,
then
.PX
was not gathering profile data and an appropriate diagnostic will
be printed.
Otherwise, the
.I buf
pointer is offset by subtracting the first word
after the system area in the
.I core
image if the process was pure,
.XP
seeks to the
implied location and a routine common to the
.I core
and
.I pmon.out
profile data gathering routines 
.I pmread
is called.
The only notable difference is that after calling
.I pmread
when processing a
.I pmon.out
file,
.XP
checks that all the data on the file is
exhausted.
This is not done when reading a
.I core
data file.
.PP
If any of this seeking or reading terminates abnormally
a
.I "bad format"
diagnostic is given.

2.2. Internal data structures

2.2.1. External data copies


.PP
The following data items are kept internally after being read from
the
.I pmon.out
or 
.I core
data file.
They are available only by using the routines in the cluster in the file
.I pmon.c .
.PP
The integer 
.I zcnt
gives the total number of counters, the integer 
.I zpfcnt
the total number of procedures and functions.
The pointer
.I zbuf
is (effectively) a dynamic array of long integers
representing the profile counts.
In addition, the external variable
.I ptvec
contains the internal
representation of the time the profile was made.

2.2.2. Profile cluster data structures


.PP
There are two major profile cluster structures.
The first structure is the primitive unit for storing a piece
of count information and is given by:
.LS
\*bstruct\fP pxcnt {
	\*blong\fP	ntimes;
	\*bint\fP	counter;
	\*bint\fP	gos;
	\*bint\fP	printed;
};
.LE
Here
.I ntimes
is the actual count data.
The variable
.I counter
gives a unique identifying tag to this structure,
.I gos
is the number of gotos which had been encountered when this counter
was created, and
.I printed
is actually a Boolean value indicating
whether the counter is considered to have been printed for
the purposes of the profile.
.I Counter
may actually be an index into the array of
values read in from
.I pmon.out ,
or it may be a negative number indicating
that it is a counter which was generated by calculations, e.g. for
an
.B else
part of an
.B if
statement.
Other uses of this structure will be described below.
.PP
The other major structure which records information
for each
.B procedure
or 
.B function
for the summary table. Its format is:
.LS
\*bstruct\fP pftab {
	\*blong\fP	pfcnt;
	\*bint\fP	pfline;
	\*bchar\fP	*pfname;
	\*bint\fP	pflev;
} *zpf;
.LE
The field
.I pfcnt
gives the invocation count for each routine,
.I pfline
is the approximate source program line number for the routine,
.I pfname
points to the character string name of the routine, and
.I pflev
gives the nesting level of the routine so it is possible to print
the table with nesting indicating structure.
The variable
.I zpf
is actually a dynamic array holding these counters.

2.2.3. The profile cluster primitives


.PP
The following are the primitive operations of the profile cluster
kept in the source file
\fIpmon.c\fP.
They deal in particular with one special counter structure
.I px
which holds the current count information as processing progresses.
.IP
.RS
.HP "\*blong\fP nowcnt()"
.br
Returns a \*blong\fP integer which is the current count.
.RE
.IP
.RS
.HP nowcntr()
.br
Returns an integer uniquely identifying the current counter.
.IP "\*blong\fP cntof(pxc)"
.HP "\*bstruct\fP pxcnt *pxc;"
.br
Returns the \*blong\fP integer
.I count
associated with the structure addressed by
\fIpxc\fP.
.IP "setcnt(l)"
.HP "\*blong\fP l;"
.br
Makes the current counter have count
\fIl\fP.
A new, unique counter is generated for this purpose.
This is used, e.g., to assign counts to
.B else
parts of
.B if
statements.
.IP "savecnt(pxc)"
.HP "\*bstruct\fP pxcnt *pxc;"
.br
Saves the information associated with the current count
.I px
in the given structure.
.IP "rescnt(pxc)"
.HP "\*bstruct\fP pxcnt *pxc;"
.br
The inverse of
.I savecnt
setting the fields of
.I px
from the given structure.
.I Rescnt
also returns non-zero exactly when there were embedded gotos.
.IP "getcnt()"
.br
Takes the next counter from the
.I pmon.out
or
.I core
data and places it and associated information into
\fIpx\fP.
The fact that there are less counters in the file than
required by the supplied program is diagnosed as a
``Counter mismatch'' here.
.IP "cnttab(s, no)"
.HP "\*bchar\fP *s;"
.HP "\*bint\fP no;"
.br
Records the current count as being that of a
.B procedure
or
.B function
with number
\fIno\fP.
The number of
procedures and functions are also counted
here so they can be checked for consistency at the end of processing.
.RE

2.2.4. Profile-printing cluster interface


.PP
The following routines, which are part of the
.I pmon.c
cluster, interface to the printing cluster, and are used
to control the annotation of the program listing with
profile data.
.IP
.RS
.HP "unprint()"
.br
Causes the current counter to become logically ``unprinted''
so that it will be printed again.
Used, e.g., after restoring a count saved before a loop
to force it to be printed again.
.RE
.IP
.RS
.HP "baroff()"
.br
Turns the printing of the bar `|' in the profile off.
.IP "baron()"
.br
The inverse of
\fIbaroff\fP.
.IP "shudpcnt()"
.br
Returns an integer indicating the type of annotation of the
current line which is desired.
A return value of 0 indicates that only the bar `|' is to
be printed, a value of 1 that the current count and the
bar `|' are to be printed, and a value of -1 indicates that
only white space is to be printed.
A flag set by the routines
.I baroff
and 
.I baron
is inspected here.
If the bar `|' to be printed is printed, it is noted that the
current counter was printed so the count will not be printed again.
.RE

2.2.5. The printing cluster


.PP
The file
.I pp.c
contains the cluster which performs the nitty-gritty business
of preparing the output to be printed.
It was the author's intention, when he wrote the program, to
pass sufficient information to this cluster to allow it to do
the job of breaking up long statements and of placing multiple
statements on each line.
This has not been implemented yet.
The description of how the routines of this cluster
work together to produce the profile is deferred to a later, higher level
view.
.PP
This cluster currently has a very minimal set of data structures.
These structures would expand greatly if the statement folding and 
multiple statement placement were implemented.
As it stands
the input is interpreted only in a very simpleminded way.
For this purpose it is only noted whether anything is being printed
at all, indicated by the flag
\fInopflg\fP,
which if non-zero indicates that no printing is to occur, and some information
about the way in which the listing is to be indented.
This is contained in an array giving the indenting in spaces for
each of three levels, i.e.:
.LS
\*b#define\fP DECL    0
\*b#define\fP STAT    1
\*b#define\fP PRFN    2

\*bint\fP pplev[3];
.LE
.PP
These levels have rough interpretations as follows.
The white space generated at the very left by the indenting
of nested
.B procedure
and
.B function
declarations is assigned to
\fIpplev[PRFN]\fP,
the white space generated in declaration parts by structured declarations
such as records to DECL, and the white space generated by indenting
of structured statements to STAT.
.PP
The white space is dispersed on the line, separating the left margin,
labels, the profile information, and the program text graphically
given below.
.DS C
line#  PRFN  label:  STAT  999.---|  DECL  text
.DE
Thus by indenting in the DECL part deeper text nesting can be shown
without the bar wandering to the right, and when indenting in the
STAT part the bar is moved over.
Similarly, the option to indent nested procedures and functions
is trivially handled by indenting in PRFN.

2.2.6. Printing cluster primitives


.PP
There are two kinds of routines in the printing cluster.
One kind deals with printing the various kinds of tokens in
Pascal programs, e.g. keywords, strings, etc.
The other set of routines deals with the specifics of printing
the profile, i.e. turning printing on and off, indenting, and
the nasty details like the placement of labels.
.IP
.RS
.HP "ppkw(s)"
.HP "\*bchar\fP *s;"
.br
Prints the character string representing the keyword.
Underlining and overstriking of keywords is also handled.
This routine facilitate the suppression of blank lines at the beginning
of partial profiles,
setting a flag the first time it prints a keyword.
Since any solid output always begins with a keyword,
.I ppnl
refuses to print a newline until a keyword is printed.
.IP "ppid(s)"
.HP "\*bchar\fP *s;"
.br
Prints the identifier
.I s
or `{identifier}' if
.I s
is null because of a syntax error.
.IP "ppop(s)"
.HP "\*bchar\fP *s;"
.br
Prints operators.
.IP "ppnumb(s)"
.HP "\*bchar\fP *s;"
.br
Prints numbers.
.IP "ppstr(s)"
.HP "\*bchar\fP *s;"
.br
Prints strings, dealing with doubling of the string metacharacter.
.IP "pplab(s)"
.HP "\*bchar\fP *s;"
.br
Prints a label.
.IP "ppbra(s)"
.HP "ppket(s)"
.HP "ppsep(s)"
.HP "\*bchar\fP *s;"
.br
These routines
are used to indicate the recursive structure of the input
at the microscopic level.
Thus, when printing out the argument list to a
.B procedure
or
.B function,
the opening `(' would be printed with
\fIppbra\fP,
each separating `,' with
\fIppsep\fP,
and the closing `)' with
\fIppket\fP.
This would, conceivably, allow this cluster to break the output sensibly
to conform with the nature of the output medium without having
to deal with the passed-through data at a more difficult macroscopic
level.
.IP "ppspac()"
.HP "pptab()"
.HP "ppitem()"
.HP "ppnl()"
.br 
These routines are all used to put separation into the output stream.
In the initial design, the difference between the routines
.I ppitem
and
.I ppnl
was that
.I ppnl
would always force a new-line, while
.I ppitem
separated major units such as statements, but didn't require the
following statement to start on a new line, leaving the possibility
that it be placed on the same line.
In the current implementation, however, both
.I ppitem
and
.I ppnl
force the output to be on a new line.
Note that forcing the output to a new line does not force the
leading white space to be printed!
.RE
.PP
The utility routines which don't deal directly with
the printing of the actual program text follow.
.IP
.RS
.HP "setprint()"
.br
Is called at the beginning of each
.B procedure
or
.B function
body and examines the global environment and option flags
to decide whether that routine should be printed in the profile.
.RE
.IP
.RS
.HP "printon()"
.br
Turns the printing on.
If profiling is notoccuring, a summary table is desired
then this actually turns the printing off!
If neither profiling or a summary table is being producd,
then this call has no effect.
.IP "printoff()"
.br
Turns the printing of the profile off.
.IP "indent()"
.br
If printing, ask
.I linopr
to print the line number.
If producing a profile (rather than just a pretty-print)
indent over
.I
pplev[PRFN] + pplev[STAT]
.R
spaces, and then, depending on the result of a call to
.I shudpcnt
print either a count, some dashes and a bar, just a bar,
or spaces.
Finally, indent
.I pplev[DECL]
more spaces and return.
.IP "dashes(c)"
.HP "\*bchar\fP c;"
.br
Spaces over an amount determined by the indenting level using
the character given.
.IP "indent1(in)"
.HP "\*bint\fP in;"
.br
Actually advance the indent by
.I in
spaces.
If pretty-printing
it is safe to optimize the output by producing tabs,
otherwise spaces must be used.
.IP "linopr()"
.br
Prints the line number if required.
.IP "indentlab()"
.br
Indents for a label print,
.I pplev[PRFN]
spaces using
\fIindent1\fP.
.IP "ppgoin(lv)"
.HP "ppgoout(lv)"
.br
These routines each take one of PRFN, STAT or DECL and increase
or decrease the indentation at that level respectively.
.RE

2.3. Producing the profile


.PP
It should be obvious from the discussion above, that little
difference can be discerned at the top level between producing
a profile and prettyprinting.
When prettyprinting, a large
number of routine calls return without doing any real work.
.PP
The production of the profile is discussed at each of four levels.
The first level is the main routine and option processing.
This is discussed only to give an outline of the work done here.
.PP
The second level is the level of procedures and functions and
involves such considerations as the 
.B z
command line option with list of
.B procedure
and
.B function
names, forward declarations, and the recording, saving and
restoring of count information.
.PP
The third level is that of individual statements, and the final
level is that of expressions.  These levels are illustrated
with actual code to make the discussion more concrete.

2.3.1. Main routine and option processing


.PP
The main routine sets up the default options such as setting
the indenting to 4 spaces and turning on nested
.B procedure
and
.B function
indents.
It then examines the arguments and,
importantly, sets the variable
.I profile
if profiling and
.I table
if producing a table of procedure and function counts.
.PP
If a list of
.B procedure
and
.B function
names is given it is saved as referenced by the variables
.I pflist
and
.I pflstc
for examination by the routine
.I inpflist
which is called by routines at
.B procedure
and
.B function
entry and exit.
.PP
After processing all the options, the main routine makes a call
to the
.I pmon.c
cluster to get the profile data if appropriate.
It sets up the input and does some special processing for processing
of
.I include
files as described further in section 2.4 below.
It finally calls the parser which completes the processing of the profile.

2.3.2. Procedure and function level processing


.PP
The
.B procedure
and 
.B function
level processing routines are contained in the file
\fIfdec.c\fP.
The routines here and their functions are:
.IP
.RS
.HP "funchdr(r)"
.HP "\*bint\fP *r;"
.br
Called with a tree node representing the initial declaration
of a function or procedure, e.g.:
.LS
\*bfunction\fP sine(a: angle): \*breal\fP;
.LE
this routine first determines if the routine is in the list
of procedures and functions given on the command line.
If it is, then the
.B z
option value is saved on a stack,
and then turned on.
It then gets the counter associated with the procedure header
and calls a routine in the print cluster to determine whether this
routine should be printed or not.
.IP
It then saves the count information for this routine at its level
in a global array of
.I pxcnt
structures called
\fIpfcnts\fP.
This counter will be restored later when the body of the routine
is encountered.
The printing of the header is also processed here, but this is
similar to other processing to be described later.
This printed header is indented if nested definitions are being indented,
to be unindented after completing the printing of the header.
.RE
.IP
.RS
.HP "funcfwd(fp)"
.HP "\*bchar\fP *fp;"
.br
This routine prints the keyword
.B forward
indented an appropriate amount.
It returns its argument.
.IP "funcbody(fp)"
.HP "\*bchar\fP *fp;"
.br
This routine, called when the actual, resolving declaration of
a
.B procedure
or
.B function
is encountered, indents if the indent nested procedures option
is set, and increments the structured statement level, returning
its argument.
.IP "funcend(fp, blk)"
.HP "\*bchar\fP *fp;"
.HP "\*bint\fP *blk;"
.br
This routine sets up for all of the actual work in printing the body
of procedures and functions.
It restores the count saved by
.I funchdr
from the
.I pfcnts
array,
.I unprints
the count to force it to come out again if there
were any nested procedures or functions, (if the last block number in the
variable
.I lastbn
is not the current block number)
and then prints the body of the procedure or function.
.IP
To print the body, it
\fIindents\fP,
prints the keyword
.B begin,
and if there were nested sections prints
the name of this routine in a comment.
It then goes in a level in DECL (without shifting the bar over!)
and prints the statement list given by the parameter
\fIblk\fP.
.RE
.PP
This is an appropriate place to note the following important fact:
When a routine is called to put out an item at the statement level,
the
.I "output cursor"
is usually at the end of the previous line, and if the routine
wants to print on a clean line, then it calls
.I ppnl
before it begins.
If it is willing to print on the same line then it can call
\fIppitem\fP.
It also turns out that this structure is critical for the processing
of comments which is described in section 3.
A delay in printing
the new line character allows the comment processing to function
correctly.
.PP
Thus, after the routine
.I statlist
processes the parameter
.I blk
the rotuines
\fIppnl\fP,
an
\fIindent\fP,
are called and the keyword
.B end
is printed followed by the routine name in a comment and a final
.I ;
or
.I \&.
character.
Finally unwind from any indenting that may have been done due
to the indent nested sections option occurs, and the
.B z
option stack is popped if this routine was in the command line
.B procedure
and
.B function
list.

2.3.3. Statement processing


.PP
The statement level processing is done by the routines in the file
.I stat.c
and for case statements by the code in the file
\fIcase.c\fP.
As noted above, the cursor for each statement is generally left
on the previous line so that a statement will ask for the cursor
to be advanced to the next line if so desired.
This is also necessary to make the placement of
.B begin
and
.B end
pairs work as desired.
The basic loop for processing a group of statements is as follows:
.LS
statlist(sl)
        \*bregister\fP \*bint\fP *sl;
{
        \*bif\fP (sl != NIL)
                \*bfor\fP (;;) {
                        statement(sl[1]);
                        sl = sl[2];
                        \*bif\fP (sl == NIL)
                                \*bbreak\fP;
                        ppsep(";");
                }
        \*belse\fP
                statement(NIL);
}
.LE
This is quite simple.
A pointer to a tree structure is received, treated as an array
of integers.
The first word of each such node is a magic number giving the
kind of node, the second word points to a statement and the third
word links to the next such node, or NIL if this is the last node
in the statement list.  This is more fully described in [1],
the nodes here being of type T_LISTPP.
.PP
To illustrate the processing for statements in general,
a subset of the code from the
.I stat.c
file comprising that for
.B if
statements, assignment statements and
.B "begin-end"
blocks within
.B if
statements follows.
This is illustrative of the work here in general.
.so stat.c
.PP
.I Statement
receives as argument a pointer to a tree node.
For the purposes of this discussion, assume that this node is either of type
T_IF, T_IFEL, or T_ASGN.
The node type T_IFEL was added to the parser because
of the problematic case of empty
.B else
clauses.
When these empty clauses are present, it is impossible
to present the data for both
.B if
statements without
.B else
and with
.B else
parts in the one T_IF structure.
An
.B if
statement without an
.B else
part looks the same as an
.B if
statement with an empty
.B else
part in this case.
This is a problem because
.I pxp
does not realize that the discarding of such empty
.B else
parts can affect the matching of outer
.B else
clauses with
.B if
parts and alter the meaning of the program!
.PP
Now, if the argument to
.I statement
is a NIL pointer, \fInull\fR is printed,
a call on a built-in procedure that does nothing.
The fact that a
.I ppitem
rather than a
.I ppnl
is done here indicates that it does not matter if this is on the same line
with something else.
If the argument pointer is not NIL a switch is done based
on which type of statement is involved.
.PP
For assignments, as for the \fInull\fR above,
the statement can appear on the same line with something else.
Thus
.I ppitem
is called.
To print an assignment first an
.I lvalue
(essentially a variable)
and then an
.I rvalue
(an expression)
must be printed, separated by a `:='.
This is as simple as knowing which fields to pass to
procedures
.I lvalue
and
.I rvalue .
.PP
Note that
.I rvalue
here takes two arguments.
The second argument is a flag indicating whether the
expression is to be surrounded by parentheses if
it is not-atomic.
Since no ambiguity can possibly result, no parenthesis are required here.

2.3.4. If-then-else's


.PP
It is required that an
.I if
statement appear on a separate line by calling
.I ppnl .
The routine
.I ifop
begins by printing:
.LS
\*bif\fP  \*bthen\fP_
.LE
where the `_' will be used to represent the invisible output cursor.
The expression here also need not be parenthesized.
.I Ifop
then saves the count before the statement by calling
\fIsavecnt\fP,
and gets the count for the
.I then
part of the statement by calling
\fIgetcnt\fP.
The
.B then
part can now be handled.
.PP
If the
.B then
part is a
.B begin\-end
block, it can be started on this line.
Thus, in this case, the routine
.I ppstbl1
can be called with the
.I then
part as argument.
It is also passed STAT, indicating that the indent it will do
is to be reflected in the position of the bars on the left.
(For
.B with
statements among others, the bars do not move over
the text is indented.)
Now
.I ppstbl1
prints out the keyword 
.I begin
and does the indent discussed above, but does not put out a newline.
That is up to the routine
.I statement
which will be called by
\fIstatlist\fP.
The
.B end
is not put out yet because the counter for the
.B else
part belongs on the line with
it if there is an
.B else
part and that count has not been set up yet.
.PP
If the statement in the
.B then
part is not a
block then the processing here is much simpler;
it is only necessary to indent a level in the STAT part as a block,
call 
.I statement
and then unindent the level.  Thus a typical
position after this part is completed for a
.B then
part which is a block would be:
.LS
\*bif\fP  \*bthen\fP \*bbegin\fP
        stat1;
        stat2_
.LE
with the cursor convention as before.
.PP
If this
.B if
statement does not have an
.B else
part the rest of the processing is simple.
The count saved before the statement is restored, doing a
.I getcnt
if there were one or more
.B goto
statements in the body of the
statement to get the count after the statement.
If the
.B then
part was a block,
.I ppstbl2
is called to put the keyword
.B end
indented on a new line with the restored count.
This finishes the processing in this case.
.PP
If the statement has an
.B else
part then the count for this part must first be calculated.
This is done by taking the
.I cntof
the saved structure from before the
.B if
statement and subtracting the current count which is that of the
.B then
part.
This becomes the new counter generated for the
.B else
part.
Processing of the
.B then
part is then finished either by printing a
.B end
with
.I ppstbl2
as above, followed by a space,
or by forcing a new line in the output and calling
.I indent.
The keyword
.B else
is printed in the output and followed with a space.
.I Unprint
is called so that the count for an
.B else
part prints not only with the
.B else
but also on the statement in the
.B else
part.
This significantly improves the readability of bushy if-then-elses.
.PP
The special case of an empty statement in the
.B else
part is caught and a null statement is put out in this case.
Otherwise a check is made to see if the
.B else
part is a block, and if so, handled in a manner identical to the processing
for the 
.B then
part.
Nested
.B if
statements are then caught, and are recursive call is made without
doing any indentation.
This accomplishes the goal of not ``wandering across the page''
in if-then-else statements.
.PP
Noting that normal statements are treated as before,
the
.B else
part has been successfully completed except for some cleanup.
This cleanup involves restoration of the count and printing of the
keyword
.B end.
Note that the pre
.B if
statement count must be unprinted whenever an
.B else
occurs.
This is because of the way
.B else
parts are printed, lined up with the
.B if
parts and obstructing their counts.

2.3.5. Expression processing


.PP
The final part of the profiling process to be discussed
is the printing of expressions.  This is quite simple,
actually, with the only interesting code being that which
determines what parenthesization to do.  This happens as follows.
.PP
Whenever a binary operator is encountered each of its operands
is processed in turn.
The following cases are from the
.B switch
in the routine
.I rvalue
and indicates the code for processing binary operators.
.LS
\*bcase\fP T_AND:
\*bcase\fP T_OR:
\*bcase\fP T_MULT:
\*bcase\fP T_ADD:
\*bcase\fP T_SUB:
\*bcase\fP T_DIVD:
\*bcase\fP T_MOD:
\*bcase\fP T_DIV:
\*bcase\fP T_EQ:
\*bcase\fP T_NE:
\*bcase\fP T_GE:
\*bcase\fP T_LE:
\*bcase\fP T_GT:
\*bcase\fP T_LT:
\*bcase\fP T_IN:
        al = r[2];
        rvalue(al, prec(al) < prec(r) || opt('f'));
        ppspac();
        \*bif\fP (alph(opname[0]))
                ppkw(opname);
        \*belse\fP
                ppop(opname);
        ppspac();
        al = r[3];
        rvalue(al, prec(al) <= prec(r) || opt('f'));
        \*bbreak\fP;
.LE
.PP
The routine
.I prec
returns an integer representing the precedence of the
operator given.
It is defined by:
.LS
prec(r)
        \*bregister\fP \*bint\fP *r;
{
        \*bif\fP (r == NIL)
                \*breturn\fP;
        \*bswitch\fP (r[0]) {
                \*bcase\fP T_NOT:
                        \*breturn\fP (3);
                \*bcase\fP T_MULT:
                \*bcase\fP T_DIVD:
                \*bcase\fP T_DIV:
                \*bcase\fP T_MOD:
                \*bcase\fP T_AND:
                        \*breturn\fP (2);
                \*bcase\fP T_ADD:
                \*bcase\fP T_SUB:
                \*bcase\fP T_OR:
                        \*breturn\fP (1);
                \*bdefault\fP:
                        \*breturn\fP (0);
        }
}
.LE
Thus, with a binary operator,
parentheses are needed around the left operand if it is non-atomic
and its operator has lower
.I prec
than the current operator.
Parentheses are needed around the right operand if it is non-atomic
and its operator has fewer or the same
.I prec
as the current operator.
This equality condition reflects Pascal associativity.

2.4. Multiple file processing


.PP
.I Pxp
has the capability of handling
.I include
files.
Just as programs to
.PI
may be split into many pieces, they may be so split and then
processed by
\fIpxp\fP.
In addition a capability for pretty printing of one piece of
a program has been included in
\fIpxp\fP.
If
.I pxp
is pretty printing and the file being pretty printed
has a name ending in 
.I \&.i
then
.I pxp
will place the line
.LS
\*bprogram\fP x(output);
.LE
into the source stream before the first line of the file
and the line
.LS
\*bbegin\fP \*bend\fP.
.LE
after the last line.
In this way, if the contents of the
.I include
corresponds to a global declaration part and/or
a group of procedures and functions, the pretty print
can proceed without modifications to the source text.
This is the only case in which the pretty printer will
take anything other than a complete Pascal program.
.I Pxp
suppresses printing back out the inserted and appended lines
in this case.
.PP
.Xp
does not normally process
.I include
directives but rather prints them back out as includes.
An option to eliminate the includes is also available.
.bp