\input texinfo @c -*-texinfo-*- @c %**start of header @setfilename avram.info @settitle avram - a virtual machine code interpreter @finalout @setchapternewpage odd @c %**end of header @set VERSION 0.13.0 @ifinfo This file documents the @code{avram} command which is a virtual machine code interpreter Copyright (C) 2000, 2003, 2006-2010, 2012 Dennis Furey Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. @ignore Permission is granted to process this file through TeX and print the results, provided the printed document carries copying permission notice identical to this one except for the removal of this paragraph (this paragraph not being relevant to the printed manual). @end ignore Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the Free Software Foundation. @end ifinfo @titlepage @title Avram @subtitle a virtual machine code interpreter @subtitle for avram Version @value{VERSION} @author by Dennis Furey @page @vskip 0pt plus 1filll Copyright @copyright{} 2000, 2003, 2006-2010, 2012 Dennis Furey Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the Free Software Foundation. @end titlepage @shortcontents @contents @c All the nodes can be updated using the EMACS command @c texinfo-every-node-update, which is normally bound to C-c C-u C-e. @node Top, Preface, (dir), (dir) @ifinfo This file documents @code{avram} version @value{VERSION}, which is a virtual machine code interpreter. @end ifinfo @c All the menus can be updated with the EMACS command @c texinfo-all-menus-update, which is normally bound to C-c C-u C-a. @menu * Preface:: project aims and scope * User Manual:: command line options and usage * Virtual Machine Specification:: a guide for compiler writers * Library Reference:: how to reuse or enhance @code{avram} * Character Table:: representations for ASCII characters * Reference Implementations:: constructive computability proofs * Changes:: recent updates to the manual * External Libraries:: specifications and calling conventions * Copying:: license terms * Function Index:: for the shared library API * Concept Index:: @end menu @node Preface, User Manual, Top, Top @unnumbered Preface @code{avram} is a virtual machine code interpreter. It reads an input file containing a user-supplied application expressed in virtual machine code, and executes it on the host machine. The name is a quasi-acronym for ``Applicative ViRtuAl Machine''. Notable features are @cindex functional programming @cindex environment @cindex Unix @itemize @bullet @item strong support for functional programming operations (e.g., list processing) @item interfaces to selected functions from mathematical libraries, such as @itemize @bullet @item @code{gsl} (numerical integration, differentiation, and series acceleration) @url{http://www.gnu.org/software/gsl/} @item @code{mpfr} (arbitrary precision arithmetic) @url{http://www.mpfr.org} @item @code{minpack} (non-linear optimization) @url{http://www.netlib.org/minpack} @item @code{lapack} (linear algebra) @url{http://www.netlib.org/lapack} @item @code{fftw} (fast fourier transforms) @url{http://www.fftw.org} @item @code{Rmath} (statistical and transcendental functions) @url{http://www.r-project.org} @item @code{ufsparse} (sparse matrices) @url{http://www.cise.ufl.edu/research/sparse/SuiteSparse/current/SuiteSparse/} @item @code{glpk} (linear programming by the simplex method) @url{http://www.gnu.org/software/glpk} @item @code{lpsolve} (mixed integer linear programming) @url{http://sourceforge.net/projects/lpsolve/} @item @code{kinsol} (constrained non-linear optimization) @url{http://www.llnl.gov/CASC/sundials/} @end itemize @item interoperability of virtual code applications with other console applications or shells through the @code{expect} library @item a simple high-level interface to files, environment variables and command line parameters @item support for various styles of stateless or persistent stream processors (a.k.a. Unix filters) @end itemize The reason for writing @code{avram} was that I wanted to do some work using a functional programming language, didn't like any functional programming languages that already existed, and felt that it would be less trouble to write a virtual machine emulator than the back end of a compiler. As of version 0.1.0, the first public release of @code{avram} as such in 2000, most of the code base had been in heavy use by me for about four years, running very reliably. At this writing some six years later, it has seen even more use with rarely any reliability issues, in some cases attacking large combinatorial problems for weeks or months at a time. These problems have involved both long running continuous execution, and batches of thousands of shorter jobs. Although the virtual machine is biased toward functional programming, it is officially language agnostic, so @code{avram} may be useful to anyone involved in the development of compilers for other programming, scripting, or special purpose languages. The crucial advantage of using it in your own project is that rather than troubling over address modes, register allocation, and other hassles inherent in generating native code, your compiler can just dump a fairly high level intermediate code representation of the source text to a file, and let the virtual machine emulator deal with the details. The tradeoff for using a presumably higher level interpreted language is that the performance is unlikely to be competitive with native code, but this issue is mitigated in the case of numerical applications whose heavy lifting is done by the external libraries mentioned above. Portability is an added bonus. The virtual code is binary compatible across all platforms. Versions of @code{avram} as of 0.1.0 and later are packaged using GNU autotools and should be possible to build on any platform supporting them. In particular, the package is known to have built successfully on MacOS, FreeBSD, Solaris (thanks to the compile farm at Sourceforge.net) Digital Unix, and Debian GNU/Linux for i386 and Alpha platforms, although it has not been extensively tested on all of them. Earlier versions were compiled and run successfully on Irix and even Windows-NT (with @command{gcc}). This document is divided into three main parts, with possibly three different audiences, but they all depend on a basic familiarity with @cindex Unix Unix or GNU/Linux systems. @table @asis @item @ref{User Manual} essentially reproduces the information found in the manpage that is distributed with @code{avram} with a few extra examples and longer explanations. Properly deployed, @code{avram} should be almost entirely hidden from end users by wrapper scripts, so the ``users'' to whom this part is relevant would be those involved in preparing these scripts (a matter of choosing the right command line options). Depending on the extent to which this task is automated by a compiler, that may include the compiler writer or the developers of applications. @item @ref{Virtual Machine Specification} documents much of what one would need to know in order to write a compiler that generates code executable by @code{avram}. That includes the complete virtual machine code semantics and file formats. It would also be possible to implement a compatible replacement for @code{avram} from scratch based on the information in this chapter, in case anyone has anything against C, my coding style, or the GPL. (A few patches to make it @command{lint} cleanly or a new implementation in good pedagogical Java without pointers would both be instructive exercises. ;-)) @cindex pointers @cindex Java @item @ref{Library Reference} includes documentation on the application program interface and recommended entry points for the C library distributed with @code{avram}. This information would be of use to those wishing to develop applications incorporating similar features, or to reuse the code for unrelated purposes. It might also be useful to anyone wishing to develop C or C++ applications that read or write data files in the format used by @code{avram}. @end table @node User Manual, Virtual Machine Specification, Preface, Top @chapter User Manual This chapter provides the basic information on how to use @code{avram} to execute virtual machine code applications. @code{avram} is invoked by typing a command at a shell prompt in one of these three forms. @display @kbd{avram} [@emph{general options}] @kbd{avram} [@emph{filter mode options}] @var{codefile}[@kbd{.avm}] @kbd{avram} [@emph{parameter mode options}] @var{codefile}[@kbd{.avm}] [@emph{parameters}] @end display @noindent In the second case, @code{avram} reads from standard input, and may of course appear as part of commands such as @cindex standard input @display @kbd{avram} [@emph{filter mode options}] @var{codefile}[@kbd{.avm}] < @var{inputfile} @var{anothercommand} | @kbd{avram} [@emph{filter mode options}] @var{codefile}[@kbd{.avm}] @end display @noindent When @code{avram} is invoked with the name of an input file (with a default extension @kbd{.avm}), it reads virtual machine code from the file and executes it on the host machine. @cindex functional programming The virtual code format used by @code{avram} is designed to support the features of functional or applicative programming languages. Although this chapter documents only the usage of @code{avram} and not the internals, it will be helpful to keep in mind that the virtual machine code expresses a mathematical function rather than a program in the conventional sense. As such, it performs no action directly, but may be applied in a choice of ways by the user of @code{avram} according to the precise operation required. The following sections provide information in greater detail about usage and diagnostics. @menu * General Options:: getting help and version information * Modes of Operation:: stream processing or file oriented * Filter Mode Options:: how to run a stream processor * Parameter Mode Options:: how to have an application use files * Command Line Syntax:: application-independent conventions * Diagnostics:: explanation of error messages * Security:: running untrusted applications * Example Script:: how to unburden the end users * Files:: miscellaneous files used * Environment:: environment variables * Bugs:: hall of shame @end menu @node General Options, Modes of Operation, User Manual, User Manual @section General Options Regardless of whatever other command line parameters are given, @code{avram} accepts the following parameters: @cindex help @cindex emulation @cindex @code{help} command line option @cindex @code{emulation} command line option @table @code @item -h, --help Show a summary of options and exit. @item -V,-v, --version Show the version of program and a short copyleft message and exit. @item --emulation=@var{version} Be backward compatible with an older version of @code{avram}. This option should include a valid version number, for example @kbd{@value{VERSION}}, which is the version of @code{avram} to be emulated. It can make virtual code applications future proof, assuming that future versions of @code{avram} correctly support backward compatibility. It may be used in conjunction with any other option in any mode of operation. @cindex web page @cindex home page @cindex url This copy of the user manual has not been updated since version @value{VERSION} of @code{avram}, so it is unable to document incompatibilities with later versions. The latest version of the manual may be found at @url{http://www.lsbu.ac.uk/~fureyd/avram}. @item -e, --external-libraries @cindex @code{external-libraries} Show a list of libraries with which @code{avram} has been linked and whose functions therefore could be called from virtual machine programs. This growing list currently includes selected functions from @code{fftw}, @code{glpk}, @code{gsl}, @code{kinsol}, @code{lapack}, @code{minpack}, @code{mpfr}, @code{lpsolve}, @code{Rmath} and @code{ufsparse} (see @ref{Preface}) which are documented further in @ref{External Libraries}. @item -j, --jail @cindex @code{jail} This option disables execution of shell commands by virtual code applications, which is normally possible by default even for nominally non-interactive applications (see @ref{Parameter Mode Options}). A virtual code application attempting to spawn a shell (using the @code{interact} combinator) when this option is selected will encounter an exception rather than successful completion of the operation. This option is provided as a security feature for running untrusted code (see @ref{Security}), and is incompatible with @option{-i}, @option{-t}, and @option{-s}. @item -f, --force-text-input @cindex @code{force-text-input} command line option Normally @code{avram} will try to guess by looking at a file whether it is an ordinary text file or one that has been written in the virtual code file format, and choose a different internal representation accordingly. An application may require one representation or the other. This option tells @code{avram} to treat all input files other than the virtual code file (named in the first command line parameter) as text files regardless of whether or not it would be possible to interpret them otherwise. This option may be used in combination with any other option. @end table @node Modes of Operation, Filter Mode Options, General Options, User Manual @section Modes of Operation Apart from to the capability to print brief help messages and exit, there are two main modes of operation, depending on which options are specified on the command line before the virtual code file name. @cindex modes @cindex modes For the purpose of choosing the mode of operation, the virtual code filename is taken to be the first command line argument not beginning with a dash. Other conventions relevant to application specific parameters are detailed in @ref{Command Line Syntax}. @menu * Filter Mode:: * Parameter Mode:: @end menu @node Filter Mode, Parameter Mode, Modes of Operation, Modes of Operation @subsection Filter Mode @cindex filter mode @cindex modes In filter mode, the argument to the function given by the virtual code is taken from standard input, and the result is written to standard output, except for error messages resulting from a failure to evaluate the function, which are written to standard error. @xref{Diagnostics}. Filter mode is indicated whenever these three conditions are all met. @itemize @bullet @item Either at least one of the filter mode options appears on the command line preceding the first filename parameter, or there are no options at all. @xref{Filter Mode Options}. @item Exactly one filename parameter appears on the command line, which is the name of the virtual machine code file. @item Either the filename comes last on the command line, or the @option{--unparameterized} option precedes it, causing everything following it to be ignored. @end itemize @noindent Examples: @table @kbd @item avram mynewapp < inputfilename @cindex standard input In this example, filter mode is recognized by default because there are no options or input files on the command line to indicate otherwise. (The input file redirected into standard input is not treated by the shell as a command line argument.) @item cat somefile | avram -r coolprog > outputfile In this example, the @option{-r} option gives it away, being one of the filter mode options, in addition to the fact that there are no input file parameters or application-specific options. @item avram -u devilmaycare.avm --bogusoption ignoredparameter In this case, filter mode is forced by the @option{-u} option despite indications to the contrary. @end table @node Parameter Mode, , Filter Mode, Modes of Operation @subsection Parameter Mode @cindex parameter mode In parameter mode, the argument to the function given by the virtual code is a data structure containing environment variables and command line parameters including files, application specific options, and possibly standard input. The result obtained by evaluating the function is either a data structure representing a set of files to be written, which may include standard output, or a sequence of shell commands to be executed, or a combination of both. Parameter mode is indicated whenever either of these conditions is met. @itemize @bullet @item Any of the parameter mode options appears on the command line preceding the first filename parameter. @ifinfo @* @end ifinfo @xref{Parameter Mode Options}. @item At least one additional filename parameter or option follows the first filename parameter, and the option @option{--unparameterized} does not precede it. @end itemize @noindent Examples: @table @kbd @item avram --map-to-each-file prettyprinter.avm *.c *.h --extra-pretty In this example, parameter mode is indicated both by the parameter mode option @option{--map-to-each-file} and by the presence of input file names and the @option{--extra-pretty} option. The latter is specific to the hypothetical @code{prettyprinter.avm} virtual code application, as indicated by its position on the command line, and is therefore passed to it by @code{avram}. @item cat ~/specfile | avram reportgenerator -v - /var/log/syslog In this example, a hypothetical parameter mode application @code{reportgenerator} is able to read @file{~/specfile} from standard input because of the @code{-} used as a parameter. @item avram --parameterized grepenv In this example, a hypothetical application that searches shell variables is invoked in parameter mode even with no input files or application specific options, because of the @option{--parameterized} option. Parameter mode invocation is required by the application to give it access to the environment. @item avram grepenv --search-targets=PATH,MANPATH This example shows an application specific option with both a keyword and a parameter list. They suffice to indicate parameter mode without an explicit @option{--parameterized} option. @end table @node Filter Mode Options, Parameter Mode Options, Modes of Operation, User Manual @section Filter Mode Options The options available in filter mode are listed below. Except as otherwise noted, all options are mutually exclusive. Ordinarily a given application will require certain fixed settings of these options and will not work properly if they are set inappropriately. @table @code @item -r, @code{--raw-output} @cindex raw-output command line option Normally the result obtained by evaluating the function in the virtual code file must be a list of character strings, which is written as such to standard output. However, if this option is selected, the form of the result is unconstrained, and it will be written in a data file format that is not human readable but can be used by other applications. This option is incompatible with any other options except @option{-u}. @item -c, --choice-of-output @cindex @code{choice-of-output} command line option When this option is used, the evaluation of the function given by the virtual machine code will be expected to yield a data structure from which @code{avram} will ascertain whether standard output should be written in text or raw data format. This option should be used only if application is aware of it. It is incompatible with any other options except @option{-u}. @item -l, --line-map @cindex @code{line-map} command line option Normally the entire contents of standard input up to @code{EOF} are loaded into memory and used as the argument to the function in the virtual code file. However, this option causes standard input to be read a line at a time, with the function applied individually to each line, and its result in each case written immediately to standard output. A given application either requires this option or does not, and will not work properly in the alternative. This option implies @option{--force-text-input} and is incompatible with any other option except @option{-u}. @item -b, --byte-transducer @cindex @code{byte-transducer} command line option This option causes standard input to be read one character at a time, evaluating the function given by the virtual code file each time. The function is used as a state transition function that takes a state and input to a next state and output. The output is written concurrently with the input operations. A given application will not work properly with an inappropriate setting of this option. This option implies @option{--force-text-input} and is incompatible with any other option except @option{-u}. @item -u, --unparameterized @cindex @code{unparameterized} command line option Normally @code{avram} guesses whether to use filter mode or parameter mode depending on whether there are any parameters. Selecting this option forces it to operate in filter mode regardless. Any parameters that may appear on the command line after the virtual code file name are ignored. This option may be used in conjunction with any other filter mode option. @end table @node Parameter Mode Options, Command Line Syntax, Filter Mode Options, User Manual @section Parameter Mode Options @cindex parameter mode The parameter mode options are listed below. Except as otherwise noted, any combination of parameter mode options may be selected together, and except as noted, the settings of these options can be varied without breaking the application. @table @code @item -q, --quiet @cindex @code{quiet} command line option @code{avram} normally informs the user when writing an output file with a short message to standard output. This option suppresses such messages. This option is compatible with any application and any other parameter mode option except @option{-a}. @item -a, --ask-to-overwrite @cindex @code{ask-to-overwrite} command line option Selecting this option will cause @code{avram} to ask permission interactively before overwriting an existing file, and to refrain from overwriting it without permission, in which case the contents that were to be written will be lost. This option overrides @option{-q} and is compatible with any other parameter mode option or application. @item -.EXT @cindex @code{EXT} command line option @cindex default file extensions @cindex extensions @cindex file name extensions @cindex file name suffixes An option beginning with a dash followed by a period specifies a default extension for input file names. If @code{avram} doesn't find a file named on the command line, and the filename doesn't already contain a period, @code{avram} will try to find a file having a similar name but with the default extension appended. The default extension given by this option takes precedence over the hard coded default extensions of @kbd{.fun} and @kbd{.avm}. At most one default extension can be supplied. This option is compatible with any other parameter mode option and compatible with any application. @item -d, --default-to-stdin @cindex @code{default-to-stdin} command line option @cindex standard input If no filename parameter appears on the command line (other than the name of the virtual code file), this option directs @code{avram} to read the contents of standard input as if it were specified as a command line parameter. (Standard input can also be specified explicitly as a dash. See @ref{Command Line Syntax}.) This option is compatible with any application and any other parameter mode option except @option{-m}. @item -m, --map-to-each-file @cindex @code{map-to-each-file} command line option Normally @code{avram} loads the entire contents of all files named on the command line into memory so as to evaluate the virtual machine code application on all of them together. This option can be used to save memory in the case of applications that operate on multiple files independently. It causes @code{avram} to load only one file at a time and to perform the relevant evaluation and output before loading the next one. Application specific options and standard input (if specified) are read only once and reused. This option is incompatible with @option{-d}, and not necessarily compatible with all applications, although some may work both with and without it. @item -i, --interactive @cindex @code{interactive} command line option @cindex interactive applications This option is used in the case of applications that interact with other programs through shell commands. An application that is meant to be invoked in this way requires this option and will not work without it, nor will applications that are not of this type work with it. This option is implied by @option{-t} and @option{-s}, and is compatible with any other parameter mode option. @item -s, --step @cindex @code{step} command line option This option is used in the case of applications that interact with other programs through shell commands, similarly to @option{-i}, and can substitute for it (see above). The option has the additional effect of causing shell commands issued by @code{avram} on behalf of the application to be written with their results to standard output, and to cause @code{avram} to pause after displaying each shell command until a key is pressed. This capability may be useful for debugging or auditing purposes but does not otherwise alter the effects of the application. This option is compatible with any other parameter mode option. @item -t, --trace @cindex @code{trace} command line option This option is used in the case of applications that interact with other programs through shell commands, but only by way of the @code{interact} combinator, for which it provides developers a means of low level debugging, particularly deadlock detection. When this option is selected, a verbose trace of all characters exchanged between the functional transducer and the external application are written to standard output, along with some additional control flow diagnostics. This option is compatible with any other parameter mode option. @item -p, --parameterized @cindex @code{parameterized} command line option Normally @code{avram} tries to guess whether to operate in filter mode or parameter mode based on the options used and the parameters. If there are no parameters and no options, it will default to filter mode, and try to read standard input. However, if this option is selected, it will use parameter mode (and therefore not try to read standard input unless required). @end table @node Command Line Syntax, Diagnostics, Parameter Mode Options, User Manual @section Command Line Syntax @cindex command line The command line parameters that follow the virtual code file name when @code{avram} is used in parameter mode (@ref{Parameter Mode}) are dependent on the specific application. However, all supported applications are constrained for implementation reasons to observe certain uniform conventions regarding their command line parameters, which are documented here to avoid needless duplication. @cindex shell @cindex file parameters @cindex input files The shell divides the command line into "arguments" separated by white space. Arguments containing white space or special characters used by the shell must be quoted or protected as usual. File names with wild cards in them are expanded by the shell before @code{avram} sees them. @code{avram} then extracts from the sequence of arguments a sequence of filenames and a sequence of options. Each option consists of a keyword and an optional parameter list. Filenames, keywords, and parameter lists are distinguished according to the following criteria. @enumerate @item An argument is treated as a keyword iff it meets these three conditions. @enumerate a @item It starts with a dash. @item It doesn't contain an equals sign. @item It doesn't consist solely of a dash. @end enumerate @item An argument is treated as a parameter list iff it meets these four conditions. @enumerate a @item It doesn't begin with a dash. @item It either begins with an equals sign or doesn't contain one. @item It immediately follows an argument beginning with a dash, not containing an equals sign and not consisting solely of a dash. @item At least one of the following is true. @enumerate @item It doesn't contain a period, tilde, or path separator. @cindex path separators @item It contains a comma. @item It can be interpreted as a C formatted floating point number. @end enumerate @end enumerate @item An argument is treated as an input file name iff it meets these four conditions. @enumerate a @item It doesn't begin with a dash. @item It doesn't contain an equals sign. @item It doesn't contain a comma. @item At least one of the following is true. @enumerate @item It contains a period, tilde, or path separator. @item It doesn't immediately follow an argument beginning with a dash, not consisting solely of a dash, and not containing an equals sign. @end enumerate @end enumerate @item If an argument contains an equals sign but doesn't begin with one, the part on the left of the first equals sign is treated as a keyword and the part on the right is treated as a parameter list. @item An argument consisting solely of a dash is taken to represent the standard input file. @item An argument not fitting any of the above classifications is an error. @end enumerate These conventions are needed for @code{avram} to detect input file names in a general, position independent way, so that it can preload the files on behalf of the application. Many standard Unix utilities follow these @cindex Unix conventions to a large extent, the exceptions being those that employ non-filename arguments without distinguishing syntax, and use positional or other ad hoc methods of command line interpretation. A drop-in replacement for such an application could nevertheless be implemented using @code{avram} with an appropriate wrapper script, similar to the approach recommended in @ref{Example Script}, but with suitable keywords inserted prior to the ambiguous arguments. @node Diagnostics, Security, Command Line Syntax, User Manual @section Diagnostics @cindex diagnostics @cindex error messages @cindex run time errors The means exists for virtual code applications to have run time error messages written to standard error on their behalf by @code{avram}. Any error messages not documented here originate with an application and should be documented by it. Most error messages originating from @code{avram} are prefaced by the application name (i.e., the name of the file containing the virtual machine code), but will be prefaced by @code{avram:} if the error is caused by a problem loading this file itself. Error messages originating from virtual code applications are the responsibility of their respective authors and might not be prefaced by the application name. The run time errors not specifically raised by the application can be classified as internal errors, i/o errors, overflow errors, file format errors, application programming errors, and configuration related errors. Some error messages include a code number. The number identifies the specific point in the source code where the condition was detected, for the benefit of the person maintaining it. @menu * Internal Errors:: * i/o Errors:: * Overflow Errors:: * File Format Errors:: * Application Programming Errors:: * Configuration Related Errors:: * Other Diagnostics and Warnings:: @end menu @node Internal Errors, i/o Errors, Diagnostics, Diagnostics @subsection Internal Errors @cindex internal errors Internal errors should never occur unless the @code{avram} source code has been carelessly modified, except as noted in @ref{Bugs}. There are two kinds. @table @code @item @var{application-name}: virtual machine internal error (code @var{nn}) Most internal errors would be reported by a message of this form if they were to occur. It indicates that some required invariant was not maintained. In such cases, the program terminates immediately, and any results already produced are suspect. @item @var{application-name}: @var{nn} unreclaimed @var{struct-names} A message of this form could be printed at the end of an otherwise successful run. @code{avram} maintains a count of the number of units allocated for various data structures, and checks that they are all reclaimed eventually as a safeguard against memory leaks. This message indicates that some memory remains unaccounted for. @end table @cindex bug reports @cindex email @cindex author If a repeatable internal error is discovered, please email a bug report and a small representative test case to @email{ursala-users@@freelists.org} or file an issue on the Avram github page. Include the version number of @code{avram}, which you can get by running @kbd{avram --version}. @node i/o Errors, Overflow Errors, Internal Errors, Diagnostics @subsection i/o Errors @cindex i/o errors These error messages are prefaced with the name of the application. A further explanation as to the @cindex @code{strerror} reason, obtained from the standard @code{strerror()} utility, is appended to the messages below if possible. @table @code @item @var{application-name}: can't read @var{filename} @cindex file names @cindex @code{can't read} @cindex environment @cindex @code{AVMINPUTS} A file was not able to be opened for reading, typically because it was not found or because the user does not have permission. The file name is displayed with special characters expanded but without any default extensions or search paths that may have been tried. If you think a file exists and should have been found, there may be a problem with your @env{AVMINPUTS} environment variable (@ref{Environment}). @item @var{application-name}: can't write @var{filename} @cindex @code{can't write} A file was not able to be opened for writing. @item @var{application-name}: can't write to @var{filename} A file was successfully opened for writing but became impossible to write thereafter. @item @var{application-name}: can't spawn @var{command} @cindex shell @cindex expect @cindex libexpect @cindex @code{can't spawn} @cindex @code{exp_popen} @cindex spawning processes An attempt to execute a shell command on behalf of an interactive application failed during the @code{exp_popen()} call to the @code{libexpect} library. @item @var{application-name}: can't close @var{filename} @cindex @code{can't close} A call to the standard C procedure @code{fclose()} failed due to unforeseen circumstances. The error is non-fatal but the file should be checked for missing data. @end table @node Overflow Errors, File Format Errors, i/o Errors, Diagnostics @subsection Overflow Errors These errors are reported by the application name prefacing one of the following messages, except as noted below. @cindex overflow @cindex counter overflow @cindex memory overflow @table @code @item @var{application-name}: counter overflow (code @var{nn}) An overflow occurred in an unsigned long integer being used as a reference counter or something similar. This situation is very unlikely. @item @var{application-name}: memory overflow (code @var{nn}) There wasn't enough memory to build an internal data structure. The most likely cause is an attempt to operate on input files that are too large. Standard remedies apply. @end table The memory overflow or counter overflow messages can also be reported without the application name preface or a code number. In these cases, they arise in the course of evaluating the function given by the application, rather than by loading the input files. A counter overflow in this case is possible if the application attempts to compute the size of a very large, shared structure using native integer arithmetic. @cindex @command{ulimit} Memory overflows are possible due to insufficient memory for a valid purpose, but may also occur due to a non-terminating recursion in the virtual machine code. To prevent thrashing or other bad effects from runaway code, the @command{ulimit} shell command is your friend. @node File Format Errors, Application Programming Errors, Overflow Errors, Diagnostics @subsection File Format Errors Certain application crashes result from an application not adhering to the required conventions about data and file formats, or because the application was invoked in the wrong mode (@ref{Modes of Operation}). These are the following. @table @code @item @var{application-name}: invalid text format (code @var{nn}) @cindex @code{invalid text format} An application that was expected to return a string of characters to be written to a text file returned data that did not correspond to any valid character representation. @item @var{application-name}: null character in prompt An interactive application (invoked rightly or wrongly with @option{-i}, @option{-t}, or @option{-s}) is required to exchange strings of non-null characters internally with @code{avram}, and used a null. @item @var{application-name}: invalid file name (code @var{nn}) The data structure representing a file obtained from an application has a name consisting of something other than character strings. This error could be the result of a filter mode application (@ref{Filter Mode}) being invoked in parameter mode. @ifinfo @* @end ifinfo (@ref{Parameter Mode}) @item @var{application-name}: null character in file name @cindex @code{null character in file name} Similar to the above errors. @item @var{application-name}: bad character in file name @cindex @code{bad character in file name} Slashes, backslashes, and unprintable characters other than spaces are also prohibited in file names. @item @var{application-name}: invalid output preamble format @cindex @code{invalid output preamble format} According the format used by @code{avram} for data files, a data file may contain an optional text portion, known as the preamble. This error occurs when a data file obtained from an application can not be written because the preamble is something other than a list of character strings. @item @var{application-name}: invalid file specification @cindex @code{invalid file specification} This error occurs in situations where the data structure for a file obtained by evaluating the application is too broken to permit any more specific diagnosis. @item avram: invalid raw file format in @var{application-name} @cindex @code{invalid raw file format} The file containing the virtual machine code was not able to be loaded, because the code was not in a recognizable format. Either the file has become corrupted, the compiler that generated it has a bug in it, or the wrong file was used as a virtual code file. @end table @node Application Programming Errors, Configuration Related Errors, File Format Errors, Diagnostics @subsection Application Programming Errors A further class of application crashes results from miscellaneous bugs in the application. These require the application to be debugged and have no user level explanation or workaround, but are listed here for reference. These messages are not normally prefaced by the application name when reported unless the application elects to do so, except for the @code{invalid profile identifier} message. @itemize @bullet @item @code{invalid recursion} @cindex @code{invalid recursion} @item @code{invalid comparison} @cindex @code{invalid comparison} @item @code{invalid deconstruction} @cindex @code{invalid deconstruction} @item @code{invalid transpose} @cindex @code{invalid transpose} @item @code{invalid membership} @cindex @code{invalid membership} @item @code{invalid distribution} @cindex @code{invalid distribution} @item @code{invalid concatenation} @cindex @code{invalid concatenation} @item @code{invalid assignment} @cindex @code{invalid assignment} @item @code{unrecognized combinator (code @var{nn})} @cindex @code{unrecognized combinator} @item @code{@var{application-name}: invalid profile identifier} @cindex @code{invalid profile identifier} @item @code{unsupported hook} @cindex @code{unsupported hook} @end itemize @node Configuration Related Errors, Other Diagnostics and Warnings, Application Programming Errors, Diagnostics @subsection Configuration Related Errors The source code distribution of @code{avram} incorporates a flexible configuration script allowing it to be installed on a variety of platforms. Not all platforms allow support for all features. It is also anticipated that new features may be added to @code{avram} from time to time. Some problems may therefore occur due to features not being supported at your site for either of these reasons. The following error messages are relevant to these situations. @table @code @item unsupported hook @cindex @code{unsupported hook} If it's not simply due to an application programming error (@ref{Application Programming Errors}) this message may be the result of trying to use an application that requires a newer version of @code{avram} than the one installed, even though applications should avoid this problem by checking the version number at run time. If this is the reason, the solution would be to install the latest version. @item @var{application-name}: I need avram linked with @var{foo}, @var{bar} and @var{baz}. @cindex @code{I need avram linked with} A message of the this form indicates that a new installation may be needed. At this writing (11/11/1), @code{avram} may report this message with respect to @code{libexpect5.32}, @code{tcl8.3}, and @code{libutil} if any of the @option{-i}, @option{-t}, or @option{-s} options is used on a system where not all of these libraries were detected when @code{avram} was installed from a source distribution. (See @ref{Parameter Mode Options}.) Because @code{avram} is useful even without interactive applications, these libraries are not considered absolute prerequisites by the configuration script. @item avram: can't emulate version @var{version} @cindex @code{can't emulate version} @cindex @code{emulation} command line option @cindex versions @cindex backward compatibility The @option{--emulation=@var{version}} option obviously won't work if the requested version is newer than the installed version, or if it is not a valid version number (@ref{General Options}). When that happens, this message is printed instead and @code{avram} terminates. @item avram: multiple version specifications @cindex @code{multiple version specifications} The @option{--emulation=@var{version}} option can be used at most once on a command line. This message is printed if it is used more than once. If you only typed it once and got this message, check your aliases and wrapper scripts before reporting a bug. @item avram: unrecognized option: @var{option-name} @cindex @code{unrecognized option} may mean that a command line option has been misspelled, or may be another sign of an obsolete version of @code{avram}. This message will be followed by a usage summary similar to that of the @option{--help} option. (@ref{General Options}). @item @var{application-name}: warning: search paths not supported @cindex search paths @cindex paths @cindex environment @cindex @code{AVMINPUTS} @cindex @code{search paths not supported} @cindex @file{argz.h} If the @file{argz.h} header file was not detected during configuration, @code{avram} will not be able to support search paths in the @env{AVMINPUTS} environment variable (@ref{Environment}). This message is a warning that the environment variable is being ignored. If the warning is followed by an i/o error @ifinfo @* @end ifinfo (@ref{i/o Errors}), the latter may be due to a file being in a path that was not searched for this reason. A workaround is to specify the full path names of all input files outside the current working directory. If you don't need search paths, you can get rid of this message by undefining @env{AVMINPUTS}. @end table @node Other Diagnostics and Warnings, , Configuration Related Errors, Diagnostics @subsection Other Diagnostics and Warnings @table @asis @item @code{avram: multiple -.EXT options; all but last ignored} @cindex @code{multiple -.EXT options} @cindex extensions @cindex file extensions @cindex extensions @cindex file name extensions @cindex file name suffixes This message is written when more than one default extension is given as a command line parameter. At most one default extension is allowed. If more than one is given, only the last one is used. The error is non-fatal and @code{avram} will try to continue. If you need more than one default extension, consider using the hard coded default extensions of @file{.fun} and @file{.avm}, or hacking the shell script in which the @code{avram} command line appears. @item @code{@var{application name}: empty operator} This message probably means that the virtual code file is corrupt or invalid. @item usage summary @cindex @code{help} command line option For any errors in usage not covered by other diagnostics, such as incompatible combinations of options, @code{avram} prints a message to standard error giving a brief summary of options, similar to the output from @kbd{avram --help}. (See @ref{General Options}.) @end table @node Security, Example Script, Diagnostics, User Manual @section Security @cindex security A few obvious security considerations are relevant to running untrusted virtual code applications. These points are only as reliable as the assumption that the @code{avram} executable has not been modified to the contrary. @itemize @bullet @cindex filter mode @item The applications with the best protection from malicious code are those that run in filter mode, because they have no access to any information not presented to them in standard input, nor the ability to affect anything other than the contents of standard output (provided that the @code{--jail} command line option is used). The worst they can do is use up a lot of memory, which can be prevented with the @command{ulimit} command. Unfortunately, not all applications are usable in this mode. @item Parameter mode applications that do not involve the @option{-i}, @cindex parameter mode @cindex standard input @option{-t} or @option{-s} options are almost as safe (also assuming @code{--jail}). They have (read-only) access to environment variables, and to the files that are indicated explicitly on the command line. If standard input is one of the files (as indicated by the use of @code{-} as a parameter), the virtual code application may infer the current date and time. However, a parameter mode application may write any file that the user has permission to write. The @option{--ask-to-overwrite} option should be used for better security, or at least the @option{--quiet} option should not be used. The virtual code can neither override nor detect the use of these options. @item Interactive parameter mode applications (those that use either the @cindex interactive applications @option{-i}, @option{-t} or @option{-s} options) are the least secure because they can execute arbitrary shell commands on behalf of the user. This statement also applies to filter mode and parameter mode applications where the @option{--jail} option is not used. Use of @option{--step} is preferable to @option{-i} for making an audit trail of all commands executed, but the application could probably subvert it. The @option{--step} option may be slightly better because it can allow the user to inspect each command and interrupt it if appropriate. However, in most cases a command will not be displayed until it is already executed. Commands executed by non-interactive applications normally will display no output to that effect. A @command{chroot} environment may be the only secure way of running untrusted interactive applications. @end itemize @node Example Script, Files, Security, User Manual @section Example Script @cindex script @cindex shell script It is recommended that the application developer (or the compiler) package virtual machine code applications as shell scripts with the @code{avram} command line embedded in them. This style relieves the user of the need to remember the appropriate virtual machine options for invoking the application, which are always the same for a given application, or even to be aware of the virtual machine at all. @cindex @code{cat} @cindex @code{default-to-stdin} command line option Here is a script that performs a similar operation to the standard @cindex Unix Unix @command{cat} utility. @example #!/bin/sh #\ exec avram --force-text-input --default-to-stdin "$0" "$@@" sKYQNTP\ @end example @noindent That is, it copies the contents of a file whose name is given on the command line to standard output, or copies standard input to standard output if no file name is given. This script can be marked executable @cindex executable files (with @command{chmod}) and run by any user @cindex @code{chmod} @cindex paths with the directory of the @code{avram} executable in his or her @code{PATH} environment variable, even if @code{avram} had to be installed in a non-standard directory such as @cindex non-standard installation @file{~/bin}. The idea for this script is blatantly lifted from the @command{wish} @cindex @code{wish} manpage. The first line of the script invokes a shell to process what follows. The shell treats the second line as a comment and ignores it. Based on the third line, the shell invokes @code{avram} with the indicated options, the script itself as the next argument, and whatever command line parameters were initially supplied by the user as the remaining arguments. The rest of the script after that line is never processed by the shell. When @code{avram} attempts to load the shell script as a virtual machine code file, which happens as a result of it being executed by the shell, it treats the first line as a comment and ignores it. It also treats the second line as a comment, but takes heed of the trailing backslash, which is interpreted as a comment continuation character. It therefore also treats the third line as a comment and ignores it. Starting with the fourth line, it reads the virtual code, which is in a binary data format encoded with printable characters, and evaluates it. @node Files, Environment, Example Script, User Manual @section Files @table @code @item ./profile.txt @cindex @file{profile.txt} This file is written automatically by @code{avram} on behalf of applications that include profile annotations. It lists the number of invocations for each annotated part of the application, the total amount of time spent on it (in relative units), the average amount of time for each invocation, and the percentage of time relative to the remainder of the application. The exact format is undocumented and subject to change. @end table @node Environment, Bugs, Files, User Manual @section Environment @cindex environment @cindex @code{AVMINPUTS} @cindex paths An environment variable @env{AVMINPUTS} can be made to store a list of directories (using the @command{set} or @command{export} commands) that @code{avram} will search for input files. The directories should be separated by colons, similarly to the @env{PATH} environment variable. @cindex search paths The search paths in @env{AVMINPUTS} apply only to the names of input files given on the command line (@ref{Command Line Syntax}) when @code{avram} is invoked in parameter mode (@ref{Parameter Mode}). They do not apply to the name of the virtual code file, which is always assumed to be either absolute or relative to the current working directory (this assumption being preferable in the case of a script like that of @ref{Example Script}). @cindex shell script Starting in the first directory in the list of @env{AVMINPUTS}, @code{avram} searches for a file exactly as its name appears on the command line (subject to the expansion of special characters by the shell). If it is not found and the name does not contain a period, but a command line option of @option{-.EXT} has been used, @code{avram} will then search for a file with that name combined with the extension @code{.EXT}. If @option{-.EXT} has not been used or if no @cindex @code{EXT} command line option matching file is found with it, @code{avram} tries the extensions of @kbd{.avm} and @kbd{.fun} in that order, provided the given file name contained no periods. If no match is found for any of those cases, @code{avram} proceeds to search the next directory in the list obtained from @env{AVMINPUTS}, and so on. It stops searching when the first match is found. For subsequent input files, the search begins again at the first directory. If @env{AVMINPUTS} is defined, the current working directory is not searched for input files unless it is listed. If it is empty or not @cindex search paths defined, a default list of search paths is used, currently @example .:/usr/local/lib/avm:/usr/lib/avm:/lib/avm:/opt/avm:/opt/lib/avm\ :/usr/local/share/avm:/usr/share/avm:/share/avm:/opt/avm:/opt/share/avm @end example @noindent These paths are defined in @code{avram.c} and can be changed by recompiling it. @node Bugs, , Environment, User Manual @section Bugs @cindex internal errors @cindex bugs @cindex exceptions There are no known bugs outstanding, except for any that may be inherent in the external library functions. However, @code{avram} has been used most extensively on GNU/Linux systems, and the prospect of portability issues with new or lesser used features on other systems can't be excluded. Though not observed in practice, it's theoretically possible to blow the stack by passing enough functions as arguments to library functions that pass more functions to library functions (e.g., by using nested calls to the gsl integration functions meant for a single variable to evaluate a very high dimensional multiple integral). In all other cases only dynamic heap storage or a constant amount of stack space is used. In particular, this issue is @emph{not} relevant to virtual code applications that don't use external libraries, or that don't pass functions to them as arguments. @code{avram} is designed to recover gracefully from memory overflows by always checking for @code{NULL} results from @code{malloc()} or otherwise trapping functions that allocate memory. In the event of an overflow, it conveys an appropriate error message to the virtual code application to be handled by the usual exception handling mechanisms. However, there is currently no way for a virtual code application to detect in advance whether sufficient memory is available, nor for it to resume normal operation once an exception occurs. Furthermore, it has been observed on some systems including Irix and 2.4 series Linux kernels that the @code{avram} process is killed automatically for attempting to allocate too much memory rather than given the chance to recover. Please send bug reports to @email{ursala-users@@freelists.org} or file an issue on the Avram github page. @node Virtual Machine Specification, Library Reference, User Manual, Top @chapter Virtual Machine Specification This chapter contains a description of the virtual machine implemented by @code{avram}, from the point of view of a person wishing to write a compiler that generates code for it. Before reading this chapter, readers should at least skim @ref{User Manual} in order to see the big picture. Topics covered in this chapter include data representations, virtual code semantics, and file formats. A toy programming language is introduced for illustrative purposes. The sections in this chapter might not make sense if read out of order the first time through. @ifinfo The last section, @ref{Virtual Code Semantics}, contains many equations that may be difficult to read in the info or html renderings. The printed version is recommended for anyone who really wants to comprehend this material. @end ifinfo @menu * Raw Material:: * Concrete Syntax:: * File Format:: * Representation of Numeric and Textual Data:: * Filter Mode Interface:: * Parameter Mode Interface:: * Virtual Code Semantics:: @end menu @node Raw Material, Concrete Syntax, Virtual Machine Specification, Virtual Machine Specification @section Raw Material The purpose of this section is to instill some basic concepts about the way information is stored or communicated by the virtual machine, which may be necessary for an understanding of subsequent sections. The virtual machine represents both programs and data as members of a semantic domain that is straightforward to describe. Lisp users and functional programmers may recognize familiar concepts of atoms and @cindex lists lists in this description. However, these terms are avoided for the moment, in order to keep this presentation self contained and to prevent knowledgeable readers from inferring any unintended meanings. As a rule, it is preferable to avoid overspecifying any theoretical artifact. In this spirit, the set of entities with which the virtual machine is concerned can be defined purely in terms of the properties we need it to have. @table @emph @item A distinguished element A particular element of the set is designated, arbitrarily or otherwise, as a distinguished element. Given any element of the set, it is always possible to decide whether or not it is the distinguished element. The set is non-empty and such an element exists. @item A binary operator A map from pairs of elements of the set to elements of the set exists and meets these conditions. @itemize @bullet @item It associates a @emph{unique} element of the set with any given ordered pair of elements from the set. @item It does not associate the distinguished element with any pair of elements. @end itemize @end table For the sake of concreteness, an additional constraint is needed: @emph{the set has no proper subset satisfying the above conditions}. Any number of constructions remain within these criteria, but there is no need to restrict them further, because they are all equivalent for our purposes. To see that these properties provide all the structure we need for general purpose computation, we may suppose some given set @code{S} and an operator @code{cons} having them are fixed, and infer the following points. @itemize @bullet @item @code{S} contains at least one element, the distinguished element. Call it @code{nil}. @cindex @code{nil} @item The pair @code{(nil,nil)} is a pair of elements of @code{S}, so there must be an element of @code{S} that @code{cons} associates with it. We can denote this element @code{cons(nil,nil)}. @cindex @code{cons} @item As no pair of elements is associated with the distinguished element, @code{cons(nil,nil)} must differ from @code{nil}, so @code{S} contains at least two distinct elements. @item The pair @code{(nil,cons(nil,nil))} therefore differs from @code{(nil,nil)}, but because it is yet another pair of elements from @code{S}, there must be an element associated with it by the operator. We can denote this element as @code{cons(nil,cons(nil,nil))}. @item Inasmuch as the operator associates every pair of elements with a @emph{unique} element, @code{cons(nil,cons(nil,nil))} must differ from the element associated with any other pair of elements, so it must differ from @code{cons(nil,nil)}, and we conclude that @code{nil}, @code{cons(nil,nil)} and @code{cons(nil,cons(nil,nil))} constitute three distinct elements of the set @code{S}. @item By defining @code{cons(cons(nil,nil),nil)} and @code{cons(cons(nil,nil),cons(nil,nil))} analogously and following a similar line of reasoning, one may establish the existence of two more distinct elements of @code{S}. @end itemize It is not difficult to see that an argument in more general terms could show that the inclusion of infinitely many elements in @code{S} is mandated by the properties of the @code{cons} operator. Furthermore, every element of @code{S} other than @code{nil} owes its inclusion to being associated with some other pair of elements by @code{cons}, because if it were not, its exclusion would permit a proper subset of @code{S} to meet all of the above conditions. We can conclude that @code{S} contains exactly @code{nil} and the countable infinitude of elements of the form @code{cons(x,y)}, where @code{x} and @code{y} are either @code{nil} or something of the form @code{cons(@dots{})} themselves. Some specific examples of sets and operators that have the required properties are as follows. @itemize @bullet @item the set of natural numbers, with @code{0} as the distinguished element, and the @code{cons} operator defined by @code{cons(@var{x},@var{y}) = ((@var{x}+@var{y})(@var{x}+@var{y}+1))/2 + @var{y} + 1} @item a set of balanced strings of parentheses, with @code{()} as the distinguished element, and @code{cons} defined as string concatenation followed by enclosure in parentheses @item a set of ordered binary trees, with the empty tree as the distinguished element, and the @code{cons} operator as that which takes an ordered pair of trees to the tree having them as its descendents @item a set containing only its own Cartesian product and an arbitrary but fixed element @code{nil}, with @code{cons} being the identity function @end itemize Each of these models may suggest a different implementation, some of which are more practical than others. The remainder of this document is phrased somewhat imprecisely in terms of a combination of the latter two. The nature of the set in question is not considered further, and elements of the set are described as ``trees'' or ``lists''. The @cindex trees @cindex lists distinguished element is denoted by @code{nil} and the operator by @code{cons}. Where no ambiguity results, @code{cons(x,y)} may be written simply as @code{(x,y)}. These terms should not be seen as constraints on the implementation. @node Concrete Syntax, File Format, Raw Material, Virtual Machine Specification @section Concrete Syntax The previous section has developed a basic vocabulary for statements such as ``the virtual machine code for the identity function is @cindex identity function @code{(nil,(nil,nil))}'', which are elaborated extensively in the subsequent sections on code and data formats. However, a description in this style would be inadequate without an explanation of how such an entity as @code{(nil,(nil,nil))} is communicated to @code{avram} in a virtual machine code file. The purpose of this section is to fill the gap by explaining exactly how any given tree would be transformed to its concrete representation. The syntax is based on a conversion of the trees to bit strings, @cindex bit strings followed by grouping the bits into blocks of six, which are then encoded by printable characters. Although anyone is free to modify @code{avram}, it is recommended that the concrete syntax described here be maintained for the sake of portability of virtual machine code applications. Building a tree by reading the data from a file requires a more difficult algorithm than the one presented in this section, and is not considered because it's not strictly necessary for a compiler. Procedures for both reading and writing are available to C and C++ users as part of the @code{avram} library, and are also easily invoked on the virtual code level. @menu * Bit String Encoding:: * Blocking:: @end menu @node Bit String Encoding, Blocking, Concrete Syntax, Concrete Syntax @subsection Bit String Encoding The conversion from trees to bit strings might have been done in several @cindex trees ways, perhaps the most obvious being based on a preorder traversal with each vertex printed as it is traversed. By this method, the entire encoding of the left descendent would precede that of the right in the bit string. This alternative is therefore rejected because it imposes unnecessary serialization on communication. It is preferable for the encodings of both descendents of a tree to be interleaved to allow concurrent transmission. Although there is presently no distributed implementation of the virtual machine and hence @cindex distributed implementation none that takes advantage of this possibility, it is better to plan ahead than to be faced with backward compatibility problems later. The preferred algorithm for encoding a tree as a bit string employs a queue. The queue contains trees and allows them to be processed in a @cindex queues first-in first-out order. Intuitively, the algorithm works by traversing @cindex printing algorithm the tree in level order. To print a tree @code{T} as a string of @code{1}s and @code{0}s, it performs the following steps. @display Initialize the queue to contain only @code{T} while the queue is not empty do if the front element of the queue is @code{nil} then print @code{0} else if the front element of the queue is of the form @code{cons(x,y)} then print @code{1} append @code{x} to the back of the queue append @code{y} to the back of the queue end if remove the front element of the queue end while @end display This algorithm presupposes that any given tree @cindex deconstruction @code{cons(x,y)} can be ``deconstructed'' to obtain @code{x} and @code{y}. The computability of such an operation is assured in theory by the uniqueness property of the @code{cons} operator, regardless of the representation chosen. If the trees are implemented with pointers in the obvious way, their deconstruction is a trivial constant time operation. As an example, running the following tree through the above algorithm results in the bit string @code{111111101011110010001001100010100010100100100}. @example cons( cons( cons(nil,cons(nil,cons(nil,nil))), cons(nil,cons(nil,nil))), cons( cons( cons(nil,cons(nil,cons(nil,cons(nil,nil)))), cons(nil,nil)), cons( cons( cons(nil,cons(nil,cons(cons(nil,cons(nil,nil)),nil))), cons(nil,nil)), nil))) @end example @node Blocking, , Bit String Encoding, Concrete Syntax @subsection Blocking After the bit string is obtained as described above, it is grouped into blocks of six. Continuing with the example, the string @example 111111101011110010001001100010100010100100100 @end example @noindent would be grouped as @example 111111 101011 110010 001001 100010 100010 100100 100 @end example @noindent Because the number of bits isn't a multiple of six, the last group has to be padded with zeros, to give @example 111111 101011 110010 001001 100010 100010 100100 100000 @end example @noindent Each of these six bit substrings is then treated as a binary number, with the most significant bit on the left. The numbers expressed in decimal are @example 63 43 50 9 34 34 36 32 @end example @noindent @cindex character codes The character codes for the characters to be written are obtained by adding sixty to each of these numbers, so as to ensure that they will be printable characters. The resulting character codes are @example 123 103 110 69 94 94 96 92 @end example @noindent which implies that the tree in the example could be written to a file as @code{@{gnE^^`\}. @node File Format, Representation of Numeric and Textual Data, Concrete Syntax, Virtual Machine Specification @section File Format @cindex file format A virtual code file consists of an optional text preamble, followed by the concrete representation for a tree. The latter uses the syntax described in the previous section. The purpose of this section is to specify the remaining details of the file format. The format for virtual code files may also be used for other purposes by virtual code applications, as it is automatically detected and parsed by @code{avram} when used in an input file, and can be automatically written to output files at the discretion of the application. Other than virtual code files, input files not conforming to this format are not an error as far as @code{avram} is concerned, because they are @cindex text files assumed to be text files. Applications can detect in virtual code the assumption that is made and report an error if appropriate. Although the data file format includes no checksums or other explicit @cindex checksums methods of error detection, the concrete syntax itself provides a good measure of protection against undetected errors. The probability is vanishingly small that a random alteration to any valid encoding leaves it intact, because every bit in the sequence either mandates or prohibits the occurrence of two more bits somewhere after it. Errors in different parts of the file would have to be consistent with one another to go unnoticed. @menu * Preamble Section:: * Data Section:: @end menu @node Preamble Section, Data Section, File Format, File Format @subsection Preamble Section @cindex preamble @itemize @bullet @item A file may contain at most one preamble. @item The preamble, if any, is a consecutive sequence of lines beginning with the first line in the file. @item The first line of the preamble must begin with a hash (@code{#}) character. @item Subsequent lines of the preamble must either begin with a hash, or immediately follow a line that ends with a backslash (@code{\}) character (or both). @end itemize @node Data Section, , Preamble Section, File Format @subsection Data Section @itemize @bullet @item The data or virtual code section of the file begins on the first line of the file that isn't part of the preamble. @item The data section may not contain any hashes, white space, or other extraneous characters other than line breaks. @item If line breaks are ignored, the data section contains a sequence of characters expressing a single tree in the concrete syntax described in @ref{Concrete Syntax}. @end itemize @node Representation of Numeric and Textual Data, Filter Mode Interface, File Format, Virtual Machine Specification @section Representation of Numeric and Textual Data As noted already, virtual code applications are specified by functions operating on elements of a set having the properties described in @ref{Raw Material}, which are convenient to envision as ordered binary trees or @cindex trees @cindex @code{nil} pairs of @code{nil}. However, virtual code applications normally deal with numeric or textual data, for example when they refer to the contents of a text file. It is therefore necessary for the application and the virtual machine emulator to agree on a way of describing textual or numeric data with these trees. The purpose of this section is to explain the basic data structures used in the exchange of information between @code{avram} and a virtual code application. For example, an explanation is needed for statements like ``an application invoked with the @option{--baz} option is expected to return a pair @code{(@var{foo},@var{bar})}, where @code{@var{foo}} is a @cindex strings @cindex character strings @cindex lists list of character strings @dots{}'', that are made subsequently in this document. Such statements should be understood as referring to the trees representing the pairs, lists, character strings, etc., according to the conventions explained below. @table @emph @item Characters @cindex character codes An arbitrarily chosen set of 256 trees is used to represent the character set. They are listed in @ref{Character Table}. For example, the letter @code{A} is represented by @code{(nil,(((nil,(nil,(nil,nil))),nil),(nil,nil)))}. That means that when an application wants the letter @code{A} written to a text file, it returns something with this tree in it. @item Booleans @cindex booleans The value of @code{false} is represented by @code{nil}, and the value of @code{true} is represented by @code{(nil,nil)}. @item Pairs @cindex pairs Given any two items of data @var{x1} and @var{x2}, having the respective representations @var{r1} and @var{r2}, the pair @code{(@var{x1},@var{x2})} has the representation @code{cons(@var{r1},@var{r2})}. @item Lists @cindex lists A list of the items @var{x1}, @var{x2} @dots{} @var{xn} with respective representations @var{r1} through @var{rn} is represented by the tree @code{cons(@var{r1},cons(@var{r2}@dots{}cons(@var{rn},nil)@dots{}))}. In other words, lists are represented as pairs whose left sides are the heads and whose right sides are the tails. The empty list is identified with @code{nil}. Lists of arbitrary finite length can be accommodated. @item Naturals @cindex naturals A number of the form @code{@var{b0} + 2@var{b1} + 4@var{b2} + @dots{} + 2^n @var{bn}}, where each @code{@var{b}i} is @code{0} or @code{1}, is represented by a tree of the form @code{cons(@var{t0},cons(@var{t1}@dots{}cons(@var{tn},nil)@dots{}))} where each @code{@var{t}i} is @code{nil} if the corresponding @code{@var{b}i} is @code{0}, and @code{(nil,nil)} otherwise. Note that the numbers @code{@var{b}i} are exactly the bits written in the binary expansion of the number, with @code{@var{b0}} being the least significant bit. @item Strings @cindex strings @cindex character strings are represented as lists of characters. @end table @cindex types @code{avram} imposes no more of a ``type discipline'' than necessary to a workable interface between it and an application. This selection of types and constructors should not be seen as constraining what a compiler writer may wish to have in a source language. @node Filter Mode Interface, Parameter Mode Interface, Representation of Numeric and Textual Data, Virtual Machine Specification @section Filter Mode Interface @cindex filter mode @cindex parameter mode @cindex modes From the point of view of the application developer or compiler writer, there are parameter mode applications, which are discussed in @ref{Parameter Mode Interface}, and filter mode applications, which are discussed in this section. Of the latter, there are mainly three kinds: those that read one character at a time, those that read a line at a time, and those that read the whole standard input file at @cindex standard input once. Each of them is invoked with different options and expected to follow different calling conventions. This section summarizes these conventions. @menu * Loading All of Standard Input at Once:: * Line Maps:: * Byte Transducers:: @end menu @node Loading All of Standard Input at Once, Line Maps, Filter Mode Interface, Filter Mode Interface @subsection Loading All of Standard Input at Once Unless @option{--line-map} or @option{--byte-transducer} is used as a @cindex @code{line-map} command line option @cindex @code{byte-transducer} command line option @cindex standard input command line option when the application is invoked, the contents of standard input are loaded entirely into memory by @code{avram} before evaluation of the virtual code begins. This interface is obviously not @cindex infinite streams appropriate for infinite streams. The virtual code application in this mode of operation is treated as a single function taking the entire contents of standard input as its argument, and returning the entire contents of standard output as its result. Hence, this interface is one of the simplest available. @menu * Standard Input Representation:: * Standard Output Representation:: @end menu @node Standard Input Representation, Standard Output Representation, Loading All of Standard Input at Once, Loading All of Standard Input at Once @subsubsection Standard Input Representation @cindex standard input The representation for the standard input file used as the argument to the function depends both on the file format and on the command line options specified when the application is invoked. The @cindex @code{unparameterized} command line option @cindex @code{raw-mode} command line option @option{--unparameterized} and @option{--raw-output} options make no difference to the input representation, and the @option{--line-map} and @option{--byte-transducer} options are not relevant to this mode of operation. That leaves four possible combined settings of the @cindex @code{choice-of-output} command line option @cindex @code{force-text-input} command line option @option{--choice-of-output} and @option{--force-text-input} options. If standard input conforms to the data file format specification @ref{File Format}, the following effects are possible. @itemize @bullet @item If neither @option{--choice-of-output} nor @option{--force-text-input} is used, the argument to the function will be given directly by the tree encoded in the data section of the file. The preamble of the file will be ignored. @item If the @option{--choice-of-output} option is used and the @option{--force-text-input} option is not used, the argument to the function will be a pair @code{(@var{preamble},@var{contents})}, where @var{preamble} is a list of character strings taken from the preamble of the file (with leading hashes stripped), and @var{contents} is the tree represented in the data section of the file. @item If the @option{--choice-of-output} option is not used and the @option{--force-text-input} option is used, the argument to the function will be the whole file as a list of character strings. I.e., both the preamble and the data sections are included, hashes are not stripped from the preamble, and the data section is not converted to the tree it represents. @item If the @option{--choice-of-output} option is used and the @option{--force-text-input} option is also used, the argument to the the function will be a pair @code{(nil,@var{contents})}, where the contents are the list of character strings as in the previous case. @end itemize @cindex file format If standard input does not conform to the data file format specification in @ref{File Format}, then it is assumed to be a text file. The @option{--force-text-input} option makes no difference, and there are only two possible effects, depending on whether @option{--choice-of-output} is used. They correspond to the latter two cases above, where @option{--force-text-input} is used. @cindex preamble @cindex text files The idea of the @option{--choice-of-output} option is that it is used only for applications that are smart enough to be aware of the @code{(@var{preamble},@var{contents})} convention. A non-empty preamble implies a data file whose contents could be any type, but an empty preamble implies a text file whose contents can only be a list of character strings. (In the case of a data file with no preamble, the list of the empty string is used for the preamble to distinguish it from a text file.) Dumb applications that never want to deal with anything but text files should be invoked with @option{--force-text-input}. Otherwise, they have to be prepared for either text or data as arguments. The use of both options at once is unproductive as far as the input format is concerned, but may be justified when the output is to be a data file and the input is text only. @node Standard Output Representation, , Standard Input Representation, Loading All of Standard Input at Once @subsubsection Standard Output Representation @cindex standard output @cindex @code{raw-output} command line option @cindex @code{choice-of-output} command line option As in the case of standard input, the representation for standard output that the function is expected to return depends on the command line options with which the application is invoked. The only relevant options are @option{--raw-output} and @option{--choice-of-output}, which are mutually exclusive. @itemize @bullet @item If neither option is selected, the result returned by the function must be a list of character strings. @item If @option{--raw-output} is used, the result returned by the function is unconstrained, and it will be written as a data file with no preamble, following the format specified in @ref{File Format}. @item If @option{--choice-of-output} is used, the result returned by the function must be a pair @code{(@var{preamble},@var{contents})}. @end itemize @cindex preamble In the last case, the preamble determines how the file will be written. If it is meant to be a text file, the preamble should be @code{nil}, and the contents should be a list of character strings. If it is meant to be a data file, the preamble should be a non-empty list of character strings, and the format of the contents is unconstrained. To express a data file with no preamble, the preamble should be the list containing the empty string, rather than being empty. In the result returned by the function, the preamble lines should not include leading hash characters, because they are automatically added to the output to enforce consistency with the data file format. However, they should include trailing backslashes as continuation characters where appropriate. The hashes that are automatically added will be automatically stripped by @code{avram} on behalf of whatever application uses the file. @cindex character strings @cindex printing algorithm Any file can be written as a list of character strings, even ``text'' files that are full of unprintable characters, and even ``text'' files that happen to conform to the format used for data files. However, if the application intends to write a data file in the standard format used by other virtual code applications, it can do so more quickly and easily by having the virtual machine do the formatting automatically with the @option{--choice-of-output} option than by implementing the algorithm in @ref{Concrete Syntax}, from scratch in virtual code. @node Line Maps, Byte Transducers, Loading All of Standard Input at Once, Filter Mode Interface @subsection Line Maps @cindex @code{line-map} command line option Virtual code applications invoked with the @option{--line-map} option (with or without the @option{--unparameterized} option) adhere to a very simple interface. @itemize @bullet @item The argument to the function is a character string, and the result must also be a character string. @item The function is applied to each line of the standard input file @cindex standard input and the result in each case is written to standard output followed by a @cindex standard output line break. @end itemize @cindex infinite streams This kind of application may be used on finite or infinite streams, provided that the lengths of the lines are finite, but preserves no state information from one line to the next. @node Byte Transducers, , Line Maps, Filter Mode Interface @subsection Byte Transducers The interface used when the @code{--byte-transducer} option is selected @cindex @code{byte-transducer} command line option allows an application to serve as a persistent stream processor suitable @cindex infinite streams for finite or infinite streams. The interface can be summarized by the following points. @itemize @bullet @item When it is first invoked, the function in the virtual code file is applied to an argument of @code{nil}, and is expected to return a pair @code{(@var{state},@var{output})}. The @var{state} format is unconstrained. The @var{output} must be a character string that will be written to standard output, but it may be the empty string. @item For each byte read from standard input, @code{avram} applies the function to the pair @code{(@var{state},@var{character})}, using the state obtained from previous evaluation, and the character whose code is the byte. The purpose of the @var{state} field is therefore to provide a way for the application to remember something from one invocation to the next. @item The function is usually expected to return a pair @code{(@var{state},@var{output})} for each input byte, so that the state can be used on the next iteration, and the output can be written to standard output as a character string. @item If the function ever returns a value of @code{nil}, the computation terminates. @item If standard input comes to an end before the computation terminates, the function will be applied to a pair of the form @code{(@var{state},nil)} thereafter, but may continue to return @code{(@var{state},@var{output})} pairs for arbitrarily many more iterations. The @code{EOF} character is not explicitly passed to the function, but the end is detectable insofar as @code{nil} is not a representation for any character. @end itemize Unlike the situation with line maps, the output character strings do not have line breaks automatically appended, and the application must include them explicitly if required. The convention for @cindex Unix line breaks is system dependent. On Unix and GNU/Linux systems, character code 10 indicates a line break, but other systems may use character code 13 followed by character code 10. See @ref{Character Table} for the @cindex character codes representations of characters having these codes. @node Parameter Mode Interface, Virtual Code Semantics, Filter Mode Interface, Virtual Machine Specification @section Parameter Mode Interface @cindex parameter mode The virtual code file for a parameter mode application contains a tree representing a single function, which takes as its argument a data structure in the format described below. The format of the result returned by the function depends on the virtual machine options used on the command line, and the various alternatives are explained subsequently. @menu * Input Data Structure:: * Input for Mapped Applications:: * Output From Non-interactive Applications:: * Output From Interactive Applications:: @end menu @node Input Data Structure, Input for Mapped Applications, Parameter Mode Interface, Parameter Mode Interface @subsection Input Data Structure The data structure that is used as the argument to the parameter mode application incorporates all the information about the command line and @cindex environment @cindex command line the environment variables. It is in the form of a triple @code{((@var{files},@var{options}),@var{environs})}. The fields have these interpretations. @table @var @item files is a list of quadruples @code{((@var{date},@var{path}),(@var{preamble},@var{contents}))}, with one quadruple for each input file named on the command line (but not the virtual code file or the @code{avram} executable). The list will be in the same order as the filenames on the command line, and is not affected by options interspersed with them. The fields in each item have the following interpretations. @table @var @item date is the time stamp of the file in as a character string in the usual @cindex time stamp @cindex date @cindex system time @cindex current time @cindex Unix Unix format, for example, @code{Fri Jan 19 14:34:44 GMT 2001}. If the file corresponds to standard input, the current system time and date are used. @item path is the full path of the file, expressed as a list of strings. If the @cindex file names @cindex paths @cindex absolute path @cindex relative path file corresponds to standard input, the list is empty. Otherwise, the first string in the list is the file name. The next is the name of the file's parent directory, if any. The next is the parent of the parent, and so on. The root directory is indicated by the empty string, so that any path ending with the empty string is an absolute path, while any path ending with a non-empty string is relative to the current working directory. Path separators (i.e., slashes) are omitted. @item preamble In the case of a text file, or any file not conforming to the format in @cindex preamble @ref{File Format}, this field is @code{nil}. Otherwise, this field contains the preamble of the file expressed as a list of strings, or contains just the empty string if the file has no preamble. Any leading hashes in the preamble of the file are stripped. @item contents In the case of a text file (as indicated by the @var{preamble} field), this @cindex text files field will contain a list of character strings, with each line of the file contained in a character string. Otherwise, it can contain data in any format, which are obtained by converting the data section of the file to a tree. @end table @item options is a list of quadruples of the form @code{((@var{position},@var{longform}),(@var{keyword},@var{params}))} with one quadruple for each option appearing on the command line after the name of the virtual code file. @table @var @item position is a natural number indicating the position of the option on the command @cindex naturals @cindex command line line. The position numbers of all the options will form an ascending sequence, but not necessarily consecutive nor starting with zero. The missing numbers in the sequence will correspond to the positions of the file names on the command line, allowing their positions to be inferred by applications for which the position matters. @item longform is a boolean value which is true if the option starts with two or more @cindex booleans dashes but false otherwise. @item keyword is the key word of the option expressed as a character string. For example in the case of a command line option @kbd{--foo=bar,baz}, the keyword is @code{foo}. Leading dashes are stripped. @item params is a list of character strings identifying the parameters for the command line option in question. In the case of an option of the form @kbd{--foo=bar,baz}, the first character string in the list will be @code{bar} and the next will be @code{baz}. The same applies if the option is written @kbd{--foo bar,baz} or @kbd{--foo =bar,baz}. If there are no parameters associated with the option, the list is empty. @end table @item environs is a list of pairs of character strings, with one pair in the list for @cindex environment each environment variable. The identifier is the left string in the pair, and the value is the right. For example, if the environment contains the definition @code{OSTYPE=linux-gnu}, there will be a pair in the list whose left side is the string @code{OSTYPE} and whose right side is the string @code{linux-gnu}. @end table @node Input for Mapped Applications, Output From Non-interactive Applications, Input Data Structure, Parameter Mode Interface @subsection Input for Mapped Applications Applications invoked using the @option{--map-to-each-file} option @cindex @code{map-to-each-file} command line option benefit from a slightly different interface than the one described above. As the purpose of this option is to save memory by loading only one file at a time, the application does not have access to all input files named on the command line simultaneously within the same data structure. Although the data structure is of the type already described, the @var{files} field is not a list of arbitrary length. Instead, it is a list containing exactly one item for an input file. If @kbd{-} is used as a command line parameter, indicating standard input, then @var{files} will have another item pertaining to standard input. In no event will it have other than one or two items. The mapped application is expected to work by being applied individually to each of any number of separately constructed data structures, doing the same in each case as it would if that case were the only one. To make that possible, copies of the environment variables, the contents of standard input, and the list of application specific options are contained in the data structure used for every invocation. @cindex command line The position numbers in the options are adjusted for each invocation to reflect the position of the particular input file associated with it. For example, in the following command @display @kbd{avram --map-to-each-file mapster.avm fa.txt --data fb.dat --code fc.o} @end display @noindent the function in the virtual code file @file{mapster.avm} would be applied to each of three data structures, corresponding to the commands @display @kbd{avram mapster.avm fa.txt --data --code} @kbd{avram mapster.avm --data fb.dat --code} @kbd{avram mapster.avm --data --code fc.o} @end display @noindent If the relative positions of the options and filenames were important to the application, they could be reliably inferred from the position numbers. In the first case, they would be 1 and 2, implying that the file is in position 0. In the second case they would be 0 and 2, implying that the file is in position 1, and in the third case they would be 0 and 1, implying that the file is in position 2. (Of course, nothing compels an application to concern itself with the positions of its parameters, and the alternative might be preferable.) For the most part, any application that can operate on one file at a time without needing information from any others can be executed more economically with the @option{--map-to-each-file} option and few if any changes to the code. The effect will normally be analogous to the above example, subject to a few possible differences. @itemize @bullet @item If an application is supposed to do something by default when there are no file parameters or only standard input, it won't work as a mapped application, because if there are no file parameters it won't be executed at all. @item If a mapped application causes any output files to be generated, they may be written before other input files are read, possibly causing the input files to be overwritten if they have the same names, and causing subsequent invocations to use the overwritten versions. This behavior differs from that of loading all input files at the outset, which ensures the application seeing all of the original versions. The latter may be more convenient for maintaining a group of files in some sort of consistent state. @item If an application causes standard output to be written along with output files, normally standard output is written last as a security measure against malicious code altering the @option{--ask-to-overwrite} prompts by subtly clobbering the console. In a mapped application, standard output isn't always last because there may be more invocations to come. @end itemize @node Output From Non-interactive Applications, Output From Interactive Applications, Input for Mapped Applications, Parameter Mode Interface @subsection Output From Non-interactive Applications @cindex @code{interactive} command line option @cindex @code{step} command line option If a parameter mode application is not invoked with either of the @option{--interactive} or @option{--step} options, then it is deemed to be non-interactive, and therefore does not concern itself with executing shell commands. Instead, it simply specifies a list of output files to be created or updated on its behalf by @code{avram}. The files are described by a list of quadruples @code{((@var{overwrite},@var{path}),(@var{preamble},@var{contents}))}, with one quadruple for each file. @cindex preamble @cindex paths @cindex standard input @cindex standard output In each quadruple, the @var{path}, @var{preamble}, and @var{contents} fields have the same interpretations as in the list of files in the input data structure described in @ref{Input Data Structure}, except that a @code{nil} path refers to standard output rather than to standard input. The @var{overwrite} field in each quadruple tells whether the file @cindex appending to files should be overwritten or appended. If the @var{overwrite} field is @code{nil} (i.e., the representation for the Boolean value of @code{false}) and a file already exists at the given path, the new contents will be appended to it. If the @var{overwrite} field is anything other than @code{nil} and/or no file exists with the given path, a new file is written or the extant one is overwritten. Note that the data file format specified in @ref{File Format} precludes appending anything to it, but the format of existing output files is not checked and nothing prevents data or text from being appended to one. @node Output From Interactive Applications, , Output From Non-interactive Applications, Parameter Mode Interface @subsection Output From Interactive Applications @cindex @code{interactive} command line option @cindex @code{step} command line option Parameter mode applications invoked with either of the @option{--interactive} or @option{--step} options are required to take the data structure described in @ref{Input Data Structure} as an argument but to return the virtual code for a function that will observe @cindex shell a certain protocol allowing shell commands to be executed on its behalf. The intent is that the virtual code file doesn't contain the real application @emph{per se}, but only something that builds the real one on the fly using configuration information from the input files and command line options. The format of the result returned by an interactive application, being a virtual code application itself, requires a full exposition of the virtual machine code semantics. This subject is deferred to @ref{Virtual Code Semantics}. The remainder of this section describes the protocol followed by the function returned by the interactive application rather than the application itself. Similarly to the case of a byte transducer described in @ref{Byte Transducers}, the basic pattern of interaction between @code{avram} and the function is a cycle of invocations. In general terms, the function is applied to a @code{nil} argument initially, and expected to return an initial state and initial output. Thereafter, the function is applied to a pair of the state returned on the previous iteration, and the next installment of input. The function returns further output and a new state, and the cycle continues until the function returns a value of @code{nil}, at which time the computation terminates. @menu * Line Oriented Interaction:: * Character Oriented Interaction:: * Mixed Modes of Interaction:: @end menu @node Line Oriented Interaction, Character Oriented Interaction, Output From Interactive Applications, Output From Interactive Applications @subsubsection Line Oriented Interaction Within this general pattern, more specific styles of interaction are possible. In the simplest one to explain first, the result returned by the function is always a data structure of the form @cindex command line @code{(@var{state},(@var{command lines},@var{prompts}))}, wherein the fields have these interpretations. @table @var @item state is a tree incorporating any data in any format that the application needs to remember from one invocation to the next. @item command lines is a list of character strings that are piped to the standard input stream of a separately spawned process. The process may persist from one invocation of the function to the next, or may be spawned each time. @item prompts is a non-empty list of character strings containing a suffix of the text expected from the standard output stream of the process as a result of sending the command lines to it. @end table On each iteration, @code{avram} sends the command line character strings @cindex spawning processes to a separately spawned process, with line breaks between them if there are more than one command. If a process remains from the previous iteration that has not terminated itself, the list of command lines is sent to the same process. If no such process already exists, the first string in the list of command lines is treated as a shell command and used to spawn the @cindex @code{exp_popen} process (using the @code{exp_popen} library function), and the remaining strings are sent to the newly spawned process. Normally processes spawned with commands that invoke interactive command line interpreters of their own, such as @command{bash}, @command{ftp} or @command{bc}, will persist indefinitely unless the command causing them to exit is issued or some other event kills them. Processes spawned with non-interactive commands, such as @command{ls} or @command{pwd}, will terminate when the last of their initial output has been received. In the case of processes that persist after being spawned, @code{avram} needs some way of knowing when to stop waiting for more output from them so that it doesn't get stuck waiting forever. This purpose is served by the @var{prompts} field. This field could contain a single string holding the last thing the process will send before becoming quiescent, such as the strings @code{bash$ } or @code{ftp> } in the above examples. Alternatively, a sequence of more than one prompt string can be used to indicate that the corresponding sequence of lines must be detected. An empty string followed by @code{ftp> } would indicate that the @code{ftp> } prompt is expected to be immediately preceded by a line @cindex prompts break. There is also the option of using prompt strings to indicate a pattern that does not necessarily imply quiescence, but is a more convenient point at which to stop reading the output from the process. For processes spawned with commands that do not start their own interactive command line interpreters, such as @command{ls} or @command{pwd}, it may be preferable to read all the output from them until they terminate. To achieve this effect, the list of prompt strings should contain only the single string containing only the single @code{EOF} character (usually code 4) or any other character that is certain not to occur in the output of the process. This technique is based on the assumption that the process was spawned originally with the command in question, not that such a command is sent to an existing shell process. In any case, when enough output has been received from the process, it is collected into a list of received strings including the prompt strings at the end (if they were received), and the function is applied to the pair @code{(@var{state},@var{received strings})}. If the cycle is to continue, the result returned by the function will include a new state, a new list of command lines, and a new list of prompt strings. A result of @code{nil} will cause the computation to terminate. There are some unusual situations that could occur in the course of line oriented interaction, and are summarized as follows. @itemize @bullet @item If the process terminates before any pattern matching the prompt strings is received from it, all of the output from the process up to the point where it terminated is collected into the @var{received strings} list and passed to the function. This situation includes cases where the process terminates immediately upon being spawned, but not abnormal completion of the @code{exp_popen} library function, which is a fatal error. This feature of the interface is what allows @code{EOF} to be used for collecting all the output at once from a non-interactive command. @item If the list of @var{command lines} is empty, and no process currently exists due to a previous iteration, the effect is the same as if the process terminates unexpectedly before outputting anything. I.e., the function is applied to a pair containing an empty list of received strings. There is no good reason for an application to get into this situation. @item If the list of @var{command lines} is empty but a process persists from a previous iteration, no output is sent to it, but receiving from it proceeds normally. This feature of the interface could be used effectively by applications intended to process the received data in @cindex deadlock parts, but will cause deadlock if the process is already quiescent. @item All character strings have to consist of lists of valid representations of non-null characters according to @ref{Character Table}, or else there will be some fatal error messages. @item If the list of @var{prompt strings} contains only the empty string, @code{avram} will not wait to receive anything from the process, but will proceed with the next iteration immediately. If this effect is intended, care must be taken not to confuse the empty list of @var{prompt strings} with the list containing the empty string. The former indicates character oriented interaction, which is explained next. @end itemize @node Character Oriented Interaction, Mixed Modes of Interaction, Line Oriented Interaction, Output From Interactive Applications @subsubsection Character Oriented Interaction A character oriented style of interaction involves the function always returning a data structure of the form @code{(@var{state},(@var{command lines},nil))}. The @var{state} and @var{command lines} fields serve @cindex command line exactly the same purposes respectively as they do in the case of line oriented interaction. The field that would be occupied by the @var{prompt strings} list in the case of line oriented interaction is identically @code{nil} in this style. When this style is used, @code{avram} spawns a process and/or sends @cindex spawning processes command lines to it as in the case of line oriented interaction, but attempts to read only a single character from it per iteration. When the character is received, @code{avram} applies the function to the pair @code{(@var{state},@var{character})} in order to obtain the next state and the next list of command lines. If the process has terminated, a @code{nil} value is used in place of the character. If the process is quiescent, deadlock ensues. The character oriented style is a lower level protocol that shifts more of the burden of analyzing the process's output to the virtual code application. It can do anything line oriented interaction can do except proceeding immediately without waiting to receive any output from the process. It may also allow more general criteria (in effect) than the matching of a fixed prompt string to delimit the received data, for those pathological processes that may require such things. Applications using character oriented interaction need to deal with line @cindex line breaks breaks explicitly among the received characters, unlike the case with line oriented interaction, where the line breaks are implicit in the @cindex Unix list of received strings. Contrary to the convention for Unix text files, line breaks in the output of a process are indicated by character code 13 followed by character code 10. @node Mixed Modes of Interaction, , Character Oriented Interaction, Output From Interactive Applications @subsubsection Mixed Modes of Interaction An application is not confined exclusively to line oriented or character oriented interaction, but may switch from one style to the other between iterations, and signal its choice simply by the format of the data structure it returns. If the @var{prompt strings} field is non-empty, the interaction is line oriented, and if the field is empty, the interaction is character oriented. A function using both styles has to be prepared for whichever type of data it indicates, either a character or a list of character strings as the case may be. Another alternative is possible if the function returns a data structure in the form @code{(@var{files},nil)}. This structure includes neither a list of command lines nor a list of prompt strings, empty or otherwise, but does include a list of quadruples in the @var{files} field. The quadruples are of the form @code{((@var{overwrite},@var{path}),(@var{preamble},@var{contents}))}. The fields have the same interpretations as in the output from a non-interactive parameter mode application, as described in @ref{Output From Non-interactive Applications}, and will cause a list of files to be written in the same way. As an interactive application is able cause the execution of arbitrary shell commands, it doesn't need @code{avram} to write files for it the way a non-interactive application does, so this feature does not provide any additional capabilities. However, it may be helpful as a matter of convenience. After the files are written, the function will be applied to the same result it returned, @code{(@var{files},nil)}. There is no direct means of preserving unconstrained state information from previous iterations in this style of interaction. A likely scenario might therefore be that the function returns a file list after finishing its other business, and then returns @code{nil} on the next iteration to terminate. @node Virtual Code Semantics, , Parameter Mode Interface, Virtual Machine Specification @section Virtual Code Semantics As the previous sections explain, virtual code applications are defined in terms of mathematical functions. Up until this point, the discussion has focused on the interface between the function and the virtual machine interpreter, by detailing the arguments passed to the functions under various circumstances and the results they are expected to return in order to achieve various effects. The purpose of this section is to complete the picture by explaining how a given computable function can be expressed in virtual code, considering only functions operating on the trees described in @ref{Raw Material}. Functions manipulating trees of @code{nil} are undoubtedly a frivolous and abstract subject in themselves. One is obliged to refer back to the previous sections if in need of motivation. @menu * A New Operator:: * On Equality:: * A Minimal Set of Properties:: * A Simple Lisp Like Language:: * How @code{avram} Thinks:: * Variable Freedom:: * Metrics and Maintenance:: * Deconstruction:: * Recursion:: * Assignment:: * Predicates:: * Iteration:: * List Combinators:: * List Functions:: * Exception Handling:: * Interfaces to External Code:: * Vacant Address Space:: @end menu @node A New Operator, On Equality, Virtual Code Semantics, Virtual Code Semantics @subsection A New Operator With regard to a set of trees as described in @ref{Raw Material}, we can define a new binary operator. Unlike the @code{cons} operator, this one is not required to associate an element of the set with every possible pair of elements. For very many pairs of operands we will have nothing to say about its result. In fact, we require nothing of it beyond a few simple properties to be described presently. Because this is the only other operator than @code{cons}, there is no @cindex @code{cons} need to have a special notation for it, so it will be denoted by empty space. The tree associated by the operator with a pair of trees @code{@var{x}} and @code{@var{y}}, if any, will be expressed in the infix notation @code{@var{x} @var{y}}. For convenience, the operator is regarded as being right associative, so that @code{@var{a} @var{b} @var{c}} can be written for @code{@var{a} (@var{b} @var{c})}. @node On Equality, A Minimal Set of Properties, A New Operator, Virtual Code Semantics @subsection On Equality @cindex equality One example of a property this operator should have, for reasons that will not be immediately clear, is that for any trees @code{@var{x}} and @code{@var{k}}, the equality @code{cons(cons(nil,@code{@var{k}),nil) @var{x}} = @code{@var{k}}} always holds. Even though the present exposition opts for readability over formality, statements like these demand clarification of the notion of equality. Some of the more pedantic points in @ref{Raw Material} may be needed for the following ideas to hold water. @itemize @bullet @item As originally stipulated, it is always possible to distinguish @code{nil} from any member of the set. We can therefore decide on this basis whether @code{@var{a} = @var{b}} whenever at least one of them is @code{nil}. @item Where neither @code{@var{a}} nor @code{@var{b}} is @code{nil} in an expression @code{@var{a} = @var{b}}, but rather something of the form @code{cons(@var{x},@var{y})}, the equality holds if and only if both pairs of corresponding subexpressions are equal. If at least one member of each pair of corresponding subexpressions is @code{nil}, the question is settled, but otherwise there is recourse to their respective subexpressions, and so on. This condition follows from the uniqueness property of the @code{cons} operator. @item If one side of an equality is of the form @code{@var{x} @var{y}}, or is defined in terms of such an expression, but @code{(@var{x},@var{y})} is one of those pairs with which the operator associates no result, then the equality holds if and only if the other side is similarly ill defined. @item Statements involving universal quantification (i.e., @cindex universal quantification @cindex undefined expressions beginning with words similar to ``for any tree @code{@var{x}} @dots{}'') obviously do not apply to instances of a variable (@code{@var{x}}) outside the indicated set (trees). Hence, they are not refuted by any consequence of identifying a variable with an undefined expression. @end itemize Readers who are aware of such issues as pointer equality or intensional @cindex pointer equality @cindex pointers versus extensional equality of functions are urged to forget all about them in the context of this document, and abide only by what is stated. Other readers should ignore this paragraph. @node A Minimal Set of Properties, A Simple Lisp Like Language, On Equality, Virtual Code Semantics @subsection A Minimal Set of Properties For any trees @code{@var{x}}, @code{@var{y}}, and @code{@var{k}}, and any non-@code{nil} trees @code{@var{p}}, @code{@var{f}}, and @code{@var{g}}, the new invisible operator satisfies these conditions. In these expressions and hereafter, increasing abuse of notation is perpetrated by not writing the @code{cons} in expressions of the form @code{cons(@var{x},@var{y})}. @table @emph @item P0 @code{(nil,(nil,nil)) @var{x}} = @code{@var{x}} @item P1 @code{(nil,((nil,nil),nil)) (@var{x},@var{y})} = @code{@var{x}} @item P2 @code{(nil,(nil,(nil,nil))) (@var{x},@var{y})} = @code{@var{y}} @item P3 @code{((nil,@var{k}),nil) @var{x}} = @code{@var{k}} @item P4 @code{(((nil,(nil,nil)),nil),nil) (@var{f},@var{x})} = @code{@var{f} (@var{f},@var{x})} @item P5 @code{((@var{f},@var{g}),nil) @var{x}} = @code{@var{f} @var{g} @var{x}} @item P6 @code{((@var{f},nil),@var{g}) @var{x}} = @code{(@var{f} @var{x},@var{g} @var{x})} @item P7 @code{((@var{p},@var{f}),@var{g}) @var{x}} = @code{@var{f} @var{x}} if @code{@var{p} @var{x}} is a non-@code{nil} tree, but @code{@var{g} @var{x}} if @code{@var{p} @var{x}} = @code{nil} @end table @cindex properties @cindex operator properties Although other properties remain to be described, it is worth pausing at this point because there is ample food for thought in the ones already given. An obvious question would be that of their origin. The short answer is that they have been chosen arbitrarily to be true by definition of the operator. At best, the completion of the construction may lead to a more satisfactory answer based on aesthetic or engineering grounds. A more important question would be that of the relevance of the mystery operator and its properties to the stated purpose of this section, which is to specify the virtual machine code semantics. The answer lies in that the operator induces a function for any given tree @code{@var{t}}, such that the value returned by the function when given an argument @var{x} is @code{@var{t} @var{x}}. This function is the one that is implemented by the virtual code @var{t}, which is to say the way an application will behave if we put @var{t} in its virtual code file. An equivalent way of looking at the situation is that the virtual machine does nothing but compute the result of this operator, taking the tree in the virtual code file as its left operand and the input data as the right operand. By knowing what the operator will do with a given pair of operands, we know what to put into the virtual code file to get the function we want. @cindex universality @cindex Turing equivalence @cindex exceptions @cindex lists It is worthwhile to note that properties @emph{P0} to @emph{P7} are sufficient for universality in the sense of Turing equivalence. That means that any computable function could be implemented by the suitable choice of a tree @var{t} without recourse to any other properties of the operator. A compiler writer who finds this material boring could therefore stop reading at this point and carry out the task of targeting any general purpose programming language to the virtual machine based on the specifications already given. However, such an implementation would not take advantage of the features for list processing, exception handling, or profiling that are also built into the virtual machine and have yet to be described. @node A Simple Lisp Like Language, How @code{avram} Thinks, A Minimal Set of Properties, Virtual Code Semantics @subsection A Simple Lisp Like Language @cindex @code{silly} With a universal computational model already at our disposal, it will be easier to use the virtual machine to specify itself than to define all of it from scratch. For this purpose, we use the @code{silly} programming language, whose name is an acronym for SImple Lisp-like Language (Yeah right). The language serves essentially as a thin layer of symbolic names on top of the virtual machine code. Due to its poor support for modularity and abstraction, @code{silly} is not recommended for serious application development, but at least it has a shallow learning curve.@footnote{Previous releases of @code{avram} included a working @code{silly} compiler, but this has now been superseded by the Ursala programming language. Ursala includes @code{silly} as a subset for the most part, and the examples in this manual should compile and execute with very little modification.} @menu * Syntax:: * Semantics:: * Standard Library:: @end menu @node Syntax, Semantics, A Simple Lisp Like Language, A Simple Lisp Like Language @subsubsection Syntax @code{silly} has no reserved words, but it has equals signs, commas and parentheses built in. A concise but ambiguous grammar for it can be given as follows. @cindex syntax @cindex grammar @display @var{program} ::= @var{declaration}* @var{declaration} ::= @var{identifier} @code{=} @var{expression} @iftex @var{expression} ::= @code{()} | @var{identifier} | @code{(@var{expression})} | @code{(@var{expression},@var{expression})} | @var{expression} @var{expression} | @var{expression}@code{(}@var{expression}@code{)} | @var{expression}@code{(}@var{expression}@code{,}@var{expression}@code{)} @end iftex @ifinfo @var{expression} ::= () | @var{identifier} | (@var{expression}) | (@var{expression},@var{expression}) | @var{expression} @var{expression} | @var{expression}(@var{expression}) | @var{expression}(@var{expression},@var{expression}) @end ifinfo @end display @noindent @cindex precedence @cindex operator precedence The real grammar is consistent with this one but enforces right associativity for binary operations and higher precedence for juxtaposition without intervening white space. The declaration of any identifier must be unique and must precede its occurrence in any expression. Hence, cyclic dependences between declarations and ``recursive'' declarations are not allowed. @node Semantics, Standard Library, Syntax, A Simple Lisp Like Language @subsubsection Semantics Declarations in @code{silly} should be understood in the obvious way as preprocessor directives to perform parenthetic textual substitutions (similar to @code{#define id (exp)} in C). All identifiers in expressions are thereby eliminated during the preprocessing phase. @cindex semantic function The overall meaning of the program is the meaning of the expression in the last declaration. A denotational semantics for expressions is given by the following equations, where [[@code{@var{x}}]] should be read as ``the meaning of @code{@var{x}}'', and @code{@var{x}}, @code{@var{y}} and @code{@var{z}} are metavariables. (That is, they stand for any source code fragment that could fit there subject to the constraint, informally speaking, that it has to correspond to a connected subtree of the parse tree as enforced by the unambiguous grammar in the context of the rest of the program.) @display [[@code{()}]] = @code{nil} [[@code{(@var{x})}]] = [[@code{@var{x}}]] [[@code{(@var{x},@var{y})}]] = @code{cons(}[[@code{@var{x}}]]@code{,}[[@code{@var{y}}]]@code{)} [[@code{@var{x} @var{y}}]] = [[@code{@var{x}(@var{y})}]] = [[@code{@var{x}}]] [[@code{@var{y}}]] [[@code{@var{x} (@var{y},@var{z})}]] = [[@code{@var{x}(@var{y},@var{z})}]] = [[@code{@var{x}}]] [[@code{(@var{y},@var{z})}]] @end display @noindent Toy languages like this are among the few situations a @cindex denotational semantics denotational semantics stands a chance of clarifying more than it obfuscates, so the reader is invited to take a moment to savor it. @node Standard Library, , Semantics, A Simple Lisp Like Language @subsubsection Standard Library @code{silly} programs may be linked with library modules, which consist of @code{silly} source text to be concatenated with the user @cindex library modules @cindex standard prelude program prior to the preprocessing phase. Most @code{silly} programs are linked with the standard @code{silly} prelude, which contains the following declarations among others. @cindex @code{nil} @cindex @code{identity} @cindex @code{left} @cindex @code{right} @cindex @code{meta} @cindex @code{constant} @cindex @code{couple} @cindex @code{compose} @cindex @code{conditional} @example nil = () identity = (nil,(nil,nil)) left = (nil,((nil,nil),nil)) right = (nil,(nil,(nil,nil))) meta = (((nil,(nil,nil)),nil),nil) constant_nil = ((nil,nil),nil) couple = ((((left,nil),constant_nil),nil),right) compose = couple(identity,constant_nil) constant = couple(couple(constant_nil,identity),constant_nil) conditional = couple(couple(left,compose(left,right)),compose(right,right)) @end example There is a close correspondence between these declarations and the properties described in @ref{A Minimal Set of Properties}. A fitting analogy would be that the properties of the operator specify the virtual machine instruction set in a language independent way, and the @code{silly} library defines the instruction mnemonics for a virtual assembly language. The @cindex mnemonics relationship of some mnemonics to their corresponding instructions may be less clear than that of others, so they are all discussed next. @node How @code{avram} Thinks, Variable Freedom, A Simple Lisp Like Language, Virtual Code Semantics @subsection How @code{avram} Thinks The definitions in the standard @code{silly} library pertaining to the basic properties of the operator can provide a good intuitive illustration of how computations are performed by @code{avram}. This task is approached in the guise of a few trivial correctness proofs about them. Conveniently, as an infeasibly small language, @code{silly} is an ideal candidate for analysis by formal methods. Technically the semantic function [[@dots{}]] has not been defined on identifiers, but we can easily extend it to them by stipulating that the meaning of an identifier @code{@var{x}} is the meaning of the program @cindex identifiers @code{@var{main} = @var{x}} when linked with a library containing the declaration of @code{@var{x}}, where @code{@var{main}} is an identifier not appearing elsewhere in the library. With this idea in mind, the following ``theorems'' can be stated, all of which have a similar proof. The variables @var{x} and @var{y} stand for any tree, and the variable @var{f} stands for any tree other than @code{nil}. @table @emph @item T0 [[@code{identity}]] @code{@var{x}} = @code{@var{x}} @item T1 [[@code{left}]] @code{(@var{x},@var{y})} = @code{@var{x}} @item T2 [[@code{right}]] @code{(@var{x},@var{y})} = @code{@var{y}} @item T4 [[@code{meta}]] @code{(@var{f},@var{x})} = @code{@var{f} (@var{f},@var{x})} @item T5 [[@code{constant_nil}]] @code{@var{x}} = @code{nil} @end table @noindent Replacing each identifier with its defining expression directly demonstrates a logical equivalence between the relevant theorem and one of the basic operator properties postulated in @ref{A Minimal Set of Properties}. For more of a challenge, it is possible to prove the next theorem. @table @emph @item T6 For non-@code{nil} @code{@var{f}} and @code{@var{g}}, ([[@code{couple}]] @code{(@var{f},@var{g})}) @code{@var{x}} = @code{(@var{f} @var{x},@var{g} @var{x})} @end table @noindent The proof is a routine calculation. Beware of the distinction between the mathematical @code{nil} and the @code{silly} identifier @code{nil}. @ifnotinfo @format ([[@code{couple}]] @code{(@var{f},@var{g})}) @code{@var{x}} = ([[@code{((((left,nil),constant_nil),nil),right)}]] @code{(@var{f},@var{g})}) @code{@var{x}} by substitution of @code{couple} with its definition in the standard library = (@code{((((}[[@code{left}]]@code{,}[[@code{nil}]]@code{),}[[@code{constant_nil}]]@code{),}[[@code{nil}]])@code{,}[[@code{right}]]@code{)} @code{(@var{f},@var{g})}) @code{@var{x}} by definition of the semantic function [[@dots{}]] regarding pairs = (@code{((((}[[@code{left}]]@code{,}[[@code{()}]]@code{),}[[@code{constant_nil}]]@code{),}[[@code{()}]])@code{,}[[@code{right}]]@code{)} @code{(@var{f},@var{g})}) @code{@var{x}} by substitution of @code{nil} from its definition in the standard library = (@code{((((}[[@code{left}]]@code{,}@code{nil}@code{),}[[@code{constant_nil}]]@code{),}@code{nil})@code{,}[[@code{right}]]@code{)} @code{(@var{f},@var{g})}) @code{@var{x}} by definition of the semantic function in the case of [[@code{()}]] = (@code{(}[[@code{left}]] @code{(@var{f},@var{g}),}[[@code{constant_nil}]] @code{(@var{f},@var{g})),}[[@code{right}]] @code{(@var{f},@var{g})}) @code{@var{x}} by property @emph{P6} (twice) = @code{((@var{f},nil),@var{g}) @var{x}} by theorems @emph{T1}, @emph{T2}, and @emph{T5} = @code{(@var{f} @var{x},@var{g} @var{x})} by property @emph{P6} again. @end format @end ifnotinfo @ifinfo @ifnothtml @format ([[@code{couple}]] @code{(@var{f},@var{g})}) @code{@var{x}} = ([[@code{((((left,nil),constant_nil),nil),right)}]] @code{(@var{f},@var{g})}) @code{@var{x}} by substitution of @code{couple} with its definition in the standard library = ( @code{( (((}[[@code{left}]]@code{,}[[@code{nil}]]@code{),}[[@code{constant_nil}]]@code{),}[[@code{nil}]])@code{,} [[@code{right}]]@code{)} @code{(@var{f},@var{g})}) @code{@var{x}} by definition of the semantic function [[@dots{}]] regarding pairs = ( @code{( (((}[[@code{left}]]@code{,}[[@code{()}]]@code{),}[[@code{constant_nil}]]@code{),}[[@code{()}]])@code{,} [[@code{right}]]@code{)} @code{(@var{f},@var{g})}) @code{@var{x}} by substitution of @code{nil} from its definition in the standard library = ( @code{( (((}[[@code{left}]]@code{,}[[@code{nil}]]@code{),}[[@code{constant_nil}]]@code{),}[[@code{nil}]])@code{,} [[@code{right}]]@code{)} @code{(@var{f},@var{g})}) @code{@var{x}} by definition of the semantic function in the case of [[@code{()}]] = ( @code{(} [[@code{left}]] @code{(@var{f},@var{g}),}[[@code{constant_nil}]] @code{(@var{f},@var{g})),} [[@code{right}]] @code{(@var{f},@var{g})}) @code{@var{x}} by property @emph{P6} (twice) = @code{((@var{f},nil),@var{g}) @var{x}} by theorems @emph{T1}, @emph{T2}, and @emph{T5} = @code{(@var{f} @var{x},@var{g} @var{x})} by property @emph{P6} again. @end format @end ifnothtml @end ifinfo Something to observe about this proof is that it might just as well have been done automatically. Every step is either the substitution of an identifier or a pattern match against existing theorems and properties of the operator. Another thing to note is that the use of identifiers and previously established theorems helps to make the proof human readable, but is not a logical necessity. An equivalent proof could have been expressed entirely in terms of the properties of the operator. If one envisions a proof like this being performed blindly and mechanically, without the running commentary or other amenities, that would not be a bad way of thinking about what takes place when @code{avram} executes virtual code. Three more theorems have similar proofs. For non-@code{nil} trees @code{@var{p}}, @code{@var{f}} and @code{@var{g}}, and any trees @code{@var{x}} and @code{@var{k}}: @cindex @code{compose} @cindex @code{constant} @cindex @code{conditional} @table @emph @item T7 ([[@code{compose}]] @code{(@var{f},@var{g})}) @var{x} = @var{f} @var{g} @var{x} @item T8 ([[@code{constant}]] @code{@var{k}}) @var{x} = @var{k} @item T9 ([[@code{conditional}]] @code{(@var{p},(@var{f},@var{g})}) @var{x} = @code{@var{f} @var{x}} if @code{@var{p} @var{x}} is non-@code{nil}, but @code{@var{g} @var{x}} if @code{@var{p} @var{x}} = @code{nil} @end table @noindent The proofs of these theorems are routine calculations analogous to the proof of @emph{T6}. Here is a proof of theorem @emph{T7} for good measure. @ifnotinfo @format ([[@code{compose}]] @code{(@var{f},@var{g})}) @code{@var{x}} = ([[@code{couple(identity,constant_nil)}]] @code{(@var{f},@var{g})}) @code{@var{x}} @end format @end ifnotinfo @ifinfo @ifnothtml @format ([[@code{compose}]] @code{(@var{f},@var{g})}) @code{@var{x}} = ([[@code{couple(identity,constant_nil)}]] @code{(@var{f},@var{g})}) @code{@var{x}} @end format @end ifnothtml @end ifinfo @iftex @display @end display @noindent @end iftex by substitution of @code{compose} with its definition in the standard library @format = (@code{(}[[@code{couple}]] @code{(}[[@code{identity}]]@code{,}[[@code{constant_nil}]]@code{))(@var{f},@var{g})}) @code{@var{x}} by definition of the semantic function = @code{(}[[@code{identity}]] @code{(@var{f},@var{g}),}[[@code{constant_nil}]]@code{ (@var{f},@var{g})) @var{x}} by theorem @emph{T6} = @code{((@var{f},@var{g}),nil) @var{x}} by theorems @emph{T0} and @emph{T5} = @code{@var{f} @var{g} @var{x}} by property @emph{P5} of the operator. @end format @node Variable Freedom, Metrics and Maintenance, How @code{avram} Thinks, Virtual Code Semantics @subsection Variable Freedom The virtual code semantics is easier to specify using the @code{silly} language than it would be without it, but still awkward in some cases. An example is the following declaration from the standard library, @cindex @code{hired} @example hired = compose( compose, couple( constant compose, compose(couple,couple(constant,constant couple)))) @end example @noindent which is constructed in such a way as to imply the following theorem, provable by routine computation. @table @emph @item T9 @code{(}[[@code{hired}]] @code{@var{h}) (@var{f},@var{g})} = [[@code{compose}]]@code{(@var{h},}[[@code{couple}]]@code{(@var{f},@var{g}))} @end table @noindent Intuitively, @code{hired} represents a function that takes a given function to a higher order function. For example, if @code{f} were a function that adds two real numbers, @code{hired f} would be a function that takes two real valued functions to their pointwise sum. Apart from its cleverness, such an opaque way of defining a function has little to recommend it. The statement of the theorem about the function is more readable than the function definition itself, probably because the theorem liberally employs mathematical variables, whereas the @code{silly} language is variable free. On the other hand, it is not worthwhile to linger over further enhancements to the language, such as adding variables to it. The solution will be to indicate informally a general method of inferring a variable free function definition from an expression containing variables, and hereafter omit the more cumbersome definitions. @cindex @code{isolate} @cindex variables An algorithm called @code{isolate} does the job. The input to @code{isolate} is a pair @code{(@var{e},@var{x})}, where @code{@var{e}} is a syntactically correct @code{silly} expression in which the identifier @code{@var{x}} may occur, but no other identifiers dependent on @code{@var{x}} may occur (or else it's garbage-in/garbage-out). Output is a syntactically correct @code{silly} expression @code{@var{f}} in which the identifier @code{@var{x}} does not occur, such that [[@code{@var{e}}]] = [[@code{@var{f} @var{x}}]]. The algorithm is as follows, @display if @code{@var{e}} = @code{@var{x}} then return @code{identity} else if @code{@var{e}} is of the form @code{(@var{u},@var{v})} return @code{couple(isolate(@var{u},@var{x}),isolate(@var{v},@var{x}))} else if @code{@var{e}} is of the form @code{@var{u} @var{v}} return @code{(hired apply)(isolate(@var{u},@var{x}),isolate(@var{v},@var{x}))} else return @code{constant @var{e}} @end display @noindent @cindex equality where equality is by literal comparison of expressions, and the definition of @code{apply} is @cindex @code{apply} @example apply = (hired meta)((hired compose)(left,constant right),right) @end example @noindent which represents a function that does the same thing as the invisible operator. @table @emph @item T10 [[@code{apply}]] @code{(@var{f},@var{x})} = @code{@var{f} @var{x}} @end table The @code{isolate} algorithm can be generalized to functions of arbitrarily many variables, but in this document we will need only a unary and a binary version. The latter takes an expression @code{@var{e}} and a pair of identifiers @code{(@var{x},@var{y})} as input, and returns an expression @code{@var{f}} such that [[@code{@var{e}}]] = [[@code{@var{f} (@var{x},@var{y})}]]. @display if @code{@var{e}} = @code{@var{x}} then return @code{left} else if @code{@var{e}} = @code{@var{y}} then return @code{right} else if @code{@var{e}} is of the form @code{(@var{u},@var{v})} return @code{couple(isolate(@var{u},(@var{x},@var{y})),isolate(@var{v},(@var{x},@var{y})))} else if @code{@var{e}} is of the form @code{@var{u} @var{v}} return @code{(hired apply)(isolate(@var{u},(@var{x},@var{y})),isolate(@var{v},(@var{x},@var{y})))} else return @code{constant @var{e}} @end display It might be noted in passing that something similar to these algorithms would be needed in a compiler targeted to @code{avram} if the source were a functional language with variables. @node Metrics and Maintenance, Deconstruction, Variable Freedom, Virtual Code Semantics @subsection Metrics and Maintenance Certain features of the virtual machine pertain to software development and maintenance more than to implementing any particular function. The operations with the mnemonics @code{version}, @code{note}, @code{profile}, and @code{weight} are in this category. @menu * Version:: * Note:: * Profile:: * Weight:: @end menu @node Version, Note, Metrics and Maintenance, Metrics and Maintenance @subsubsection Version A virtual code application with exactly the following definition implements a function that returns a constant character string regardless of its argument. @cindex @code{version} @example version = ((nil,nil),((nil,nil),(nil,((nil,nil),nil)))) @end example @noindent The character string encodes the version number of the installed @code{avram} executable, for example @code{@value{VERSION}}, using the standard representation for characters. Although such an application is useless by itself, the intended use for this feature is to cope with the possibility that future versions of @code{avram} may include enhancements. Ideally, the maintainer of @code{avram} will update the version number when new enhancements are added. Applications can then detect whether they are available in the installed version by using this feature. If a needed enhancement is not available, the application can either make allowances or at least terminate gracefully. @node Note, Profile, Version, Metrics and Maintenance @subsubsection Note This operation allows arbitrary information or comments to be embedded in a virtual code application in such a way that it will be ignored by @code{avram} when executing it. For the @code{silly} language, a @code{note} function is defined in the standard prelude so as to imply the following theorem. @cindex @code{note} @cindex annotations @table @emph @item T11 [[@code{note}]] @code{(@var{f},@var{c})} = @code{((nil,nil),((nil,nil),(nil,(nil,(@var{f},@var{c})))))} @end table @noindent Intuitively, the argument @code{@var{f}} represents a function, and the argument @code{c} represents the comment, annotation, or whatever, that will be embedded but ignored in the virtual code. Semantically, a function with a note attached is the same as the function by itself, as the following property stipulates for any non-@code{nil} @code{@var{f}}. @table @emph @item P8 ([[@code{note}]] @code{(@var{f},@var{c})}) @code{@var{x}} = @code{@var{f} @var{x}} @end table A possible reason for using this feature might be to support a language that performs run-time type checking by hanging type tags on @cindex type tags everything. Another possible use would be to include symbolic information needed by a debugger. @node Profile, Weight, Note, Metrics and Maintenance @subsubsection Profile The virtual machine supports a profiling capability by way of this @cindex @file{profile.txt} feature. Profiling an application causes run time statistics about it to be written to a file @file{./profile.txt}. Profiled applications are of the form indicated in the following theorem @table @emph @item T12 [[@code{profile}]] @code{(@var{f},@var{s})} = @code{((nil,nil),((nil,nil),(nil,((@var{f},@var{s}),nil))))} @end table @noindent where @code{@var{f}} stands for the virtual code of the application, and @code{@var{s}} stands for the name of it to be written to the file. The semantics of a profiled function is identical to the unprofiled form for any non-@code{nil} @code{@var{f}}. @table @emph @item P9 ([[@code{profile}]] @code{(@var{f},@var{s})}) @code{@var{x}} = @code{@var{f} @var{x}} @end table Unlike the situation with @code{note}, the annotation @code{@var{s}} of @cindex @code{note} used in profiled code is not in an unrestricted format but must be a character string in the standard representation (as in @ref{Representation of Numeric and Textual Data}), because this string needs to be written by @code{avram} to the file @file{./profile.txt}. Ordinarily this string will be the source code identifier of the function being profiled. When profiles are used in many parts of an application, an informative table is generated showing the time spent in each part. @node Weight, , Profile, Metrics and Maintenance @subsubsection Weight The following virtual code implements a function that returns the weight of its argument in the standard representation for natural numbers. @cindex @code{weight} @example weight = ((nil,nil),((nil,nil),(nil,(nil,nil)))) @end example @noindent The weight of a tree is zero if the tree is @code{nil}, and otherwise the sum of the weights of the two subtrees plus one. An algorithm to compute the weight of a tree would be trivial to implement without being built in to the virtual machine. The built in capability could also be used for purposes unrelated to code maintenance. Nevertheless, it is built in for the following reasons. @itemize @bullet @item Computing weights happened to be a bottleneck for a particular aspect of code generation that was of interest to the author, @cindex compression namely the compression of generated code. @item A built in implementation in C runs at least an order of magnitude faster than the equivalent implementation in virtual code. @item It runs even faster when repeated on the same data, by retrieving previously calculated weights rather than recalculating them. @end itemize The only disadvantage of using this feature instead of implementing a function in virtual code to compute weights is that it relies on native @cindex native integer arithmetic @cindex overflow integer arithmetic and could overflow, causing a fatal error. It has never occurred in practice, but is possible due to sharing, whereby the nominal weight of a tree could be exponentially larger than the actual amount of memory occupied by it. @node Deconstruction, Recursion, Metrics and Maintenance, Virtual Code Semantics @subsection Deconstruction Much of the time required for evaluating a function is devoted to @cindex deconstruction performing deconstruction operations, e.g., taking the left side of a pair, the tail of a list, the right side of the head of the tail, etc.. Because these operations are so frequent, there are some features of the virtual machine to make them as efficient as possible. @menu * Field:: * Fan:: @end menu @node Field, Fan, Deconstruction, Deconstruction @subsubsection Field The virtual machine supports a generalization of the @code{left} and @cindex @code{left} @cindex @code{right} @code{right} deconstruction operations that is applicable to deeply nested structures. Use of this feature is conducive to code that is faster and more compact than is possible by relying on the primitive deconstructors alone. It may also be easier for a code optimizer to recognize and transform. The general form of a virtual code application to perform deconstruction is that it is a pair with a @code{nil} left side, and a non-@code{nil} right side. The right side indicates the nature of the deconstruction to be performed when the function is evaluated on an argument. To make the expression of deconstruction functions more readable in @code{silly}, the standard library contains the declaration @example field = couple(constant nil,identity) @end example @noindent which implies the following theorem. @table @emph @item T13 [[@code{field}]] @code{@var{w}} = @code{(nil,@var{w})} @end table @cindex @code{field} The virtual machine recognizes an application in this form and evaluates it according to the following properties, where @code{@var{u}} and @code{@var{v}} are other than @code{nil}, but @code{@var{x}}, @code{@var{y}}, and @code{@var{z}} are unrestricted. @table @emph @item P10 ([[@code{field}]] @code{(@var{u},nil)}) @code{(@var{x},@var{y})} = ([[@code{field}]] @code{@var{u}}) @code{@var{x}} @item P11 ([[@code{field}]] @code{(nil,@var{v})}) @code{(@var{x},@var{y})} = ([[@code{field}]] @code{@var{v}}) @code{@var{y}} @item P12 ([[@code{field}]] @code{(@var{u},@code{v})}) @code{@var{z}} = @code{((}[[@code{field}]] @code{@var{u}) @var{z},(}[[@code{field}]] @code{@var{v}) @var{z})} @end table @noindent One might also add that ([[@code{field}]] @code{(nil,nil)}) @code{@var{z}} = @code{@var{z}}, but this statement would be equivalent to @emph{P0}. A suitable choice of the @code{field} operand permits the implementation of any deconstruction function expressible in terms of @code{compose}, @code{couple}, @code{identity}, @code{left} and @code{right}. For example, the application @code{couple(compose(right,right),left)} has an equivalent representation in @code{field((nil,(nil,(nil,nil))),((nil,nil),nil))}. The latter looks longer in @code{silly} but is smaller and faster in virtual code. @node Fan, , Field, Deconstruction @subsubsection Fan @cindex @code{fan} In cases where a deconstructions would be needed to apply the same function to both sides of a pair, the overhead can be avoided by means of a property of the virtual machine intended for that purpose. A @code{silly} definition of @code{fan} implying the following theorem is helpful in expressing such an application. @table @emph @item T14 [[@code{fan}]] @code{@var{f}} = @code{((nil,nil),((nil,@var{f}),(nil,nil)))} @end table @noindent The virtual machine recognizes when an application has the form shown above, and uses @code{@var{f}} as a function to be applied to both sides of the argument. @table @emph @item P13 ([[@code{fan}]] @code{@var{f}}) @code{(@var{x},@var{y})} = @code{(@var{f} @var{x},@var{f} @var{y})} @end table @node Recursion, Assignment, Deconstruction, Virtual Code Semantics @subsection Recursion @cindex recursion Defining functions or programs self referentially is sometimes informally known as recursion. In functional languages, the clever use @cindex combinators @cindex functional programming of ``combinators'' is often preferred to this practice, and is in fact well supported by the virtual machine. However, some computations may be inexpressible without an explicitly ``recursive'' formulation, so there is some support for that as well. @menu * Recur:: * Refer:: @end menu @node Recur, Refer, Recursion, Recursion @subsubsection Recur @cindex @code{meta} @cindex @code{recur} A generalization of the form denoted by @code{meta} in @code{silly} is recognized by the virtual machine and allows a slightly more efficient encoding of recursive applications. An expression @code{recur @var{p}} has the representation indicated by this theorem, @table @emph @item T15 [[@code{recur}]] @code{@var{p}} = @code{(((nil,@var{p}),nil),nil)} @end table @noindent which implies that [[@code{meta}]] = [[@code{recur}]] @code{(nil,nil)}. If @code{@var{p}} is non-@code{nil}, a tree of the form [[@code{recur}]] @code{@var{p}} is interpreted as follows. Note that @emph{P4} is equivalent to the special case of this property for which @code{@var{p}} is @code{(nil,nil)}. @table @emph @item P14 ([[@code{recur}]] @code{@var{p}}) @code{@var{x}} = [[@code{meta}]] ([[@code{field}]] @code{@var{p}}) @code{@var{x}} @end table The rationale is that @code{meta} would very frequently be composed with a deconstruction @code{field @var{p}}, so the virtual machine saves some time and space by allowing the two of them to be encoded in a smaller tree with the combined meaning. @node Refer, , Recur, Recursion @subsubsection Refer @cindex @code{refer} In the style of recursive programming compelled by the available @code{meta} primitive, a function effectively requires a copy of its own machine code as its left argument. Bringing about that state of affairs is an interesting party trick. @cindex @code{bu} If we had a definition of @code{bu} in the standard library implying @table @emph @item T16 ([[@code{bu}]] @code{(@var{f},@var{k})}) @code{@var{x}} = @code{@var{f}(@var{k},@var{x})} @end table @noindent which for the sake of concreteness can be done like this, @example bu = (hired compose)( left, (hired couple)(compose(constant,right),constant identity)) @end example @noindent then a definition of @code{refer} as @example refer = (hired bu)(identity,identity) @end example @noindent would be consistent with the following property of the operator. @table @emph @item P15 ([[@code{refer}]] @code{@var{f}}) @code{@var{x}} = @code{@var{f} (@var{f},@var{x})} @end table @noindent The proof, as always, is a matter of routine calculation in the manner of the section on how @code{avram} thinks. However, this pattern would occur so frequently in recursively defined functions as to be a significant waste of space and time. Therefore, rather than requiring it to be defined in terms of other operations, the virtual machine specification recognizes a pattern of the form below with respect to property @emph{P15}, @table @emph @item T17 [[@code{refer}]] @code{@var{f}} = @code{(((@var{f},nil),nil),nil)} @end table @noindent and takes the property to be true by definition of the operator. A definition of @code{refer} consistent with @emph{T17} is therefore to @cindex standard library be found in the standard library, not the definition proposed above. @node Assignment, Predicates, Recursion, Virtual Code Semantics @subsection Assignment @cindex assignment @cindex imperative programming In an imperative programming paradigm, a machine consists partly of an ensemble of addressable storage locations, whose contents are changed over time by assignment statements. An assignment statement includes some computable function of the global machine state, and the address of the location whose contents will be overwritten with the value computed from the function when it is evaluated. Compiling a language containing assignment statements into virtual machine code suitable for @code{avram} might be facilitated by exploiting the following property. @table @emph @item P16 ([[@code{assign}]] @code{(@var{p},@var{f})}) @code{@var{x}} = [[@code{replace}]] @code{((@var{p},@var{f} @var{x}),@var{x})} @end table @noindent The identifier @code{assign} is used in @code{silly} to express a virtual code fragment having the form shown below, and @code{replace} corresponds to a further operation to be explained presently. @cindex @code{assign} @table @emph @item T18 [[@code{assign}]] @code{(@var{p},@var{f})} = @code{(((@var{p},@var{f}),nil),nil)} @end table This feature simulates assignment statements in the following way. The variable @code{@var{x}} in @emph{P16} corresponds intuitively to the set of addressable locations in the machine. The variable @code{@var{f}} corresponds to the function whose value will be stored in the location addressed by @code{@var{p}}. The result of a function expressed using @code{assign} is a new store similar to the argument @code{@var{x}}, but with the part of it in location @code{@var{p}} replaced by @code{@var{f} @var{x}}. A source text with a sequence of assignment statements could therefore be translated directly into a functional composition of trees in this form. @cindex storage locations The way storage locations are modeled in virtual code using this feature would be as nested pairs, and the address @code{@var{p}} of a location is a tree interpreted similarly to the trees used as operands to the @code{field} operator described in @ref{Field}, to specify deconstructions. In fact, @code{replace} can be defined as a minimal solution to the following equation. @cindex @code{replace} @table @emph @item E0 ([[@code{field}]] @code{@var{p}}) [[@code{replace}]] @code{((@var{p},@var{y}),@var{x})} = @code{@var{y}} @end table This equation regrettably does not lend itself to inferring the @code{silly} source for @code{replace} @cindex @code{isolate} using the @code{isolate} algorithm in @ref{Variable Freedom}, so an explicit construction is given in @ref{Replace}. This construction need not concern a reader who considers the equation a sufficiently precise specification in itself. In view of the way addresses for deconstruction are represented as trees, it would be entirely correct to infer from this equation that a tuple of values computed together can be assigned to a tuple of locations. The locations don't even have to be ``contiguous'', but could be anywhere in the tree representing the store, and the function is computed from the contents of all of them prior to the update. Hence, this simulation of assignment fails to capture the full inconvenience of imperative programming except in the special case of a single value assigned to a single location, but fortunately this case is the only one most languages allow. There is another benefit to this feature besides running languages with assignment statements in them, which is the support of abstract or opaque data structures. A function that takes an abstract data structure as an argument and returns something of the same type can be coded in a way that is independent of the fields it doesn't use. For example, a data structure with three fields having the field identifiers @code{foo}, @code{bar}, and @code{baz} in some source language might be represented as a tuple @code{((@var{foo contents},@var{bar contents}),@var{baz contents})} on the virtual code level. Compile time constants like @code{bar = ((nil,(nil,nil)),nil)} could be defined in an effort to hide the details of the representation, so that the virtual code @code{field bar} is used instead of @code{compose(right,left)}. Using field identifiers appropriately, a function that transforms such a structure by operating on the @code{bar} field could have the virtual @cindex @code{field} code @code{couple(couple(field foo,compose(f,field bar)),field baz)}. However, this code does not avoid depending on the representation of the data structure, because it relies on the assumption of the @code{foo} field being on the left of the left, and the @code{baz} field being on the right. On the other hand, the code @code{assign(bar,compose(f,field bar))} does the same job without depending on anything but the position of the @code{bar} field. Furthermore, if this position were to change relative to the others, the code maintenance would be limited to a recompilation. @node Predicates, Iteration, Assignment, Virtual Code Semantics @subsection Predicates @cindex predicates A couple of operations are built into the virtual machine for performing tests efficiently. These functions return either @code{nil} for false or @code{(nil,nil)} for true, and are useful for example as a predicate @code{@var{p}} in programs of the form @code{conditional(@var{p},(@var{f},@var{g}))} among other things. In this example, the predicate is applied to the argument, a result of @code{(nil,nil)} causes @code{@var{f}} to be applied to it, and a result of @code{nil} causes @code{@var{g}} to be applied to it. @menu * Compare:: * Member:: @end menu @node Compare, Member, Predicates, Predicates @subsubsection Compare @cindex @code{compare} A function that performs comparison has a the following very simple virtual code representation. @table @emph @item T19 [[@code{compare}]] = @code{(nil,nil)} @end table @noindent The proof of theorem @emph{T19} is that the standard @code{silly} prelude contains the declaration @code{compare = (nil,nil)}. Code in this form has the following semantics. @table @emph @item P17 For distinct trees @code{@var{x}} and @code{@var{y}}, [[@code{compare}]] @code{(@var{x},@var{y})} = @code{nil} @item P18 [[@code{compare}]] @code{(@var{x},@var{x})} = @code{(nil,nil)} @end table @noindent @cindex equality In other words, the virtual code @code{(nil,nil)} implements a function that takes a pair of trees and returns true if and only if they are equal. It would be fairly simple to write an equivalent virtual code application that implements this function if it were not realizable in this form by definition of the operator. However, this method is preferable because it saves space in virtual code and has a highly optimized implementation in C. @node Member, , Compare, Predicates @subsubsection Member Another built in predicate function has the virtual code shown below. @cindex @code{member} @table @emph @item T20 [[@code{member}]] = @code{((nil,nil),((nil,nil),nil))} @end table @noindent As the mnemonic suggests, the function implemented by this code detects whether a given item is a member of a list. The left side of its argument is the item to be detected, and the right side is the list that may or may not contain it. Lists are represented as explained in @ref{Representation of Numeric and Textual Data}. The virtual code semantics can be specified by these three properties of the operator. @table @emph @item P19 [[@code{member}]] @code{(@var{x},nil)} = @code{nil} @item P20 [[@code{member}]] @code{(@var{x},(@var{x},@var{y}))} = @code{(nil,nil)} @item P21 For distinct trees @code{@var{x}} and @code{@var{y}}, [[@code{member}]] @code{(@var{x},(@var{y},@var{z}))} = [[@code{member}]] @code{(@var{x},@code{z})} @end table As in the case of @code{compare}, the implementation of @code{member} is well optimized in C, so this form is to be preferred over an ad hoc construction of a membership testing function in virtual code. @node Iteration, List Combinators, Predicates, Virtual Code Semantics @subsection Iteration @cindex recursion @cindex @code{iterate} One of many alternatives to recursion provided by the virtual machine is iteration, which allows an operation to be repeated until a condition is met. If the source language is imperative, this feature provides an easy means of translating loop statements to virtual code. In languages that allow functions to be treated as data, iteration can be regarded as a function that takes a predicate and a function as arguments, and returns a function that applies the given function repeatedly to its argument until the predicate is refuted. Iterative applications are expressed in virtual code by the pattern shown below. @table @emph @item T21 [[@code{iterate}]] @code{(@var{p},@var{f})} = @code{((nil,nil),(nil,(@var{p},@var{f})))} @end table @noindent In the @code{silly} language, the @code{iterate} mnemonic plays the role of the function that takes the virtual code for a predicate @code{@var{p}} and a function @code{@var{f}} as arguments, and returns the virtual code for an iterating function. The code for an iterating function is recognized as such by the virtual machine emulator only if the subtrees @code{@var{f}} and @code{@var{p}} are other than @code{nil}. The resulting function tests the argument @code{@var{x}} with @code{@var{p}} and returns @code{@var{x}} if the predicate is false. @table @emph @item P22 ([[@code{iterate}]] @code{(@var{p},@var{f})}) @code{@var{x}} = @code{@var{x}} if @code{@var{p} @var{x}} = @code{nil} @end table @noindent If the predicate turns out to be true, @code{@var{f}} is applied to @code{@var{x}}, and then another iteration is performed. @table @emph @item P23 ([[@code{iterate}]] @code{(@var{p},@var{f})}) @code{@var{x}} = ([[@code{iterate}]] @code{(@var{p},@var{f})}) @code{@var{f} @var{x}} if @code{@var{p} @var{x}} is a non-@code{nil} tree @end table @node List Combinators, List Functions, Iteration, Virtual Code Semantics @subsection List Combinators @cindex lists @cindex imperative programming @cindex functional programming There is extensive support for operations on lists in the virtual code format. Use of these features is encouraged because they are conducive to tight code with explicit concurrency. Within an imperative programming paradigm, these features might perhaps have to be understood as design patterns or algorithmic skeletons. The present exposition takes a functional view, describing them in terms of operators that take functions as their arguments and return functions as their result. @menu * Map:: * Filter:: * Reduce:: * Sort:: * Transfer:: * Mapcur:: @end menu @node Map, Filter, List Combinators, List Combinators @subsubsection Map A virtual code application in the following form causes a function with non-@code{nil} virtual code @code{@var{f}} to be applied to every item in a list. @table @emph @item T22 [[@code{map}]] @code{@var{f}} = @code{((nil,nil),((nil,@var{f}),nil))} @end table @noindent @cindex @code{map} The @code{map} mnemonic is used in @code{silly} to express applications in this form as @code{map @var{f}}. This operation is also well known to lisp users and functional programmers. The semantics is determined by these two operator properties (for non-@code{nil} @code{@var{f}}). @table @emph @item P24 ([[@code{map}]] @code{@var{f}}) @code{nil} = @code{nil} @item P25 ([[@code{map}]] @code{@var{f}}) @code{(@var{x},@var{y})} = @code{(@var{f} @var{x},(}[[@code{map}]] @code{@var{f}) @var{y})} @end table @noindent Note that the representation of lists described in @ref{Representation of Numeric and Textual Data}, is assumed. @node Filter, Reduce, Map, List Combinators @subsubsection Filter @cindex @code{filter} Another well known list operation is that which applies a predicate to every item of a list, and deletes those for which the predicate is false. For a predicate with virtual code @code{@var{p}}, such an application can be coded conveniently in this form, @table @emph @item T23 [[@code{filter}]] @code{@var{p}} = @code{((nil,nil),(nil,(@var{p},nil)))} @end table @noindent which is to say that writing @code{((nil,nil),(nil,(@var{p},nil)))} in @code{silly} is the same as writing @code{filter @var{p}}. The virtual machine detects code of this form provided that @code{@var{p}} is other than @code{nil}, and evaluates it consistently with the following properties, causing it to have the meaning that it does. @table @emph @item P26 ([[@code{filter}]] @code{@var{p}}) @code{nil} = @code{nil} @item P27 ([[@code{filter}]] @code{@var{p}}) @code{(@var{x},@var{y})} = ([[@code{filter}]] @code{@var{p}}) @code{@var{y}} if @code{@var{p} @code{@var{x}}} = @code{nil} @item P28 ([[@code{filter}]] @code{@var{p}}) @code{(@var{x},@var{y})} = @code{(@var{x},}([[@code{filter}]] @code{@var{p}}) @code{@var{y})} if @code{@var{p} @var{x}} is a non-@code{nil} tree @end table @node Reduce, Sort, Filter, List Combinators @subsubsection Reduce @cindex @code{reduce} In the virtual code fragment shown below, @code{@var{f}} should be regarded as the virtual code for a binary operator, and @code{@var{k}} is a constant. @table @emph @item T24 [[@code{reduce}]] @code{(@var{f},@var{k})} = @code{((nil,nil),((@var{f},@var{k}),nil))} @end table @noindent By constructing a tree in the form shown, the @code{silly} mnemonic @code{reduce} effectively constructs the code for a function operating on lists in a particular way. The effect of evaluating an application in this form with an argument representing a list can be broken down into several cases. @itemize @bullet @item If the list is empty, then the value of @code{@var{k}} is the result. @item If the list has only one item, then that item is the result. @item if the list has exactly two items, the first being @code{@var{x}} and the second being @code{@var{y}}, then the result is @code{@var{f} (@var{x},@var{y})}. @item If the list has more than two items and an even number of them, the result is that of evaluating the application with the list of values obtained by partitioning the list into pairs of adjacent items, and evaluating @code{@var{f}} with each pair. @item If the list has more than two items and an odd number of them, the result is that of evaluating the application with the list of values obtained by partitioning the list into pairs of adjacent items excluding the last one, evaluating @code{@var{f}} with each pair, and then appending the last item to the list of values. @end itemize @noindent In the last two cases, evaluation of the application is not necessarily finished after just one traversal of the list, because it has to carry on until the list is reduced to a single item. Some care has been taken to describe this operation in detail because it differs from comparable operations common to functional programming @cindex fold languages, variously known as fold or reduce. All of these operations could be used, for example, to compute the summation of a list of numbers. The crucial differences are as follows. @itemize @bullet @item Whereas a fold or a reduce is conventionally either of the left or right variety, this @code{reduce} is neither. @item The vacuous case result @code{@var{k}} is never used at all unless the argument is the empty list. @item This @code{reduce} is not very useful if the operator @code{@var{f}} is not associative. @end itemize The reason for defining the semantics of @code{reduce} in this way instead of the normal way is that a distributed implementation of this @cindex distributed implementation one could work in logarithmic time, so it's worth making it easy for a language processor to detect the pattern in case the virtual code is ever going to be targeted to such an implementation. Of course, nothing prevents the conventional left or right reduction semantics from being translated to virtual code by explicit recursion. @cindex recursion The precise semantics of this operation are as follows, where @code{@var{f}} is not @code{nil}, @code{@var{k}} is unconstrained, and @code{pairwise} represents a function to be explained presently. @cindex @code{iterate} @cindex @code{pairwise} @cindex @code{bu} @cindex @code{right} @table @emph @item P29 ([[@code{reduce}]] @code{(@var{f},@var{k})}) @code{nil} = @code{@var{k}} @item P30 ([[@code{reduce}]] @code{(@var{f},@var{k})}) @code{(@var{x},@var{y})} = @* @w{ }@w{ }@w{ } [[@code{left}]] ([[@code{bu(iterate,right)}]] [[@code{pairwise}]] @code{@var{f}}) @code{(@var{x},@var{y})} @end table @noindent The latter property leverages off some virtual machine features and @code{silly} code that has been defined already. The function implemented by [[@code{pairwise}]] @code{@var{f}} is the one that partitions its argument into pairs and evaluates @code{@var{f}} with each pair as described above. The combination of that with @code{bu(iterate,right)} results in an application that repeatedly performs [[@code{pairwise}]] @code{@var{f}} while the resulting list still has a tail (i.e., a @code{right} side), and stops when the list has only a single item (i.e., when @code{right} is false). The @code{left} operation then extracts the item. For the sake of completeness, it is tedious but straightforward to give an exact definition for @code{pairwise}. The short version is that it can be anything that satisfies these three equations. @table @emph @item E1 ([[@code{pairwise}]] @code{@var{f}}) @code{nil} = @code{nil} @item E2 ([[@code{pairwise}]] @code{@var{f}}) @code{(@var{x},nil)} = @code{(@var{x},nil)} @item E3 ([[@code{pairwise}]] @code{@var{f}}) @code{(@var{x},(@var{y},@var{z}))} = @code{(@var{f} (@var{x},@var{y}),}([[@code{pairwise}]] @code{@var{f}}) @code{@var{z})} @end table @noindent For the long version, see @ref{Pairwise}. @node Sort, Transfer, Reduce, List Combinators @subsubsection Sort @cindex @code{sort} Sorting is a frequently used operation that has the following standard representation in virtual code. @table @emph @item T25 [[@code{sort}]] @code{@var{p}} = @code{((nil,nil),((@var{p},nil),(nil,nil)))} @end table @noindent When an application in this form is evaluated with an operand representing a list, the result is a sorted version of the list. The way a list is sorted depends on what order is of interest. For example, numbers could be sorted in ascending or descending order, character strings could be sorted lexically or by length, etc.. The value of @code{@var{p}} is therefore needed in sorting applications to specify the order. It contains the virtual code for a partial order relational operator. This operator can be evaluated with any pair of items selected from a list, and should have a value of true if the left one should precede the right under the ordering. For example, if numbers were to be sorted in ascending order, then @code{@var{p}} would compute the ``less or equal'' relation, returning true if its operand were a pair of numbers in which the left is less or equal to the right. The virtual code semantics for sorting applications is given by these two properties, wherein @code{@var{p}} is a non-@code{nil} tree, and @code{insert} is a @code{silly} mnemonic to be defined next. @cindex @code{insert} @table @emph @item P31 ([[@code{sort}]] @code{@var{p}}) @code{nil} = @code{nil} @item P32 ([[@code{sort}]] @code{@var{p}}) @code{(@var{x},@var{y})} = ([[@code{insert}]] @code{@var{p}}) @code{(@var{x},}([[@code{sort}]] @code{@var{p}}) @code{@var{y})} @end table @noindent These properties say that the empty list is already sorted, and a non-empty list is sorted if its tail is sorted and the head is then inserted in the proper place. This specification of sorting has nothing to do with the sorting algorithm that @code{avram} really uses. The meaning of insertion is convenient to specify in virtual code itself. It should satisfy these three equations. @table @emph @item E4 ([[@code{insert}]] @code{@var{p}}) @code{(@var{x},nil)} = @code{(@var{x},nil)} @item E5 ([[@code{insert}]] @code{@var{p}}) @code{(@var{x},(@var{y},@var{z}))} = @code{(@var{y},}([[@code{insert}]] @code{@var{p}}) @code{(@var{x},@var{z}))} if @code{@var{p}(@var{x},@var{y})} = @code{nil} @item E6 ([[@code{insert}]] @code{@var{p}}) @code{(@var{x},(@var{y},@var{z})}) = @code{(@var{x},(@var{y},@var{z}))} if @code{@var{p}(@var{x},@var{y})} is a non-@code{nil} tree @end table @noindent Intuitively, the right argument, whether @code{nil} or @code{(@var{y},@var{z})}, represents a list that is already sorted. The left argument @code{@var{x}} therefore only needs to be compared to the head element, @code{@var{y}} to ascertain whether or not it belongs at the beginning. If not, it should be inserted into the tail. A possible implementation of @code{insert} in @code{silly} is given in @ref{Insert}. @node Transfer, Mapcur, Sort, List Combinators @subsubsection Transfer @cindex @code{transfer} A particular interpretation is given to virtual code in the following form. @table @emph @item T26 [[@code{transfer}]] @code{@var{f}} = @code{((nil,nil),(nil,(nil,@var{f})))} @end table @noindent When code in this form is evaluated with an argument, the tree @cindex state transition function @code{@var{f}} is used as a state transition function, and the argument is used as a list to be traversed. The traversal begins with @code{@var{f}} being evaluated on @code{nil} to get the initial state and the initial output. Thereafter, each item of the list is paired with the current state to be evaluated with @code{@var{f}}, resulting in a list of output and the next state. The output resulting from the entire application is the cumulative concatenation of all outputs obtained in the course of evaluating @code{@var{f}}. The computation terminates when @code{@var{f}} yields a @code{nil} result. If the list of inputs runs out before the computation terminates, @code{nil} values are used as inputs. This behavior is specified more precisely in the following property of the operator, which applies in the case of non-@code{nil} @code{@var{f}}. @cindex @code{transition} @table @emph @item P33 ([[@code{transfer}]] @code{@var{f}}) @code{@var{x}} = ([[@code{transition}]] @code{@var{f}}) @code{(nil,(@var{f} nil,@var{x}))} @end table Much of the @code{transfer} semantics is implicit in the meaning of @code{transition}. For any given application @code{@var{f}}, [[@code{transition}]] @code{@var{f}} is the virtual code for a function that takes the list traversal from one configuration to the next. A configuration is represented as a tuple, usually in the form @code{(@var{previous outputs},((@var{state},@var{output}),(@var{next input},@var{subsequent inputs})))}. A terminal configuration has the form @code{(@var{previous outputs},(nil,(@var{next input},@var{subsequent inputs})))}. A configuration may also have @code{nil} in place of the pair @code{(@var{next input},@var{subsequent inputs})} if no more input remains. In the non-degenerate case, the meaning of [[@code{transition}]] @code{@var{f}} is consistent with the following equation. @table @emph @item E7 ([[@code{transition}]] @code{@var{f}}) @code{(@var{y},((@var{s},@var{o}),(@var{i},@var{x})))} =@* @w{ }@w{ }@w{ }@w{ }([[@code{transition}]] @code{@var{f}}) @code{((@var{o},@var{y}),(@var{f} (@var{s},@var{i}),@var{x}))} @end table @noindent That is, the current output @code{@var{o}} is stored with previous outputs @code{@var{y}}, the next input @code{@var{i}} is removed from the configuration, and the next state and output are obtained from the evaluation of @code{@var{f}} with the state @code{@var{s}} and the next input @code{@var{i}}. In the case where no input remains, the transition function is consistent with the following equation. @table @emph @item E8 ([[@code{transition}]] @code{@var{f}}) @code{(@var{y},((@var{s},@var{o}),nil))} = @* @w{ }@w{ }@w{ }@w{ }([[@code{transition}]] @code{@var{f}}) @code{((@var{o},@var{y}),(@var{f} (@var{s},nil),nil))} @end table @noindent This case is similar to the previous one except that the @code{nil} value is used in place of the next input. Note that in either case, nothing about @code{@var{f}} depends on the particular way configurations are represented, except that it should have a state as its left argument and an input as its right argument. Finally, in the case of a terminal configuration, the transition function returns the cumulative output. @table @emph @item E9 ([[@code{transition}]] @code{@var{f}}) @code{(@var{y},(nil,@var{x}))} = [[@code{reduce(cat,nil)}]] [[@code{reverse}]] @code{@var{y}} @end table @noindent The @code{silly} code @code{reduce(cat,nil)} has the effect of @cindex @code{cat} @cindex concatenation flattening a list of lists into one long list, which is necessary insofar as the transition function will have generated not necessarily a single output but a list of outputs on each iteration. The @code{cat} mnemonic stands for list concatenation and is explained in @ref{Cat}. The reversal is necessary to cause the first generated output to be at the head of the list. List reversal is a built in operation of the virtual machine and is described in @ref{Reverse}. If such a function as @code{transition} seems implausible, its implementation in @code{silly} can be found in @ref{Transition}. It is usually more awkward to express a function in terms of a @code{transfer} than to code it directly using recursion or other list operations. However, this feature is provided by the virtual machine for several reasons. @itemize @bullet @item Functions in this form may be an easier translation target if the source is an imperative language. @item Translating from virtual code to asynchronous circuits or process @cindex asynchronous circuits networks has been a research interest of the author, and code in this @cindex author form lends itself to easy recognition and mapping onto discrete components. @item The @option{--byte-transducer} and @option{--interactive} command line options to @code{avram} cause an application to be invoked in a @cindex state transition function similar manner to the transition function in a @code{transfer} function, so this feature allows for easy simulation and troubleshooting of these applications without actually deploying them. @end itemize @node Mapcur, , Transfer, List Combinators @subsubsection Mapcur An alternative form of recursive definition is the following. @cindex @code{mapcur} @table @emph @item T27 [[@code{mapcur}]] @code{@var{p}} = @code{((nil,nil),((nil,nil),(@var{p},nil)))} @end table @noindent This form is convenient for applications that cause themselves to be @cindex recursion applied recursively to a list of arguments. It has this semantics. @table @emph @item P34 ([[@code{mapcur}]] @code{@var{p}}) @code{@var{x}} = [[@code{map meta}]] [[@code{distribute}]] ([[@code{field}]] @code{@var{p}}) @code{@var{x}} @end table @node List Functions, Exception Handling, List Combinators, Virtual Code Semantics @subsection List Functions In addition to the foregoing list operations, the virtual machine @cindex lists provides a number of canned functions operating on lists, namely concatenation, reversal, distribution, and transposition. These functions could be coded by other means if they were not built in, but the built in versions are faster and smaller. @menu * Cat:: * Reverse:: * Distribute:: * Transpose:: @end menu @node Cat, Reverse, List Functions, List Functions @subsubsection Cat The list concatenation operation has this representation in virtual code. @cindex @code{cat} @cindex concatenation @table @emph @item T28 [[@code{cat}]] = @code{((nil,nil),(nil,nil))} @end table @noindent This function takes a pair of lists as an argument, an returns the list obtained by appending the right one to the left. The semantics of concatenation is what one would expect. @table @emph @item P35 [[@code{cat}]] @code{(nil,@var{z})} = @code{@var{z}} @item P36 [[@code{cat}]] @code{((@var{x},@var{y}),@var{z})} = @code{(@var{x},}[[@code{cat}]] @code{(@var{y},@code{z}))} @end table @node Reverse, Distribute, Cat, List Functions @subsubsection Reverse @cindex @code{reverse} The function that reverses a list has the following representation in virtual code. @table @emph @item T29 [[@code{reverse}]] = @code{((nil,nil),(nil,(nil,nil)))} @end table @noindent This function takes a list as an argument, and returns a the list consisting of the same items in the reverse order. The semantics is given by the following properties. @table @emph @item P37 [[@code{reverse}]] @code{nil} = @code{nil} @item P38 [[@code{reverse}]] @code{(@var{x},@var{y})} = [[@code{cat}]] ([[@code{reverse}]] @code{@var{y},(@var{x},nil)}) @end table @node Distribute, Transpose, Reverse, List Functions @subsubsection Distribute The function with the following virtual code representation is frequently useful for manipulating lists. @cindex @code{distribute} @table @emph @item T30 @code{distribute} = @code{(((nil,nil),nil),nil)} @end table @noindent This function takes a pair whose right side represents a list, and returns a list of pairs, with one pair for each item in the list. The left side of each pair is the left side of the original argument, and the right side is the corresponding item of the list. A semantics for this operation is specified by the following properties. @table @emph @item P39 [[@code{distribute}]] @code{(@var{x},nil)} = @code{nil} @item P40 [[@code{distribute}]] @code{(@var{x},(@var{y},@var{z}))} = @code{((@var{x},@var{y}),}[[@code{distribute}]] @code{(@var{x},@var{z}))} @end table @node Transpose, , Distribute, List Functions @subsubsection Transpose The @code{transpose} operation has the following representation in virtual code. @table @emph @item T31 [[@code{transpose}]] = @code{((nil,nil),((nil,nil),(nil,nil)))} @end table @noindent @cindex @code{transpose} This function takes a list of equal length lists as an argument, and returns a list of lists as a result. In the resulting list, the first item is the list of all first items of lists in the argument. The next item is the list of all second items, and so on. In the specification of the semantics, the @code{silly} mnemonic @cindex @code{flat} @code{flat} is defined by @code{flat = reduce(cat,nil)} in the standard @code{silly} prelude, which means that it flattens a list of lists into one long list. @table @emph @item P41 [[@code{transpose}]] @code{@var{x}} = @code{nil} if [[@code{flat}]] @code{@var{x}} = @code{nil} @item P42 [[@code{transpose}]] @code{@var{x}} = @code{(}[[@code{map left}]] @code{@var{x},}[[@code{transpose}]] [[@code{map right}]] @code{@var{x})}@* @w{ }@w{ }@w{ } if [[@code{flat}]] @code{@var{x}} is a non-@code{nil} tree @end table @node Exception Handling, Interfaces to External Code, List Functions, Virtual Code Semantics @subsection Exception Handling @cindex exceptions In quite a few cases, the properties given for the operator up to this point do not imply any particular result. A good example would be an expression such as [[@code{left}]] @code{nil}, which appears to represent the left side of an empty pair. It can be argued that expressions like this have no sensible interpretation and should never be used, so it would be appropriate to leave them undefined. On the other hand, attempts to evaluate such expressions occur frequently by mistake, and in any case, the virtual machine emulator should be designed to do something reasonable about them if only for the sake of reporting the error. The chosen remedy for this situation addresses the need for error reporting, and also turns out to be useful in other ways. @menu * A Hierarchy of Sets:: * Operator Generalization:: * Error Messages:: * Expedient Error Messages:: * Computable Error Messages:: * Exception Handler Usage:: @end menu @node A Hierarchy of Sets, Operator Generalization, Exception Handling, Exception Handling @subsubsection A Hierarchy of Sets As indicated already, the virtual machine represents all functions and data as members of a set satisfying the properties in @ref{Raw Material}, namely a @code{nil} element and a @code{cons} operator for constructing trees or nested pairs of @code{nil}. However, it will be necessary to distinguish the results of computations that go wrong for exceptional reasons from normal results. Because any tree in the set could conceivably represent a normal result, we need to go outside the set to find an unambiguous representation of exceptional results. Because there may be many possible exceptional conditions, it will be helpful to have a large set of possible ways to encode them, and in fact there is no need to refrain from choosing a countably infinite set. Furthermore, it will be useful to distinguish between different levels of severity among exceptional conditions, so for this purpose a countably infinite hierarchy of mutually disjoint sets is used. In order to build on the theory already developed, the set that has been used up to this point will form the bottom level of the hierarchy, and its members will represent normal computational results. The members of sets on the higher levels in the hierarchy represent exceptional results. To avoid ambiguity, the term ``trees'' is reserved for members @cindex trees of the bottom set, as in ``for any tree @code{@var{x}} @dots{}''. Unless otherwise stated, variables like @code{@var{x}} and @code{@var{y}} are universally quantified over the bottom set only. @cindex universal quantification Because each set in the hierarchy is countably infinite, it is isomorphic to the bottom set. With respect to an arbitrary but fixed bijection between them, let @code{@var{x}_@var{n}} denote the image in the @code{@var{n}}th level set of a tree @code{@var{x}} in the bottom set. The level numbers in this notation start with zero, and we take @code{@var{x}_0} to be synonymous with @code{@var{x}}. For good measure, let @code{(@var{x}_@var{n})_@var{m}} = @code{@var{x}_(@var{n}+@var{m})}. @node Operator Generalization, Error Messages, A Hierarchy of Sets, Exception Handling @subsubsection Operator Generalization Each set in the hierarchy induces a structure preserving @code{cons} @cindex @code{cons} operator, denoted @code{cons_@var{n}} for the @code{@var{n}}th level set, and satisfying this equation. @table @emph @item E10 @code{cons_@var{n}(@var{x}_@var{n},@var{y}_@var{n})} = @code{(cons(@var{x},@var{y}))_@var{n}} @end table @noindent It will be convenient to generalize all of these @code{cons} operators to be defined on members of other sets than their own. @table @emph @item E11 For @code{@var{m}} greater than @code{@var{n}}, @w{ } @w{ } @w{ } @code{cons_@var{n}(@var{x}_@var{m},@var{y}_@var{p})} = @code{@var{x}_@var{m}} @end table @noindent In this equation, @code{@var{p}} is unrestricted. The intuition is that if the left operand of a @code{cons} is the result of a computation that went wrong due to an exceptional condition (more exceptional than @code{@var{n}}, the level already in effect), then the exceptional result becomes the whole result. It is tempting to hazard a slightly stronger statement, which is that this equation holds even if @code{@var{y}_@var{p}} is equal to some expression @code{@var{f} @var{x}} that is undefined according to the operator semantics. This stipulation would correspond to an implementation in which the right operand isn't evaluated after an error is detected in the left, but there are two problems with it. @itemize @bullet @item This semantics might unreasonably complicate a concurrent implementation of the virtual machine. If evaluation leads to non-termination in some cases where the result is undefined (as it certainly would in any possible implementation consistent with cases where it's defined), then the mechanism that evaluates the right side of a pair must be interruptible in case an exception is detected in the left. @item It is beyond the expressive power of the present mathematical framework to make such a statement, because it entails universal quantification over non-members of the constructed sets, which includes @cindex universal quantification almost everything. @end itemize @noindent Nevertheless, the implementation in @code{avram} is sequential and does indeed behave as proposed, with no practical difficulty. As for any deficiency in the theory, it could be banished by recasting the semantics in terms of a calculus of expressions with formal rules of manipulation. An operand to the @code{cons} operator would be identified not with a member of a semantic domain, but with the expression used to write it down, and then even ``undefinedness'' could be @cindex undefined expressions defined. However, the present author's preference in computing as in @cindex author life is to let some things remain a mystery rather than to abandon the quest for meaning entirely. A comparable condition applies in cases where the right side of a pair represents an exceptional result. @table @emph @item E12 For @code{@var{m}} greater than @code{@var{n}}, @w{ } @w{ } @code{cons_@var{n}(@var{x}_@var{n},@var{y}_@var{m})} = @code{@var{y}_@var{m}} @end table Whereas an infinitude of @code{cons} operators has been needed, it will @cindex @code{cons} be possible to get by with only one invisible operator, as before, by generalizing it in the following way to operands on any level of the hierarchy. @table @emph @item P43 @code{@var{f}_@var{n} @var{x}_@var{n}} = @code{(@var{f} @var{x})_@var{n}} @item P44 For distinct @code{@var{n}} and @code{@var{m}}, @w{ }@code{@var{f}_@var{n} @var{x}_@var{m}} = @code{@var{x}_@var{m}} @end table @noindent That is, the result of evaluating two operands on the same level is the image relative to that level of the result of their respective images on the bottom level, but the result of evaluating two operands on different levels is the same as the right operand. @node Error Messages, Expedient Error Messages, Operator Generalization, Exception Handling @subsubsection Error Messages The basic strategy for representing the results of exceptional conditions arising from the evaluation of operands on a given level of the hierarchy will be to use an error message corresponding to the image of a list of character strings on the level above. Unfortunately, the official @code{silly} standard does not define character constants, but they are available as a vendor specific extension in @code{silly-me} (millennium edition), where character strings @cindex @code{silly-me} @cindex strings @cindex character strings may be enclosed in single quotes. The value of the semantic @cindex semantic function function [[@dots{}]] in the case of a character string is the list of representations of the characters, based on @ref{Character Table} and @ref{Representation of Numeric and Textual Data}. For the sake of consistency, each standard error message is a list of character strings, even though the list has only one string in it. If any exceptional condition is the result of a computation, it is written to standard error by @code{avram} as the list of character strings it represents. @table @emph @item P45 ([[@code{compare}]] @code{nil})@code{_@var{n}} = [[@code{('invalid comparison',nil)}]]@code{_(@var{n}+1)} @item P46 ([[@code{left}]] @code{nil})@code{_@var{n}} = [[@code{('invalid deconstruction',nil)}]]@code{_(@var{n}+1)} @item P47 ([[@code{right}]] @code{nil})@code{_@var{n}} = [[@code{('invalid deconstruction',nil)}]]@code{_(@var{n}+1)} @item P48 (([[@code{fan}]] @code{@var{f}}) @code{nil})@code{_@var{n}} = [[@code{('invalid deconstruction',nil)}]]@code{_(@var{n}+1)} @item P49 ([[@code{member}]] @code{nil})@code{_@var{n}} = [[@code{('invalid membership',nil)}]]@code{_(@var{n}+1)} @item P50 ([[@code{distribute}]] @code{nil})@code{_@var{n}} = [[@code{('invalid distribution',nil)}]]@code{_(@var{n}+1)} @item P51 ([[@code{cat}]] @code{nil})@code{_@var{n}} = [[@code{('invalid concatenation',nil)}]]@code{_(@var{n}+1)} @item P52 ([[@code{meta}]] @code{nil})@code{_@var{n}} = [[@code{('invalid recursion',nil)}]]@code{_(@var{n}+1)} @end table Note that by virtue of property @emph{P44}, there is no need for an application to make explicit checks for exceptional results at any point, because the exceptional result propagates through to the output of any function composed with the one that incurred it. For example, an application of the form @code{h = compose(f,right)}, which will cause an invalid deconstruction error if applied in filter mode to an empty file, imposes no requirement that @code{f} be written to accommodate that possibility (i.e., by checking for it) in order for the error to be reported properly. The following proof demonstrates that the meaning of @code{f} is irrelevant to the result. @format [[@code{compose(f,right)}]]@code{_0} @code{nil_0} = [[@code{f}]]@code{_0} [[@code{right}]]@code{_0} @code{nil}@code{_0} = [[@code{f}]]@code{_0} [[@code{('invalid deconstruction',nil)}]]@code{_1} = [[@code{('invalid deconstruction',nil)}]]@code{_1} @end format @noindent In an application @code{h = compose(f,g)}, the input validation therefore may be confined to the ``front @w{end'', @code{g}.} It will be recalled from the discussions of @code{recur} (@ref{Recur}) @cindex @code{recur} @cindex @code{transpose} and @code{transpose} (@ref{Transpose}) that the semantics of virtual code involving these forms is defined in terms of the @code{field} format for deconstruction functions (@ref{Field}), @cindex @code{field} which depends implicitly on the semantics of @code{left} and @code{right}, being a generalization of them. An invalid deconstruction @cindex @code{left} @cindex @code{right} message could therefore result from applications incorporating any of the forms of @code{recur}, @code{transpose}, or @code{field}. Invalid deconstructions could also arise from the @code{replace} operation @cindex @code{replace} @cindex assignment (@ref{Replace}), which is used for assignment (@ref{Assignment}), because @code{replace} is defined by virtual code, except as noted next. @node Expedient Error Messages, Computable Error Messages, Error Messages, Exception Handling @subsubsection Expedient Error Messages @cindex error messages Because there are so many ways to cause an invalid deconstruction, this message is the most common in practice and therefore the least informative. As a matter of convenience, @code{avram} takes the liberty of a slight departure from the virtual machine specification as written hitherto, and employs the following messages when invalid deconstructions occur respectively in the cases of recursion, transposition, and assignment. @itemize @bullet @item @code{invalid recursion} @item @code{invalid transpose} @item @code{invalid assignment} @end itemize @noindent That is, this section contradicts and supersedes what is stated at the end of @ref{Error Messages} and implied by the operator properties @emph{P14}, @emph{P16}, and @emph{P42}. It is also possible that user applications may modify the error messages by methods described in @ref{Computable Error Messages}. Whereas these three cases constitute an expedient variation on the semantics, there is another sense in which no possible implementation could conform faithfully to the specification. When an evaluation can not be carried out because of insufficient space on the host machine, one of the following error messages may be the result. @itemize @bullet @item @code{memory overflow} @item @code{counter overflow} @end itemize @noindent These messages are treated in the same way as those that are caused by programming errors, and propagate to the final result written to standard error without any specific consideration by the application developer. The latter occurs only in connection with the built in weight function (@ref{Weight}). Other messages listed in @ref{Application Programming Errors} are also of this ilk. @node Computable Error Messages, Exception Handler Usage, Expedient Error Messages, Exception Handling @subsubsection Computable Error Messages The automatic generation and reporting of error messages provides a reasonable default behavior for applications that do not consider exceptional conditions. All applications and their input data are ordinarily members of the bottom level set in the hierarchy (@ref{A Hierarchy of Sets}). The error messages caused by invalid operations on this level are on the first level above the bottom, which are recognized as such and written to standard error without intervention from the application. However, there are two drawbacks to this style of dealing with exceptions. @cindex exceptions @itemize @bullet @item An application developer may wish to translate error messages into terms that are meaningful to the user, not only by literally translating them from English to the local vernacular, but perhaps by relating the particular exceptional condition to application specific causes. While it is convenient for the ``back end'' code not to be required to intervene in the error reporting, it would be most inconvenient for it not to be able to do so. @item Some application specific errors might not correspond directly to any of the particular conditions detected automatically due to invalid operations, for example a semantic error in a syntactically correct input file. It might be convenient in such cases for an application to be able to define its own error messages but still have them reported automatically like the built in messages. @end itemize These situations suggest a need for some ability on the part of an application to operate on error messages themselves. Based on the operator semantics given so far, such an application is inexpressible, because for any application @code{@var{f}_0} and error message @w{@code{@var{x}_1}}, property @emph{P44} stipulates @code{@var{f}_0 @var{x}_1} = @code{@var{x}_1}, meaning that the resulting error message is unchanged. Therefore, we need to define another basic property of the operator. The following form of virtual code is used in applications that may need to operate on error messages. @cindex @code{guard} @table @emph @item T32 [[@code{guard}]] @code{(@var{f},@var{g})} = @code{((nil,@var{f}),@var{g})} @end table @noindent Code in this form has the following semantics. @table @emph @item P53 ([[@code{guard}]] @code{(@var{f},@var{g})})@code{_@var{n}} @code{@var{x}_@var{p}} = @code{@var{g}_(@var{n}+1) @var{f}_@var{n} @var{x}_@var{p}} @end table @noindent The intuitive explanation is that @code{@var{f}} is the main part of the application, and @code{@var{g}} is the part of the application that operates on the error message that comes from @code{@var{f}} if an exception occurs while it is being evaluated (i.e., the ``exception handler''). Typically the exception handler code implements a function that takes an error message as an argument and returns an error message as a result. Where there is no exception, the exception handler @code{@var{g}_(@var{n}+1)} is never used, because its argument will be on level @code{@var{n}}, and therefore unaffected by an application on level @code{@var{n}+1}. Exception handlers may have their own exception handlers, which will be invoked if the evaluation of the exception handler causes a further exception. Such an exception corresponds semantically to a value on the next level of the hierarchy of sets. @node Exception Handler Usage, , Computable Error Messages, Exception Handling @subsubsection Exception Handler Usage One way for this feature of the virtual machine to be used is to intercept and translate error messages to a more meaningful form. An application guarded as shown below causes messages of invalid deconstruction to be changed to @code{'syntax error'}. @display @code{main = guard( application, conditional( bu(compare,('invalid deconstruction',nil)), (constant ('syntax error',nil),identity)))} @end display @noindent The conditional compares its argument to the error message for an @cindex deconstruction invalid deconstruction, and if it matches, the syntax error message is returned, but otherwise the original message is returned. Note that an error message must be in the form of a list of character strings, so that it can be printed. Although the message of @code{'syntax error'} might not be very informative, at least it looks less like a crash. A real application should of course strive to do better than that. Exception handling features of the virtual machine can also be adapted by applications to raise their own exceptions with customized messages. @example error_messenger = guard(compose(compare,constant nil),constant ('syntax error',nil)) @end example @noindent This code fragment implements a function that causes a message of @code{'syntax error'} to be reported for any possible input. This code works by first causing an invalid comparison and then substituting its own error message. A function that always causes an error is not useful in itself, but might be used as part of an application in the following form. @example main = conditional(validation,(application,error_messenger)) @end example @noindent In this case, the application checks the validity of the input with a predicate, and invokes the error messenger if it is invalid. Although the previous examples return a fixed error message for each possible kind of error, it is also possible to have error messages that depend on the input data, as the next example shows. @cindex @code{bu} @cindex @code{guard} @cindex @code{identity} @cindex @code{apply} @cindex @code{hired} @example main = (hired apply)( compose( bu(guard,some_application), (hired constant)(constant 'invalid input was:',identity)), identity) @end example @noindent If the application causes an exception for any reason, the error message returned will include a complete listing of the input, prefaced by the words @code{'invalid input was:'}. This particular example works only if the input is a list of character strings, but could be adapted for other types of data by substituting an appropriate formatting function for the first identity. The formatting function would take the relevant data type to a list of character strings. Another possible variation would be to concatenate the invalid input listing with the error message that was generated, rather than just replacing it. As the last example may suggest, exception handlers turn out to be an @cindex debugging @cindex functional programming @cindex imperative programming essential debugging tool for functional programs, making them as easy to debug as imperative programs if not more so. This example forms the basis for a higher order function that wraps any given function with an exception handler that prints the argument causing it to crash. For arguments not causing a crash, the behavior is unchanged. Alternatively, code implementing a function that unconditionally reports its argument in an error message can be inserted at a strategic point in the application code similarly to a print statement. Finally, inspired use of exception handlers that concatenate their messages with previously generated messages can show something like a parameter stack dump when a recursively defined function crashes. These are all matters for a language designer and are not pursued further in this document. @node Interfaces to External Code, Vacant Address Space, Exception Handling, Virtual Code Semantics @subsection Interfaces to External Code A few other combinators have been incorporated into the virtual machine as alternatives to the style of interactive applications described in @ref{Output From Interactive Applications}. These make it possible to interface with external libraries and applications either by a simple function call, or by executing a run-time generated transducer as described previously. In either case, there is no need for any particular command line options to specify interactive invocation, nor for the application to be designed that way from the outset. Existing virtual code applications may therefore be enhanced to make use of these features without radical changes. To account for these additional capabilities, it is not entirely adequate to continue defining the virtual machine semantics in terms of a mathematical function, but it is done nevertheless due to the lack of any appealing alternative. Although most library functions are in fact functions in the sense that their outputs are determined by their arguments, they defy a concise specification within the present mathematical framework, especially insofar as they may involve finite precision floating point numbers. More problematically, the effect of interaction with a shell is neither well defined nor deterministic. The descriptions that follow presuppose a computational procedure associated with the following definitions but leave its exact nature unspecified. @menu * Library combinator:: * Have combinator:: * Interaction combinator:: @end menu @node Library combinator, Have combinator, Interfaces to External Code, Interfaces to External Code @subsubsection Library combinator The simplest and fastest method of interfacing to an external library is by way of a virtual machine combinator called @code{library}. It takes two non-empty character strings as arguments to a virtual code program of the form implied by the following property. @table @emph @item T33 [[@code{library}]] (@code{@var{x}},@code{@var{y}}) = @code{((nil,nil),((@var{x},@var{y}),(nil,nil)))} @end table @noindent Intuitively, @var{x} is the name of a library and @var{y} is the name of a function within the library. For example, if @var{x} is @code{'math'} and @var{y} is @code{'sqrt'}, then @code{library}(@var{x},@var{y}) represents the function that computes the square root of a floating point number as defined by the host machine's native C implementation, normally in IEEE double precision format. Different functions and libraries may involve other argument and result types, such as complex numbers, arrays, sparse matrices, or arbitrary precision numbers. A list of currently supported external library names with their functions and calling conventions is given in @ref{External Libraries}. On the virtual code side, all function arguments and results regardless of their types are encoded as nested pairs of @code{nil}, as always, and may be manipulated or stored as any other data, including storage and retrieval from files in @file{.avm} virtual code format (@ref{File Format}). However, on the C side, various memory management and caching techniques are employed to maintain this facade while allowing the libraries to operate on data in their native format. The details are given more fully in the API documentation, particularly in @ref{Type Conversions} and @ref{External Library Maintenance}. While this style is fast and convenient, it is limited either to libraries that have already been built into the virtual machine, or to those for which the user is prepared to implement a new interface module in C as described in @ref{Implementing new library functions}. @node Have combinator, Interaction combinator, Library combinator, Interfaces to External Code @subsubsection Have combinator As virtual machine interfaces to external libraries accumulate faster than they can be documented and may vary from one installation to another, it is helpful to have a way of interrogating the virtual machine for an up to date list of the installed libraries and functions. A combinator called @code{have} can be used to test for the availability of a library function. It takes the form @table @emph @item T34 [[@code{have}]] (@code{@var{x}},@code{@var{y}}) = @code{((nil,nil),((nil,@var{x}),(nil,@var{y})))} @end table @noindent where @var{x} is the name of a library and @var{y} is the name of a function within the library encoded as character strings. For example, if @var{x} is @code{'mtwist'} and @var{y} is @code{'u_disc'} (for the natural random number generator function in the Mersenne twistor library) then @code{have(@var{x},@var{y})} is a function that returns a non-empty value if an only if that library is installed and that function is available within it. The actual argument to the function is ignored as the result depends only on the installed virtual machine configuration. In this sense, it acts like a @code{constant} combinator. One way for this combinator to be used is in code of the form @example portable_rng = conditional( have('mtwist','u_disc'), library('mtwist','u_disc'), some_replacement_function) @end example @noindent which will use the library function if available but otherwise use a replacement function. Code in this form makes the decision at run time, but it is also possible to express the function such that the check for library presence is made at compile time, as the following example shows, which will imply a slight improvement in performance. @example non_portable_rng = apply( conditional( have('mtwist','u_disc'), constant library('mtwist','u_disc'), constant some_replacement_function), 0) @end example @noindent This program would be non-portable in the sense that it would need to be recompiled for each installation if there were a chance that some of them might have the @code{mtwist} library and some might not, whereas the previous example would be binary compatible across all of them. @footnote{In practice both examples are equally portable because the @code{mtwist} source is distributed with @code{avram} so all installations will have it. Most libraries are distributed separately.} The actual value returned by a function @code{have(foo,bar)} is the list of pairs of strings @code{<(foo,bar)>} if the function is available, or the empty list otherwise. A non-empty list is represented as a pair @code{(head,tail)}, and an empty list as @code{nil}. The angle bracket notation @code{} used here is an abbreviation for @code{(a,(b,(c...nil)))}. Either or both arguments to the @code{have} combinator can be a wildcard, which is the string containing a single asterisk, @cindex wild cards @code{'*'}. In that case, the list of all available matching library names and function names will be returned. This feature can be used to find out what library functions are available without already knowing their names. If a library had a function named @code{'*'}, which clashes with the wild card string, the interpretation as a wild card would take precedence. @node Interaction combinator, , Have combinator, Interfaces to External Code @subsubsection Interaction combinator A further combinator allows virtual code applications to interact directly with any interactive console application using the @code{expect} library. The mechanism is similar to that of interactive applications documented in the @ref{Output From Interactive Applications}, but attempts to be more convenient. Instead of being designed as an interactive application, any virtual code application may use this combinator to spawn a shell and interact with it in order to compute some desired result. The advantage of this combinator over the @code{library} combinator is that it requires no modification of the virtual machine to support new applications. It can also interact with applications that may reside on remote servers, that are implemented languages other than C, or @cindex GNU R whose source code is unavailable. For example, the GNU R statistical package provides an interactive command to evaluate multivariate @cindex multivariate normal distrubution normal distribution functions with an arbitrary covariance matrix, but @cindex covariance matrix the corresponding function is not provided by the @code{Rmath} C library (or any other free library, to the author's knowledge) because it is implemented in interpreted code. This combinator makes it callable by an @code{avram} virtual code application nevertheless. The disadvantage compared to the @code{library} combinator is that there is more overhead in spawning a process than simply making a call to a built in function, and the programming interface is more complicated. The combinator takes the form @table @emph @item T35 [[@code{interact}]] @var{f} = @code{((nil,nil),(((nil,nil),nil),((nil,@var{f}),nil)))} @end table @noindent where @var{f} is the virtual code for a function that follows the same protocol described in @ref{Output From Interactive Applications}, except that it does not allow file output as described in @ref{Mixed Modes of Interaction}. The argument @code{x} is ignored when the expression @code{(interact f) x} is evaluated, similarly to the way the argument is ignored in an expression like @code{(constant k) x}. The result returned is a transcript of the dialogue that took place between @code{f} and the externally spawned shell, represented as a list of lists of strings for line oriented interaction, or a list of characters alternating with lists of strings in the case of character oriented interaction. The following example demonstrates a trivial use of the @code{interact} combinator to spawn an @code{ftp} client, do an @code{ls} command, and then @cindex ftp terminate the session. @example eof = <(nil,(nil,(((nil,nil),nil),(nil,nil))))> demo = interact conditional( conditional(identity,constant false,constant true), constant(0,<'ftp'>,<'ftp> '>), conditional( conditional(left,constant false,constant true), constant(1,<'ls',''>,<'','ftp> '>), conditional( compose(compare,couple(left,constant 1)), constant(2,<'bye',''>,), constant nil))) @end example @noindent Some liberties are taken with @code{silly} syntax in this example, in the way of using angle brackets to denote lists, and numbers to represent states. @itemize @bullet @item The interacting transducer works by checking whether its argument is empty (via the @code{identity} function used as a predicate in the @code{conditional}, which is then negated). In that case it returns the triple containing the initial state of 0, the @code{ftp} shell command to spawn the client, and the @code{'ftp> '} prompt expected when the client has been spawned, both of the latter being lists of strings. @item If the argument is non-empty, then next it checks whether it is in the initial state of 0, (via the @code{left} function used as a predicate, referring to the state variable expected on the left of any given @code{(state,input)} pair, also negated). If so, it returns the triple containing the next state of 1, the @code{ls} command followed by an empty string to indicate a line break, and the expected prompt preceded by an empty string to match it only at the beginning of a line. @item Finally, it checks for state 1, in which case it issues the @code{bye} command to close the session, @code{eof} rather than a @cindex eof prompt to wait for termination of the client, and a state of 2. @item In the remaining state of 2, which needn't be explicitly tested because it is the only remaining possibility, the program returns a @code{nil} value to indicate that the computation has terminated. @end itemize Deadlock would be possible at any point if either party did not follow @cindex deadlock this protocol, but for this example it is not an issue. If an expression of the form @code{demo x} were to be evaluated, then regardless of the value of @code{x}, the value of the result would be as shown below. @example < <'ftp'>, <'ftp> '>, <'ls',''>, <'ls','Not connected.','ftp> '>, <'bye',''>, <'bye',''>> @end example @noindent That is, it would be a list of lists of strings, alternating between the output of the interactor and the output of the @code{ftp} client. If the spawned application had been something non-trivial such as a computer algebra system or a command line web search utility, then it is easy to see how functions using this combinator can leverage off a wealth of available resources. @node Vacant Address Space, , Interfaces to External Code, Virtual Code Semantics @subsection Vacant Address Space Not every possible pattern has been used by the virtual machine as a way of encoding a function. The following patterns, where @code{@var{a}}, @code{@var{b}}, and @code{@var{c}} are non-@code{nil} trees, do not represent anything useful. @table @asis @item unary forms @code{((nil,nil),((nil,nil),(nil,((nil,@var{a}),nil))))}@* @code{((nil,nil),((nil,nil),(nil,(nil,(nil,@var{a})))))} @item binary forms @code{((nil,nil),((nil,nil),(@var{a},@var{b})))}@* @code{((nil,nil),((@var{a},nil),(@var{b},nil)))}@* @code{((nil,nil),((@var{a},nil),(nil,@var{b})))} @item ternary forms @code{((nil,nil),((@var{a},@var{b}),(@var{c},nil)))}@* @code{((nil,nil),((@var{a},@var{b}),(nil,@var{c})))}@* @code{((nil,nil),((@var{a},nil),(@var{b},@var{c})))}@* @code{((nil,nil),((nil,@var{a}),(@var{b},@var{c})))} @end table @noindent These patterns are detected by the virtual machine simply to avoid blowing it up, but they always cause an error message to be reported. @cindex @code{unsupported hook} @cindex @code{unrecognized combinator} @table @emph @item P55 For @code{@var{f}} matching any of the first three trees in the above list,@* @w{ }@w{ }@w{ }@code{@var{f}_@var{n} @var{x}_@var{n}} = [[@code{('unsupported hook',nil)}]]@code{_(@var{n}+1)} @item P56 For the remaining trees @code{@var{f}} in the above list,@* @w{ }@w{ }@w{ }@code{@var{f}_@var{n} @var{x}_@var{n}} = [[@code{('unrecognized combinator (code @var{m})',nil)}]]@code{_(@var{n}+1)} @end table @noindent Here, @code{@var{m}} is a numeric constant dependent on which tree @code{@var{f}} was used. The unsupported hook message is meant to be more informative than the unrecognized combinator message, suggesting that a feature intended for future use is not yet available. This list has been assembled for the benefit of readers considering the addition of backward compatible extensions to the virtual code semantics, who are undeterred by the facts that @itemize @bullet @item the computational model @cindex universality is already universal @item virtual code applications are already interoperable with all kinds of high performance software having a text based or console interface by way of the @code{interact} combinator @item an unlimited number of built in library functions can be added by way of the @code{library} combinator as described in @ref{Implementing new library functions} @item the C code in @code{avram} makes fairly @cindex pointers intricate use of pointers with a careful policy of reference counting and storage reclamation @item there is also a performance penalty incurred by @cindex reference count further extensions to the semantics, even for applications that don't use them, because a pattern recognition algorithm in the interpreter has more cases to consider. @end itemize Nevertheless, a new functional form combining a pair of functions to be interpreted in a new way by the virtual machine could be defined using any of the binary forms above, for example, with @code{@var{a}} as the virtual code for one of the functions and @code{@var{b}} as that of the other. Such a form would not conflict with any existing applications, provided that both @code{@var{a}} and @code{@var{b}} are not @code{nil}, which is true of any valid representation for a function. Virtual machine architects, take note. There are infinitely many trees @cindex trees fitting these patterns, but it would be possible to use them up by assigning them without adequate foresight. For example, if interpretations were assigned to the four ternary forms, the three binary forms, and one of the remaining unary forms, then the only unassigned pattern could be of the form @display @code{((nil,nil),((nil,nil),(nil,(nil,(nil,@var{a})))))} @end display @noindent Assigning an interpretation to it would leave no further room for backward compatible expansion. On the other hand, any tree of the following form also fits the above pattern, @display @code{((nil,nil),((nil,nil),(nil,(nil,(nil,(@var{b},@var{c}))))))} @end display @noindent with any values for @code{@var{b}} and @code{@var{c}}. Different meanings could be chosen for the case where both are @code{nil}, both are non-@code{nil}, or one is @code{nil} and the other non-@code{nil}, allowing two unary forms, one binary, and one constant. If at least one of these patterns is reserved for future enhancements, then a potentially inexhaustible supply of address space remains and there will be no need for incompatible changes later. @node Library Reference, Character Table, Virtual Machine Specification, Top @chapter Library Reference Much of the code developed for @code{avram} may be reusable in other projects, so it has been packaged into a library and documented in this chapter. For ease of reference, this chapter is organized with a separate section for each source file. For the most part, each source file encapsulates an abstract type and a number of related functions, except for a few cases where C makes such a design awkward. An attempt has been made to present the sections in a readable order as far as possible. The documentation in this chapter is confined to the application program interface (API), and does not delve unnecessarily into any details of the @cindex API implementation. A reader wishing to extend, modify, or troubleshoot the library itself can find additional information in the source code comments. These are more likely to be in sync with the code than this document may be, and are more readily accessible to someone working with the code. Some general points pertaining to the library are the following. @itemize @bullet @item Unlike the previous chapter, this chapter uses the word ``function'' in the C sense rather than the mathematical sense of the word. @item Internal errors are internal from the user's point of view, not the developer's (@ref{Internal Errors}). Invoking these functions in ways that are contrary to their specifications can certainly cause internal errors (not to mention segfaults). @item The library is definitely not thread safe, and thread safety is @cindex threads not a planned enhancement. The amount of locking required to make it thread safe would probably incur an objectionable performance penalty due to the complexity of the shared data structures involved, in addition to being very difficult to get right. If you need these facilities in a concurrent application, consider spawning a process for @cindex spawning processes each client of the library so as to keep their address spaces separate. @item The library files are built from the standard source distribution using GNU @command{libtool}. In the default directory hierarchy, they will be found either in @file{/usr/lib/libavram.*} or in @file{/usr/local/lib/libavram.*}. These directories will differ in a non-standard installation. @item The header files will probably be located in either @file{/usr/include/avm/*.h} or @file{/usr/local/include/avm/*.h} for a standard installation. @item All exported functions, macros and constants are preceded with @code{avm_}, so as to reduce the chance of name clashes with other libraries. Not all type declarations or field identifiers follow this convention, because that would be far too tedious. @item The library header files are designed to be compatible with C++ @cindex C++ but have been tested only with C. Please refer to platform specific documentation for further information on how to link library modules with your own code. @end itemize @menu * Lists:: * Characters and Strings:: * File Manipulation:: * Invocation:: * Version Management:: * Error Reporting:: * Profiling:: * Emulation Primitives:: * External Library Maintenance:: @end menu @node Lists, Characters and Strings, Library Reference, Library Reference @section Lists The basic data structure used for representing virtual code and data in the @code{avram} library is declared as a @code{list}. @cindex lists @cindex @code{head} field @cindex @code{tail} field The @code{list} type is a pointer to a structure having a @code{head} field and a @code{tail} field, which are also lists. The empty tree, @code{nil}, is represented by the C constant @code{NULL}. A tree of the form @code{cons(@var{a},@var{b})} is represented in C as a list whose @code{head} is the representation of @code{@var{a}} and whose @code{tail} is the representation of @code{@var{b}}. A number of other fields in the structure are maintained automatically and should not be touched. For that matter, even the @code{head} and @code{tail} fields should be considered read-only. Because of sharing, it is almost never valid to modify a list ``in place'', except for cases that are already covered by library functions. @menu * Simple Operations:: * Recoverable Operations:: * List Transformations:: * Type Conversions:: * Comparison:: * Deconstruction Functions:: * Indirection:: * The Universal Function:: @end menu @node Simple Operations, Recoverable Operations, Lists, Lists @subsection Simple Operations These functions are declared in the header file @code{lists.h}, which should be included in any C source file that uses them with a directive such as @code{@w{#include }}. All of these functions except the first three have the potential cause a memory overflow. In that @cindex overflow event, a brief message is written to standard error and the process is killed rather than returning to the caller. It is possible for client programs requiring more robust behavior to do their own error handling by using the alternative versions of these operations described in the next section. @deftypefun void avm_initialize_lists () The function @code{avm_initialize_lists} should be called before any of the other ones in this section is called, because it sets up some internal data structures. Otherwise, the behavior of the other functions is undefined. @end deftypefun @deftypefun void avm_dispose (list @var{front}) This function deallocates the memory associated with a given list, either by consigning it to a cache maintained internally by the library, or by the standard @code{free} function if the cache is full. Shared lists are taken into account and handled properly according to a reference counting scheme. Lists should be freed only by this function, not by using @code{free} directly. @end deftypefun @deftypefun void avm_count_lists () If a client program aims to do its own storage reclamation, this function can be called optionally at the end of a run when it is believed that all lists have been freed. If any allocated lists remain at large, a warning will be printed to standard error. This function therefore provides a useful check for memory leaks. Overhead is small enough that it is not infeasible to leave this check in the production code. @end deftypefun @deftypefun list avm_copied (list @var{operand}) A copy of the argument list is returned by this function. The copy remains intact after the original is reclaimed. A typical use might be for retaining part of a list after the rest of it is no longer needed. In this example, a list @code{x} is traversed by a hypothetical @code{visit} function to each item, which is then immediately reclaimed. @example while(x)@{ visit(x->head); old_x = x; x = avm_copied(x->tail); /* the right way */ avm_dispose(old_x); @} @end example This example allows each item in the list to be visited even as previously visited items are reclaimed, because @code{x} is copied at each iteration. This example contrasts with the next one, which will probably cause a segmentation fault. @cindex segmentation fault @example while(x)@{ visit(x->head); old_x = x; x = x->tail; /* the wrong way */ avm_dispose(old_x); @} @end example In the second example, a reference is made to a part of a list which no longer exists because it has been deallocated. In fact, the @code{avm_copied} function does nothing but increment a reference count, so it is a fast, constant time operation that requires @cindex reference count no additional memory allocation. Semantically this action is equivalent to creating a fresh copy of the list, because all list operations in the library deal with reference counts properly. @end deftypefun @deftypefun list avm_join (list @var{left}, list @var{right}) This function takes a pair of lists to a list in which the left is the head and the right is the tail. It may need to use @code{malloc} to allocate additional memory. If there is insufficient memory, an error message is written to standard error and the program exits. When the list returned by @code{avm_join} is eventually deallocated, the lists from which it was built are taken with it and must not be referenced again. For example, the following code is an error. @example z = avm_join(x,y); @dots{} avm_dispose(z); avm_print_list(x); /* error here */ @end example To accomplish something similar to this without an error, a copy of @code{x} should be made, as in the next example. @example z = avm_join(avm_copied(x),y); @dots{} avm_dispose(z); avm_print_list(x); /* original x still intact */ @end example @end deftypefun @deftypefun void avm_enqueue (list *@var{front}, list *@var{back}, list @var{operand}) @cindex queues A fast simple way of building a list head first is provided by the @code{enqueue} function. The @code{front} is a pointer to the beginning of the list being built, and the @code{back} is a pointer to the last item. The recommended way to use it would be something like this. @example front = back = NULL; avm_enqueue(&front,&back,item); avm_enqueue(&front,&back,next_item); avm_enqueue(&front,&back,another_item); @dots{} @end example It might be more typical for the calls to @code{avm_enqueue} to appear within a loop. In any case, after the above code is executed, the following postconditions will hold. @example front->head == item front->tail->head == next_item front->tail->tail->head == another_item back->head == another_item back->tail == NULL @end example The @code{avm_enqueue} function must never be used on a shared list, because it modifies its arguments in place. The only practical way to guarantee that a list is not shared is to initialize the @code{front} and @code{back} to @code{NULL} as shown before the first call to @code{avm_enqueue}, and to make no copies of @code{front} or @code{back} until after the last call to @code{avm_enqueue}. Because a list built with @code{avm_enqueue} is not shared, it is one of the few instances of a list that can have something harmlessly appended to it in place. For example, if the next line of code were @example back->tail = rest_of_list; @end example that would be acceptable assuming @code{rest_of_list} is not shared and does not conceal a dangling or cyclic reference, and if nothing further were enqueued. The items that are enqueued into a list are not copied and will be deallocated when the list is deallocated, so they must not be referenced thereafter. A non-obvious violation of this convention is implicit in the following code. @example @dots{} avm_enqueue(&front,&back,x->head); @dots{} avm_dispose(front); avm_print_list(x); /* error here */ @end example This code might cause a segmentation fault because of the reference to @cindex segmentation fault @code{x} after its head has been deallocated. The following code is subject to the same problem, @example @dots{} avm_enqueue(&front,&back,x->head); @dots{} avm_dispose(x); avm_print_list(front); /* error here */ @end example as is the following. @example @dots{} avm_enqueue(&front,&back,x->head); @dots{} avm_dispose(x); /* front is now impossible to reclaim */ avm_dispose(front); @end example The problem with the last example is that it is not valid even to dispose of the same list more than once, albeit indirectly. If part of a list is intended to be enqueued temporarily or independently of its parent, the list should be copied explicitly, as the following code demonstrates. @example @dots{} avm_enqueue(&front,&back,avm_copied(x->head)); /* correct */ @dots{} avm_dispose(front); avm_print_list(x); @end example @end deftypefun @deftypefun counter avm_length (list @var{operand}) A @code{counter} is meant to be the longest unsigned integer available @cindex @code{counter} on the host machine, and is defined in @code{common.h}, which is automatically included whenever @code{lists.h} is included. The @code{avm_length} function returns the number of items in a list. If a list is @code{NULL}, a value of zero is returned. There is a possibility of a counter overflow error from this function (@ref{Overflow Errors}), but only on a platform where the @code{counter} type is shorter than the address length. @end deftypefun @deftypefun counter avm_area (list @var{operand}) This function is similar to @code{avm_length}, but it treats its argument as a list of lists and returns the summation of their lengths. @end deftypefun @deftypefun list avm_natural (counter @var{number}) @cindex naturals This function takes a @code{counter} to its representation as a list, as described in @ref{Representation of Numeric and Textual Data}. That is, the number is represented as a list of bits, least significant bit first, with each zero bit represented by @code{NULL} and each one bit represented by a list whose @code{head} and @code{tail} are @code{NULL}. @end deftypefun @deftypefun void avm_print_list (list @var{operand}) The @code{avm_print_list} function is not used in any production code but retained in the library for debugging purposes. It prints a list to @cindex standard output standard output using an expression involving only commas and parentheses, as per the @code{silly} syntax (@ref{A Simple Lisp Like Language}). The results quickly become unintelligible for lists of any significant size. The function is recursively defined and will crash in the event of a stack overflow, which will occur in the case of very large or cyclic lists. @end deftypefun @deftypefun list avm_position (list @var{key}, list @var{table}, int *@var{fault}) This function searches for a @var{key} in a short @var{table} where each item is a possible key. If it's not found, a @code{NULL} value is returned. If it's found, a list representing a character encoding according to @ref{Character Table} is returned. The ascii code of the character corresponding to the returned list is the position of the @var{key} in the @var{table}, assuming position numbers start with 1. The table should have a length of 255 or less. If it's longer and the @var{key} is found beyond that range, the higher order bits of the position number are ignored. The integer referenced by @var{fault} is set to a non-zero value in the event of a memory overflow, which could happen in the course of the list comparisons necessary for the search. @end deftypefun @node Recoverable Operations, List Transformations, Simple Operations, Lists @subsection Recoverable Operations The functions in this section are similar to the ones in the previous section except with regard to error handling. Whereas the other ones cause an error message to be printed and the process to exit in the event of an overflow, these return to the caller, whose responsibility it is to take appropriate action. The functions in both sections are declared in @file{lists.h}, and should be preceded by a call to @code{avm_initialize_lists}. @deftypefun list avm_recoverable_join (list @var{left}, list @var{right}) This function is similar to @code{avm_join}, but will return a @code{NULL} pointer if memory that was needed can not be allocated. A @code{NULL} pointer would never be the result of a join under normal circumstances, so the overflow can be detected by the caller. Regardless of whether overflow occurs, the arguments are deallocated by this function and should not be referenced thereafter. @end deftypefun @deftypefun void avm_recoverable_enqueue (list *@var{front}, list *@var{back}, list @var{operand}, int *@var{fault}) This version of the enqueue function will dispose of the @code{@var{operand}} if there isn't room to append another item and set @code{*@var{fault}} to a non-zero value. Other than that, it does the same as @code{avm_enqueue}. @end deftypefun @deftypefun counter avm_recoverable_length (list @var{operand}) This function checks for arithmetic overflow when calculating the length of a list, and returns a zero value if overflow occurs. The caller can detect the error by noting that zero is not the length of any list other than @code{NULL}. This kind of overflow is impossible unless the host does not have long enough integers for its address space. @end deftypefun @deftypefun counter avm_recoverable_area (list @var{operand}, int *@var{fault}) This function is similar to @code{avm_area}, except that it reacts differently to arithmetic overflow. The @code{fault} parameter should be the address of an integer known to the caller, which will be set to a non-zero value if overflow occurs. In that event, the value of zero will also be returned for the area. Note that it is possible for non-empty lists to have an area of zero, so this condition alone is not indicative of an error. @end deftypefun @deftypefun list avm_recoverable_natural (counter @var{number}) This function returns the @code{list} representation of a native unsigned long integer, provided that there is enough memory, similarly to the @code{avm_natural} function. Unlike that function, this one will return a value of @code{NULL} rather than exiting the program in the event of a memory overflow. The overflow can be detected by the caller insofar as a @code{NULL} @code{list} does not represent any number other than zero. @end deftypefun @node List Transformations, Type Conversions, Recoverable Operations, Lists @subsection List Transformations Some functions declared in @file{listfuns.h} are used to implement the operations described in @ref{List Functions}. These functions are able to report error messages in the event of overflow or other exceptional @cindex overflow @cindex exceptions @cindex error messages conditions, as described in @ref{Error Messages}. The error messages are represented as lists and returned to the caller. The occurrence of an error can be detected by the @code{*@var{fault}} flag being set to a non-zero value. None of these functions ever causes a program exit except in the event of an internal error. @deftypefun void avm_initialize_listfuns () This has to be called before any of the other functions in this section is called. It initializes the error message lists, among other things. @end deftypefun @deftypefun void avm_count_listfuns () At the end of a run, a call to this function can verify that no unreclaimed storage attributable to these functions persists. If it does, a warning is printed to standard error. If @code{avm_count_lists} is also used, it must be called after this function. @end deftypefun @deftypefun list avm_reversal (list @var{operand}, int *@var{fault}) The reversal of the list is returned by this function if no overflow occurs. A non-zero @code{*@var{fault}} and an error message are returned otherwise. The original @code{@var{operand}} still exists in its original order after this function is called. The amount of additional storage allocated is proportional only to the length of the list, not the size of its contents. @end deftypefun @deftypefun list avm_distribution (list @var{operand}, int *@var{fault}) This function performs the operation described in @ref{Distribute}. The invalid distribution message is returned in the event of a @code{NULL} operand. Otherwise, the returned value is the distributed list. In any event, the @code{@var{operand}} is unaffected. @end deftypefun @deftypefun list avm_concatenation (list @var{operand}, int *@var{fault}) @cindex concatenation The @code{@var{operand}} is treated as a pair of lists to be concatenated, with the left one in the @code{head} field and the right one in the @code{tail} field. The invalid concatenation message is returned in the event of a @code{NULL} @code{@var{operand}}. The result returned otherwise is the concatenation of the lists, but the given @code{@var{operand}} still exists unchanged. @end deftypefun @deftypefun list avm_transposition (list @var{operand}, int *@var{fault}) The operation performed by this function corresponds to that of @ref{Transpose}. Unlike other functions in this section, the operand passed to this function is deallocated, and must not be referenced @cindex @code{transpose} thereafter. The transposed list is accessible as the returned value of this function. If the original @code{@var{operand}} is still needed after a call to @code{avm_transposition}, only a copy of it should be passed to it, obtained from @code{avm_copied}. The invalid transpose error message is the result if the operand does not represent a list of equal length lists. @end deftypefun @deftypefun list avm_membership (list @var{operand}, int *@var{fault}) This function computes the membership predicate described in @cindex @code{member} @ref{Member}. The operand is a list in which the @code{tail} field is a list that will be searched for the item in the @code{head}. If the item is not found, a @code{NULL} list is returned, but otherwise a list with @code{NULL} @code{head} and @code{tail} fields is returned. If the operand is @code{NULL}, an error message of invalid membership is returned and @code{*@var{fault}} is set to a non-zero value. The @code{avm_membership} function calls @code{avm_binary_comparison} in order to compare lists, so the same efficiency and side-effect considerations are relevant to both (@ref{Comparison}). It is not necessary to @code{#include} the header file @code{compare.h} or to call @code{avm_initialize_compare} in order to use @code{avm_membership}, because they will be done automatically. @end deftypefun @deftypefun list avm_binary_membership (list @var{operand}, list @var{members}, int *@var{fault}); This function is the same as @code{avm_membership} except that it allows the element and the set of members to be passed as separate lists instead of being the head and the tail of the same list. @end deftypefun @deftypefun list avm_measurement (list @var{operand}, int *@var{fault}) This function implements the operation described in @ref{Weight}, which pertains to the weight of a tree. The returned value of this function is a list encoding the weight as a binary number, unless a counter overflow occurs, in which case it's an error message. As noted previously, the weight of a tree can easily be exponentially larger than the amount of @cindex native integer arithmetic memory it occupies, but this function uses native integer arithmetic for performance reasons. Hence, a counter overflow is a real possibility. @end deftypefun @node Type Conversions, Comparison, List Transformations, Lists @subsection Type Conversions External library functions accessed by the @code{library} combinator as explained in @ref{Library combinator} may operate on data other than the @code{list} type usually used by @code{avram}, such as floating point numbers and arrays, but a virtual code application must be able to represent the arguments and results of these functions in order to use them. As a matter of convention, a data structure occupying @var{size} bytes of contiguous storage on the host machine appears as a list of length @var{size} to a virtual code application, in which each item corresponds to a byte, and is represented according to @ref{Character Table}. In principle, a virtual code application invoking a library function to operate on a contiguous block of data, such as an IEEE double precision number, for example, would construct a list of eight character representations (one for each byte in a double precision number), and pass this list as an argument to the library function. The virtual machine would transparently convert this representation to the native floating point format, evaluate the function, and convert the result back to a list. In practice, high level language features beyond the scope of this document would insulate the programmer from some of the details on the application side as well. To save the time of repeatedly converting between the list representation and the contiguous native binary representation, the structure referenced by a @code{list} pointer contains a @code{value} @cindex value field field which is a @code{void} pointer to a block of memory of unspecified type, and serves as a persistent cache of the value represented by the list. This field normally should be managed by the API rather than being accessed directly by client modules, but see the code in @file{mpfr.c} for an example of a situation in which it's appropriate to break this rule. (Generally these situations involve library functions operating on non-contiguous data.) @menu * Primitive types:: * One dimensional arrays:: * Two dimensional arrays:: * Related utility functions:: @end menu @node Primitive types, One dimensional arrays, Type Conversions, Type Conversions @subsubsection Primitive types A pair of functions in support of this abstraction is prototyped in @file{listfuns.h}. These functions will be of interest mainly to developers wishing to implement an interface to a new library module and make it accessible on the virtual side by way of the @code{library} combinator (@ref{Library combinator}). @deftypefun void *avm_value_of_list (list @var{operand}, list *@var{message}, int *@var{fault}) This function takes an @var{operand} representing a value used by a library function in the format described above (@ref{Type Conversions}) and returns a pointer to the value. The @code{value} field in the @var{operand} normally will point to the block of memory holding the value, and the @var{operand} itself will be a list of character representations whose binary encodings spell out the value as explained above. The @code{value} field need not be initialized on entry but it will be initialized as a side effect of being computed by this function. If it has been initialized due to a previous call with the same @var{operand}, this function is a fast constant time operation. The caller should not free the pointer returned by this function because a reference to its value will remain in the @var{operand}. When the @var{operand} itself is freed by @code{avm_dispose} (@ref{Simple Operations}), the value will go with it. If an error occurs during the evaluation of this function, the integer referenced by @var{fault} will be set to a non-zero value, and the list referenced by @var{message} will be assigned a representation of a list of strings describing the error. The @var{message} is freshly created and should be freed by the caller with @code{avm_dispose} when no longer needed. Possible error messages are @code{<'missing value'>}, in the case of @cindex missing value an empty @var{operand}, @code{<'invalid value'>} in the case of an @cindex invalid value @var{operand} that is not a list of character representations, and @code{<'memory overflow'>} if there was insufficient space to allocate the result. @end deftypefun @deftypefun list avm_list_of_value (void *@var{contents}, size_t @var{size}, int *@var{fault}) This function performs the inverse operation of @code{avm_value_of_list}, taking the address of an area of contiguously stored data and its @var{size} in bytes to a list representation. The length of the list returned is equal to the number of bytes of data, @var{size}, and each item of the list is a character representation for the corresponding byte as given by @ref{Character Table}. A copy of the memory area is made so that the original is no longer needed and may be freed by the caller. A pointer to this copy is returned by subsequent calls to @code{avm_value_of_list} when the result returned by this function is used as the @var{operand} parameter. If there is insufficient memory to allocate the result, the integer referenced by @var{fault} is set to a non-zero value, and a copy of the message @code{<'memory overflow'>} represented as a list is returned. This function could also cause a segmentation fault if it is @cindex segmentation fault passed an invalid pointer or a @var{size} that overruns the storage area. However, it is acceptable to specify a @var{size} that is less than the actual size of the given memory area to construct a list representing only the first part of it. The @var{size} must always be greater than zero. @end deftypefun @node One dimensional arrays, Two dimensional arrays, Primitive types, Type Conversions @subsubsection One dimensional arrays A couple of functions declared in @file{matcon.h} are concerned mainly with one dimensional arrays or vectors. They have been used for @cindex arrays vectors of double precision and complex numbers, but are applicable to @cindex vectors any base type that is contiguous and of a fixed size. The motivation for these functions is to enable a developer to present an API to virtual code applications wherein external library functions operating natively on one dimensional arrays of numbers are seen from the virtual side to operate on lists of numbers. Lists are the preferred container for interoperability with virtual code applications. @deftypefun void *avm_vector_of_list (list @var{operand}, size_t @var{item_size}, list *@var{message}, int *@var{fault}) This function calls @code{avm_value_of_list} (@ref{Primitive types}) for each item of the @var{operand} and puts all the values together into one contiguous block, whose address is returned. The given @var{item_size} is required to be the lengths of the items, all necessarily equal, and is required only for validation. For example, @var{item_size} is 8 for a list of double precision numbers, because they occupy 8 bytes each and are represented as lists of length 8. The total number of bytes allocated is the product of @var{item_size} and the length of the @var{operand}. Unlike the case of @code{avm_value_of_list} (@ref{Primitive types}), the result returned by this function should be explicitly freed by the caller when no longer needed. Any errors such as insufficient memory cause the integer referenced by @var{fault} to be assigned a non-zero value and the @var{message} to be assigned an error message represented as a list of strings. An error message of @code{<'bad vector specification'>} is possible in the case of an empty @var{operand} or one whose item lengths don't match the given @var{item_size}. Error messages caused by @code{avm_value_of_list} can also be generated by this function. Any non-empty error message should be reclaimed by the caller using @code{avm_dispose} (@ref{Simple Operations}). If an error occurs, a @code{NULL} pointer is returned. @end deftypefun @deftypefun list avm_list_of_vector (void *@var{vector}, int @var{num_items}, size_t @var{item_size}, int *@var{fault}) This function takes it on faith that an array of dimension @var{num_items} in which each item occupies @var{item_size} bytes begins at the address given by @var{vector}. A list representation of each item in the array is constructed by the function @code{avm_list_of_value} (@ref{Primitive types}), and a list of all of the lists thus obtained in order of their position in the array is returned. In the event of any errors caused by @code{avm_list_of_value} or errors due to insufficient memory, the error message is returned as the function result, and the integer referenced by @var{fault} is assigned a non-zero value. The error message is in the form of a list of character string representations. A segmentation fault is possible @cindex segmentation fault if @var{vector} is not a valid pointer or if the array size implied by misspecified values of @var{num_items} and @var{item_size} exceeds its actual size. @end deftypefun @node Two dimensional arrays, Related utility functions, One dimensional arrays, Type Conversions @subsubsection Two dimensional arrays Several other functions in @file{matcon.h} are meant to support conversions between matrices represented as lists of lists and arrays in a variety of representations. Dense matrices either square or @cindex matrices rectangular are accommodated, and symmetric square matrices can be stored with redundant entries omitted in either upper trangular or lower triangular format. Similarly to the vector operations (@ref{One dimensional arrays}) these functions are intended to allow a developer to present an interface to external libraries based on lists rather than arrays. The preferred convention for virtual code applications is to represent a matrix as a list of lists of entities (typically numbers), with one list for each row of the matrix. For example, a 3 by 3 matrix containing a value of @code{aij} in the @code{i}-th row and the @code{j}-th column would be represented by this list of three lists. @example < , , > @end example @noindent Such a representation is convenient for manipulation by virtual machine combinators, for example @code{transpose} (@ref{Transpose}), and is readily identified with the matrix it represents. If a matrix is symmetric (that is, with @code{aij} equal to @code{aji} for all values of @code{i} and @code{j}), only the lower triangular portion needs to be stored because the other entries are @cindex triangular matrix redundant. The list representatation would be something like this. @example < , , > @end example Another alternative for representing a symmetric matrix is to store only the upper triangular portion. In this case, a list such as the following would be used. @example < , , > @end example @noindent The upper and lower triangular representations are distinguishable by whether or not the row lengths form an increasing sequence. In addition to representing symmetric matrices, these upper and lower @cindex symmetric matrix triangular forms are also appropriate for representing matrices whose remaining entries are zero, such as the factors in an LU decomposition. @cindex LU decomposition @deftypefun void *avm_matrix_of_list (int @var{square}, int @var{upper_triangular}, int @var{lower_triangular}, int @var{column_major}, list @var{operand}, size_t @var{item_size}, list *@var{message}, int *@var{fault}) This function converts a matrix in one of the list representations above to a contiguous array according to the given specifications. The array can contain elements of any fixed sized type of size @var{item_size}. The memory for it is allocated by this function and it should be freed by the caller when no longer needed. The input matrix is given by the list parameter, @var{operand}, and its format is described by the integer parameters @var{square}, @var{upper_triangular}, and @var{lower_triangular}. The number of bytes occupied by each entry is given by @var{item_size}. To the extent these specifications are redundant, they are used for validation. If any of the following conditions is not met, the integer referenced by @var{fault} is assigned a non-zero value and a copy of the message @code{<'bad matrix specification'>} represented as a list is assigned to the list referenced by @var{message}. Errors are also possible due to insufficient memory. @itemize @bullet @item The @var{operand} must be a list of lists of lists such that each item of each item is has a length of @var{item_size}, and its items consist of character representations as required by @code{avm_value_of_list} (@ref{Primitive types}). @item If the lengths of the top level lists in the @var{operand} form an increasing sequence, the lower triangular representation is assumed and the @var{lower_triangular} parameter must have a non-zero value. @item If the lengths of the top level lists in the @var{operand} form a decreasing sequence, the upper triangular representation is assumed and the @var{upper_triangular} parameter must have a non-zero value. @item At least one of @var{upper_triangular} or @var{lower_triangular} must be zero. @item If @var{square} has a non-zero value, then either all items of the @var{operand} must have the same length as the operand, or if it's triangular, then the longest one must have the same length as the operand. @item If the @var{operand} is neither square nor a triangular form, all items of it are required to have the same length. @end itemize The parameters @var{upper_triangular} or @var{lower_triangular} may be set to non-zero values even if the @var{operand} is not in one of the upper or lower triangular forms discussed above. In this case, the @var{operand} must be square or rectangular (i.e., with all items the same length), and the following interpretations apply. @itemize @bullet @item If @var{upper_triangular} is non-zero, the diagonal elements and the upper triangular portion of the input matrix are copied to the output. The lower triangle of the input is ignored and the lower triangle of the output is left uninitialized. @item If @var{lower_triangular} is non-zero, the diagonal elements and the lower triangular portion of the input matrix are copied to the output. The upper triangle of the input is ignored and the upper triangle of the output is left uninitialized. @end itemize The @var{column_major} parameter affects the form of the output array. If it is zero, then each row of the input matrix is stored in a contiguous block of memory in the output array, and if it is non-zero, each column is stored contiguously. @cindex Fortran The latter representation is also known as Fortran order and may be required by library functions written in Fortran. In all cases when a triangular form is specified, part of the output matrix is left uninitialized. The redundant entries may be assigned if required by the @code{avm_reflect_matrix} function (@ref{Related utility functions}). @end deftypefun @deftypefun list avm_list_of_matrix (void *@var{matrix}, int @var{rows}, int @var{cols}, size_t @var{item_size}, int *@var{fault}) This function performs an inverse operation to @code{avm_matrix_of_list} by taking the address of a matrix stored as a contiguous array in the parameter @var{matrix} and constructing the list representation as discussed above. Only square and rectangular matrices in row major order are supported, but see @code{avm_matrix_transposition} for a way to convert between row major @cindex column major order and column major order (@ref{Related utility functions}). The parameters @var{rows}, @var{cols}, and @var{item_size} describe the form of the matrix. The list returned as a result will have a length of @var{rows}, and each item will be a list of length @var{cols}. Each item of the result corresponds to a row of the matrix, and each item of the items represents the an entry of the matrix as a list of length @var{item_size}. These items could be passed to @code{avm_value_of_list}, for example, to obtain their values (@ref{Primitive types}). Memory is allocated by this function to create the list, which can be reclaimed by @code{avm_dispose} (@ref{Simple Operations}). If there is insufficient memory, the integer referenced by @var{fault} is assigned a non-zero value and the result returned is a list representation of the message @code{<'memory overflow'>}. The error message be reclaimed by the caller as well using @code{avm_dispose}. @end deftypefun A packed storage representation for symmetric square matrices and @cindex packed arrays triangular matrices is of interest because it is used by some library functions, notably those in @code{LAPACK}, to save memory and thereby accommodate larger problems. In this representation, column major @cindex column major order order is assumed, and either the lower or the upper triangle of the matrix is not explicitly stored. For example, a lower triangular @cindex triangular matrix matrix whose list representation corresponds to @example < , , , > @end example @noindent would be stored according to the memory map @cindex matrix memory map @example [a11 a21 a31 a41 a22 a32 a42 a33 a43 a44] @end example @noindent with @code{a11} at the beginning address. An upper triangular matrix @example < , , , > @end example @noindent would be stored according to the memory map @example [a11 a12 a22 a13 a23 a33 a14 a24 a34 a44]. @end example A couple of functions converting between list representations and packed array format are provided as described below. @deftypefun void *avm_packed_matrix_of_list (int @var{upper_triangular}, list @var{operand}, int @var{n}, size_t @var{item_size}, list *@var{message}, int *@var{fault}) If the @var{operand} is a list in one of the triangular forms explained above, then the @var{upper_triangular} parameter must be consisitent with it, being non-zero if the @var{operand} is upper triangular and zero otherwise. If the @var{operand} is not in a triangular form, then each item of the operand must be a list of length @var{n}. In this case, the @var{upper_triangular} parameter indicates which triangle of the operand should be copied to the result, and the other triangle is ignored. In either case, the operand must have a length of @var{n}, and the items of its items must be lists of length @var{item_size} containing character representations as required by @code{avm_value_of_list} (@ref{Primitive types}). If the input parameters are inconsistent or if there is insufficient memory to allocate the result, the integer referenced by @var{fault} is assigned a non-zero value, and the list referenced by @var{message} is assigned a copy of the list representation of @code{<'bad matrix specification'>} or @code{<'memory overflow'>}, respectively. A non-empty message must be reclaimed by the caller using @code{avm_dispose} (@ref{Simple Operations}). If there are no errors, the result is a pointer to a packed array representation of the @var{operand} as explained above. The memory for this result is allocated by this function and should be freed by the caller when no longer required. The number of bytes allocated will be @var{item_size} * (@var{n} * (@var{n} + 1))/2. @end deftypefun @deftypefun list avm_list_of_packed_matrix (int @var{upper_trianguler},void *@var{operand}, int @var{n}, size_t @var{item_size}, int *@var{fault}) This function performs an inverse operation to that of @code{avm_packed_matrix_of_list} given the address of a packed matrix stored according to one of the memory maps discussed above. The @var{operand} parameter holds the address, the parameter @var{n} gives the number of rows, and the @var{upper_triangular} parameter specifies which of the two possible memory maps to assume. If there is sufficient memory, the result returned is a list in one of the triangular forms described above, being upper triangular if the @var{upper_triangular} parameter is non-zero, with values of length @var{item_size} taken from the array. In the event of a memory overflow, the integer referenced by @var{fault} is assigned a non-zero value and the result is a copy of the message @code{<'memory overflow'>} represented as a list. A @cindex segmentation fault segmentation fault is possible if this function is passed an invalid pointer or dimension. @end deftypefun @node Related utility functions, , Two dimensional arrays, Type Conversions @subsubsection Related utility functions A small selection of additional functions that are likely to be of use to developers concerned with matrix operations has been incorporated into the API to save the trouble of reinventing them, although doing so would be straightforward. They are described in this section without further motivation. @deftypefun void *avm_matrix_transposition (void *@var{matrix}, int @var{rows}, int @var{cols}, size_t @var{item_size}) This function takes the address of an arbitrary rectangular @var{matrix} represented as a contiguous array (not a list) and transposes it in place. That is, this function transforms an @var{m} by @var{n} matrix to an @var{n} by @var{m} matrix by exchanging the @var{i},@var{j}th element with the @var{j},@var{i}th element for all values of @var{i} and @var{j}. The numbers of rows and columns in the @var{matrix} are given by the parameters @var{rows} and @var{cols}, respectively, and the size of the entries in bytes is given by @var{item_size}. The @var{matrix} is assumed to be in row major order, but this function is applicable to matrices in column major order if the caller @cindex column major order passes the number of columns in @var{rows} and the number of rows in @var{cols}. Alternatively, this function can be seen as a conversion between the row major and the column major representation of a matrix. An @var{m} by @var{n} matrix in row major order will be transformed to the same @var{m} by @var{n} matrix in column order, or from column order to row order. A notable feature of this function is that it allocates no memory so there is no possibility of a memory overflow even for very large matrices, unlike a naive implementation which would involve making a temporary copy of the matrix. There is a possibility of a segmentation @cindex segmentation fault fault if invalid pointers or dimensions are given. @end deftypefun @deftypefun void *avm_matrix_reflection (int @var{upper_triangular}, void *@var{matrix}, int @var{n}, size_t @var{item_size}) This function takes a symmetric square @var{matrix} of dimension @var{n} containing entries of @var{item_size} bytes each and fills in the redundant entries. If @var{upper_triangular} is non-zero, the upper triangle of the @var{matrix} is copied to the lower triangle. If @var{upper_triangular} is zero, the lower triangular entries are copied to the upper triangle. These conventions assume row major order. If the @var{matrix} is in @cindex row major order column major order, then the caller can either transpose it in place @cindex column major order before and after this function by @code{avm_matrix_transposition}, or can complement the value of @var{upper_triangular}. Note that this function may be unnecessary for @code{LAPACK} library functions that ignore the redundant entries in a symmetric matrix, because they can be left uninitialized, but it is included for the sake of completeness. @end deftypefun @deftypefun list *avm_row_number_array (counter @var{m}, int *@var{fault}) A fast, memory efficient finite map from natural numbers to their list representations can be obtained by using this function as an alternative to @code{avm_natural} or @code{avm_recoverable_natural} when repeated evaluations of numbers within a known range are required (@ref{Simple Operations} and @ref{Recoverable Operations}). Given a positive integer @var{m}, this function allocates and returns an array of @var{m} lists whose @var{i}th entry is the list representation of the number @var{i} as explained in @ref{Representation of Numeric and Textual Data}. An amount of memory proportional to @var{m} is used for the array and its contents. If there is insufficient memory, a @code{NULL} value is returned and the integer referenced by @var{fault} is set to a non-zero value. @end deftypefun @deftypefun void avm_dispose_rows (counter @var{m}, list *@var{row_number}) This function reclaims an array @var{row_number} of size @var{m} returned by @code{avm_row_number_array}, and its contents if any. A @code{NULL} pointer is allowed as the @var{row_number} parameter and will have no effect, but an uninitialized pointer will cause a @cindex segmentation fault segmentation fault. @end deftypefun @deftypefun void avm_initialize_matcon (); This function initializes some static variables used by the functions declared in @file{matcon.h} and should be called before any of them is called or they might not perform according to specifications. @end deftypefun @deftypefun void avm_count_matcon (); This function frees the static variables allocated by @code{avm_initialize_matcon} and is used to verify the absence of memory leaks. It should be called after the last call to any functions in @file{matcon.h} but before @code{avm_count_lists} if the latter is being used (@ref{Simple Operations}). @end deftypefun @node Comparison, Deconstruction Functions, Type Conversions, Lists @subsection Comparison The file @file{compare.h} contains a few function declarations pertaining to the computation of the comparison predicate described in @ref{Compare}. Some of the work is done by static functions in @file{compare.c} that are not recommended entry points to the library. @deftypefun void avm_initialize_compare () @cindex @code{compare} This function should be called once before the first call to @code{avm_comparison}, as it initializes some necessary internal data structures. @end deftypefun @deftypefun void avm_count_compare () This function can be used to check for memory leaks, by detecting unreclaimed storage at the end of a run. The data structures relevant to comparison that could be reported as unreclaimed are known as ``decision'' nodes, but these should always be handled properly by the library without intervention. If @code{avm_count_lists} is also being used, the call to this function must precede it. @end deftypefun @deftypefun list avm_comparison (list @var{operand}, int *@var{fault}) This function takes a list operand representing a pair of trees and returns a list representing the logical value of their equality. If the operand is @code{NULL}, a message of invalid comparison is returned and the @code{*@var{fault}} is set to a non-zero value. If the @code{head} of the operand is unequal to the @code{tail}, a @code{NULL} value is returned. If they are equal, a list is returned whose @code{head} and @code{tail} are both @code{NULL}. The equality in question is structural @cindex pointer equality rather than pointer equality. The list operand to this function may be modified by this function, but not in a way that should make any difference to a client program. If two lists are found to be equal, or if even two sublists are found to be equal in the course of the comparison, one of them is deallocated and made to point to the other. This action saves memory and may make subsequent comparisons faster. However, it could disrupt client programs @cindex pointers that happen to be holding stale list pointers. @cindex discontiguous field As of @code{avram} version 0.6.0, a logical field called @code{discontiguous} has been added to the @code{node} record type declared in @code{lists.h}, which is checked by the comparison function. If a list node has its @code{discontiguous} field set to a non-zero value, and if it also has a non-null @code{value} field, then it won't be deallocated in the course of comparison even if it is found to be equal to something else. This feature can be used by client modules to create lists in which value fields refer to data structures that are meant to exist independently of them. See @file{mpfr.c} for an example. This function is likely to have better performance and memory usage than a naive implementation of comparison, for the above reasons and also because of optimizations pertaining to comparison of lists representing characters. Moreover, it is not subject to stack overflow exceptions @cindex recursion because it is not written in a recursive style. @end deftypefun @deftypefun list avm_binary_comparison (list @var{left_operand}, list @var{right_operand}, int *@var{fault}); This function is the same as @code{avm_comparison} except that it allows the left and right operands to be passed as separate lists rather than taking them from the @code{head} and the @code{tail} of a single list. @end deftypefun @node Deconstruction Functions, Indirection, Comparison, Lists @subsection Deconstruction Functions A fast native implementation of the deconstruction operation is provided @cindex deconstruction by the functions declared in @file{decons.h}. @deftypefun void avm_initialize_decons () This should be called prior to the first call to @code{avm_deconstruction}, so as to initialize some necessary internal data structures. Results will be undefined if it is not. @end deftypefun @deftypefun void avm_count_decons () For ecologically sound memory management, this function should be called at the end of a run to verify that there have been no leaks due to the deconstruction functions, which there won't be unless the code in @file{decons.c} has been ineptly modified. An error message to the effect of unreclaimed ``points'' could be the result otherwise. @end deftypefun @deftypefun list avm_deconstruction (list @var{pointer}, list @var{operand}, int *@var{fault}) Deconstructions are performed by this function, as described in @ref{Field}. In the @code{silly} program notation (@ref{A Simple Lisp Like Language}), this function computes the value of ([[@code{field}]] @code{@var{pointer}}) @code{@var{operand}}. For example, using the fixed list @code{avm_join(NULL,NULL)} as the @code{@var{pointer}} parameter will cause a copy of the operand itself to be returned as the result. A @code{@var{pointer}} equal to @code{avm_join(NULL,avm_join(NULL,NULL))} will cause a copy of @code{operand->tail} to be returned, and so on. A @code{NULL} @code{@var{pointer}} causes an internal error. If the deconstruction is invalid, as in the case of the tail of an empty list, the invalid deconstruction error message is returned as the result, and the @code{*@var{fault}} parameter is set to a non-zero value. The @code{*@var{fault}} parameter is also set to a non-zero value in the event of a memory overflow, and the memory overflow message is returned. @end deftypefun @node Indirection, The Universal Function, Deconstruction Functions, Lists @subsection Indirection In some cases it is necessary to build a tree from the top down rather @cindex pointers than from the bottom up, when it is not known in advance what's on the bottom. Although the @code{list} type is a pointer itself, these situations call for a type of pointers to lists, which are declared as the @code{branch} type in @file{branches.h}. For example, if @code{b} is declared as a @code{branch} and @code{l} is declared as a @code{list}, it would be possible to write @code{b = &l}. Facilities are also provided for maintaining queues of branches, which @cindex queues are declared as the @code{branch_queue} type. This type is a pointer to a structure with two fields, @code{above} and @code{following}, where @code{above} is a @code{branch} and @code{following} is a @code{branch_queue}. These functions are used internally elsewhere in the library and might not be necessary for most client programs to use directly. @deftypefun void avm_initialize_branches () This must be done once before any of the other branch related functions is used, and creates some internal data structures. Results of the other functions are undefined if this one isn't called first. @end deftypefun @deftypefun void avm_count_branches () This function can be used at the end of a run to detect unreclaimed storage used for branches or branch queues. If any storage remains unreclaimed, a message about unreclaimed branches is written to standard error. @end deftypefun @deftypefun void avm_anticipate (branch_queue *@var{front}, branch_queue *@var{back}, branch @var{operand}) This function provides a simple queueing facility for branches. Similarly to the case with @code{avm_enqueue}, @code{front} and @code{back} should be initialized to @code{NULL} before the first call. Each call to this function will enqueue one item to the back, assuming enough memory is available, as the following example shows. @example front = NULL; back = NULL; l = avm_join(NULL,NULL); anticipate(&front,&back,&(l->head)); anticipate(&front,&back,&(l->tail)); @end example After the above code is executed, these postconditions will hold. @example front->above == &(l->head) front->following->above == &(l->tail) front->following == back back->following == NULL @end example The name ``anticipate'' is used because ordinarily the queue contains positions in a tree to be filled in later. As usual, only unshared trees should be modified in place. @end deftypefun @deftypefun void avm_recoverable_anticipate (branch_queue *@var{front}, branch_queue *@var{back}, branch @var{operand}, int *@var{fault}) This function is similar to @code{avm_anticipate}, except that it will not exit with an error message in the event of an overflow error, but will simply set @code{*@var{fault}} to a non-zero value and return to the caller. If an overflow occurs, nothing about the queue is changed. @end deftypefun @deftypefun void avm_enqueue_branch (branch_queue *@var{front}, branch_queue *@var{back}, int @var{received_bit}) A slightly higher level interface to the @code{avm_anticipate} function is provided by this function, which is useful for building a tree from @cindex trees a string of input bits in a format similar to the one described in @ref{Concrete Syntax}. This function should be called the first time with @code{front} and @code{back} having been initialized to represent a queue containing a @cindex queues single branch pointing to a list known to the caller. The list itself need not be allocated or initialized. An easy way of doing so would be the following. @example front = NULL; back = NULL; avm_anticipate(&front,&back,&my_list); @end example On each call to @code{avm_enqueue_branch}, the @code{@var{received_bit}} parameter is examined. If it is zero, nothing will be added to the queue, the list referenced by the front branch will be assigned @code{NULL}, and the front branch will be removed from the queue. If @code{@var{received_bit}} is a non-zero value, the list referenced by the front branch will be assigned to point to a newly created unshared list node, and two more branches will be appended to the queue. The first branch to be appended will point to the head of the newly created list node, and the second branch to be appended will point to the tail. If the sequence of bits conforms to the required concrete syntax, this function can be called for each of them in turn, and at the end of the sequence, the queue will be empty and the list referenced by the initial branch (i.e., @code{my_list}) will be the one specified by the bit string. If the sequence of bits does not conform to the required concrete syntax, the error can be detected insofar as the emptying of the queue will not coincide exactly with the last bit. The caller should check for the queue becoming prematurely empty due to syntax errors, because no message is reported by @code{avm_enqueue_branch} in that event, and subsequent attempts to enqueue anything are ignored. However, in the event of a memory overflow, an error message is reported and the process is terminated. @end deftypefun @deftypefun void avm_recoverable_enqueue_branch (branch_queue *@var{front}, branch_queue *@var{back}, int @var{received_bit}, int *@var{fault}) This function is similar to @code{avm_enqueue_branch} but will leave error handling to the caller in the event of insufficient memory to enqueue another branch. Instead of printing an error message and exiting, it will dispose of the queue, set the @code{@var{fault}} flag to a non-zero value, and return. Although the queue will be reclaimed, the lists referenced by the branches in it will persist. The list nodes themselves can be reclaimed by disposing of the list whose address was stored originally in the front branch. @end deftypefun @deftypefun void avm_dispose_branch_queue (branch_queue @var{front}) This function deallocates a branch queue by chasing the @code{following} fields in each one. It does nothing to the lists referenced by the branches in the queue. Rather than using @code{free} directly, client programs should use this function for deallocating branch queues, because it allows better performance by interacting with a local internal cache of free memory, and because it performs necessary bookkeeping for @code{avm_count_branches}. @end deftypefun @deftypefun void avm_dispose_branch (branch_queue @var{old}) This disposes of a single branch queue node rather than a whole queue. Otherwise, the same comments as those above apply. @end deftypefun @node The Universal Function, , Indirection, Lists @subsection The Universal Function @cindex universal function A function computing the result of the invisible operator used to specify the virtual code semantics in @ref{Virtual Code Semantics}, is easily available by way of a declaration in @file{apply.h}. @deftypefun void avm_initialize_apply () This function should be called by the client program at least once prior to the first call to @code{avm_apply} or @code{avm_recoverable_apply}. It causes certain internal data structures and error message texts to be initialized. @end deftypefun @deftypefun void avm_count_apply () This function should be used at the end of a run for the purpose of detecting and reporting any unreclaimed storage associated with functions in this section. If the function @code{avm_count_lists()} is also being used, it should be called after this one. @end deftypefun @deftypefun list avm_apply (list @var{operator}, list @var{operand}) This is the function that evaluates the operator used to describe the virtual code semantics. For example, the value of @code{@var{f} @var{x}} can be obtained as the result returned by @code{avm_apply(@var{f},@var{x})}. Both parameters to this function are deallocated unconditionally and should not be referenced again by the caller. If the parameters are needed subsequently, then only copies of them should be passed to @code{avm_apply} using @code{avm_copied}. This function is not guaranteed to terminate, and may cause a memory overflow error. In the event of an exceptional condition, the error message is written to standard error and the program is halted. There is no externally visible distinction between different levels of error conditions. @end deftypefun @deftypefun list avm_recoverable_apply (list @var{operator}, list @var{operand}, int *@var{fault}) This function is similar to @code{avm_apply} but leaves the responsibility of error handling with the caller. If any overflow or exceptional condition occurs, the result returned is a list representing the error message, and the @code{@var{fault}} flag is set to a non-zero value. This behavior contrasts with that of @code{avm_apply}, which will display the message to standard error and kill the process. @end deftypefun @node Characters and Strings, File Manipulation, Lists, Library Reference @section Characters and Strings @cindex character strings If a C program is to interact with a virtual code application by exchanging text, it uses the representation for characters described in @ref{Character Table}. This convention would be inconvenient without a suitable API, so the functions in this section address the need. These functions are declared in the header file @file{chrcodes.h}. Some of these functions have two forms, with one of them having the word @code{standard} as part of its name. The reason is to cope with multiple character encodings. Versions of @code{avram} prior to 0.1.0 @cindex character encodings @cindex multiple character encodings used a different character encoding than the one documented in @ref{Character Table}. The functions described in @ref{Version Management} can be used to select backward compatible operation with the older character encoding. The normal forms of the functions in this section will use the older character set if a backward compatibility mode is indicated, whereas the standard forms will use the character encoding documented in @ref{Character Table} regardless. Standard encodings should always be assumed for library and function @cindex standard character encoding names associated with the @code{library} combinator (@ref{Calling existing library functions}), and for values of lists defined by @code{avm_list_of_value} (@ref{Primitive types}), but version dependent encodings should be used for all other purposes such as error messages. Alternatively, the normal version dependent forms of the functions below can be used safely in any case if backward @cindex backward compatability compatibility is not an issue. This distinction is viewed as a transitional feature of the API that will be discontinued eventually when support for the old character set is withdrawn and the @code{standard} forms are be removed. @deftypefun list avm_character_representation (int @var{character}) @end deftypefun @deftypefun list avm_standard_character_representation (int @var{character}) This function takes an integer character code and returns a copy of the list representing it, as per the table in @ref{Character Table}. Because the copy is shared, no memory is allocated by this function so there is no possibility of overflow. Nevertheless, it is the responsibility of the caller dispose of the list when it is no longer needed by @code{avm_dispose}, just as if the copy were not shared (@ref{Simple Operations}). For performance reasons, this function is implemented as a macro. If the argument is outside the range of zero to 255, it is masked into that range. @end deftypefun @deftypefun int avm_character_code (list @var{operand}) @end deftypefun @deftypefun int avm_standard_character_code (list @var{operand}) This function takes a list as an argument and returns the corresponding character code, as per @ref{Character Table}. If the argument does not represent any character, a value of @code{-1} is returned. @end deftypefun @deftypefun list avm_strung (char *@var{string}) @end deftypefun @deftypefun list avm_standard_strung (char *@var{string}) This function takes a pointer to a null terminated character string and returns the list obtained by translating each character into its list representation and enqueuing them together. Memory needs to be allocated for the result, and if there isn't enough available, an error message is written to standard error and the process is terminated. This function is useful to initialize lists from hard coded strings at the beginning of a run, as in this example. @example hello_string = avm_strung("hello"); @end example This form initializes a single string, but to initialize a one line message suitable for writing to a file, it would have to be a list of strings, as in this example. @example hello_message = avm_join(avm_strung("hello"),NULL); @end example The latter form is used internally by the library for initializing most of the various error messages that can be returned by other functions. @end deftypefun @deftypefun list avm_recoverable_strung (char *@var{string}, int *@var{fault}); @end deftypefun @deftypefun list avm_recoverable_standard_strung (char *@var{string}, int *@var{fault}); This function is like @code{avm_strung} except that if it runs out of memory it sets the integer referenced by @var{fault} to a non-zero value and returns instead of terminating the process. @end deftypefun @deftypefun char *avm_unstrung (list @var{string}, list *@var{message}, int *@var{fault}) @end deftypefun @deftypefun char *avm_standard_unstrung (list @var{string}, list *@var{message}, int *@var{fault}) This function performs an inverse operation to @code{avm_recoverable_strung}, taking a list representing a character string to the character string in ASCII null terminated form as per the standard C representation. Memory is allocated for the result by this function which should be freed by the caller. In the event of an exception, the integer referenced by @code{fault} is assigned a non-zero value and an error message represented as a list is assigned to the list referenced by @code{message}. The error message should be reclaimed by the caller with @code{avm_dispose} (@ref{Simple Operations} if it is non-empty. Possible error messages are @code{<'memory overflow'>}, @code{<'counter overflow'>}, and @code{<'invalid text format'>}. @end deftypefun @deftypefun list avm_scanned_list (char *@var{string}) An application that makes use of virtual code snippets or data that are known at compile time can use this function to initialize them. The argument is a string in the format described in @ref{Concrete Syntax}, and the result is the list representing it. For example, the program discussed in @ref{Example Script} could be hard coded into a C program by pasting the data from its virtual code file into an expression of this form. @example cat_program = avm_scanned_list("sKYQNTP\\"); @end example Note that the backslash character in the original data has to be preceded by an extra backslash in the C source, because backslashes usually mean something in C character constants. The @code{avm_scanned_list} function needs to allocate memory. If there isn't enough memory available, it writes a message to standard error and causes the process to exit. @end deftypefun @deftypefun list avm_multiscanned (char **@var{strings}) Sometimes it may be useful to initialize very large lists from strings, but some C compilers impose limitations on the maximum length of a string constant, and the ISO standard for C requires only 512 bytes. This function serves a similar purpose to @code{avm_scanned_list}, but allows the argument to be a pointer to a null terminated array of strings instead of one long string, thereby circumventing this limitation in the compiler. @example char *code[] = @{"sKYQ","NTP\\",NULL@}; ... cat_program = avm_multiscanned(code); @end example If there is insufficient memory to allocate the list this function needs to create, it causes an error message to be written to standard error, and then kills the process. @end deftypefun @deftypefun char* avm_prompt (list @var{prompt_strings}) This function takes a list representing a list of character strings, and returns its translation to a character string with the sequence 13 10 used as a separator. For example, given a tree of this form @example some_message = avm_join( avm_strung("hay"), avm_join( avm_strung("you"), NULL)); @end example the result returned by @code{prompt_strings(some_message)} would be a pointer to a null terminated character string equivalent to the C constant @code{"hay\13\10you"}. Error messages are printed and the process terminated in the event of either a memory overflow or an invalid character representation. This function is used by @code{avram} in the evaluation of interactive @cindex interactive applications virtual code applications, whose output has to be compared to the output from a shell command in this format. The separator is chosen to be compatible with the @code{expect} library. @end deftypefun @deftypefun char* avm_recoverable_prompt (list @var{prompt_strings}, list *@var{message}, int *@var{fault}) This function performs the same operation as @code{avm_prompt} but allows the caller to handle exceptional conditions. If an exception such as a memory overflow occurs, the integer referenced by @code{fault} is assigned a non-zero value and a representation of the error message as a list of strings is assigned to the list referenced by @code{message}. This function is used to by @code{avram} to evaluate the @code{interact} combinator (@ref{Interaction combinator}), when terminating in the event of an error would be inappropriate. @end deftypefun @deftypefun void avm_initialize_chrcodes () This function has to be called before any of the other character conversion functions in this section, or else their results are undefined. It performs the initialization of various internal data structures. @end deftypefun @deftypefun void avm_count_chrcodes () This function can be called at the end of a run, after the last call to any of the other functions in this section, but before @code{avm_count_lists} if that function is also being used. The purpose of this function is to detect and report memory leaks. If any memory associated with any of these functions has not been reclaimed by the client program, a message giving the number of unreclaimed lists will be written to standard error. @end deftypefun @node File Manipulation, Invocation, Characters and Strings, Library Reference @section File Manipulation The functions described in this section provide an interface between virtual code applications and the host file system by converting between files or file names and their representations as lists. These conversions are necessary when passing a file to a virtual code application, or when writing a file received in the result of one. @menu * File Names:: * Raw Files:: * Formatted Input:: * Formatted Output:: @end menu @node File Names, Raw Files, File Manipulation, File Manipulation @subsection File Names A standard representation is used by virtual code applications for the @cindex file names path names of files, following the description in @ref{Input Data Structure}. The functions and constants declared in @code{fnames.h} provide an API for operating on file names in this form. @deftypefun list avm_path_representation (char *@var{path}) If a C program is to invoke a virtual code application and pass a path name to it as a parameter, this function can be used to generate the appropriate representation from a given character string. @example conf_path = avm_path_representation("/etc/resolve.conf"); @end example In this example, @code{conf_path} is a @code{list}. For potentially better portability, a C program can use the character constant @code{avm_path_separator_character} in place of the slashes in hard coded path names. Other useful constants are @code{avm_current_directory_prefix} as a @cindex @code{avm_path_separator_character} @cindex @code{avm_path_separator} @cindex @code{avm_current_directory_prefix} @cindex @code{avm_parent_directory_prefix} @cindex @code{avm_root_directory_prefix} portable replacement for @code{"./"}, as well as @code{avm_parent_directory_prefix} instead of @code{"../"}. There is also @code{avm_root_directory_prefix} for @code{"/"}. These three constants are null terminated strings, unlike @code{avm_path_separator_character}, which is a character. If a @code{NULL} pointer is passed as the @code{@var{path}}, a @code{NULL} list is returned, which is the path representation for standard input or standard output. If the address of an empty string is passed to this function as the @code{@var{path}}, the list of the empty string will be returned, which is the path representation for the root directory. Trailing path separators are ignored, so @code{"/"} is the same as the empty string. Some memory needs to be allocated for the result of this function. If the memory is not available, an error message is written to standard error and the process is terminated. @end deftypefun @deftypefun list avm_date_representation (char *@var{path}) This function is essentially a wrapper around the standard @code{ctime_r} function that not only gets the time stamp for a file at a given path, but transforms it to a list representation according to @ref{Character Table}. It needs to allocate memory for the result and will cause the program to exit with an error message if there is not enough memory available. The time stamp will usually be in a format like @code{Sun Mar 4 10:56:40 GMT 2001}. If for some reason the time stamp can not be obtained, the @cindex @code{unknown date} result will be a representation of the string @code{unknown date}. @end deftypefun @deftypefun char* avm_path_name (list @var{path}) This function is the inverse of @code{avm_path_representation}, in that it takes a list representing a path to the path name expressed as a character string. This function can be used in C programs that invoke virtual code applications returning paths as part of their results, so that the C program can get the path into a character string in order to open the file. If the @code{@var{path}} parameter is @code{NULL}, a @code{NULL} pointer is returned as the result. The calling program should check for a @cindex standard input @cindex standard output @code{NULL} result and interpret it as the path to standard input or standard output. The memory needed for the character string whose address is returned is allocated by this function if possible. The given @code{@var{path}} is not required to be consistent with the host file system, but it is required to consist of representations of non-null printable characters or spaces as lists per @ref{Character Table}. In the event of any error or overflow, control does not return to the caller, but an error message is printed and the program is aborted. The possible error messages from this function are the following. @cindex @code{counter overflow} @cindex @code{memory overflow} @cindex @code{null character in file name} @cindex @code{bad character in file name} @cindex @code{invalid file name} @itemize @bullet @item @code{@var{program-name}: counter overflow (code @var{nn})} @item @code{@var{program-name}: memory overflow (code @var{nn})} @item @code{@var{program-name}: null character in file name} @item @code{@var{program-name}: bad character in file name} @item @code{@var{program-name}: invalid file name (code @var{nn})} @end itemize @end deftypefun @deftypefun void avm_initialize_fnames () A few housekeeping operations relevant to internal data structures are performed by this function, making it necessary to be called by the client program prior to using any of the other ones. @end deftypefun @deftypefun void avm_count_fnames () This function can be used after the the last call to any of the other functions in this section during a run, and it will detect memory leaks that may be attributable to code in these functions or misuse thereof. If any unreclaimed storage remains when this function is called, a warning message will be written to standard error. If the function @code{avm_count_lists} is also being used by the client, it should be called after this one. @end deftypefun @node Raw Files, Formatted Input, File Names, File Manipulation @subsection Raw Files Some low level operations involving lists and data files are provided by these functions, which are declared in the header file @file{rawio.h}. @deftypefun list avm_received_list (FILE *@var{object}, char *@var{filename}) This function is a convenient way of transferring data directly from a raw format file into a list in memory. It might typically be used to load the virtual code for an application that has been written to a file by a compiler. @table @code @item @var{object} is the address of a file which should already be open for reading before this function is called, and will be read from its current position. @item @var{filename} should be set by the caller to the address of a null terminated string containing the name of the file, but is not used unless it needs to be printed as part of an error message. If it is a null pointer, standard input is assumed. @end table The result returned is a list containing data read from the file. The file format is described in @ref{File Format}. The preamble section of the file, if any, is ignored. If the file ends prematurely or otherwise conflicts with the format, the program is aborted with a message of @cindex @code{invalid raw file format} @display @code{@var{program-name}: invalid raw file format in @var{filename}} @end display written to standard error. The program will also be aborted by this function in the event of a memory overflow. The file is left open when this function returns, and could therefore be used to store other data after the end of the list. The end of a list is detected automatically by this function, and it reads no further, leaving the file position on the next character, if any. @end deftypefun @deftypefun void avm_send_list (FILE *@var{repository}, list @var{operand}, char *@var{filename}) This function can be used to transfer data from a list in memory to a file, essentially by implementing the printing algorithm described in @ref{Bit String Encoding}. @table @code @item @var{repository} is the address of a file already open for writing, to which the data are written starting from the current position. @item @var{operand} is the list containing the data to be written @item @var{filename} is the address of a null terminated string containing the name of the file that will be reported in an error message if necessary. @end table No preamble section is written by this function, but one could be @cindex preamble written to the file by the caller prior to calling it. Error messages are possible either because of i/o errors or because of insufficient memory. I/o errors are not fatal and will result only in a warning message being printed to standard error, but a memory overflow will cause the process to abort. An i/o error message reported by this function would be of the form @cindex @code{can't write} @display @code{@var{program-name}: can't write to @var{filename}} @end display followed by the diagnostic obtained from the standard @code{strerror} @cindex @code{strerror} function if it exists on the host platform. The file is left open when this function returns. @end deftypefun @deftypefun void avm_initialize_rawio () This function initializes some necessary data structures for the functions in this section, and should be called prior to them at the beginning of a run. @end deftypefun @deftypefun void avm_count_rawio () This function does nothing in the present version of the library, but should be called after the last call to all of the other functions in this section in order to maintain compatibility with future versions of the library. Future versions may decide to use this function to do some cleaning up of local data structures. @end deftypefun @node Formatted Input, Formatted Output, Raw Files, File Manipulation @subsection Formatted Input Some functions relating to the input of text files or data files with preambles are declared in the header file @file{formin.h}. The usage of these functions is as follows. @deftypefun list avm_preamble_and_contents (FILE *@var{source}, char *@var{filename}) This function loads a file of either text or data format into memory. @table @code @item @var{source} should be initialized by the caller as the address of a file already open for reading that will be read from its current position. @item @var{filename} should be set by the caller to the address of a null terminated character string giving the name of the file that will be used if an i/o error message needs to be written about it. If it is a @code{NULL} pointer, standard input is assumed. @end table The result returned by the function will be a list whose @code{head} @cindex preamble represents the preamble of the file and whose @code{tail} represents the contents. As a side effect, the input file will be closed, unless the @code{@var{filename}} parameter is @code{NULL}. If the file conforms to the format described in @ref{File Format}, the preamble is a list of character strings. In the result returned by the function, the @code{head} field will be a list with one item for each line in the file, and each item will be a list of character representations as in @ref{Character Table}, but with the leading hashes stripped. The @code{tail} will be the list specified by remainder of the file according to @ref{Concrete Syntax}. If the file has an empty preamble but is nevertheless a data file, the @code{head} will be a list whose @code{head} and @code{tail} are both @code{NULL}. If the file does not conform to the format in @ref{File Format}, then the @code{head} of the result will be @code{NULL}, and the @code{tail} will be a list of lists of character representations, with one for each line. Whether or not the file conforms to the format is determined on the fly, so this function is useful for situations in which the format is not known in advance. The conventions regarding the preamble and contents maintained by this function are the same as those used by virtual code applications as described in @ref{Standard Output Representation} and @ref{Input Data Structure}. The characters used for line breaks are not explicitly represented in @cindex line breaks the result. Depending on the host system, line breaks in text files may be represented either by the character code 10, or by the sequence 13 10. However, in order for the library to deal with binary files in a portable way, a line break always corresponds to a 10 as far as this function is concerned regardless of the host, and a 13 is treated like any other character. Hence, if this function were used on binary files that happened to have some 10s in them, the exact contents of a file could be reconstructed easily by appending a 10 to all but the last line and flattening the list. A considerable amount of memory may need to be allocated by this function in order to store the file as a list. If not enough memory is available, the function prints an error message to standard error and aborts rather than returning to the caller. However, i/o errors are not fatal, and will cause the function to print a warning but attempt to continue. @end deftypefun @deftypefun list avm_load (FILE *@var{source}, char *@var{filename}, int @var{raw}) Similarly to @code{avm_preamble_and_contents}, this function also loads a file into memory, but the format is specified in advance. @table @code @item @var{source} should be set by the caller to the address of an already open file for reading, which will be read from its current position. @item @var{filename} should be initialized by the caller as a pointer to a null terminated string containing the name of the file that will be reported to the user in the event of an error reading from it. If it is a @code{NULL} pointer, standard input is assumed. @item @var{raw} is set to a non-zero value by the caller to indicate that the file is expected to conform to the format in @ref{File Format}. If the file is an ordinary text file, then it should be set to zero. @end table In the case of a data file, which is when @code{@var{raw}} is non-zero, the result returned by this function will be a list representing the data section of the file and ignoring the preamble. In the case of a text file, the result will be a list of lists of character representations as per @ref{Character Table}, with one such list for each line in the file. Similar comments about line breaks to those mentioned under @code{avm_preamble_and_contents} are applicable. As a side effect of this function, the @code{@var{source}} file will be closed, unless the @code{@var{filename}} is a @code{NULL} pointer. This function is useful when the type of file is known in advance. If a data file is indicated by the @code{@var{raw}} parameter but the format is incorrect, an error message is reported and the process terminates. The error message will be of the form @display @code{@var{program-name}: invalid raw file format in @var{filename}} @end display Alternatively, if a text file is indicated by the @cindex @code{invalid raw file format} @code{@var{raw}} parameter, then no attempt is made to test whether it could be interpreted as data, even if it could be. This behavior differs from that of @code{avm_preamble_and_contents}, where a bad data file format causes the file to be treated as text, and a valid data file format, even in a ``text'' file, causes it to be treated as data. Memory requirements for this function are significant and will cause the process to abort with an error message in the event of insufficient free memory. Messages pertaining to i/o errors are also possible and are not fatal. @end deftypefun @deftypefun void avm_initialize_formin () This function should be called before either of the other functions in this section is called, as it initializes some necessary static data structures. Results of the other functions are undefined if this one is not called first. @end deftypefun @deftypefun void avm_count_formin () This function should be called after the last call to any of the other functions in this section, as it is necessary for cleaning up and reclaiming some internal data. If any storage remains unreclaimed due to memory leaks in these functions or to misuse of them, a warning message is written to standard error. If the function @code{avm_count_lists} is also being used by the client program, it should be called after this one. @end deftypefun @node Formatted Output, , Formatted Input, File Manipulation @subsection Formatted Output The following functions pertaining to the output of text files or data files with @cindex preamble preambles are declared in the header file @file{formout.h}. @deftypefun void avm_output (FILE *@var{repository}, char *@var{filename}, list @var{preamble}, list @var{contents}, int @var{trace_mode}) This function writes a either a text file or a data file in the format described in @ref{File Format}. The parameters have these interpretations. @table @code @item @var{repository} is the address of a file opened for writing by the caller, that will be written from its current position. @item @var{filename} is the address of a null terminated character string set by the caller to be the name of the file that will be reported to the user in the event of an i/o error. @item @var{preamble} is @code{NULL} in the case of a text file, but a list of character string representations as per @ref{Character Table}, in the case of a data file. If a data file has is to be written with an empty preamble, then this list should have a @code{NULL} @code{head} and a @code{NULL} @code{tail}. @item @var{contents} is either a list of character string representations in the case of a text file, or is an unconstrained list in the case of a data file. @item @var{trace_mode} may be set to a non-zero value by the caller to request that everything written to a text file should be echoed to standard output. It is ignored in the case of a data file. @end table The effect of calling this function is to write the preamble and contents to the file in the format indicated by the preamble. The file is left open when this function returns. Line breaks are always written as character code 10, not as 13 10, @cindex line breaks regardless of the convention on the host system, so that files written by this function can be reliably read by other functions in the library. Leading hashes are automatically added to the beginning of the lines in the preamble, except where they are unnecessary due to a continuation character on the previous line. This action enforces consistency with the file format, ensuring that anything written as a data file can be read back as one. The hashes are stripped automatically when the file is read by @code{avm_preamble_and_contents}. Another feature of this function is that it will mark any output file as executable if it is a data format file with a prelude whose first @cindex executable files character in the first line is an exclamation point. This feature makes it easier for a compiler implemented in virtual code to generate executable shell scripts directly. A fatal error is reported if any of the data required to be a character representation is not listed in the @ref{Character Table}. A fatal error can also be caused by a memory overflow. Possible error messages are the following. @cindex @code{invalid output preamble format} @cindex @code{invalid text format} @cindex @code{can't write} @itemize @bullet @item @code{@var{program-name}: invalid output preamble format} @item @code{@var{program-name}: invalid text format} @item @code{@var{program-name}: can't write to @var{filename}} @end itemize @cindex @code{strerror} In the last case, the error message will be followed by an explanation furnished by the standard @code{strerror} function if available. @end deftypefun @deftypefun void avm_output_as_directed (list @var{data}, int @var{ask_to_overwrite_mode}, int @var{verbose_mode}) This function writes an ensemble of files at specified paths in either text or data format, optionally interacting with the user through standard input and output. The parameters have these interpretations. @table @code @item @var{data} is a list in which each item specifies a file to be written. @item @var{ask_to_overwrite_mode} may be set to a non-zero value by the calling program in order to have this function ask the user for permission to overwrite existing files. @item @var{verbose_mode} may be set to a non-zero value by the calling program to have this function print to standard output a list of the names of the files it writes. @end table A high level interface between virtual code applications and the file system is provided by this function. The @code{@var{data}} parameter format is compatible with the the data structure returned by an application complying with the conventions in @ref{Output From Non-interactive Applications}. Each item in the @code{@var{data}} list should be a non-empty list whose @code{head} and @code{tail} are also non-empty. The fields in each item have the following relevance to the file it specifies. @itemize @bullet @item The @code{head} of the @code{head} is @code{NULL} if the file is to be opened for appending, and non-@code{NULL} if it is to be overwritten. @item The @code{tail} of the @code{head} represents a path as a list of character string representations, in a form suitable as an argument to @code{avm_path_name}. @item The @code{head} of the @code{tail} represents the preamble of the file, as either @code{NULL} for a text file or a non-empty list of character string representations for a data file. @item The @code{tail} of the @code{tail} represents the contents of the file, either as a list of character string representations for a text file or as a list in an unconstrained format for a data file. @end itemize For each item in the list, the function performs the following steps. @enumerate @item It decides whether to open a file for overwriting or appending based on the @code{head} of the @code{head}. @item It uses the @code{tail} of the @code{head} to find out the file name from @code{avm_path_name}, in order to open it. @item If the @code{@var{ask_to_overwrite_mode}} flag is set and the file is found to exist already, the function will print one of the following messages to standard output, depending on whether the file is to be overwritten or appended. @itemize @bullet @item @code{@var{program-name}: overwrite @var{filename}? (y/n)} @item @code{@var{program-name}: append to @var{filename}? (y/n)} @end itemize It will then insist on either @kbd{y} or @kbd{n} as an answer before continuing. @item If the @code{@var{ask_to_overwrite}} flag has not been set, or the file did not previously exist, or the answer of @kbd{y} was given, the preamble and contents of the file are then written with @code{avm_output}. @item If permission to write or append was denied, one of the following messages is reported to standard output, and the data that were to be written are lost. @cindex @code{not writing} file name @cindex @code{writing} file name @itemize @bullet @item @code{@var{program-name}: not writing @var{filename}} @item @code{@var{program-name}: not appending @var{filename}} @end itemize @item If permission was granted to write or append to the file or the @code{@var{verbose_mode}} flag is set, one of the messages @itemize @bullet @item @code{@var{program-name}: writing @var{filename}} @item @code{@var{program-name}: appending @var{filename}} @end itemize is sent to standard output. @end enumerate @cindex standard output If any files are to be written to standard output, which would be indicated by a @code{NULL} path, they are not written until all other files in the list are written. This feature is in the interest of @cindex security security, as it makes it more difficult for malicious or inept virtual code to alter the appearance of the console through standard output until after the interactive dialogue has taken place. Permission is not solicited for writing to standard output, and it will not be closed. Any of the fatal errors or i/o errors possible with @code{avm_output} or @code{avm_path_name} are also possible with this function, as well as the following additional ones. @cindex @code{invalid file specification} @cindex @code{can't close} @cindex @code{can't write} @itemize @bullet @item @code{@var{program-name}: invalid file specification} @item @code{@var{program-name}: can't close @var{filename}} @item @code{@var{program-name}: can't write @var{filename}} @end itemize The last two are non-fatal i/o errors that will be accompanied by an @cindex @code{strerror} explanation from the @code{strerror} function if the host supports it. The other message is the result of a badly formatted @code{@var{data}} parameter. @end deftypefun @deftypefun void avm_put_bytes (list @var{bytes}) This function takes a list of character representations, converts them to characters, and sends them to standard output. There is no chance of a memory overflow, but the following other errors are possible. @cindex @code{invalid text format} @cindex @code{can't write} @itemize @bullet @item @code{@var{program-name}: invalid text format (code @var{nn})} @item @code{@var{program-name}: can't write to standard output} @end itemize The latter is non-fatal, but the former causes the program to abort. It is caused when any member of the list @code{@var{bytes}} is not a character representation appearing in @ref{Character Table}. @end deftypefun @deftypefun void avm_initialize_formout () This function initializes some data structures used locally by the other functions in this section, and should be called at the beginning of a run before any of them is called. @end deftypefun @deftypefun void avm_count_formout () This function doesn't do anything in the current version of the library, but should be called after the last call to any of the other functions in this section. Future versions of the library might use this function for cleaning up some internal data structures, and client programs that call it will maintain compatibility with them. @end deftypefun @node Invocation, Version Management, File Manipulation, Library Reference @section Invocation The functions documented in this section can be used to incorporate the capabilities of a virtual machine emulator into other C programs with a minimal concern for the details of the required data structures and virtual code invocation conventions. @menu * Command Line Parsing:: * Execution Modes:: @end menu @node Command Line Parsing, Execution Modes, Invocation, Invocation @subsection Command Line Parsing @cindex command line A couple of functions declared in @file{cmdline.h} can be used to do all the necessary parsing of command lines and environment variables needed by virtual code applications. @deftypefun list avm_default_command_line (int @var{argc}, char *@var{argv}[], int @var{index}, char *@var{extension}, char *@var{paths}, int @var{default_to_stdin_mode}, int @var{force_text_input_mode}, int *@var{file_ordinal}) The purpose of this function is to build most of the data structure used by parameter mode applications, as described in @ref{Input Data Structure}, by parsing the command line according to @ref{Command Line Syntax}. The parameters have these interpretations. @table @code @item @var{argc} is the number elements in the array referenced by @code{@var{argv}} @item @var{argv} is the address of an array of pointers to null terminated character strings holding command line arguments @item @var{index} is the position of the first element of @code{@var{argv}} to be considered. Those preceding it are ignored. @item @var{extension} is the address of a string that will be appended to input file names given in @code{@var{argv}} in an effort to find the associated files @item @var{paths} is the address of a null terminated character string containing a colon separated list of directory names that will be searched for input files @item @var{default_to_stdin_mode} is set to a non-zero value by the caller if the contents of standard input should be read in the absence of input files @item @var{force_text_input_mode} is set to a non-zero value by the caller to indicate that input files should be read as text, using @code{avm_load} (rather than @code{avm_preamble_and_contents}, which would allow them to be either text or data). The @code{@var{preamble}} field of the returned file specifications will always be empty when this flag is set. @item @var{file_ordinal} is set to a pointer to an integer by the caller if only one file is to be loaded during each call. The value of the integer indicates the which one it will be. @end table The result returned by this function is a list whose @code{head} is a list of file specifications and whose @code{tail} is a list of command line options intended for input to a virtual code application. The list of file specifications returned in the @code{head} of the result follows the same conventions as the @code{@var{data}} parameter to the function @code{avm_output_as_directed}, except that the @code{head} of the @code{head} of each item is a list representing the time stamp of the file as given by @code{avm_date_representation}. If the file is standard input, then it holds the current system date and time. If the @code{@var{file_ordinal}} parameter is @code{NULL}, then all files on the command line are loaded, but if it points to an integer @var{n}, then only the @var{n}th file is loaded, and @var{n} is incremented. If there is no @var{n}th file, a @code{NULL} value is returned as the entire result of the function. For a series of calls, the integer should be initialized to zero by the caller before the first call. If standard input is indicated as one of the files on the command line (by a dash), then it is also loaded regardless of the @code{@var{file_ordinal}}, but a cached copy of it is used on subsequent calls after the first, so that the function does not actually attempt to reread it. If standard input is to be loaded, it must be finite for this function to work properly. The search strategy for files is described in @ref{Environment}, and makes use of the @code{@var{extension}} and @code{@var{paths}} parameters. In the list of command line options returned in the @code{tail} of the result, each item is a list with a non-empty @code{head} and @code{tail}, and is interpreted as follows. @itemize @bullet @item The @code{head} of the @code{head} is a list representing a natural number, as given by @code{avm_natural}, indicating the position of the option on the command line relative to the initial value of the @code{@var{index}} parameter. @item The @code{tail} of the @code{head} is a list which is @code{NULL} in the case of a ``short form'' option, written with a single dash on the command line, but is a list whose @code{head} and @code{tail} are @code{NULL} in the case of a ``long form'' option, written with two dashes. @item The @code{head} of the @code{tail} is a list representing a character string for the keyword of an option, for example @kbd{foo} in the case of an option written @kbd{--foo=bar,baz}. @item The @code{tail} of the @code{tail} is a list of lists representing character strings, with one item for each parameter associated with the option, for example, @kbd{bar} and @kbd{baz}. @end itemize If multiple calls to the function are made with differing values of @code{*@var{file_ordinal}} but other parameters unchanged, the same list of options will be returned each time, except insofar as the position numbers in the @code{head} of the @code{head} of each item are adjusted as explained in @ref{Input for Mapped Applications}. Any of the i/o errors or fatal errors associated with other file input operations are possible with this function as well. This non-fatal warning message is also possible. @cindex @code{search paths not supported} @display @code{@var{program-name}: warning: search paths not supported} @end display This error occurs if the library has been built on a platform that @cindex @file{argz.h} doesn't have the @file{argz.h} header file and the @code{@var{paths}} parameter is non-@code{NULL}. @end deftypefun @deftypefun list avm_environment (char *@var{env}[]) @cindex environment This function takes the address of a null terminated array of pointers to null terminated character strings of the form @code{"variable=value"}. The result returned is a list of lists, with one item for each element of the array. The @code{head} of each item is a representation of the left side of the corresponding string, and the @code{tail} is a representation of the right. This function is therefore useful along with @code{avm_default_command_line} for building the remainder of the data structure described in @ref{Parameter Mode Interface}. For example, a virtual machine emulator for non-interactive parameter mode applications with no bells and whistles could have the following form. @example int main(argc,argv,env) @dots{} @{ FILE *virtual_code_file; @dots{} avm_initialize_lists(); avm_initialize_apply(); avm_initialize_rawio(); avm_initialize_formout(); avm_initialize_cmdline(); virtual_code_file = fopen(argv[1],"rb"); operator = avm_received_list( virtual_code_file,argv[1]); fclose(virtual_code_file); command = avm_default_command_line(argc, argv,2,NULL,NULL,0,0,NULL); environs = avm_environment(env); operand = avm_join(command,environs); result = avm_apply(operator,operand); avm_output_as_directed(result,0,0); avm_dispose(result); @dots{} @} @end example The @code{avm_environment} function could cause the program to abort due to a memory overflow. For security reasons, it will also abort with an @cindex security error message if any non-printing characters are detected in its argument. (See @ref{Other Diagnostics and Warnings}.) @end deftypefun @deftypefun void avm_initialize_cmdline () This function initializes some local variables and should be called before any of the other functions in this section is called, or else their results are unpredictable. @end deftypefun @deftypefun void avm_count_cmdline () This function should be called after the last call to any of the other functions in this section, as it reclaims some locally allocated storage. If the @code{avm_count_lists} function is used, it should be called after this one. @end deftypefun @node Execution Modes, , Command Line Parsing, Invocation @subsection Execution Modes Some functions declared in @file{exmodes.h} are useful for executing interactive applications or filter mode transducers in a manner consistent with the specifications described in the previous chapter. @deftypefun void avm_interact (list @var{avm_interactor}, int @var{step_mode}, int @var{ask_to_overwrite_mode}, int @var{quiet_mode}) This function executes an interactive virtual code application. The parameters have these interpretations. @table @code @item @var{avm_interactor} is the virtual code for a function that performs as specified in @ref{Output From Interactive Applications}. @item @var{step_mode} will cause all shell commands to be echoed if set to a non-zero value, and will cause the program to pause after each shell command until a key is pressed. @item @var{ask_to_overwrite_mode} can be set to a non-zero value by the caller to cause the program to ask permission of the user to overwrite any existing files in cases where the virtual code returns a file list as described in @ref{Mixed Modes of Interaction}. @item @var{quiet_mode} can be set to a non-zero value to suppress console messages in the case of file output per @ref{Mixed Modes of Interaction}. @end table The meaning of this function is accessible to any reader willing to slog through @ref{Output From Interactive Applications}. The only subtle point is that @code{@var{avm_interactor}} parameter in this function does not correspond to the virtual code application that @code{avram} reads from a virtual code file, but to the result computed when the application read from the file is applied to the data structure representing the command line and environment. Any of the memory overflows or i/o errors possible with other functions in the library are possible from this one as well, and will also cause it to print an error message and halt the program. A badly designed @cindex deadlock virtual code application could cause a deadlock, which will not be detected or reported @end deftypefun @deftypefun void avm_trace_interaction () This function enables diagnostic output for the @code{avm_recoverable_interact} function. @end deftypefun @deftypefun void avm_disable_interaction () This function causes @code{avm_interact} and @code{avm_recoverable_interact} to terminate with an error instead of executing, as required by the @code{--jail} command line option. @end deftypefun @deftypefun list avm_recoverable_interact (list @var{interactor}, int @var{*fault}) This function is similar to @code{avm_interact} but always closes the pipe and performs no file i/o, and will return an error message rather than exiting. Otherwise it returns a transcript of the intereaction as a list of lists of strings represented as lists of character encodings. It implements the @var{interact} combinator with the virtual code for the transducer function given as the parameter. A prior call to @code{avm_trace_interaction} will cause diagnostic information to be written to standard output when this function is executed. @end deftypefun @deftypefun void avm_byte_transduce (list @var{operator}) This function executes a filter mode byte transducer application, which behaves as described in @ref{Byte Transducers}. The argument is the virtual code for the application, which would be found in a virtual code file. There are limited opportunities for i/o errors, as only standard input and standard output are involved with this function, but fatal errors due to memory overflow are possible. @end deftypefun @deftypefun void avm_line_map (list @var{operator}) This function executes line mapped filter mode applications, which are explained in @ref{Line Maps}. The argument is the virtual code for the application. Similar comments to those above apply. @end deftypefun @deftypefun void avm_initialize_exmodes () This function should be called before any of the other functions in this section in order to initialize some local variables. Results are undefined if this function isn't called first. @end deftypefun @deftypefun void avm_count_exmodes () This function doesn't do anything in the present version of the library, but should be called after the last call to any of the other functions in this section in order to maintain compatibility with future versions, which may use it for cleaning up local variables. @end deftypefun @node Version Management, Error Reporting, Invocation, Library Reference @section Version Management The @code{avram} library is designed to support any number of backward @cindex versions compatibility modes with itself, by way of some functions declared in @file{vman.h}. The assumption is that the library will go through a sequence of revisions during its life, each being identified by a unique number. In the event of a fork in the project, each branch will attempt to maintain compatibility at least with its own ancestors. @deftypefun void avm_set_version (char *@var{number}) This function can be used to delay the demise of a client program that uses the library but is not updated very often. The argument is a null terminated string representing a version number, such as @code{"@value{VERSION}"}. A call to this function requests that all library functions revert to their behavior as of that version in any cases where the current behavior is incompatible with it. It will also cause virtual code applications evaluated with @code{avm_apply} to detect a version number equal to the given one rather than the current one. (See @ref{Version}.) The program will exit with an internal error message if any function in the library has already interrogated the version number before this function is called, or if it is passed a null pointer. This problem can be avoided by calling it prior to any of the @code{avm_initialize} functions with a valid address. The program will exit with the message @display @code{@var{program-name}: multiple version specifications} @end display if this function is called more than once, even with the same number. If the number is not recognized as a present or past version, or is so old that it is no longer supported, the program will exit with this message. @display @code{avram: can't emulate version @var{number}} @end display Client programs that are built to last should allow the version number to be specified as an option by the user, and let virtual code applications that they execute take care of their own backward compatibility problems. This strategy will at least guard against changes in the virtual machine specification and other changes that do not affect the library API. @end deftypefun @deftypefun int avm_prior_to_version (char *@var{number}) This function takes the address of a null terminated string representing a version number as an argument, such as @code{"@value{VERSION}"}, and returns a non-zero value if the version currently being emulated predates it. If no call has been made to @code{avm_set_version} prior to the call to this function, the current version is assumed, and subsequent calls to @code{avm_set_version} will cause an internal error. The intended use for this function would be by a maintainer of the library introducing an enhancement that will not be backward compatible, who doesn't wish to break existing client programs and virtual code applications. For example, if a version @code{1.0} is developed at some time in the distant future, and it incorporates a previously unexpected way of doing something, code similar to the following could be used to maintain backward compatibility. @example if (avm_prior_to_version("1.0")) @{ /* do it the 0.x way */ @} else @{ /* do it the 1.0-and-later way */ @} @end example @noindent This function will cause an internal error if the parameter does not match any known past or present version number, or if it is a null pointer. @end deftypefun @deftypefun char* avm_version () This function returns the number of the version currently being emulated as the address of a null terminated string. The string whose address is returned should not be modified by the caller. If no call has been made to @code{avm_set_version} prior to the call to this function, the current version is assumed, and subsequent calls to @code{avm_set_version} will cause an internal error. @end deftypefun @node Error Reporting, Profiling, Version Management, Library Reference @section Error Reporting @cindex error messages Most of the error reporting by other functions in the library is done by way of the functions declared in @file{error.h}. These function communicate directly with the user through standard error. Client programs should also use these functions where possible for the sake of a uniform interface. @deftypefun void avm_set_program_name (char *@var{argv0}) The argument to this function should be the address of a null terminated string holding the name of the program to be reported in error messages that begin with a program name. Typically this string will be the name of the program as it was invoked on the command line, possibly with path components stripped from it. An alternative would be to set it to the name of a virtual code application being evaluated. If this function is never called, the name @code{"avram"} is used by default. Space for a copy of the program name is allocated by this function, and a fatal memory overflow error is possible if there is insufficient space available. @end deftypefun @deftypefun char* avm_program_name () This function returns a pointer to a null terminated character string holding the program name presently in use. It will be either the name most recently set by @code{avm_set_program_name}, or the default name @code{"avram"} if none has been set. The string whose address is returned should not be modified by the caller. @end deftypefun @deftypefun void avm_warning (char *@var{message}) This function writes the null terminated string whose address is given to standard error, prefaced by the program name and followed by a line break. @end deftypefun @deftypefun void avm_error (char *@var{message}) This function writes the null terminated string whose address is given to standard error, prefaced by the program name and followed by a line break, as @code{avm_warning}, but it then terminates the process with an exit code of 1. @end deftypefun @deftypefun void avm_fatal_io_error (char *@var{message}, char *@var{filename}, int @var{reason}) This function is useful for reporting errors caused in the course of reading or writing files. The message is written to standard error prefaced by the program name, and incorporating the name of the relevant file. The @code{@var{reason}} should be the error code obtained from the standard @code{errno} variable, which will be translated to an @cindex @code{strerror} informative message if possible by the standard @code{strerror} function and appended to the message. After the message is written, the process will terminate with an exit code of 1. @end deftypefun @deftypefun void avm_non_fatal_io_error (char *@var{message}, char *@var{filename}, int @var{reason}) This function does the same as @code{avm_fatal_io_error} except that it doesn't exit the program, and allows control to return to the caller, which should take appropriate action. @end deftypefun @deftypefun void avm_internal_error (int @var{code}) This function is used to report internal errors and halt the program. The error message is written to standard error prefaced by the program name and followed by a line break. The code should be a unique integer constant (i.e., not one that's used for any other internal error), that will be printed as part of the error message as an aid to the maintainer. This function should be used by client programs only in the event of conditions that constitute some violation of a required invariant. It indicates to the user that something has gone wrong with the program, for which a bug report would be appropriate. @end deftypefun @deftypefun void avm_reclamation_failure (char *@var{entity}, counter @var{count}) This function is used only by the @code{avm_count} functions to report unreclaimed storage. The @code{@var{count}} is the number of units of storage left unreclaimed, and the @code{@var{entity}} is the address of a null terminated string describing the type of unreclaimed entity, such as @code{"lists"} or @code{"branches"}. The message is written to standard error followed by a line break, but the program is not halted and control returns to the caller. @end deftypefun @node Profiling, Emulation Primitives, Error Reporting, Library Reference @section Profiling @cindex @file{profile.h} The functions declared in @file{profile.h} can be used for constructing and writing tables of run time statistics such as those mentioned in @ref{Files}, and @ref{Profile}. These functions maintain a database of structures, each recording the statistics for a particular virtual code fragment. Each structure in the database is identified by a unique key, which must be a list representing a character string. A pointer to such a structure @cindex @code{score} is declared to be of type @code{score}. For the most part, the data structure should be regarded as opaque by a client program, except for a field @code{reductions} of type @code{counter}, which may be modified arbitrarily by the client. The way these operations are used in the course of evaluating virtual code applications containing profile annotations is to add a structure to the database each time a new profiled code fragment is encountered, using the annotation as its key, and to increment the @code{reductions} @cindex annotations of the structure each time any constituent of the code gets a quantum of work done on it. Other ways of using these operations are left to the developer's discretion. @deftypefun score avm_entries (list @var{team}, list *@var{message}, int *@var{fault}) This function retrieves or creates a data base entry given its key. The parameters have these interpretations. @table @code @item @var{team} is a list representing a character string that uniquely identifies the database entry to be retrieved or created. @item @var{message} is the address of a list known to the caller, which will be assigned a list representing an error message if any error occurs in the course of searching the database or creating a new entry. @item @var{fault} is the address of an integer that will be set to a non-zero value if any error is caused by this function. @end table The pointer returned by this function is the address of the record whose key is given by the @code{@var{team}} parameter. If such a record is already in the database, its address is returned, but otherwise a new one is created whose address is then returned. The @code{reductions} field of a newly created entry will be zero. In the course of searching the database, the @code{avm_compare} function is used, so the associated lists may be modified as noted in @ref{Comparison}. It is not necessary for a client to include the header file @file{compare.h} or to call @code{avm_initialize_compare} in order to use the profile operations, because they are done automatically. If an error message is assigned to the list referenced by @code{@var{message}}, the integer referenced by @code{@var{fault}} will be set to a non-zero value. The form of the error message will be a list in which each item is a list of character representations as per @ref{Character Table}. It is the responsibility of the caller to dispose of the error message. Currently the only possible error is a memory overflow, which in this case is non-fatal. @end deftypefun @deftypefun void avm_tally (char *@var{filename}) This function makes a table of the results stored in the data base built by the @code{avm_entries} function. The argument is the address of a null terminated character string containing the name of the file in which the results will be written. A file is opened and the table is written in a self explanatory text format, with columns labeled ``reductions'' and ``invocations'' among others. The latter contains the number of times the associated key was accessed through @code{avm_entries}. The data written to the file should be taken with a grain of salt. It is computed using native integer and floating point arithmetic, with no checks made for overflow or roundoff error, and no guarantee of cross @cindex reductions platform portability. The number of ``reductions'' means whatever the developer of the client program wants it to mean. The following error messages are possible with this function, which will be written to standard error. None of them is fatal. @cindex @code{can't write} @cindex @code{can't close} @cindex @code{invalid profile identifier} @itemize @bullet @item @code{@var{program-name}: can't write @var{filename}} @item @code{@var{program-name}: can't write to @var{filename}} @item @code{@var{program-name}: can't close @var{filename}} @item @code{@var{program-name}: invalid profile identifier} @end itemize The last message is reported if any record in the database has a key that is not a list of valid character representations. The others are @cindex @code{strerror} accompanied by an explanation from the standard @code{strerror} function if possible. @end deftypefun @deftypefun void avm_initialize_profile () This function should be called before any of the other functions in this section in order to initialize the data base. Results are undefined if it is not called first. @end deftypefun @deftypefun void avm_count_profile () This function can be called after the other functions in this section as a way of detecting memory leaks. If any storage remains unreclaimed that was created by the functions in this section, a warning message is written to standard error. If the @code{avm_count_lists} function is being used by the client program, it should be called after this one. @end deftypefun @node Emulation Primitives, External Library Maintenance, Profiling, Library Reference @section Emulation Primitives The functions documented in this section can be used to take very specific control over the evaluation of virtual code applications. It is unlikely that a client program will have any need for them unless it aims to replace or extend the @code{avm_apply} function. The virtual machine is somewhat removed from a conventional von Neumann model of computation, so emulating it in C or any other imperative language is less straightforward than one would prefer. An elaborate system of interdependent data structures is used to represent partially evaluated computations, which does not particularly lend itself to a convenient, modular API. The abstraction provided by the functions in this section is limited mainly to that of simple memory management and stack operations. Consequently, a developer wishing to build on them effectively would need to @emph{grok} the data structures involved, which are described in some detail. @menu * Lists of Pairs of Ports:: * Ports and Packets:: * Instruction Stacks:: @end menu @node Lists of Pairs of Ports, Ports and Packets, Emulation Primitives, Emulation Primitives @subsection Lists of Pairs of Ports @cindex @code{port} A @code{port} is the name given to a type of pointer used in the library as the address of a place where a computational result yet to be evaluated will be sent. Ports are discussed further in @ref{Ports and Packets}, but are mentioned here because it is sometimes necessary to employ a list of pairs of them. A pointer to such a list is declared as a @code{portal} type. It refers to a structure of the form @cindex @code{portal} @cindex @code{port_pair} @example struct port_pair @{ port left; port right; portal alters; @} @end example A small selection of functions for @code{portal} memory management is declared as follows in the header file @file{portals.h}. For reasons of C-ness, the type declarations themselves are forced to be in @file{lists.h}. @deftypefun portal avm_new_portal (portal @var{alters}) This function is used to create storage for a new @code{port_pair} structure, and returns a @code{portal} pointer to it if successful. If the storage can't be allocated, a @code{NULL} pointer is returned. The @code{alters} field of the result is initialized as the given parameter supplied by the caller. All other fields are filled with zeros. @end deftypefun @deftypefun void avm_seal (portal @var{fate}) This function performs the reclamation of storage associated with @code{portal} pointers, either by freeing them or by consigning them temporarily to a local cache for performance reasons. Client programs should use only this function for disposing of @code{portal} storage rather than using @code{free} directly, so as to allow accurate record keeping. @end deftypefun @deftypefun void avm_initialize_portals () This function should be called by a client program prior to calling either of the above memory management functions in order to initialize some local variables. Anomalous results are possible otherwise. @end deftypefun @deftypefun void avm_count_portals () This function should be called at the end of a run or after the last call to any of the other functions in this section as a way of detecting memory leaks associated with @code{portal} pointers. A warning message will be written to standard error if any remains unreclaimed. @end deftypefun @node Ports and Packets, Instruction Stacks, Lists of Pairs of Ports, Emulation Primitives @subsection Ports and Packets A pointer type declared as a @code{port} points to a structure in the following form, where a @code{flag} is an unsigned short integer type, and a @code{counter} is an unsigned long integer. @cindex @code{counter} @cindex @code{flag} @cindex @code{avm_packet} @example struct avm_packet @{ port parent; counter errors; portal descendents; list impetus, contents; flag predicating; @}; @end example @noindent For reasons that make sense to C, the @code{avm_packet} and @code{port} types are declared in @code{lists.h}, but a few memory management operations on them are available by way of functions declared in @file{ports.h}. The intended meaning of this structure is described presently, but first the memory management functions are as follows. @deftypefun port avm_newport (counter @var{errors}, port @var{parent}, int @var{predicating}) This function attempts to allocate storage for a new packet structure and returns its address if successful. If storage can not be allocated, a @code{NULL} pointer is returned. The @code{errors}, @code{parent}, and @code{predicating} fields are initialized with the parameters supplied by the caller. The rest of the structure is filled with zeros. A local memory cache is used for improved performance. @end deftypefun @deftypefun void avm_sever (port @var{appendage}) This function reclaims the storage associated with a @code{port}, either freeing it entirely or holding it in a local cache. None of the entities that may be referenced by pointers within the structure are affected. Only this function should be used by client programs for disposing of ports, not the @code{free} function directly, or some internal bookkeeping will be disrupted. An internal error results if the argument is a @code{NULL} pointer. @end deftypefun @deftypefun void avm_initialize_ports () This function must be called prior to calling either of the two above, in order to initialize some static variables. @end deftypefun @deftypefun void avm_count_ports () This function may be called after the last call to any of the other functions in this section in order to detect and report unreclaimed storage associated with ports. A non-fatal warning will be written to standard error if any is detected, but otherwise there is no effect. @end deftypefun The interesting aspect of this data structure is the role it plays in capturing the state of a computation. For this purpose, it corresponds to a single node in a partially computed result to be represented by a @code{list} when it's finished. The nodes should be envisioned as a doubly-linked binary tree, except that the pair of @code{descendents} for each node is not yet known with certainty, so a list of alternatives must be maintained. Because the computation is not completed while this data structure exists, there are always some empty fields in it. For example, the @code{descendents} and the @code{contents} fields embody the same information, the latter doing so in a compact as opposed to a more expanded form. Hence, it would be redundant for both fields to be non-empty at the same time. The data structure is built initially with @code{descendents} and no @code{contents}, only to be transformed into one with @code{contents} and no @code{descendents}. The significance of each field in the structure can be summarized as follows. @table @code @item contents If the computational result destined for the @code{port} pointing to this packet is not complete, then this field is @code{NULL} and the @code{descendents} are being computed. Otherwise, it contains the result of the computation. @item descendents This field points to a list of pairs of ports serving as the destinations for an ensemble of concurrent computations.@footnote{Earlier versions of @code{avram} included a bottom avoiding choice combinator that required this feature, but which has been withdrawn. A single pair of descendent ports would now suffice.} The @code{head} and @code{tail} of the @code{contents} are to be identified respectively with the @code{contents} of the @code{left} and @code{right} @code{port} in the first pair to finish being computed. @item parent If this packet is addressed by the @code{left} or the @code{right} of @code{port} in one of the @code{descendents} of some other packet, then this field points to that packet. @item errors A non-zero value in this field indicates that the result destined for the @code{contents} of this packet is expected to be an error message. If the exact level of error severity incurred in the computation of the @code{contents} matches this number, then the contents can be assigned the result, but otherwise the result should propagate to the @code{contents} of the @code{parent}. @item predicating A non-zero value in this field implies that the result destined for the @code{contents} of this packet is being computed in order to decide which arm of a conditional function should be chosen. I.e., a @code{NULL} result calls for the one that is invoked when the predicate is false. @item impetus If the result destined for the @code{contents} of this packet is being computed in order to transform a virtual code fragment from its original form to an equivalent representation capable of being evaluated more directly, this field points to a @code{list} node at the root of the virtual code in its original form. @end table One of the hitherto undocumented fields in a @code{list} node structure @cindex @code{interpretation} @cindex @code{impetus} declared in @file{lists.h} is called the @code{interpretation}, and is of type @code{list}. A client program delving into sufficient depth of detail to be concerned with ports and packets may reasonably assign the @code{interpretation} field of the @code{list} referenced by the @code{impetus} field in a packet to be a copy of the @code{contents} of the packet when they are eventually obtained. Doing so will save some time by eliminating the need for it to be recomputed if the same virtual code should be executed again. If this course is taken, the @code{facilitator} field in a @code{list} @cindex @code{facilitator} node, also hitherto undocumented, should contain the address of the packet referring to the list node as its @code{impetus}. The reason for this additional link is so that it can be followed when the @code{impetus} of the packet is be cleared by @code{avm_dispose} in the event that the @code{list} node is freed before the computation completes. This action is performed in order to preclude a dangling pointer in the @code{impetus} field. @node Instruction Stacks, , Ports and Packets, Emulation Primitives @subsection Instruction Stacks A header file named @file{instruct.h} declares a number of memory management and stack operations on a data structure of the following form. @cindex @code{instruction_node} @example struct instruction_node @{ port client; score sheet; struct avm_packet actor; struct avm_packet datum; instruction dependents; @}; @end example In this structure, an @code{instruction} is a pointer to an @code{instruction_node}, a @code{score} is a pointer to a profile database entry as discussed in @ref{Profiling}, and the @code{port} and @code{avm_packet} types are as described in @ref{Ports and Packets}. This data structure is appropriate for a simple virtual machine code @cindex concurrency evaluation strategy involving no concurrency. The strategy to evaluate an expression @code{@var{f} @var{x}} would be based on a stack of these nodes threaded through the @code{dependents} field, and would proceed something like this. @enumerate @item The stack is initialized to contain a single node having @code{@var{f}} in its @code{actor.contents} field, and @code{@var{x}} in its @code{datum.contents} field. @item The @code{client} in this node would refer to a static packet to whose @code{contents} field the final result will be delivered. @item The evaluator examines the @code{actor.contents} field on the top of the stack, detects by its form the operation it represents, and decides whether it corresponds to one that can be evaluated immediately by way of a canned function available in the library. List reversal, transposition, and comparison would be examples of such operations. @item If the operation can be performed in this way, the result is computed and assigned to the destination indicated by the @code{client} field. @item If the operation is not easy enough to perform immediately but is of a form recognizable as a combination of simpler operations, it is decomposed into the simpler operations, and each of them is strategically positioned on the stack so as to effect the evaluation of the combination. For example, if @code{@var{f}} were of the form @code{compose(@var{g},@var{h})} (@code{silly} notation), the node with @code{@var{f}} and @code{@var{x}} would be popped, but a node with @code{@var{g}} as its @code{actor.contents} would be pushed, and then a node with @code{@var{h}} as its @code{actor.contents} and @code{@var{x}} as its @code{datum.contents} would be pushed. Furthermore, the @code{client} field of the latter node would point to the @code{datum.contents} of the one with @code{@var{g}}, and the @code{client} field of the one with @code{@var{g}} would point wherever the @code{client} of the popped node used to point. @item If the operation indicated by the top @code{actor.contents} is neither implemented by a canned operation in the library nor easily decomposable into some that are, the evaluator can either give up or use virtual code to execute other virtual code. The latter trick is accomplished by pushing a node with @code{@var{f}} as its @code{datum.contents}, and a copy of a hard coded virtual code interpreter @code{@var{V}} as its @code{actor.contents}. The @code{client} of this node will point to the @code{@var{f}} in the original node so as to overwrite it when a simplified version is subsequently computed. The implementation of @code{@var{V}} is a straightforward exercise in @code{silly} programming. @item In any case, the evaluator would continue working on the stack until everything on it has been popped, at which time the result of the entire computation will be found in the packet addressed by the @code{client} in the original instruction node. @end enumerate What makes this strategy feasible to implement is the assumption of a sequential language, wherein synchronization incurs no cost and is automatic. The availability of any operand is implied by its position at the top of the stack. If you are reading this section with a view to @cindex threads implementing a concurrent or multi-threaded evaluation strategy, it will be apparent that further provisions would need to be made, such as that of a @code{data_ready} flag added to the @code{avm_packet} structure. The following functions support the use of stacks of instruction nodes that would be needed in an evaluation strategy such as the one above. @deftypefun int avm_scheduled (list @var{actor_contents}, counter @var{datum_errors}, list @var{datum_contents}, port @var{client}, instruction *@var{next}, score @var{sheet}) This function performs the memory allocation for instruction nodes. It attempts to create one and to initialize the fields with the given parameters, returning a pointer to it if successful. It returns a @code{NULL} pointer if the storage could not be allocated. Copies of the @code{list} parameters @code{actor_contents} and @code{data_contents} are made by this function using @code{avm_copied}, so the originals still exist as far as the caller is concerned and will have to be deallocated separately from this structure. The copies are made only if the allocation succeeds. Any fields other than those indicated by the parameters to this function are filled with zeros in the result. @end deftypefun @deftypefun void avm_retire (instruction *@var{done}) This function performs the storage reclamation of instructions, taking as its argument the instruction to be reclaimed. The @code{list} fields in the structure corresponding to the @code{list} parameters used when it was created are specifically reclaimed as well, using @code{avm_dispose}. The argument to this function is the address of an @code{instruction} rather than just an @code{instruction} so that the @code{instruction} whose address is given may be reassigned as the @code{dependents} field of the deallocated node. In this way, the instructions can form a stack that is popped by this function. This function cooperates with @code{avm_scheduled} in the use of a local cache of instruction nodes in the interest of better performance. Client modules should not attempt to allocate or reclaim instructions directly with @code{malloc} or @code{free}, but use only these functions. It causes a fatal internal error to pass a @code{NULL} pointer to this function. @end deftypefun @deftypefun void avm_reschedule (instruction *@var{next}) Given the address of an instruction pointer that may be regarded as the top of a stack of instructions threaded through the @code{dependents} field, this function exchanges the positions of the top instruction and the one below it. A fatal internal error is caused if there are fewer than two instructions in the stack. A use for this function arises in the course of evaluating virtual code applications of the form @code{conditional(@var{p},(@var{f},@var{g}))} (in @code{silly} notation). The evaluation strategy would require pushing nodes for all three constituents, but with @code{@var{p}} pushed last (therefore evaluated first). The result of the evaluation of @code{@var{p}} would require either the top one or the one below it to be popped without being evaluated, depending on whether the result is empty. @end deftypefun @deftypefun void avm_initialize_instruct () This function should be called before any of the instruction memory management functions is called in order to initialize some local data structures. Results are unpredictable without it. @end deftypefun @deftypefun void avm_count_instruct () This function should be called after the last call to any of the other functions in this section in order to detect and report unreclaimed storage associated with them. A warning message will be written to standard error if any unreclaimed instructions remain. This function relies on the assumption that the memory management has been done only by way of the above functions. @end deftypefun @node External Library Maintenance, , Emulation Primitives, Library Reference @section External Library Maintenance External mathematical library functions such as those documented in @ref{External Libraries} that are invoked from virtual code by the @code{library} combinator (@ref{Library combinator}) are also accessible from C by way of a uniform API implemented by the functions declared in @code{libfuns.h}. This interface applies even to libraries @cindex Fortran implemented in Fortran such as @code{minpack}. This section briefly documents the functions in @code{libfuns.h} and sets out some recommeded guidelines for developers wishing to add support for other external libraries. @menu * Calling existing library functions:: * Implementing new library functions:: * Working around library misfeatures:: @end menu @node Calling existing library functions, Implementing new library functions, External Library Maintenance, External Library Maintenance @subsection Calling existing library functions Whatever data types a library function manipulates, its argument and its result are each ultimately encoded each by a single list as explained in @ref{Type Conversions}. This representation allows all library functions to be invoked by a uniform calling convention as detailed below. @deftypefun list avm_library_call (list @var{library_name}, @var{list function_name}, list @var{argument}, int *@var{fault}) This function serves as an interpreter of external library functions by taking a @var{library_name}, a @var{function_name}, and an @var{argument} to the result returned by the corresponding library function for the given @var{argument}. The library and function names should be encoded as lists of character representations, the same as the arguments that would be used with the @code{library} combinator if it were being invoked by virtual code @cindex backward compatability (with attention to the backward compatibility issue explained in @ref{Characters and Strings}). If an error occurs in the course of evaluating a library function, the integer referenced by @var{fault} will be assigned a non-zero value, and the result will be a list of character string representations explaining the error, such as @code{<'memory overflow'>}, for example. Otherwise, the list returned will encode the result of the library function in a way that depends on the particular function being evaluated. @end deftypefun @deftypefun list avm_have_library_call (list @var{library_name}, list @var{function_name}, int *@var{fault}) This function implements the @code{have} combinator described in @ref{Have combinator}, which tests for the availability of a library function. The @var{library_name} and @var{function_name} parameters are as explained above for @code{avm_library_call}, and @code{fault} could signal an error similarly for this function as well. The result returned will be an error message in the event of an error, or a list of pairs of strings otherwise. The list will be empty if the library function is not available. If the library function is available, the list will contain a single pair, as in @example <(library_name,function_name)> @end example In addition, the list representation of the character string @code{'*'} can be specified as either the library name or the function name or both. This string is interpreted as a wild card and will cause all matching pairs of library and function names to be returned in the list. @end deftypefun @deftypefun void avm_initialize_libfuns () This function initializes some static data structures used by the two functions above. It may be called optionally before the first call to either of them, but will be called automatically if not. @end deftypefun @deftypefun void avm_count_libfuns () This function can be used as an aid to detecting memory leaks. It reclaims any data structures allocated by @code{avm_initialize_libfuns} and should be called towards the end of a run some time prior to @code{avm_count_lists} @ref{Simple Operations}, if the latter is being used. @end deftypefun @node Implementing new library functions, Working around library misfeatures, Calling existing library functions, External Library Maintenance @subsection Implementing new library functions Adding more external libraries to @code{avram} is currently a manual procedure requiring the attention of a developer conversant with C. To support a new library called @code{foobar}, these steps need to be followed at a minimum. @itemize @bullet @item Create a new file called @file{foobar.h} under the @file{avm/} directory in the main source tree whose name doesn't clash with any @cindex header file @cindex library interface header file existing file names and preferably doesn't induce any proper prefixes among them. This file should contain at least these function declarations. @example extern list avm_foobar_call (list function_name,list argument, int *fault); extern list avm_have_foobar_call (list function_name,int *fault); extern void avm_initialize_foobar (); extern void avm_count_foobar (); @end example There should also be the usual preprocessor directives for @file{include} files. The naming convention shown should be followed in anticipation of automated support for these operations in the future. @item Add @file{foobar.h} to the list of other header files in @file{avm/Makefile.am}. @item Create a new file called @file{foobar.c} under the @file{src/} directory whose name doesn't clash with any existing file names to @cindex library interfac source file store most of the library interface code. It can start out with stubs for the functions declared in @file{foobar.h}. @item Add @file{foobar.c} to the list of other source files in @file{src/Makefile.am} @item Execute the following command in the main @file{avram-x.x.x} source directory where the file @file{configure.in} is found. @example aclocal \ && automake --gnu --add-missing \ && autoconf @end example This command requires having @code{automake} and @cindex automake @cindex autoconf @code{autoconf} installed on your system. @item Make the following changes to @file{libfuns.c}. @itemize @bullet @item Add the line @code{#include} after the @cindex include directives other @code{include} directives. @item Add the string @code{"foobar"} to the end of the array of @code{libnames} in @code{avm_initialize_libfuns}. @item Add a call to @code{avm_initialize_foobar} to the body. @item Add a call to @code{avm_count_foobar} to the body of @code{avm_count_libfuns}. @item Add a case of the form @example case nn: return avm_foobar_call(function_name,argument,fault); @end example after the last case in @code{avm_library_call}, being careful not to change the order, and using the same name as above in the file @file{foobar.h}. @item Add a case of the form @example case nn: looked_up = avm_have_foobar_call(function_name,fault); break; @end example after the last case in @code{avm_have_library_call}, being careful not to change the order, and using the same name as above in the file @file{foobar.h}. @end itemize @item Edit @file{foobar.c} and @file{foobar.h} to suit, periodically compiling and testing by executing @code{make}. @item Package and install at will. @end itemize The functions shown above have the obvious interpretations, namely that @code{avm_foobar_call} evaluates a library function from the @code{foobar} library, and @code{avm_have_foobar_call} tests for a function's availability. The latter should interpret wild cards as explained in @ref{Calling existing library functions}, but should return only a list of strings for the matching function names rather than a list of pairs of strings, as the library name is redundant. The remaining functions are for static initialization and reclamation. These functions should consist mainly of boilerplate code similar to the corresponding functions in any of the other library source files, which should be consulted as examples. The real work would be done by other functions called by them. These should be statically declared within the @file{.c} source file and normally not listed in the @file{.h} header file unless there is some reason to think they may be of more general use. Any externally visible functions should have names beginning with @code{avm_} to avoid name clashes. Some helpful hints are reported below for what they may be worth. @itemize @bullet @item The reason for doing this is to leverage off other people's intelligence, so generally @code{foobar.c} should contain only glue code for library routines developed elsewhere with great skill rather than reinventing them in some home grown way. @item The best numerical software is often written by Fortran @cindex Fortran programmers. Linking to a Fortran library is no problem on GNU systems provided that all variables are passed by reference and all arrays are converted to column order (@ref{Type Conversions}). @item Most C++ programmers have yet to reach a comparable standard, but C++ @cindex C++ libraries can also be linked by running @code{nm} on the static @cindex nm utility library file to find out the real names of the functions and @cindex c++filt utility @code{c++filt} to find out which is which. However, there is no obvious workaround for the use of so called derived classes by C++ programmers to simulate passing functions as parameters. @item Anything worth using can probably be found in the Debian @cindex Debian archive. @item Not all libraries are sensible candidates for interfaces to @code{avram}. Typical design flaws are @itemize @bullet @item irrepressible debugging messages written to @code{stderr} or @code{stdout} that are unfit for end user consumption @item deliberately crashing the application if @code{malloc} fails @item opaque data types with undocumented storage requirements @item opaque data types that would be useful to store persistently but have platform specific binary representations @item heavily state dependent @cindex state dependence semantics @item identifiers with clashing names @item restrictive @cindex licensing restrictions licenses @end itemize Some of these misfeatures have workarounds as explained next in @ref{Working around library misfeatures}, at least if there's nothing else wrong with the library. @end itemize Those who support @code{avram} are always prepared to assist in the dissemination of worthwhile contributed library modules under terms compatible with @ref{Copying}, and under separate copyrights if @cindex copyright preferred. Contributed modules can be integrated into the official source tree provided that they meet the following additional @cindex coding standards guidelines to those above. @itemize @bullet @item source code documentation and indentation according to GNU coding standards (@url{http://www.gnu.org/prep/standards}) @item sufficient stability for a semi-annual release cycle @item no run-time or compile-time dependence on any non-free software, although dynamic loading and client/server interaction are acceptable @item portable or at least unbreakable configuration by appropriate use of @cindex autoconf @code{autoconf} macros and conditional defines @item little or no state dependence at the level of the virtual code @cindex state dependence interface (i.e., pure functions or something like them, except for @cindex random number generators random number generators and related applications) @item adequate documentation for a section in @ref{External Libraries} @end itemize @node Working around library misfeatures, , Implementing new library functions, External Library Maintenance @subsection Working around library misfeatures As mentioned already (@ref{Implementing new library functions}), some common problems with external libraries that are worthwhile in other respects are that they may generate unwelcome console output while running, they may follow ill defined memory management policies, and they may handle exceptions just by crashing themselves along with the client module. An accumulation of techniques for coping with these issues (short of modifying the library source) has been collected into the API and made available by way of the header file @file{mwrap.h}. This section briefly documents how they might be put to use. @menu * Inept excess verbiage:: * Memory leaks:: * Suicidal exception handling:: @end menu @node Inept excess verbiage, Memory leaks, Working around library misfeatures, Working around library misfeatures @subsubsection Inept excess verbiage Although the author of a library function may take pride in putting its activities on display, it should be assumed that virtual code applications running on @code{avram} have other agendas for the console, so the library interface module should prevent direct output from the external library. More thoughtful API's may have a verbosity setting, which should be @cindex verbosity setting used in preference to this workaround, but failing that, it is easy to dispense with console output generated by calls to external library functions by using some combination of the following functions. @deftypefun void avm_turn_off_stdout () Calling this function will suppress all output to the standard output stream until the next time @code{avm_turn_on_stdout} is called. Additional calls to this function without intervening calls to @code{avm_turn_on_stdout} may be made safely with no effect. The standard output stream is flushed as a side effect of calling this function. @end deftypefun @deftypefun void avm_turn_on_stdout () Calling this function will allow output to the standard output stream to resume if it has been suppressed previously by a call to @code{avm_turn_off_stdout}. If @code{avm_turn_off_stdout} has not been previously called, this function has no effect. Any output that would have been sent to @code{stdout} during the time it was turned off will be lost. @end deftypefun @deftypefun void avm_turn_off_stderr () This function performs a similar service to that of @code{avm_turn_off_stdout} but pertains to the standard error stream. The standard error and the standard output streams are controlled independently even if both of them are piped to the same console. @end deftypefun @deftypefun void avm_turn_on_stderr () This function performs a similar service to that of @code{avm_turn_on_stdout} but pertains to the standard error stream. @end deftypefun As an example, the following code fragment will prevent any output to standard output taking place as a side effect of @code{blather}, but will allow error messages to standard error. Note that ouput should not be left permanently turned off. @example ... #include ... x = y + z; avm_turn_off_stdout (); w = blather (foo, bar, baz); avm_turn_on_stdout (); return w; ... @end example One possible issue with these functions is that they rely on a feature of the GNU C library that might not be portable to non-GNU @cindex portability systems and has not been widely tested on other platforms. Another issue is that a library function could be both careless enough to clutter the console unconditionally and meticulous enough to check for I/O errors after each attempted write. Writing while the output stream is disabled will return an I/O error to the caller (i.e., to the verbose library function) for appropriate action, which could include terminating the process. @node Memory leaks, Suicidal exception handling, Inept excess verbiage, Working around library misfeatures @subsubsection Memory leaks Incorrect memory management may undermine confidence in a library when one wonders what else it gets wrong, but if the worst it does is leave a few bytes unreclaimed, then help is at hand. The first priority is to assess the seriousness of the situation. Similarly to the way library functions are bracketed with calls to those listed in @ref{Inept excess verbiage}, the following functions are meant to be placed before and after a call to a library function either for diagnostic purposes or production use. @deftypefun void avm_manage_memory () After this function is called, all subsequent calls to the standard C functions @code{malloc}, @code{free}, and @code{realloc} are intercepted and logged until the next time @code{avm_dont_manage_memory} is called. Furthermore, a complete record is maintained of the addresses and sizes of all allocated areas of memory during this time in a persistent data structure managed internally. @end deftypefun @deftypefun void avm_dont_manage_memory () Calling this function suspends the storage monitoring activities initiated by calling @code{avm_manage_memory}, but the record of allocated memory areas is not erased. @end deftypefun @deftypefun void avm_debug_memory () After this function is called and @code{avm_manage_memory} is also called, the standard output stream will display a running account of the sizes and addresses of all memory allocations or deallocations as they occur until the next call to either @code{avm_dont_debug_memory} or @code{avm_dont_manage_memory}. @end deftypefun @deftypefun void avm_dont_debug_memory () This function stops the output being sent to @code{stdout} caused by @code{avm_debug_memory}, if any, but has no effect on the logging of memory management events preformed due to @code{avm_manage_memory}. @end deftypefun While the latter two are not useful in production code, they can help to clarify an inadequately documented API during development by experimentally identifying the functions that cause memory to be allocated. They can also provide the answer to questions like whether separate copies are made from arrays passed to functions (useful for knowing when it's appropriate to free them). Although the console output reveals everything there is to know about memory management during the selected window, the question of unreclaimed storage is more directly settled by the following functions. @deftypefun void avm_initialize_mwrap () This function has to be called before any other functions from @file{mwrap.h} in order to clean the slate and prepare the static data structures for use. This function might not have to be called explicitly if the client module is part of @code{avram}, whose main program would have already called it. There is no harm in calling it repeatedly. @end deftypefun @deftypefun void avm_count_mwrap () This function should be called after the last call to any other functions in @file{mwrap.h}, when it is expected that all storage that was allocated while @code{avm_manage_memory} was in effect should have been reclaimed. If there is no unreclaimed storage allocated during an interval when memory was being managed, this function returns uneventfully. However, if any storage remains unreclaimed, a message stating the number of bytes is written to @code{stderr}. If @code{avm_debug_memory} is also in effect when this function detects unreclaimed storage, an itemized list of the unreclaimed memory addresses and their sizes is written to standard output. @end deftypefun Of course, in order for @code{avm_count_mwrap} to report meaningful results, any memory that is allocated during the interval between calls to @code{avm_manage_memory} and @code{avm_dont_manage_memory} must have been given an opportunity to be reclaimed also while this logging mechanism is in effect. However, there may be arbitrarily many intervening intervals during which it is suspended. On the other hand, any storage that is allocated when memory is not being managed must not be freed at a time when it is (except for freeing a @code{NULL} pointer, which is tolerated but not encouraged). Doing so raises an internal error, causing termination @cindex internal error with extreme prejudice. This behavior is a precaution against library functions freeing storage that they didn't allocate, which would mean no memory is safe and it's better for @code{avram} not to continue. If these investigations uncover no evidence of a memory leak, then perhaps the relevant library functions are reliable enough to run without supervisory memory management. Alternatively, when memory leaks are indicated, the next function provides a simple remedy. @deftypefun void avm_free_managed_memory () This function causes all storage to be reclaimed that was allocated at any time while logging of memory allocation was in effect (i.e., whenever @code{avm_manage_memory} had been called more recently than @code{avm_dont_manage_memory}). When the storage is freed, no further record of it is maintained. A side effect of this function is to call @code{avm_dont_manage_memory} and therefore leave memory management turned off. @end deftypefun This last function when used in conjunction with the others is therefore the workaround for library functions that don't clean up after themselves. It may be important to do it for them if repeated calls to the library function are expected, which would otherwise cause unreclaimed storage to accumulate until it curtailed other operations. One small issue with this function is the assumption that unreclaimed storage is really a leak and not internal library data that is designed to persist between calls. If this assumption is not valid, breakage will occur. However, libraries deliberately making use of persistent data are likely to have initialization and destructor functions as part of their API's, so this assumption is often justified if they don't. An example of using these functions is given below. In this example, @code{allocated_library_object} is a hypothetical function exported by an external library that causes storage to be allocated, and @code{library_reclamation_routine} is provided by the same library ostensibly to reclaim the storage thus allocated. However, the latter is suspected of memory leaks. The variable @code{my_data} is declared and used by an @code{avram} developer who is presumably competent to reclaim it correctly, rather than it being part of an external library. Memory management is therefore enabled during the calls to the library routines but not at other times. The call to @code{avm_count_mwrap} is redundant immediately after a call to @code{avm_free_managed_memory}, because with all managed memory having been freed, no memory leak will ever be detected, but it is included for illustrative purposes. @example #include ... @{ void *behemoth; char *my_data; avm_initialize_mwrap (); avm_manage_memory (); behemoth = allocated_library_object (foo, bar); avm_dont_manage_memory (); my_data = (char *) malloc (100); ... free (my_data); avm_manage_memory (); library_reclamation_routine (&behemoth); avm_free_managed_memory (); avm_count_mwrap (); return; @} @end example It might be a cleaner solution in some sense to omit the call to @code{library_reclamation_routine} entirely, because the storage allocated during the call to @code{allocated_library_object} will be reclaimed perfectly well by @code{avm_free_managed_memory} without it. Doing so may also be the only option if the library reclamation routine is either extremely unreliable or non-existent. However, the style above is to be preferred for portability if possible. The memory management functions rely on the availability of the system header file @code{malloc.h}, and GNU C library features whose portability is not assured. If the required features are not detected on the host system at configuration time, conditional directives in the @code{avram} source will make the @code{avm_}* memory management functions perform no operations, and the responsibility for memory management will devolve to the possibly less robust external library implementation. @node Suicidal exception handling, , Memory leaks, Working around library misfeatures @subsubsection Suicidal exception handling An inconvenient characteristic of some external library functions is to terminate the program rather than returning an error status to the caller for routine events such as a failure of memory allocation. Although in many cases there is no simple workaround for this behavior, memory allocation failures at least can be detected and preventive action taken by using the functions described in this section. The general approach is to use memory management functions from @file{mwrap.h} as described previously (@ref{Memory leaks}), while additionally registering a return destination for a non-local jump to @cindex non-local jumps be taken in the event of a memory overflow. The jump is taken when an external library function calls @code{malloc} or @code{realloc} unsuccessfully. The jump avoids passing control back to the library function, thereby denying it the opportunity to abort, but restores the context to that of the jump destination almost as if the library function and all of its intervening callers had returned normally. The interface is similar to that of the standard @code{setjmp} @cindex setjmp function defined in the system header file @code{setjmp.h}, and in fact is built on it, but differs in that the client module does not explicitly refer to jump buffers. Instead, the @code{mwrap} module internally maintains a stack of return destinations. If a jump is taken, it always goes to the most recently registered destination. It may revert to the previously registered destination only when the current one is cleared. This organization provides the necessary flexibility for multiple clients and recursion, but it necessitates a protocol whereby each registration of a destination must be explicitly cleared exactly once. The following functions implement these two features. @deftypefun int avm_setjmp () This function specifies the point to which control will pass by a non-local jump if there is insufficient memory to complete a subsequent @code{malloc} or @code{realloc} operation. Only the operations that take place while memory is being managed due to @code{avm_manage_memory} are affected (@ref{Memory leaks}). The function returns zero when it is called normally and successfully registers the return point. It returns a non-zero value when it has been entered by a non-local jump (i.e., when @code{malloc} or @code{realloc} has reported insufficient memory while memory management is active), or when the return point could not be successfully registered due to insufficient memory. The client need not distinguish between these two cases, because both correspond to memory overflows and the destination must be cleared by @code{avm_clearjmp} regardless. When a non-zero value is returned due to this function being reached by a non-local jump, it has the side effects of reclaiming all managed memory by calling @code{avm_free_managed_memory} and disabling memory management by calling @code{avm_dont_manage_memory}. @end deftypefun @deftypefun void avm_clearjmp () This function cancels the effect of @code{avm_setjmp ()} by preventing further non-local jumps to its destination if the destination was successfully registered, or by acknowledging unsuccessful registration otherwise. It should be called before exiting any function that calls @code{avm_setjmp ()} or anomalous results may ensue. @end deftypefun The memory management functions @code{avm_manage_memory} and @code{avm_dont_manage_memory} can be useful with or without @code{avm_setjmp}, depending on how much of a workaround is needed for a given library. If a library does not abort on memory overflows, there is no need to use @code{avm_setjmp}, while it may still be appropriate to use the other functions against memory leaks. Calling @code{avm_clearjmp} is particularly important if a client module with memory management that doesn't use @code{avm_setjmp} is invoked subsequently to one that does, so that memory overflows in the latter won't cause an attempted jump to a stale destination. A further complication that arises from careful consideration of these issues is the situation of a client module that does not intend to use @code{avm_setjmp} but is called (perhaps indirectly) by one that does. The latter will have registered a return destination that remains active and valid even if the former refrains from doing so, thereby allowing a branch to be taken that should have been prevented. Although it is an unusual situation, it can be accommodated by the following function. @deftypefun void avm_setnonjump () This function temporarily inhibits non-local jumps to destinations previously registered by @code{avm_setjmp} until the next time @code{avm_clearjmp} is called. Thereafter, any previously registered destinations are reinstated. @end deftypefun A sketch of how some of these functions might be used to cope with library functions that would otherwise terminate the program in the event of a memory overflow is shown below. The GNU @code{libc} @cindex non-local jumps reference manual contains a related discussion of non-local jumps. @example #include ... int function foobar (foo, bar) ... @{ char *my_data; my_data = (char *) malloc (100); if (avm_setjmp () != 0) @{ avm_clearjmp (); avm_turn_on_stdout (); /* reaching here */ free (my_data); /* means malloc */ return ABNORMAL_STATUS; /* failed below */ @} avm_turn_off_stdout (); avm_manage_memory (); ... call_library_functions (foo, bar); /* may jump */ ... /* to above */ avm_free_managed_memory (); avm_turn_on_stdout (); avm_clearjmp (); free (my_data); /* reaching here means */ return OK_STATUS; /* jumping wasn't done */ @} @end example Portability issues with these functions are not well known at this @cindex portability writing. If the configuration script for @code{avram} fails to detect the required features in @code{setjmp.h} on the host system, conditional compilation directives will disable the functions @code{avm_setjmp}, @code{avm_clearjmp}, and @code{avm_setnonjmp}. However, it may still be possible for the other @code{avm_}* memory management functions to be configured. If @code{setjmp} is not configured, the @code{avm_setjmp} function is still callable but will always return a value of zero, and will provide no protection against external library functions aborting the program. The other two will perform no operation and return. @node Character Table, Reference Implementations, Library Reference, Top @appendix Character Table @cindex character representations This table lists the representations used by @code{avram} for characters. The left column shows the character code in decimal. For printable characters, the middle column shows the character. The right column shows the representation used. For example, the letter @code{A} has character code 65, and the representation @code{(nil,(((nil,(nil,(nil,nil))),nil),(nil,nil)))}. These representations were generated automatically to meet various helpful criteria, and are not expected to change in future releases. No character representation coincides with the representations used for boolean values, natural numbers, character strings, pairs of characters, or certain other data types beyond the scope of this document. An easy algorithm for lexical sorting is possible. Subject to these criteria, the smallest possible trees were chosen. @example 0 (nil,(nil,(nil,((nil,nil),(nil,nil))))) 1 (nil,(nil,((nil,nil),(nil,nil)))) 2 (nil,(nil,((nil,nil),(nil,(nil,nil))))) 3 (nil,(nil,((nil,(nil,nil)),(nil,nil)))) 4 (nil,(nil,(((nil,nil),nil),(nil,nil)))) 5 (nil,(nil,(((nil,nil),(nil,nil)),nil))) 6 (nil,(nil,((((nil,nil),(nil,nil)),nil),nil))) 7 (nil,((nil,nil),(nil,nil))) 8 (nil,((nil,nil),(nil,(nil,nil)))) 9 (nil,((nil,nil),(nil,(nil,(nil,nil))))) 10 (nil,((nil,nil),(nil,(nil,(nil,(nil,nil)))))) 11 (nil,((nil,nil),(nil,((nil,nil),(nil,nil))))) 12 (nil,((nil,nil),(nil,((nil,(nil,nil)),nil)))) 13 (nil,((nil,nil),(nil,(((nil,nil),nil),nil)))) 14 (nil,((nil,nil),((nil,nil),(nil,nil)))) 15 (nil,((nil,nil),((nil,nil),(nil,(nil,nil))))) 16 (nil,((nil,nil),((nil,(nil,nil)),nil))) 17 (nil,((nil,nil),((nil,(nil,nil)),(nil,nil)))) 18 (nil,((nil,nil),((nil,(nil,(nil,nil))),nil))) 19 (nil,((nil,nil),(((nil,nil),nil),(nil,nil)))) 20 (nil,((nil,nil),(((nil,nil),(nil,nil)),nil))) 21 (nil,((nil,(nil,nil)),(nil,nil))) 22 (nil,((nil,(nil,nil)),(nil,(nil,nil)))) 23 (nil,((nil,(nil,nil)),(nil,(nil,(nil,nil))))) 24 (nil,((nil,(nil,nil)),(nil,((nil,nil),nil)))) 25 (nil,((nil,(nil,nil)),((nil,nil),nil))) 26 (nil,((nil,(nil,nil)),((nil,nil),(nil,nil)))) 27 (nil,((nil,(nil,nil)),((nil,(nil,nil)),nil))) 28 (nil,((nil,(nil,nil)),(((nil,nil),nil),nil))) 29 (nil,((nil,(nil,(nil,nil))),(nil,nil))) 30 (nil,((nil,(nil,(nil,nil))),(nil,(nil,nil)))) 31 (nil,((nil,(nil,(nil,nil))),((nil,nil),nil))) 32 (nil,((nil,(nil,(nil,(nil,nil)))),(nil,nil))) 33 ! (nil,((nil,(nil,((nil,nil),nil))),(nil,nil))) 34 " (nil,((nil,(nil,((nil,nil),(nil,nil)))),nil)) 35 # (nil,((nil,((nil,nil),nil)),(nil,nil))) 36 $ (nil,((nil,((nil,nil),nil)),(nil,(nil,nil)))) 37 % (nil,((nil,((nil,nil),(nil,nil))),nil)) 38 & (nil,((nil,((nil,nil),(nil,nil))),(nil,nil))) 39 ' (nil,((nil,((nil,nil),(nil,(nil,nil)))),nil)) 40 ( (nil,((nil,((nil,(nil,nil)),nil)),(nil,nil))) 41 ) (nil,((nil,((nil,(nil,nil)),(nil,nil))),nil)) 42 * (nil,((nil,(((nil,nil),nil),nil)),(nil,nil))) 43 + (nil,((nil,(((nil,nil),nil),(nil,nil))),nil)) 44 , (nil,((nil,(((nil,nil),(nil,nil)),nil)),nil)) 45 - (nil,(((nil,nil),nil),(nil,nil))) 46 . (nil,(((nil,nil),nil),(nil,(nil,nil)))) 47 / (nil,(((nil,nil),nil),(nil,(nil,(nil,nil))))) 48 0 (nil,(((nil,nil),nil),((nil,nil),(nil,nil)))) 49 1 (nil,(((nil,nil),nil),((nil,(nil,nil)),nil))) 50 2 (nil,(((nil,nil),(nil,nil)),nil)) 51 3 (nil,(((nil,nil),(nil,nil)),(nil,nil))) 52 4 (nil,(((nil,nil),(nil,nil)),(nil,(nil,nil)))) 53 5 (nil,(((nil,nil),(nil,nil)),((nil,nil),nil))) 54 6 (nil,(((nil,nil),(nil,(nil,nil))),nil)) 55 7 (nil,(((nil,nil),(nil,(nil,nil))),(nil,nil))) 56 8 (nil,(((nil,nil),(nil,(nil,(nil,nil)))),nil)) 57 9 (nil,(((nil,nil),((nil,nil),nil)),(nil,nil))) 58 : (nil,(((nil,nil),((nil,nil),(nil,nil))),nil)) 59 ; (nil,(((nil,nil),((nil,(nil,nil)),nil)),nil)) 60 < (nil,(((nil,(nil,nil)),nil),(nil,nil))) 61 = (nil,(((nil,(nil,nil)),nil),(nil,(nil,nil)))) 62 > (nil,(((nil,(nil,nil)),(nil,nil)),nil)) 63 ? (nil,(((nil,(nil,nil)),(nil,nil)),(nil,nil))) 64 @@ (nil,(((nil,(nil,nil)),(nil,(nil,nil))),nil)) 65 A (nil,(((nil,(nil,(nil,nil))),nil),(nil,nil))) 66 B (nil,(((nil,(nil,(nil,nil))),(nil,nil)),nil)) 67 C (nil,(((nil,((nil,nil),nil)),nil),(nil,nil))) 68 D (nil,(((nil,((nil,nil),nil)),(nil,nil)),nil)) 69 E (nil,((((nil,nil),nil),nil),(nil,nil))) 70 F (nil,((((nil,nil),nil),nil),(nil,(nil,nil)))) 71 G (nil,((((nil,nil),nil),(nil,nil)),nil)) 72 H (nil,((((nil,nil),nil),(nil,nil)),(nil,nil))) 73 I (nil,((((nil,nil),nil),(nil,(nil,nil))),nil)) 74 J (nil,((((nil,nil),(nil,nil)),nil),(nil,nil))) 75 K (nil,((((nil,nil),(nil,nil)),(nil,nil)),nil)) 76 L (nil,((((nil,(nil,nil)),nil),nil),(nil,nil))) 77 M (nil,((((nil,(nil,nil)),nil),(nil,nil)),nil)) 78 N (nil,(((((nil,nil),nil),nil),nil),(nil,nil))) 79 O (nil,(((((nil,nil),nil),nil),(nil,nil)),nil)) 80 P ((nil,nil),(nil,nil)) 81 Q ((nil,nil),(nil,(nil,nil))) 82 R ((nil,nil),(nil,(nil,(nil,nil)))) 83 S ((nil,nil),(nil,(nil,(nil,(nil,nil))))) 84 T ((nil,nil),(nil,(nil,(nil,(nil,(nil,nil)))))) 85 U ((nil,nil),(nil,(nil,((nil,(nil,nil)),nil)))) 86 V ((nil,nil),(nil,(nil,(((nil,nil),nil),nil)))) 87 W ((nil,nil),(nil,((nil,nil),(nil,nil)))) 88 X ((nil,nil),(nil,((nil,(nil,nil)),nil))) 89 Y ((nil,nil),(nil,((nil,(nil,nil)),(nil,nil)))) 90 Z ((nil,nil),(nil,((nil,(nil,(nil,nil))),nil))) 91 [ ((nil,nil),(nil,((nil,((nil,nil),nil)),nil))) 92 \ ((nil,nil),(nil,(((nil,nil),nil),nil))) 93 ] ((nil,nil),(nil,(((nil,nil),nil),(nil,nil)))) 94 ^ ((nil,nil),(nil,(((nil,nil),(nil,nil)),nil))) 95 _ ((nil,nil),(nil,(((nil,(nil,nil)),nil),nil))) 96 ` ((nil,nil),(nil,((((nil,nil),nil),nil),nil))) 97 a ((nil,nil),((nil,nil),(nil,nil))) 98 b ((nil,nil),((nil,nil),(nil,(nil,nil)))) 99 c ((nil,nil),((nil,nil),(nil,(nil,(nil,nil))))) 100 d ((nil,nil),((nil,nil),((nil,nil),(nil,nil)))) 101 e ((nil,nil),((nil,nil),((nil,(nil,nil)),nil))) 102 f ((nil,nil),((nil,(nil,nil)),nil)) 103 g ((nil,nil),((nil,(nil,nil)),(nil,nil))) 104 h ((nil,nil),((nil,(nil,nil)),(nil,(nil,nil)))) 105 i ((nil,nil),((nil,(nil,nil)),((nil,nil),nil))) 106 j ((nil,nil),((nil,(nil,(nil,nil))),nil)) 107 k ((nil,nil),((nil,(nil,(nil,nil))),(nil,nil))) 108 l ((nil,nil),((nil,(nil,(nil,(nil,nil)))),nil)) 109 m ((nil,nil),((nil,((nil,nil),nil)),(nil,nil))) 110 n ((nil,nil),((nil,((nil,nil),(nil,nil))),nil)) 111 o ((nil,nil),((nil,((nil,(nil,nil)),nil)),nil)) 112 p ((nil,nil),(((nil,nil),nil),(nil,nil))) 113 q ((nil,nil),(((nil,nil),nil),(nil,(nil,nil)))) 114 r ((nil,nil),(((nil,nil),(nil,nil)),nil)) 115 s ((nil,nil),(((nil,nil),(nil,nil)),(nil,nil))) 116 t ((nil,nil),(((nil,nil),(nil,(nil,nil))),nil)) 117 u ((nil,nil),(((nil,(nil,nil)),nil),(nil,nil))) 118 v ((nil,nil),(((nil,(nil,nil)),(nil,nil)),nil)) 119 w ((nil,nil),((((nil,nil),nil),nil),(nil,nil))) 120 x ((nil,nil),((((nil,nil),nil),(nil,nil)),nil)) 121 y ((nil,nil),(((((nil,nil),nil),nil),nil),nil)) 122 z ((nil,(nil,nil)),(nil,nil)) 123 @{ ((nil,(nil,nil)),(nil,(nil,(nil,nil)))) 124 | ((nil,(nil,nil)),(nil,(nil,(nil,(nil,nil))))) 125 @} ((nil,(nil,nil)),(nil,((nil,nil),nil))) 126 ~ ((nil,(nil,nil)),(nil,((nil,nil),(nil,nil)))) 127 ((nil,(nil,nil)),(nil,((nil,(nil,nil)),nil))) 128 ((nil,(nil,nil)),((nil,nil),(nil,nil))) 129 ((nil,(nil,nil)),((nil,nil),(nil,(nil,nil)))) 130 ((nil,(nil,nil)),((nil,(nil,nil)),nil)) 131 ((nil,(nil,nil)),((nil,(nil,nil)),(nil,nil))) 132 ((nil,(nil,nil)),((nil,(nil,(nil,nil))),nil)) 133 ((nil,(nil,nil)),(((nil,nil),nil),(nil,nil))) 134 ((nil,(nil,nil)),(((nil,nil),(nil,nil)),nil)) 135 ((nil,(nil,(nil,nil))),(nil,nil)) 136 ((nil,(nil,(nil,nil))),(nil,(nil,nil))) 137 ((nil,(nil,(nil,nil))),(nil,(nil,(nil,nil)))) 138 ((nil,(nil,(nil,nil))),(nil,((nil,nil),nil))) 139 ((nil,(nil,(nil,nil))),((nil,nil),(nil,nil))) 140 ((nil,(nil,(nil,nil))),((nil,(nil,nil)),nil)) 141 ((nil,(nil,(nil,(nil,nil)))),(nil,nil)) 142 ((nil,(nil,(nil,(nil,nil)))),(nil,(nil,nil))) 143 ((nil,(nil,(nil,(nil,nil)))),((nil,nil),nil)) 144 ((nil,(nil,(nil,(nil,(nil,nil))))),(nil,nil)) 145 ((nil,(nil,(nil,((nil,nil),nil)))),(nil,nil)) 146 ((nil,(nil,((nil,nil),nil))),(nil,nil)) 147 ((nil,(nil,((nil,nil),(nil,nil)))),(nil,nil)) 148 ((nil,(nil,((nil,(nil,nil)),nil))),(nil,nil)) 149 ((nil,(nil,(((nil,nil),nil),nil))),(nil,nil)) 150 ((nil,((nil,nil),nil)),(nil,nil)) 151 ((nil,((nil,nil),nil)),(nil,(nil,nil))) 152 ((nil,((nil,nil),nil)),(nil,(nil,(nil,nil)))) 153 ((nil,((nil,nil),nil)),(nil,((nil,nil),nil))) 154 ((nil,((nil,nil),nil)),((nil,nil),(nil,nil))) 155 ((nil,((nil,nil),nil)),((nil,(nil,nil)),nil)) 156 ((nil,((nil,nil),(nil,nil))),(nil,nil)) 157 ((nil,((nil,nil),(nil,nil))),(nil,(nil,nil))) 158 ((nil,((nil,nil),(nil,(nil,nil)))),(nil,nil)) 159 ((nil,((nil,nil),((nil,nil),nil))),(nil,nil)) 160 ((nil,((nil,(nil,nil)),nil)),(nil,nil)) 161 ((nil,((nil,(nil,nil)),nil)),(nil,(nil,nil))) 162 ((nil,((nil,(nil,nil)),nil)),((nil,nil),nil)) 163 ((nil,((nil,(nil,nil)),(nil,nil))),(nil,nil)) 164 ((nil,((nil,(nil,(nil,nil))),nil)),(nil,nil)) 165 ((nil,((nil,((nil,nil),nil)),nil)),(nil,nil)) 166 ((nil,(((nil,nil),nil),nil)),(nil,nil)) 167 ((nil,(((nil,nil),nil),(nil,nil))),(nil,nil)) 168 ((nil,(((nil,nil),(nil,nil)),nil)),(nil,nil)) 169 ((nil,(((nil,(nil,nil)),nil),nil)),(nil,nil)) 170 ((nil,((((nil,nil),nil),nil),nil)),(nil,nil)) 171 (((nil,nil),nil),(nil,nil)) 172 (((nil,nil),nil),(nil,(nil,nil))) 173 (((nil,nil),nil),(nil,(nil,(nil,nil)))) 174 (((nil,nil),nil),(nil,(nil,(nil,(nil,nil))))) 175 (((nil,nil),nil),(nil,(nil,((nil,nil),nil)))) 176 (((nil,nil),nil),(nil,((nil,nil),nil))) 177 (((nil,nil),nil),(nil,((nil,nil),(nil,nil)))) 178 (((nil,nil),nil),(nil,((nil,(nil,nil)),nil))) 179 (((nil,nil),nil),(nil,(((nil,nil),nil),nil))) 180 (((nil,nil),nil),((nil,nil),(nil,nil))) 181 (((nil,nil),nil),((nil,nil),(nil,(nil,nil)))) 182 (((nil,nil),nil),((nil,(nil,nil)),nil)) 183 (((nil,nil),nil),((nil,(nil,nil)),(nil,nil))) 184 (((nil,nil),nil),((nil,(nil,(nil,nil))),nil)) 185 (((nil,nil),nil),(((nil,nil),nil),(nil,nil))) 186 (((nil,nil),nil),(((nil,nil),(nil,nil)),nil)) 187 (((nil,nil),(nil,nil)),(nil,nil)) 188 (((nil,nil),(nil,nil)),(nil,(nil,nil))) 189 (((nil,nil),(nil,nil)),(nil,(nil,(nil,nil)))) 190 (((nil,nil),(nil,nil)),(nil,((nil,nil),nil))) 191 (((nil,nil),(nil,nil)),((nil,(nil,nil)),nil)) 192 (((nil,nil),(nil,(nil,nil))),(nil,nil)) 193 (((nil,nil),(nil,(nil,nil))),(nil,(nil,nil))) 194 (((nil,nil),(nil,(nil,(nil,nil)))),(nil,nil)) 195 (((nil,nil),(nil,((nil,nil),nil))),(nil,nil)) 196 (((nil,nil),((nil,nil),nil)),(nil,nil)) 197 (((nil,nil),((nil,nil),nil)),(nil,(nil,nil))) 198 (((nil,nil),((nil,nil),(nil,nil))),(nil,nil)) 199 (((nil,nil),((nil,(nil,nil)),nil)),(nil,nil)) 200 (((nil,nil),(((nil,nil),nil),nil)),(nil,nil)) 201 (((nil,(nil,nil)),nil),(nil,nil)) 202 (((nil,(nil,nil)),nil),(nil,(nil,nil))) 203 (((nil,(nil,nil)),nil),(nil,(nil,(nil,nil)))) 204 (((nil,(nil,nil)),nil),(nil,((nil,nil),nil))) 205 (((nil,(nil,nil)),nil),((nil,nil),(nil,nil))) 206 (((nil,(nil,nil)),nil),((nil,(nil,nil)),nil)) 207 (((nil,(nil,nil)),(nil,nil)),(nil,nil)) 208 (((nil,(nil,nil)),(nil,nil)),(nil,(nil,nil))) 209 (((nil,(nil,nil)),(nil,(nil,nil))),(nil,nil)) 210 (((nil,(nil,nil)),((nil,nil),nil)),(nil,nil)) 211 (((nil,(nil,(nil,nil))),nil),(nil,nil)) 212 (((nil,(nil,(nil,nil))),nil),(nil,(nil,nil))) 213 (((nil,(nil,(nil,nil))),nil),((nil,nil),nil)) 214 (((nil,(nil,(nil,nil))),(nil,nil)),(nil,nil)) 215 (((nil,(nil,(nil,(nil,nil)))),nil),(nil,nil)) 216 (((nil,(nil,((nil,nil),nil))),nil),(nil,nil)) 217 (((nil,((nil,nil),nil)),nil),(nil,nil)) 218 (((nil,((nil,nil),nil)),nil),(nil,(nil,nil))) 219 (((nil,((nil,nil),nil)),nil),((nil,nil),nil)) 220 (((nil,((nil,nil),nil)),(nil,nil)),(nil,nil)) 221 (((nil,((nil,nil),(nil,nil))),nil),(nil,nil)) 222 (((nil,((nil,(nil,nil)),nil)),nil),(nil,nil)) 223 (((nil,(((nil,nil),nil),nil)),nil),(nil,nil)) 224 ((((nil,nil),nil),nil),(nil,nil)) 225 ((((nil,nil),nil),nil),(nil,(nil,nil))) 226 ((((nil,nil),nil),nil),(nil,(nil,(nil,nil)))) 227 ((((nil,nil),nil),nil),(nil,((nil,nil),nil))) 228 ((((nil,nil),nil),nil),((nil,nil),nil)) 229 ((((nil,nil),nil),nil),((nil,nil),(nil,nil))) 230 ((((nil,nil),nil),nil),((nil,(nil,nil)),nil)) 231 ((((nil,nil),nil),nil),(((nil,nil),nil),nil)) 232 ((((nil,nil),nil),(nil,nil)),(nil,nil)) 233 ((((nil,nil),nil),(nil,nil)),(nil,(nil,nil))) 234 ((((nil,nil),nil),(nil,(nil,nil))),(nil,nil)) 235 ((((nil,nil),nil),((nil,nil),nil)),(nil,nil)) 236 ((((nil,nil),(nil,nil)),nil),(nil,nil)) 237 ((((nil,nil),(nil,nil)),nil),(nil,(nil,nil))) 238 ((((nil,nil),(nil,nil)),(nil,nil)),(nil,nil)) 239 ((((nil,nil),(nil,(nil,nil))),nil),(nil,nil)) 240 ((((nil,nil),((nil,nil),nil)),nil),(nil,nil)) 241 ((((nil,(nil,nil)),nil),nil),(nil,nil)) 242 ((((nil,(nil,nil)),nil),nil),(nil,(nil,nil))) 243 ((((nil,(nil,nil)),nil),nil),((nil,nil),nil)) 244 ((((nil,(nil,nil)),nil),(nil,nil)),(nil,nil)) 245 ((((nil,(nil,nil)),(nil,nil)),nil),(nil,nil)) 246 ((((nil,(nil,(nil,nil))),nil),nil),(nil,nil)) 247 ((((nil,((nil,nil),nil)),nil),nil),(nil,nil)) 248 (((((nil,nil),nil),nil),nil),(nil,nil)) 249 (((((nil,nil),nil),nil),nil),(nil,(nil,nil))) 250 (((((nil,nil),nil),nil),nil),((nil,nil),nil)) 251 (((((nil,nil),nil),nil),(nil,nil)),(nil,nil)) 252 (((((nil,nil),nil),(nil,nil)),nil),(nil,nil)) 253 (((((nil,nil),(nil,nil)),nil),nil),(nil,nil)) 254 (((((nil,(nil,nil)),nil),nil),nil),(nil,nil)) 255 ((((((nil,nil),nil),nil),nil),nil),(nil,nil)) @end example @node Reference Implementations, Changes, Character Table, Top @appendix Reference Implementations This appendix contains some @code{silly} source code for several functions that are mentioned in @ref{Virtual Code Semantics}, for specifying the virtual machine code semantics, namely @code{pairwise}, @code{transition}, @code{insert} and @code{replace}. The intention is to specify the virtual machine mathematically with a minimum of hand waving, by using only simple equations and small fragments of @code{silly} code, which has a straightforward semantics. However, the @code{silly} code fragments are more significant in some cases than what could fit into a few lines or be mechanically derived from an equation. The purpose of this appendix is therefore to avoid leaving any gaps in the construction by demonstrating that everything mentioned can be done. None of this code is needed for any practical purpose, because its functionality is inherent in the virtual machine, but it shows how certain operations would be specified if they were not built in. @menu * Pairwise:: * Insert:: * Replace:: * Transition:: @end menu @node Pairwise, Insert, Reference Implementations, Reference Implementations @section Pairwise @cindex @code{pairwise} This @code{silly} code fragment is mentioned in @ref{Reduce}, in the discussion of @code{reduce}, and is provided as an example of a solution to equations @emph{E1} to @emph{E3}. It is written in the style of a higher order function, in that it takes a function @code{@var{f}} as an argument and returns another function, [[@code{pairwise}]] @code{@var{f}} as a result. @example self = left argument = right head = left tail = right pairwise = compose( refer, compose( bu( conditional, conditional(argument,compose(tail,argument),constant nil)), couple( (hired couple)( (hired compose)( identity, constant (hired fan head)( argument, compose(tail,argument))), constant (hired meta)( self, compose(tail,compose(tail,argument)))), constant argument))) @end example @noindent To see how this works, one should evaluate it symbolically with an unknown @code{@var{f}}, which will result in some @code{silly} pseudocode, and then evaluate that symbolically with some sample lists. @node Insert, Replace, Pairwise, Reference Implementations @section Insert @cindex @code{insert} This function is mentioned in @ref{Sort}, on sorting. It takes the virtual code for a partial order relational operator and returns the code for a function of two arguments. The left argument is a list item and the right argument is a list of items of the same type, which is already sorted with respect to the relational operator given as the argument to @code{insert}. The result of the function returned by @code{insert} is a list similar to its right argument but with the left argument inserted in the proper position to maintain the order. This code makes use of the @code{self}, @code{argument}, @code{head} and @code{tail} declarations associated with @code{pairwise}. @example insert = bu(compose,refer) (hired conditional)( constant compose(right,argument), couple( (hired conditional)( (hired compose)( identity, constant compose( couple(left,compose(head,right)), argument)), constant ( argument, couple( compose(head,compose(right,argument)), (hired meta)( self, couple( compose(left,argument), compose(tail,compose(right,argument))))))), constant argument)) @end example As with the other higher order functions in this appendix, the only feasible ways to verify it would be either by formal proof or by some form of symbolic interpretation. @node Replace, Transition, Insert, Reference Implementations @section Replace @cindex @code{replace} This code is needed in the discussion of assignment in @ref{Assignment}. where it serves as a solution to equation @emph{E0}. The idea is that the function takes an argument of the form @code{((@var{locations},@var{values}),@var{store})} and returns the store with the values stored at the locations indicated. @example locations = compose(left,compose(left,argument)) values = compose(right,compose(left,argument)) store = compose(right,argument) replace = refer conditional( store, ( conditional( compose(left,locations), ( conditional( compose(right,locations), ( (hired meta)( self, couple( (hired fan right)(locations,values), (hired meta)( self, couple( (hired fan left)(locations,values), store)))), couple( (hired meta)( self, couple( couple(compose(left,locations),values), compose(left,store))), compose(right,store)))), conditional( compose(right,locations), ( couple( compose(left,store), (hired meta)( self, couple( couple(compose(right,locations),values), compose(right,store)))), values)))), (hired meta)( self, couple(couple(locations,values),constant (nil,nil))))) @end example @node Transition, , Replace, Reference Implementations @section Transition This code is relevant to the discussion of @code{transfer} in @ref{Transfer}, where its specification is described in detail. When this code is evaluated on a virtual code application @code{@var{f}}, the result is the code for a transition function that takes one configuration to the next in the course of evaluating a transfer function, as specified in equations @emph{E7} to @emph{E9}. @cindex @code{transition} @example output_buffer = compose(left,argument) input_buffer = compose(right,compose(right,argument)) active = compose(left,compose(right,argument)) state = compose(left,active) output = compose(right,active) transition = bu(compose,refer) (hired bu(conditional,active))( (hired conditional)( constant input_buffer, bu(compose,(fan bu(hired meta,self))) (hired apply)( constant fan bu(couple,couple(output,output_buffer)), couple (fan bu(compose,couple))( couple( (hired apply)( hired, constant (state,compose(head,input_buffer))), constant compose(tail,input_buffer)), couple( (hired apply)(hired,constant(state,constant nil)), constant constant nil)))), constant compose(flat,compose(reverse,output_buffer))) @end example @node Changes, External Libraries, Reference Implementations, Top @appendix Changes This section is reserved for brief updates due to changes in the software that may be important enough to note temporarily until more thorough revisions to the document can be made. The lack of content here indicates that the current version is either completely up to date or in such a sorry state of neglect that even this section is obsolete. @node External Libraries, Copying, Changes, Top @appendix External Libraries Various functions are callable from virtual code applications by way of the @code{library} combinator as explained in @ref{Library combinator}. An expression (shown in @code{silly} syntax) of the form @code{library('foo','bar') x} applies a function named @code{'bar'} from a library named @code{'foo'} to an argument @code{x}. A brief overview of the libraries and functions can always be had by executing @example $ avram --external-libraries @end example @noindent The listing displayed by this command may show some that are not included here if this version of the documentation is not current or your installation has been locally enhanced. It may also lack some that are documented here if your installation is not fully equipped. Although the overview from the command line is adequate for a reminder, it is not informative enough to explain how each function should be used. The purpose of this section is to provide this information in greater detail. Some general comments are applicable to all libraries. Each library documented in this section can generate error messages in the event of exceptional conditions, that are documented individually. In addition to those, it's also possible for any library function to return error messages of @cindex unrecognized library @cindex unrecognized function name @example <'unrecognized library'> <'unrecognized @var{xxxx} function name'> @end example @noindent where @var{xxxx} is the name of a library. These indicate either that the library name is invalid, or the library name is valid but the function name is invalid, or that they're both valid but the library wasn't detected on the host when @code{avram} was compiled. A virtual code application can always avoid these errors by testing for the availability of a function using the @code{have} combinator (@ref{Have combinator}). In addition, any library function that operates on numerical values or lists thereof can return these messages in cases of invalid input. @cindex missing value @cindex invalid value @cindex bad vector specification @cindex bad matrix specification @example <'missing value'> <'invalid value'> <'bad vector specification'> <'bad matrix specification'> @end example @noindent These messages indicate that an input parameter that was required to be a valid representation of a floating point number, a vector, or a matrix was something other than that (@ref{Type Conversions}). The last could also occur if a parameter that is required to be a square matrix has unequal numbers of rows and columns. @menu * bes:: Bessel functions * complex:: native complex arithmetic * fftw:: fast Fourier transforms * glpk:: simplex linear programming * gsldif:: numerical differentiation * gslevu:: series acceleration * gslint:: numerical integration * harminv:: harmonic inversion * kinsol:: constrained non-linear optimization * lapack:: linear algebra * math:: native floating point arithmetic * mtwist:: random number generation * minpack:: non-linear optimization * mpfr:: arbitrary precision arithmetic * lpsolve:: mixed integer programming * rmath:: statistical and special functions * umf:: sparse matrices @end menu @node bes, complex, External Libraries, External Libraries @section @code{bes} An interface to the Bessel functions as defined in the GNU Scientific Library (gsl) is available to virtual code applications by invoking a function of the form @example library('bes',f) @end example @noindent where f is a character string identifying the Bessel function family. All functions in this library return a floating point number encoded as in @ref{math}. @menu * Bessel function calling conventions:: * Bessel function errors:: @end menu @node Bessel function calling conventions, Bessel function errors, bes, bes @subsection Bessel function calling conventions @cindex bessel functions The virtual code interface simplifies the gsl C language API by excluding the facilities for error estimates, omitting certain array valued functions, and subsuming sets of related functions within common ones where possible. The functions with names in the following group take an argument of the form @code{(n,x)}, where @code{n} identifies the member of the function family, and @code{x} is the argument to the function. @itemize @bullet @item @code{J} regular cylindrical Bessel functions @item @code{Y} irregular cylindrical Bessel functions @item @code{I} regular modified cylindrical Bessel functions @item @code{K} irregular modified cylindrical Bessel functions @end itemize For these functions, @code{n} can be either a natural number encoded as in @ref{Representation of Numeric and Textual Data}, or a floating point number encoded as in @ref{math}. The latter case specifies functions of a fractional order. The relevant gsl function is called based on the value and type of the parameter. Two further related families of functions follow the same calling convention. @itemize @bullet @item @code{Isc} scaled regular modified cylindrical Bessel functions @item @code{Ksc} scaled irregular modified cylindrical Bessel functions @end itemize @noindent The foregoing functions are related to those above by an exponential scale factor as documented in the gsl reference manual. Functions with names in the following group also take an argument of the form @code{(n,x)}, but are not defined for fractional orders and so require a natural number for @code{n}. @itemize @bullet @item @code{j} regular spherical Bessel functions @item @code{y} irregular spherical Bessel functions @item @code{isc} regular modified spherical Bessel functions @item @code{ksc} irregular modified spherical Bessel functions @end itemize The functions in the remaining group follow idiosyncratic calling conventions. @itemize @bullet @item @code{zJ0}, @code{zJ1} These take a natural number @code{n} and return the @code{n}th root of the regular cylindrical Bessel functions of order 0 or 1, respectively. @item @code{zJnu} This takes a pair @code{(nu,n)} where @code{nu} is the (fractional) order of a regular cylindrical Bessel function, @code{n} is a natural number. It returns the @code{n}th zero of the function. @item @code{lnKnu} This takes a pair of floating point numbers @code{(nu,x)} where @code{nu} is the (fractional) order of an irregular modified cylindrical Bessel and @code{x} is the argument to the function, and it returns the natural log of the function. @end itemize @node Bessel function errors, ,Bessel function calling conventions, bes @subsection Bessel function errors Memory overflows and unrecognized function names can happen as with other library interfaces. A message of @cindex bad bessel function call @example <'bad bessel function call'> @end example @noindent means that invalid input parameters were given, such as a fractional order to a function family that is defined only for natural orders. @node complex, fftw, bes, External Libraries @section @code{complex} Complex numbers are represented according to the ISO C standard as @cindex complex numbers arrays of two IEEE double precision floating point numbers of 8 bytes each, with the number representing the real part first. A small selection of operations on complex numbers is available by function calls of the form @code{library('complex',f)}. These functions are implemented by the host system's C library. One example is @code{library('complex','create')} which takes a pair of floating point numbers @code{(@var{x},@var{y})} to a complex number whose real part is @var{x} and whose imaginary part is @var{y}. See @ref{math} for information about constructing floating point numbers. Other than that, the @code{complex} library functions @code{f} fall into three main groups, which are the real valued unary operations, the complex valued unary operations, and the complex valued binary operations. All of these operations are designated by their standard C names as documented elsewhere, such as the GNU @code{libc} reference manual, except as noted. @table @bullet @item @asis{} real valued unary operations @example creal cimag cabs carg @end example @item @asis{} complex valued unary operations @example ccos cexp clog conj csin csqrt ctan csinh ccosh ctanh casinh cacosh catanh casin cacos catan @end example @item @asis{} complex valued binary operations @example cpow vid bus mul add sub div @end example @end table The last four correspond to the C language operators @code{*}, @code{+}, @code{-}, and @code{/} for complex numbers. The functions named @code{vid} and @code{bus} are similar to @code{div} and @code{sub}, respectively, but with the operands interchanged. That is, @example library('complex','vid') (x,y) @end example @noindent is equivalent to @example library('complex','div') (y,x) @end example All functions in this library taking complex numbers as input may also operate on real numbers, and binary operators can have either or both operands real. For real operands, a value of zero is inferred as the imaginary part. The result type of the function is the same regardless. @node fftw, glpk, complex, External Libraries @section @code{fftw} Some functions in the @code{fftw} fast Fourier transform library are @cindex Fourier transforms @cindex Hartley transforms callable by virtual code programs of the form @code{library('fftw',f)}, where @code{f} can be one of the following character strings. @table @asis @item @code{u_fw_dft} (uni-dimensional forward Discrete Fourier transform) @item @code{u_bw_dft} (uni-dimensional backward Discrete Fourier transform) @item @code{b_fw_dft} (bi-dimensional forward Discrete Fourier transform) @item @code{b_bw_dft} (bi-dimensional backward Discrete Fourier transform) @item @code{u_dht} (uni-dimensional Discrete Hartley transform) @item @code{b_dht} (bi-dimensional Discrete Hartley transform) @end table These stand for the discrete Fourier transform, in one dimension and two dimensions, either backward or forward, and the discrete Hartley transform in one dimension and two dimensions. The @code{fftw} library documentation (@url{http://www.fftw.org}) can give more information about the meaning of these transformations. The interface is somewhat simplified compared to the API for the @code{fftw} C library because there are no considerations of memory management or planning, nor any provision for dimensions higher than two. Furthermore, from the virtual side of the interface, these functions operate on lists rather than arrays. The one dimensional Fourier transforms take a list of complex numbers to a list of complex numbers (see @ref{complex}), and the one dimensional Hartley transforms take a list of reals to a list of reals (see @ref{math}). The two dimensional transforms are analogous but they take a matrix represented as a list of lists. Error messages pertaining to invalid input documented at the beginning of this section (@ref{External Libraries}) are relevant. Finally, unlike the native API for @code{fftw}, these transformations are scaled so that the backward transformation is the inverse of the forward, and the Hartley transformations are their own inverses (subject to roundoff error). @node glpk, gsldif, fftw, External Libraries @section @code{glpk} The @code{glpk} library (@url{ftp://ftp.gnu.org/pub/gnu/glpk/}) solves linear programming problems by the either the simplex algorithm or @cindex linear programming an interior point method. The API for C client programs involves a complicated protocol with many optional settings, which is simplified for the virtual machine interface. Specifically, the library gives a choice of only two functions, which can be expressed in the following forms. @example library('glpk','simplex') library('glpk','interior') @end example @noindent These functions have the same calling convention and should return generally the same output for identical inputs, but differences in performance, precision, and maybe correctness can be expected. The remainder of this section applies to both of them. @menu * glpk input parameters:: * glpk output:: * glpk errors:: * Additional glpk notes:: @end menu @node glpk input parameters, glpk output, glpk, glpk @subsection @code{glpk} input parameters The argument must be a triple of the form, @code{(@var{c},(@var{m},@var{y}))}, subject to the following specification. @itemize @bullet @item @var{c} is a list of cost function coefficients as floating point numbers (see @ref{math}). There should be one item of @var{c} for each variable in the linear programming problem (Note that there is no additive constant, which would require one extra). The interpretation of @var{c} is that an assignment of non-negative values to the variables @var{x} is sought to make the vector inner product @var{c} @var{x} as small as possible. @item @var{m} is a sparse matrix represented as a list of triples in the form @cindex sparse matrix @example <((@var{i},@var{j}),@var{a})...> @end example @noindent where @var{i} and @var{j} are row and column indices as natural numbers starting from 0 and @var{a} is a non-zero floating point number. The presence of a triple @code{((@var{i},@var{j}),@var{a})} in the list indicates that the @var{i},@var{j}-th entry in the matrix has a value of @var{a}. Missing combinations of @var{i} and @var{j} indicate that the corresponding entry is zero. The interpretation of @var{m} is that together with @var{y} it specifies a system of equations the variables in the solution @var{x} must satisfy simultaneously, as explained below. @item @var{y} is a list of floating point numbers, with one number for each distinct value of @var{i} in @var{m}, above, needed to complete the equations. The interpretation of @var{y} is that in matrix notation, the condition @var{m} @var{x} = @var{y} must be met by any acceptable solution @var{x}. To put it another way, for each distinct value of @var{i}, the @var{i}-th item of @var{y} has to equal the sum over all @var{j} of @var{xj} @var{a}, where @var{a} is the real number appearing in the triple @code{((@var{i},@var{j}),@var{a})} in @var{m}, if any, and @var{xj} is the @var{j}-th variable of the solution. @end itemize @node glpk output, glpk errors, glpk input parameters, glpk @subsection @code{glpk} output If a solution meeting the constraints is found, it is returned as a list of pairs of the form @code{<(@var{i},@var{x})...>}, where each @var{i} is a natural number and each @var{x} is a floating point number giving the value obtained for the @var{i}-th variable numbered from zero. Any values of @var{i} that are omitted from the list indicate that the corresponding variable has a value of zero. If no solution is found due to infeasibility or because @code{glpk} just didn't find one, an empty list is returned. The lack of a solution is not treated as an exceptional condition. @node glpk errors, Additional glpk notes, glpk output, glpk @subsection @code{glpk} errors Possible error messages are @example <'bad glpk specification'> @end example @noindent which means that the input did not conform to the description given above, and @example <'memory overflow'> @end example It is not considered an exceptional condition for no feasible solution to exist, and in that case an empty list is returned. The @code{glpk} documentation gives no assurance as to the correctness of reported solutions, so the user should also take the possibility of incorrect results into account. @node Additional glpk notes, , glpk errors, glpk @subsection Additional @code{glpk} notes A sparse matrix representation of @var{m} is used because in practice @cindex sparse matrix most linear programming problems have very sparse systems of equations. Only the constraint of non-negativity is admitted. Other @cindex constraints constraints such as upper bounds must be effected through a change of variables if required. The @code{glpk} library has a small memory leak, which @code{avram} corrects by methods described in @ref{Memory leaks}. @node gsldif, gslevu, glpk, External Libraries @section @code{gsldif} Numerical differentiation of a real valued function of a single real @cindex numerical differentiation variable can be done by a library function of the form @example library('gsldif',method) @end example @noindent where @code{method} is one of @itemize @bullet @item @code{'backward'} @item @code{'central'} @item @code{'forward'} @item @code{'t_backward'} @item @code{'t_central'} @item @code{'t_forward'} @end itemize @menu * gsldif input parameters:: * gsldif output:: * gsldif exceptions:: * Additional gsldif notes:: @end menu @node gsldif input parameters, gsldif output, gsldif, gsldif @subsection @code{gsldif} input parameters The argument to the functions with mnemonics of @code{backward}, @code{central} or @code{forward} is a pair @code{(@var{f},@var{x})}, where @var{f} is the virtual machine code for a real valued function of a real variable, and @var{x} is the input to @var{f} where the derivative is sought. Real numbers are represented according to @ref{math}. The argument to the functions with mnemonics of @code{t_backward}, @code{t_central} or @code{t_forward} is a pair @code{((@var{f},@var{t}),@var{x})}, where @var{f} and @var{x} are as above, and @var{t} is a tolerance represented as a floating point number. The tolerance is passed through to the GNU Scientific library (GSL) differentiation routines. When no tolerance is specified, the default is @code{1.0e-8}. @node gsldif output, gsldif exceptions, gsldif input parameters, gsldif @subsection @code{gsldif} output The result returned by @code{library('gsldif',method) (f,x)} or @code{library('gsldif',method) ((f,t),x)} is an approximation of the first derivative of @var{f} evaluated at @var{x}. The result is obtained by the one of the GNU Scientific Library (GSL) @cindex GNU Scientific Library functions for numerical differentiation that matches the virtual code function name. These functions are documented in the GSL reference manual. The three methods should have approximately the same results but may differ in numerical properties. @node gsldif exceptions, Additional gsldif notes, gsldif output, gsldif @subsection @code{gsldif} exceptions An error message of @cindex bad derivative specification @example <'bad derivative specification'> @end example @noindent will be returned if the either the whole argument, @var{f}, or @var{x} is @code{nil}. Any error message caused by the evaluation of @var{f} will propagate to the result. @node Additional gsldif notes, , gsldif exceptions, gsldif @subsection Additional @code{gsldif} notes The function @var{f} may be any expressible virtual machine code function that takes a real argument to a real result, including one that uses other library functions. However, if @var{f} passes functions to other library functions as arguments, there is a constant overhead in stack space for each level, and a remote possibility of a @cindex segmentation fault segmentation fault if they are very deeply nested. Numerical instability is an issue for higher derivatives (i.e., differentiating a function that is obtained by differentiating another function). Some experimentation with larger tolerances may be needed. @node gslevu, gslint, gsldif, External Libraries @section @code{gslevu} This library exports a pair of functions of the form @example library('gslevu','accel') library('gslevu','utrunc') @end example @noindent that take a list of real numbers @var{x} to a pair of real numbers @code{(@var{s},@var{e})}. The idea is that @var{x} represents the first few terms of an infinite @cindex infinite series series whose sum converges, but only very slowly. The functions @cindex convergence extrapolate an estimate of the infinite summation by the Levin @cindex Levin u-transform u-transform as documented in the GNU Scientific Library reference manual. For well behaved series, considerably fewer terms are needed for an accurate estimate than a direct summation would require. @menu * gslevu calling conventions:: * gslevu exceptions:: @end menu @node gslevu calling conventions, gslevu exceptions, gslevu, gslevu @subsection @code{gslevu} calling conventions The input to either of these functions is a list of real numbers represented as explained in @ref{math}. The result is a pair @code{(@var{s},@var{e})} holding an estimate of the sum, @var{s}, and an estimate of the error in the sum, @var{e}, each being a real number. Both functions compute the same sum, @var{s}, but the @code{utrunc} function is faster and @cindex infinite sum more memory efficient, using a less trustworthy method of estimating the error. @node gslevu exceptions, , gslevu calling conventions, gslevu @subsection @code{gslevu} exceptions If an empty list is passed as a parameter to a function in this library, an error message of @code{<'empty gslevu sequence'>} is returned. If there is insufficient memory, an error message of @code{<'memory overflow'>} is returned. Other than that, no exceptional conditions are relevant other than the general ones documented at the beginning of @ref{External Libraries}. @node gslint, harminv, gslevu, External Libraries @section @code{gslint} An interface to a selection of numerical integration routines from the @cindex numerical integration GNU Scientific Library is provided by functions of the form @example library('gslint',q) @end example @noindent where q can be one of @code{'qng'}, @code{'qng_tol'}, @code{'qagx'}, @code{'qagx_tol'}, @code{'qagp'}, or @code{'qagp_tol'}. @menu * gslint input parameters:: * gslint output:: * gslint exceptions:: * Additional gslint notes:: @end menu @node gslint input parameters, gslint output, gslint, gslint @subsection @code{gslint} input parameters The library functions @code{qng} and @code{qagx} take an argument of the form @code{(@var{f},(@var{a},@var{b}))}, where @var{f} is a function to be integrated, @var{a} is the lower limit, and @var{b} is the upper limit, both limits being floating point numbers as in @ref{math}. The @code{qng_tol} and @code{qagx_tol} functions take an argument of @cindex tolerance the form @code{((@var{f},@var{t}),(@var{a},@var{b}))}, where @var{f}, @var{a}, and @var{b} are as above, and @var{t} is a specified tolerance. The @code{qagp} and @code{qagp_tol} functions take arguments of the form @code{(@var{f},@var{p})} and @code{((@var{f},@var{t}),@var{p})}, respectively, where @var{f} and @var{t} are as above, and @var{p} is an ordered list of real numbers specifying the limits of integration along with arbitrarily many intervening breakpoints. The integrand @var{f} is expressed in virtual machine code, and takes a single real argument to a real result. The argument and result of @var{f} are required to be floating point numbers as described in @ref{math}. Any expressible function of this type is acceptable, even one defined in terms of other integrals, so that a double or triple integral can be expressed easily, albeit a costly computation. However, a constant overhead in stack space is required for each nested library function call, and there is currently no mechanism to @cindex segmentation fault prevent segmentation faults due to a stack overflow. When no tolerance is specified, as with @code{qng}, @code{qagx}, and @code{qagp}, the tightest attainable tolerance is chosen by default, currently @code{2e-14}, in order find the most accurate result possible. A selection of progressively looser tolerances is tried automatically if the tightest one is not successful, stopping when either a solution is found or ten orders of magnitude are covered. If a tolerance is explicitly specified, as with @code{qng_tol}, @code{qagx_tol} or @code{qagp_tol}, only that tolerance is tried. @node gslint output, gslint exceptions, gslint input parameters, gslint @subsection @code{gslint} output In all cases, if no exception occurs, the result returned is an approximation of the integral of @var{f} over the interval from @var{a} to @var{b} or from the first item of @var{p} to the last. Results may differ in numerical properties depending on the integration method and the tolerance used. @itemize @bullet @item The @code{qagp}* and @code{qagx}* functions use an adaptive algorithm, @cindex adaptive integration @cindex non-adaptive integration whereas the @code{qng}* functions use a faster non-adaptive algorithm suitable only for smooth integrands. @item Faster and maybe more accurate results are obtained for discontinuous or non-differentiable integrands by the @code{qagp}* integration methods if the interior points in @var{p} are chosen to coincide with the discontinuities or corners. @item Larger tolerances are conducive to faster but less accurate results in most cases. @end itemize @node gslint exceptions, Additional gslint notes, gslint output, gslint @subsection @code{gslint} exceptions If an argument of an inappropriate form can be detected (such as an empty pair or one without floating point numbers), it causes an error message to be returned saying @cindex bad integral specification @example <'bad integral specification'> @end example @noindent Error messages signalled by the integrand @var{f} may also be reported, as well as any message returned by @code{gsl_strerror}. A typical cause for a @code{gsl_strerror} message would be an explicitly specified tolerance that is too tight. An error message of @cindex slow convergence @example <'slow convergence'> @end example @noindent is returned in the event of excessively many function evaluations (currently 3600 at each tolerance level). @node Additional gslint notes, , gslint exceptions, gslint @subsection Additional @code{gslint} notes The @code{qagx}* functions subsume the GSL variants @code{qags}, @code{qagiu}, @code{qagil}, and @code{qagi} for finite, semi-infinite, and infinite intervals, which are seleted as appropriate based on the @cindex improper integrals limits of integration @var{a} and @var{b}. The @code{qagp} function reverts to the @code{qagx} function if there are only two points given in @var{p}. Fewer than two will cause an error. The library interface code relies on the standard @code{setjmp} @cindex setjmp utility found in the system header file @code{setjmp.h} to break out of integrals that don't converge after excessively many function evaluations. Non-termination has been an issue in the past with GSL integration routines for very badly behaved integrands, and the API provides no documented means for the user supplied integrand function to request a halt. Although it is meant to be standard, a host without @code{setjmp} will cause @code{avram} to be configured to abort the application with an error message in the event of non-convergence. This behavior is considered preferable to the alternative of non-termination. Usually an effective workaround in such cases is to specify a sufficiently loose tolerance explicitly by using one of the *@code{_tol} library functions. @node harminv, kinsol, gslint, External Libraries @section @code{harminv} The @code{harminv} library decomposes a complex valued function of a @cindex harminv discrete variable into a sum of decaying sinusoids given a finite sample. It uses a method with better accuracy and convergence than Fourier analysis or least squares curve fitting. More information @cindex least squares @cindex Fourier transforms is available at @url{http://ab-initio.mit.edu/wiki/index.php/Harminv}. @menu * harminv input parameters:: * harminv output:: * harminv exceptions:: * Additional harminv notes:: @end menu @node harminv input parameters, harminv output, harminv, harminv @subsection @code{harminv} input parameters The virtual machine interface to the @code{harminv} library provides only a single function, callable as @example library('harminv','hsolve') @end example @noindent The input to this function is an operand of the form @example (signal,(fmin,fmax),nf) @end example @noindent where @itemize @bullet @item @code{signal} is a list of complex numbers containing samples of the function to be decomposed at equal time steps (@ref{complex} and @ref{Representation of Numeric and Textual Data}). @item @code{fmin} and @code{fmax} are the band limits expressed in units of inverse time steps as floating point numbers (@ref{math}). @item @code{nf} is the number of spectral basis functions expressed as a natural (@ref{Representation of Numeric and Textual Data}). @end itemize @noindent If a value of 0 is specified for @code{nf} a default value of @example min(300, (fmax - fmin) * n * 1.1) @end example @noindent is used, where @code{n} is the length of @code{signal}. The computation time increases cubically with @code{nf}. @node harminv output, harminv exceptions, harminv input parameters, harminv @subsection @code{harminv} output The result returned by a call to @example library('harminv','hsolve') @end example @noindent with valid input (@ref{harminv input parameters}) is a list of similar tuples of the form @example <(amplitude,frequency,decay,quality,error)...> @end example @noindent with all members being real valued except for the amplitudes, which are complex. Each tuple describes a function of the form @example f(t) = A * sin (frequency * t + P) * exp (-decay * t) @end example @noindent such that the summation of these functions approximates the original given signal (@ref{harminv input parameters}). The real amplitude @code{A} and phase @code{P} are given by the modulus and argument of the complex amplitude returned in the result, @example A = library('complex','cabs') amplitude P = library('complex','carg') amplitude @end example @noindent in terms of the complex library functions (@ref{complex}). The error values are measures of the goodness of fit, and the quality factors are defined as @example quality = (pi * |frequency| / decay) @end example @noindent It may be useful in some applications to ignore components with quality factors outside of a certain range. @node harminv exceptions, Additional harminv notes, harminv output, harminv @subsection @code{harminv} exceptions Various exceptional conditions are possible with the @code{harminv} library interface, and one of the following messages could be returned. Each of them has the form of a list containing a single character string. @itemize @bullet @item @code{unrecognized harminv function name} is reported in case of a function call of the form @code{library('harminv',f)} where @code{f} is anything other than the character string @code{'hsolve'}, this being the only function in the library. @item @code{bad harminv function call} is reported if the input parameters don't meet the specifications described in @ref{harminv input parameters}, or if @code{fmin} is greater than @code{fmax}. @item @code{bad vector specification} could be the result of a list of real numbers rather than complex numbers being passed as a @code{signal}. Real numbers can be converted to complex numbers using the @code{create} function from the @code{complex} library (@ref{complex}). @item @code{memory overflow} can occur if @code{avram} is operating very close to the limit of host memory, or perhaps if infeasibly large values are passed as @code{nf} @item @code{counter overflow} is similar to a memory overflow @end itemize @node Additional harminv notes, , harminv exceptions, harminv @subsection Additional @code{harminv} notes The @code{harminv} library interface requires the @code{harminv} and @code{lapack} libraries to be installed on the host system, and also requires standard complex number support from the system's C library. The author's installation of @code{avram} has been compiled against the Debian @code{harminv} development library package, which at this writing is unmaintained and is missing the necessary header file @file{harminv-int.h}, without which compilation of files including @file{harminv.h} fails. Some headers from @file{harminv.h} have been copied directly into @file{avram-x.x.x/src/harminv.c} under the @code{avram} source tree to avoid this dependence, so that @code{avram} will compile correctly on a Debian system. These may need to be updated if necessary to track the @code{harminv} source. @node kinsol, lapack, harminv, External Libraries @section @code{kinsol} The @code{kinsol} library (@url{http://www.llnl.gov/CASC/sundials/}) contains sophisticated routines for non-linear optimization and @cindex optimization @cindex non-linear optimization constrained non-linear optimization, some of which are available to @cindex constrained non-linear optimization virtual code applications by way of functions expressed as shown. @example library('kinsol',k) @end example @noindent The function name @code{k} is a string of the form @code{'@var{xy}_@var{zzzzz}'}. The field @var{zzzzz} specifies the optimization algorithm, which can be one of @code{dense}, @code{gmres}, @code{bicgs}, or @code{tfqmr}, following the names used by the API for @code{kinsol} in C. The field @var{y} determines the way gradients are obtained, which is either @code{j} for a user supplied Jacobian, or @code{d} for finite differences computed by @code{kinsol}. The remaining field @var{x} is either @code{c} for constrained optimization, or @code{u} for unconstrained. Hence, the whole function name can be one of sixteen possible alternatives. @example cd_dense cd_gmres cd_bicgs cd_tfqmr ud_dense ud_gmres ud_bicgs ud_tfqmr cj_dense cj_gmres cj_bicgs cj_tfqmr uj_dense uj_gmres uj_bicgs uj_tfqmr @end example More specific information about the optimization algorithms can be found in the @code{kinsol} documentation at the above address. Different algorithms may perform better on different problems. @menu * kinsol input parameters:: * kinsol output:: * kinsol exceptions:: * Additional kinsol notes:: @end menu @node kinsol input parameters, kinsol output, kinsol, kinsol @subsection @code{kinsol} input parameters Functions whose names are of the form @code{@var{x}d_@var{zzzzz}} take an argument of the form @code{(@var{f},(@var{i},@var{o}))}, and functions whose names are of the form @code{@var{x}j_@var{zzzzz}} take an argument of the form @code{((@var{f},@var{j}),(@var{i},@var{o}))}. The parameters have these interpretations. @itemize @bullet @item @var{f} is a function to be optimized, expressed in virtual machine code. It takes a list of real numbers as input and returns a list of real numbers as output. The numbers must be in floating point format as described in @ref{math}. @item @var{j} is a function in virtual machine code that computes the Jacobian or partial derivatives of @var{f} for a given list of input @cindex Jacobian numbers. The exact calling convention for @var{j} depends on the optimization algorithm selected, as explained below. @item @var{i} is a list of real numbers suitable as an input for @var{f}. The exact values of the numbers in @var{i} are not crucial but the length of @var{i} is taken as an indication of the required length for any input list to @var{f}. In the case of constrained optimization problems (i.e., functions with names beginning with @code{c}), @var{i} must consist entirely of non-negative numbers. @item @var{o} is a list numbers indicating the ``optimal'' output from @var{f} in the sense described below (@ref{kinsol output}). Its length is taken to indicate the usual length of an output returned by @var{f}. @end itemize If the optimization problem is being solved by either the @code{cj_dense} or the @code{uj_dense} method, the Jacobian parameter @var{j} is expected to take a list @var{v} of real numbers the length of @var{i} as input and return a list of lists of reals as output. The numbers are represented as described in @ref{math}. The outer list in the output from @var{j} is required to be the length of @var{o}, while each inner list is required to be the length of @var{i}. The output from @var{j} is interpreted as a matrix of the form described in @ref{Two dimensional arrays}. The entry in row @var{m} and column @var{n} is the partial derivative (evaluated at @var{v}) of the @var{m}-th component of the output of @var{f} with respect to the @var{n}-th item of the input list. For optimization problems being solved by the methods of @code{@var{x}j_gmres}, @code{@var{x}j_bicgs}, or @code{@var{x}j_tfqmr}, (i.e., where @var{x} is either @code{c} or @code{u}) the Jacobian function @var{j} follows a different convention that is meant to be more memory efficient. Given an argument of the form @code{(@var{m},@var{v})}, it returns only the @var{m}-th row of the matrix described above instead of the whole thing. The parameter @var{m} is a natural number less than the length of @var{o}, and @var{v} is a list of real numbers the length of @var{i} the same as above. The number @var{m} is encoded as described in @ref{Representation of Numeric and Textual Data}. @node kinsol output, kinsol exceptions, kinsol input parameters, kinsol @subsection @code{kinsol} output The @code{kinsol} functions attempt to search the domain of @var{f} for a vector @var{v} the length of @var{i} to satisfy @code{@var{f}(@var{v}) = @var{o}} as closely as possible. In the case of constrained optimization, (i.e., functions whose names begin with @code{c}), only non-negative numbers are acceptable in @var{v}. The search for @var{v} will start in the vicinity of @var{i}. The value of @var{i} will therefore determine a unique solution if multiple solutions exist, and will save time if it is near a solution. In some cases when a solution can't be found due to non-convergence, @cindex non-convergence an empty list is returned. Non-convergence is not considered an exceptional condition. In all other cases where no exception occurs, the output from a @code{kinsol} function will be the list @var{v} of real numbers satisfying @code{@var{f}(@var{v}) = @var{o}} to the best possible tolerance. @cindex tolerance @node kinsol exceptions, Additional kinsol notes, kinsol output, kinsol @subsection @code{kinsol} exceptions @itemize @bullet @item Any error messages that may be generated in the course of evaluating the functions @var{f} and @var{j} will propagate to the result returned by the @code{kinsol} library functions. @item If there is insufficient memory to complete any operation, the result is a message of @example <'memory overflow'> @end example @item If the argument to the library function (i.e., @code{(@var{f},(@var{i},@var{o}))} or @code{((@var{f},@var{j}),(@var{i},@var{o}))}) fails to meet the required specifications in a detectable way, the result will be a message of @cindex bad kinsol specification @example <'bad kinsol specification'> @end example @item Any status returned by any @code{kinsol} API functions other than success or one of several types of non-convergence results in a message of @example <'kinsol error'> @end example @end itemize @node Additional kinsol notes, , kinsol exceptions, kinsol @subsection Additional @code{kinsol} notes When a user supplied Jacobian function @var{j} is specified, the @cindex Jacobian solution is likely to be found faster and more accurately. The Jacobian should be given if an analytical form for @var{f} is known, from which the Jacobian can be obtained easily by partial differentiation. If the Jacobian is unavailable, a finite difference method implemented internally by @code{kinsol} is used as a substitute and will usually yield acceptable results. Tolerances are not explicitly specified on the virtual side of the interface although the native @code{kinsol} API requires them. A range of tolerances over ten orders of magnitude is automatically tried before giving up. Similarly to the @code{glpk} and @code{lpsolve} library interfaces (@ref{glpk} and @ref{lpsolve}), the only expressible constraint through @cindex constraints the virtual code interface is that all variables are non-negative. Arbitrary upper and lower bounds can be simulated by appropriate variable substitutions in the formulation of the problem. The @code{kinsol} library natively requires a system function @var{f} with equally many inputs as outputs, and will search only for the input associated with an output vector of all zeros, but the virtual code interface relaxes these requirements by allowing a function that transforms between lists of unequal lengths, and will search for the input of @var{f} causing it to match any given ``optimal'' output @var{o}. These effects are achieved by padding the shorter of the two vectors transparently and subtracting the specified optimum from the result. The @code{kinsol} library can be configured to use single precision, double precision, or extended precision arithmetic, but only a double precision configuration is compatible with @code{avram}. This condition is checked when @code{avram} is configured and it will not interface with alternative @code{kinsol} configurations. The @code{kinsol} library has some more advanced features to which this interface doesn't do justice, such as preconditioning, scaling, solution of systems with band limited Jacobians, and concurrent computation. @node lapack, math, kinsol, External Libraries @section @code{lapack} An arsenal of weapons grade linear algebra functions from the @cindex Fortran @code{LAPACK} Fortran library is accessible to virtual code @cindex linear algebra applications through library calls of the form @example library('lapack',f) @end example Each library function @code{f} invokes a @code{LAPACK} function of the same name, but the calling conventions on the virtual side are an artifact of the interface requiring their own documentation. Some functions that are part of @code{LAPACK} are not described here (mostly the so called computational and auxiliary routines, and @cindex single precision anything in single precision), because they are not accessible by the virtual code interface. @menu * lapack calling conventions:: * lapack exceptions:: * Additional lapack notes:: @end menu @node lapack calling conventions, lapack exceptions, lapack, lapack @subsection @code{lapack} calling conventions A table describing the inputs and outputs to the @code{lapack} library functions listed by their function names is given in this section. Some general points related to most of the functions are mentioned first. @itemize @bullet @item References to vectors, matrices, and packed matrices should be understood as their list representations explained in @ref{Type Conversions}. Although @code{LAPACK} internally uses column order arrays, the virtual code library interface exhibits a matrix as a list of lists with one inner list for each row. @item Some functions require a symmetric matrix as an input parameter. Any @cindex symmetric matrices input parameter that is required to be a symmetric matrix may be specified optionally either in square form or in triangular form as @cindex triangular matrices described in @ref{Two dimensional arrays}. If a square matrix form is used, symmetry is not checked and the lower triangular portion is ignored. @item Some function names are listed in pairs differing only in the first letter. Function names beginning with @code{d} pertain to vectors or matrices of real numbers (@ref{math}), and function names beginning with @code{z} pertain to complex numbers (@ref{complex}). The specifications of similarly named functions are otherwise identical. @end itemize @table @asis @item @code{dgesvx} @item @code{zgesvx} These library functions take a pair @code{(@var{a},@var{b})} where @var{a} is an @var{n} by @var{n} matrix and @var{b} is a vector of length @var{n}. If @var{a} is non-singular, they return a vector @var{x} such that @code{@var{a} @var{x} = @var{b}}. Otherwise they return an empty list. @item @code{dgelsd} @item @code{zgelsd} These functions generalize those above by taking a pair @code{(@var{a},@var{b})} where @var{a} is an @var{m} by @var{n} matrix and @var{b} is a vector of length @var{m}, with @var{m} greater than @var{n}. They return a vector @var{x} of length @var{n} to minimize the magnitude of @code{@var{b} - @var{a} @var{x}}. @item @code{dgesdd} @item @code{zgesdd} These functions take a list of @var{m} time series (i.e., vectors) each of length @var{n} and return a list of basis vectors each of length @var{n}. The basis vectors span the set of time series in the @cindex singular value decomposition given list according to the singular value decomposition (i.e., with the basis vectors forming a series in order of decreasing significance). The number of basis vectors is at most @code{@var{min}(@var{m},@var{n})} but could be less if the input time series aren't linearly independent. An empty list could be returned due to lack of convergence. @item @code{dgeevx} @item @code{zgeevx} These functions take a non-symmetric square matrix and return a pair @code{(@var{e},@var{v})} where @var{e} is a list of eigenvectors and @var{v} is a list of eigenvalues, both of which will @cindex eigenvectors contain only complex numbers. (N.B., both functions return complex results even though @code{dgeevx} takes real input.) They could also return @code{nil} due to a lack of convergence. @item @code{dpptrf} @item @code{zpptrf} These functions take a symmetric square matrix and return one of the Cholesky factors. The Cholesky factors are a pair @cindex Cholesky decomposition of triangular matrices, each equal to the transpose of the other, whose product is the original matrix. @itemize @bullet @item If the input matrix is specified in lower triangular form, the lower triangular Cholesky factor is returned. @item If the input matrix is specified in square or upper triangular form, the upper triangular Cholesky factor is returned. @item In either case, the result is returned in triangular form. @end itemize @item @code{dggglm} @item @code{zggglm} The input is a pair of matrices and a vector @cindex generalized least squares @cindex least squares @code{((@var{A},@var{B}),@var{d})}. The output is a pair of vectors @code{(@var{x},@var{y})} satisfying @code{@var{A}@var{x} + @var{B}@var{y} = @var{d}} for which the magnitude of @var{y} is minimal. The dimensions all have to be consistent, which means the number of rows in @var{A} and @var{B} is the length of @var{d}, the number of columns in @var{A} is the length of @var{x}, and the number of columns in @var{B} is the length of @var{y}. @item @code{dgglse} @item @code{zgglse} The input is of the form @code{((@var{A},@var{c}),(@var{B},@var{d}))} where @var{A} and @var{B} are matrices and @var{c} and @var{d} are vectors. The output is a vector @var{x} to minimize the magnitude of @code{@var{A}@var{x} - @var{c}} subject to the constraint that @code{@var{B}@var{x} = @var{d}}. The dimensions have to be consistent, which means @var{A} has @var{m} rows, @var{c} has length @var{m}, @var{B} has @var{p} rows, @var{d} has length @var{p}, both @var{A} and @var{B} have @var{n} columns, and the output @var{x} has length @var{n}. It is also a requirement that @code{@var{p} <= @var{n} <= @var{m} + @var{p}}. @item @code{dsyevr} This function takes a symmetric real matrix and returns a pair @code{(@var{e},@var{v})} where @var{e} is a list of eigenvectors and @var{v} is a list of eigenvalues. Both contain only real numbers. This function is fast and accurate but not as storage efficient as possible. If there is insufficient memory, it automatically invokes @code{dspev}. @item @code{dspev} This function takes a symmetric real matrix and returns a pair @code{(@var{e},@var{v})} where @var{e} is a list of eigenvectors and @var{v} is a list of eigenvalues. Both contain only real numbers. It uses roughly half the memory of @code{dsyevr} but is not as fast or accurate. @item @code{zheevr} This function takes a complex Hermitian matrix and returns a pair @cindex Hermitian matrix @code{(@var{e},@var{v})} where @var{e} is a list of eigenvectors and @var{v} is a list of eigenvalues. The eigenvectors are complex but the eigenvalues are real. @itemize @bullet @item A Hermitian matrix has @var{Aij} equal to the complex conjugate of @var{Aji}. @item Although not exactly symmetric, a Hermitian matrix may nevertheless be given in either upper or lower triangular form. @item This function is faster but less storage efficient than @code{zhpev}, and calls it automatically if it runs out of memory. @end itemize @item @code{zhpev} This function has the same inputs and approximate outputs as @code{zheevr} but is slower and more memory efficient because it uses only packed matrices. @end table @node lapack exceptions, Additional lapack notes, lapack calling conventions, lapack @subsection @code{lapack} exceptions @itemize @bullet @item Any of these functions can return a message of @example <'memory overflow'> @end example if it runs out of memory. @item If the input parameters don't meet the specification, they can also return @cindex bad lapack specification @example <'bad lapack specification'> @end example @item Any unexpected behavior from the @code{LAPACK} Fortran functions or irregular status returned by them is reported by the message @cindex lapack error @example <'lapack error'> @end example @noindent Getting to the bottom of it may require some debugging of the @code{avram} source code in the file @file{lapack.c}. @end itemize @node Additional lapack notes, , lapack exceptions, lapack @subsection Additional @code{lapack} notes The functions @code{dgesdd} and @code{zgesdd} are an effective dimensionality reduction technique for a large database of time @cindex dimensionality reduction series. A set of basis vectors can be computed once for the database, and then any time series in the database can be expressed as a linear combination thereof. To the extent that the data embody any redundant information, an approximate reconstruction of an individual series from the database will require fewer coefficients (maybe far fewer) in terms of the basis than original length of the series. The library functions @code{dgelsd} and @code{zgelsd} are good for @cindex least squares finding least squares fits to empirical data. If the matrix parameter @var{a} is interpreted as a list of inputs and the vector parameter @var{b} as the list of corresponding output data from some unknown linear function of @var{n} variables @var{f}, then @var{x} is the list of coefficients whereby @var{f} achieves the optimum fit to the data in the least squares sense. These functions solve a special case of the problem solved by @cindex generalized least squares @cindex least squares @code{dggglm} and @code{zggglm} where the parameter @var{B} is the identity matrix. For the latter functions, the output vector @var{y} can be interpreted as a measure of the error, and @var{B} can be chosen to express unequal costs for errors at different points in the fitted function. Cholesky decompositions obtained by @code{dpptrf} and @code{zpptrf} @cindex Cholesky decomposition are useful for generating correlated random numbers. A population of vectors of uncorrelated standard normally distributed random numbers can be made to exhibit any correlations to order by multiplying all of @cindex correlation the vectors by the lower Cholesky factor of the desired covariance @cindex covariance matrix matrix. @node math, mtwist, lapack, External Libraries @section @code{math} The @code{math} library exports functions that operate on IEEE double precision floating point numbers using the host system's C library. The numbers are represented natively as contiguous blocks of 8 bytes each, and on the virtual side as lists of eight character representations. (More explanation is given in @ref{Type Conversions}.) These functions take the form @example library('math',f) @end example @noindent where @code{f} is a character string identifying the function in most cases by its standard name in the C library. @menu * math library operators:: * math library predicates:: * math library conversion functions:: * math library exceptions:: * Additional math library notes:: @end menu @node math library operators, math library predicates, math, math @subsection @code{math} library operators The unary operators take a single real number to a real result. They @cindex trigonometric functions include @example ceil floor round trunc sin cos tan sinh cosh tanh asin acos atan asinh acosh atanh exp log sqrt cbrt expm1 log1p fabs @end example The binary operators take a pair of real numbers @code{(@var{x},@var{y})} to a single real number output. They include @example pow hypot atan2 remainder bus vid add sub mul div @end example @noindent where the last four correspond to the C language operators @code{+}, @code{-}, @code{*}, and @code{/}. The functions named @code{bus} and @code{vid} are like the @code{sub} and @code{div} functions, respectively, with the order of the operands reversed, as explained in @ref{complex}. The meanings of these operators are documented in the GNU @code{libc} reference manual or other C language references. They follow IEEE standards including proper handling of @code{nan} and infinity. @node math library predicates, math library conversion functions, math library operators, math @subsection @code{math} library predicates There is one binary predicate, @code{islessequal}, and several unary @cindex predicates predicates, @code{isinfinite}, @code{isnan}, @code{isnormal}, @code{isubnormal} and @code{iszero}. The predicate @code{islessequal} takes a pair of floating point numbers @code{(@var{x},@var{y})} as an argument, and returns @code{nil} for a false result and @code{(nil,nil)} for a true result. The unary predicates have the obvious interpretations as classification functions, and should probably be used in preference to comparison with constants in case the representations aren't unique. @node math library conversion functions, math library exceptions, math library predicates, math @subsection @code{math} library conversion functions The conversion function @code{strtod} takes a string representing a @cindex strtod floating point number in C format to its representation. This function is the primary means of creating or initializing floating point numbers in virtual code. A value of floating point 0.0 is returned if the string is not valid, but no exception is raised. The conversion @code{asprintf} is similar to the one by that name in @cindex asprintf C, but requires a pair @code{(@var{f},@var{x})} as an argument. The left side @var{f} is a character string containing a C style format conversion for exactly one double precision floating point number, such as @code{'%0.4e'}, and the parameter @var{x} is a floating point number. The result returned will be a character string expressing the number in the specified format. @node math library exceptions, Additional math library notes, math library conversion functions, math @subsection @code{math} library exceptions The most likely cause of an exception is an attempt to apply a @code{math} library function to @code{nil} or to an argument that doesn't represent a floating point number. In these cases, an error @cindex missing value @cindex invalid value message of @code{<'missing value'>} or @code{<'invalid value'>} will be the result. An error message of @code{<'invalid asprintf() specifier'>} is @cindex invalid asprintf specifier reported by the @code{asprintf} function if the format specifier @cindex asprintf pertains to a string, such as @code{'%s'}. This error is specifically trapped because the alternative would be a segmentation @cindex segmentation fault fault. Otherwise, invalid format specifiers are not detected or reported. Error messages of @code{<'invalid text format'>} can be generated @cindex invalid text format by conversion functions if any parameters that are meant to be character string representations are something else. There is always a chance of a @code{<'memory overflow'>} error if there is insufficient memory to allocate a result. @node Additional math library notes, , math library exceptions, math @subsection Additional @code{math} library notes Floating point exceptions such as division by zero are not specifically reported as exceptions, but invalid computations can be @cindex nan detected by the propagation of @code{nan} into the result, following standard conventions. The C function @code{feclearexcept (FE_ALL_EXCEPT)} is called before @cindex feclearexcept every floating point operation so that no lingering exception flags can affect it. There is no library predicate for exact comparison of floating point numbers, but none is required because the virtual machine's @code{compare} combinator will work on their representations as it @cindex compare combinator will on any other data. The usual caveats apply with regard to comparing floating point numbers in the presence of roundoff error. @node mtwist, minpack, math, External Libraries @section @code{mtwist} The @code{mtwist} library interfaces to a random number generator @cindex random numbers based on the Mersenne Twistor algorithm. The algorithm has good properties but is not meant to be cryptographically secure. The library functions are of the form @example library('mtwist',f) @end example @noindent where @code{f} is one of the followng character strings. @example bern u_cont u_disc u_path u_enum w_disc w_enum @end example Formally they are not mathematical functions because their results depend on a pseudo-random number that is not uniquely determined by their arguments. The numbers are generated deterministically in a sequence starting from a seed derived from the system clock at the time @code{avram} is launched, and each call uses the next number in the sequence. In so doing, it simulates a random draw from a uniformly distributed population. @menu * mtwist calling conventions:: * mtwist exceptions:: * Additional mtwist notes:: @end menu @node mtwist calling conventions, mtwist exceptions, mtwist, mtwist @subsection @code{mtwist} calling conventions All of the functions in this library simulate a random draw from a distribution. There is a choice of distribution statistics depending on the function used. @table @asis @item @code{bern} takes a floating point number @var{p} between 0 and 1, encoded as in @ref{math}, and returns a boolean value, either @code{(nil,nil)} for true or @code{nil} for false. A true value is returned only if a random draw from a uniform distribution ranging from 0 to 1 is less than @var{p}. This function therefore simulates a draw from a Bernoulli distribution. A @code{nil} value of @var{p} is treated as 1/2. @item @code{u_cont} takes a floating point number @var{x} as an argument, and returns a random draw from a continuous uniform distribution ranging from 0 to @var{x}. A @code{nil} value of @var{x} is treated as unity. @item @code{u_disc} simulates a draw from a uniform discrete distribution whose domain is the set of natural numbers from 0 to @var{n} - 1. The number @var{n} is given as a parameter to this function, and the retuned value is the draw. @itemize @bullet @item The returned value will have at most 64 bits regardless of @var{n}. @item Natural numbers are encoded as described in @ref{Representation of Numeric and Textual Data}. @item If a value of 0 is passed for @var{n}, the full 64 bit range is used. @end itemize @item @code{u_path} takes a pair of natural numbers @code{(@var{n},@var{m})} and returns a randomly chosen tree (@ref{Raw Material}) with 1 leaf and @var{n} non-leaves each having either a left or a right descendent but not both. The number @var{m} constrains the result to fall within the first @var{m} - 1 trees of this form enumerated by exhausting all possibilities at lower levels before admitting a right descendent at a higher level. Within these criteria, all possible results are equally probable. Both numbers are masked to 64 bits, but if @var{m} is zero, it is treated as 2^@var{n}. @item @code{u_enum} simulates a random draw from a uniform discrete distribution whose domain is enumerated. The argument to the function is a non-empty list, and the result is an item selected from the list, with all choices being equally probable. @item @code{w_disc} simulates a random draw from a non-uniform, or ``weighted'' discrete distribution whose domain is a set of consecutive natural numbers starting from zero. The argument to the function is a list giving the probability of each outcome starting from zero as a floating point number. Probabilities must be non-negative but needn't be normalized. @item @code{w_enum} simulates a random draw from a non-uniform, or ``weighted'' discrete distribution with an arbitrary domain enumerated in the argument. The argument is a list of pairs @code{<(@var{x},@var{p})..>}, where @var{x} is a possible outcome and @var{p} is its probability. The result returned is one of the values of @var{x} from the input list chosen at random according to the associated probability. Probabilities must be non-negative but needn't be normalized. @end table @node mtwist exceptions, Additional mtwist notes, mtwist calling conventions, mtwist @subsection @code{mtwist} exceptions @itemize @bullet @item @code{<'memory overflow'>} can be returned if there is insufficient memory to allocate a result. @item Messages of @code{<'missing value'>} and @code{<'invalid value'>} can be returned if any floating point argument is @code{nil} or is not a valid floating point number, unless there is a designated default interpretation for @code{nil} as in @code{bern} and @code{u_cont}. @item A message of @code{<'bad mtwist specification'>} is returned if an argument to the @code{bern} function is not in the range of 0 to 1, or if any probability passed to the @code{w_}* functions is negative. @item A message of @code{<'draw from empty list'>} is returned if an argument to the *@code{_enum} functions is @code{nil} or if an argument to @code{w_enum} contains @code{nil}. @end itemize @node Additional mtwist notes, , mtwist exceptions, mtwist @subsection Additional @code{mtwist} notes Although the @code{mtwist} library is ``external'', it requires no special configuration on the host because the uniform variate generator in the form developed by its original authors is short and elegant enough to be packaged easily within the @code{avram} distribution. All further embellishments are home grown despite the advice at the end of @ref{Implementing new library functions}. The @code{u_path} function is intended to allow sampling from a large population in logarithmic time when it is stored in a balanced tree. A left-heavy tree should be constructed initially with the data items all at the same level. Thereafter, a result returned by @code{u_path} with the appropriate dimensions can be used as an index into the tree for fast retrieval by the virtual machine's @code{field} combinator (@ref{Field}). The last three functions, @code{u_enum}, @code{w_disc}, and @code{w_enum} use an inversion method with a binary search. The first draw from a given list will take a time asymptotically proportional to the length of the list, but subsequent draws from the same list are considerably faster due to a persistent cache maintained transparently by @code{avram}. For lists whose length is up to 2^16, the time required for a subsequent draw consists mainly of constant overhead with a small logarithmic component in the length of the list. For longer lists, the time ramps up linearly by a small factor. Information allowing fast draws from up to sixteen lists can be cached simultaneously. If an application uses more than sixteen, the cached data are replaced in first-in first-out order. The size of the cache and the maximum list length for logarithmic time access can be adjusted easily by redefining constants in @file{mtwist.c} under the @code{avram} source tree, but will require recompilation. @node minpack, mpfr, mtwist, External Libraries @section @code{minpack} The @code{minpack} library contains functions to solve non-linear @cindex non-linear optimization optimization and least squares problems. The functions can be @cindex least squares expressed as @example library('minpack',f) @end example @noindent where @code{f} can be one of @code{'hybrd'}, @code{'hybrj'}, @code{'lmder'}, @code{'lmdif'}, or @code{'lmstr'}, following the names of the underlying Fortran subroutines. @menu * minpack calling conventions:: * minpack exceptions:: * Additional minpack notes:: @end menu @node minpack calling conventions, minpack exceptions, minpack, minpack @subsection @code{minpack} calling conventions The @code{minpack} library solves a similar problem to that of the @code{kinsol} library (@ref{kinsol}), and the two libraries have identical calling conventions at the level of the virtual code interface. The @code{hybrd} and @code{lmdif} functions take input arguments of the form @code{(@var{f},(@var{i},@var{o}))}, whereas @code{hybrj}, @code{lmder}, and @code{lmstr} take arguments of the form @code{((@var{f},@var{j}),(@var{i},@var{o}))}. The interpretations of these parameters are explained in @ref{kinsol input parameters}. For the @code{lmstr} function, the Jacobian function @var{j} takes @cindex Jacobian an argument @code{(@var{m},@var{v})} and returns only the @var{m}-th row of the Jacobian matrix. For @code{lmder} and @code{hybrj}, the Jacobian function takes only an input vector @var{v} and returns the whole matrix. These specifications are also explained further in relation to the @code{kinsol} library. The output from any minpack function is a vector @var{v} satisfying @code{@var{f}(@var{v}) = @var{o}} to the best possible tolerance if a @cindex tolerance solution is found. A range of tolerances over ten orders of magnitude is sampled starting from @code{1e-15}. If no solution is found, an empty list is returned. @node minpack exceptions, Additional minpack notes, minpack calling conventions, minpack @subsection @code{minpack} exceptions @itemize @bullet @item A message of @code{<'memory overflow'>} is possible any time @code{minpack} runs out of memory. @item A message of @code{<'bad minpack specification'>} will be returned @cindex bad minpack specification if an input argument recognizably violates the required specification. @item The @code{<'minpack error'>} message is returned in the event of any @cindex minpack error unexpected behavior or irregular status from the API. @item Any error messages reported by the system function @var{f} or the Jacobian function @var{j} are propagated to the result. @end itemize @node Additional minpack notes, , minpack exceptions, minpack @subsection Additional @code{minpack} notes The @code{lm}* functions are better suited to problems in which the system function @var{f} has more outputs than inputs, and the @code{hybr}* functions are better suited to the alternative. If either is called when the other is more appropriate, the job is handed off to the other automatically. The @code{lmstr} function is more memory efficient than the others because it doesn't compute the whole Jacobian matrix at once. Any @cindex Jacobian of the @code{lm}* functions is more memory efficient than the @code{kinsol} equivalent when the output list is sufficiently longer than the input list. Unlike @code{kinsol}, there is no provision in @code{minpack} for @cindex constrained optimization constrained optimization. The @code{minpack} documentation doesn't state whether it's @cindex re-entrancy re-entrant, but the odds are against it unless it uses no storage outside the user provided work areas. If it isn't re-entrant, anomalous effects could occur when a virtual code function being optimized calls another @code{minpack} function. A workaround would be to use an equivalent @code{kinsol} function, which is re-entrant by design. The @code{avram} configuration script searches for a C header file @cindex header file @file{minpack.h} on the host system in order to build an interface to this library. This file is specific to the Debian @code{minpack-dev} package and is not part of the upstream Fortran source. Configuring @code{avram} with an interface to the @code{minpack} library on a @cindex Debian non-Debian system may require the administrator to retrieve the header file manually from the Debian archive and place it under @file{/usr/include} before running the configuration script (in addition to installing the @code{minpack} library itself, of course). @node mpfr, lpsolve, minpack, External Libraries @section @code{mpfr} The @code{mpfr} library provides a rich assortment of floating point operations on arbitrary precision numbers (@url{http://www.mpfr.org}). These numbers are represented in a format that is not binary compatible with the standard IEEE floating point number format used by other libraries, but they offer superior numerical stability suitable for many ill conditioned problems. The virtual code interface to the @code{mpfr} library follows the native API to the extent of using the same names for most operations, but excludes features pertaining to i/o, mutable storage, and memory management. The @code{mpfr} library functions are invoked by an expression of the form @example library('mpfr',f) @end example @noindent Aside from a few exceptions as noted, @code{f} is a character string derived from the name of the related function from the @code{mpfr} C library as documented at the above address, but without the @code{mpfr_} prefix. The full complement of available functions is documented in the remainder of this section. @itemize @bullet @item References to natural numbers pertain to the list representation described in @ref{Representation of Numeric and Textual Data}. @item All functions that perform rounding use a mode of @code{GMP_RNDN} for @cindex rounding rounding to nearest, which is not explicitly specified on the virtual side. @end itemize @menu * mpfr binary operators:: * mpfr unary operators:: * mpfr binary operators with a natural operand:: * mpfr binary predicates:: * mpfr unary predicates:: * mpfr constants:: * mpfr functions with miscellaneous calling conventions:: * mpfr conversion functions:: * mpfr exceptions:: * Additional mpfr notes:: @end menu @node mpfr binary operators, mpfr unary operators, mpfr, mpfr @subsection @code{mpfr} binary operators Functions with these names take a pair of @code{mpfr} numbers @code{(@var{x},@var{y})} and return an @code{mpfr} number as a result. @itemize @bullet @item @code{add} @item @code{sub} @item @code{mul} @item @code{div} @item @code{pow} @item @code{atan2} @item @code{hypot} @item @code{min} @item @code{max} @item @code{vid} @item @code{bus} @end itemize Their semantics are similar to those listed in the @code{mpfr} documentation, with some minor qualifications. @itemize @bullet @item Unlike the native API, there is no third argument to which the result is assigned, because the result is the returned value. @item The precision of the result is the greater of the two precisions of the input numbers @var{x} and @var{y}. @item The @code{vid} and @code{bus} functions are added features of the virtual code interface, corresponding to division and subtraction with the order of the operands reversed, as explained in @ref{complex}. @end itemize Mathematically it might make more sense for the precision of the @cindex precision result to be the lesser of the two input precisions, but this way is more convenient for virtual code programs that perform binary operations on their input with hard coded constants, because it makes one size fit all. @node mpfr unary operators, mpfr binary operators with a natural operand, mpfr binary operators, mpfr @subsection @code{mpfr} unary operators Functions with these names take a single @code{mpfr} number as an argument and return a single @code{mpfr} number as a result. @cindex gamma functions @example sqr sqrt cbrt neg abs log log2 log10 exp exp2 exp10 cos sin tan acos asin atan cosh sinh tanh acosh asinh atanh lngamma expm1 eint gamma erf log1p nextbelow ceil floor round trunc frac nextabove erfc @end example The semantics of these functions are similar to those of their counterparts in the native API, with these provisions. @itemize @bullet @item The precision of the result is the precision of the argument. @item There is no second argument for assigning the result. @item The @code{nextabove} and @code{nextbelow} functions do not modify their arguments in place, but return a freshly allocated result like all other functions. @end itemize @node mpfr binary operators with a natural operand, mpfr binary predicates, mpfr unary operators, mpfr @subsection @code{mpfr} binary operators with a natural operand Functions with these names take an argument of the form @code{(@var{x},@var{n})}, where @var{x} is an @code{mpfr} number and @var{n} is a natural number. @itemize @bullet @item @code{root} @item @code{pow_ui} @item @code{mul_2ui} @item @code{div_2ui} @item @code{grow} @item @code{shrink} @end itemize The last two are specific to the virtual code interface, having no counterpart in the native API of the @code{mpfr} library. The @code{grow} function returns a copy of @var{x} with its precision increased by @var{n} bits, and the @code{shrink} function returns a copy of @var{x} with its precision reduced by @var{n} bits. @itemize @bullet @item The precisions are silently capped at the maximum or floored at the @cindex precision minimum allowable precisions if necessary. @item Increasing the precision by the @code{grow} function does not directly cause a more accurate result to be computed, but only pads an existing number with zeros. @item Decreasing the precision by the @code{shrink} function does not prevent valid bits from being discarded. @end itemize The appropriate way to use @code{grow} is to grow the precision of an operand before applying an operator to it, which will cause the result to be computed to the full precision. This capability is suitable for algorithms that iterate over increasing precisions until a stopping criterion is met. @node mpfr binary predicates, mpfr unary predicates, mpfr binary operators with a natural operand, mpfr @subsection @code{mpfr} binary predicates These predicates take a pair of @code{mpfr} numbers @code{(@var{x},@var{y})} as arguments and perform a logical operation. If the result is true, they return @code{(nil,nil)}, and if it's false, they return @code{nil}. @itemize @bullet @item @code{equal_p} @item @code{unequal_abs} @item @code{greater_p} @item @code{greaterequal_p} @item @code{less_p} @item @code{lessequal_p} @item @code{lessgreater_p} @end itemize The name of the function @code{unequal_abs}, for comparing absolute values, has been changed from @code{mpfr_cmpabs} to avoid confusion with the virtual machine's @code{compare} combinator. The @code{compare} combinator returns a @code{(nil,nil)} result (i.e., true) if the @cindex compare combinator operands are equal and a @code{nil} result if they're unequal, opposite from @code{unequal_abs}. @node mpfr unary predicates, mpfr constants, mpfr binary predicates, mpfr @subsection @code{mpfr} unary predicates Each of these predicates takes an @code{mpfr} number as an argument and performs a logical operation. If the result is true, it returns @code{(nil,nil)}, and otherwise it returns @code{nil}. @itemize @bullet @item @code{nan_p} @item @code{inf_p} @item @code{number_p} @item @code{zero_p} @item @code{integer_p} @end itemize @node mpfr constants, mpfr functions with miscellaneous calling conventions, mpfr unary predicates, mpfr @subsection @code{mpfr} constants Each of these functions takes a natural number as an argument specifying a precision, and returns a mathematical constant evaluated to that precision. @itemize @bullet @item @code{const_log2} @item @code{pi} @item @code{const_catalan} @item @code{inf} @item @code{ninf} @item @code{nan} @end itemize The name of the constant @code{pi} has been shortened from @code{mpfr_const_pi}. The functions @code{inf} and @code{ninf} return infinity and negative infinity, respectively. The encoding of @code{nan}, used to represent the results of undefined @cindex nan computations such as division by zero, is not unique even for a fixed precision. Applications should test for undefined results using @code{nan_p} rather than by comparing a result to a hard coded @code{nan} (@ref{mpfr unary predicates}). @node mpfr functions with miscellaneous calling conventions, mpfr conversion functions, mpfr constants, mpfr @subsection @code{mpfr} functions with miscellaneous calling conventions Some functions listed below don't conform to any of the previously mentioned calling conventions. @table @asis @item @code{eq} This is a ternary operator taking a triple @code{(@var{prec},(@var{x},@var{y}))}, where @var{prec} is a natural number and @var{x} and @var{y} are @code{mpfr} numbers. It returns a result of @code{(nil,nil)} (i.e., true) if the numbers agree up to the specified precision measured in bits, and returns @code{nil} otherwise.@footnote{a potentially useful tool for algorithms concerned with numerical approximations despite its inexplicable malignment in the @code{mpfr} documentation} @item @code{urandomb} This function takes a natural number specifying a precision and @cindex random numbers returns a uniformly distributed pseudo-random number of that precision between 0 and 1. @item @code{prec} This function takes an @code{mpfr} number and returns a natural number as a result, which is the precision of the argument in bits. @item @code{sin_cos} This function takes an @code{mpfr} number @var{z} as an argument and returns a pair of @code{mpfr} numbers @code{(@var{x},@var{y})} as a result, where @var{x} is the sine of @var{z} and @var{y} is the cosine. The precisions of the results are the same as the precision of the argument. @end table @node mpfr conversion functions, mpfr exceptions, mpfr functions with miscellaneous calling conventions, mpfr @subsection @code{mpfr} conversion functions The functions described in this section convert between @code{mpfr} numbers and character strings, naturals, or standard IEEE floating point format (in their list representations). Where these functions have similar or equivalent counterparts in the @code{mpfr} library's native API, the names have been changed for mnemonic reasons. @table @asis @item @code{dbl2mp} The input is a standard floating point number as in @ref{math}. The result is an @code{mpfr} number equal to the input with a fixed precision, currently set to 160 bits. @item @code{mp2dbl} The input is an @code{mpfr} number, and the output is the best possible approximation to it by a standard a double precision number. @item @code{str2mp} The input is a pair @code{(@var{prec},@var{s})}, where @var{prec} is a natural number specifying the precision, and @var{s} is a string expressing a floating point number in C format. The output is an @code{mpfr} number with the specified precision. @item @code{mp2str} The input is an @code{mpfr} number, and the output is a character string expressing the number in exponential decimal notation. Sufficiently many decimal digits are included in the string to express the full precision. @item @code{nat2mp} The input is a natural number represented as described in @ref{Representation of Numeric and Textual Data}, and the output is an @code{mpfr} number of sufficient precision to express the natural number exactly. @end table The @code{mp2str} function enhances the native @code{mpfr_get_str} function by properly formatting the output string rather than only listing the digits of the mantissa. The @code{nat2mp} function does not rely on the @code{mpfr} native integer conversion functions, so natural numbers with any number of bits up to @code{MP_PREC_MAX} can be used losslessly. There is currently no conversion in the other direction. @node mpfr exceptions, Additional mpfr notes, mpfr conversion functions, mpfr @subsection @code{mpfr} exceptions @itemize @bullet @item A message of @code{<'memory overflow'>} is possible any time @code{mpfr} runs out of memory. @item A message of @code{<'bad mpfr specification'>} will be returned @cindex bad mpfr specification if an input argument recognizably violates the required specification. @item The @code{<'mpfr error'>} message is returned in the event of any @cindex mpfr error unexpected behavior or irregular status from the API. @item The message of @code{<'mpfr overflow'>} can be cause by the @code{nat2mp} function if a natural number has too many bits to be represented exactly as an @code{mpfr} number. @end itemize @node Additional mpfr notes, , mpfr exceptions, mpfr @subsection Additional @code{mpfr} notes The @code{eq} and @code{urandomb} functions depend not only on the @code{mpfr} library but on the @code{gmp} library @cindex gmp library (@url{http://ftp.gnu.org/gnu/gmp}). It's possible for them to be unavailable on a host without @code{gmp} even if the rest of the @code{mpfr} library is properly configured. The file @code{mpfr.c} in the @code{avram} source tree exports a couple of functions that may be of use to C hackers interested in further development of @code{avram} with @code{mpfr}. The functions @code{avm_mpfr_of_list} and @code{avm_list_of_mpfr} convert between the native representation for @code{mpfr} numbers and the caching list representation used by @code{avram} (@ref{Type Conversions}). This conversion is non-trivial because the numbers are not stored contiguously. @node lpsolve, rmath, mpfr, External Libraries @section @code{lpsolve} This library interface exports functions to solve linear programming @cindex mixed integer programming @cindex linear programming and mixed integer programming problems using the @code{lpsolve} package documented at @url{http://lpsolve.sourceforge.net/5.5/}. @noindent Of the two linear programming solvers currently interfaced with @code{avram}, this one is believed to be the more robust. @menu * lpsolve calling conventions:: * lpsolve return values:: * lpsolve errors:: @end menu @node lpsolve calling conventions, lpsolve return values, lpsolve, lpsolve @subsection @code{lpsolve} calling conventions The library is able to solve linear and mixed integer programming problems, depending on which function is selected. The @cindex linear programming function to call the linear programming solver is of the form @itemize @bullet @item @code{library('lpsolve','stdform')} @end itemize @noindent @cindex mixed integer programming and the mixed integer programming functions are of the form @itemize @bullet @item @code{library('lpsolve','iform')} @item @code{library('lpsolve','bform')} @item @code{library('lpsolve','biform')} @end itemize @noindent The argument to the @code{stdform} function represents a triple @code{(@var{c},(@var{m},@var{y}))}, which has the same interpretation described in @ref{glpk input parameters}. The arguments to the @code{iform}, @code{bform}, and @code{biform} functions are tuples @code{(@var{i},(@var{c},(@var{m},@var{y})))} @code{(@var{b},(@var{c},(@var{m},@var{y})))}, and @code{((@var{b},@var{i}),(@var{c},(@var{m},@var{y})))}, respectively, where @var{c}, @var{m}, and @var{y} are as above, and @itemize @bullet @item @var{b} is a list of binary variable column indices @item @var{i} is a list of integer variable column indices @end itemize @noindent where column indices pertain to the constraint matrix, and are numbered from zero. Specifying some or all variables as integers directs the solver to seek only solutions in which those variables have integer values, and specifying any as binary directs the solver to seek only solutions in which those variables have values of zero or one. The IEEE floating point representation is used for all variables regardless (@ref{math}). @node lpsolve return values, lpsolve errors, lpsolve calling conventions, lpsolve @subsection @code{lpsolve} return values If a feasible and optimal solution is found, a list of values for the variables is returned in the form @code{<(@var{i},@var{x})...>}, where @var{i} is a natural number and @var{x} is a floating point number giving the value of the @var{i}-th variable numbered from zero. Values of @var{x} equal to zero are omitted. @node lpsolve errors, , lpsolve return values, lpsolve @subsection @code{lpsolve} errors If any calling conventions are not followed, an exception is raised and a diagnostic message of @code{bad lpsolve problem specification} is reported. If no feasible solution can be found, no exception is raised but an empty list is returned. @node rmath, umf, lpsolve, External Libraries @section @code{rmath} A selection of mathematical and statistical functions from the GNU R math library has a virtual code interface of the form @example library('rmath',f) @end example @noindent where @code{f} is a character string derived from the name of a function in the C language API described in the manual @file{R-exts.pdf}, available at @url{http://www.r-project.org}. Every function in the library returns a real result in the form of @ref{math}, but functions differ in the argument types. The arguments are tuples of real numbers and booleans that also closely follow the native API as explained below. @menu * rmath statistical functions:: * rmath miscellaneous functions:: * rmath exceptions:: @end menu @node rmath statistical functions, rmath miscellaneous functions, rmath, rmath @subsection @code{rmath} statistical functions Functions for evaluating random draws, density, cumulative probability and inverse cumulative probability are provided for some of the more frequently used probability distributions, which are chi-squared, non-central chi-squared, exponential, lognormal, normal, poisson, Student's t, and uniform. Each distribution is known by an abbreviated name and specified by one @cindex distributions @cindex probability distributions @cindex statistical distributions or two real parameters as listed below. Names of distributions in this table form the stem of a library function name. The names of the parameters such as @var{mu} and @var{sigma} are not explicitly mentioned when invoking the functions, but are listed here for reference. The precise definitions of the distribution functions and interpretations of these parameters can be found in standard texts on probability and statistics. @example chisq @var{df} nchisq @var{df}, @var{lambda} exp @var{scale} lnorm @var{logmean}, @var{logsd} norm @var{mu}, @var{sigma} pois @var{lambda} t @var{n} unif @var{a}, @var{b} @end example The virtual code interface follows a naming convention similar to the native API, in that function names beginning with @code{r} represent random draws from a distribution, with the argument to the function being the parameters specifying the distribution. Functions in this first group return a random draw from a distribution described by a single real parameter. @itemize @bullet @item @code{rchisq} @item @code{rexp} @item @code{rpois} @item @code{rt} @end itemize @noindent These next functions return random draws from distributions specified by a pair of parameters, @code{(@var{x},@var{y})}. @itemize @bullet @item @code{rnchisq} @item @code{rlnorm} @item @code{rnorm} @item @code{runif} @end itemize Functions whose names begin with @code{d} evaluate the probability density of a distribution at a given point. They require at least two real arguments, the first being the point whose probability density is sought, and the remaining ones being the parameters that specify the distribution. A boolean operand, which is @code{nil} for false and @code{(nil,nil)} for true, requests the logarithm of the density when true. Functions with names in the following group take a triple with two real operands and a boolean, @code{(@var{x},(@var{y},@var{a}))}, and return a probabiity density. @itemize @bullet @item @code{dchisq} @item @code{dexp} @item @code{dpois} @item @code{dt} @end itemize @noindent The next functions pertain to distributions requiring two paramters to specify them, so they take a quadruple with three real operands and a boolean, @code{(@var{x},(@var{y},(@var{z},@var{a})))}. @itemize @bullet @item @code{dnchisq} @item @code{dlnorm} @item @code{dnorm} @item @code{dunif} @end itemize Functions whose names begin with @code{p} or @code{q} obtain @cindex cumulative probability cumulative probabilities or inverse cumulative probabilities respectively for a specified distribution. They require one real operand to identify the point whose probability or inverse probability is sought, and other real operands to parameterize the distribution, as above. There are also two boolean operands. The first is true in order to request a probability or inverse probability with respect to the lower tail as opposed to the upper, and the other is true to indicate that probabilities are to be expressed logarithmically. The argument to these functions is a quadruple with two real operands and two booleans, @code{(@var{x},(@var{y},(@var{a},@var{b})))}. @itemize @bullet @item @code{pchisq}, @code{qchisq} @item @code{pexp}, @code{qexp} @item @code{ppois}, @code{qpois} @item @code{pt}, @code{qt} @end itemize @noindent The remaining functions pertain to distributions parameterized by two real operands. These take a quintuple with three real operands and two booleans, @code{(@var{x},(@var{y},(@var{z},(@var{a},@var{b}))))}. @itemize @bullet @item @code{pnchisq}, @code{qnchisq} @item @code{plnorm}, @code{qlnorm} @item @code{pnorm}, @code{qnorm} @item @code{punif}, @code{qunif} @end itemize @node rmath miscellaneous functions, rmath exceptions, rmath statistical functions, rmath @subsection @code{rmath} miscellaneous functions Some less frequently used real valued mathematical functions are also accessible by the @code{rmath} library interface. The functions with @cindex gamma functions names in this group take a single real operand. @example gammafn lgammafn digamma trigamma tetragamma pentagamma @end example @noindent The ones in this group take a pair of real operands @cindex bessel functions @code{(@var{x},@var{y})}. @example beta lbeta bessel_j bessel_y @end example @noindent Those remaining take a triple of real operands @code{(@var{x},(@var{y},@var{z}))}. @example bessel_i bessel_k @end example An alternative and better documented selection of Bessel functions is provided by the @code{bes} library interface (@ref{bes}). @node rmath exceptions, , rmath miscellaneous functions, rmath @subsection @code{rmath} exceptions The only exceptional condition specific to the @code{rmath} library interface is associated with the message @code{<'bad rmath specification'>}, which means that a tuple given as an argument @cindex bad rmath specification has the wrong number or types of operands. @node umf, ,rmath, External Libraries @section @code{umf} Systems of equations described by sparse matrices (i.e., matrices @cindex sparse matrices containing mostly zeros) arise in certain practical problems. The usual array representation in which zeros are explicitly stored would be prohibitive for large matrices occurring in many problems of interest. A more sophisticated approach is used by the @code{umf} library to manage memory efficiently, which is documented at @url{http://www.cise.ufl.edu/research/sparse/SuiteSparse/current/SuiteSparse/UMFPACK/Doc/}. A virtual code interface to functions for solving sparse systems of equations by these methods is afforded by library functions of the form @example library('umf',f) @end example @noindent where the library function name, @code{f} is a character string of the form @code{@var{tt}_@var{m}_@var{rrr}}. @itemize @bullet @item @var{tt} can be either @code{di} for real matrices, or @code{zi} for complex. @item @var{m} can be one of @code{a}, @code{t}, or @code{c} for solving a system given either by a matrix, its transpose, or its conjugate transpose, respectively, @cindex conjugate transpose corresponding to mnemonics @code{A}, @code{Aat} and @code{At} used in the C language API. @item @var{rrr} is either @code{trp} or @code{col}, to indicate a sparse matrix expressed either as a list of triples, or in packed column form, as documented below. @end itemize The complete set of function names for this library interface is as follows. @example di_a_trp di_a_col zi_a_trp zi_a_col di_t_trp di_t_col zi_t_trp zi_t_col zi_c_trp zi_c_col @end example @noindent Not all combinations are represented, because the conjugate transpose is relevant only to complex matrices. @menu * umf input parameters:: * umf output:: * umf exceptions:: * Additional umf notes:: @end menu @node umf input parameters, umf output, umf, umf @subsection @code{umf} input parameters For a square matrix @var{A} and a column vector @var{b}, the @code{umf} functions find the solution @var{x} to the matrix equation @var{M} @var{x} = @var{b}, where @var{M} is either @var{A}, the transpose of @var{A}, or its conjugate transpose. As noted above, the choice is determined by whether the the function name is of the form *@code{_a_}*, *@code{_t_}*, or *@code{_c_}* respectively. The argument to any of these functions is a pair @code{(@var{A},@var{b})}, where @var{A} represents the matrix mentioned above and @var{b} represents the column vector. The parameter @var{b} is required to be a list of numbers whose length matches the number of rows in the matrix. The numbers are either real numbers for the @code{di_}* functions (@ref{math}), or complex for the @code{zi_}* functions (@ref{complex}). There is a choice of representations for the parameter @var{A}, depending on whether the function being called is one of the *@code{_trp} functions or one of the *@code{_col} functions. For the *@code{_trp} functions, @var{A} is represented as a non-empty list of triples @code{<((@var{i},@var{j}),@var{v})...>}, where each item of the list corresponds to a non-zero entry in the matrix. @itemize @bullet @item The parameters @var{i} and @var{j} are natural numbers as in @ref{Representation of Numeric and Textual Data}. @item The value @var{v} is a real number for the @code{di_}*@code{_trp} functions or a complex number for the @code{zi_}*@code{_trp} functions. @item The presence of a triple ((@var{i},@var{j}),@var{v}) in the list signifies that the @var{i},@var{j}-th entry in the matrix @var{A} (numbered from zero) has a value of @var{v}. @end itemize For the *@code{_col} functions, the representation of @var{A} is more complicated but has a slight advantage in memory usage. It may also have an advantage in speed unless more time is wasted on the virtual side transforming a matrix to this representation than it saves. In this case, @var{A} is represented by a triple of the form @code{((@var{p},@var{i}),@var{v})}. The parameters @var{p} and @var{i} are lists of natural numbers. The parameter @var{v} is a list of real numbers for the @code{di_}*@code{_col} functions and complex numbers for the @code{zi_}*@code{_col} functions. They have the following interpretations. @itemize @bullet @item @var{v} is the list of non-zero entries in the matrix in @cindex column major order column major order. @item @var{i} has the same length as @var{v}, and each item of @var{i} is the row index of the corresponding item in @var{v}, numbered from zero. @item @var{p} has the length of the number of columns in the matrix, and each item identifies the starting position of a column in @var{v} and @var{i}, numbered from zero. @end itemize @noindent The first item of @var{p} is always zero. Further explanation of this format in terms of an array representation can be found in the file @file{UMFPACK_UserGuide.pdf}, available from the @code{umf} library home page at @url{http://www.cise.ufl.edu/research/sparse/SuiteSparse/current/SuiteSparse/}. @node umf output, umf exceptions, umf input parameters, umf @subsection @code{umf} output If no exception occurs, the solution @var{x} to the matrix equation @var{M} @var{x} = @var{b} noted previously will be returned if one exists. The solution is represented as either a list of real numbers as in @ref{math}, or a list of complex numbers as in @ref{complex}. Real numbers are returned by the @code{di_}* functions, and complex numbers are returned by the @code{zi_}* functions. If no solution exists due to a singular matrix, an empty list is returned. The lack of a solution isn't treated as an exceptional condition. @node umf exceptions, Additional umf notes, umf output, umf @subsection @code{umf} exceptions If an exceptional condition arises from the use of this library, one of the following lists of character strings may be returned as the function result. @itemize @bullet @item @code{<'memory overflow'>} means the library function ran out of memory, most likely due to a matrix being too large. @item @code{<'bad umf specification'>} means an input parameter didn't @cindex bad umf specification conform to the appropriate format described above (@ref{umf input parameters}) @item @code{<'umf error'>} covers any unexpected behavior or abnormal @cindex umf error status returned by any function from the C language API. @end itemize For the *@code{_trp} functions. A non-square matrix will cause the second exception above. For the *@code{_col} functions, a non-square matrix will cause the third exception or cause an empty result to be returned. The exceptions noted at the beginning of this section (@ref{External Libraries}) are also possible. @node Additional umf notes, , umf exceptions, umf @subsection Additional @code{umf} notes The C language API to @code{umf} provides very many less frequently used features that are not part of the virtual code interface, some of which could be added by minor modifications to the file @file{umf.c} in the @code{avram} source tree. A set of @code{dl_}* and @code{zl_}* functions orthogonal to those presently accessible would enable matrices having billions of rows or columns by using long integers, but memory requirements on the virtual code side for problems of that scale are probably prohibitive for the foreseeable future. @node Copying, Function Index, External Libraries, Top @setfilename gpl.info @appendix GNU GENERAL PUBLIC LICENCE @center Version 2, June 1991 @display Copyright @copyright{} 1989, 1991 Free Software Foundation, Inc. 675 Mass Ave, Cambridge, MA 02139, USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. @end display @menu * Preamble:: * Terms and Conditions:: * How to Apply:: @end menu @node Preamble, Terms and Conditions, ,Copying @unnumberedsec Preamble The licenses for most software are designed to take away your freedom to share and change it. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change free software---to make sure the software is free for all its users. This General Public License applies to most of the Free Software Foundation's software and to any other program whose authors commit to using it. (Some other Free Software Foundation software is covered by the GNU Library General Public License instead.) You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. Finally, any free program is threatened constantly by software patents. We wish to avoid the danger that redistributors of a free program will individually obtain patent licenses, in effect making the program proprietary. To prevent this, we have made it clear that any patent must be licensed for everyone's free use or not licensed at all. The precise terms and conditions for copying, distribution and modification follow. @node Terms and Conditions, How to Apply, Preamble, Copying @unnumberedsec TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION @enumerate @item This License applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The ``Program'', below, refers to any such program or work, and a ``work based on the Program'' means either the Program or any derivative work under copyright law: that is to say, a work containing the Program or a portion of it, either verbatim or with modifications and/or translated into another language. (Hereinafter, translation is included without limitation in the term ``modification''.) Each licensee is addressed as ``you''. Activities other than copying, distribution and modification are not covered by this License; they are outside its scope. The act of running the Program is not restricted, and the output from the Program is covered only if its contents constitute a work based on the Program (independent of having been made by running the Program). Whether that is true depends on what the Program does. @item You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this License and to the absence of any warranty; and give any other recipients of the Program a copy of this License along with the Program. You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. @item You may modify your copy or copies of the Program or any portion of it, thus forming a work based on the Program, and copy and distribute such modifications or work under the terms of Section 1 above, provided that you also meet all of these conditions: @enumerate a @item You must cause the modified files to carry prominent notices stating that you changed the files and the date of any change. @item You must cause any work that you distribute or publish, that in whole or in part contains or is derived from the Program or any part thereof, to be licensed as a whole at no charge to all third parties under the terms of this License. @item If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the most ordinary way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this License. (Exception: if the Program itself is interactive but does not normally print such an announcement, your work based on the Program is not required to print an announcement.) @end enumerate These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it. Thus, it is not the intent of this section to claim rights or contest your rights to work written entirely by you; rather, the intent is to exercise the right to control the distribution of derivative or collective works based on the Program. In addition, mere aggregation of another work not based on the Program with the Program (or with a work based on the Program) on a volume of a storage or distribution medium does not bring the other work under the scope of this License. @item You may copy and distribute the Program (or a work based on it, under Section 2) in object code or executable form under the terms of Sections 1 and 2 above provided that you also do one of the following: @enumerate a @item Accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, @item Accompany it with a written offer, valid for at least three years, to give any third party, for a charge no more than your cost of physically performing source distribution, a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Sections 1 and 2 above on a medium customarily used for software interchange; or, @item Accompany it with the information you received as to the offer to distribute corresponding source code. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form with such an offer, in accord with Subsection b above.) @end enumerate The source code for a work means the preferred form of the work for making modifications to it. For an executable work, complete source code means all the source code for all modules it contains, plus any associated interface definition files, plus the scripts used to control compilation and installation of the executable. However, as a special exception, the source code distributed need not include anything that is normally distributed (in either source or binary form) with the major components (compiler, kernel, and so on) of the operating system on which the executable runs, unless that component itself accompanies the executable. If distribution of executable or object code is made by offering access to copy from a designated place, then offering equivalent access to copy the source code from the same place counts as distribution of the source code, even though third parties are not compelled to copy the source along with the object code. @item You may not copy, modify, sublicense, or distribute the Program except as expressly provided under this License. Any attempt otherwise to copy, modify, sublicense or distribute the Program is void, and will automatically terminate your rights under this License. However, parties who have received copies, or rights, from you under this License will not have their licenses terminated so long as such parties remain in full compliance. @item You are not required to accept this License, since you have not signed it. However, nothing else grants you permission to modify or distribute the Program or its derivative works. These actions are prohibited by law if you do not accept this License. Therefore, by modifying or distributing the Program (or any work based on the Program), you indicate your acceptance of this License to do so, and all its terms and conditions for copying, distributing or modifying the Program or works based on it. @item Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. You are not responsible for enforcing compliance by third parties to this License. @item If, as a consequence of a court judgment or allegation of patent infringement or for any other reason (not limited to patent issues), conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot distribute so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not distribute the Program at all. For example, if a patent license would not permit royalty-free redistribution of the Program by all those who receive copies directly or indirectly through you, then the only way you could satisfy both it and this License would be to refrain entirely from distribution of the Program. If any portion of this section is held invalid or unenforceable under any particular circumstance, the balance of the section is intended to apply and the section as a whole is intended to apply in other circumstances. It is not the purpose of this section to induce you to infringe any patents or other property right claims or to contest validity of any such claims; this section has the sole purpose of protecting the integrity of the free software distribution system, which is implemented by public license practices. Many people have made generous contributions to the wide range of software distributed through that system in reliance on consistent application of that system; it is up to the author/donor to decide if he or she is willing to distribute software through any other system and a licensee cannot impose that choice. This section is intended to make thoroughly clear what is believed to be a consequence of the rest of this License. @item If the distribution and/or use of the Program is restricted in certain countries either by patents or by copyrighted interfaces, the original copyright holder who places the Program under this License may add an explicit geographical distribution limitation excluding those countries, so that distribution is permitted only in or among countries not thus excluded. In such case, this License incorporates the limitation as if written in the body of this License. @item The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of this License which applies to it and ``any later version'', you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of this License, you may choose any version ever published by the Free Software Foundation. @item If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. @iftex @heading NO WARRANTY @end iftex @ifinfo @center NO WARRANTY @end ifinfo @item BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM ``AS IS'' WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. @item IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. @end enumerate @iftex @heading END OF TERMS AND CONDITIONS @end iftex @ifinfo @center END OF TERMS AND CONDITIONS @end ifinfo @page @node How to Apply, , Terms and Conditions, Copying @unnumberedsec How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the ``copyright'' line and a pointer to where the full notice is found. @smallexample @var{one line to give the program's name and an idea of what it does.} Copyright (C) 19@var{yy} @var{name of author} This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. @end smallexample Also add information on how to contact you by electronic and paper mail. If the program is interactive, make it output a short notice like this when it starts in an interactive mode: @smallexample Gnomovision version 69, Copyright (C) 19@var{yy} @var{name of author} Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. @end smallexample The hypothetical commands @samp{show w} and @samp{show c} should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than @samp{show w} and @samp{show c}; they could even be mouse-clicks or menu items---whatever suits your program. You should also get your employer (if you work as a programmer) or your school, if any, to sign a ``copyright disclaimer'' for the program, if necessary. Here is a sample; alter the names: @smallexample @group Yoyodyne, Inc., hereby disclaims all copyright interest in the program `Gnomovision' (which makes passes at compilers) written by James Hacker. @var{signature of Ty Coon}, 1 April 1989 Ty Coon, President of Vice @end group @end smallexample This General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Library General Public License instead of this License. @node Function Index, Concept Index, Copying, Top @unnumbered Function Index @printindex fn @node Concept Index, , Function Index, Top @unnumbered Concept Index @printindex cp @bye pcx has been ditched, but this section is left here in case someone decides to put it back @node pcx, lpsolve, mpfr, External Libraries @section @code{pcx} This library interface exports a function that uses the algorithm documented at @url{http://www-fp.mcs.anl.gov/otc/Tools/PCx/} to solve linear programming problems by an interior point method. The library @cindex linear programming function is of the form @example library('pcx','solution') @end example @noindent This function has the same calling conventions as @code{library('glpk','simplex')} and @code{library('glpk','interior')} (@ref{glpk}). @cindex internal errors Error messages are also the same as @ref{glpk errors}, except that the a message text of @code{<'bad pcx specification'>} is used. In addition, @code{pcx} can cause a virtual machine internal error (code 70), which is raised when @code{avram} detects access anomalies by an external library, or in the worst case, a segmentation fault. @cindex segmentation fault A working configuration of this library interface requires a shared library @file{libPCx.so} to be installed on the host, which is not a feature of the standard installation distributed by the authors of @code{PCx}. However, the shared library can be built from the source package by any system administrator familiar with these procedures. Alternatively, an unofficial patched version of the Debian @cindex Debian package of @code{pcx} that includes the shared library is available on the @code{avram} home page. This patch has been submitted to the Debian maintainer of @code{pcx} and may be expected in a future release. If @code{avram} has been built already on a system that has the @code{pcx} header files but not the shared library, it should be possible to get the @code{pcx} library interface to work just by installing the shared library without recompiling @code{avram}, because @code{pcx} is dynamically loaded due to licensing restrictions.