avram.info-1 286 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963964965966967968969970971972973974975976977978979980981982983984985986987988989990991992993994995996997998999100010011002100310041005100610071008100910101011101210131014101510161017101810191020102110221023102410251026102710281029103010311032103310341035103610371038103910401041104210431044104510461047104810491050105110521053105410551056105710581059106010611062106310641065106610671068106910701071107210731074107510761077107810791080108110821083108410851086108710881089109010911092109310941095109610971098109911001101110211031104110511061107110811091110111111121113111411151116111711181119112011211122112311241125112611271128112911301131113211331134113511361137113811391140114111421143114411451146114711481149115011511152115311541155115611571158115911601161116211631164116511661167116811691170117111721173117411751176117711781179118011811182118311841185118611871188118911901191119211931194119511961197119811991200120112021203120412051206120712081209121012111212121312141215121612171218121912201221122212231224122512261227122812291230123112321233123412351236123712381239124012411242124312441245124612471248124912501251125212531254125512561257125812591260126112621263126412651266126712681269127012711272127312741275127612771278127912801281128212831284128512861287128812891290129112921293129412951296129712981299130013011302130313041305130613071308130913101311131213131314131513161317131813191320132113221323132413251326132713281329133013311332133313341335133613371338133913401341134213431344134513461347134813491350135113521353135413551356135713581359136013611362136313641365136613671368136913701371137213731374137513761377137813791380138113821383138413851386138713881389139013911392139313941395139613971398139914001401140214031404140514061407140814091410141114121413141414151416141714181419142014211422142314241425142614271428142914301431143214331434143514361437143814391440144114421443144414451446144714481449145014511452145314541455145614571458145914601461146214631464146514661467146814691470147114721473147414751476147714781479148014811482148314841485148614871488148914901491149214931494149514961497149814991500150115021503150415051506150715081509151015111512151315141515151615171518151915201521152215231524152515261527152815291530153115321533153415351536153715381539154015411542154315441545154615471548154915501551155215531554155515561557155815591560156115621563156415651566156715681569157015711572157315741575157615771578157915801581158215831584158515861587158815891590159115921593159415951596159715981599160016011602160316041605160616071608160916101611161216131614161516161617161816191620162116221623162416251626162716281629163016311632163316341635163616371638163916401641164216431644164516461647164816491650165116521653165416551656165716581659166016611662166316641665166616671668166916701671167216731674167516761677167816791680168116821683168416851686168716881689169016911692169316941695169616971698169917001701170217031704170517061707170817091710171117121713171417151716171717181719172017211722172317241725172617271728172917301731173217331734173517361737173817391740174117421743174417451746174717481749175017511752175317541755175617571758175917601761176217631764176517661767176817691770177117721773177417751776177717781779178017811782178317841785178617871788178917901791179217931794179517961797179817991800180118021803180418051806180718081809181018111812181318141815181618171818181918201821182218231824182518261827182818291830183118321833183418351836183718381839184018411842184318441845184618471848184918501851185218531854185518561857185818591860186118621863186418651866186718681869187018711872187318741875187618771878187918801881188218831884188518861887188818891890189118921893189418951896189718981899190019011902190319041905190619071908190919101911191219131914191519161917191819191920192119221923192419251926192719281929193019311932193319341935193619371938193919401941194219431944194519461947194819491950195119521953195419551956195719581959196019611962196319641965196619671968196919701971197219731974197519761977197819791980198119821983198419851986198719881989199019911992199319941995199619971998199920002001200220032004200520062007200820092010201120122013201420152016201720182019202020212022202320242025202620272028202920302031203220332034203520362037203820392040204120422043204420452046204720482049205020512052205320542055205620572058205920602061206220632064206520662067206820692070207120722073207420752076207720782079208020812082208320842085208620872088208920902091209220932094209520962097209820992100210121022103210421052106210721082109211021112112211321142115211621172118211921202121212221232124212521262127212821292130213121322133213421352136213721382139214021412142214321442145214621472148214921502151215221532154215521562157215821592160216121622163216421652166216721682169217021712172217321742175217621772178217921802181218221832184218521862187218821892190219121922193219421952196219721982199220022012202220322042205220622072208220922102211221222132214221522162217221822192220222122222223222422252226222722282229223022312232223322342235223622372238223922402241224222432244224522462247224822492250225122522253225422552256225722582259226022612262226322642265226622672268226922702271227222732274227522762277227822792280228122822283228422852286228722882289229022912292229322942295229622972298229923002301230223032304230523062307230823092310231123122313231423152316231723182319232023212322232323242325232623272328232923302331233223332334233523362337233823392340234123422343234423452346234723482349235023512352235323542355235623572358235923602361236223632364236523662367236823692370237123722373237423752376237723782379238023812382238323842385238623872388238923902391239223932394239523962397239823992400240124022403240424052406240724082409241024112412241324142415241624172418241924202421242224232424242524262427242824292430243124322433243424352436243724382439244024412442244324442445244624472448244924502451245224532454245524562457245824592460246124622463246424652466246724682469247024712472247324742475247624772478247924802481248224832484248524862487248824892490249124922493249424952496249724982499250025012502250325042505250625072508250925102511251225132514251525162517251825192520252125222523252425252526252725282529253025312532253325342535253625372538253925402541254225432544254525462547254825492550255125522553255425552556255725582559256025612562256325642565256625672568256925702571257225732574257525762577257825792580258125822583258425852586258725882589259025912592259325942595259625972598259926002601260226032604260526062607260826092610261126122613261426152616261726182619262026212622262326242625262626272628262926302631263226332634263526362637263826392640264126422643264426452646264726482649265026512652265326542655265626572658265926602661266226632664266526662667266826692670267126722673267426752676267726782679268026812682268326842685268626872688268926902691269226932694269526962697269826992700270127022703270427052706270727082709271027112712271327142715271627172718271927202721272227232724272527262727272827292730273127322733273427352736273727382739274027412742274327442745274627472748274927502751275227532754275527562757275827592760276127622763276427652766276727682769277027712772277327742775277627772778277927802781278227832784278527862787278827892790279127922793279427952796279727982799280028012802280328042805280628072808280928102811281228132814281528162817281828192820282128222823282428252826282728282829283028312832283328342835283628372838283928402841284228432844284528462847284828492850285128522853285428552856285728582859286028612862286328642865286628672868286928702871287228732874287528762877287828792880288128822883288428852886288728882889289028912892289328942895289628972898289929002901290229032904290529062907290829092910291129122913291429152916291729182919292029212922292329242925292629272928292929302931293229332934293529362937293829392940294129422943294429452946294729482949295029512952295329542955295629572958295929602961296229632964296529662967296829692970297129722973297429752976297729782979298029812982298329842985298629872988298929902991299229932994299529962997299829993000300130023003300430053006300730083009301030113012301330143015301630173018301930203021302230233024302530263027302830293030303130323033303430353036303730383039304030413042304330443045304630473048304930503051305230533054305530563057305830593060306130623063306430653066306730683069307030713072307330743075307630773078307930803081308230833084308530863087308830893090309130923093309430953096309730983099310031013102310331043105310631073108310931103111311231133114311531163117311831193120312131223123312431253126312731283129313031313132313331343135313631373138313931403141314231433144314531463147314831493150315131523153315431553156315731583159316031613162316331643165316631673168316931703171317231733174317531763177317831793180318131823183318431853186318731883189319031913192319331943195319631973198319932003201320232033204320532063207320832093210321132123213321432153216321732183219322032213222322332243225322632273228322932303231323232333234323532363237323832393240324132423243324432453246324732483249325032513252325332543255325632573258325932603261326232633264326532663267326832693270327132723273327432753276327732783279328032813282328332843285328632873288328932903291329232933294329532963297329832993300330133023303330433053306330733083309331033113312331333143315331633173318331933203321332233233324332533263327332833293330333133323333333433353336333733383339334033413342334333443345334633473348334933503351335233533354335533563357335833593360336133623363336433653366336733683369337033713372337333743375337633773378337933803381338233833384338533863387338833893390339133923393339433953396339733983399340034013402340334043405340634073408340934103411341234133414341534163417341834193420342134223423342434253426342734283429343034313432343334343435343634373438343934403441344234433444344534463447344834493450345134523453345434553456345734583459346034613462346334643465346634673468346934703471347234733474347534763477347834793480348134823483348434853486348734883489349034913492349334943495349634973498349935003501350235033504350535063507350835093510351135123513351435153516351735183519352035213522352335243525352635273528352935303531353235333534353535363537353835393540354135423543354435453546354735483549355035513552355335543555355635573558355935603561356235633564356535663567356835693570357135723573357435753576357735783579358035813582358335843585358635873588358935903591359235933594359535963597359835993600360136023603360436053606360736083609361036113612361336143615361636173618361936203621362236233624362536263627362836293630363136323633363436353636363736383639364036413642364336443645364636473648364936503651365236533654365536563657365836593660366136623663366436653666366736683669367036713672367336743675367636773678367936803681368236833684368536863687368836893690369136923693369436953696369736983699370037013702370337043705370637073708370937103711371237133714371537163717371837193720372137223723372437253726372737283729373037313732373337343735373637373738373937403741374237433744374537463747374837493750375137523753375437553756375737583759376037613762376337643765376637673768376937703771377237733774377537763777377837793780378137823783378437853786378737883789379037913792379337943795379637973798379938003801380238033804380538063807380838093810381138123813381438153816381738183819382038213822382338243825382638273828382938303831383238333834383538363837383838393840384138423843384438453846384738483849385038513852385338543855385638573858385938603861386238633864386538663867386838693870387138723873387438753876387738783879388038813882388338843885388638873888388938903891389238933894389538963897389838993900390139023903390439053906390739083909391039113912391339143915391639173918391939203921392239233924392539263927392839293930393139323933393439353936393739383939394039413942394339443945394639473948394939503951395239533954395539563957395839593960396139623963396439653966396739683969397039713972397339743975397639773978397939803981398239833984398539863987398839893990399139923993399439953996399739983999400040014002400340044005400640074008400940104011401240134014401540164017401840194020402140224023402440254026402740284029403040314032403340344035403640374038403940404041404240434044404540464047404840494050405140524053405440554056405740584059406040614062406340644065406640674068406940704071407240734074407540764077407840794080408140824083408440854086408740884089409040914092409340944095409640974098409941004101410241034104410541064107410841094110411141124113411441154116411741184119412041214122412341244125412641274128412941304131413241334134413541364137413841394140414141424143414441454146414741484149415041514152415341544155415641574158415941604161416241634164416541664167416841694170417141724173417441754176417741784179418041814182418341844185418641874188418941904191419241934194419541964197419841994200420142024203420442054206420742084209421042114212421342144215421642174218421942204221422242234224422542264227422842294230423142324233423442354236423742384239424042414242424342444245424642474248424942504251425242534254425542564257425842594260426142624263426442654266426742684269427042714272427342744275427642774278427942804281428242834284428542864287428842894290429142924293429442954296429742984299430043014302430343044305430643074308430943104311431243134314431543164317431843194320432143224323432443254326432743284329433043314332433343344335433643374338433943404341434243434344434543464347434843494350435143524353435443554356435743584359436043614362436343644365436643674368436943704371437243734374437543764377437843794380438143824383438443854386438743884389439043914392439343944395439643974398439944004401440244034404440544064407440844094410441144124413441444154416441744184419442044214422442344244425442644274428442944304431443244334434443544364437443844394440444144424443444444454446444744484449445044514452445344544455445644574458445944604461446244634464446544664467446844694470447144724473447444754476447744784479448044814482448344844485448644874488448944904491449244934494449544964497449844994500450145024503450445054506450745084509451045114512451345144515451645174518451945204521452245234524452545264527452845294530453145324533453445354536453745384539454045414542454345444545454645474548454945504551455245534554455545564557455845594560456145624563456445654566456745684569457045714572457345744575457645774578457945804581458245834584458545864587458845894590459145924593459445954596459745984599460046014602460346044605460646074608460946104611461246134614461546164617461846194620462146224623462446254626462746284629463046314632463346344635463646374638463946404641464246434644464546464647464846494650465146524653465446554656465746584659466046614662466346644665466646674668466946704671467246734674467546764677467846794680468146824683468446854686468746884689469046914692469346944695469646974698469947004701470247034704470547064707470847094710471147124713471447154716471747184719472047214722472347244725472647274728472947304731473247334734473547364737473847394740474147424743474447454746474747484749475047514752475347544755475647574758475947604761476247634764476547664767476847694770477147724773477447754776477747784779478047814782478347844785478647874788478947904791479247934794479547964797479847994800480148024803480448054806480748084809481048114812481348144815481648174818481948204821482248234824482548264827482848294830483148324833483448354836483748384839484048414842484348444845484648474848484948504851485248534854485548564857485848594860486148624863486448654866486748684869487048714872487348744875487648774878487948804881488248834884488548864887488848894890489148924893489448954896489748984899490049014902490349044905490649074908490949104911491249134914491549164917491849194920492149224923492449254926492749284929493049314932493349344935493649374938493949404941494249434944494549464947494849494950495149524953495449554956495749584959496049614962496349644965496649674968496949704971497249734974497549764977497849794980498149824983498449854986498749884989499049914992499349944995499649974998499950005001500250035004500550065007500850095010501150125013501450155016501750185019502050215022502350245025502650275028502950305031503250335034503550365037503850395040504150425043504450455046504750485049505050515052505350545055505650575058505950605061506250635064506550665067506850695070507150725073507450755076507750785079508050815082508350845085508650875088508950905091509250935094509550965097509850995100510151025103510451055106510751085109511051115112511351145115511651175118511951205121512251235124512551265127512851295130513151325133513451355136513751385139514051415142514351445145514651475148514951505151515251535154515551565157515851595160516151625163516451655166516751685169517051715172517351745175517651775178517951805181518251835184518551865187518851895190519151925193519451955196519751985199520052015202520352045205520652075208520952105211521252135214521552165217521852195220522152225223522452255226522752285229523052315232523352345235523652375238523952405241524252435244524552465247524852495250525152525253525452555256525752585259526052615262526352645265526652675268526952705271527252735274527552765277527852795280528152825283528452855286528752885289529052915292529352945295529652975298529953005301530253035304530553065307530853095310531153125313531453155316531753185319532053215322532353245325532653275328532953305331533253335334533553365337533853395340534153425343534453455346534753485349535053515352535353545355535653575358535953605361536253635364536553665367536853695370537153725373537453755376537753785379538053815382538353845385538653875388538953905391539253935394539553965397539853995400540154025403540454055406540754085409541054115412541354145415541654175418541954205421542254235424542554265427542854295430543154325433543454355436543754385439544054415442544354445445544654475448544954505451545254535454545554565457545854595460546154625463546454655466546754685469547054715472547354745475547654775478547954805481548254835484548554865487548854895490549154925493549454955496549754985499550055015502550355045505550655075508550955105511551255135514551555165517551855195520552155225523552455255526552755285529553055315532553355345535553655375538553955405541554255435544554555465547554855495550555155525553555455555556555755585559556055615562556355645565556655675568556955705571557255735574557555765577557855795580558155825583558455855586558755885589559055915592559355945595559655975598559956005601560256035604560556065607560856095610561156125613561456155616561756185619562056215622562356245625562656275628562956305631563256335634563556365637563856395640564156425643564456455646564756485649565056515652565356545655565656575658565956605661566256635664566556665667566856695670567156725673567456755676567756785679568056815682568356845685568656875688568956905691569256935694569556965697569856995700570157025703570457055706570757085709571057115712571357145715571657175718571957205721572257235724572557265727572857295730573157325733573457355736573757385739574057415742574357445745574657475748574957505751575257535754575557565757575857595760576157625763576457655766576757685769577057715772577357745775577657775778577957805781578257835784578557865787578857895790579157925793579457955796579757985799580058015802580358045805580658075808580958105811581258135814581558165817581858195820582158225823582458255826582758285829583058315832583358345835583658375838583958405841584258435844584558465847584858495850585158525853585458555856585758585859586058615862586358645865586658675868586958705871587258735874587558765877587858795880588158825883588458855886588758885889589058915892589358945895589658975898589959005901590259035904590559065907590859095910591159125913591459155916591759185919592059215922592359245925592659275928592959305931593259335934593559365937593859395940594159425943594459455946594759485949595059515952595359545955595659575958595959605961596259635964596559665967596859695970597159725973597459755976597759785979598059815982598359845985598659875988598959905991599259935994599559965997599859996000600160026003600460056006600760086009601060116012601360146015601660176018601960206021602260236024602560266027602860296030603160326033603460356036603760386039604060416042604360446045604660476048604960506051605260536054605560566057605860596060606160626063606460656066606760686069607060716072607360746075607660776078607960806081608260836084608560866087608860896090609160926093609460956096609760986099610061016102610361046105610661076108610961106111611261136114611561166117611861196120612161226123612461256126612761286129613061316132613361346135613661376138613961406141614261436144614561466147614861496150615161526153615461556156615761586159616061616162616361646165616661676168616961706171617261736174617561766177617861796180618161826183618461856186618761886189619061916192619361946195619661976198619962006201620262036204620562066207620862096210621162126213621462156216621762186219622062216222622362246225622662276228622962306231623262336234623562366237623862396240624162426243624462456246624762486249625062516252625362546255625662576258625962606261626262636264626562666267626862696270627162726273627462756276627762786279628062816282628362846285628662876288628962906291629262936294629562966297629862996300630163026303630463056306630763086309631063116312631363146315631663176318631963206321632263236324632563266327632863296330633163326333633463356336633763386339634063416342634363446345634663476348634963506351635263536354635563566357635863596360636163626363636463656366636763686369637063716372637363746375637663776378637963806381638263836384638563866387638863896390639163926393639463956396639763986399640064016402640364046405640664076408640964106411641264136414641564166417641864196420642164226423642464256426642764286429643064316432643364346435643664376438643964406441644264436444644564466447644864496450645164526453645464556456645764586459646064616462646364646465646664676468646964706471647264736474647564766477647864796480648164826483648464856486648764886489649064916492649364946495649664976498649965006501650265036504650565066507650865096510651165126513651465156516651765186519652065216522652365246525652665276528652965306531653265336534653565366537653865396540654165426543654465456546654765486549655065516552655365546555655665576558655965606561656265636564656565666567656865696570657165726573657465756576657765786579658065816582658365846585658665876588658965906591659265936594659565966597659865996600660166026603660466056606660766086609661066116612661366146615661666176618
  1. This is avram.info, produced by makeinfo version 4.13 from
  2. avram.texinfo.
  3. This file documents the `avram' command which is a virtual machine code
  4. interpreter
  5. Copyright (C) 2000, 2003, 2006-2010 Dennis Furey Permission is
  6. granted to make and distribute verbatim copies of this manual provided
  7. the copyright notice and this permission notice are preserved on all
  8. copies.
  9. Permission is granted to copy and distribute modified versions of
  10. this manual under the conditions for verbatim copying, provided that
  11. the entire resulting derived work is distributed under the terms of a
  12. permission notice identical to this one.
  13. Permission is granted to copy and distribute translations of this
  14. manual into another language, under the above conditions for modified
  15. versions, except that this permission notice may be stated in a
  16. translation approved by the Free Software Foundation.
  17. 
  18. File: avram.info, Node: Top, Next: Preface, Prev: (dir), Up: (dir)
  19. This file documents `avram' version 0.13.0, which is a virtual
  20. machine code interpreter.
  21. * Menu:
  22. * Preface:: project aims and scope
  23. * User Manual:: command line options and usage
  24. * Virtual Machine Specification:: a guide for compiler writers
  25. * Library Reference:: how to reuse or enhance `avram'
  26. * Character Table:: representations for ASCII characters
  27. * Reference Implementations:: constructive computability proofs
  28. * Changes:: recent updates to the manual
  29. * External Libraries:: specifications and calling conventions
  30. * Copying:: license terms
  31. * Function Index:: for the shared library API
  32. * Concept Index::
  33. 
  34. File: avram.info, Node: Preface, Next: User Manual, Prev: Top, Up: Top
  35. Preface
  36. *******
  37. `avram' is a virtual machine code interpreter. It reads an input file
  38. containing a user-supplied application expressed in virtual machine
  39. code, and executes it on the host machine. The name is a quasi-acronym
  40. for "Applicative ViRtuAl Machine". Notable features are
  41. * strong support for functional programming operations (e.g., list
  42. processing)
  43. * interfaces to selected functions from mathematical libraries, such
  44. as
  45. * `gsl' (numerical integration, differentiation, and
  46. series acceleration)
  47. `http://www.gnu.org/software/gsl/'
  48. * `mpfr' (arbitrary precision arithmetic)
  49. `http://www.mpfr.org'
  50. * `minpack' (non-linear optimization)
  51. `http://ftp.netlib.org/minpack'
  52. * `lapack' (linear algebra)
  53. `http://ftp.netlib.org/lapack'
  54. * `fftw' (fast fourier transforms)
  55. `http://www.fftw.org'
  56. * `Rmath' (statistical and transcendental functions)
  57. `http://www.r-project.org'
  58. * `ufsparse' (sparse matrices)
  59. `http://www.cise.ufl.edu/research/sparse/SuiteSparse/current/SuiteSparse/'
  60. * `glpk' (linear programming by the simplex method)
  61. `http://tech.groups.yahoo.com/group/lp_solve/'
  62. * `lpsolve' (mixed integer linear programming)
  63. `http://www.llnl.gov/CASC/sundials/'
  64. * `kinsol' (constrained non-linear optimization)
  65. `http://www.llnl.gov/CASC/sundials/'
  66. * interoperability of virtual code applications with other console
  67. applications or shells through the `expect' library
  68. * a simple high-level interface to files, environment variables and
  69. command line parameters
  70. * support for various styles of stateless or persistent stream
  71. processors (a.k.a. Unix filters)
  72. The reason for writing `avram' was that I wanted to do some work
  73. using a functional programming language, didn't like any functional
  74. programming languages that already existed, and felt that it would be
  75. less trouble to write a virtual machine emulator than the back end of a
  76. compiler. As of version 0.1.0, the first public release of `avram' as
  77. such in 2000, most of the code base had been in heavy use by me for
  78. about four years, running very reliably. At this writing some six years
  79. later, it has seen even more use with rarely any reliability issues, in
  80. some cases attacking large combinatorial problems for weeks or months
  81. at a time. These problems have involved both long running continuous
  82. execution, and batches of thousands of shorter jobs.
  83. Although the virtual machine is biased toward functional programming,
  84. it is officially language agnostic, so `avram' may be useful to anyone
  85. involved in the development of compilers for other programming,
  86. scripting, or special purpose languages. The crucial advantage of using
  87. it in your own project is that rather than troubling over address
  88. modes, register allocation, and other hassles inherent in generating
  89. native code, your compiler can just dump a fairly high level
  90. intermediate code representation of the source text to a file, and let
  91. the virtual machine emulator deal with the details. The tradeoff for
  92. using a presumably higher level interpreted language is that the
  93. performance is unlikely to be competitive with native code, but this
  94. issue is mitigated in the case of numerical applications whose heavy
  95. lifting is done by the external libraries mentioned above.
  96. Portability is an added bonus. The virtual code is binary compatible
  97. across all platforms. Versions of `avram' as of 0.1.0 and later are
  98. packaged using GNU autotools and should be possible to build on any
  99. platform supporting them. In particular, the package is known to have
  100. built successfully on MacOS, FreeBSD, Solaris (thanks to the compile
  101. farm at Sourceforge.net) Digital Unix, and Debian GNU/Linux for i386 and
  102. Alpha platforms, although it has not been extensively tested on all of
  103. them. Earlier versions were compiled and run successfully on Irix and
  104. even Windows-NT (with `gcc').
  105. This document is divided into three main parts, with possibly three
  106. different audiences, but they all depend on a basic familiarity with Unix
  107. or GNU/Linux systems.
  108. *note User Manual::
  109. essentially reproduces the information found in the manpage that
  110. is distributed with `avram' with a few extra examples and longer
  111. explanations. Properly deployed, `avram' should be almost entirely
  112. hidden from end users by wrapper scripts, so the "users" to whom
  113. this part is relevant would be those involved in preparing these
  114. scripts (a matter of choosing the right command line options).
  115. Depending on the extent to which this task is automated by a
  116. compiler, that may include the compiler writer or the developers
  117. of applications.
  118. *note Virtual Machine Specification::
  119. documents much of what one would need to know in order to write a
  120. compiler that generates code executable by `avram'. That includes
  121. the complete virtual machine code semantics and file formats. It
  122. would also be possible to implement a compatible replacement for
  123. `avram' from scratch based on the information in this chapter, in
  124. case anyone has anything against C, my coding style, or the GPL.
  125. (A few patches to make it `lint' cleanly or a new implementation
  126. in good pedagogical Java without pointers would both be instructive
  127. exercises. ;-))
  128. *note Library Reference::
  129. includes documentation on the application program interface and
  130. recommended entry points for the C library distributed with
  131. `avram'. This information would be of use to those wishing to
  132. develop applications incorporating similar features, or to reuse
  133. the code for unrelated purposes. It might also be useful to anyone
  134. wishing to develop C or C++ applications that read or write data
  135. files in the format used by `avram'.
  136. 
  137. File: avram.info, Node: User Manual, Next: Virtual Machine Specification, Prev: Preface, Up: Top
  138. 1 User Manual
  139. *************
  140. This chapter provides the basic information on how to use `avram' to
  141. execute virtual machine code applications.
  142. `avram' is invoked by typing a command at a shell prompt in one of
  143. these three forms.
  144. `avram' [_general options_]
  145. `avram' [_filter mode options_] CODEFILE[`.avm']
  146. `avram' [_parameter mode options_] CODEFILE[`.avm'] [_parameters_]
  147. In the second case, `avram' reads from standard input, and may of
  148. course appear as part of commands such as
  149. `avram' [_filter mode options_] CODEFILE[`.avm'] < INPUTFILE
  150. ANOTHERCOMMAND | `avram' [_filter mode options_] CODEFILE[`.avm']
  151. When `avram' is invoked with the name of an input file (with a default
  152. extension `.avm'), it reads virtual machine code from the file and
  153. executes it on the host machine.
  154. The virtual code format used by `avram' is designed to support the
  155. features of functional or applicative programming languages. Although
  156. this chapter documents only the usage of `avram' and not the internals,
  157. it will be helpful to keep in mind that the virtual machine code
  158. expresses a mathematical function rather than a program in the
  159. conventional sense. As such, it performs no action directly, but may be
  160. applied in a choice of ways by the user of `avram' according to the
  161. precise operation required.
  162. The following sections provide information in greater detail about
  163. usage and diagnostics.
  164. * Menu:
  165. * General Options:: getting help and version information
  166. * Modes of Operation:: stream processing or file oriented
  167. * Filter Mode Options:: how to run a stream processor
  168. * Parameter Mode Options:: how to have an application use files
  169. * Command Line Syntax:: application-independent conventions
  170. * Diagnostics:: explanation of error messages
  171. * Security:: running untrusted applications
  172. * Example Script:: how to unburden the end users
  173. * Files:: miscellaneous files used
  174. * Environment:: environment variables
  175. * Bugs:: hall of shame
  176. 
  177. File: avram.info, Node: General Options, Next: Modes of Operation, Prev: User Manual, Up: User Manual
  178. 1.1 General Options
  179. ===================
  180. Regardless of whatever other command line parameters are given, `avram'
  181. accepts the following parameters:
  182. `-h, --help'
  183. Show a summary of options and exit.
  184. `-V,-v, --version'
  185. Show the version of program and a short copyleft message and exit.
  186. `--emulation=VERSION'
  187. Be backward compatible with an older version of `avram'. This
  188. option should include a valid version number, for example
  189. `0.13.0', which is the version of `avram' to be emulated. It can
  190. make virtual code applications future proof, assuming that future
  191. versions of `avram' correctly support backward compatibility. It
  192. may be used in conjunction with any other option in any mode of
  193. operation. This copy of the user manual has not been updated
  194. since version 0.13.0 of `avram', so it is unable to document
  195. incompatibilities with later versions. The latest version of the
  196. manual may be found at `http://www.lsbu.ac.uk/~fureyd/avram'.
  197. `-e, --external-libraries'
  198. Show a list of libraries with which `avram' has been linked and
  199. whose functions therefore could be called from virtual machine
  200. programs. This growing list currently includes selected functions
  201. from `fftw', `glpk', `gsl', `kinsol', `lapack', `minpack', `mpfr',
  202. `lpsolve', `Rmath' and `ufsparse' (see *note Preface::) which are
  203. documented further in *note External Libraries::.
  204. `-j, --jail'
  205. This option disables execution of shell commands by virtual code
  206. applications, which is normally possible by default even for
  207. nominally non-interactive applications (see *note Parameter Mode
  208. Options::). A virtual code application attempting to spawn a shell
  209. (using the `interact' combinator) when this option is selected will
  210. encounter an exception rather than successful completion of the
  211. operation. This option is provided as a security feature for
  212. running untrusted code (see *note Security::), and is incompatible
  213. with `-i', `-t', and `-s'.
  214. `-f, --force-text-input'
  215. Normally `avram' will try to guess by looking at a file whether it
  216. is an ordinary text file or one that has been written in the
  217. virtual code file format, and choose a different internal
  218. representation accordingly. An application may require one
  219. representation or the other. This option tells `avram' to treat
  220. all input files other than the virtual code file (named in the
  221. first command line parameter) as text files regardless of whether
  222. or not it would be possible to interpret them otherwise. This
  223. option may be used in combination with any other option.
  224. 
  225. File: avram.info, Node: Modes of Operation, Next: Filter Mode Options, Prev: General Options, Up: User Manual
  226. 1.2 Modes of Operation
  227. ======================
  228. Apart from to the capability to print brief help messages and exit,
  229. there are two main modes of operation, depending on which options are
  230. specified on the command line before the virtual code file name.
  231. For the purpose of choosing the mode of operation, the virtual code
  232. filename is taken to be the first command line argument not beginning
  233. with a dash. Other conventions relevant to application specific
  234. parameters are detailed in *note Command Line Syntax::.
  235. * Menu:
  236. * Filter Mode::
  237. * Parameter Mode::
  238. 
  239. File: avram.info, Node: Filter Mode, Next: Parameter Mode, Prev: Modes of Operation, Up: Modes of Operation
  240. 1.2.1 Filter Mode
  241. -----------------
  242. In filter mode, the argument to the function given by the virtual code
  243. is taken from standard input, and the result is written to standard
  244. output, except for error messages resulting from a failure to evaluate
  245. the function, which are written to standard error. *Note
  246. Diagnostics::. Filter mode is indicated whenever these three conditions
  247. are all met.
  248. * Either at least one of the filter mode options appears on the
  249. command line preceding the first filename parameter, or there are
  250. no options at all. *Note Filter Mode Options::.
  251. * Exactly one filename parameter appears on the command line, which
  252. is the name of the virtual machine code file.
  253. * Either the filename comes last on the command line, or the
  254. `--unparameterized' option precedes it, causing everything
  255. following it to be ignored.
  256. Examples:
  257. `avram mynewapp < inputfilename'
  258. In this example, filter mode is recognized by default because
  259. there are no options or input files on the command line to indicate
  260. otherwise. (The input file redirected into standard input is not
  261. treated by the shell as a command line argument.)
  262. `cat somefile | avram -r coolprog > outputfile'
  263. In this example, the `-r' option gives it away, being one of the
  264. filter mode options, in addition to the fact that there are no
  265. input file parameters or application-specific options.
  266. `avram -u devilmaycare.avm --bogusoption ignoredparameter'
  267. In this case, filter mode is forced by the `-u' option despite
  268. indications to the contrary.
  269. 
  270. File: avram.info, Node: Parameter Mode, Prev: Filter Mode, Up: Modes of Operation
  271. 1.2.2 Parameter Mode
  272. --------------------
  273. In parameter mode, the argument to the function given by the virtual
  274. code is a data structure containing environment variables and command
  275. line parameters including files, application specific options, and
  276. possibly standard input. The result obtained by evaluating the function
  277. is either a data structure representing a set of files to be written,
  278. which may include standard output, or a sequence of shell commands to
  279. be executed, or a combination of both. Parameter mode is indicated
  280. whenever either of these conditions is met.
  281. * Any of the parameter mode options appears on the command line
  282. preceding the first filename parameter.
  283. *Note Parameter Mode Options::.
  284. * At least one additional filename parameter or option follows the
  285. first filename parameter, and the option `--unparameterized' does
  286. not precede it.
  287. Examples:
  288. `avram --map-to-each-file prettyprinter.avm *.c *.h --extra-pretty'
  289. In this example, parameter mode is indicated both by the parameter
  290. mode option `--map-to-each-file' and by the presence of input file
  291. names and the `--extra-pretty' option. The latter is specific to
  292. the hypothetical `prettyprinter.avm' virtual code application, as
  293. indicated by its position on the command line, and is therefore
  294. passed to it by `avram'.
  295. `cat ~/specfile | avram reportgenerator -v - /var/log/syslog'
  296. In this example, a hypothetical parameter mode application
  297. `reportgenerator' is able to read `~/specfile' from standard input
  298. because of the `-' used as a parameter.
  299. `avram --parameterized grepenv'
  300. In this example, a hypothetical application that searches shell
  301. variables is invoked in parameter mode even with no input files or
  302. application specific options, because of the `--parameterized'
  303. option. Parameter mode invocation is required by the application
  304. to give it access to the environment.
  305. `avram grepenv --search-targets=PATH,MANPATH'
  306. This example shows an application specific option with both a
  307. keyword and a parameter list. They suffice to indicate parameter
  308. mode without an explicit `--parameterized' option.
  309. 
  310. File: avram.info, Node: Filter Mode Options, Next: Parameter Mode Options, Prev: Modes of Operation, Up: User Manual
  311. 1.3 Filter Mode Options
  312. =======================
  313. The options available in filter mode are listed below. Except as
  314. otherwise noted, all options are mutually exclusive. Ordinarily a given
  315. application will require certain fixed settings of these options and
  316. will not work properly if they are set inappropriately.
  317. `-r, `--raw-output''
  318. Normally the result obtained by evaluating the function in the
  319. virtual code file must be a list of character strings, which is
  320. written as such to standard output. However, if this option is
  321. selected, the form of the result is unconstrained, and it will be
  322. written in a data file format that is not human readable but can
  323. be used by other applications. This option is incompatible with
  324. any other options except `-u'.
  325. `-c, --choice-of-output'
  326. When this option is used, the evaluation of the function given by
  327. the virtual machine code will be expected to yield a data
  328. structure from which `avram' will ascertain whether standard
  329. output should be written in text or raw data format. This option
  330. should be used only if application is aware of it. It is
  331. incompatible with any other options except `-u'.
  332. `-l, --line-map'
  333. Normally the entire contents of standard input up to `EOF' are
  334. loaded into memory and used as the argument to the function in the
  335. virtual code file. However, this option causes standard input to
  336. be read a line at a time, with the function applied individually
  337. to each line, and its result in each case written immediately to
  338. standard output. A given application either requires this option
  339. or does not, and will not work properly in the alternative. This
  340. option implies `--force-text-input' and is incompatible with any
  341. other option except `-u'.
  342. `-b, --byte-transducer'
  343. This option causes standard input to be read one character at a
  344. time, evaluating the function given by the virtual code file each
  345. time. The function is used as a state transition function that
  346. takes a state and input to a next state and output. The output is
  347. written concurrently with the input operations. A given
  348. application will not work properly with an inappropriate setting
  349. of this option. This option implies `--force-text-input' and is
  350. incompatible with any other option except `-u'.
  351. `-u, --unparameterized'
  352. Normally `avram' guesses whether to use filter mode or parameter
  353. mode depending on whether there are any parameters. Selecting this
  354. option forces it to operate in filter mode regardless. Any
  355. parameters that may appear on the command line after the virtual
  356. code file name are ignored. This option may be used in conjunction
  357. with any other filter mode option.
  358. 
  359. File: avram.info, Node: Parameter Mode Options, Next: Command Line Syntax, Prev: Filter Mode Options, Up: User Manual
  360. 1.4 Parameter Mode Options
  361. ==========================
  362. The parameter mode options are listed below. Except as otherwise noted,
  363. any combination of parameter mode options may be selected together, and
  364. except as noted, the settings of these options can be varied without
  365. breaking the application.
  366. `-q, --quiet'
  367. `avram' normally informs the user when writing an output file with
  368. a short message to standard output. This option suppresses such
  369. messages. This option is compatible with any application and any
  370. other parameter mode option except `-a'.
  371. `-a, --ask-to-overwrite'
  372. Selecting this option will cause `avram' to ask permission
  373. interactively before overwriting an existing file, and to refrain
  374. from overwriting it without permission, in which case the contents
  375. that were to be written will be lost. This option overrides `-q'
  376. and is compatible with any other parameter mode option or
  377. application.
  378. `-.EXT'
  379. An option beginning with a dash followed by a period specifies a
  380. default extension for input file names. If `avram' doesn't find a
  381. file named on the command line, and the filename doesn't already
  382. contain a period, `avram' will try to find a file having a similar
  383. name but with the default extension appended. The default
  384. extension given by this option takes precedence over the hard
  385. coded default extensions of `.fun' and `.avm'. At most one default
  386. extension can be supplied. This option is compatible with any
  387. other parameter mode option and compatible with any application.
  388. `-d, --default-to-stdin'
  389. If no filename parameter appears on the command line (other than
  390. the name of the virtual code file), this option directs `avram' to
  391. read the contents of standard input as if it were specified as a
  392. command line parameter. (Standard input can also be specified
  393. explicitly as a dash. See *note Command Line Syntax::.) This
  394. option is compatible with any application and any other parameter
  395. mode option except `-m'.
  396. `-m, --map-to-each-file'
  397. Normally `avram' loads the entire contents of all files named on
  398. the command line into memory so as to evaluate the virtual machine
  399. code application on all of them together. This option can be used
  400. to save memory in the case of applications that operate on
  401. multiple files independently. It causes `avram' to load only one
  402. file at a time and to perform the relevant evaluation and output
  403. before loading the next one. Application specific options and
  404. standard input (if specified) are read only once and reused. This
  405. option is incompatible with `-d', and not necessarily compatible
  406. with all applications, although some may work both with and
  407. without it.
  408. `-i, --interactive'
  409. This option is used in the case of applications that interact with
  410. other programs through shell commands. An application that is
  411. meant to be invoked in this way requires this option and will not
  412. work without it, nor will applications that are not of this type
  413. work with it. This option is implied by `-t' and `-s', and is
  414. compatible with any other parameter mode option.
  415. `-s, --step'
  416. This option is used in the case of applications that interact with
  417. other programs through shell commands, similarly to `-i', and can
  418. substitute for it (see above). The option has the additional
  419. effect of causing shell commands issued by `avram' on behalf of
  420. the application to be written with their results to standard
  421. output, and to cause `avram' to pause after displaying each shell
  422. command until a key is pressed. This capability may be useful for
  423. debugging or auditing purposes but does not otherwise alter the
  424. effects of the application. This option is compatible with any
  425. other parameter mode option.
  426. `-t, --trace'
  427. This option is used in the case of applications that interact with
  428. other programs through shell commands, but only by way of the
  429. `interact' combinator, for which it provides developers a means of
  430. low level debugging, particularly deadlock detection. When this
  431. option is selected, a verbose trace of all characters exchanged
  432. between the functional transducer and the external application are
  433. written to standard output, along with some additional control flow
  434. diagnostics. This option is compatible with any other parameter
  435. mode option.
  436. `-p, --parameterized'
  437. Normally `avram' tries to guess whether to operate in filter mode
  438. or parameter mode based on the options used and the parameters. If
  439. there are no parameters and no options, it will default to filter
  440. mode, and try to read standard input. However, if this option is
  441. selected, it will use parameter mode (and therefore not try to read
  442. standard input unless required).
  443. 
  444. File: avram.info, Node: Command Line Syntax, Next: Diagnostics, Prev: Parameter Mode Options, Up: User Manual
  445. 1.5 Command Line Syntax
  446. =======================
  447. The command line parameters that follow the virtual code file name when
  448. `avram' is used in parameter mode (*note Parameter Mode::) are
  449. dependent on the specific application. However, all supported
  450. applications are constrained for implementation reasons to observe
  451. certain uniform conventions regarding their command line parameters,
  452. which are documented here to avoid needless duplication.
  453. The shell divides the command line into "arguments" separated by
  454. white space. Arguments containing white space or special characters
  455. used by the shell must be quoted or protected as usual. File names with
  456. wild cards in them are expanded by the shell before `avram' sees them.
  457. `avram' then extracts from the sequence of arguments a sequence of
  458. filenames and a sequence of options. Each option consists of a keyword
  459. and an optional parameter list. Filenames, keywords, and parameter
  460. lists are distinguished according to the following criteria.
  461. 1. An argument is treated as a keyword iff it meets these three
  462. conditions.
  463. a. It starts with a dash.
  464. b. It doesn't contain an equals sign.
  465. c. It doesn't consist solely of a dash.
  466. 2. An argument is treated as a parameter list iff it meets these four
  467. conditions.
  468. a. It doesn't begin with a dash.
  469. b. It either begins with an equals sign or doesn't contain one.
  470. c. It immediately follows an argument beginning with a dash, not
  471. containing an equals sign and not consisting solely of a dash.
  472. d. At least one of the following is true.
  473. 1. It doesn't contain a period, tilde, or path separator.
  474. 2. It contains a comma.
  475. 3. It can be interpreted as a C formatted floating point
  476. number.
  477. 3. An argument is treated as an input file name iff it meets these
  478. four conditions.
  479. a. It doesn't begin with a dash.
  480. b. It doesn't contain an equals sign.
  481. c. It doesn't contain a comma.
  482. d. At least one of the following is true.
  483. 1. It contains a period, tilde, or path separator.
  484. 2. It doesn't immediately follow an argument beginning with
  485. a dash, not consisting solely of a dash, and not
  486. containing an equals sign.
  487. 4. If an argument contains an equals sign but doesn't begin with one,
  488. the part on the left of the first equals sign is treated as a
  489. keyword and the part on the right is treated as a parameter list.
  490. 5. An argument consisting solely of a dash is taken to represent the
  491. standard input file.
  492. 6. An argument not fitting any of the above classifications is an
  493. error.
  494. These conventions are needed for `avram' to detect input file names
  495. in a general, position independent way, so that it can preload the files
  496. on behalf of the application. Many standard Unix utilities follow these conventions
  497. to a large extent, the exceptions being those that employ non-filename
  498. arguments without distinguishing syntax, and use positional or other ad
  499. hoc methods of command line interpretation. A drop-in replacement for
  500. such an application could nevertheless be implemented using `avram'
  501. with an appropriate wrapper script, similar to the approach recommended
  502. in *note Example Script::, but with suitable keywords inserted prior to
  503. the ambiguous arguments.
  504. 
  505. File: avram.info, Node: Diagnostics, Next: Security, Prev: Command Line Syntax, Up: User Manual
  506. 1.6 Diagnostics
  507. ===============
  508. The means exists for virtual code applications to have run time error
  509. messages written to standard error on their behalf by `avram'. Any
  510. error messages not documented here originate with an application and
  511. should be documented by it.
  512. Most error messages originating from `avram' are prefaced by the
  513. application name (i.e., the name of the file containing the virtual
  514. machine code), but will be prefaced by `avram:' if the error is caused
  515. by a problem loading this file itself. Error messages originating from
  516. virtual code applications are the responsibility of their respective
  517. authors and might not be prefaced by the application name.
  518. The run time errors not specifically raised by the application can be
  519. classified as internal errors, i/o errors, overflow errors, file format
  520. errors, application programming errors, and configuration related
  521. errors.
  522. Some error messages include a code number. The number identifies the
  523. specific point in the source code where the condition was detected, for
  524. the benefit of the person maintaining it.
  525. * Menu:
  526. * Internal Errors::
  527. * i/o Errors::
  528. * Overflow Errors::
  529. * File Format Errors::
  530. * Application Programming Errors::
  531. * Configuration Related Errors::
  532. * Other Diagnostics and Warnings::
  533. 
  534. File: avram.info, Node: Internal Errors, Next: i/o Errors, Prev: Diagnostics, Up: Diagnostics
  535. 1.6.1 Internal Errors
  536. ---------------------
  537. Internal errors should never occur unless the `avram' source code has
  538. been carelessly modified, except as noted in *note Bugs::. There are
  539. two kinds.
  540. `APPLICATION-NAME: virtual machine internal error (code NN)'
  541. Most internal errors would be reported by a message of this form
  542. if they were to occur. It indicates that some required invariant
  543. was not maintained. In such cases, the program terminates
  544. immediately, and any results already produced are suspect.
  545. `APPLICATION-NAME: NN unreclaimed STRUCT-NAMES'
  546. A message of this form could be printed at the end of an otherwise
  547. successful run. `avram' maintains a count of the number of units
  548. allocated for various data structures, and checks that they are
  549. all reclaimed eventually as a safeguard against memory leaks. This
  550. message indicates that some memory remains unaccounted for.
  551. If a repeatable internal error is discovered, please email a bug
  552. report and a small representative test case to the author at
  553. <[email protected]>. Include the version number of
  554. `avram', which you can get by running `avram --version'.
  555. 
  556. File: avram.info, Node: i/o Errors, Next: Overflow Errors, Prev: Internal Errors, Up: Diagnostics
  557. 1.6.2 i/o Errors
  558. ----------------
  559. These error messages are prefaced with the name of the application. A
  560. further explanation as to the reason, obtained from the standard
  561. `strerror()' utility, is appended to the messages below if possible.
  562. `APPLICATION-NAME: can't read FILENAME'
  563. A file was not able to be opened for reading, typically because it
  564. was not found or because the user does not have permission. The
  565. file name is displayed with special characters expanded but
  566. without any default extensions or search paths that may have been
  567. tried. If you think a file exists and should have been found,
  568. there may be a problem with your `AVMINPUTS' environment variable
  569. (*note Environment::).
  570. `APPLICATION-NAME: can't write FILENAME'
  571. A file was not able to be opened for writing.
  572. `APPLICATION-NAME: can't write to FILENAME'
  573. A file was successfully opened for writing but became impossible to
  574. write thereafter.
  575. `APPLICATION-NAME: can't spawn COMMAND'
  576. An attempt to execute a shell command on behalf of an interactive
  577. application failed during the `exp_popen()' call to the
  578. `libexpect' library.
  579. `APPLICATION-NAME: can't close FILENAME'
  580. A call to the standard C procedure `fclose()' failed due to
  581. unforeseen circumstances. The error is non-fatal but the file
  582. should be checked for missing data.
  583. 
  584. File: avram.info, Node: Overflow Errors, Next: File Format Errors, Prev: i/o Errors, Up: Diagnostics
  585. 1.6.3 Overflow Errors
  586. ---------------------
  587. These errors are reported by the application name prefacing one of the
  588. following messages, except as noted below.
  589. `APPLICATION-NAME: counter overflow (code NN)'
  590. An overflow occurred in an unsigned long integer being used as a
  591. reference counter or something similar. This situation is very
  592. unlikely.
  593. `APPLICATION-NAME: memory overflow (code NN)'
  594. There wasn't enough memory to build an internal data structure. The
  595. most likely cause is an attempt to operate on input files that are
  596. too large. Standard remedies apply.
  597. The memory overflow or counter overflow messages can also be reported
  598. without the application name preface or a code number. In these cases,
  599. they arise in the course of evaluating the function given by the
  600. application, rather than by loading the input files.
  601. A counter overflow in this case is possible if the application
  602. attempts to compute the size of a very large, shared structure using
  603. native integer arithmetic.
  604. Memory overflows are possible due to insufficient memory for a valid
  605. purpose, but may also occur due to a non-terminating recursion in the
  606. virtual machine code. To prevent thrashing or other bad effects from
  607. runaway code, the `ulimit' shell command is your friend.
  608. 
  609. File: avram.info, Node: File Format Errors, Next: Application Programming Errors, Prev: Overflow Errors, Up: Diagnostics
  610. 1.6.4 File Format Errors
  611. ------------------------
  612. Certain application crashes result from an application not adhering to
  613. the required conventions about data and file formats, or because the
  614. application was invoked in the wrong mode (*note Modes of Operation::).
  615. These are the following.
  616. `APPLICATION-NAME: invalid text format (code NN)'
  617. An application that was expected to return a string of characters
  618. to be written to a text file returned data that did not correspond
  619. to any valid character representation.
  620. `APPLICATION-NAME: null character in prompt'
  621. An interactive application (invoked rightly or wrongly with `-i',
  622. `-t', or `-s') is required to exchange strings of non-null
  623. characters internally with `avram', and used a null.
  624. `APPLICATION-NAME: invalid file name (code NN)'
  625. The data structure representing a file obtained from an application
  626. has a name consisting of something other than character strings.
  627. This error could be the result of a filter mode application (*note
  628. Filter Mode::) being invoked in parameter mode.
  629. (*note Parameter Mode::)
  630. `APPLICATION-NAME: null character in file name'
  631. Similar to the above errors.
  632. `APPLICATION-NAME: bad character in file name'
  633. Slashes, backslashes, and unprintable characters other than spaces
  634. are also prohibited in file names.
  635. `APPLICATION-NAME: invalid output preamble format'
  636. According the format used by `avram' for data files, a data file
  637. may contain an optional text portion, known as the preamble. This
  638. error occurs when a data file obtained from an application can not
  639. be written because the preamble is something other than a list of
  640. character strings.
  641. `APPLICATION-NAME: invalid file specification'
  642. This error occurs in situations where the data structure for a file
  643. obtained by evaluating the application is too broken to permit any
  644. more specific diagnosis.
  645. `avram: invalid raw file format in APPLICATION-NAME'
  646. The file containing the virtual machine code was not able to be
  647. loaded, because the code was not in a recognizable format. Either
  648. the file has become corrupted, the compiler that generated it has a
  649. bug in it, or the wrong file was used as a virtual code file.
  650. 
  651. File: avram.info, Node: Application Programming Errors, Next: Configuration Related Errors, Prev: File Format Errors, Up: Diagnostics
  652. 1.6.5 Application Programming Errors
  653. ------------------------------------
  654. A further class of application crashes results from miscellaneous bugs
  655. in the application. These require the application to be debugged and
  656. have no user level explanation or workaround, but are listed here for
  657. reference. These messages are not normally prefaced by the application
  658. name when reported unless the application elects to do so, except for
  659. the `invalid profile identifier' message.
  660. * `invalid recursion'
  661. * `invalid comparison'
  662. * `invalid deconstruction'
  663. * `invalid transpose'
  664. * `invalid membership'
  665. * `invalid distribution'
  666. * `invalid concatenation'
  667. * `invalid assignment'
  668. * `unrecognized combinator (code NN)'
  669. * `APPLICATION-NAME: invalid profile identifier'
  670. * `unsupported hook'
  671. 
  672. File: avram.info, Node: Configuration Related Errors, Next: Other Diagnostics and Warnings, Prev: Application Programming Errors, Up: Diagnostics
  673. 1.6.6 Configuration Related Errors
  674. ----------------------------------
  675. The source code distribution of `avram' incorporates a flexible
  676. configuration script allowing it to be installed on a variety of
  677. platforms. Not all platforms allow support for all features. It is also
  678. anticipated that new features may be added to `avram' from time to
  679. time. Some problems may therefore occur due to features not being
  680. supported at your site for either of these reasons. The following error
  681. messages are relevant to these situations.
  682. `unsupported hook'
  683. If it's not simply due to an application programming error (*note
  684. Application Programming Errors::) this message may be the result of
  685. trying to use an application that requires a newer version of
  686. `avram' than the one installed, even though applications should
  687. avoid this problem by checking the version number at run time. If
  688. this is the reason, the solution would be to install the latest
  689. version.
  690. `APPLICATION-NAME: I need avram linked with FOO, BAR and BAZ.'
  691. A message of the this form indicates that a new installation may be
  692. needed. At this writing (11/11/1), `avram' may report this message
  693. with respect to `libexpect5.32', `tcl8.3', and `libutil' if any of
  694. the `-i', `-t', or `-s' options is used on a system where not all
  695. of these libraries were detected when `avram' was installed from a
  696. source distribution. (See *note Parameter Mode Options::.)
  697. Because `avram' is useful even without interactive applications,
  698. these libraries are not considered absolute prerequisites by the
  699. configuration script.
  700. `avram: can't emulate version VERSION'
  701. The `--emulation=VERSION' option obviously won't work if the
  702. requested version is newer than the installed version, or if it is
  703. not a valid version number (*note General Options::). When that
  704. happens, this message is printed instead and `avram' terminates.
  705. `avram: multiple version specifications'
  706. The `--emulation=VERSION' option can be used at most once on a
  707. command line. This message is printed if it is used more than
  708. once. If you only typed it once and got this message, check your
  709. aliases and wrapper scripts before reporting a bug.
  710. `avram: unrecognized option: OPTION-NAME'
  711. may mean that a command line option has been misspelled, or may be
  712. another sign of an obsolete version of `avram'. This message will
  713. be followed by a usage summary similar to that of the `--help'
  714. option. (*note General Options::).
  715. `APPLICATION-NAME: warning: search paths not supported'
  716. If the `argz.h' header file was not detected during configuration,
  717. `avram' will not be able to support search paths in the
  718. `AVMINPUTS' environment variable (*note Environment::). This
  719. message is a warning that the environment variable is being
  720. ignored. If the warning is followed by an i/o error
  721. (*note i/o Errors::), the latter may be due to a file being in a
  722. path that was not searched for this reason. A workaround is to
  723. specify the full path names of all input files outside the current
  724. working directory. If you don't need search paths, you can get rid
  725. of this message by undefining `AVMINPUTS'.
  726. 
  727. File: avram.info, Node: Other Diagnostics and Warnings, Prev: Configuration Related Errors, Up: Diagnostics
  728. 1.6.7 Other Diagnostics and Warnings
  729. ------------------------------------
  730. `avram: multiple -.EXT options; all but last ignored'
  731. This message is written when more than one default extension is
  732. given as a command line parameter. At most one default extension
  733. is allowed. If more than one is given, only the last one is used.
  734. The error is non-fatal and `avram' will try to continue. If you
  735. need more than one default extension, consider using the hard
  736. coded default extensions of `.fun' and `.avm', or hacking the
  737. shell script in which the `avram' command line appears.
  738. `APPLICATION NAME: empty operator'
  739. This message probably means that the virtual code file is corrupt
  740. or invalid.
  741. usage summary
  742. For any errors in usage not covered by other diagnostics, such as
  743. incompatible combinations of options, `avram' prints a message to
  744. standard error giving a brief summary of options, similar to the
  745. output from `avram --help'. (See *note General Options::.)
  746. 
  747. File: avram.info, Node: Security, Next: Example Script, Prev: Diagnostics, Up: User Manual
  748. 1.7 Security
  749. ============
  750. A few obvious security considerations are relevant to running untrusted
  751. virtual code applications. These points are only as reliable as the
  752. assumption that the `avram' executable has not been modified to the
  753. contrary.
  754. * The applications with the best protection from malicious code are
  755. those that run in filter mode, because they have no access to any
  756. information not presented to them in standard input, nor the
  757. ability to affect anything other than the contents of standard
  758. output (provided that the `--jail' command line option is used).
  759. The worst they can do is use up a lot of memory, which can be
  760. prevented with the `ulimit' command. Unfortunately, not all
  761. applications are usable in this mode.
  762. * Parameter mode applications that do not involve the `-i', `-t' or
  763. `-s' options are almost as safe (also assuming `--jail'). They
  764. have (read-only) access to environment variables, and to the files
  765. that are indicated explicitly on the command line. If standard
  766. input is one of the files (as indicated by the use of `-' as a
  767. parameter), the virtual code application may infer the current
  768. date and time. However, a parameter mode application may write
  769. any file that the user has permission to write. The
  770. `--ask-to-overwrite' option should be used for better security, or
  771. at least the `--quiet' option should not be used. The virtual
  772. code can neither override nor detect the use of these options.
  773. * Interactive parameter mode applications (those that use either the `-i',
  774. `-t' or `-s' options) are the least secure because they can
  775. execute arbitrary shell commands on behalf of the user. This
  776. statement also applies to filter mode and parameter mode
  777. applications where the `--jail' option is not used. Use of
  778. `--step' is preferable to `-i' for making an audit trail of all
  779. commands executed, but the application could probably subvert it.
  780. The `--step' option may be slightly better because it can allow
  781. the user to inspect each command and interrupt it if appropriate.
  782. However, in most cases a command will not be displayed until it is
  783. already executed. Commands executed by non-interactive
  784. applications normally will display no output to that effect. A
  785. `chroot' environment may be the only secure way of running
  786. untrusted interactive applications.
  787. 
  788. File: avram.info, Node: Example Script, Next: Files, Prev: Security, Up: User Manual
  789. 1.8 Example Script
  790. ==================
  791. It is recommended that the application developer (or the compiler)
  792. package virtual machine code applications as shell scripts with the
  793. `avram' command line embedded in them. This style relieves the user of
  794. the need to remember the appropriate virtual machine options for
  795. invoking the application, which are always the same for a given
  796. application, or even to be aware of the virtual machine at all.
  797. Here is a script that performs a similar operation to the standard Unix
  798. `cat' utility.
  799. #!/bin/sh
  800. #\
  801. exec avram --force-text-input --default-to-stdin "$0" "$@"
  802. sKYQNTP\
  803. That is, it copies the contents of a file whose name is given on the
  804. command line to standard output, or copies standard input to standard
  805. output if no file name is given. This script can be marked executable (with
  806. `chmod') and run by any user with the directory of the `avram'
  807. executable in his or her `PATH' environment variable, even if `avram'
  808. had to be installed in a non-standard directory such as `~/bin'.
  809. The idea for this script is blatantly lifted from the `wish' manpage.
  810. The first line of the script invokes a shell to process what follows.
  811. The shell treats the second line as a comment and ignores it. Based on
  812. the third line, the shell invokes `avram' with the indicated options,
  813. the script itself as the next argument, and whatever command line
  814. parameters were initially supplied by the user as the remaining
  815. arguments. The rest of the script after that line is never processed by
  816. the shell.
  817. When `avram' attempts to load the shell script as a virtual machine
  818. code file, which happens as a result of it being executed by the shell,
  819. it treats the first line as a comment and ignores it. It also treats
  820. the second line as a comment, but takes heed of the trailing backslash,
  821. which is interpreted as a comment continuation character. It therefore
  822. also treats the third line as a comment and ignores it. Starting with
  823. the fourth line, it reads the virtual code, which is in a binary data
  824. format encoded with printable characters, and evaluates it.
  825. 
  826. File: avram.info, Node: Files, Next: Environment, Prev: Example Script, Up: User Manual
  827. 1.9 Files
  828. =========
  829. `./profile.txt'
  830. This file is written automatically by `avram' on behalf of
  831. applications that include profile annotations. It lists the number
  832. of invocations for each annotated part of the application, the
  833. total amount of time spent on it (in relative units), the average
  834. amount of time for each invocation, and the percentage of time
  835. relative to the remainder of the application. The exact format is
  836. undocumented and subject to change.
  837. 
  838. File: avram.info, Node: Environment, Next: Bugs, Prev: Files, Up: User Manual
  839. 1.10 Environment
  840. ================
  841. An environment variable `AVMINPUTS' can be made to store a list of
  842. directories (using the `set' or `export' commands) that `avram' will
  843. search for input files. The directories should be separated by colons,
  844. similarly to the `PATH' environment variable.
  845. The search paths in `AVMINPUTS' apply only to the names of input
  846. files given on the command line (*note Command Line Syntax::) when
  847. `avram' is invoked in parameter mode (*note Parameter Mode::). They do
  848. not apply to the name of the virtual code file, which is always assumed
  849. to be either absolute or relative to the current working directory
  850. (this assumption being preferable in the case of a script like that of
  851. *note Example Script::).
  852. Starting in the first directory in the list of `AVMINPUTS', `avram'
  853. searches for a file exactly as its name appears on the command line
  854. (subject to the expansion of special characters by the shell). If it is
  855. not found and the name does not contain a period, but a command line
  856. option of `-.EXT' has been used, `avram' will then search for a file
  857. with that name combined with the extension `.EXT'. If `-.EXT' has not
  858. been used or if no matching file is found with it, `avram' tries the
  859. extensions of `.avm' and `.fun' in that order, provided the given file
  860. name contained no periods. If no match is found for any of those cases,
  861. `avram' proceeds to search the next directory in the list obtained from
  862. `AVMINPUTS', and so on. It stops searching when the first match is
  863. found. For subsequent input files, the search begins again at the first
  864. directory.
  865. If `AVMINPUTS' is defined, the current working directory is not
  866. searched for input files unless it is listed. If it is empty or not defined,
  867. a default list of search paths is used, currently
  868. .:/usr/local/lib/avm:/usr/lib/avm:/lib/avm:/opt/avm:/opt/lib/avm\
  869. :/usr/local/share/avm:/usr/share/avm:/share/avm:/opt/avm:/opt/share/avm
  870. These paths are defined in `avram.c' and can be changed by recompiling
  871. it.
  872. 
  873. File: avram.info, Node: Bugs, Prev: Environment, Up: User Manual
  874. 1.11 Bugs
  875. =========
  876. There are no known bugs outstanding, except for any that may be
  877. inherent in the external library functions. However, `avram' has been
  878. used most extensively on GNU/Linux systems, and the prospect of
  879. portability issues with new or lesser used features on other systems
  880. can't be excluded.
  881. Though not observed in practice, it's theoretically possible to blow
  882. the stack by passing enough functions as arguments to library functions
  883. that pass more functions to library functions (e.g., by using nested
  884. calls to the gsl integration functions meant for a single variable to
  885. evaluate a very high dimensional multiple integral). In all other cases
  886. only dynamic heap storage or a constant amount of stack space is used.
  887. In particular, this issue is _not_ relevant to virtual code
  888. applications that don't use external libraries, or that don't pass
  889. functions to them as arguments.
  890. `avram' is designed to recover gracefully from memory overflows by
  891. always checking for `NULL' results from `malloc()' or otherwise
  892. trapping functions that allocate memory. In the event of an overflow,
  893. it conveys an appropriate error message to the virtual code application
  894. to be handled by the usual exception handling mechanisms. However,
  895. there is currently no way for a virtual code application to detect in
  896. advance whether sufficient memory is available, nor for it to resume
  897. normal operation once an exception occurs. Furthermore, it has been
  898. observed on some systems including Irix and 2.4 series Linux kernels
  899. that the `avram' process is killed automatically for attempting to
  900. allocate too much memory rather than given the chance to recover.
  901. Please send bug reports to <[email protected]>.
  902. 
  903. File: avram.info, Node: Virtual Machine Specification, Next: Library Reference, Prev: User Manual, Up: Top
  904. 2 Virtual Machine Specification
  905. *******************************
  906. This chapter contains a description of the virtual machine implemented
  907. by `avram', from the point of view of a person wishing to write a
  908. compiler that generates code for it. Before reading this chapter,
  909. readers should at least skim *note User Manual:: in order to see the big
  910. picture. Topics covered in this chapter include data representations,
  911. virtual code semantics, and file formats. A toy programming language is
  912. introduced for illustrative purposes. The sections in this chapter might
  913. not make sense if read out of order the first time through. The last
  914. section, *note Virtual Code Semantics::, contains many equations that
  915. may be difficult to read in the info or html renderings. The printed
  916. version is recommended for anyone who really wants to comprehend this
  917. material.
  918. * Menu:
  919. * Raw Material::
  920. * Concrete Syntax::
  921. * File Format::
  922. * Representation of Numeric and Textual Data::
  923. * Filter Mode Interface::
  924. * Parameter Mode Interface::
  925. * Virtual Code Semantics::
  926. 
  927. File: avram.info, Node: Raw Material, Next: Concrete Syntax, Prev: Virtual Machine Specification, Up: Virtual Machine Specification
  928. 2.1 Raw Material
  929. ================
  930. The purpose of this section is to instill some basic concepts about the
  931. way information is stored or communicated by the virtual machine, which
  932. may be necessary for an understanding of subsequent sections.
  933. The virtual machine represents both programs and data as members of a
  934. semantic domain that is straightforward to describe. Lisp users and
  935. functional programmers may recognize familiar concepts of atoms and lists
  936. in this description. However, these terms are avoided for the moment,
  937. in order to keep this presentation self contained and to prevent
  938. knowledgeable readers from inferring any unintended meanings.
  939. As a rule, it is preferable to avoid overspecifying any theoretical
  940. artifact. In this spirit, the set of entities with which the virtual
  941. machine is concerned can be defined purely in terms of the properties we
  942. need it to have.
  943. _A distinguished element_
  944. A particular element of the set is designated, arbitrarily or
  945. otherwise, as a distinguished element. Given any element of the
  946. set, it is always possible to decide whether or not it is the
  947. distinguished element. The set is non-empty and such an element
  948. exists.
  949. _A binary operator_
  950. A map from pairs of elements of the set to elements of the set
  951. exists and meets these conditions.
  952. * It associates a _unique_ element of the set with any given
  953. ordered pair of elements from the set.
  954. * It does not associate the distinguished element with any pair
  955. of elements.
  956. For the sake of concreteness, an additional constraint is needed:
  957. _the set has no proper subset satisfying the above conditions_. Any
  958. number of constructions remain within these criteria, but there is no
  959. need to restrict them further, because they are all equivalent for our
  960. purposes.
  961. To see that these properties provide all the structure we need for
  962. general purpose computation, we may suppose some given set `S' and an
  963. operator `cons' having them are fixed, and infer the following points.
  964. * `S' contains at least one element, the distinguished element. Call
  965. it `nil'.
  966. * The pair `(nil,nil)' is a pair of elements of `S', so there must
  967. be an element of `S' that `cons' associates with it. We can denote
  968. this element `cons(nil,nil)'.
  969. * As no pair of elements is associated with the distinguished
  970. element, `cons(nil,nil)' must differ from `nil', so `S' contains
  971. at least two distinct elements.
  972. * The pair `(nil,cons(nil,nil))' therefore differs from `(nil,nil)',
  973. but because it is yet another pair of elements from `S', there
  974. must be an element associated with it by the operator. We can
  975. denote this element as `cons(nil,cons(nil,nil))'.
  976. * Inasmuch as the operator associates every pair of elements with a
  977. _unique_ element, `cons(nil,cons(nil,nil))' must differ from the
  978. element associated with any other pair of elements, so it must
  979. differ from `cons(nil,nil)', and we conclude that `nil',
  980. `cons(nil,nil)' and `cons(nil,cons(nil,nil))' constitute three
  981. distinct elements of the set `S'.
  982. * By defining `cons(cons(nil,nil),nil)' and
  983. `cons(cons(nil,nil),cons(nil,nil))' analogously and following a
  984. similar line of reasoning, one may establish the existence of two
  985. more distinct elements of `S'.
  986. It is not difficult to see that an argument in more general terms
  987. could show that the inclusion of infinitely many elements in `S' is
  988. mandated by the properties of the `cons' operator. Furthermore, every
  989. element of `S' other than `nil' owes its inclusion to being associated
  990. with some other pair of elements by `cons', because if it were not, its
  991. exclusion would permit a proper subset of `S' to meet all of the above
  992. conditions. We can conclude that `S' contains exactly `nil' and the
  993. countable infinitude of elements of the form `cons(x,y)', where `x' and
  994. `y' are either `nil' or something of the form `cons(...)' themselves.
  995. Some specific examples of sets and operators that have the required
  996. properties are as follows.
  997. * the set of natural numbers, with `0' as the distinguished element,
  998. and the `cons' operator defined by `cons(X,Y) = ((X+Y)(X+Y+1))/2 +
  999. Y + 1'
  1000. * a set of balanced strings of parentheses, with `()' as the
  1001. distinguished element, and `cons' defined as string concatenation
  1002. followed by enclosure in parentheses
  1003. * a set of ordered binary trees, with the empty tree as the
  1004. distinguished element, and the `cons' operator as that which takes
  1005. an ordered pair of trees to the tree having them as its descendents
  1006. * a set containing only its own Cartesian product and an arbitrary
  1007. but fixed element `nil', with `cons' being the identity function
  1008. Each of these models may suggest a different implementation, some of
  1009. which are more practical than others. The remainder of this document is
  1010. phrased somewhat imprecisely in terms of a combination of the latter
  1011. two. The nature of the set in question is not considered further, and
  1012. elements of the set are described as "trees" or "lists". The distinguished
  1013. element is denoted by `nil' and the operator by `cons'. Where no
  1014. ambiguity results, `cons(x,y)' may be written simply as `(x,y)'. These
  1015. terms should not be seen as constraints on the implementation.
  1016. 
  1017. File: avram.info, Node: Concrete Syntax, Next: File Format, Prev: Raw Material, Up: Virtual Machine Specification
  1018. 2.2 Concrete Syntax
  1019. ===================
  1020. The previous section has developed a basic vocabulary for statements
  1021. such as "the virtual machine code for the identity function is `(nil,(nil,nil))'",
  1022. which are elaborated extensively in the subsequent sections on code and
  1023. data formats. However, a description in this style would be inadequate
  1024. without an explanation of how such an entity as `(nil,(nil,nil))' is
  1025. communicated to `avram' in a virtual machine code file. The purpose of
  1026. this section is to fill the gap by explaining exactly how any given
  1027. tree would be transformed to its concrete representation.
  1028. The syntax is based on a conversion of the trees to bit strings, followed
  1029. by grouping the bits into blocks of six, which are then encoded by
  1030. printable characters. Although anyone is free to modify `avram', it is
  1031. recommended that the concrete syntax described here be maintained for
  1032. the sake of portability of virtual machine code applications.
  1033. Building a tree by reading the data from a file requires a more
  1034. difficult algorithm than the one presented in this section, and is not
  1035. considered because it's not strictly necessary for a compiler.
  1036. Procedures for both reading and writing are available to C and C++
  1037. users as part of the `avram' library, and are also easily invoked on
  1038. the virtual code level.
  1039. * Menu:
  1040. * Bit String Encoding::
  1041. * Blocking::
  1042. 
  1043. File: avram.info, Node: Bit String Encoding, Next: Blocking, Prev: Concrete Syntax, Up: Concrete Syntax
  1044. 2.2.1 Bit String Encoding
  1045. -------------------------
  1046. The conversion from trees to bit strings might have been done in several ways,
  1047. perhaps the most obvious being based on a preorder traversal with each
  1048. vertex printed as it is traversed. By this method, the entire encoding
  1049. of the left descendent would precede that of the right in the bit
  1050. string. This alternative is therefore rejected because it imposes
  1051. unnecessary serialization on communication.
  1052. It is preferable for the encodings of both descendents of a tree to
  1053. be interleaved to allow concurrent transmission. Although there is
  1054. presently no distributed implementation of the virtual machine and hence none
  1055. that takes advantage of this possibility, it is better to plan ahead
  1056. than to be faced with backward compatibility problems later.
  1057. The preferred algorithm for encoding a tree as a bit string employs a
  1058. queue. The queue contains trees and allows them to be processed in a first-in
  1059. first-out order. Intuitively, the algorithm works by traversing the
  1060. tree in level order. To print a tree `T' as a string of `1's and `0's,
  1061. it performs the following steps.
  1062. Initialize the queue to contain only `T'
  1063. while the queue is not empty do
  1064. if the front element of the queue is `nil' then
  1065. print `0'
  1066. else if the front element of the queue is of the form `cons(x,y)' then
  1067. print `1'
  1068. append `x' to the back of the queue
  1069. append `y' to the back of the queue
  1070. end if
  1071. remove the front element of the queue
  1072. end while
  1073. This algorithm presupposes that any given tree `cons(x,y)' can be
  1074. "deconstructed" to obtain `x' and `y'. The computability of such an
  1075. operation is assured in theory by the uniqueness property of the `cons'
  1076. operator, regardless of the representation chosen. If the trees are
  1077. implemented with pointers in the obvious way, their deconstruction is a
  1078. trivial constant time operation.
  1079. As an example, running the following tree through the above algorithm
  1080. results in the bit string
  1081. `111111101011110010001001100010100010100100100'.
  1082. cons(
  1083. cons(
  1084. cons(nil,cons(nil,cons(nil,nil))),
  1085. cons(nil,cons(nil,nil))),
  1086. cons(
  1087. cons(
  1088. cons(nil,cons(nil,cons(nil,cons(nil,nil)))),
  1089. cons(nil,nil)),
  1090. cons(
  1091. cons(
  1092. cons(nil,cons(nil,cons(cons(nil,cons(nil,nil)),nil))),
  1093. cons(nil,nil)),
  1094. nil)))
  1095. 
  1096. File: avram.info, Node: Blocking, Prev: Bit String Encoding, Up: Concrete Syntax
  1097. 2.2.2 Blocking
  1098. --------------
  1099. After the bit string is obtained as described above, it is grouped into
  1100. blocks of six. Continuing with the example, the string
  1101. 111111101011110010001001100010100010100100100
  1102. would be grouped as
  1103. 111111 101011 110010 001001 100010 100010 100100 100
  1104. Because the number of bits isn't a multiple of six, the last group has
  1105. to be padded with zeros, to give
  1106. 111111 101011 110010 001001 100010 100010 100100 100000
  1107. Each of these six bit substrings is then treated as a binary number,
  1108. with the most significant bit on the left. The numbers expressed in
  1109. decimal are
  1110. 63 43 50 9 34 34 36 32
  1111. The character codes for the characters to be written are obtained by
  1112. adding sixty to each of these numbers, so as to ensure that they will be
  1113. printable characters. The resulting character codes are
  1114. 123 103 110 69 94 94 96 92
  1115. which implies that the tree in the example could be written to a file as
  1116. `{gnE^^`\'.
  1117. 
  1118. File: avram.info, Node: File Format, Next: Representation of Numeric and Textual Data, Prev: Concrete Syntax, Up: Virtual Machine Specification
  1119. 2.3 File Format
  1120. ===============
  1121. A virtual code file consists of an optional text preamble, followed by
  1122. the concrete representation for a tree. The latter uses the syntax
  1123. described in the previous section. The purpose of this section is to
  1124. specify the remaining details of the file format.
  1125. The format for virtual code files may also be used for other purposes
  1126. by virtual code applications, as it is automatically detected and parsed
  1127. by `avram' when used in an input file, and can be automatically written
  1128. to output files at the discretion of the application.
  1129. Other than virtual code files, input files not conforming to this
  1130. format are not an error as far as `avram' is concerned, because they are assumed
  1131. to be text files. Applications can detect in virtual code the
  1132. assumption that is made and report an error if appropriate.
  1133. Although the data file format includes no checksums or other explicit methods
  1134. of error detection, the concrete syntax itself provides a good measure
  1135. of protection against undetected errors. The probability is vanishingly
  1136. small that a random alteration to any valid encoding leaves it intact,
  1137. because every bit in the sequence either mandates or prohibits the
  1138. occurrence of two more bits somewhere after it. Errors in different
  1139. parts of the file would have to be consistent with one another to go
  1140. unnoticed.
  1141. * Menu:
  1142. * Preamble Section::
  1143. * Data Section::
  1144. 
  1145. File: avram.info, Node: Preamble Section, Next: Data Section, Prev: File Format, Up: File Format
  1146. 2.3.1 Preamble Section
  1147. ----------------------
  1148. * A file may contain at most one preamble.
  1149. * The preamble, if any, is a consecutive sequence of lines beginning
  1150. with the first line in the file.
  1151. * The first line of the preamble must begin with a hash (`#')
  1152. character.
  1153. * Subsequent lines of the preamble must either begin with a hash, or
  1154. immediately follow a line that ends with a backslash (`\')
  1155. character (or both).
  1156. 
  1157. File: avram.info, Node: Data Section, Prev: Preamble Section, Up: File Format
  1158. 2.3.2 Data Section
  1159. ------------------
  1160. * The data or virtual code section of the file begins on the first
  1161. line of the file that isn't part of the preamble.
  1162. * The data section may not contain any hashes, white space, or other
  1163. extraneous characters other than line breaks.
  1164. * If line breaks are ignored, the data section contains a sequence
  1165. of characters expressing a single tree in the concrete syntax
  1166. described in *note Concrete Syntax::.
  1167. 
  1168. File: avram.info, Node: Representation of Numeric and Textual Data, Next: Filter Mode Interface, Prev: File Format, Up: Virtual Machine Specification
  1169. 2.4 Representation of Numeric and Textual Data
  1170. ==============================================
  1171. As noted already, virtual code applications are specified by functions
  1172. operating on elements of a set having the properties described in *note
  1173. Raw Material::, which are convenient to envision as ordered binary
  1174. trees or pairs of `nil'. However, virtual code applications normally
  1175. deal with numeric or textual data, for example when they refer to the
  1176. contents of a text file. It is therefore necessary for the application
  1177. and the virtual machine emulator to agree on a way of describing textual
  1178. or numeric data with these trees.
  1179. The purpose of this section is to explain the basic data structures
  1180. used in the exchange of information between `avram' and a virtual code
  1181. application. For example, an explanation is needed for statements like
  1182. "an application invoked with the `--baz' option is expected to return a
  1183. pair `(FOO,BAR)', where `FOO' is a list of character strings ...", that
  1184. are made subsequently in this document. Such statements should be
  1185. understood as referring to the trees representing the pairs, lists,
  1186. character strings, etc., according to the conventions explained below.
  1187. _Characters_
  1188. An arbitrarily chosen set of 256 trees is used to represent the
  1189. character set. They are listed in *note Character Table::. For
  1190. example, the letter `A' is represented by
  1191. `(nil,(((nil,(nil,(nil,nil))),nil),(nil,nil)))'. That means that
  1192. when an application wants the letter `A' written to a text file, it
  1193. returns something with this tree in it.
  1194. _Booleans_
  1195. The value of `false' is represented by `nil', and the value of
  1196. `true' is represented by `(nil,nil)'.
  1197. _Pairs_
  1198. Given any two items of data X1 and X2, having the respective
  1199. representations R1 and R2, the pair `(X1,X2)' has the
  1200. representation `cons(R1,R2)'.
  1201. _Lists_
  1202. A list of the items X1, X2 ... XN with respective representations
  1203. R1 through RN is represented by the tree
  1204. `cons(R1,cons(R2...cons(RN,nil)...))'. In other words, lists are
  1205. represented as pairs whose left sides are the heads and whose
  1206. right sides are the tails. The empty list is identified with
  1207. `nil'. Lists of arbitrary finite length can be accommodated.
  1208. _Naturals_
  1209. A number of the form `B0 + 2B1 + 4B2 + ... + 2^n BN', where each
  1210. `Bi' is `0' or `1', is represented by a tree of the form
  1211. `cons(T0,cons(T1...cons(TN,nil)...))' where each `Ti' is `nil' if
  1212. the corresponding `Bi' is `0', and `(nil,nil)' otherwise. Note that
  1213. the numbers `Bi' are exactly the bits written in the binary
  1214. expansion of the number, with `B0' being the least significant bit.
  1215. _Strings_
  1216. are represented as lists of characters.
  1217. `avram' imposes no more of a "type discipline" than necessary to a
  1218. workable interface between it and an application. This selection of
  1219. types and constructors should not be seen as constraining what a
  1220. compiler writer may wish to have in a source language.
  1221. 
  1222. File: avram.info, Node: Filter Mode Interface, Next: Parameter Mode Interface, Prev: Representation of Numeric and Textual Data, Up: Virtual Machine Specification
  1223. 2.5 Filter Mode Interface
  1224. =========================
  1225. From the point of view of the application developer or compiler writer,
  1226. there are parameter mode applications, which are discussed in *note
  1227. Parameter Mode Interface::, and filter mode applications, which are
  1228. discussed in this section. Of the latter, there are mainly three kinds:
  1229. those that read one character at a time, those that read a line at a
  1230. time, and those that read the whole standard input file at once. Each
  1231. of them is invoked with different options and expected to follow
  1232. different calling conventions. This section summarizes these
  1233. conventions.
  1234. * Menu:
  1235. * Loading All of Standard Input at Once::
  1236. * Line Maps::
  1237. * Byte Transducers::
  1238. 
  1239. File: avram.info, Node: Loading All of Standard Input at Once, Next: Line Maps, Prev: Filter Mode Interface, Up: Filter Mode Interface
  1240. 2.5.1 Loading All of Standard Input at Once
  1241. -------------------------------------------
  1242. Unless `--line-map' or `--byte-transducer' is used as a command line
  1243. option when the application is invoked, the contents of standard input
  1244. are loaded entirely into memory by `avram' before evaluation of the
  1245. virtual code begins. This interface is obviously not appropriate for
  1246. infinite streams.
  1247. The virtual code application in this mode of operation is treated as
  1248. a single function taking the entire contents of standard input as its
  1249. argument, and returning the entire contents of standard output as its
  1250. result. Hence, this interface is one of the simplest available.
  1251. * Menu:
  1252. * Standard Input Representation::
  1253. * Standard Output Representation::
  1254. 
  1255. File: avram.info, Node: Standard Input Representation, Next: Standard Output Representation, Prev: Loading All of Standard Input at Once, Up: Loading All of Standard Input at Once
  1256. 2.5.1.1 Standard Input Representation
  1257. .....................................
  1258. The representation for the standard input file used as the argument to
  1259. the function depends both on the file format and on the command line
  1260. options specified when the application is invoked. The `--unparameterized'
  1261. and `--raw-output' options make no difference to the input
  1262. representation, and the `--line-map' and `--byte-transducer' options
  1263. are not relevant to this mode of operation. That leaves four possible
  1264. combined settings of the `--choice-of-output' and `--force-text-input'
  1265. options. If standard input conforms to the data file format
  1266. specification *note File Format::, the following effects are possible.
  1267. * If neither `--choice-of-output' nor `--force-text-input' is used,
  1268. the argument to the function will be given directly by the tree
  1269. encoded in the data section of the file. The preamble of the file
  1270. will be ignored.
  1271. * If the `--choice-of-output' option is used and the
  1272. `--force-text-input' option is not used, the argument to the
  1273. function will be a pair `(PREAMBLE,CONTENTS)', where PREAMBLE is a
  1274. list of character strings taken from the preamble of the file
  1275. (with leading hashes stripped), and CONTENTS is the tree
  1276. represented in the data section of the file.
  1277. * If the `--choice-of-output' option is not used and the
  1278. `--force-text-input' option is used, the argument to the function
  1279. will be the whole file as a list of character strings. I.e., both
  1280. the preamble and the data sections are included, hashes are not
  1281. stripped from the preamble, and the data section is not converted
  1282. to the tree it represents.
  1283. * If the `--choice-of-output' option is used and the
  1284. `--force-text-input' option is also used, the argument to the the
  1285. function will be a pair `(nil,CONTENTS)', where the contents are
  1286. the list of character strings as in the previous case.
  1287. If standard input does not conform to the data file format
  1288. specification in *note File Format::, then it is assumed to be a text
  1289. file. The `--force-text-input' option makes no difference, and there are
  1290. only two possible effects, depending on whether `--choice-of-output' is
  1291. used. They correspond to the latter two cases above, where
  1292. `--force-text-input' is used.
  1293. The idea of the `--choice-of-output' option is that it is used only
  1294. for applications that are smart enough to be aware of the
  1295. `(PREAMBLE,CONTENTS)' convention. A non-empty preamble implies a data
  1296. file whose contents could be any type, but an empty preamble implies a
  1297. text file whose contents can only be a list of character strings. (In
  1298. the case of a data file with no preamble, the list of the empty string
  1299. is used for the preamble to distinguish it from a text file.)
  1300. Dumb applications that never want to deal with anything but text
  1301. files should be invoked with `--force-text-input'. Otherwise, they have
  1302. to be prepared for either text or data as arguments.
  1303. The use of both options at once is unproductive as far as the input
  1304. format is concerned, but may be justified when the output is to be a
  1305. data file and the input is text only.
  1306. 
  1307. File: avram.info, Node: Standard Output Representation, Prev: Standard Input Representation, Up: Loading All of Standard Input at Once
  1308. 2.5.1.2 Standard Output Representation
  1309. ......................................
  1310. As in the case of standard input, the representation for standard output
  1311. that the function is expected to return depends on the command line
  1312. options with which the application is invoked. The only relevant options
  1313. are `--raw-output' and `--choice-of-output', which are mutually
  1314. exclusive.
  1315. * If neither option is selected, the result returned by the function
  1316. must be a list of character strings.
  1317. * If `--raw-output' is used, the result returned by the function is
  1318. unconstrained, and it will be written as a data file with no
  1319. preamble, following the format specified in *note File Format::.
  1320. * If `--choice-of-output' is used, the result returned by the
  1321. function must be a pair `(PREAMBLE,CONTENTS)'.
  1322. In the last case, the preamble determines how the file will be
  1323. written. If it is meant to be a text file, the preamble should be
  1324. `nil', and the contents should be a list of character strings. If it is
  1325. meant to be a data file, the preamble should be a non-empty list of
  1326. character strings, and the format of the contents is unconstrained. To
  1327. express a data file with no preamble, the preamble should be the list
  1328. containing the empty string, rather than being empty.
  1329. In the result returned by the function, the preamble lines should not
  1330. include leading hash characters, because they are automatically added to
  1331. the output to enforce consistency with the data file format. However,
  1332. they should include trailing backslashes as continuation characters
  1333. where appropriate. The hashes that are automatically added will be
  1334. automatically stripped by `avram' on behalf of whatever application
  1335. uses the file.
  1336. Any file can be written as a list of character strings, even "text"
  1337. files that are full of unprintable characters, and even "text" files
  1338. that happen to conform to the format used for data files. However, if
  1339. the application intends to write a data file in the standard format used
  1340. by other virtual code applications, it can do so more quickly and easily
  1341. by having the virtual machine do the formatting automatically with the
  1342. `--choice-of-output' option than by implementing the algorithm in *note
  1343. Concrete Syntax::, from scratch in virtual code.
  1344. 
  1345. File: avram.info, Node: Line Maps, Next: Byte Transducers, Prev: Loading All of Standard Input at Once, Up: Filter Mode Interface
  1346. 2.5.2 Line Maps
  1347. ---------------
  1348. Virtual code applications invoked with the `--line-map' option (with or
  1349. without the `--unparameterized' option) adhere to a very simple
  1350. interface.
  1351. * The argument to the function is a character string, and the result
  1352. must also be a character string.
  1353. * The function is applied to each line of the standard input file and
  1354. the result in each case is written to standard output followed by a line
  1355. break.
  1356. This kind of application may be used on finite or infinite streams,
  1357. provided that the lengths of the lines are finite, but preserves no
  1358. state information from one line to the next.
  1359. 
  1360. File: avram.info, Node: Byte Transducers, Prev: Line Maps, Up: Filter Mode Interface
  1361. 2.5.3 Byte Transducers
  1362. ----------------------
  1363. The interface used when the `--byte-transducer' option is selected allows
  1364. an application to serve as a persistent stream processor suitable for
  1365. finite or infinite streams. The interface can be summarized by the
  1366. following points.
  1367. * When it is first invoked, the function in the virtual code file is
  1368. applied to an argument of `nil', and is expected to return a pair
  1369. `(STATE,OUTPUT)'. The STATE format is unconstrained. The OUTPUT
  1370. must be a character string that will be written to standard
  1371. output, but it may be the empty string.
  1372. * For each byte read from standard input, `avram' applies the
  1373. function to the pair `(STATE,CHARACTER)', using the state obtained
  1374. from previous evaluation, and the character whose code is the
  1375. byte. The purpose of the STATE field is therefore to provide a way
  1376. for the application to remember something from one invocation to
  1377. the next.
  1378. * The function is usually expected to return a pair `(STATE,OUTPUT)'
  1379. for each input byte, so that the state can be used on the next
  1380. iteration, and the output can be written to standard output as a
  1381. character string.
  1382. * If the function ever returns a value of `nil', the computation
  1383. terminates.
  1384. * If standard input comes to an end before the computation
  1385. terminates, the function will be applied to a pair of the form
  1386. `(STATE,nil)' thereafter, but may continue to return
  1387. `(STATE,OUTPUT)' pairs for arbitrarily many more iterations. The
  1388. `EOF' character is not explicitly passed to the function, but the
  1389. end is detectable insofar as `nil' is not a representation for any
  1390. character.
  1391. Unlike the situation with line maps, the output character strings do
  1392. not have line breaks automatically appended, and the application must
  1393. include them explicitly if required. The convention for line breaks is
  1394. system dependent. On Unix and GNU/Linux systems, character code 10
  1395. indicates a line break, but other systems may use character code 13
  1396. followed by character code 10. See *note Character Table:: for the representations
  1397. of characters having these codes.
  1398. 
  1399. File: avram.info, Node: Parameter Mode Interface, Next: Virtual Code Semantics, Prev: Filter Mode Interface, Up: Virtual Machine Specification
  1400. 2.6 Parameter Mode Interface
  1401. ============================
  1402. The virtual code file for a parameter mode application contains a tree
  1403. representing a single function, which takes as its argument a data
  1404. structure in the format described below. The format of the result
  1405. returned by the function depends on the virtual machine options used on
  1406. the command line, and the various alternatives are explained
  1407. subsequently.
  1408. * Menu:
  1409. * Input Data Structure::
  1410. * Input for Mapped Applications::
  1411. * Output From Non-interactive Applications::
  1412. * Output From Interactive Applications::
  1413. 
  1414. File: avram.info, Node: Input Data Structure, Next: Input for Mapped Applications, Prev: Parameter Mode Interface, Up: Parameter Mode Interface
  1415. 2.6.1 Input Data Structure
  1416. --------------------------
  1417. The data structure that is used as the argument to the parameter mode
  1418. application incorporates all the information about the command line and the
  1419. environment variables. It is in the form of a triple
  1420. `((FILES,OPTIONS),ENVIRONS)'. The fields have these interpretations.
  1421. FILES
  1422. is a list of quadruples `((DATE,PATH),(PREAMBLE,CONTENTS))', with
  1423. one quadruple for each input file named on the command line (but
  1424. not the virtual code file or the `avram' executable). The list
  1425. will be in the same order as the filenames on the command line,
  1426. and is not affected by options interspersed with them. The fields
  1427. in each item have the following interpretations.
  1428. DATE
  1429. is the time stamp of the file in as a character string in the
  1430. usual Unix format, for example, `Fri Jan 19 14:34:44 GMT
  1431. 2001'. If the file corresponds to standard input, the current
  1432. system time and date are used.
  1433. PATH
  1434. is the full path of the file, expressed as a list of strings.
  1435. If the file corresponds to standard input, the list is empty.
  1436. Otherwise, the first string in the list is the file name. The
  1437. next is the name of the file's parent directory, if any. The
  1438. next is the parent of the parent, and so on. The root
  1439. directory is indicated by the empty string, so that any path
  1440. ending with the empty string is an absolute path, while any
  1441. path ending with a non-empty string is relative to the
  1442. current working directory. Path separators (i.e., slashes)
  1443. are omitted.
  1444. PREAMBLE
  1445. In the case of a text file, or any file not conforming to the
  1446. format in *note File Format::, this field is `nil'.
  1447. Otherwise, this field contains the preamble of the file
  1448. expressed as a list of strings, or contains just the empty
  1449. string if the file has no preamble. Any leading hashes in the
  1450. preamble of the file are stripped.
  1451. CONTENTS
  1452. In the case of a text file (as indicated by the PREAMBLE
  1453. field), this field will contain a list of character strings,
  1454. with each line of the file contained in a character string.
  1455. Otherwise, it can contain data in any format, which are
  1456. obtained by converting the data section of the file to a tree.
  1457. OPTIONS
  1458. is a list of quadruples of the form
  1459. `((POSITION,LONGFORM),(KEYWORD,PARAMS))' with one quadruple for
  1460. each option appearing on the command line after the name of the
  1461. virtual code file.
  1462. POSITION
  1463. is a natural number indicating the position of the option on
  1464. the command line. The position numbers of all the options
  1465. will form an ascending sequence, but not necessarily
  1466. consecutive nor starting with zero. The missing numbers in
  1467. the sequence will correspond to the positions of the file
  1468. names on the command line, allowing their positions to be
  1469. inferred by applications for which the position matters.
  1470. LONGFORM
  1471. is a boolean value which is true if the option starts with
  1472. two or more dashes but false otherwise.
  1473. KEYWORD
  1474. is the key word of the option expressed as a character
  1475. string. For example in the case of a command line option
  1476. `--foo=bar,baz', the keyword is `foo'. Leading dashes are
  1477. stripped.
  1478. PARAMS
  1479. is a list of character strings identifying the parameters for
  1480. the command line option in question. In the case of an option
  1481. of the form `--foo=bar,baz', the first character string in
  1482. the list will be `bar' and the next will be `baz'. The same
  1483. applies if the option is written `--foo bar,baz' or `--foo
  1484. =bar,baz'. If there are no parameters associated with the
  1485. option, the list is empty.
  1486. ENVIRONS
  1487. is a list of pairs of character strings, with one pair in the list
  1488. for each environment variable. The identifier is the left string
  1489. in the pair, and the value is the right. For example, if the
  1490. environment contains the definition `OSTYPE=linux-gnu', there will
  1491. be a pair in the list whose left side is the string `OSTYPE' and
  1492. whose right side is the string `linux-gnu'.
  1493. 
  1494. File: avram.info, Node: Input for Mapped Applications, Next: Output From Non-interactive Applications, Prev: Input Data Structure, Up: Parameter Mode Interface
  1495. 2.6.2 Input for Mapped Applications
  1496. -----------------------------------
  1497. Applications invoked using the `--map-to-each-file' option benefit from
  1498. a slightly different interface than the one described above. As the
  1499. purpose of this option is to save memory by loading only one file at a
  1500. time, the application does not have access to all input files named on
  1501. the command line simultaneously within the same data structure.
  1502. Although the data structure is of the type already described, the FILES
  1503. field is not a list of arbitrary length. Instead, it is a list
  1504. containing exactly one item for an input file. If `-' is used as a
  1505. command line parameter, indicating standard input, then FILES will have
  1506. another item pertaining to standard input. In no event will it have
  1507. other than one or two items.
  1508. The mapped application is expected to work by being applied
  1509. individually to each of any number of separately constructed data
  1510. structures, doing the same in each case as it would if that case were
  1511. the only one. To make that possible, copies of the environment
  1512. variables, the contents of standard input, and the list of application
  1513. specific options are contained in the data structure used for every
  1514. invocation.
  1515. The position numbers in the options are adjusted for each invocation
  1516. to reflect the position of the particular input file associated with it.
  1517. For example, in the following command
  1518. `avram --map-to-each-file mapster.avm fa.txt --data fb.dat --code fc.o'
  1519. the function in the virtual code file `mapster.avm' would be applied
  1520. to each of three data structures, corresponding to the commands
  1521. `avram mapster.avm fa.txt --data --code'
  1522. `avram mapster.avm --data fb.dat --code'
  1523. `avram mapster.avm --data --code fc.o'
  1524. If the relative positions of the options and filenames were important to
  1525. the application, they could be reliably inferred from the position
  1526. numbers. In the first case, they would be 1 and 2, implying that the
  1527. file is in position 0. In the second case they would be 0 and 2,
  1528. implying that the file is in position 1, and in the third case they
  1529. would be 0 and 1, implying that the file is in position 2. (Of course,
  1530. nothing compels an application to concern itself with the positions of
  1531. its parameters, and the alternative might be preferable.)
  1532. For the most part, any application that can operate on one file at a
  1533. time without needing information from any others can be executed more
  1534. economically with the `--map-to-each-file' option and few if any
  1535. changes to the code. The effect will normally be analogous to the above
  1536. example, subject to a few possible differences.
  1537. * If an application is supposed to do something by default when
  1538. there are no file parameters or only standard input, it won't work
  1539. as a mapped application, because if there are no file parameters
  1540. it won't be executed at all.
  1541. * If a mapped application causes any output files to be generated,
  1542. they may be written before other input files are read, possibly
  1543. causing the input files to be overwritten if they have the same
  1544. names, and causing subsequent invocations to use the overwritten
  1545. versions. This behavior differs from that of loading all input
  1546. files at the outset, which ensures the application seeing all of
  1547. the original versions. The latter may be more convenient for
  1548. maintaining a group of files in some sort of consistent state.
  1549. * If an application causes standard output to be written along with
  1550. output files, normally standard output is written last as a
  1551. security measure against malicious code altering the
  1552. `--ask-to-overwrite' prompts by subtly clobbering the console. In
  1553. a mapped application, standard output isn't always last because
  1554. there may be more invocations to come.
  1555. 
  1556. File: avram.info, Node: Output From Non-interactive Applications, Next: Output From Interactive Applications, Prev: Input for Mapped Applications, Up: Parameter Mode Interface
  1557. 2.6.3 Output From Non-interactive Applications
  1558. ----------------------------------------------
  1559. If a parameter mode application is not invoked with either of the
  1560. `--interactive' or `--step' options, then it is deemed to be
  1561. non-interactive, and therefore does not concern itself with executing
  1562. shell commands. Instead, it simply specifies a list of output files to
  1563. be created or updated on its behalf by `avram'.
  1564. The files are described by a list of quadruples
  1565. `((OVERWRITE,PATH),(PREAMBLE,CONTENTS))', with one quadruple for each
  1566. file.
  1567. In each quadruple, the PATH, PREAMBLE, and CONTENTS fields have the
  1568. same interpretations as in the list of files in the input data
  1569. structure described in *note Input Data Structure::, except that a
  1570. `nil' path refers to standard output rather than to standard input.
  1571. The OVERWRITE field in each quadruple tells whether the file should
  1572. be overwritten or appended. If the OVERWRITE field is `nil' (i.e., the
  1573. representation for the Boolean value of `false') and a file already
  1574. exists at the given path, the new contents will be appended to it. If
  1575. the OVERWRITE field is anything other than `nil' and/or no file exists
  1576. with the given path, a new file is written or the extant one is
  1577. overwritten. Note that the data file format specified in *note File
  1578. Format:: precludes appending anything to it, but the format of existing
  1579. output files is not checked and nothing prevents data or text from
  1580. being appended to one.
  1581. 
  1582. File: avram.info, Node: Output From Interactive Applications, Prev: Output From Non-interactive Applications, Up: Parameter Mode Interface
  1583. 2.6.4 Output From Interactive Applications
  1584. ------------------------------------------
  1585. Parameter mode applications invoked with either of the `--interactive'
  1586. or `--step' options are required to take the data structure described
  1587. in *note Input Data Structure:: as an argument but to return the
  1588. virtual code for a function that will observe a certain protocol
  1589. allowing shell commands to be executed on its behalf. The intent is
  1590. that the virtual code file doesn't contain the real application _per
  1591. se_, but only something that builds the real one on the fly using
  1592. configuration information from the input files and command line options.
  1593. The format of the result returned by an interactive application,
  1594. being a virtual code application itself, requires a full exposition of
  1595. the virtual machine code semantics. This subject is deferred to *note
  1596. Virtual Code Semantics::. The remainder of this section describes the
  1597. protocol followed by the function returned by the interactive
  1598. application rather than the application itself.
  1599. Similarly to the case of a byte transducer described in *note Byte
  1600. Transducers::, the basic pattern of interaction between `avram' and the
  1601. function is a cycle of invocations. In general terms, the function is
  1602. applied to a `nil' argument initially, and expected to return an
  1603. initial state and initial output. Thereafter, the function is applied
  1604. to a pair of the state returned on the previous iteration, and the next
  1605. installment of input. The function returns further output and a new
  1606. state, and the cycle continues until the function returns a value of
  1607. `nil', at which time the computation terminates.
  1608. * Menu:
  1609. * Line Oriented Interaction::
  1610. * Character Oriented Interaction::
  1611. * Mixed Modes of Interaction::
  1612. 
  1613. File: avram.info, Node: Line Oriented Interaction, Next: Character Oriented Interaction, Prev: Output From Interactive Applications, Up: Output From Interactive Applications
  1614. 2.6.4.1 Line Oriented Interaction
  1615. .................................
  1616. Within this general pattern, more specific styles of interaction are
  1617. possible. In the simplest one to explain first, the result returned by
  1618. the function is always a data structure of the form `(STATE,(COMMAND
  1619. LINES,PROMPTS))', wherein the fields have these interpretations.
  1620. STATE
  1621. is a tree incorporating any data in any format that the application
  1622. needs to remember from one invocation to the next.
  1623. COMMAND LINES
  1624. is a list of character strings that are piped to the standard input
  1625. stream of a separately spawned process. The process may persist
  1626. from one invocation of the function to the next, or may be spawned
  1627. each time.
  1628. PROMPTS
  1629. is a non-empty list of character strings containing a suffix of
  1630. the text expected from the standard output stream of the process
  1631. as a result of sending the command lines to it.
  1632. On each iteration, `avram' sends the command line character strings to
  1633. a separately spawned process, with line breaks between them if there
  1634. are more than one command. If a process remains from the previous
  1635. iteration that has not terminated itself, the list of command lines is
  1636. sent to the same process. If no such process already exists, the first
  1637. string in the list of command lines is treated as a shell command and
  1638. used to spawn the process (using the `exp_popen' library function), and
  1639. the remaining strings are sent to the newly spawned process.
  1640. Normally processes spawned with commands that invoke interactive
  1641. command line interpreters of their own, such as `bash', `ftp' or `bc',
  1642. will persist indefinitely unless the command causing them to exit is
  1643. issued or some other event kills them. Processes spawned with
  1644. non-interactive commands, such as `ls' or `pwd', will terminate when
  1645. the last of their initial output has been received.
  1646. In the case of processes that persist after being spawned, `avram'
  1647. needs some way of knowing when to stop waiting for more output from them
  1648. so that it doesn't get stuck waiting forever. This purpose is served by
  1649. the PROMPTS field. This field could contain a single string holding the
  1650. last thing the process will send before becoming quiescent, such as the
  1651. strings `bash$ ' or `ftp> ' in the above examples. Alternatively, a
  1652. sequence of more than one prompt string can be used to indicate that
  1653. the corresponding sequence of lines must be detected. An empty string
  1654. followed by `ftp> ' would indicate that the `ftp> ' prompt is expected
  1655. to be immediately preceded by a line break. There is also the option of
  1656. using prompt strings to indicate a pattern that does not necessarily
  1657. imply quiescence, but is a more convenient point at which to stop
  1658. reading the output from the process.
  1659. For processes spawned with commands that do not start their own
  1660. interactive command line interpreters, such as `ls' or `pwd', it may be
  1661. preferable to read all the output from them until they terminate. To
  1662. achieve this effect, the list of prompt strings should contain only the
  1663. single string containing only the single `EOF' character (usually code
  1664. 4) or any other character that is certain not to occur in the output of
  1665. the process. This technique is based on the assumption that the process
  1666. was spawned originally with the command in question, not that such a
  1667. command is sent to an existing shell process.
  1668. In any case, when enough output has been received from the process,
  1669. it is collected into a list of received strings including the prompt
  1670. strings at the end (if they were received), and the function is applied
  1671. to the pair `(STATE,RECEIVED STRINGS)'. If the cycle is to continue,
  1672. the result returned by the function will include a new state, a new
  1673. list of command lines, and a new list of prompt strings. A result of
  1674. `nil' will cause the computation to terminate.
  1675. There are some unusual situations that could occur in the course of
  1676. line oriented interaction, and are summarized as follows.
  1677. * If the process terminates before any pattern matching the prompt
  1678. strings is received from it, all of the output from the process up
  1679. to the point where it terminated is collected into the RECEIVED
  1680. STRINGS list and passed to the function. This situation includes
  1681. cases where the process terminates immediately upon being spawned,
  1682. but not abnormal completion of the `exp_popen' library function,
  1683. which is a fatal error. This feature of the interface is what
  1684. allows `EOF' to be used for collecting all the output at once from
  1685. a non-interactive command.
  1686. * If the list of COMMAND LINES is empty, and no process currently
  1687. exists due to a previous iteration, the effect is the same as if
  1688. the process terminates unexpectedly before outputting anything.
  1689. I.e., the function is applied to a pair containing an empty list
  1690. of received strings. There is no good reason for an application to
  1691. get into this situation.
  1692. * If the list of COMMAND LINES is empty but a process persists from
  1693. a previous iteration, no output is sent to it, but receiving from
  1694. it proceeds normally. This feature of the interface could be used
  1695. effectively by applications intended to process the received data
  1696. in parts, but will cause deadlock if the process is already
  1697. quiescent.
  1698. * All character strings have to consist of lists of valid
  1699. representations of non-null characters according to *note
  1700. Character Table::, or else there will be some fatal error messages.
  1701. * If the list of PROMPT STRINGS contains only the empty string,
  1702. `avram' will not wait to receive anything from the process, but
  1703. will proceed with the next iteration immediately. If this effect is
  1704. intended, care must be taken not to confuse the empty list of
  1705. PROMPT STRINGS with the list containing the empty string. The
  1706. former indicates character oriented interaction, which is
  1707. explained next.
  1708. 
  1709. File: avram.info, Node: Character Oriented Interaction, Next: Mixed Modes of Interaction, Prev: Line Oriented Interaction, Up: Output From Interactive Applications
  1710. 2.6.4.2 Character Oriented Interaction
  1711. ......................................
  1712. A character oriented style of interaction involves the function always
  1713. returning a data structure of the form `(STATE,(COMMAND LINES,nil))'.
  1714. The STATE and COMMAND LINES fields serve exactly the same purposes
  1715. respectively as they do in the case of line oriented interaction. The
  1716. field that would be occupied by the PROMPT STRINGS list in the case of
  1717. line oriented interaction is identically `nil' in this style.
  1718. When this style is used, `avram' spawns a process and/or sends command
  1719. lines to it as in the case of line oriented interaction, but attempts
  1720. to read only a single character from it per iteration. When the
  1721. character is received, `avram' applies the function to the pair
  1722. `(STATE,CHARACTER)' in order to obtain the next state and the next list
  1723. of command lines. If the process has terminated, a `nil' value is used
  1724. in place of the character. If the process is quiescent, deadlock ensues.
  1725. The character oriented style is a lower level protocol that shifts
  1726. more of the burden of analyzing the process's output to the virtual code
  1727. application. It can do anything line oriented interaction can do except
  1728. proceeding immediately without waiting to receive any output from the
  1729. process. It may also allow more general criteria (in effect) than the
  1730. matching of a fixed prompt string to delimit the received data, for
  1731. those pathological processes that may require such things.
  1732. Applications using character oriented interaction need to deal with
  1733. line breaks explicitly among the received characters, unlike the case
  1734. with line oriented interaction, where the line breaks are implicit in
  1735. the list of received strings. Contrary to the convention for Unix text
  1736. files, line breaks in the output of a process are indicated by character
  1737. code 13 followed by character code 10.
  1738. 
  1739. File: avram.info, Node: Mixed Modes of Interaction, Prev: Character Oriented Interaction, Up: Output From Interactive Applications
  1740. 2.6.4.3 Mixed Modes of Interaction
  1741. ..................................
  1742. An application is not confined exclusively to line oriented or character
  1743. oriented interaction, but may switch from one style to the other between
  1744. iterations, and signal its choice simply by the format of the data
  1745. structure it returns. If the PROMPT STRINGS field is non-empty, the
  1746. interaction is line oriented, and if the field is empty, the
  1747. interaction is character oriented. A function using both styles has to
  1748. be prepared for whichever type of data it indicates, either a character
  1749. or a list of character strings as the case may be.
  1750. Another alternative is possible if the function returns a data
  1751. structure in the form `(FILES,nil)'. This structure includes neither a
  1752. list of command lines nor a list of prompt strings, empty or otherwise,
  1753. but does include a list of quadruples in the FILES field. The
  1754. quadruples are of the form `((OVERWRITE,PATH),(PREAMBLE,CONTENTS))'.
  1755. The fields have the same interpretations as in the output from a
  1756. non-interactive parameter mode application, as described in *note
  1757. Output From Non-interactive Applications::, and will cause a list of
  1758. files to be written in the same way.
  1759. As an interactive application is able cause the execution of
  1760. arbitrary shell commands, it doesn't need `avram' to write files for it
  1761. the way a non-interactive application does, so this feature does not
  1762. provide any additional capabilities. However, it may be helpful as a
  1763. matter of convenience.
  1764. After the files are written, the function will be applied to the same
  1765. result it returned, `(FILES,nil)'. There is no direct means of
  1766. preserving unconstrained state information from previous iterations in
  1767. this style of interaction. A likely scenario might therefore be that
  1768. the function returns a file list after finishing its other business, and
  1769. then returns `nil' on the next iteration to terminate.
  1770. 
  1771. File: avram.info, Node: Virtual Code Semantics, Prev: Parameter Mode Interface, Up: Virtual Machine Specification
  1772. 2.7 Virtual Code Semantics
  1773. ==========================
  1774. As the previous sections explain, virtual code applications are defined
  1775. in terms of mathematical functions. Up until this point, the discussion
  1776. has focused on the interface between the function and the virtual
  1777. machine interpreter, by detailing the arguments passed to the functions
  1778. under various circumstances and the results they are expected to return
  1779. in order to achieve various effects.
  1780. The purpose of this section is to complete the picture by explaining
  1781. how a given computable function can be expressed in virtual code,
  1782. considering only functions operating on the trees described in *note
  1783. Raw Material::. Functions manipulating trees of `nil' are undoubtedly
  1784. a frivolous and abstract subject in themselves. One is obliged to refer
  1785. back to the previous sections if in need of motivation.
  1786. * Menu:
  1787. * A New Operator::
  1788. * On Equality::
  1789. * A Minimal Set of Properties::
  1790. * A Simple Lisp Like Language::
  1791. * How `avram' Thinks::
  1792. * Variable Freedom::
  1793. * Metrics and Maintenance::
  1794. * Deconstruction::
  1795. * Recursion::
  1796. * Assignment::
  1797. * Predicates::
  1798. * Iteration::
  1799. * List Combinators::
  1800. * List Functions::
  1801. * Exception Handling::
  1802. * Interfaces to External Code::
  1803. * Vacant Address Space::
  1804. 
  1805. File: avram.info, Node: A New Operator, Next: On Equality, Prev: Virtual Code Semantics, Up: Virtual Code Semantics
  1806. 2.7.1 A New Operator
  1807. --------------------
  1808. With regard to a set of trees as described in *note Raw Material::, we
  1809. can define a new binary operator. Unlike the `cons' operator, this one
  1810. is not required to associate an element of the set with every possible
  1811. pair of elements. For very many pairs of operands we will have nothing
  1812. to say about its result. In fact, we require nothing of it beyond a few
  1813. simple properties to be described presently.
  1814. Because this is the only other operator than `cons', there is no need
  1815. to have a special notation for it, so it will be denoted by empty
  1816. space. The tree associated by the operator with a pair of trees `X' and
  1817. `Y', if any, will be expressed in the infix notation `X Y'. For
  1818. convenience, the operator is regarded as being right associative, so
  1819. that `A B C' can be written for `A (B C)'.
  1820. 
  1821. File: avram.info, Node: On Equality, Next: A Minimal Set of Properties, Prev: A New Operator, Up: Virtual Code Semantics
  1822. 2.7.2 On Equality
  1823. -----------------
  1824. One example of a property this operator should have, for reasons that
  1825. will not be immediately clear, is that for any trees `X' and `K', the
  1826. equality `cons(cons(nil,`K),nil) X' = `K'' always holds. Even though
  1827. the present exposition opts for readability over formality, statements
  1828. like these demand clarification of the notion of equality. Some of the
  1829. more pedantic points in *note Raw Material:: may be needed for the
  1830. following ideas to hold water.
  1831. * As originally stipulated, it is always possible to distinguish
  1832. `nil' from any member of the set. We can therefore decide on this
  1833. basis whether `A = B' whenever at least one of them is `nil'.
  1834. * Where neither `A' nor `B' is `nil' in an expression `A = B', but
  1835. rather something of the form `cons(X,Y)', the equality holds if
  1836. and only if both pairs of corresponding subexpressions are equal.
  1837. If at least one member of each pair of corresponding
  1838. subexpressions is `nil', the question is settled, but otherwise
  1839. there is recourse to their respective subexpressions, and so on.
  1840. This condition follows from the uniqueness property of the `cons'
  1841. operator.
  1842. * If one side of an equality is of the form `X Y', or is defined in
  1843. terms of such an expression, but `(X,Y)' is one of those pairs
  1844. with which the operator associates no result, then the equality
  1845. holds if and only if the other side is similarly ill defined.
  1846. * Statements involving universal quantification (i.e., beginning
  1847. with words similar to "for any tree `X' ...") obviously do not
  1848. apply to instances of a variable (`X') outside the indicated set
  1849. (trees). Hence, they are not refuted by any consequence of
  1850. identifying a variable with an undefined expression.
  1851. Readers who are aware of such issues as pointer equality or
  1852. intensional versus extensional equality of functions are urged to
  1853. forget all about them in the context of this document, and abide only
  1854. by what is stated. Other readers should ignore this paragraph.
  1855. 
  1856. File: avram.info, Node: A Minimal Set of Properties, Next: A Simple Lisp Like Language, Prev: On Equality, Up: Virtual Code Semantics
  1857. 2.7.3 A Minimal Set of Properties
  1858. ---------------------------------
  1859. For any trees `X', `Y', and `K', and any non-`nil' trees `P', `F', and
  1860. `G', the new invisible operator satisfies these conditions. In these
  1861. expressions and hereafter, increasing abuse of notation is perpetrated
  1862. by not writing the `cons' in expressions of the form `cons(X,Y)'.
  1863. _P0_
  1864. `(nil,(nil,nil)) X' = `X'
  1865. _P1_
  1866. `(nil,((nil,nil),nil)) (X,Y)' = `X'
  1867. _P2_
  1868. `(nil,(nil,(nil,nil))) (X,Y)' = `Y'
  1869. _P3_
  1870. `((nil,K),nil) X' = `K'
  1871. _P4_
  1872. `(((nil,(nil,nil)),nil),nil) (F,X)' = `F (F,X)'
  1873. _P5_
  1874. `((F,G),nil) X' = `F G X'
  1875. _P6_
  1876. `((F,nil),G) X' = `(F X,G X)'
  1877. _P7_
  1878. `((P,F),G) X' = `F X' if `P X' is a non-`nil' tree, but `G X' if
  1879. `P X' = `nil'
  1880. Although other properties remain to be described, it is worth
  1881. pausing at this point because there is ample food for thought in the
  1882. ones already given. An obvious question would be that of their origin.
  1883. The short answer is that they have been chosen arbitrarily to be true by
  1884. definition of the operator. At best, the completion of the construction
  1885. may lead to a more satisfactory answer based on aesthetic or engineering
  1886. grounds.
  1887. A more important question would be that of the relevance of the
  1888. mystery operator and its properties to the stated purpose of this
  1889. section, which is to specify the virtual machine code semantics. The
  1890. answer lies in that the operator induces a function for any given tree
  1891. `T', such that the value returned by the function when given an argument
  1892. X is `T X'. This function is the one that is implemented by the virtual
  1893. code T, which is to say the way an application will behave if we put T
  1894. in its virtual code file. An equivalent way of looking at the situation
  1895. is that the virtual machine does nothing but compute the result of this
  1896. operator, taking the tree in the virtual code file as its left operand
  1897. and the input data as the right operand. By knowing what the operator
  1898. will do with a given pair of operands, we know what to put into the
  1899. virtual code file to get the function we want.
  1900. It is worthwhile to note that properties _P0_ to _P7_ are sufficient
  1901. for universality in the sense of Turing equivalence. That means that
  1902. any computable function could be implemented by the suitable choice of
  1903. a tree T without recourse to any other properties of the operator. A
  1904. compiler writer who finds this material boring could therefore stop
  1905. reading at this point and carry out the task of targeting any general
  1906. purpose programming language to the virtual machine based on the
  1907. specifications already given. However, such an implementation would not
  1908. take advantage of the features for list processing, exception handling,
  1909. or profiling that are also built into the virtual machine and have yet
  1910. to be described.
  1911. 
  1912. File: avram.info, Node: A Simple Lisp Like Language, Next: How `avram' Thinks, Prev: A Minimal Set of Properties, Up: Virtual Code Semantics
  1913. 2.7.4 A Simple Lisp Like Language
  1914. ---------------------------------
  1915. With a universal computational model already at our disposal, it will be
  1916. easier to use the virtual machine to specify itself than to define all
  1917. of it from scratch. For this purpose, we use the `silly' programming
  1918. language, whose name is an acronym for SImple Lisp-like Language (Yeah
  1919. right). The language serves essentially as a thin layer of symbolic
  1920. names on top of the virtual machine code. Due to its poor support for
  1921. modularity and abstraction, `silly' is not recommended for serious
  1922. application development, but at least it has a shallow learning
  1923. curve.(1)
  1924. * Menu:
  1925. * Syntax::
  1926. * Semantics::
  1927. * Standard Library::
  1928. ---------- Footnotes ----------
  1929. (1) Previous releases of `avram' included a working `silly'
  1930. compiler, but this has now been superseded by the Ursala programming
  1931. language. Ursala includes `silly' as a subset for the most part, and
  1932. the examples in this manual should compile and execute with very little
  1933. modification.
  1934. 
  1935. File: avram.info, Node: Syntax, Next: Semantics, Prev: A Simple Lisp Like Language, Up: A Simple Lisp Like Language
  1936. 2.7.4.1 Syntax
  1937. ..............
  1938. `silly' has no reserved words, but it has equals signs, commas and
  1939. parentheses built in. A concise but ambiguous grammar for it can be
  1940. given as follows.
  1941. PROGRAM ::= DECLARATION*
  1942. DECLARATION ::= IDENTIFIER `=' EXPRESSION
  1943. EXPRESSION ::= () | IDENTIFIER
  1944. | (EXPRESSION)
  1945. | (EXPRESSION,EXPRESSION)
  1946. | EXPRESSION EXPRESSION
  1947. | EXPRESSION(EXPRESSION)
  1948. | EXPRESSION(EXPRESSION,EXPRESSION)
  1949. The real grammar is consistent with this one but enforces right
  1950. associativity for binary operations and higher precedence for
  1951. juxtaposition without intervening white space.
  1952. The declaration of any identifier must be unique and must precede its
  1953. occurrence in any expression. Hence, cyclic dependences between
  1954. declarations and "recursive" declarations are not allowed.
  1955. 
  1956. File: avram.info, Node: Semantics, Next: Standard Library, Prev: Syntax, Up: A Simple Lisp Like Language
  1957. 2.7.4.2 Semantics
  1958. .................
  1959. Declarations in `silly' should be understood in the obvious way as
  1960. preprocessor directives to perform parenthetic textual substitutions
  1961. (similar to `#define id (exp)' in C). All identifiers in expressions
  1962. are thereby eliminated during the preprocessing phase.
  1963. The overall meaning of the program is the meaning of the expression
  1964. in the last declaration. A denotational semantics for expressions is
  1965. given by the following equations, where [[`X']] should be read as "the
  1966. meaning of `X'", and `X', `Y' and `Z' are metavariables. (That is, they
  1967. stand for any source code fragment that could fit there subject to the
  1968. constraint, informally speaking, that it has to correspond to a
  1969. connected subtree of the parse tree as enforced by the unambiguous
  1970. grammar in the context of the rest of the program.)
  1971. [[`()']] = `nil'
  1972. [[`(X)']] = [[`X']]
  1973. [[`(X,Y)']] = `cons('[[`X']]`,'[[`Y']]`)'
  1974. [[`X Y']] = [[`X(Y)']] = [[`X']] [[`Y']]
  1975. [[`X (Y,Z)']] = [[`X(Y,Z)']] = [[`X']] [[`(Y,Z)']]
  1976. Toy languages like this are among the few situations a denotational
  1977. semantics stands a chance of clarifying more than it obfuscates, so the
  1978. reader is invited to take a moment to savor it.
  1979. 
  1980. File: avram.info, Node: Standard Library, Prev: Semantics, Up: A Simple Lisp Like Language
  1981. 2.7.4.3 Standard Library
  1982. ........................
  1983. `silly' programs may be linked with library modules, which consist of
  1984. `silly' source text to be concatenated with the user program prior to
  1985. the preprocessing phase. Most `silly' programs are linked with the
  1986. standard `silly' prelude, which contains the following declarations
  1987. among others.
  1988. nil = ()
  1989. identity = (nil,(nil,nil))
  1990. left = (nil,((nil,nil),nil))
  1991. right = (nil,(nil,(nil,nil)))
  1992. meta = (((nil,(nil,nil)),nil),nil)
  1993. constant_nil = ((nil,nil),nil)
  1994. couple = ((((left,nil),constant_nil),nil),right)
  1995. compose = couple(identity,constant_nil)
  1996. constant = couple(couple(constant_nil,identity),constant_nil)
  1997. conditional =
  1998. couple(couple(left,compose(left,right)),compose(right,right))
  1999. There is a close correspondence between these declarations and the
  2000. properties described in *note A Minimal Set of Properties::. A fitting
  2001. analogy would be that the properties of the operator specify the
  2002. virtual machine instruction set in a language independent way, and the
  2003. `silly' library defines the instruction mnemonics for a virtual
  2004. assembly language. The relationship of some mnemonics to their
  2005. corresponding instructions may be less clear than that of others, so
  2006. they are all discussed next.
  2007. 
  2008. File: avram.info, Node: How `avram' Thinks, Next: Variable Freedom, Prev: A Simple Lisp Like Language, Up: Virtual Code Semantics
  2009. 2.7.5 How `avram' Thinks
  2010. ------------------------
  2011. The definitions in the standard `silly' library pertaining to the basic
  2012. properties of the operator can provide a good intuitive illustration of
  2013. how computations are performed by `avram'. This task is approached in
  2014. the guise of a few trivial correctness proofs about them. Conveniently,
  2015. as an infeasibly small language, `silly' is an ideal candidate for
  2016. analysis by formal methods.
  2017. Technically the semantic function [[...]] has not been defined on
  2018. identifiers, but we can easily extend it to them by stipulating that the
  2019. meaning of an identifier `X' is the meaning of the program `MAIN = X'
  2020. when linked with a library containing the declaration of `X', where
  2021. `MAIN' is an identifier not appearing elsewhere in the library.
  2022. With this idea in mind, the following "theorems" can be stated, all
  2023. of which have a similar proof. The variables X and Y stand for any
  2024. tree, and the variable F stands for any tree other than `nil'.
  2025. _T0_
  2026. [[`identity']] `X' = `X'
  2027. _T1_
  2028. [[`left']] `(X,Y)' = `X'
  2029. _T2_
  2030. [[`right']] `(X,Y)' = `Y'
  2031. _T4_
  2032. [[`meta']] `(F,X)' = `F (F,X)'
  2033. _T5_
  2034. [[`constant_nil']] `X' = `nil'
  2035. Replacing each identifier with its defining expression directly
  2036. demonstrates a logical equivalence between the relevant theorem and one
  2037. of the basic operator properties postulated in *note A Minimal Set of
  2038. Properties::.
  2039. For more of a challenge, it is possible to prove the next theorem.
  2040. _T6_
  2041. For non-`nil' `F' and `G', ([[`couple']] `(F,G)') `X' = `(F X,G X)'
  2042. The proof is a routine calculation. Beware of the distinction between
  2043. the mathematical `nil' and the `silly' identifier `nil'.
  2044. ([[`couple']] `(F,G)') `X'
  2045. = ([[`((((left,nil),constant_nil),nil),right)']] `(F,G)') `X'
  2046. by substitution of `couple' with its definition in the standard library
  2047. = (
  2048. `(
  2049. ((('[[`left']]`,'[[`nil']]`),'[[`constant_nil']]`),'[[`nil']])`,'
  2050. [[`right']]`)'
  2051. `(F,G)')
  2052. `X'
  2053. by definition of the semantic function [[...]] regarding pairs
  2054. = (
  2055. `(
  2056. ((('[[`left']]`,'[[`()']]`),'[[`constant_nil']]`),'[[`()']])`,'
  2057. [[`right']]`)'
  2058. `(F,G)')
  2059. `X'
  2060. by substitution of `nil' from its definition in the standard library
  2061. = (
  2062. `(
  2063. ((('[[`left']]`,'[[`nil']]`),'[[`constant_nil']]`),'[[`nil']])`,'
  2064. [[`right']]`)'
  2065. `(F,G)')
  2066. `X'
  2067. by definition of the semantic function in the case of [[`()']]
  2068. = (
  2069. `('
  2070. [[`left']] `(F,G),'[[`constant_nil']] `(F,G)),'
  2071. [[`right']] `(F,G)')
  2072. `X'
  2073. by property _P6_ (twice)
  2074. = `((F,nil),G) X'
  2075. by theorems _T1_, _T2_, and _T5_
  2076. = `(F X,G X)'
  2077. by property _P6_ again.
  2078. Something to observe about this proof is that it might just as well
  2079. have been done automatically. Every step is either the substitution of
  2080. an identifier or a pattern match against existing theorems and
  2081. properties of the operator. Another thing to note is that the use of
  2082. identifiers and previously established theorems helps to make the proof
  2083. human readable, but is not a logical necessity. An equivalent proof
  2084. could have been expressed entirely in terms of the properties of the
  2085. operator. If one envisions a proof like this being performed blindly and
  2086. mechanically, without the running commentary or other amenities, that
  2087. would not be a bad way of thinking about what takes place when `avram'
  2088. executes virtual code.
  2089. Three more theorems have similar proofs. For non-`nil' trees `P',
  2090. `F' and `G', and any trees `X' and `K':
  2091. _T7_
  2092. ([[`compose']] `(F,G)') X = F G X
  2093. _T8_
  2094. ([[`constant']] `K') X = K
  2095. _T9_
  2096. ([[`conditional']] `(P,(F,G)') X = `F X' if `P X' is non-`nil',
  2097. but `G X' if `P X' = `nil'
  2098. The proofs of these theorems are routine calculations analogous to the
  2099. proof of _T6_. Here is a proof of theorem _T7_ for good measure.
  2100. ([[`compose']] `(F,G)') `X'
  2101. = ([[`couple(identity,constant_nil)']] `(F,G)') `X'
  2102. by substitution of `compose' with its definition in the standard
  2103. library
  2104. = (`('[[`couple']] `('[[`identity']]`,'[[`constant_nil']]`))(F,G)') `X'
  2105. by definition of the semantic function
  2106. = `('[[`identity']] `(F,G),'[[`constant_nil']]` (F,G)) X'
  2107. by theorem _T6_
  2108. = `((F,G),nil) X'
  2109. by theorems _T0_ and _T5_
  2110. = `F G X'
  2111. by property _P5_ of the operator.
  2112. 
  2113. File: avram.info, Node: Variable Freedom, Next: Metrics and Maintenance, Prev: How `avram' Thinks, Up: Virtual Code Semantics
  2114. 2.7.6 Variable Freedom
  2115. ----------------------
  2116. The virtual code semantics is easier to specify using the `silly'
  2117. language than it would be without it, but still awkward in some cases.
  2118. An example is the following declaration from the standard library,
  2119. hired = compose(
  2120. compose,
  2121. couple(
  2122. constant compose,
  2123. compose(couple,couple(constant,constant couple))))
  2124. which is constructed in such a way as to imply the following theorem,
  2125. provable by routine computation.
  2126. _T9_
  2127. `('[[`hired']] `H) (F,G)' = [[`compose']]`(H,'[[`couple']]`(F,G))'
  2128. Intuitively, `hired' represents a function that takes a given function
  2129. to a higher order function. For example, if `f' were a function that
  2130. adds two real numbers, `hired f' would be a function that takes two
  2131. real valued functions to their pointwise sum.
  2132. Apart from its cleverness, such an opaque way of defining a function
  2133. has little to recommend it. The statement of the theorem about the
  2134. function is more readable than the function definition itself, probably
  2135. because the theorem liberally employs mathematical variables, whereas
  2136. the `silly' language is variable free. On the other hand, it is not
  2137. worthwhile to linger over further enhancements to the language, such as
  2138. adding variables to it. The solution will be to indicate informally a
  2139. general method of inferring a variable free function definition from an
  2140. expression containing variables, and hereafter omit the more cumbersome
  2141. definitions.
  2142. An algorithm called `isolate' does the job. The input to `isolate'
  2143. is a pair `(E,X)', where `E' is a syntactically correct `silly'
  2144. expression in which the identifier `X' may occur, but no other
  2145. identifiers dependent on `X' may occur (or else it's
  2146. garbage-in/garbage-out). Output is a syntactically correct `silly'
  2147. expression `F' in which the identifier `X' does not occur, such that
  2148. [[`E']] = [[`F X']]. The algorithm is as follows,
  2149. if `E' = `X' then
  2150. return `identity'
  2151. else if `E' is of the form `(U,V)'
  2152. return `couple(isolate(U,X),isolate(V,X))'
  2153. else if `E' is of the form `U V'
  2154. return `(hired apply)(isolate(U,X),isolate(V,X))'
  2155. else
  2156. return `constant E'
  2157. where equality is by literal comparison of expressions, and the
  2158. definition of `apply' is
  2159. apply = (hired meta)((hired compose)(left,constant right),right)
  2160. which represents a function that does the same thing as the invisible
  2161. operator.
  2162. _T10_
  2163. [[`apply']] `(F,X)' = `F X'
  2164. The `isolate' algorithm can be generalized to functions of
  2165. arbitrarily many variables, but in this document we will need only a
  2166. unary and a binary version. The latter takes an expression `E' and a
  2167. pair of identifiers `(X,Y)' as input, and returns an expression `F'
  2168. such that [[`E']] = [[`F (X,Y)']].
  2169. if `E' = `X' then
  2170. return `left'
  2171. else if `E' = `Y' then
  2172. return `right'
  2173. else if `E' is of the form `(U,V)'
  2174. return `couple(isolate(U,(X,Y)),isolate(V,(X,Y)))'
  2175. else if `E' is of the form `U V'
  2176. return `(hired apply)(isolate(U,(X,Y)),isolate(V,(X,Y)))'
  2177. else
  2178. return `constant E'
  2179. It might be noted in passing that something similar to these
  2180. algorithms would be needed in a compiler targeted to `avram' if the
  2181. source were a functional language with variables.
  2182. 
  2183. File: avram.info, Node: Metrics and Maintenance, Next: Deconstruction, Prev: Variable Freedom, Up: Virtual Code Semantics
  2184. 2.7.7 Metrics and Maintenance
  2185. -----------------------------
  2186. Certain features of the virtual machine pertain to software development
  2187. and maintenance more than to implementing any particular function. The
  2188. operations with the mnemonics `version', `note', `profile', and
  2189. `weight' are in this category.
  2190. * Menu:
  2191. * Version::
  2192. * Note::
  2193. * Profile::
  2194. * Weight::
  2195. 
  2196. File: avram.info, Node: Version, Next: Note, Prev: Metrics and Maintenance, Up: Metrics and Maintenance
  2197. 2.7.7.1 Version
  2198. ...............
  2199. A virtual code application with exactly the following definition
  2200. implements a function that returns a constant character string
  2201. regardless of its argument.
  2202. version = ((nil,nil),((nil,nil),(nil,((nil,nil),nil))))
  2203. The character string encodes the version number of the installed
  2204. `avram' executable, for example `0.13.0', using the standard
  2205. representation for characters.
  2206. Although such an application is useless by itself, the intended use
  2207. for this feature is to cope with the possibility that future versions of
  2208. `avram' may include enhancements. Ideally, the maintainer of `avram'
  2209. will update the version number when new enhancements are added.
  2210. Applications can then detect whether they are available in the
  2211. installed version by using this feature. If a needed enhancement is not
  2212. available, the application can either make allowances or at least
  2213. terminate gracefully.
  2214. 
  2215. File: avram.info, Node: Note, Next: Profile, Prev: Version, Up: Metrics and Maintenance
  2216. 2.7.7.2 Note
  2217. ............
  2218. This operation allows arbitrary information or comments to be embedded
  2219. in a virtual code application in such a way that it will be ignored by
  2220. `avram' when executing it. For the `silly' language, a `note' function
  2221. is defined in the standard prelude so as to imply the following theorem.
  2222. _T11_
  2223. [[`note']] `(F,C)' = `((nil,nil),((nil,nil),(nil,(nil,(F,C)))))'
  2224. Intuitively, the argument `F' represents a function, and the argument
  2225. `c' represents the comment, annotation, or whatever, that will be
  2226. embedded but ignored in the virtual code.
  2227. Semantically, a function with a note attached is the same as the
  2228. function by itself, as the following property stipulates for any
  2229. non-`nil' `F'.
  2230. _P8_
  2231. ([[`note']] `(F,C)') `X' = `F X'
  2232. A possible reason for using this feature might be to support a
  2233. language that performs run-time type checking by hanging type tags on everything.
  2234. Another possible use would be to include symbolic information needed by
  2235. a debugger.
  2236. 
  2237. File: avram.info, Node: Profile, Next: Weight, Prev: Note, Up: Metrics and Maintenance
  2238. 2.7.7.3 Profile
  2239. ...............
  2240. The virtual machine supports a profiling capability by way of this feature.
  2241. Profiling an application causes run time statistics about it to be
  2242. written to a file `./profile.txt'. Profiled applications are of the
  2243. form indicated in the following theorem
  2244. _T12_
  2245. [[`profile']] `(F,S)' = `((nil,nil),((nil,nil),(nil,((F,S),nil))))'
  2246. where `F' stands for the virtual code of the application, and `S'
  2247. stands for the name of it to be written to the file. The semantics of
  2248. a profiled function is identical to the unprofiled form for any
  2249. non-`nil' `F'.
  2250. _P9_
  2251. ([[`profile']] `(F,S)') `X' = `F X'
  2252. Unlike the situation with `note', the annotation `S' of used in
  2253. profiled code is not in an unrestricted format but must be a character
  2254. string in the standard representation (as in *note Representation of
  2255. Numeric and Textual Data::), because this string needs to be written by
  2256. `avram' to the file `./profile.txt'. Ordinarily this string will be the
  2257. source code identifier of the function being profiled.
  2258. When profiles are used in many parts of an application, an
  2259. informative table is generated showing the time spent in each part.
  2260. 
  2261. File: avram.info, Node: Weight, Prev: Profile, Up: Metrics and Maintenance
  2262. 2.7.7.4 Weight
  2263. ..............
  2264. The following virtual code implements a function that returns the weight
  2265. of its argument in the standard representation for natural numbers.
  2266. weight = ((nil,nil),((nil,nil),(nil,(nil,nil))))
  2267. The weight of a tree is zero if the tree is `nil', and otherwise the
  2268. sum of the weights of the two subtrees plus one.
  2269. An algorithm to compute the weight of a tree would be trivial to
  2270. implement without being built in to the virtual machine. The built in
  2271. capability could also be used for purposes unrelated to code
  2272. maintenance. Nevertheless, it is built in for the following reasons.
  2273. * Computing weights happened to be a bottleneck for a particular
  2274. aspect of code generation that was of interest to the author, namely
  2275. the compression of generated code.
  2276. * A built in implementation in C runs at least an order of magnitude
  2277. faster than the equivalent implementation in virtual code.
  2278. * It runs even faster when repeated on the same data, by retrieving
  2279. previously calculated weights rather than recalculating them.
  2280. The only disadvantage of using this feature instead of implementing a
  2281. function in virtual code to compute weights is that it relies on native integer
  2282. arithmetic and could overflow, causing a fatal error. It has never
  2283. occurred in practice, but is possible due to sharing, whereby the
  2284. nominal weight of a tree could be exponentially larger than the actual
  2285. amount of memory occupied by it.
  2286. 
  2287. File: avram.info, Node: Deconstruction, Next: Recursion, Prev: Metrics and Maintenance, Up: Virtual Code Semantics
  2288. 2.7.8 Deconstruction
  2289. --------------------
  2290. Much of the time required for evaluating a function is devoted to performing
  2291. deconstruction operations, e.g., taking the left side of a pair, the
  2292. tail of a list, the right side of the head of the tail, etc.. Because
  2293. these operations are so frequent, there are some features of the
  2294. virtual machine to make them as efficient as possible.
  2295. * Menu:
  2296. * Field::
  2297. * Fan::
  2298. 
  2299. File: avram.info, Node: Field, Next: Fan, Prev: Deconstruction, Up: Deconstruction
  2300. 2.7.8.1 Field
  2301. .............
  2302. The virtual machine supports a generalization of the `left' and `right'
  2303. deconstruction operations that is applicable to deeply nested
  2304. structures. Use of this feature is conducive to code that is faster and
  2305. more compact than is possible by relying on the primitive deconstructors
  2306. alone. It may also be easier for a code optimizer to recognize and
  2307. transform.
  2308. The general form of a virtual code application to perform
  2309. deconstruction is that it is a pair with a `nil' left side, and a
  2310. non-`nil' right side. The right side indicates the nature of the
  2311. deconstruction to be performed when the function is evaluated on an
  2312. argument.
  2313. To make the expression of deconstruction functions more readable in
  2314. `silly', the standard library contains the declaration
  2315. field = couple(constant nil,identity)
  2316. which implies the following theorem.
  2317. _T13_
  2318. [[`field']] `W' = `(nil,W)'
  2319. The virtual machine recognizes an application in this form and
  2320. evaluates it according to the following properties, where `U' and `V'
  2321. are other than `nil', but `X', `Y', and `Z' are unrestricted.
  2322. _P10_
  2323. ([[`field']] `(U,nil)') `(X,Y)' = ([[`field']] `U') `X'
  2324. _P11_
  2325. ([[`field']] `(nil,V)') `(X,Y)' = ([[`field']] `V') `Y'
  2326. _P12_
  2327. ([[`field']] `(U,`v')') `Z' = `(('[[`field']] `U) Z,('[[`field']]
  2328. `V) Z)'
  2329. One might also add that ([[`field']] `(nil,nil)') `Z' = `Z', but this
  2330. statement would be equivalent to _P0_.
  2331. A suitable choice of the `field' operand permits the implementation
  2332. of any deconstruction function expressible in terms of `compose',
  2333. `couple', `identity', `left' and `right'. For example, the application
  2334. `couple(compose(right,right),left)' has an equivalent representation in
  2335. `field((nil,(nil,(nil,nil))),((nil,nil),nil))'. The latter looks longer
  2336. in `silly' but is smaller and faster in virtual code.
  2337. 
  2338. File: avram.info, Node: Fan, Prev: Field, Up: Deconstruction
  2339. 2.7.8.2 Fan
  2340. ...........
  2341. In cases where a deconstructions would be needed to apply the same
  2342. function to both sides of a pair, the overhead can be avoided by means
  2343. of a property of the virtual machine intended for that purpose.
  2344. A `silly' definition of `fan' implying the following theorem is
  2345. helpful in expressing such an application.
  2346. _T14_
  2347. [[`fan']] `F' = `((nil,nil),((nil,F),(nil,nil)))'
  2348. The virtual machine recognizes when an application has the form shown
  2349. above, and uses `F' as a function to be applied to both sides of the
  2350. argument.
  2351. _P13_
  2352. ([[`fan']] `F') `(X,Y)' = `(F X,F Y)'
  2353. 
  2354. File: avram.info, Node: Recursion, Next: Assignment, Prev: Deconstruction, Up: Virtual Code Semantics
  2355. 2.7.9 Recursion
  2356. ---------------
  2357. Defining functions or programs self referentially is sometimes
  2358. informally known as recursion. In functional languages, the clever use of
  2359. "combinators" is often preferred to this practice, and is in fact well
  2360. supported by the virtual machine. However, some computations may be
  2361. inexpressible without an explicitly "recursive" formulation, so there
  2362. is some support for that as well.
  2363. * Menu:
  2364. * Recur::
  2365. * Refer::
  2366. 
  2367. File: avram.info, Node: Recur, Next: Refer, Prev: Recursion, Up: Recursion
  2368. 2.7.9.1 Recur
  2369. .............
  2370. A generalization of the form denoted by `meta' in `silly' is recognized
  2371. by the virtual machine and allows a slightly more efficient encoding of
  2372. recursive applications. An expression `recur P' has the representation
  2373. indicated by this theorem,
  2374. _T15_
  2375. [[`recur']] `P' = `(((nil,P),nil),nil)'
  2376. which implies that [[`meta']] = [[`recur']] `(nil,nil)'.
  2377. If `P' is non-`nil', a tree of the form [[`recur']] `P' is
  2378. interpreted as follows. Note that _P4_ is equivalent to the special
  2379. case of this property for which `P' is `(nil,nil)'.
  2380. _P14_
  2381. ([[`recur']] `P') `X' = [[`meta']] ([[`field']] `P') `X'
  2382. The rationale is that `meta' would very frequently be composed with
  2383. a deconstruction `field P', so the virtual machine saves some time and
  2384. space by allowing the two of them to be encoded in a smaller tree with
  2385. the combined meaning.
  2386. 
  2387. File: avram.info, Node: Refer, Prev: Recur, Up: Recursion
  2388. 2.7.9.2 Refer
  2389. .............
  2390. In the style of recursive programming compelled by the available `meta'
  2391. primitive, a function effectively requires a copy of its own machine
  2392. code as its left argument. Bringing about that state of affairs is an
  2393. interesting party trick.
  2394. If we had a definition of `bu' in the standard library implying
  2395. _T16_
  2396. ([[`bu']] `(F,K)') `X' = `F(K,X)'
  2397. which for the sake of concreteness can be done like this,
  2398. bu = (hired compose)(
  2399. left,
  2400. (hired couple)(compose(constant,right),constant identity))
  2401. then a definition of `refer' as
  2402. refer = (hired bu)(identity,identity)
  2403. would be consistent with the following property of the operator.
  2404. _P15_
  2405. ([[`refer']] `F') `X' = `F (F,X)'
  2406. The proof, as always, is a matter of routine calculation in the manner
  2407. of the section on how `avram' thinks.
  2408. However, this pattern would occur so frequently in recursively
  2409. defined functions as to be a significant waste of space and time.
  2410. Therefore, rather than requiring it to be defined in terms of other
  2411. operations, the virtual machine specification recognizes a pattern of
  2412. the form below with respect to property _P15_,
  2413. _T17_
  2414. [[`refer']] `F' = `(((F,nil),nil),nil)'
  2415. and takes the property to be true by definition of the operator. A
  2416. definition of `refer' consistent with _T17_ is therefore to be found in
  2417. the standard library, not the definition proposed above.
  2418. 
  2419. File: avram.info, Node: Assignment, Next: Predicates, Prev: Recursion, Up: Virtual Code Semantics
  2420. 2.7.10 Assignment
  2421. -----------------
  2422. In an imperative programming paradigm, a machine consists partly of an
  2423. ensemble of addressable storage locations, whose contents are changed
  2424. over time by assignment statements. An assignment statement includes
  2425. some computable function of the global machine state, and the address of
  2426. the location whose contents will be overwritten with the value computed
  2427. from the function when it is evaluated.
  2428. Compiling a language containing assignment statements into virtual
  2429. machine code suitable for `avram' might be facilitated by exploiting
  2430. the following property.
  2431. _P16_
  2432. ([[`assign']] `(P,F)') `X' = [[`replace']] `((P,F X),X)'
  2433. The identifier `assign' is used in `silly' to express a virtual code
  2434. fragment having the form shown below, and `replace' corresponds to a
  2435. further operation to be explained presently.
  2436. _T18_
  2437. [[`assign']] `(P,F)' = `(((P,F),nil),nil)'
  2438. This feature simulates assignment statements in the following way.
  2439. The variable `X' in _P16_ corresponds intuitively to the set of
  2440. addressable locations in the machine. The variable `F' corresponds to
  2441. the function whose value will be stored in the location addressed by
  2442. `P'. The result of a function expressed using `assign' is a new store
  2443. similar to the argument `X', but with the part of it in location `P'
  2444. replaced by `F X'. A source text with a sequence of assignment
  2445. statements could therefore be translated directly into a functional
  2446. composition of trees in this form.
  2447. The way storage locations are modeled in virtual code using this
  2448. feature would be as nested pairs, and the address `P' of a location is
  2449. a tree interpreted similarly to the trees used as operands to the
  2450. `field' operator described in *note Field::, to specify
  2451. deconstructions. In fact, `replace' can be defined as a minimal
  2452. solution to the following equation.
  2453. _E0_
  2454. ([[`field']] `P') [[`replace']] `((P,Y),X)' = `Y'
  2455. This equation regrettably does not lend itself to inferring the
  2456. `silly' source for `replace' using the `isolate' algorithm in *note
  2457. Variable Freedom::, so an explicit construction is given in *note
  2458. Replace::. This construction need not concern a reader who considers
  2459. the equation a sufficiently precise specification in itself.
  2460. In view of the way addresses for deconstruction are represented as
  2461. trees, it would be entirely correct to infer from this equation that a
  2462. tuple of values computed together can be assigned to a tuple of
  2463. locations. The locations don't even have to be "contiguous", but could
  2464. be anywhere in the tree representing the store, and the function is
  2465. computed from the contents of all of them prior to the update. Hence,
  2466. this simulation of assignment fails to capture the full inconvenience of
  2467. imperative programming except in the special case of a single value
  2468. assigned to a single location, but fortunately this case is the only one
  2469. most languages allow.
  2470. There is another benefit to this feature besides running languages
  2471. with assignment statements in them, which is the support of abstract or
  2472. opaque data structures. A function that takes an abstract data structure
  2473. as an argument and returns something of the same type can be coded in a
  2474. way that is independent of the fields it doesn't use. For example, a
  2475. data structure with three fields having the field identifiers `foo',
  2476. `bar', and `baz' in some source language might be represented as a
  2477. tuple `((FOO CONTENTS,BAR CONTENTS),BAZ CONTENTS)' on the virtual code
  2478. level. Compile time constants like `bar = ((nil,(nil,nil)),nil)' could
  2479. be defined in an effort to hide the details of the representation, so
  2480. that the virtual code `field bar' is used instead of
  2481. `compose(right,left)'. Using field identifiers appropriately, a
  2482. function that transforms such a structure by operating on the `bar'
  2483. field could have the virtual code `couple(couple(field
  2484. foo,compose(f,field bar)),field baz)'. However, this code does not
  2485. avoid depending on the representation of the data structure, because it
  2486. relies on the assumption of the `foo' field being on the left of the
  2487. left, and the `baz' field being on the right. On the other hand, the
  2488. code `assign(bar,compose(f,field bar))' does the same job without
  2489. depending on anything but the position of the `bar' field. Furthermore,
  2490. if this position were to change relative to the others, the code
  2491. maintenance would be limited to a recompilation.
  2492. 
  2493. File: avram.info, Node: Predicates, Next: Iteration, Prev: Assignment, Up: Virtual Code Semantics
  2494. 2.7.11 Predicates
  2495. -----------------
  2496. A couple of operations are built into the virtual machine for performing
  2497. tests efficiently. These functions return either `nil' for false or
  2498. `(nil,nil)' for true, and are useful for example as a predicate `P' in
  2499. programs of the form `conditional(P,(F,G))' among other things. In this
  2500. example, the predicate is applied to the argument, a result of
  2501. `(nil,nil)' causes `F' to be applied to it, and a result of `nil'
  2502. causes `G' to be applied to it.
  2503. * Menu:
  2504. * Compare::
  2505. * Member::
  2506. 
  2507. File: avram.info, Node: Compare, Next: Member, Prev: Predicates, Up: Predicates
  2508. 2.7.11.1 Compare
  2509. ................
  2510. A function that performs comparison has a the following very simple
  2511. virtual code representation.
  2512. _T19_
  2513. [[`compare']] = `(nil,nil)'
  2514. The proof of theorem _T19_ is that the standard `silly' prelude
  2515. contains the declaration `compare = (nil,nil)'. Code in this form has
  2516. the following semantics.
  2517. _P17_
  2518. For distinct trees `X' and `Y', [[`compare']] `(X,Y)' = `nil'
  2519. _P18_
  2520. [[`compare']] `(X,X)' = `(nil,nil)'
  2521. In other words, the virtual code `(nil,nil)' implements a function that
  2522. takes a pair of trees and returns true if and only if they are equal.
  2523. It would be fairly simple to write an equivalent virtual code
  2524. application that implements this function if it were not realizable in
  2525. this form by definition of the operator. However, this method is
  2526. preferable because it saves space in virtual code and has a highly
  2527. optimized implementation in C.
  2528. 
  2529. File: avram.info, Node: Member, Prev: Compare, Up: Predicates
  2530. 2.7.11.2 Member
  2531. ...............
  2532. Another built in predicate function has the virtual code shown below.
  2533. _T20_
  2534. [[`member']] = `((nil,nil),((nil,nil),nil))'
  2535. As the mnemonic suggests, the function implemented by this code detects
  2536. whether a given item is a member of a list. The left side of its
  2537. argument is the item to be detected, and the right side is the list that
  2538. may or may not contain it. Lists are represented as explained in *note
  2539. Representation of Numeric and Textual Data::.
  2540. The virtual code semantics can be specified by these three
  2541. properties of the operator.
  2542. _P19_
  2543. [[`member']] `(X,nil)' = `nil'
  2544. _P20_
  2545. [[`member']] `(X,(X,Y))' = `(nil,nil)'
  2546. _P21_
  2547. For distinct trees `X' and `Y', [[`member']] `(X,(Y,Z))' =
  2548. [[`member']] `(X,`z')'
  2549. As in the case of `compare', the implementation of `member' is well
  2550. optimized in C, so this form is to be preferred over an ad hoc
  2551. construction of a membership testing function in virtual code.
  2552. 
  2553. File: avram.info, Node: Iteration, Next: List Combinators, Prev: Predicates, Up: Virtual Code Semantics
  2554. 2.7.12 Iteration
  2555. ----------------
  2556. One of many alternatives to recursion provided by the virtual machine is
  2557. iteration, which allows an operation to be repeated until a condition is
  2558. met. If the source language is imperative, this feature provides an easy
  2559. means of translating loop statements to virtual code. In languages that
  2560. allow functions to be treated as data, iteration can be regarded as a
  2561. function that takes a predicate and a function as arguments, and
  2562. returns a function that applies the given function repeatedly to its
  2563. argument until the predicate is refuted.
  2564. Iterative applications are expressed in virtual code by the pattern
  2565. shown below.
  2566. _T21_
  2567. [[`iterate']] `(P,F)' = `((nil,nil),(nil,(P,F)))'
  2568. In the `silly' language, the `iterate' mnemonic plays the role of the
  2569. function that takes the virtual code for a predicate `P' and a function
  2570. `F' as arguments, and returns the virtual code for an iterating
  2571. function.
  2572. The code for an iterating function is recognized as such by the
  2573. virtual machine emulator only if the subtrees `F' and `P' are other
  2574. than `nil'. The resulting function tests the argument `X' with `P' and
  2575. returns `X' if the predicate is false.
  2576. _P22_
  2577. ([[`iterate']] `(P,F)') `X' = `X' if `P X' = `nil'
  2578. If the predicate turns out to be true, `F' is applied to `X', and then
  2579. another iteration is performed.
  2580. _P23_
  2581. ([[`iterate']] `(P,F)') `X' = ([[`iterate']] `(P,F)') `F X' if `P
  2582. X' is a non-`nil' tree
  2583. 
  2584. File: avram.info, Node: List Combinators, Next: List Functions, Prev: Iteration, Up: Virtual Code Semantics
  2585. 2.7.13 List Combinators
  2586. -----------------------
  2587. There is extensive support for operations on lists in the virtual code
  2588. format. Use of these features is encouraged because they are conducive
  2589. to tight code with explicit concurrency. Within an imperative
  2590. programming paradigm, these features might perhaps have to be understood
  2591. as design patterns or algorithmic skeletons. The present exposition
  2592. takes a functional view, describing them in terms of operators that take
  2593. functions as their arguments and return functions as their result.
  2594. * Menu:
  2595. * Map::
  2596. * Filter::
  2597. * Reduce::
  2598. * Sort::
  2599. * Transfer::
  2600. * Mapcur::
  2601. 
  2602. File: avram.info, Node: Map, Next: Filter, Prev: List Combinators, Up: List Combinators
  2603. 2.7.13.1 Map
  2604. ............
  2605. A virtual code application in the following form causes a function with
  2606. non-`nil' virtual code `F' to be applied to every item in a list.
  2607. _T22_
  2608. [[`map']] `F' = `((nil,nil),((nil,F),nil))'
  2609. The `map' mnemonic is used in `silly' to express applications in this
  2610. form as `map F'. This operation is also well known to lisp users and
  2611. functional programmers. The semantics is determined by these two
  2612. operator properties (for non-`nil' `F').
  2613. _P24_
  2614. ([[`map']] `F') `nil' = `nil'
  2615. _P25_
  2616. ([[`map']] `F') `(X,Y)' = `(F X,('[[`map']] `F) Y)'
  2617. Note that the representation of lists described in *note Representation
  2618. of Numeric and Textual Data::, is assumed.
  2619. 
  2620. File: avram.info, Node: Filter, Next: Reduce, Prev: Map, Up: List Combinators
  2621. 2.7.13.2 Filter
  2622. ...............
  2623. Another well known list operation is that which applies a predicate to
  2624. every item of a list, and deletes those for which the predicate is
  2625. false. For a predicate with virtual code `P', such an application can
  2626. be coded conveniently in this form,
  2627. _T23_
  2628. [[`filter']] `P' = `((nil,nil),(nil,(P,nil)))'
  2629. which is to say that writing `((nil,nil),(nil,(P,nil)))' in `silly' is
  2630. the same as writing `filter P'.
  2631. The virtual machine detects code of this form provided that `P' is
  2632. other than `nil', and evaluates it consistently with the following
  2633. properties, causing it to have the meaning that it does.
  2634. _P26_
  2635. ([[`filter']] `P') `nil' = `nil'
  2636. _P27_
  2637. ([[`filter']] `P') `(X,Y)' = ([[`filter']] `P') `Y' if `P `X'' =
  2638. `nil'
  2639. _P28_
  2640. ([[`filter']] `P') `(X,Y)' = `(X,'([[`filter']] `P') `Y)' if `P X'
  2641. is a non-`nil' tree
  2642. 
  2643. File: avram.info, Node: Reduce, Next: Sort, Prev: Filter, Up: List Combinators
  2644. 2.7.13.3 Reduce
  2645. ...............
  2646. In the virtual code fragment shown below, `F' should be regarded as the
  2647. virtual code for a binary operator, and `K' is a constant.
  2648. _T24_
  2649. [[`reduce']] `(F,K)' = `((nil,nil),((F,K),nil))'
  2650. By constructing a tree in the form shown, the `silly' mnemonic `reduce'
  2651. effectively constructs the code for a function operating on lists in a
  2652. particular way.
  2653. The effect of evaluating an application in this form with an argument
  2654. representing a list can be broken down into several cases.
  2655. * If the list is empty, then the value of `K' is the result.
  2656. * If the list has only one item, then that item is the result.
  2657. * if the list has exactly two items, the first being `X' and the
  2658. second being `Y', then the result is `F (X,Y)'.
  2659. * If the list has more than two items and an even number of them, the
  2660. result is that of evaluating the application with the list of
  2661. values obtained by partitioning the list into pairs of adjacent
  2662. items, and evaluating `F' with each pair.
  2663. * If the list has more than two items and an odd number of them, the
  2664. result is that of evaluating the application with the list of
  2665. values obtained by partitioning the list into pairs of adjacent
  2666. items excluding the last one, evaluating `F' with each pair, and
  2667. then appending the last item to the list of values.
  2668. In the last two cases, evaluation of the application is not necessarily
  2669. finished after just one traversal of the list, because it has to carry
  2670. on until the list is reduced to a single item.
  2671. Some care has been taken to describe this operation in detail
  2672. because it differs from comparable operations common to functional
  2673. programming languages, variously known as fold or reduce. All of these
  2674. operations could be used, for example, to compute the summation of a
  2675. list of numbers. The crucial differences are as follows.
  2676. * Whereas a fold or a reduce is conventionally either of the left or
  2677. right variety, this `reduce' is neither.
  2678. * The vacuous case result `K' is never used at all unless the
  2679. argument is the empty list.
  2680. * This `reduce' is not very useful if the operator `F' is not
  2681. associative.
  2682. The reason for defining the semantics of `reduce' in this way
  2683. instead of the normal way is that a distributed implementation of this one
  2684. could work in logarithmic time, so it's worth making it easy for a
  2685. language processor to detect the pattern in case the virtual code is
  2686. ever going to be targeted to such an implementation. Of course, nothing
  2687. prevents the conventional left or right reduction semantics from being
  2688. translated to virtual code by explicit recursion.
  2689. The precise semantics of this operation are as follows, where `F' is
  2690. not `nil', `K' is unconstrained, and `pairwise' represents a function
  2691. to be explained presently.
  2692. _P29_
  2693. ([[`reduce']] `(F,K)') `nil' = `K'
  2694. _P30_
  2695. ([[`reduce']] `(F,K)') `(X,Y)' =
  2696. [[`left']] ([[`bu(iterate,right)']] [[`pairwise']] `F') `(X,Y)'
  2697. The latter property leverages off some virtual machine features and
  2698. `silly' code that has been defined already. The function implemented by
  2699. [[`pairwise']] `F' is the one that partitions its argument into pairs
  2700. and evaluates `F' with each pair as described above. The combination of
  2701. that with `bu(iterate,right)' results in an application that repeatedly
  2702. performs [[`pairwise']] `F' while the resulting list still has a tail
  2703. (i.e., a `right' side), and stops when the list has only a single item
  2704. (i.e., when `right' is false). The `left' operation then extracts the
  2705. item.
  2706. For the sake of completeness, it is tedious but straightforward to
  2707. give an exact definition for `pairwise'. The short version is that it
  2708. can be anything that satisfies these three equations.
  2709. _E1_
  2710. ([[`pairwise']] `F') `nil' = `nil'
  2711. _E2_
  2712. ([[`pairwise']] `F') `(X,nil)' = `(X,nil)'
  2713. _E3_
  2714. ([[`pairwise']] `F') `(X,(Y,Z))' = `(F (X,Y),'([[`pairwise']] `F')
  2715. `Z)'
  2716. For the long version, see *note Pairwise::.
  2717. 
  2718. File: avram.info, Node: Sort, Next: Transfer, Prev: Reduce, Up: List Combinators
  2719. 2.7.13.4 Sort
  2720. .............
  2721. Sorting is a frequently used operation that has the following standard
  2722. representation in virtual code.
  2723. _T25_
  2724. [[`sort']] `P' = `((nil,nil),((P,nil),(nil,nil)))'
  2725. When an application in this form is evaluated with an operand
  2726. representing a list, the result is a sorted version of the list.
  2727. The way a list is sorted depends on what order is of interest. For
  2728. example, numbers could be sorted in ascending or descending order,
  2729. character strings could be sorted lexically or by length, etc.. The
  2730. value of `P' is therefore needed in sorting applications to specify the
  2731. order. It contains the virtual code for a partial order relational
  2732. operator. This operator can be evaluated with any pair of items
  2733. selected from a list, and should have a value of true if the left one
  2734. should precede the right under the ordering. For example, if numbers
  2735. were to be sorted in ascending order, then `P' would compute the "less
  2736. or equal" relation, returning true if its operand were a pair of
  2737. numbers in which the left is less or equal to the right.
  2738. The virtual code semantics for sorting applications is given by these
  2739. two properties, wherein `P' is a non-`nil' tree, and `insert' is a
  2740. `silly' mnemonic to be defined next.
  2741. _P31_
  2742. ([[`sort']] `P') `nil' = `nil'
  2743. _P32_
  2744. ([[`sort']] `P') `(X,Y)' = ([[`insert']] `P') `(X,'([[`sort']]
  2745. `P') `Y)'
  2746. These properties say that the empty list is already sorted, and a
  2747. non-empty list is sorted if its tail is sorted and the head is then
  2748. inserted in the proper place. This specification of sorting has nothing
  2749. to do with the sorting algorithm that `avram' really uses.
  2750. The meaning of insertion is convenient to specify in virtual code
  2751. itself. It should satisfy these three equations.
  2752. _E4_
  2753. ([[`insert']] `P') `(X,nil)' = `(X,nil)'
  2754. _E5_
  2755. ([[`insert']] `P') `(X,(Y,Z))' = `(Y,'([[`insert']] `P') `(X,Z))'
  2756. if `P(X,Y)' = `nil'
  2757. _E6_
  2758. ([[`insert']] `P') `(X,(Y,Z)') = `(X,(Y,Z))' if `P(X,Y)' is a
  2759. non-`nil' tree
  2760. Intuitively, the right argument, whether `nil' or `(Y,Z)', represents a
  2761. list that is already sorted. The left argument `X' therefore only needs
  2762. to be compared to the head element, `Y' to ascertain whether or not it
  2763. belongs at the beginning. If not, it should be inserted into the tail.
  2764. A possible implementation of `insert' in `silly' is given in *note
  2765. Insert::.
  2766. 
  2767. File: avram.info, Node: Transfer, Next: Mapcur, Prev: Sort, Up: List Combinators
  2768. 2.7.13.5 Transfer
  2769. .................
  2770. A particular interpretation is given to virtual code in the following
  2771. form.
  2772. _T26_
  2773. [[`transfer']] `F' = `((nil,nil),(nil,(nil,F)))'
  2774. When code in this form is evaluated with an argument, the tree `F' is
  2775. used as a state transition function, and the argument is used as a list
  2776. to be traversed. The traversal begins with `F' being evaluated on `nil'
  2777. to get the initial state and the initial output. Thereafter, each item
  2778. of the list is paired with the current state to be evaluated with `F',
  2779. resulting in a list of output and the next state. The output resulting
  2780. from the entire application is the cumulative concatenation of all
  2781. outputs obtained in the course of evaluating `F'. The computation
  2782. terminates when `F' yields a `nil' result. If the list of inputs runs
  2783. out before the computation terminates, `nil' values are used as inputs.
  2784. This behavior is specified more precisely in the following property
  2785. of the operator, which applies in the case of non-`nil' `F'.
  2786. _P33_
  2787. ([[`transfer']] `F') `X' = ([[`transition']] `F') `(nil,(F nil,X))'
  2788. Much of the `transfer' semantics is implicit in the meaning of
  2789. `transition'. For any given application `F', [[`transition']] `F' is
  2790. the virtual code for a function that takes the list traversal from one
  2791. configuration to the next. A configuration is represented as a tuple,
  2792. usually in the form `(PREVIOUS OUTPUTS,((STATE,OUTPUT),(NEXT
  2793. INPUT,SUBSEQUENT INPUTS)))'. A terminal configuration has the form
  2794. `(PREVIOUS OUTPUTS,(nil,(NEXT INPUT,SUBSEQUENT INPUTS)))'. A
  2795. configuration may also have `nil' in place of the pair `(NEXT
  2796. INPUT,SUBSEQUENT INPUTS)' if no more input remains.
  2797. In the non-degenerate case, the meaning of [[`transition']] `F' is
  2798. consistent with the following equation.
  2799. _E7_
  2800. ([[`transition']] `F') `(Y,((S,O),(I,X)))' =
  2801. ([[`transition']] `F') `((O,Y),(F (S,I),X))'
  2802. That is, the current output `O' is stored with previous outputs `Y',
  2803. the next input `I' is removed from the configuration, and the next
  2804. state and output are obtained from the evaluation of `F' with the state
  2805. `S' and the next input `I'.
  2806. In the case where no input remains, the transition function is
  2807. consistent with the following equation.
  2808. _E8_
  2809. ([[`transition']] `F') `(Y,((S,O),nil))' =
  2810. ([[`transition']] `F') `((O,Y),(F (S,nil),nil))'
  2811. This case is similar to the previous one except that the `nil' value is
  2812. used in place of the next input. Note that in either case, nothing
  2813. about `F' depends on the particular way configurations are represented,
  2814. except that it should have a state as its left argument and an input as
  2815. its right argument.
  2816. Finally, in the case of a terminal configuration, the transition
  2817. function returns the cumulative output.
  2818. _E9_
  2819. ([[`transition']] `F') `(Y,(nil,X))' = [[`reduce(cat,nil)']]
  2820. [[`reverse']] `Y'
  2821. The `silly' code `reduce(cat,nil)' has the effect of flattening a list
  2822. of lists into one long list, which is necessary insofar as the
  2823. transition function will have generated not necessarily a single output
  2824. but a list of outputs on each iteration. The `cat' mnemonic stands for
  2825. list concatenation and is explained in *note Cat::. The reversal is
  2826. necessary to cause the first generated output to be at the head of the
  2827. list. List reversal is a built in operation of the virtual machine and
  2828. is described in *note Reverse::.
  2829. If such a function as `transition' seems implausible, its
  2830. implementation in `silly' can be found in *note Transition::.
  2831. It is usually more awkward to express a function in terms of a
  2832. `transfer' than to code it directly using recursion or other list
  2833. operations. However, this feature is provided by the virtual machine for
  2834. several reasons.
  2835. * Functions in this form may be an easier translation target if the
  2836. source is an imperative language.
  2837. * Translating from virtual code to asynchronous circuits or process networks
  2838. has been a research interest of the author, and code in this form
  2839. lends itself to easy recognition and mapping onto discrete
  2840. components.
  2841. * The `--byte-transducer' and `--interactive' command line options
  2842. to `avram' cause an application to be invoked in a similar manner
  2843. to the transition function in a `transfer' function, so this
  2844. feature allows for easy simulation and troubleshooting of these
  2845. applications without actually deploying them.
  2846. 
  2847. File: avram.info, Node: Mapcur, Prev: Transfer, Up: List Combinators
  2848. 2.7.13.6 Mapcur
  2849. ...............
  2850. An alternative form of recursive definition is the following.
  2851. _T27_
  2852. [[`mapcur']] `P' = `((nil,nil),((nil,nil),(P,nil)))'
  2853. This form is convenient for applications that cause themselves to be applied
  2854. recursively to a list of arguments. It has this semantics.
  2855. _P34_
  2856. ([[`mapcur']] `P') `X' = [[`map meta']] [[`distribute']]
  2857. ([[`field']] `P') `X'
  2858. 
  2859. File: avram.info, Node: List Functions, Next: Exception Handling, Prev: List Combinators, Up: Virtual Code Semantics
  2860. 2.7.14 List Functions
  2861. ---------------------
  2862. In addition to the foregoing list operations, the virtual machine provides
  2863. a number of canned functions operating on lists, namely concatenation,
  2864. reversal, distribution, and transposition. These functions could be
  2865. coded by other means if they were not built in, but the built in
  2866. versions are faster and smaller.
  2867. * Menu:
  2868. * Cat::
  2869. * Reverse::
  2870. * Distribute::
  2871. * Transpose::
  2872. 
  2873. File: avram.info, Node: Cat, Next: Reverse, Prev: List Functions, Up: List Functions
  2874. 2.7.14.1 Cat
  2875. ............
  2876. The list concatenation operation has this representation in virtual
  2877. code.
  2878. _T28_
  2879. [[`cat']] = `((nil,nil),(nil,nil))'
  2880. This function takes a pair of lists as an argument, an returns the list
  2881. obtained by appending the right one to the left. The semantics of
  2882. concatenation is what one would expect.
  2883. _P35_
  2884. [[`cat']] `(nil,Z)' = `Z'
  2885. _P36_
  2886. [[`cat']] `((X,Y),Z)' = `(X,'[[`cat']] `(Y,`z'))'
  2887. 
  2888. File: avram.info, Node: Reverse, Next: Distribute, Prev: Cat, Up: List Functions
  2889. 2.7.14.2 Reverse
  2890. ................
  2891. The function that reverses a list has the following representation in
  2892. virtual code.
  2893. _T29_
  2894. [[`reverse']] = `((nil,nil),(nil,(nil,nil)))'
  2895. This function takes a list as an argument, and returns a the list
  2896. consisting of the same items in the reverse order. The semantics is
  2897. given by the following properties.
  2898. _P37_
  2899. [[`reverse']] `nil' = `nil'
  2900. _P38_
  2901. [[`reverse']] `(X,Y)' = [[`cat']] ([[`reverse']] `Y,(X,nil)')
  2902. 
  2903. File: avram.info, Node: Distribute, Next: Transpose, Prev: Reverse, Up: List Functions
  2904. 2.7.14.3 Distribute
  2905. ...................
  2906. The function with the following virtual code representation is
  2907. frequently useful for manipulating lists.
  2908. _T30_
  2909. `distribute' = `(((nil,nil),nil),nil)'
  2910. This function takes a pair whose right side represents a list, and
  2911. returns a list of pairs, with one pair for each item in the list. The
  2912. left side of each pair is the left side of the original argument, and
  2913. the right side is the corresponding item of the list. A semantics for
  2914. this operation is specified by the following properties.
  2915. _P39_
  2916. [[`distribute']] `(X,nil)' = `nil'
  2917. _P40_
  2918. [[`distribute']] `(X,(Y,Z))' = `((X,Y),'[[`distribute']] `(X,Z))'
  2919. 
  2920. File: avram.info, Node: Transpose, Prev: Distribute, Up: List Functions
  2921. 2.7.14.4 Transpose
  2922. ..................
  2923. The `transpose' operation has the following representation in virtual
  2924. code.
  2925. _T31_
  2926. [[`transpose']] = `((nil,nil),((nil,nil),(nil,nil)))'
  2927. This function takes a list of equal length lists as an argument, and
  2928. returns a list of lists as a result. In the resulting list, the first
  2929. item is the list of all first items of lists in the argument. The next
  2930. item is the list of all second items, and so on.
  2931. In the specification of the semantics, the `silly' mnemonic `flat'
  2932. is defined by `flat = reduce(cat,nil)' in the standard `silly' prelude,
  2933. which means that it flattens a list of lists into one long list.
  2934. _P41_
  2935. [[`transpose']] `X' = `nil' if [[`flat']] `X' = `nil'
  2936. _P42_
  2937. [[`transpose']] `X' = `('[[`map left']] `X,'[[`transpose']] [[`map
  2938. right']] `X)'
  2939. if [[`flat']] `X' is a non-`nil' tree
  2940. 
  2941. File: avram.info, Node: Exception Handling, Next: Interfaces to External Code, Prev: List Functions, Up: Virtual Code Semantics
  2942. 2.7.15 Exception Handling
  2943. -------------------------
  2944. In quite a few cases, the properties given for the operator up to this
  2945. point do not imply any particular result. A good example would be an
  2946. expression such as [[`left']] `nil', which appears to represent the
  2947. left side of an empty pair. It can be argued that expressions like this
  2948. have no sensible interpretation and should never be used, so it would
  2949. be appropriate to leave them undefined. On the other hand, attempts to
  2950. evaluate such expressions occur frequently by mistake, and in any case,
  2951. the virtual machine emulator should be designed to do something
  2952. reasonable about them if only for the sake of reporting the error. The
  2953. chosen remedy for this situation addresses the need for error
  2954. reporting, and also turns out to be useful in other ways.
  2955. * Menu:
  2956. * A Hierarchy of Sets::
  2957. * Operator Generalization::
  2958. * Error Messages::
  2959. * Expedient Error Messages::
  2960. * Computable Error Messages::
  2961. * Exception Handler Usage::
  2962. 
  2963. File: avram.info, Node: A Hierarchy of Sets, Next: Operator Generalization, Prev: Exception Handling, Up: Exception Handling
  2964. 2.7.15.1 A Hierarchy of Sets
  2965. ............................
  2966. As indicated already, the virtual machine represents all functions and
  2967. data as members of a set satisfying the properties in *note Raw
  2968. Material::, namely a `nil' element and a `cons' operator for
  2969. constructing trees or nested pairs of `nil'. However, it will be
  2970. necessary to distinguish the results of computations that go wrong for
  2971. exceptional reasons from normal results. Because any tree in the set
  2972. could conceivably represent a normal result, we need to go outside the
  2973. set to find an unambiguous representation of exceptional results.
  2974. Because there may be many possible exceptional conditions, it will
  2975. be helpful to have a large set of possible ways to encode them, and in
  2976. fact there is no need to refrain from choosing a countably infinite
  2977. set. Furthermore, it will be useful to distinguish between different
  2978. levels of severity among exceptional conditions, so for this purpose a
  2979. countably infinite hierarchy of mutually disjoint sets is used.
  2980. In order to build on the theory already developed, the set that has
  2981. been used up to this point will form the bottom level of the hierarchy,
  2982. and its members will represent normal computational results. The
  2983. members of sets on the higher levels in the hierarchy represent
  2984. exceptional results. To avoid ambiguity, the term "trees" is reserved
  2985. for members of the bottom set, as in "for any tree `X' ...". Unless
  2986. otherwise stated, variables like `X' and `Y' are universally quantified
  2987. over the bottom set only.
  2988. Because each set in the hierarchy is countably infinite, it is
  2989. isomorphic to the bottom set. With respect to an arbitrary but fixed
  2990. bijection between them, let `X_N' denote the image in the `N'th level
  2991. set of a tree `X' in the bottom set. The level numbers in this notation
  2992. start with zero, and we take `X_0' to be synonymous with `X'. For good
  2993. measure, let `(X_N)_M' = `X_(N+M)'.
  2994. 
  2995. File: avram.info, Node: Operator Generalization, Next: Error Messages, Prev: A Hierarchy of Sets, Up: Exception Handling
  2996. 2.7.15.2 Operator Generalization
  2997. ................................
  2998. Each set in the hierarchy induces a structure preserving `cons' operator,
  2999. denoted `cons_N' for the `N'th level set, and satisfying this equation.
  3000. _E10_
  3001. `cons_N(X_N,Y_N)' = `(cons(X,Y))_N'
  3002. It will be convenient to generalize all of these `cons' operators to be
  3003. defined on members of other sets than their own.
  3004. _E11_
  3005. For `M' greater than `N', `cons_N(X_M,Y_P)' = `X_M'
  3006. In this equation, `P' is unrestricted. The intuition is that if the
  3007. left operand of a `cons' is the result of a computation that went wrong
  3008. due to an exceptional condition (more exceptional than `N', the level
  3009. already in effect), then the exceptional result becomes the whole
  3010. result.
  3011. It is tempting to hazard a slightly stronger statement, which is that
  3012. this equation holds even if `Y_P' is equal to some expression `F X'
  3013. that is undefined according to the operator semantics. This stipulation
  3014. would correspond to an implementation in which the right operand isn't
  3015. evaluated after an error is detected in the left, but there are two
  3016. problems with it.
  3017. * This semantics might unreasonably complicate a concurrent
  3018. implementation of the virtual machine. If evaluation leads to
  3019. non-termination in some cases where the result is undefined (as it
  3020. certainly would in any possible implementation consistent with
  3021. cases where it's defined), then the mechanism that evaluates the
  3022. right side of a pair must be interruptible in case an exception is
  3023. detected in the left.
  3024. * It is beyond the expressive power of the present mathematical
  3025. framework to make such a statement, because it entails universal
  3026. quantification over non-members of the constructed sets, which
  3027. includes almost everything.
  3028. Nevertheless, the implementation in `avram' is sequential and does
  3029. indeed behave as proposed, with no practical difficulty. As for any
  3030. deficiency in the theory, it could be banished by recasting the
  3031. semantics in terms of a calculus of expressions with formal rules of
  3032. manipulation. An operand to the `cons' operator would be identified not
  3033. with a member of a semantic domain, but with the expression used to
  3034. write it down, and then even "undefinedness" could be defined. However,
  3035. the present author's preference in computing as in life is to let some
  3036. things remain a mystery rather than to abandon the quest for meaning
  3037. entirely.
  3038. A comparable condition applies in cases where the right side of a
  3039. pair represents an exceptional result.
  3040. _E12_
  3041. For `M' greater than `N', `cons_N(X_N,Y_M)' = `Y_M'
  3042. Whereas an infinitude of `cons' operators has been needed, it will be
  3043. possible to get by with only one invisible operator, as before, by
  3044. generalizing it in the following way to operands on any level of the
  3045. hierarchy.
  3046. _P43_
  3047. `F_N X_N' = `(F X)_N'
  3048. _P44_
  3049. For distinct `N' and `M', `F_N X_M' = `X_M'
  3050. That is, the result of evaluating two operands on the same level is the
  3051. image relative to that level of the result of their respective images on
  3052. the bottom level, but the result of evaluating two operands on different
  3053. levels is the same as the right operand.
  3054. 
  3055. File: avram.info, Node: Error Messages, Next: Expedient Error Messages, Prev: Operator Generalization, Up: Exception Handling
  3056. 2.7.15.3 Error Messages
  3057. .......................
  3058. The basic strategy for representing the results of exceptional
  3059. conditions arising from the evaluation of operands on a given level of
  3060. the hierarchy will be to use an error message corresponding to the image
  3061. of a list of character strings on the level above.
  3062. Unfortunately, the official `silly' standard does not define
  3063. character constants, but they are available as a vendor specific
  3064. extension in `silly-me' (millennium edition), where character strings may
  3065. be enclosed in single quotes. The value of the semantic function
  3066. [[...]] in the case of a character string is the list of
  3067. representations of the characters, based on *note Character Table:: and
  3068. *note Representation of Numeric and Textual Data::.
  3069. For the sake of consistency, each standard error message is a list of
  3070. character strings, even though the list has only one string in it. If
  3071. any exceptional condition is the result of a computation, it is written
  3072. to standard error by `avram' as the list of character strings it
  3073. represents.
  3074. _P45_
  3075. ([[`compare']] `nil')`_N' = [[`('invalid
  3076. comparison',nil)']]`_(N+1)'
  3077. _P46_
  3078. ([[`left']] `nil')`_N' = [[`('invalid
  3079. deconstruction',nil)']]`_(N+1)'
  3080. _P47_
  3081. ([[`right']] `nil')`_N' = [[`('invalid
  3082. deconstruction',nil)']]`_(N+1)'
  3083. _P48_
  3084. (([[`fan']] `F') `nil')`_N' = [[`('invalid
  3085. deconstruction',nil)']]`_(N+1)'
  3086. _P49_
  3087. ([[`member']] `nil')`_N' = [[`('invalid
  3088. membership',nil)']]`_(N+1)'
  3089. _P50_
  3090. ([[`distribute']] `nil')`_N' = [[`('invalid
  3091. distribution',nil)']]`_(N+1)'
  3092. _P51_
  3093. ([[`cat']] `nil')`_N' = [[`('invalid
  3094. concatenation',nil)']]`_(N+1)'
  3095. _P52_
  3096. ([[`meta']] `nil')`_N' = [[`('invalid recursion',nil)']]`_(N+1)'
  3097. Note that by virtue of property _P44_, there is no need for an
  3098. application to make explicit checks for exceptional results at any
  3099. point, because the exceptional result propagates through to the output
  3100. of any function composed with the one that incurred it. For example, an
  3101. application of the form `h = compose(f,right)', which will cause an
  3102. invalid deconstruction error if applied in filter mode to an empty file,
  3103. imposes no requirement that `f' be written to accommodate that
  3104. possibility (i.e., by checking for it) in order for the error to be
  3105. reported properly. The following proof demonstrates that the meaning of
  3106. `f' is irrelevant to the result.
  3107. [[`compose(f,right)']]`_0' `nil_0'
  3108. = [[`f']]`_0' [[`right']]`_0' `nil'`_0'
  3109. = [[`f']]`_0' [[`('invalid deconstruction',nil)']]`_1'
  3110. = [[`('invalid deconstruction',nil)']]`_1'
  3111. In an application `h = compose(f,g)', the input validation therefore
  3112. may be confined to the "front end", `g'.
  3113. It will be recalled from the discussions of `recur' (*note Recur::) and
  3114. `transpose' (*note Transpose::) that the semantics of virtual code
  3115. involving these forms is defined in terms of the `field' format for
  3116. deconstruction functions (*note Field::), which depends implicitly on
  3117. the semantics of `left' and `right', being a generalization of them. An
  3118. invalid deconstruction message could therefore result from applications
  3119. incorporating any of the forms of `recur', `transpose', or `field'.
  3120. Invalid deconstructions could also arise from the `replace' operation (*note
  3121. Replace::), which is used for assignment (*note Assignment::), because
  3122. `replace' is defined by virtual code, except as noted next.
  3123. 
  3124. File: avram.info, Node: Expedient Error Messages, Next: Computable Error Messages, Prev: Error Messages, Up: Exception Handling
  3125. 2.7.15.4 Expedient Error Messages
  3126. .................................
  3127. Because there are so many ways to cause an invalid deconstruction, this
  3128. message is the most common in practice and therefore the least
  3129. informative. As a matter of convenience, `avram' takes the liberty of a
  3130. slight departure from the virtual machine specification as written
  3131. hitherto, and employs the following messages when invalid
  3132. deconstructions occur respectively in the cases of recursion,
  3133. transposition, and assignment.
  3134. * `invalid recursion'
  3135. * `invalid transpose'
  3136. * `invalid assignment'
  3137. That is, this section contradicts and supersedes what is stated at the
  3138. end of *note Error Messages:: and implied by the operator properties
  3139. _P14_, _P16_, and _P42_. It is also possible that user applications may
  3140. modify the error messages by methods described in *note Computable
  3141. Error Messages::.
  3142. Whereas these three cases constitute an expedient variation on the
  3143. semantics, there is another sense in which no possible implementation
  3144. could conform faithfully to the specification. When an evaluation can
  3145. not be carried out because of insufficient space on the host machine,
  3146. one of the following error messages may be the result.
  3147. * `memory overflow'
  3148. * `counter overflow'
  3149. These messages are treated in the same way as those that are caused by
  3150. programming errors, and propagate to the final result written to
  3151. standard error without any specific consideration by the application
  3152. developer. The latter occurs only in connection with the built in weight
  3153. function (*note Weight::). Other messages listed in *note Application
  3154. Programming Errors:: are also of this ilk.
  3155. 
  3156. File: avram.info, Node: Computable Error Messages, Next: Exception Handler Usage, Prev: Expedient Error Messages, Up: Exception Handling
  3157. 2.7.15.5 Computable Error Messages
  3158. ..................................
  3159. The automatic generation and reporting of error messages provides a
  3160. reasonable default behavior for applications that do not consider
  3161. exceptional conditions. All applications and their input data are
  3162. ordinarily members of the bottom level set in the hierarchy (*note A
  3163. Hierarchy of Sets::). The error messages caused by invalid operations
  3164. on this level are on the first level above the bottom, which are
  3165. recognized as such and written to standard error without intervention
  3166. from the application. However, there are two drawbacks to this style of
  3167. dealing with exceptions.
  3168. * An application developer may wish to translate error messages into
  3169. terms that are meaningful to the user, not only by literally
  3170. translating them from English to the local vernacular, but perhaps
  3171. by relating the particular exceptional condition to application
  3172. specific causes. While it is convenient for the "back end" code
  3173. not to be required to intervene in the error reporting, it would
  3174. be most inconvenient for it not to be able to do so.
  3175. * Some application specific errors might not correspond directly to
  3176. any of the particular conditions detected automatically due to
  3177. invalid operations, for example a semantic error in a
  3178. syntactically correct input file. It might be convenient in such
  3179. cases for an application to be able to define its own error
  3180. messages but still have them reported automatically like the built
  3181. in messages.
  3182. These situations suggest a need for some ability on the part of an
  3183. application to operate on error messages themselves. Based on the
  3184. operator semantics given so far, such an application is inexpressible,
  3185. because for any application `F_0' and error message `X_1', property
  3186. _P44_ stipulates `F_0 X_1' = `X_1', meaning that the resulting error
  3187. message is unchanged. Therefore, we need to define another basic
  3188. property of the operator.
  3189. The following form of virtual code is used in applications that may
  3190. need to operate on error messages.
  3191. _T32_
  3192. [[`guard']] `(F,G)' = `((nil,F),G)'
  3193. Code in this form has the following semantics.
  3194. _P53_
  3195. ([[`guard']] `(F,G)')`_N' `X_P' = `G_(N+1) F_N X_P'
  3196. The intuitive explanation is that `F' is the main part of the
  3197. application, and `G' is the part of the application that operates on
  3198. the error message that comes from `F' if an exception occurs while it
  3199. is being evaluated (i.e., the "exception handler"). Typically the
  3200. exception handler code implements a function that takes an error
  3201. message as an argument and returns an error message as a result.
  3202. Where there is no exception, the exception handler `G_(N+1)' is
  3203. never used, because its argument will be on level `N', and therefore
  3204. unaffected by an application on level `N+1'.
  3205. Exception handlers may have their own exception handlers, which will
  3206. be invoked if the evaluation of the exception handler causes a further
  3207. exception. Such an exception corresponds semantically to a value on the
  3208. next level of the hierarchy of sets.
  3209. 
  3210. File: avram.info, Node: Exception Handler Usage, Prev: Computable Error Messages, Up: Exception Handling
  3211. 2.7.15.6 Exception Handler Usage
  3212. ................................
  3213. One way for this feature of the virtual machine to be used is to
  3214. intercept and translate error messages to a more meaningful form. An
  3215. application guarded as shown below causes messages of invalid
  3216. deconstruction to be changed to `'syntax error''.
  3217. `main = guard(
  3218. application,
  3219. conditional(
  3220. bu(compare,('invalid deconstruction',nil)),
  3221. (constant ('syntax error',nil),identity)))'
  3222. The conditional compares its argument to the error message for an invalid
  3223. deconstruction, and if it matches, the syntax error message is
  3224. returned, but otherwise the original message is returned. Note that an
  3225. error message must be in the form of a list of character strings, so
  3226. that it can be printed. Although the message of `'syntax error'' might
  3227. not be very informative, at least it looks less like a crash. A real
  3228. application should of course strive to do better than that.
  3229. Exception handling features of the virtual machine can also be
  3230. adapted by applications to raise their own exceptions with customized
  3231. messages.
  3232. error_messenger =
  3233. guard(compose(compare,constant nil),constant ('syntax error',nil))
  3234. This code fragment implements a function that causes a message of
  3235. `'syntax error'' to be reported for any possible input. This code
  3236. works by first causing an invalid comparison and then substituting its
  3237. own error message. A function that always causes an error is not useful
  3238. in itself, but might be used as part of an application in the following
  3239. form.
  3240. main = conditional(validation,(application,error_messenger))
  3241. In this case, the application checks the validity of the input with a
  3242. predicate, and invokes the error messenger if it is invalid.
  3243. Although the previous examples return a fixed error message for each
  3244. possible kind of error, it is also possible to have error messages that
  3245. depend on the input data, as the next example shows.
  3246. main = (hired apply)(
  3247. compose(
  3248. bu(guard,some_application),
  3249. (hired constant)(constant 'invalid input was:',identity)),
  3250. identity)
  3251. If the application causes an exception for any reason, the error message
  3252. returned will include a complete listing of the input, prefaced by the
  3253. words `'invalid input was:''. This particular example works only if the
  3254. input is a list of character strings, but could be adapted for other
  3255. types of data by substituting an appropriate formatting function for the
  3256. first identity. The formatting function would take the relevant data
  3257. type to a list of character strings. Another possible variation would
  3258. be to concatenate the invalid input listing with the error message that
  3259. was generated, rather than just replacing it.
  3260. As the last example may suggest, exception handlers turn out to be an essential
  3261. debugging tool for functional programs, making them as easy to debug as
  3262. imperative programs if not more so. This example forms the basis for a
  3263. higher order function that wraps any given function with an exception
  3264. handler that prints the argument causing it to crash. For arguments not
  3265. causing a crash, the behavior is unchanged. Alternatively, code
  3266. implementing a function that unconditionally reports its argument in an
  3267. error message can be inserted at a strategic point in the application
  3268. code similarly to a print statement. Finally, inspired use of exception
  3269. handlers that concatenate their messages with previously generated
  3270. messages can show something like a parameter stack dump when a
  3271. recursively defined function crashes. These are all matters for a
  3272. language designer and are not pursued further in this document.
  3273. 
  3274. File: avram.info, Node: Interfaces to External Code, Next: Vacant Address Space, Prev: Exception Handling, Up: Virtual Code Semantics
  3275. 2.7.16 Interfaces to External Code
  3276. ----------------------------------
  3277. A few other combinators have been incorporated into the virtual machine
  3278. as alternatives to the style of interactive applications described in
  3279. *note Output From Interactive Applications::. These make it possible to
  3280. interface with external libraries and applications either by a simple
  3281. function call, or by executing a run-time generated transducer as
  3282. described previously. In either case, there is no need for any
  3283. particular command line options to specify interactive invocation, nor
  3284. for the application to be designed that way from the outset. Existing
  3285. virtual code applications may therefore be enhanced to make use of
  3286. these features without radical changes.
  3287. To account for these additional capabilities, it is not entirely
  3288. adequate to continue defining the virtual machine semantics in terms of
  3289. a mathematical function, but it is done nevertheless due to the lack of
  3290. any appealing alternative. Although most library functions are in fact
  3291. functions in the sense that their outputs are determined by their
  3292. arguments, they defy a concise specification within the present
  3293. mathematical framework, especially insofar as they may involve finite
  3294. precision floating point numbers. More problematically, the effect of
  3295. interaction with a shell is neither well defined nor deterministic.
  3296. The descriptions that follow presuppose a computational procedure
  3297. associated with the following definitions but leave its exact nature
  3298. unspecified.
  3299. * Menu:
  3300. * Library combinator::
  3301. * Have combinator::
  3302. * Interaction combinator::
  3303. 
  3304. File: avram.info, Node: Library combinator, Next: Have combinator, Prev: Interfaces to External Code, Up: Interfaces to External Code
  3305. 2.7.16.1 Library combinator
  3306. ...........................
  3307. The simplest and fastest method of interfacing to an external library
  3308. is by way of a virtual machine combinator called `library'. It takes
  3309. two non-empty character strings as arguments to a virtual code program
  3310. of the form implied by the following property.
  3311. _T33_
  3312. [[`library']] (`X',`Y') = `((nil,nil),((X,Y),(nil,nil)))'
  3313. Intuitively, X is the name of a library and Y is the name of a function
  3314. within the library. For example, if X is `'math'' and Y is `'sqrt'',
  3315. then `library'(X,Y) represents the function that computes the square
  3316. root of a floating point number as defined by the host machine's native
  3317. C implementation, normally in IEEE double precision format. Different
  3318. functions and libraries may involve other argument and result types,
  3319. such as complex numbers, arrays, sparse matrices, or arbitrary
  3320. precision numbers. A list of currently supported external library names
  3321. with their functions and calling conventions is given in *note External
  3322. Libraries::.
  3323. On the virtual code side, all function arguments and results
  3324. regardless of their types are encoded as nested pairs of `nil', as
  3325. always, and may be manipulated or stored as any other data, including
  3326. storage and retrieval from files in `.avm' virtual code format (*note
  3327. File Format::). However, on the C side, various memory management and
  3328. caching techniques are employed to maintain this facade while allowing
  3329. the libraries to operate on data in their native format. The details
  3330. are given more fully in the API documentation, particularly in *note
  3331. Type Conversions:: and *note External Library Maintenance::.
  3332. While this style is fast and convenient, it is limited either to
  3333. libraries that have already been built into the virtual machine, or to
  3334. those for which the user is prepared to implement a new interface
  3335. module in C as described in *note Implementing new library functions::.
  3336. 
  3337. File: avram.info, Node: Have combinator, Next: Interaction combinator, Prev: Library combinator, Up: Interfaces to External Code
  3338. 2.7.16.2 Have combinator
  3339. ........................
  3340. As virtual machine interfaces to external libraries accumulate faster
  3341. than they can be documented and may vary from one installation to
  3342. another, it is helpful to have a way of interrogating the virtual
  3343. machine for an up to date list of the installed libraries and
  3344. functions. A combinator called `have' can be used to test for the
  3345. availability of a library function. It takes the form
  3346. _T34_
  3347. [[`have']] (`X',`Y') = `((nil,nil),((nil,X),(nil,Y)))'
  3348. where X is the name of a library and Y is the name of a function within
  3349. the library encoded as character strings. For example, if X is
  3350. `'mtwist'' and Y is `'u_disc'' (for the natural random number generator
  3351. function in the Mersenne twistor library) then `have(X,Y)' is a
  3352. function that returns a non-empty value if an only if that library is
  3353. installed and that function is available within it. The actual argument
  3354. to the function is ignored as the result depends only on the installed
  3355. virtual machine configuration. In this sense, it acts like a `constant'
  3356. combinator.
  3357. One way for this combinator to be used is in code of the form
  3358. portable_rng =
  3359. conditional(
  3360. have('mtwist','u_disc'),
  3361. library('mtwist','u_disc'),
  3362. some_replacement_function)
  3363. which will use the library function if available but otherwise use a
  3364. replacement function. Code in this form makes the decision at run time,
  3365. but it is also possible to express the function such that the check for
  3366. library presence is made at compile time, as the following example
  3367. shows, which will imply a slight improvement in performance.
  3368. non_portable_rng =
  3369. apply(
  3370. conditional(
  3371. have('mtwist','u_disc'),
  3372. constant library('mtwist','u_disc'),
  3373. constant some_replacement_function),
  3374. 0)
  3375. This program would be non-portable in the sense that it would need to
  3376. be recompiled for each installation if there were a chance that some of
  3377. them might have the `mtwist' library and some might not, whereas the
  3378. previous example would be binary compatible across all of them. (1)
  3379. The actual value returned by a function `have(foo,bar)' is the list
  3380. of pairs of strings `<(foo,bar)>' if the function is available, or the
  3381. empty list otherwise. A non-empty list is represented as a pair
  3382. `(head,tail)', and an empty list as `nil'. The angle bracket notation
  3383. `<a,b,c...>' used here is an abbreviation for `(a,(b,(c...nil)))'.
  3384. Either or both arguments to the `have' combinator can be a wildcard,
  3385. which is the string containing a single asterisk, `'*''. In that case,
  3386. the list of all available matching library names and function names
  3387. will be returned. This feature can be used to find out what library
  3388. functions are available without already knowing their names.
  3389. If a library had a function named `'*'', which clashes with the wild
  3390. card string, the interpretation as a wild card would take precedence.
  3391. ---------- Footnotes ----------
  3392. (1) In practice both examples are equally portable because the
  3393. `mtwist' source is distributed with `avram' so all installations will
  3394. have it. Most libraries are distributed separately.
  3395. 
  3396. File: avram.info, Node: Interaction combinator, Prev: Have combinator, Up: Interfaces to External Code
  3397. 2.7.16.3 Interaction combinator
  3398. ...............................
  3399. A further combinator allows virtual code applications to interact
  3400. directly with any interactive console application using the `expect'
  3401. library. The mechanism is similar to that of interactive applications
  3402. documented in the *note Output From Interactive Applications::, but
  3403. attempts to be more convenient. Instead of being designed as an
  3404. interactive application, any virtual code application may use this
  3405. combinator to spawn a shell and interact with it in order to compute
  3406. some desired result.
  3407. The advantage of this combinator over the `library' combinator is
  3408. that it requires no modification of the virtual machine to support new
  3409. applications. It can also interact with applications that may reside on
  3410. remote servers, that are implemented languages other than C, or whose
  3411. source code is unavailable. For example, the GNU R statistical package
  3412. provides an interactive command to evaluate multivariate normal
  3413. distribution functions with an arbitrary covariance matrix, but the
  3414. corresponding function is not provided by the `Rmath' C library (or any
  3415. other free library, to the author's knowledge) because it is
  3416. implemented in interpreted code. This combinator makes it callable by
  3417. an `avram' virtual code application nevertheless. The disadvantage
  3418. compared to the `library' combinator is that there is more overhead in
  3419. spawning a process than simply making a call to a built in function,
  3420. and the programming interface is more complicated.
  3421. The combinator takes the form
  3422. _T35_
  3423. [[`interact']] F = `((nil,nil),(((nil,nil),nil),((nil,F),nil)))'
  3424. where F is the virtual code for a function that follows the same
  3425. protocol described in *note Output From Interactive Applications::,
  3426. except that it does not allow file output as described in *note Mixed
  3427. Modes of Interaction::. The argument `x' is ignored when the expression
  3428. `(interact f) x' is evaluated, similarly to the way the argument is
  3429. ignored in an expression like `(constant k) x'. The result returned is
  3430. a transcript of the dialogue that took place between `f' and the
  3431. externally spawned shell, represented as a list of lists of strings for
  3432. line oriented interaction, or a list of characters alternating with
  3433. lists of strings in the case of character oriented interaction.
  3434. The following example demonstrates a trivial use of the `interact'
  3435. combinator to spawn an `ftp' client, do an `ls' command, and then terminate
  3436. the session.
  3437. eof = <(nil,(nil,(((nil,nil),nil),(nil,nil))))>
  3438. demo =
  3439. interact conditional(
  3440. conditional(identity,constant false,constant true),
  3441. constant(0,<'ftp'>,<'ftp> '>),
  3442. conditional(
  3443. conditional(left,constant false,constant true),
  3444. constant(1,<'ls',''>,<'','ftp> '>),
  3445. conditional(
  3446. compose(compare,couple(left,constant 1)),
  3447. constant(2,<'bye',''>,<eof>),
  3448. constant nil)))
  3449. Some liberties are taken with `silly' syntax in this example, in the
  3450. way of using angle brackets to denote lists, and numbers to represent
  3451. states.
  3452. * The interacting transducer works by checking whether its argument
  3453. is empty (via the `identity' function used as a predicate in the
  3454. `conditional', which is then negated). In that case it returns the
  3455. triple containing the initial state of 0, the `ftp' shell command
  3456. to spawn the client, and the `'ftp> '' prompt expected when the
  3457. client has been spawned, both of the latter being lists of strings.
  3458. * If the argument is non-empty, then next it checks whether it is in
  3459. the initial state of 0, (via the `left' function used as a
  3460. predicate, referring to the state variable expected on the left of
  3461. any given `(state,input)' pair, also negated). If so, it returns
  3462. the triple containing the next state of 1, the `ls' command
  3463. followed by an empty string to indicate a line break, and the
  3464. expected prompt preceded by an empty string to match it only at
  3465. the beginning of a line.
  3466. * Finally, it checks for state 1, in which case it issues the `bye'
  3467. command to close the session, `eof' rather than a prompt to wait
  3468. for termination of the client, and a state of 2.
  3469. * In the remaining state of 2, which needn't be explicitly tested
  3470. because it is the only remaining possibility, the program returns a
  3471. `nil' value to indicate that the computation has terminated.
  3472. Deadlock would be possible at any point if either party did not
  3473. follow this protocol, but for this example it is not an issue. If an
  3474. expression of the form `demo x' were to be evaluated, then regardless
  3475. of the value of `x', the value of the result would be as shown below.
  3476. <
  3477. <'ftp'>,
  3478. <'ftp> '>,
  3479. <'ls',''>,
  3480. <'ls','Not connected.','ftp> '>,
  3481. <'bye',''>,
  3482. <'bye',''>>
  3483. That is, it would be a list of lists of strings, alternating between the
  3484. output of the interactor and the output of the `ftp' client. If the
  3485. spawned application had been something non-trivial such as a computer
  3486. algebra system or a command line web search utility, then it is easy to
  3487. see how functions using this combinator can leverage off a wealth of
  3488. available resources.
  3489. 
  3490. File: avram.info, Node: Vacant Address Space, Prev: Interfaces to External Code, Up: Virtual Code Semantics
  3491. 2.7.17 Vacant Address Space
  3492. ---------------------------
  3493. Not every possible pattern has been used by the virtual machine as a way
  3494. of encoding a function. The following patterns, where `A', `B', and `C'
  3495. are non-`nil' trees, do not represent anything useful.
  3496. unary forms
  3497. `((nil,nil),((nil,nil),(nil,((nil,A),nil))))'
  3498. `((nil,nil),((nil,nil),(nil,(nil,(nil,A)))))'
  3499. binary forms
  3500. `((nil,nil),((nil,nil),(A,B)))'
  3501. `((nil,nil),((A,nil),(B,nil)))'
  3502. `((nil,nil),((A,nil),(nil,B)))'
  3503. ternary forms
  3504. `((nil,nil),((A,B),(C,nil)))'
  3505. `((nil,nil),((A,B),(nil,C)))'
  3506. `((nil,nil),((A,nil),(B,C)))'
  3507. `((nil,nil),((nil,A),(B,C)))'
  3508. These patterns are detected by the virtual machine simply to avoid
  3509. blowing it up, but they always cause an error message to be reported.
  3510. _P55_
  3511. For `F' matching any of the first three trees in the above list,
  3512. `F_N X_N' = [[`('unsupported hook',nil)']]`_(N+1)'
  3513. _P56_
  3514. For the remaining trees `F' in the above list,
  3515. `F_N X_N' = [[`('unrecognized combinator (code
  3516. M)',nil)']]`_(N+1)'
  3517. Here, `M' is a numeric constant dependent on which tree `F' was used.
  3518. The unsupported hook message is meant to be more informative than the
  3519. unrecognized combinator message, suggesting that a feature intended for
  3520. future use is not yet available.
  3521. This list has been assembled for the benefit of readers considering
  3522. the addition of backward compatible extensions to the virtual code
  3523. semantics, who are undeterred by the facts that
  3524. * the computational model is already universal
  3525. * virtual code applications are already interoperable with all kinds
  3526. of high performance software having a text based or console
  3527. interface by way of the `interact' combinator
  3528. * an unlimited number of built in library functions can be added by
  3529. way of the `library' combinator as described in *note Implementing
  3530. new library functions::
  3531. * the C code in `avram' makes fairly intricate use of pointers with
  3532. a careful policy of reference counting and storage reclamation
  3533. * there is also a performance penalty incurred by further extensions
  3534. to the semantics, even for applications that don't use them,
  3535. because a pattern recognition algorithm in the interpreter has
  3536. more cases to consider.
  3537. Nevertheless, a new functional form combining a pair of functions to
  3538. be interpreted in a new way by the virtual machine could be defined
  3539. using any of the binary forms above, for example, with `A' as the
  3540. virtual code for one of the functions and `B' as that of the other.
  3541. Such a form would not conflict with any existing applications, provided
  3542. that both `A' and `B' are not `nil', which is true of any valid
  3543. representation for a function.
  3544. Virtual machine architects, take note. There are infinitely many
  3545. trees fitting these patterns, but it would be possible to use them up by
  3546. assigning them without adequate foresight. For example, if
  3547. interpretations were assigned to the four ternary forms, the three
  3548. binary forms, and one of the remaining unary forms, then the only
  3549. unassigned pattern could be of the form
  3550. `((nil,nil),((nil,nil),(nil,(nil,(nil,A)))))'
  3551. Assigning an interpretation to it would leave no further room for
  3552. backward compatible expansion. On the other hand, any tree of the
  3553. following form also fits the above pattern,
  3554. `((nil,nil),((nil,nil),(nil,(nil,(nil,(B,C))))))'
  3555. with any values for `B' and `C'. Different meanings could be chosen for
  3556. the case where both are `nil', both are non-`nil', or one is `nil' and
  3557. the other non-`nil', allowing two unary forms, one binary, and one
  3558. constant. If at least one of these patterns is reserved for future
  3559. enhancements, then a potentially inexhaustible supply of address space
  3560. remains and there will be no need for incompatible changes later.
  3561. 
  3562. File: avram.info, Node: Library Reference, Next: Character Table, Prev: Virtual Machine Specification, Up: Top
  3563. 3 Library Reference
  3564. *******************
  3565. Much of the code developed for `avram' may be reusable in other
  3566. projects, so it has been packaged into a library and documented in this
  3567. chapter. For ease of reference, this chapter is organized with a
  3568. separate section for each source file. For the most part, each source
  3569. file encapsulates an abstract type and a number of related functions,
  3570. except for a few cases where C makes such a design awkward. An attempt
  3571. has been made to present the sections in a readable order as far as
  3572. possible.
  3573. The documentation in this chapter is confined to the application
  3574. program interface (API), and does not delve unnecessarily into any
  3575. details of the implementation. A reader wishing to extend, modify, or
  3576. troubleshoot the library itself can find additional information in the
  3577. source code comments. These are more likely to be in sync with the code
  3578. than this document may be, and are more readily accessible to someone
  3579. working with the code.
  3580. Some general points pertaining to the library are the following.
  3581. * Unlike the previous chapter, this chapter uses the word "function"
  3582. in the C sense rather than the mathematical sense of the word.
  3583. * Internal errors are internal from the user's point of view, not
  3584. the developer's (*note Internal Errors::). Invoking these
  3585. functions in ways that are contrary to their specifications can
  3586. certainly cause internal errors (not to mention segfaults).
  3587. * The library is definitely not thread safe, and thread safety is not
  3588. a planned enhancement. The amount of locking required to make it
  3589. thread safe would probably incur an objectionable performance
  3590. penalty due to the complexity of the shared data structures
  3591. involved, in addition to being very difficult to get right. If you
  3592. need these facilities in a concurrent application, consider
  3593. spawning a process for each client of the library so as to keep
  3594. their address spaces separate.
  3595. * The library files are built from the standard source distribution
  3596. using GNU `libtool'. In the default directory hierarchy, they will
  3597. be found either in `/usr/lib/libavram.*' or in
  3598. `/usr/local/lib/libavram.*'. These directories will differ in a
  3599. non-standard installation.
  3600. * The header files will probably be located in either
  3601. `/usr/include/avm/*.h' or `/usr/local/include/avm/*.h' for a
  3602. standard installation.
  3603. * All exported functions, macros and constants are preceded with
  3604. `avm_', so as to reduce the chance of name clashes with other
  3605. libraries. Not all type declarations or field identifiers follow
  3606. this convention, because that would be far too tedious.
  3607. * The library header files are designed to be compatible with C++ but
  3608. have been tested only with C. Please refer to platform specific
  3609. documentation for further information on how to link library
  3610. modules with your own code.
  3611. * Menu:
  3612. * Lists::
  3613. * Characters and Strings::
  3614. * File Manipulation::
  3615. * Invocation::
  3616. * Version Management::
  3617. * Error Reporting::
  3618. * Profiling::
  3619. * Emulation Primitives::
  3620. * External Library Maintenance::
  3621. 
  3622. File: avram.info, Node: Lists, Next: Characters and Strings, Prev: Library Reference, Up: Library Reference
  3623. 3.1 Lists
  3624. =========
  3625. The basic data structure used for representing virtual code and data in
  3626. the `avram' library is declared as a `list'. The `list' type is a
  3627. pointer to a structure having a `head' field and a `tail' field, which
  3628. are also lists. The empty tree, `nil', is represented by the C constant
  3629. `NULL'. A tree of the form `cons(A,B)' is represented in C as a list
  3630. whose `head' is the representation of `A' and whose `tail' is the
  3631. representation of `B'.
  3632. A number of other fields in the structure are maintained
  3633. automatically and should not be touched. For that matter, even the
  3634. `head' and `tail' fields should be considered read-only. Because of
  3635. sharing, it is almost never valid to modify a list "in place", except
  3636. for cases that are already covered by library functions.
  3637. * Menu:
  3638. * Simple Operations::
  3639. * Recoverable Operations::
  3640. * List Transformations::
  3641. * Type Conversions::
  3642. * Comparison::
  3643. * Deconstruction Functions::
  3644. * Indirection::
  3645. * The Universal Function::
  3646. 
  3647. File: avram.info, Node: Simple Operations, Next: Recoverable Operations, Prev: Lists, Up: Lists
  3648. 3.1.1 Simple Operations
  3649. -----------------------
  3650. These functions are declared in the header file `lists.h', which should
  3651. be included in any C source file that uses them with a directive such
  3652. as `#include <avm/lists.h>'. All of these functions except the first
  3653. three have the potential cause a memory overflow. In that event, a
  3654. brief message is written to standard error and the process is killed
  3655. rather than returning to the caller. It is possible for client programs
  3656. requiring more robust behavior to do their own error handling by using
  3657. the alternative versions of these operations described in the next
  3658. section.
  3659. -- Function: void avm_initialize_lists ()
  3660. The function `avm_initialize_lists' should be called before any of
  3661. the other ones in this section is called, because it sets up some
  3662. internal data structures. Otherwise, the behavior of the other
  3663. functions is undefined.
  3664. -- Function: void avm_dispose (list FRONT)
  3665. This function deallocates the memory associated with a given list,
  3666. either by consigning it to a cache maintained internally by the
  3667. library, or by the standard `free' function if the cache is full.
  3668. Shared lists are taken into account and handled properly according
  3669. to a reference counting scheme. Lists should be freed only by this
  3670. function, not by using `free' directly.
  3671. -- Function: void avm_count_lists ()
  3672. If a client program aims to do its own storage reclamation, this
  3673. function can be called optionally at the end of a run when it is
  3674. believed that all lists have been freed. If any allocated lists
  3675. remain at large, a warning will be printed to standard error. This
  3676. function therefore provides a useful check for memory leaks.
  3677. Overhead is small enough that it is not infeasible to leave this
  3678. check in the production code.
  3679. -- Function: list avm_copied (list OPERAND)
  3680. A copy of the argument list is returned by this function. The copy
  3681. remains intact after the original is reclaimed. A typical use
  3682. might be for retaining part of a list after the rest of it is no
  3683. longer needed. In this example, a list `x' is traversed by a
  3684. hypothetical `visit' function to each item, which is then
  3685. immediately reclaimed.
  3686. while(x){
  3687. visit(x->head);
  3688. old_x = x;
  3689. x = avm_copied(x->tail); /* the right way */
  3690. avm_dispose(old_x);
  3691. }
  3692. This example allows each item in the list to be visited even as
  3693. previously visited items are reclaimed, because `x' is copied at
  3694. each iteration. This example contrasts with the next one, which
  3695. will probably cause a segmentation fault.
  3696. while(x){
  3697. visit(x->head);
  3698. old_x = x;
  3699. x = x->tail; /* the wrong way */
  3700. avm_dispose(old_x);
  3701. }
  3702. In the second example, a reference is made to a part of a list
  3703. which no longer exists because it has been deallocated.
  3704. In fact, the `avm_copied' function does nothing but increment a
  3705. reference count, so it is a fast, constant time operation that
  3706. requires no additional memory allocation. Semantically this action
  3707. is equivalent to creating a fresh copy of the list, because all
  3708. list operations in the library deal with reference counts properly.
  3709. -- Function: list avm_join (list LEFT, list RIGHT)
  3710. This function takes a pair of lists to a list in which the left is
  3711. the head and the right is the tail. It may need to use `malloc' to
  3712. allocate additional memory. If there is insufficient memory, an
  3713. error message is written to standard error and the program exits.
  3714. When the list returned by `avm_join' is eventually deallocated, the
  3715. lists from which it was built are taken with it and must not be
  3716. referenced again. For example, the following code is an error.
  3717. z = avm_join(x,y);
  3718. ...
  3719. avm_dispose(z);
  3720. avm_print_list(x); /* error here */
  3721. To accomplish something similar to this without an error, a copy of
  3722. `x' should be made, as in the next example.
  3723. z = avm_join(avm_copied(x),y);
  3724. ...
  3725. avm_dispose(z);
  3726. avm_print_list(x); /* original x still intact */
  3727. -- Function: void avm_enqueue (list *FRONT, list *BACK, list OPERAND)
  3728. A fast simple way of building a list head first is provided by the
  3729. `enqueue' function. The `front' is a pointer to the beginning of
  3730. the list being built, and the `back' is a pointer to the last
  3731. item. The recommended way to use it would be something like this.
  3732. front = back = NULL;
  3733. avm_enqueue(&front,&back,item);
  3734. avm_enqueue(&front,&back,next_item);
  3735. avm_enqueue(&front,&back,another_item);
  3736. ...
  3737. It might be more typical for the calls to `avm_enqueue' to appear
  3738. within a loop. In any case, after the above code is executed, the
  3739. following postconditions will hold.
  3740. front->head == item
  3741. front->tail->head == next_item
  3742. front->tail->tail->head == another_item
  3743. back->head == another_item
  3744. back->tail == NULL
  3745. The `avm_enqueue' function must never be used on a shared list,
  3746. because it modifies its arguments in place. The only practical way
  3747. to guarantee that a list is not shared is to initialize the
  3748. `front' and `back' to `NULL' as shown before the first call to
  3749. `avm_enqueue', and to make no copies of `front' or `back' until
  3750. after the last call to `avm_enqueue'.
  3751. Because a list built with `avm_enqueue' is not shared, it is one
  3752. of the few instances of a list that can have something harmlessly
  3753. appended to it in place. For example, if the next line of code were
  3754. back->tail = rest_of_list;
  3755. that would be acceptable assuming `rest_of_list' is not shared and
  3756. does not conceal a dangling or cyclic reference, and if nothing
  3757. further were enqueued.
  3758. The items that are enqueued into a list are not copied and will be
  3759. deallocated when the list is deallocated, so they must not be
  3760. referenced thereafter. A non-obvious violation of this convention
  3761. is implicit in the following code.
  3762. ...
  3763. avm_enqueue(&front,&back,x->head);
  3764. ...
  3765. avm_dispose(front);
  3766. avm_print_list(x); /* error here */
  3767. This code might cause a segmentation fault because of the
  3768. reference to `x' after its head has been deallocated. The
  3769. following code is subject to the same problem,
  3770. ...
  3771. avm_enqueue(&front,&back,x->head);
  3772. ...
  3773. avm_dispose(x);
  3774. avm_print_list(front); /* error here */
  3775. as is the following.
  3776. ...
  3777. avm_enqueue(&front,&back,x->head);
  3778. ...
  3779. avm_dispose(x); /* front is now impossible to reclaim */
  3780. avm_dispose(front);
  3781. The problem with the last example is that it is not valid even to
  3782. dispose of the same list more than once, albeit indirectly.
  3783. If part of a list is intended to be enqueued temporarily or
  3784. independently of its parent, the list should be copied explicitly,
  3785. as the following code demonstrates.
  3786. ...
  3787. avm_enqueue(&front,&back,avm_copied(x->head)); /* correct */
  3788. ...
  3789. avm_dispose(front);
  3790. avm_print_list(x);
  3791. -- Function: counter avm_length (list OPERAND)
  3792. A `counter' is meant to be the longest unsigned integer available on
  3793. the host machine, and is defined in `common.h', which is
  3794. automatically included whenever `lists.h' is included. The
  3795. `avm_length' function returns the number of items in a list. If a
  3796. list is `NULL', a value of zero is returned. There is a possibility
  3797. of a counter overflow error from this function (*note Overflow
  3798. Errors::), but only on a platform where the `counter' type is
  3799. shorter than the address length.
  3800. -- Function: counter avm_area (list OPERAND)
  3801. This function is similar to `avm_length', but it treats its
  3802. argument as a list of lists and returns the summation of their
  3803. lengths.
  3804. -- Function: list avm_natural (counter NUMBER)
  3805. This function takes a `counter' to its representation as a list, as
  3806. described in *note Representation of Numeric and Textual Data::.
  3807. That is, the number is represented as a list of bits, least
  3808. significant bit first, with each zero bit represented by `NULL'
  3809. and each one bit represented by a list whose `head' and `tail' are
  3810. `NULL'.
  3811. -- Function: void avm_print_list (list OPERAND)
  3812. The `avm_print_list' function is not used in any production code
  3813. but retained in the library for debugging purposes. It prints a
  3814. list to standard output using an expression involving only commas
  3815. and parentheses, as per the `silly' syntax (*note A Simple Lisp
  3816. Like Language::). The results quickly become unintelligible for
  3817. lists of any significant size. The function is recursively
  3818. defined and will crash in the event of a stack overflow, which
  3819. will occur in the case of very large or cyclic lists.
  3820. -- Function: list avm_position (list KEY, list TABLE, int *FAULT)
  3821. This function searches for a KEY in a short TABLE where each item
  3822. is a possible key.
  3823. If it's not found, a `NULL' value is returned. If it's found, a
  3824. list representing a character encoding according to *note
  3825. Character Table:: is returned.
  3826. The ascii code of the character corresponding to the returned list
  3827. is the position of the KEY in the TABLE, assuming position numbers
  3828. start with 1.
  3829. The table should have a length of 255 or less. If it's longer and
  3830. the KEY is found beyond that range, the higher order bits of the
  3831. position number are ignored.
  3832. The integer referenced by FAULT is set to a non-zero value in the
  3833. event of a memory overflow, which could happen in the course of
  3834. the list comparisons necessary for the search.
  3835. 
  3836. File: avram.info, Node: Recoverable Operations, Next: List Transformations, Prev: Simple Operations, Up: Lists
  3837. 3.1.2 Recoverable Operations
  3838. ----------------------------
  3839. The functions in this section are similar to the ones in the previous
  3840. section except with regard to error handling. Whereas the other ones
  3841. cause an error message to be printed and the process to exit in the
  3842. event of an overflow, these return to the caller, whose responsibility
  3843. it is to take appropriate action. The functions in both sections are
  3844. declared in `lists.h', and should be preceded by a call to
  3845. `avm_initialize_lists'.
  3846. -- Function: list avm_recoverable_join (list LEFT, list RIGHT)
  3847. This function is similar to `avm_join', but will return a `NULL'
  3848. pointer if memory that was needed can not be allocated. A `NULL'
  3849. pointer would never be the result of a join under normal
  3850. circumstances, so the overflow can be detected by the caller.
  3851. Regardless of whether overflow occurs, the arguments are
  3852. deallocated by this function and should not be referenced
  3853. thereafter.
  3854. -- Function: void avm_recoverable_enqueue (list *FRONT, list *BACK,
  3855. list OPERAND, int *FAULT)
  3856. This version of the enqueue function will dispose of the `OPERAND'
  3857. if there isn't room to append another item and set `*FAULT' to a
  3858. non-zero value. Other than that, it does the same as `avm_enqueue'.
  3859. -- Function: counter avm_recoverable_length (list OPERAND)
  3860. This function checks for arithmetic overflow when calculating the
  3861. length of a list, and returns a zero value if overflow occurs. The
  3862. caller can detect the error by noting that zero is not the length
  3863. of any list other than `NULL'. This kind of overflow is impossible
  3864. unless the host does not have long enough integers for its address
  3865. space.
  3866. -- Function: counter avm_recoverable_area (list OPERAND, int *FAULT)
  3867. This function is similar to `avm_area', except that it reacts
  3868. differently to arithmetic overflow. The `fault' parameter should be
  3869. the address of an integer known to the caller, which will be set
  3870. to a non-zero value if overflow occurs. In that event, the value
  3871. of zero will also be returned for the area. Note that it is
  3872. possible for non-empty lists to have an area of zero, so this
  3873. condition alone is not indicative of an error.
  3874. -- Function: list avm_recoverable_natural (counter NUMBER)
  3875. This function returns the `list' representation of a native
  3876. unsigned long integer, provided that there is enough memory,
  3877. similarly to the `avm_natural' function. Unlike that function,
  3878. this one will return a value of `NULL' rather than exiting the
  3879. program in the event of a memory overflow. The overflow can be
  3880. detected by the caller insofar as a `NULL' `list' does not
  3881. represent any number other than zero.
  3882. 
  3883. File: avram.info, Node: List Transformations, Next: Type Conversions, Prev: Recoverable Operations, Up: Lists
  3884. 3.1.3 List Transformations
  3885. --------------------------
  3886. Some functions declared in `listfuns.h' are used to implement the
  3887. operations described in *note List Functions::. These functions are able
  3888. to report error messages in the event of overflow or other exceptional conditions,
  3889. as described in *note Error Messages::. The error messages are
  3890. represented as lists and returned to the caller. The occurrence of an
  3891. error can be detected by the `*FAULT' flag being set to a non-zero
  3892. value. None of these functions ever causes a program exit except in the
  3893. event of an internal error.
  3894. -- Function: void avm_initialize_listfuns ()
  3895. This has to be called before any of the other functions in this
  3896. section is called. It initializes the error message lists, among
  3897. other things.
  3898. -- Function: void avm_count_listfuns ()
  3899. At the end of a run, a call to this function can verify that no
  3900. unreclaimed storage attributable to these functions persists. If it
  3901. does, a warning is printed to standard error. If `avm_count_lists'
  3902. is also used, it must be called after this function.
  3903. -- Function: list avm_reversal (list OPERAND, int *FAULT)
  3904. The reversal of the list is returned by this function if no
  3905. overflow occurs. A non-zero `*FAULT' and an error message are
  3906. returned otherwise. The original `OPERAND' still exists in its
  3907. original order after this function is called. The amount of
  3908. additional storage allocated is proportional only to the length of
  3909. the list, not the size of its contents.
  3910. -- Function: list avm_distribution (list OPERAND, int *FAULT)
  3911. This function performs the operation described in *note
  3912. Distribute::. The invalid distribution message is returned in the
  3913. event of a `NULL' operand. Otherwise, the returned value is the
  3914. distributed list. In any event, the `OPERAND' is unaffected.
  3915. -- Function: list avm_concatenation (list OPERAND, int *FAULT)
  3916. The `OPERAND' is treated as a pair of lists to be concatenated,
  3917. with the left one in the `head' field and the right one in the
  3918. `tail' field. The invalid concatenation message is returned in the
  3919. event of a `NULL' `OPERAND'. The result returned otherwise is the
  3920. concatenation of the lists, but the given `OPERAND' still exists
  3921. unchanged.
  3922. -- Function: list avm_transposition (list OPERAND, int *FAULT)
  3923. The operation performed by this function corresponds to that of
  3924. *note Transpose::. Unlike other functions in this section, the
  3925. operand passed to this function is deallocated, and must not be
  3926. referenced thereafter. The transposed list is accessible as the
  3927. returned value of this function. If the original `OPERAND' is
  3928. still needed after a call to `avm_transposition', only a copy of
  3929. it should be passed to it, obtained from `avm_copied'. The invalid
  3930. transpose error message is the result if the operand does not
  3931. represent a list of equal length lists.
  3932. -- Function: list avm_membership (list OPERAND, int *FAULT)
  3933. This function computes the membership predicate described in *note
  3934. Member::. The operand is a list in which the `tail' field is a
  3935. list that will be searched for the item in the `head'. If the item
  3936. is not found, a `NULL' list is returned, but otherwise a list with
  3937. `NULL' `head' and `tail' fields is returned. If the operand is
  3938. `NULL', an error message of invalid membership is returned and
  3939. `*FAULT' is set to a non-zero value.
  3940. The `avm_membership' function calls `avm_binary_comparison' in
  3941. order to compare lists, so the same efficiency and side-effect
  3942. considerations are relevant to both (*note Comparison::). It is not
  3943. necessary to `#include' the header file `compare.h' or to call
  3944. `avm_initialize_compare' in order to use `avm_membership', because
  3945. they will be done automatically.
  3946. -- Function: list avm_binary_membership (list OPERAND, list MEMBERS,
  3947. int *FAULT);
  3948. This function is the same as `avm_membership' except that it
  3949. allows the element and the set of members to be passed as separate
  3950. lists instead of being the head and the tail of the same list.
  3951. -- Function: list avm_measurement (list OPERAND, int *FAULT)
  3952. This function implements the operation described in *note
  3953. Weight::, which pertains to the weight of a tree. The returned
  3954. value of this function is a list encoding the weight as a binary
  3955. number, unless a counter overflow occurs, in which case it's an
  3956. error message. As noted previously, the weight of a tree can
  3957. easily be exponentially larger than the amount of memory it
  3958. occupies, but this function uses native integer arithmetic for
  3959. performance reasons. Hence, a counter overflow is a real
  3960. possibility.
  3961. 
  3962. File: avram.info, Node: Type Conversions, Next: Comparison, Prev: List Transformations, Up: Lists
  3963. 3.1.4 Type Conversions
  3964. ----------------------
  3965. External library functions accessed by the `library' combinator as
  3966. explained in *note Library combinator:: may operate on data other than
  3967. the `list' type usually used by `avram', such as floating point numbers
  3968. and arrays, but a virtual code application must be able to represent
  3969. the arguments and results of these functions in order to use them. As a
  3970. matter of convention, a data structure occupying SIZE bytes of
  3971. contiguous storage on the host machine appears as a list of length SIZE
  3972. to a virtual code application, in which each item corresponds to a
  3973. byte, and is represented according to *note Character Table::.
  3974. In principle, a virtual code application invoking a library function
  3975. to operate on a contiguous block of data, such as an IEEE double
  3976. precision number, for example, would construct a list of eight
  3977. character representations (one for each byte in a double precision
  3978. number), and pass this list as an argument to the library function. The
  3979. virtual machine would transparently convert this representation to the
  3980. native floating point format, evaluate the function, and convert the
  3981. result back to a list. In practice, high level language features
  3982. beyond the scope of this document would insulate the programmer from
  3983. some of the details on the application side as well.
  3984. To save the time of repeatedly converting between the list
  3985. representation and the contiguous native binary representation, the
  3986. structure referenced by a `list' pointer contains a `value' field which
  3987. is a `void' pointer to a block of memory of unspecified type, and
  3988. serves as a persistent cache of the value represented by the list. This
  3989. field normally should be managed by the API rather than being accessed
  3990. directly by client modules, but see the code in `mpfr.c' for an example
  3991. of a situation in which it's appropriate to break this rule. (Generally
  3992. these situations involve library functions operating on non-contiguous
  3993. data.)
  3994. * Menu:
  3995. * Primitive types::
  3996. * One dimensional arrays::
  3997. * Two dimensional arrays::
  3998. * Related utility functions::
  3999. 
  4000. File: avram.info, Node: Primitive types, Next: One dimensional arrays, Prev: Type Conversions, Up: Type Conversions
  4001. 3.1.4.1 Primitive types
  4002. .......................
  4003. A pair of functions in support of this abstraction is prototyped in
  4004. `listfuns.h'. These functions will be of interest mainly to developers
  4005. wishing to implement an interface to a new library module and make it
  4006. accessible on the virtual side by way of the `library' combinator
  4007. (*note Library combinator::).
  4008. -- Function: void *avm_value_of_list (list OPERAND, list *MESSAGE, int
  4009. *FAULT)
  4010. This function takes an OPERAND representing a value used by a
  4011. library function in the format described above (*note Type
  4012. Conversions::) and returns a pointer to the value.
  4013. The `value' field in the OPERAND normally will point to the block
  4014. of memory holding the value, and the OPERAND itself will be a list
  4015. of character representations whose binary encodings spell out the
  4016. value as explained above.
  4017. The `value' field need not be initialized on entry but it will be
  4018. initialized as a side effect of being computed by this function.
  4019. If it has been initialized due to a previous call with the same
  4020. OPERAND, this function is a fast constant time operation.
  4021. The caller should not free the pointer returned by this function
  4022. because a reference to its value will remain in the OPERAND. When
  4023. the OPERAND itself is freed by `avm_dispose' (*note Simple
  4024. Operations::), the value will go with it.
  4025. If an error occurs during the evaluation of this function, the
  4026. integer referenced by FAULT will be set to a non-zero value, and
  4027. the list referenced by MESSAGE will be assigned a representation of
  4028. a list of strings describing the error. The MESSAGE is freshly
  4029. created and should be freed by the caller with `avm_dispose' when
  4030. no longer needed.
  4031. Possible error messages are `<'missing value'>', in the case of an
  4032. empty OPERAND, `<'invalid value'>' in the case of an OPERAND that
  4033. is not a list of character representations, and `<'memory
  4034. overflow'>' if there was insufficient space to allocate the result.
  4035. -- Function: list avm_list_of_value (void *CONTENTS, size_t SIZE, int
  4036. *FAULT)
  4037. This function performs the inverse operation of
  4038. `avm_value_of_list', taking the address of an area of contiguously
  4039. stored data and its SIZE in bytes to a list representation. The
  4040. length of the list returned is equal to the number of bytes of
  4041. data, SIZE, and each item of the list is a character
  4042. representation for the corresponding byte as given by *note
  4043. Character Table::.
  4044. A copy of the memory area is made so that the original is no longer
  4045. needed and may be freed by the caller. A pointer to this copy is
  4046. returned by subsequent calls to `avm_value_of_list' when the
  4047. result returned by this function is used as the OPERAND parameter.
  4048. If there is insufficient memory to allocate the result, the integer
  4049. referenced by FAULT is set to a non-zero value, and a copy of the
  4050. message `<'memory overflow'>' represented as a list is returned.
  4051. This function could also cause a segmentation fault if it is passed
  4052. an invalid pointer or a SIZE that overruns the storage area.
  4053. However, it is acceptable to specify a SIZE that is less than the
  4054. actual size of the given memory area to construct a list
  4055. representing only the first part of it. The SIZE must always be
  4056. greater than zero.
  4057. 
  4058. File: avram.info, Node: One dimensional arrays, Next: Two dimensional arrays, Prev: Primitive types, Up: Type Conversions
  4059. 3.1.4.2 One dimensional arrays
  4060. ..............................
  4061. A couple of functions declared in `matcon.h' are concerned mainly with
  4062. one dimensional arrays or vectors. They have been used for vectors of
  4063. double precision and complex numbers, but are applicable to any base
  4064. type that is contiguous and of a fixed size.
  4065. The motivation for these functions is to enable a developer to
  4066. present an API to virtual code applications wherein external library
  4067. functions operating natively on one dimensional arrays of numbers are
  4068. seen from the virtual side to operate on lists of numbers. Lists are the
  4069. preferred container for interoperability with virtual code applications.
  4070. -- Function: void *avm_vector_of_list (list OPERAND, size_t ITEM_SIZE,
  4071. list *MESSAGE, int *FAULT)
  4072. This function calls `avm_value_of_list' (*note Primitive types::)
  4073. for each item of the OPERAND and puts all the values together into
  4074. one contiguous block, whose address is returned.
  4075. The given ITEM_SIZE is required to be the lengths of the items,
  4076. all necessarily equal, and is required only for validation. For
  4077. example, ITEM_SIZE is 8 for a list of double precision numbers,
  4078. because they occupy 8 bytes each and are represented as lists of
  4079. length 8.
  4080. The total number of bytes allocated is the product of ITEM_SIZE
  4081. and the length of the OPERAND. Unlike the case of
  4082. `avm_value_of_list' (*note Primitive types::), the result returned
  4083. by this function should be explicitly freed by the caller when no
  4084. longer needed.
  4085. Any errors such as insufficient memory cause the integer
  4086. referenced by FAULT to be assigned a non-zero value and the
  4087. MESSAGE to be assigned an error message represented as a list of
  4088. strings. An error message of `<'bad vector specification'>' is
  4089. possible in the case of an empty OPERAND or one whose item lengths
  4090. don't match the given ITEM_SIZE. Error messages caused by
  4091. `avm_value_of_list' can also be generated by this function. Any
  4092. non-empty error message should be reclaimed by the caller using
  4093. `avm_dispose' (*note Simple Operations::). If an error occurs, a
  4094. `NULL' pointer is returned.
  4095. -- Function: list avm_list_of_vector (void *VECTOR, int NUM_ITEMS,
  4096. size_t ITEM_SIZE, int *FAULT)
  4097. This function takes it on faith that an array of dimension
  4098. NUM_ITEMS in which each item occupies ITEM_SIZE bytes begins at
  4099. the address given by VECTOR. A list representation of each item in
  4100. the array is constructed by the function `avm_list_of_value'
  4101. (*note Primitive types::), and a list of all of the lists thus
  4102. obtained in order of their position in the array is returned.
  4103. In the event of any errors caused by `avm_list_of_value' or errors
  4104. due to insufficient memory, the error message is returned as the
  4105. function result, and the integer referenced by FAULT is assigned a
  4106. non-zero value. The error message is in the form of a list of
  4107. character string representations. A segmentation fault is possible if
  4108. VECTOR is not a valid pointer or if the array size implied by
  4109. misspecified values of NUM_ITEMS and ITEM_SIZE exceeds its actual
  4110. size.
  4111. 
  4112. File: avram.info, Node: Two dimensional arrays, Next: Related utility functions, Prev: One dimensional arrays, Up: Type Conversions
  4113. 3.1.4.3 Two dimensional arrays
  4114. ..............................
  4115. Several other functions in `matcon.h' are meant to support conversions
  4116. between matrices represented as lists of lists and arrays in a variety
  4117. of representations. Dense matrices either square or rectangular are
  4118. accommodated, and symmetric square matrices can be stored with
  4119. redundant entries omitted in either upper trangular or lower triangular
  4120. format.
  4121. Similarly to the vector operations (*note One dimensional arrays::)
  4122. these functions are intended to allow a developer to present an
  4123. interface to external libraries based on lists rather than arrays.
  4124. The preferred convention for virtual code applications is to
  4125. represent a matrix as a list of lists of entities (typically numbers),
  4126. with one list for each row of the matrix. For example, a 3 by 3 matrix
  4127. containing a value of `aij' in the `i'-th row and the `j'-th column
  4128. would be represented by this list of three lists.
  4129. <
  4130. <a11,a12,a13>,
  4131. <a21,a22,a23>,
  4132. <a31,a32,a33>>
  4133. Such a representation is convenient for manipulation by virtual machine
  4134. combinators, for example `transpose' (*note Transpose::), and is
  4135. readily identified with the matrix it represents.
  4136. If a matrix is symmetric (that is, with `aij' equal to `aji' for all
  4137. values of `i' and `j'), only the lower triangular portion needs to be
  4138. stored because the other entries are redundant. The list
  4139. representatation would be something like this.
  4140. <
  4141. <a11>,
  4142. <a21,a22>,
  4143. <a31,a32,a33>>
  4144. Another alternative for representing a symmetric matrix is to store
  4145. only the upper triangular portion. In this case, a list such as the
  4146. following would be used.
  4147. <
  4148. <a11,a12,a13>,
  4149. <a22,a23>,
  4150. <a33>>
  4151. The upper and lower triangular representations are distinguishable by
  4152. whether or not the row lengths form an increasing sequence.
  4153. In addition to representing symmetric matrices, these upper and lower triangular
  4154. forms are also appropriate for representing matrices whose remaining
  4155. entries are zero, such as the factors in an LU decomposition.
  4156. -- Function: void *avm_matrix_of_list (int SQUARE, int
  4157. UPPER_TRIANGULAR, int LOWER_TRIANGULAR, int COLUMN_MAJOR,
  4158. list OPERAND, size_t ITEM_SIZE, list *MESSAGE, int *FAULT)
  4159. This function converts a matrix in one of the list representations
  4160. above to a contiguous array according to the given specifications.
  4161. The array can contain elements of any fixed sized type of size
  4162. ITEM_SIZE. The memory for it is allocated by this function and it
  4163. should be freed by the caller when no longer needed.
  4164. The input matrix is given by the list parameter, OPERAND, and its
  4165. format is described by the integer parameters SQUARE,
  4166. UPPER_TRIANGULAR, and LOWER_TRIANGULAR. The number of bytes
  4167. occupied by each entry is given by ITEM_SIZE.
  4168. To the extent these specifications are redundant, they are used for
  4169. validation. If any of the following conditions is not met, the
  4170. integer referenced by FAULT is assigned a non-zero value and a copy
  4171. of the message `<'bad matrix specification'>' represented as a
  4172. list is assigned to the list referenced by MESSAGE. Errors are
  4173. also possible due to insufficient memory.
  4174. * The OPERAND must be a list of lists of lists such that each
  4175. item of each item is has a length of ITEM_SIZE, and its items
  4176. consist of character representations as required by
  4177. `avm_value_of_list' (*note Primitive types::).
  4178. * If the lengths of the top level lists in the OPERAND form an
  4179. increasing sequence, the lower triangular representation is
  4180. assumed and the LOWER_TRIANGULAR parameter must have a
  4181. non-zero value.
  4182. * If the lengths of the top level lists in the OPERAND form a
  4183. decreasing sequence, the upper triangular representation is
  4184. assumed and the UPPER_TRIANGULAR parameter must have a
  4185. non-zero value.
  4186. * At least one of UPPER_TRIANGULAR or LOWER_TRIANGULAR must be
  4187. zero.
  4188. * If SQUARE has a non-zero value, then either all items of the
  4189. OPERAND must have the same length as the operand, or if it's
  4190. triangular, then the longest one must have the same length as
  4191. the operand.
  4192. * If the OPERAND is neither square nor a triangular form, all
  4193. items of it are required to have the same length.
  4194. The parameters UPPER_TRIANGULAR or LOWER_TRIANGULAR may be set to
  4195. non-zero values even if the OPERAND is not in one of the upper or
  4196. lower triangular forms discussed above. In this case, the OPERAND
  4197. must be square or rectangular (i.e., with all items the same
  4198. length), and the following interpretations apply.
  4199. * If UPPER_TRIANGULAR is non-zero, the diagonal elements and the
  4200. upper triangular portion of the input matrix are copied to
  4201. the output. The lower triangle of the input is ignored and
  4202. the lower triangle of the output is left uninitialized.
  4203. * If LOWER_TRIANGULAR is non-zero, the diagonal elements and the
  4204. lower triangular portion of the input matrix are copied to
  4205. the output. The upper triangle of the input is ignored and
  4206. the upper triangle of the output is left uninitialized.
  4207. The COLUMN_MAJOR parameter affects the form of the output array.
  4208. If it is zero, then each row of the input matrix is stored in a
  4209. contiguous block of memory in the output array, and if it is
  4210. non-zero, each column is stored contiguously. The latter
  4211. representation is also known as Fortran order and may be required
  4212. by library functions written in Fortran.
  4213. In all cases when a triangular form is specified, part of the
  4214. output matrix is left uninitialized. The redundant entries may be
  4215. assigned if required by the `avm_reflect_matrix' function (*note
  4216. Related utility functions::).
  4217. -- Function: list avm_list_of_matrix (void *MATRIX, int ROWS, int
  4218. COLS, size_t ITEM_SIZE, int *FAULT)
  4219. This function performs an inverse operation to
  4220. `avm_matrix_of_list' by taking the address of a matrix stored as a
  4221. contiguous array in the parameter MATRIX and constructing the list
  4222. representation as discussed above. Only square and rectangular
  4223. matrices in row major order are supported, but see
  4224. `avm_matrix_transposition' for a way to convert between row major and
  4225. column major order (*note Related utility functions::).
  4226. The parameters ROWS, COLS, and ITEM_SIZE describe the form of the
  4227. matrix. The list returned as a result will have a length of ROWS,
  4228. and each item will be a list of length COLS. Each item of the
  4229. result corresponds to a row of the matrix, and each item of the
  4230. items represents the an entry of the matrix as a list of length
  4231. ITEM_SIZE. These items could be passed to `avm_value_of_list', for
  4232. example, to obtain their values (*note Primitive types::).
  4233. Memory is allocated by this function to create the list, which can
  4234. be reclaimed by `avm_dispose' (*note Simple Operations::). If
  4235. there is insufficient memory, the integer referenced by FAULT is
  4236. assigned a non-zero value and the result returned is a list
  4237. representation of the message `<'memory overflow'>'. The error
  4238. message be reclaimed by the caller as well using `avm_dispose'.
  4239. A packed storage representation for symmetric square matrices and triangular
  4240. matrices is of interest because it is used by some library functions,
  4241. notably those in `LAPACK', to save memory and thereby accommodate
  4242. larger problems. In this representation, column major order is assumed,
  4243. and either the lower or the upper triangle of the matrix is not
  4244. explicitly stored. For example, a lower triangular matrix whose list
  4245. representation corresponds to
  4246. <
  4247. <a11>,
  4248. <a21,a22>,
  4249. <a31,a32,a33>,
  4250. <a41,a42,a43,a44>>
  4251. would be stored according to the memory map
  4252. [a11 a21 a31 a41 a22 a32 a42 a33 a43 a44]
  4253. with `a11' at the beginning address. An upper triangular matrix
  4254. <
  4255. <a11,a12,a13,a14>,
  4256. <a22,a23,a24>,
  4257. <a33,a34>,
  4258. <a44>>
  4259. would be stored according to the memory map
  4260. [a11 a12 a22 a13 a23 a33 a14 a24 a34 a44].
  4261. A couple of functions converting between list representations and
  4262. packed array format are provided as described below.
  4263. -- Function: void *avm_packed_matrix_of_list (int UPPER_TRIANGULAR,
  4264. list OPERAND, int N, size_t ITEM_SIZE, list *MESSAGE, int
  4265. *FAULT)
  4266. If the OPERAND is a list in one of the triangular forms explained
  4267. above, then the UPPER_TRIANGULAR parameter must be consisitent
  4268. with it, being non-zero if the OPERAND is upper triangular and
  4269. zero otherwise.
  4270. If the OPERAND is not in a triangular form, then each item of the
  4271. operand must be a list of length N. In this case, the
  4272. UPPER_TRIANGULAR parameter indicates which triangle of the operand
  4273. should be copied to the result, and the other triangle is ignored.
  4274. In either case, the operand must have a length of N, and the items
  4275. of its items must be lists of length ITEM_SIZE containing
  4276. character representations as required by `avm_value_of_list'
  4277. (*note Primitive types::).
  4278. If the input parameters are inconsistent or if there is
  4279. insufficient memory to allocate the result, the integer referenced
  4280. by FAULT is assigned a non-zero value, and the list referenced by
  4281. MESSAGE is assigned a copy of the list representation of `<'bad
  4282. matrix specification'>' or `<'memory overflow'>', respectively. A
  4283. non-empty message must be reclaimed by the caller using
  4284. `avm_dispose' (*note Simple Operations::).
  4285. If there are no errors, the result is a pointer to a packed array
  4286. representation of the OPERAND as explained above. The memory for
  4287. this result is allocated by this function and should be freed by
  4288. the caller when no longer required. The number of bytes allocated
  4289. will be ITEM_SIZE * (N * (N + 1))/2.
  4290. -- Function: list avm_list_of_packed_matrix (int UPPER_TRIANGULER,void
  4291. *OPERAND, int N, size_t ITEM_SIZE, int *FAULT)
  4292. This function performs an inverse operation to that of
  4293. `avm_packed_matrix_of_list' given the address of a packed matrix
  4294. stored according to one of the memory maps discussed above. The
  4295. OPERAND parameter holds the address, the parameter N gives the
  4296. number of rows, and the UPPER_TRIANGULAR parameter specifies which
  4297. of the two possible memory maps to assume.
  4298. If there is sufficient memory, the result returned is a list in
  4299. one of the triangular forms described above, being upper
  4300. triangular if the UPPER_TRIANGULAR parameter is non-zero, with
  4301. values of length ITEM_SIZE taken from the array.
  4302. In the event of a memory overflow, the integer referenced by FAULT
  4303. is assigned a non-zero value and the result is a copy of the
  4304. message `<'memory overflow'>' represented as a list. A segmentation
  4305. fault is possible if this function is passed an invalid pointer or
  4306. dimension.
  4307. 
  4308. File: avram.info, Node: Related utility functions, Prev: Two dimensional arrays, Up: Type Conversions
  4309. 3.1.4.4 Related utility functions
  4310. .................................
  4311. A small selection of additional functions that are likely to be of use
  4312. to developers concerned with matrix operations has been incorporated
  4313. into the API to save the trouble of reinventing them, although doing so
  4314. would be straightforward. They are described in this section without
  4315. further motivation.
  4316. -- Function: void *avm_matrix_transposition (void *MATRIX, int ROWS,
  4317. int COLS, size_t ITEM_SIZE)
  4318. This function takes the address of an arbitrary rectangular MATRIX
  4319. represented as a contiguous array (not a list) and transposes it
  4320. in place. That is, this function transforms an M by N matrix to an
  4321. N by M matrix by exchanging the I,Jth element with the J,Ith
  4322. element for all values of I and J.
  4323. The numbers of rows and columns in the MATRIX are given by the
  4324. parameters ROWS and COLS, respectively, and the size of the
  4325. entries in bytes is given by ITEM_SIZE.
  4326. The MATRIX is assumed to be in row major order, but this function
  4327. is applicable to matrices in column major order if the caller passes
  4328. the number of columns in ROWS and the number of rows in COLS.
  4329. Alternatively, this function can be seen as a conversion between
  4330. the row major and the column major representation of a matrix. An M
  4331. by N matrix in row major order will be transformed to the same M
  4332. by N matrix in column order, or from column order to row order.
  4333. A notable feature of this function is that it allocates no memory
  4334. so there is no possibility of a memory overflow even for very large
  4335. matrices, unlike a naive implementation which would involve making
  4336. a temporary copy of the matrix. There is a possibility of a
  4337. segmentation fault if invalid pointers or dimensions are given.
  4338. -- Function: void *avm_matrix_reflection (int UPPER_TRIANGULAR, void
  4339. *MATRIX, int N, size_t ITEM_SIZE)
  4340. This function takes a symmetric square MATRIX of dimension N
  4341. containing entries of ITEM_SIZE bytes each and fills in the
  4342. redundant entries.
  4343. If UPPER_TRIANGULAR is non-zero, the upper triangle of the MATRIX
  4344. is copied to the lower triangle. If UPPER_TRIANGULAR is zero, the
  4345. lower triangular entries are copied to the upper triangle.
  4346. These conventions assume row major order. If the MATRIX is in column
  4347. major order, then the caller can either transpose it in place before
  4348. and after this function by `avm_matrix_transposition', or can
  4349. complement the value of UPPER_TRIANGULAR.
  4350. Note that this function may be unnecessary for `LAPACK' library
  4351. functions that ignore the redundant entries in a symmetric matrix,
  4352. because they can be left uninitialized, but it is included for the
  4353. sake of completeness.
  4354. -- Function: list *avm_row_number_array (counter M, int *FAULT)
  4355. A fast, memory efficient finite map from natural numbers to their
  4356. list representations can be obtained by using this function as an
  4357. alternative to `avm_natural' or `avm_recoverable_natural' when
  4358. repeated evaluations of numbers within a known range are required
  4359. (*note Simple Operations:: and *note Recoverable Operations::).
  4360. Given a positive integer M, this function allocates and returns an
  4361. array of M lists whose Ith entry is the list representation of the
  4362. number I as explained in *note Representation of Numeric and
  4363. Textual Data::.
  4364. An amount of memory proportional to M is used for the array and
  4365. its contents. If there is insufficient memory, a `NULL' value is
  4366. returned and the integer referenced by FAULT is set to a non-zero
  4367. value.
  4368. -- Function: void avm_dispose_rows (counter M, list *ROW_NUMBER)
  4369. This function reclaims an array ROW_NUMBER of size M returned by
  4370. `avm_row_number_array', and its contents if any. A `NULL' pointer
  4371. is allowed as the ROW_NUMBER parameter and will have no effect,
  4372. but an uninitialized pointer will cause a segmentation fault.
  4373. -- Function: void avm_initialize_matcon ();
  4374. This function initializes some static variables used by the
  4375. functions declared in `matcon.h' and should be called before any
  4376. of them is called or they might not perform according to
  4377. specifications.
  4378. -- Function: void avm_count_matcon ();
  4379. This function frees the static variables allocated by
  4380. `avm_initialize_matcon' and is used to verify the absence of
  4381. memory leaks. It should be called after the last call to any
  4382. functions in `matcon.h' but before `avm_count_lists' if the latter
  4383. is being used (*note Simple Operations::).
  4384. 
  4385. File: avram.info, Node: Comparison, Next: Deconstruction Functions, Prev: Type Conversions, Up: Lists
  4386. 3.1.5 Comparison
  4387. ----------------
  4388. The file `compare.h' contains a few function declarations pertaining to
  4389. the computation of the comparison predicate described in *note
  4390. Compare::. Some of the work is done by static functions in `compare.c'
  4391. that are not recommended entry points to the library.
  4392. -- Function: void avm_initialize_compare ()
  4393. This function should be called once before the first call to
  4394. `avm_comparison', as it initializes some necessary internal data
  4395. structures.
  4396. -- Function: void avm_count_compare ()
  4397. This function can be used to check for memory leaks, by detecting
  4398. unreclaimed storage at the end of a run. The data structures
  4399. relevant to comparison that could be reported as unreclaimed are
  4400. known as "decision" nodes, but these should always be handled
  4401. properly by the library without intervention. If `avm_count_lists'
  4402. is also being used, the call to this function must precede it.
  4403. -- Function: list avm_comparison (list OPERAND, int *FAULT)
  4404. This function takes a list operand representing a pair of trees and
  4405. returns a list representing the logical value of their equality.
  4406. If the operand is `NULL', a message of invalid comparison is
  4407. returned and the `*FAULT' is set to a non-zero value. If the
  4408. `head' of the operand is unequal to the `tail', a `NULL' value is
  4409. returned. If they are equal, a list is returned whose `head' and
  4410. `tail' are both `NULL'. The equality in question is structural rather
  4411. than pointer equality.
  4412. The list operand to this function may be modified by this
  4413. function, but not in a way that should make any difference to a
  4414. client program. If two lists are found to be equal, or if even two
  4415. sublists are found to be equal in the course of the comparison,
  4416. one of them is deallocated and made to point to the other. This
  4417. action saves memory and may make subsequent comparisons faster.
  4418. However, it could disrupt client programs that happen to be
  4419. holding stale list pointers.
  4420. As of `avram' version 0.6.0, a logical field called
  4421. `discontiguous' has been added to the `node' record type declared
  4422. in `lists.h', which is checked by the comparison function. If a
  4423. list node has its `discontiguous' field set to a non-zero value,
  4424. and if it also has a non-null `value' field, then it won't be
  4425. deallocated in the course of comparison even if it is found to be
  4426. equal to something else. This feature can be used by client
  4427. modules to create lists in which value fields refer to data
  4428. structures that are meant to exist independently of them. See
  4429. `mpfr.c' for an example.
  4430. This function is likely to have better performance and memory
  4431. usage than a naive implementation of comparison, for the above
  4432. reasons and also because of optimizations pertaining to comparison
  4433. of lists representing characters. Moreover, it is not subject to
  4434. stack overflow exceptions because it is not written in a recursive
  4435. style.
  4436. -- Function: list avm_binary_comparison (list LEFT_OPERAND, list
  4437. RIGHT_OPERAND, int *FAULT);
  4438. This function is the same as `avm_comparison' except that it
  4439. allows the left and right operands to be passed as separate lists
  4440. rather than taking them from the `head' and the `tail' of a single
  4441. list.
  4442. 
  4443. File: avram.info, Node: Deconstruction Functions, Next: Indirection, Prev: Comparison, Up: Lists
  4444. 3.1.6 Deconstruction Functions
  4445. ------------------------------
  4446. A fast native implementation of the deconstruction operation is provided by
  4447. the functions declared in `decons.h'.
  4448. -- Function: void avm_initialize_decons ()
  4449. This should be called prior to the first call to
  4450. `avm_deconstruction', so as to initialize some necessary internal
  4451. data structures. Results will be undefined if it is not.
  4452. -- Function: void avm_count_decons ()
  4453. For ecologically sound memory management, this function should be
  4454. called at the end of a run to verify that there have been no leaks
  4455. due to the deconstruction functions, which there won't be unless
  4456. the code in `decons.c' has been ineptly modified. An error message
  4457. to the effect of unreclaimed "points" could be the result
  4458. otherwise.
  4459. -- Function: list avm_deconstruction (list POINTER, list OPERAND, int
  4460. *FAULT)
  4461. Deconstructions are performed by this function, as described in
  4462. *note Field::. In the `silly' program notation (*note A Simple
  4463. Lisp Like Language::), this function computes the value of
  4464. ([[`field']] `POINTER') `OPERAND'.
  4465. For example, using the fixed list `avm_join(NULL,NULL)' as the
  4466. `POINTER' parameter will cause a copy of the operand itself to be
  4467. returned as the result. A `POINTER' equal to
  4468. `avm_join(NULL,avm_join(NULL,NULL))' will cause a copy of
  4469. `operand->tail' to be returned, and so on. A `NULL' `POINTER'
  4470. causes an internal error.
  4471. If the deconstruction is invalid, as in the case of the tail of an
  4472. empty list, the invalid deconstruction error message is returned
  4473. as the result, and the `*FAULT' parameter is set to a non-zero
  4474. value. The `*FAULT' parameter is also set to a non-zero value in
  4475. the event of a memory overflow, and the memory overflow message is
  4476. returned.
  4477. 
  4478. File: avram.info, Node: Indirection, Next: The Universal Function, Prev: Deconstruction Functions, Up: Lists
  4479. 3.1.7 Indirection
  4480. -----------------
  4481. In some cases it is necessary to build a tree from the top down rather than
  4482. from the bottom up, when it is not known in advance what's on the
  4483. bottom. Although the `list' type is a pointer itself, these situations
  4484. call for a type of pointers to lists, which are declared as the
  4485. `branch' type in `branches.h'. For example, if `b' is declared as a
  4486. `branch' and `l' is declared as a `list', it would be possible to write
  4487. `b = &l'.
  4488. Facilities are also provided for maintaining queues of branches,
  4489. which are declared as the `branch_queue' type. This type is a pointer to
  4490. a structure with two fields, `above' and `following', where `above' is
  4491. a `branch' and `following' is a `branch_queue'.
  4492. These functions are used internally elsewhere in the library and
  4493. might not be necessary for most client programs to use directly.
  4494. -- Function: void avm_initialize_branches ()
  4495. This must be done once before any of the other branch related
  4496. functions is used, and creates some internal data structures.
  4497. Results of the other functions are undefined if this one isn't
  4498. called first.
  4499. -- Function: void avm_count_branches ()
  4500. This function can be used at the end of a run to detect unreclaimed
  4501. storage used for branches or branch queues. If any storage remains
  4502. unreclaimed, a message about unreclaimed branches is written to
  4503. standard error.
  4504. -- Function: void avm_anticipate (branch_queue *FRONT, branch_queue
  4505. *BACK, branch OPERAND)
  4506. This function provides a simple queueing facility for branches.
  4507. Similarly to the case with `avm_enqueue', `front' and `back'
  4508. should be initialized to `NULL' before the first call. Each call
  4509. to this function will enqueue one item to the back, assuming
  4510. enough memory is available, as the following example shows.
  4511. front = NULL;
  4512. back = NULL;
  4513. l = avm_join(NULL,NULL);
  4514. anticipate(&front,&back,&(l->head));
  4515. anticipate(&front,&back,&(l->tail));
  4516. After the above code is executed, these postconditions will hold.
  4517. front->above == &(l->head)
  4518. front->following->above == &(l->tail)
  4519. front->following == back
  4520. back->following == NULL
  4521. The name "anticipate" is used because ordinarily the queue contains
  4522. positions in a tree to be filled in later. As usual, only unshared
  4523. trees should be modified in place.
  4524. -- Function: void avm_recoverable_anticipate (branch_queue *FRONT,
  4525. branch_queue *BACK, branch OPERAND, int *FAULT)
  4526. This function is similar to `avm_anticipate', except that it will
  4527. not exit with an error message in the event of an overflow error,
  4528. but will simply set `*FAULT' to a non-zero value and return to the
  4529. caller. If an overflow occurs, nothing about the queue is changed.
  4530. -- Function: void avm_enqueue_branch (branch_queue *FRONT,
  4531. branch_queue *BACK, int RECEIVED_BIT)
  4532. A slightly higher level interface to the `avm_anticipate' function
  4533. is provided by this function, which is useful for building a tree
  4534. from a string of input bits in a format similar to the one
  4535. described in *note Concrete Syntax::.
  4536. This function should be called the first time with `front' and
  4537. `back' having been initialized to represent a queue containing a single
  4538. branch pointing to a list known to the caller. The list itself
  4539. need not be allocated or initialized. An easy way of doing so
  4540. would be the following.
  4541. front = NULL;
  4542. back = NULL;
  4543. avm_anticipate(&front,&back,&my_list);
  4544. On each call to `avm_enqueue_branch', the `RECEIVED_BIT' parameter
  4545. is examined. If it is zero, nothing will be added to the queue,
  4546. the list referenced by the front branch will be assigned `NULL',
  4547. and the front branch will be removed from the queue. If
  4548. `RECEIVED_BIT' is a non-zero value, the list referenced by the
  4549. front branch will be assigned to point to a newly created unshared
  4550. list node, and two more branches will be appended to the queue. The
  4551. first branch to be appended will point to the head of the newly
  4552. created list node, and the second branch to be appended will point
  4553. to the tail.
  4554. If the sequence of bits conforms to the required concrete syntax,
  4555. this function can be called for each of them in turn, and at the
  4556. end of the sequence, the queue will be empty and the list
  4557. referenced by the initial branch (i.e., `my_list') will be the one
  4558. specified by the bit string. If the sequence of bits does not
  4559. conform to the required concrete syntax, the error can be detected
  4560. insofar as the emptying of the queue will not coincide exactly
  4561. with the last bit.
  4562. The caller should check for the queue becoming prematurely empty
  4563. due to syntax errors, because no message is reported by
  4564. `avm_enqueue_branch' in that event, and subsequent attempts to
  4565. enqueue anything are ignored. However, in the event of a memory
  4566. overflow, an error message is reported and the process is
  4567. terminated.
  4568. -- Function: void avm_recoverable_enqueue_branch (branch_queue *FRONT,
  4569. branch_queue *BACK, int RECEIVED_BIT, int *FAULT)
  4570. This function is similar to `avm_enqueue_branch' but will leave
  4571. error handling to the caller in the event of insufficient memory to
  4572. enqueue another branch. Instead of printing an error message and
  4573. exiting, it will dispose of the queue, set the `FAULT' flag to a
  4574. non-zero value, and return. Although the queue will be reclaimed,
  4575. the lists referenced by the branches in it will persist. The list
  4576. nodes themselves can be reclaimed by disposing of the list whose
  4577. address was stored originally in the front branch.
  4578. -- Function: void avm_dispose_branch_queue (branch_queue FRONT)
  4579. This function deallocates a branch queue by chasing the `following'
  4580. fields in each one. It does nothing to the lists referenced by the
  4581. branches in the queue.
  4582. Rather than using `free' directly, client programs should use this
  4583. function for deallocating branch queues, because it allows better
  4584. performance by interacting with a local internal cache of free
  4585. memory, and because it performs necessary bookkeeping for
  4586. `avm_count_branches'.
  4587. -- Function: void avm_dispose_branch (branch_queue OLD)
  4588. This disposes of a single branch queue node rather than a whole
  4589. queue. Otherwise, the same comments as those above apply.
  4590. 
  4591. File: avram.info, Node: The Universal Function, Prev: Indirection, Up: Lists
  4592. 3.1.8 The Universal Function
  4593. ----------------------------
  4594. A function computing the result of the invisible operator used to
  4595. specify the virtual code semantics in *note Virtual Code Semantics::, is
  4596. easily available by way of a declaration in `apply.h'.
  4597. -- Function: void avm_initialize_apply ()
  4598. This function should be called by the client program at least once
  4599. prior to the first call to `avm_apply' or `avm_recoverable_apply'.
  4600. It causes certain internal data structures and error message texts
  4601. to be initialized.
  4602. -- Function: void avm_count_apply ()
  4603. This function should be used at the end of a run for the purpose of
  4604. detecting and reporting any unreclaimed storage associated with
  4605. functions in this section. If the function `avm_count_lists()' is
  4606. also being used, it should be called after this one.
  4607. -- Function: list avm_apply (list OPERATOR, list OPERAND)
  4608. This is the function that evaluates the operator used to describe
  4609. the virtual code semantics. For example, the value of `F X' can be
  4610. obtained as the result returned by `avm_apply(F,X)'.
  4611. Both parameters to this function are deallocated unconditionally
  4612. and should not be referenced again by the caller. If the
  4613. parameters are needed subsequently, then only copies of them
  4614. should be passed to `avm_apply' using `avm_copied'.
  4615. This function is not guaranteed to terminate, and may cause a
  4616. memory overflow error. In the event of an exceptional condition,
  4617. the error message is written to standard error and the program is
  4618. halted. There is no externally visible distinction between
  4619. different levels of error conditions.
  4620. -- Function: list avm_recoverable_apply (list OPERATOR, list OPERAND,
  4621. int *FAULT)
  4622. This function is similar to `avm_apply' but leaves the
  4623. responsibility of error handling with the caller. If any overflow
  4624. or exceptional condition occurs, the result returned is a list
  4625. representing the error message, and the `FAULT' flag is set to a
  4626. non-zero value. This behavior contrasts with that of `avm_apply',
  4627. which will display the message to standard error and kill the
  4628. process.
  4629. 
  4630. File: avram.info, Node: Characters and Strings, Next: File Manipulation, Prev: Lists, Up: Library Reference
  4631. 3.2 Characters and Strings
  4632. ==========================
  4633. If a C program is to interact with a virtual code application by
  4634. exchanging text, it uses the representation for characters described in
  4635. *note Character Table::. This convention would be inconvenient without
  4636. a suitable API, so the functions in this section address the need. These
  4637. functions are declared in the header file `chrcodes.h'.
  4638. Some of these functions have two forms, with one of them having the
  4639. word `standard' as part of its name. The reason is to cope with
  4640. multiple character encodings. Versions of `avram' prior to 0.1.0 used a
  4641. different character encoding than the one documented in *note Character
  4642. Table::. The functions described in *note Version Management:: can be
  4643. used to select backward compatible operation with the older character
  4644. encoding. The normal forms of the functions in this section will use
  4645. the older character set if a backward compatibility mode is indicated,
  4646. whereas the standard forms will use the character encoding documented
  4647. in *note Character Table:: regardless.
  4648. Standard encodings should always be assumed for library and function names
  4649. associated with the `library' combinator (*note Calling existing
  4650. library functions::), and for values of lists defined by
  4651. `avm_list_of_value' (*note Primitive types::), but version dependent
  4652. encodings should be used for all other purposes such as error messages.
  4653. Alternatively, the normal version dependent forms of the functions
  4654. below can be used safely in any case if backward compatibility is not
  4655. an issue. This distinction is viewed as a transitional feature of the
  4656. API that will be discontinued eventually when support for the old
  4657. character set is withdrawn and the `standard' forms are be removed.
  4658. -- Function: list avm_character_representation (int CHARACTER)
  4659. -- Function: list avm_standard_character_representation (int CHARACTER)
  4660. This function takes an integer character code and returns a copy of
  4661. the list representing it, as per the table in *note Character
  4662. Table::. Because the copy is shared, no memory is allocated by this
  4663. function so there is no possibility of overflow. Nevertheless, it
  4664. is the responsibility of the caller dispose of the list when it is
  4665. no longer needed by `avm_dispose', just as if the copy were not
  4666. shared (*note Simple Operations::). For performance reasons, this
  4667. function is implemented as a macro. If the argument is outside the
  4668. range of zero to 255, it is masked into that range.
  4669. -- Function: int avm_character_code (list OPERAND)
  4670. -- Function: int avm_standard_character_code (list OPERAND)
  4671. This function takes a list as an argument and returns the
  4672. corresponding character code, as per *note Character Table::. If
  4673. the argument does not represent any character, a value of `-1' is
  4674. returned.
  4675. -- Function: list avm_strung (char *STRING)
  4676. -- Function: list avm_standard_strung (char *STRING)
  4677. This function takes a pointer to a null terminated character
  4678. string and returns the list obtained by translating each character
  4679. into its list representation and enqueuing them together. Memory
  4680. needs to be allocated for the result, and if there isn't enough
  4681. available, an error message is written to standard error and the
  4682. process is terminated. This function is useful to initialize lists
  4683. from hard coded strings at the beginning of a run, as in this
  4684. example.
  4685. hello_string = avm_strung("hello");
  4686. This form initializes a single string, but to initialize a one line
  4687. message suitable for writing to a file, it would have to be a list
  4688. of strings, as in this example.
  4689. hello_message = avm_join(avm_strung("hello"),NULL);
  4690. The latter form is used internally by the library for initializing
  4691. most of the various error messages that can be returned by other
  4692. functions.
  4693. -- Function: list avm_recoverable_strung (char *STRING, int *FAULT);
  4694. -- Function: list avm_recoverable_standard_strung (char *STRING, int
  4695. *FAULT);
  4696. This function is like `avm_strung' except that if it runs out of
  4697. memory it sets the integer referenced by FAULT to a non-zero value
  4698. and returns instead of terminating the process.
  4699. -- Function: char *avm_unstrung (list STRING, list *MESSAGE, int
  4700. *FAULT)
  4701. -- Function: char *avm_standard_unstrung (list STRING, list *MESSAGE,
  4702. int *FAULT)
  4703. This function performs an inverse operation to
  4704. `avm_recoverable_strung', taking a list representing a character
  4705. string to the character string in ASCII null terminated form as per
  4706. the standard C representation. Memory is allocated for the result
  4707. by this function which should be freed by the caller.
  4708. In the event of an exception, the integer referenced by `fault' is
  4709. assigned a non-zero value and an error message represented as a
  4710. list is assigned to the list referenced by `message'. The error
  4711. message should be reclaimed by the caller with `avm_dispose'
  4712. (*note Simple Operations:: if it is non-empty. Possible error
  4713. messages are `<'memory overflow'>', `<'counter overflow'>', and
  4714. `<'invalid text format'>'.
  4715. -- Function: list avm_scanned_list (char *STRING)
  4716. An application that makes use of virtual code snippets or data
  4717. that are known at compile time can use this function to initialize
  4718. them. The argument is a string in the format described in *note
  4719. Concrete Syntax::, and the result is the list representing it. For
  4720. example, the program discussed in *note Example Script:: could be
  4721. hard coded into a C program by pasting the data from its virtual
  4722. code file into an expression of this form.
  4723. cat_program = avm_scanned_list("sKYQNTP\\");
  4724. Note that the backslash character in the original data has to be
  4725. preceded by an extra backslash in the C source, because backslashes
  4726. usually mean something in C character constants.
  4727. The `avm_scanned_list' function needs to allocate memory. If there
  4728. isn't enough memory available, it writes a message to standard
  4729. error and causes the process to exit.
  4730. -- Function: list avm_multiscanned (char **STRINGS)
  4731. Sometimes it may be useful to initialize very large lists from
  4732. strings, but some C compilers impose limitations on the maximum
  4733. length of a string constant, and the ISO standard for C requires
  4734. only 512 bytes. This function serves a similar purpose to
  4735. `avm_scanned_list', but allows the argument to be a pointer to a
  4736. null terminated array of strings instead of one long string,
  4737. thereby circumventing this limitation in the compiler.
  4738. char *code[] = {"sKYQ","NTP\\",NULL};
  4739. ...
  4740. cat_program = avm_multiscanned(code);
  4741. If there is insufficient memory to allocate the list this function
  4742. needs to create, it causes an error message to be written to
  4743. standard error, and then kills the process.
  4744. -- Function: char* avm_prompt (list PROMPT_STRINGS)
  4745. This function takes a list representing a list of character
  4746. strings, and returns its translation to a character string with
  4747. the sequence 13 10 used as a separator. For example, given a tree
  4748. of this form
  4749. some_message = avm_join(
  4750. avm_strung("hay"),
  4751. avm_join(
  4752. avm_strung("you"),
  4753. NULL));
  4754. the result returned by `prompt_strings(some_message)' would be a
  4755. pointer to a null terminated character string equivalent to the C
  4756. constant `"hay\13\10you"'.
  4757. Error messages are printed and the process terminated in the event
  4758. of either a memory overflow or an invalid character representation.
  4759. This function is used by `avram' in the evaluation of interactive virtual
  4760. code applications, whose output has to be compared to the output
  4761. from a shell command in this format. The separator is chosen to be
  4762. compatible with the `expect' library.
  4763. -- Function: char* avm_recoverable_prompt (list PROMPT_STRINGS, list
  4764. *MESSAGE, int *FAULT)
  4765. This function performs the same operation as `avm_prompt' but
  4766. allows the caller to handle exceptional conditions. If an exception
  4767. such as a memory overflow occurs, the integer referenced by
  4768. `fault' is assigned a non-zero value and a representation of the
  4769. error message as a list of strings is assigned to the list
  4770. referenced by `message'.
  4771. This function is used to by `avram' to evaluate the `interact'
  4772. combinator (*note Interaction combinator::), when terminating in
  4773. the event of an error would be inappropriate.
  4774. -- Function: void avm_initialize_chrcodes ()
  4775. This function has to be called before any of the other character
  4776. conversion functions in this section, or else their results are
  4777. undefined. It performs the initialization of various internal data
  4778. structures.
  4779. -- Function: void avm_count_chrcodes ()
  4780. This function can be called at the end of a run, after the last
  4781. call to any of the other functions in this section, but before
  4782. `avm_count_lists' if that function is also being used. The purpose
  4783. of this function is to detect and report memory leaks. If any
  4784. memory associated with any of these functions has not been
  4785. reclaimed by the client program, a message giving the number of
  4786. unreclaimed lists will be written to standard error.
  4787. 
  4788. File: avram.info, Node: File Manipulation, Next: Invocation, Prev: Characters and Strings, Up: Library Reference
  4789. 3.3 File Manipulation
  4790. =====================
  4791. The functions described in this section provide an interface between
  4792. virtual code applications and the host file system by converting
  4793. between files or file names and their representations as lists. These
  4794. conversions are necessary when passing a file to a virtual code
  4795. application, or when writing a file received in the result of one.
  4796. * Menu:
  4797. * File Names::
  4798. * Raw Files::
  4799. * Formatted Input::
  4800. * Formatted Output::
  4801. 
  4802. File: avram.info, Node: File Names, Next: Raw Files, Prev: File Manipulation, Up: File Manipulation
  4803. 3.3.1 File Names
  4804. ----------------
  4805. A standard representation is used by virtual code applications for the path
  4806. names of files, following the description in *note Input Data
  4807. Structure::. The functions and constants declared in `fnames.h' provide
  4808. an API for operating on file names in this form.
  4809. -- Function: list avm_path_representation (char *PATH)
  4810. If a C program is to invoke a virtual code application and pass a
  4811. path name to it as a parameter, this function can be used to
  4812. generate the appropriate representation from a given character
  4813. string.
  4814. conf_path = avm_path_representation("/etc/resolve.conf");
  4815. In this example, `conf_path' is a `list'. For potentially better
  4816. portability, a C program can use the character constant
  4817. `avm_path_separator_character' in place of the slashes in hard
  4818. coded path names.
  4819. Other useful constants are `avm_current_directory_prefix' as a portable
  4820. replacement for `"./"', as well as `avm_parent_directory_prefix'
  4821. instead of `"../"'. There is also `avm_root_directory_prefix' for
  4822. `"/"'. These three constants are null terminated strings, unlike
  4823. `avm_path_separator_character', which is a character.
  4824. If a `NULL' pointer is passed as the `PATH', a `NULL' list is
  4825. returned, which is the path representation for standard input or
  4826. standard output. If the address of an empty string is passed to
  4827. this function as the `PATH', the list of the empty string will be
  4828. returned, which is the path representation for the root directory.
  4829. Trailing path separators are ignored, so `"/"' is the same as the
  4830. empty string.
  4831. Some memory needs to be allocated for the result of this function.
  4832. If the memory is not available, an error message is written to
  4833. standard error and the process is terminated.
  4834. -- Function: list avm_date_representation (char *PATH)
  4835. This function is essentially a wrapper around the standard
  4836. `ctime_r' function that not only gets the time stamp for a file at
  4837. a given path, but transforms it to a list representation according
  4838. to *note Character Table::. It needs to allocate memory for the
  4839. result and will cause the program to exit with an error message if
  4840. there is not enough memory available.
  4841. The time stamp will usually be in a format like `Sun Mar 4 10:56:40
  4842. GMT 2001'. If for some reason the time stamp can not be obtained,
  4843. the result will be a representation of the string `unknown date'.
  4844. -- Function: char* avm_path_name (list PATH)
  4845. This function is the inverse of `avm_path_representation', in that
  4846. it takes a list representing a path to the path name expressed as
  4847. a character string. This function can be used in C programs that
  4848. invoke virtual code applications returning paths as part of their
  4849. results, so that the C program can get the path into a character
  4850. string in order to open the file.
  4851. If the `PATH' parameter is `NULL', a `NULL' pointer is returned as
  4852. the result. The calling program should check for a `NULL' result
  4853. and interpret it as the path to standard input or standard output.
  4854. The memory needed for the character string whose address is
  4855. returned is allocated by this function if possible. The given
  4856. `PATH' is not required to be consistent with the host file system,
  4857. but it is required to consist of representations of non-null
  4858. printable characters or spaces as lists per *note Character
  4859. Table::. In the event of any error or overflow, control does not
  4860. return to the caller, but an error message is printed and the
  4861. program is aborted. The possible error messages from this function
  4862. are the following.
  4863. * `PROGRAM-NAME: counter overflow (code NN)'
  4864. * `PROGRAM-NAME: memory overflow (code NN)'
  4865. * `PROGRAM-NAME: null character in file name'
  4866. * `PROGRAM-NAME: bad character in file name'
  4867. * `PROGRAM-NAME: invalid file name (code NN)'
  4868. -- Function: void avm_initialize_fnames ()
  4869. A few housekeeping operations relevant to internal data structures
  4870. are performed by this function, making it necessary to be called
  4871. by the client program prior to using any of the other ones.
  4872. -- Function: void avm_count_fnames ()
  4873. This function can be used after the the last call to any of the
  4874. other functions in this section during a run, and it will detect
  4875. memory leaks that may be attributable to code in these functions
  4876. or misuse thereof. If any unreclaimed storage remains when this
  4877. function is called, a warning message will be written to standard
  4878. error. If the function `avm_count_lists' is also being used by the
  4879. client, it should be called after this one.
  4880. 
  4881. File: avram.info, Node: Raw Files, Next: Formatted Input, Prev: File Names, Up: File Manipulation
  4882. 3.3.2 Raw Files
  4883. ---------------
  4884. Some low level operations involving lists and data files are provided by
  4885. these functions, which are declared in the header file `rawio.h'.
  4886. -- Function: list avm_received_list (FILE *OBJECT, char *FILENAME)
  4887. This function is a convenient way of transferring data directly
  4888. from a raw format file into a list in memory. It might typically
  4889. be used to load the virtual code for an application that has been
  4890. written to a file by a compiler.
  4891. `OBJECT'
  4892. is the address of a file which should already be open for
  4893. reading before this function is called, and will be read from
  4894. its current position.
  4895. `FILENAME'
  4896. should be set by the caller to the address of a null
  4897. terminated string containing the name of the file, but is not
  4898. used unless it needs to be printed as part of an error
  4899. message. If it is a null pointer, standard input is assumed.
  4900. The result returned is a list containing data read from the file.
  4901. The file format is described in *note File Format::. The preamble
  4902. section of the file, if any, is ignored. If the file ends
  4903. prematurely or otherwise conflicts with the format, the program is
  4904. aborted with a message of
  4905. `PROGRAM-NAME: invalid raw file format in FILENAME'
  4906. written to standard error. The program will also be aborted by this
  4907. function in the event of a memory overflow.
  4908. The file is left open when this function returns, and could
  4909. therefore be used to store other data after the end of the list.
  4910. The end of a list is detected automatically by this function, and
  4911. it reads no further, leaving the file position on the next
  4912. character, if any.
  4913. -- Function: void avm_send_list (FILE *REPOSITORY, list OPERAND, char
  4914. *FILENAME)
  4915. This function can be used to transfer data from a list in memory
  4916. to a file, essentially by implementing the printing algorithm
  4917. described in *note Bit String Encoding::.
  4918. `REPOSITORY'
  4919. is the address of a file already open for writing, to which
  4920. the data are written starting from the current position.
  4921. `OPERAND'
  4922. is the list containing the data to be written
  4923. `FILENAME'
  4924. is the address of a null terminated string containing the
  4925. name of the file that will be reported in an error message if
  4926. necessary.
  4927. No preamble section is written by this function, but one could be written
  4928. to the file by the caller prior to calling it. Error messages are
  4929. possible either because of i/o errors or because of insufficient
  4930. memory. I/o errors are not fatal and will result only in a warning
  4931. message being printed to standard error, but a memory overflow will
  4932. cause the process to abort. An i/o error message reported by this
  4933. function would be of the form
  4934. `PROGRAM-NAME: can't write to FILENAME'
  4935. followed by the diagnostic obtained from the standard `strerror' function
  4936. if it exists on the host platform. The file is left open when this
  4937. function returns.
  4938. -- Function: void avm_initialize_rawio ()
  4939. This function initializes some necessary data structures for the
  4940. functions in this section, and should be called prior to them at
  4941. the beginning of a run.
  4942. -- Function: void avm_count_rawio ()
  4943. This function does nothing in the present version of the library,
  4944. but should be called after the last call to all of the other
  4945. functions in this section in order to maintain compatibility with
  4946. future versions of the library. Future versions may decide to use
  4947. this function to do some cleaning up of local data structures.
  4948. 
  4949. File: avram.info, Node: Formatted Input, Next: Formatted Output, Prev: Raw Files, Up: File Manipulation
  4950. 3.3.3 Formatted Input
  4951. ---------------------
  4952. Some functions relating to the input of text files or data files with
  4953. preambles are declared in the header file `formin.h'. The usage of
  4954. these functions is as follows.
  4955. -- Function: list avm_preamble_and_contents (FILE *SOURCE, char
  4956. *FILENAME)
  4957. This function loads a file of either text or data format into
  4958. memory.
  4959. `SOURCE'
  4960. should be initialized by the caller as the address of a file
  4961. already open for reading that will be read from its current
  4962. position.
  4963. `FILENAME'
  4964. should be set by the caller to the address of a null
  4965. terminated character string giving the name of the file that
  4966. will be used if an i/o error message needs to be written
  4967. about it. If it is a `NULL' pointer, standard input is
  4968. assumed.
  4969. The result returned by the function will be a list whose `head' represents
  4970. the preamble of the file and whose `tail' represents the contents.
  4971. As a side effect, the input file will be closed, unless the
  4972. `FILENAME' parameter is `NULL'.
  4973. If the file conforms to the format described in *note File
  4974. Format::, the preamble is a list of character strings. In the
  4975. result returned by the function, the `head' field will be a list
  4976. with one item for each line in the file, and each item will be a
  4977. list of character representations as in *note Character Table::,
  4978. but with the leading hashes stripped. The `tail' will be the list
  4979. specified by remainder of the file according to *note Concrete
  4980. Syntax::. If the file has an empty preamble but is nevertheless a
  4981. data file, the `head' will be a list whose `head' and `tail' are
  4982. both `NULL'.
  4983. If the file does not conform to the format in *note File Format::,
  4984. then the `head' of the result will be `NULL', and the `tail' will
  4985. be a list of lists of character representations, with one for each
  4986. line.
  4987. Whether or not the file conforms to the format is determined on
  4988. the fly, so this function is useful for situations in which the
  4989. format is not known in advance. The conventions regarding the
  4990. preamble and contents maintained by this function are the same as
  4991. those used by virtual code applications as described in *note
  4992. Standard Output Representation:: and *note Input Data Structure::.
  4993. The characters used for line breaks are not explicitly represented
  4994. in the result. Depending on the host system, line breaks in text
  4995. files may be represented either by the character code 10, or by
  4996. the sequence 13 10. However, in order for the library to deal with
  4997. binary files in a portable way, a line break always corresponds to
  4998. a 10 as far as this function is concerned regardless of the host,
  4999. and a 13 is treated like any other character. Hence, if this
  5000. function were used on binary files that happened to have some 10s
  5001. in them, the exact contents of a file could be reconstructed
  5002. easily by appending a 10 to all but the last line and flattening
  5003. the list.
  5004. A considerable amount of memory may need to be allocated by this
  5005. function in order to store the file as a list. If not enough
  5006. memory is available, the function prints an error message to
  5007. standard error and aborts rather than returning to the caller.
  5008. However, i/o errors are not fatal, and will cause the function to
  5009. print a warning but attempt to continue.
  5010. -- Function: list avm_load (FILE *SOURCE, char *FILENAME, int RAW)
  5011. Similarly to `avm_preamble_and_contents', this function also loads
  5012. a file into memory, but the format is specified in advance.
  5013. `SOURCE'
  5014. should be set by the caller to the address of an already open
  5015. file for reading, which will be read from its current
  5016. position.
  5017. `FILENAME'
  5018. should be initialized by the caller as a pointer to a null
  5019. terminated string containing the name of the file that will
  5020. be reported to the user in the event of an error reading from
  5021. it. If it is a `NULL' pointer, standard input is assumed.
  5022. `RAW'
  5023. is set to a non-zero value by the caller to indicate that the
  5024. file is expected to conform to the format in *note File
  5025. Format::. If the file is an ordinary text file, then it
  5026. should be set to zero.
  5027. In the case of a data file, which is when `RAW' is non-zero, the
  5028. result returned by this function will be a list representing the
  5029. data section of the file and ignoring the preamble. In the case of
  5030. a text file, the result will be a list of lists of character
  5031. representations as per *note Character Table::, with one such list
  5032. for each line in the file. Similar comments about line breaks to
  5033. those mentioned under `avm_preamble_and_contents' are applicable.
  5034. As a side effect of this function, the `SOURCE' file will be
  5035. closed, unless the `FILENAME' is a `NULL' pointer.
  5036. This function is useful when the type of file is known in advance.
  5037. If a data file is indicated by the `RAW' parameter but the format
  5038. is incorrect, an error message is reported and the process
  5039. terminates. The error message will be of the form
  5040. `PROGRAM-NAME: invalid raw file format in FILENAME'
  5041. Alternatively, if a text file is indicated by the `RAW' parameter,
  5042. then no attempt is made to test whether it could be interpreted as
  5043. data, even if it could be. This behavior differs from that of
  5044. `avm_preamble_and_contents', where a bad data file format causes
  5045. the file to be treated as text, and a valid data file format, even
  5046. in a "text" file, causes it to be treated as data.
  5047. Memory requirements for this function are significant and will
  5048. cause the process to abort with an error message in the event of
  5049. insufficient free memory. Messages pertaining to i/o errors are
  5050. also possible and are not fatal.
  5051. -- Function: void avm_initialize_formin ()
  5052. This function should be called before either of the other
  5053. functions in this section is called, as it initializes some
  5054. necessary static data structures. Results of the other functions
  5055. are undefined if this one is not called first.
  5056. -- Function: void avm_count_formin ()
  5057. This function should be called after the last call to any of the
  5058. other functions in this section, as it is necessary for cleaning
  5059. up and reclaiming some internal data. If any storage remains
  5060. unreclaimed due to memory leaks in these functions or to misuse of
  5061. them, a warning message is written to standard error. If the
  5062. function `avm_count_lists' is also being used by the client
  5063. program, it should be called after this one.
  5064. 
  5065. File: avram.info, Node: Formatted Output, Prev: Formatted Input, Up: File Manipulation
  5066. 3.3.4 Formatted Output
  5067. ----------------------
  5068. The following functions pertaining to the output of text files or data
  5069. files with preambles are declared in the header file `formout.h'.
  5070. -- Function: void avm_output (FILE *REPOSITORY, char *FILENAME, list
  5071. PREAMBLE, list CONTENTS, int TRACE_MODE)
  5072. This function writes a either a text file or a data file in the
  5073. format described in *note File Format::. The parameters have these
  5074. interpretations.
  5075. `REPOSITORY'
  5076. is the address of a file opened for writing by the caller,
  5077. that will be written from its current position.
  5078. `FILENAME'
  5079. is the address of a null terminated character string set by
  5080. the caller to be the name of the file that will be reported
  5081. to the user in the event of an i/o error.
  5082. `PREAMBLE'
  5083. is `NULL' in the case of a text file, but a list of character
  5084. string representations as per *note Character Table::, in the
  5085. case of a data file. If a data file has is to be written with
  5086. an empty preamble, then this list should have a `NULL' `head'
  5087. and a `NULL' `tail'.
  5088. `CONTENTS'
  5089. is either a list of character string representations in the
  5090. case of a text file, or is an unconstrained list in the case
  5091. of a data file.
  5092. `TRACE_MODE'
  5093. may be set to a non-zero value by the caller to request that
  5094. everything written to a text file should be echoed to
  5095. standard output. It is ignored in the case of a data file.
  5096. The effect of calling this function is to write the preamble and
  5097. contents to the file in the format indicated by the preamble. The
  5098. file is left open when this function returns.
  5099. Line breaks are always written as character code 10, not as 13 10, regardless
  5100. of the convention on the host system, so that files written by
  5101. this function can be reliably read by other functions in the
  5102. library.
  5103. Leading hashes are automatically added to the beginning of the
  5104. lines in the preamble, except where they are unnecessary due to a
  5105. continuation character on the previous line. This action enforces
  5106. consistency with the file format, ensuring that anything written
  5107. as a data file can be read back as one. The hashes are stripped
  5108. automatically when the file is read by `avm_preamble_and_contents'.
  5109. Another feature of this function is that it will mark any output
  5110. file as executable if it is a data format file with a prelude
  5111. whose first character in the first line is an exclamation point.
  5112. This feature makes it easier for a compiler implemented in virtual
  5113. code to generate executable shell scripts directly.
  5114. A fatal error is reported if any of the data required to be a
  5115. character representation is not listed in the *note Character
  5116. Table::. A fatal error can also be caused by a memory overflow.
  5117. Possible error messages are the following.
  5118. * `PROGRAM-NAME: invalid output preamble format'
  5119. * `PROGRAM-NAME: invalid text format'
  5120. * `PROGRAM-NAME: can't write to FILENAME'
  5121. In the last case, the error message will be followed by an
  5122. explanation furnished by the standard `strerror' function if
  5123. available.
  5124. -- Function: void avm_output_as_directed (list DATA, int
  5125. ASK_TO_OVERWRITE_MODE, int VERBOSE_MODE)
  5126. This function writes an ensemble of files at specified paths in
  5127. either text or data format, optionally interacting with the user
  5128. through standard input and output. The parameters have these
  5129. interpretations.
  5130. `DATA'
  5131. is a list in which each item specifies a file to be written.
  5132. `ASK_TO_OVERWRITE_MODE'
  5133. may be set to a non-zero value by the calling program in
  5134. order to have this function ask the user for permission to
  5135. overwrite existing files.
  5136. `VERBOSE_MODE'
  5137. may be set to a non-zero value by the calling program to have
  5138. this function print to standard output a list of the names of
  5139. the files it writes.
  5140. A high level interface between virtual code applications and the
  5141. file system is provided by this function. The `DATA' parameter
  5142. format is compatible with the the data structure returned by an
  5143. application complying with the conventions in *note Output From
  5144. Non-interactive Applications::.
  5145. Each item in the `DATA' list should be a non-empty list whose
  5146. `head' and `tail' are also non-empty. The fields in each item have
  5147. the following relevance to the file it specifies.
  5148. * The `head' of the `head' is `NULL' if the file is to be
  5149. opened for appending, and non-`NULL' if it is to be
  5150. overwritten.
  5151. * The `tail' of the `head' represents a path as a list of
  5152. character string representations, in a form suitable as an
  5153. argument to `avm_path_name'.
  5154. * The `head' of the `tail' represents the preamble of the file,
  5155. as either `NULL' for a text file or a non-empty list of
  5156. character string representations for a data file.
  5157. * The `tail' of the `tail' represents the contents of the file,
  5158. either as a list of character string representations for a
  5159. text file or as a list in an unconstrained format for a data
  5160. file.
  5161. For each item in the list, the function performs the following
  5162. steps.
  5163. 1. It decides whether to open a file for overwriting or
  5164. appending based on the `head' of the `head'.
  5165. 2. It uses the `tail' of the `head' to find out the file name
  5166. from `avm_path_name', in order to open it.
  5167. 3. If the `ASK_TO_OVERWRITE_MODE' flag is set and the file is
  5168. found to exist already, the function will print one of the
  5169. following messages to standard output, depending on whether
  5170. the file is to be overwritten or appended.
  5171. * `PROGRAM-NAME: overwrite FILENAME? (y/n)'
  5172. * `PROGRAM-NAME: append to FILENAME? (y/n)'
  5173. It will then insist on either `y' or `n' as an answer before
  5174. continuing.
  5175. 4. If the `ASK_TO_OVERWRITE' flag has not been set, or the file
  5176. did not previously exist, or the answer of `y' was given, the
  5177. preamble and contents of the file are then written with
  5178. `avm_output'.
  5179. 5. If permission to write or append was denied, one of the
  5180. following messages is reported to standard output, and the
  5181. data that were to be written are lost.
  5182. * `PROGRAM-NAME: not writing FILENAME'
  5183. * `PROGRAM-NAME: not appending FILENAME'
  5184. 6. If permission was granted to write or append to the file or
  5185. the `VERBOSE_MODE' flag is set, one of the messages
  5186. * `PROGRAM-NAME: writing FILENAME'
  5187. * `PROGRAM-NAME: appending FILENAME'
  5188. is sent to standard output.
  5189. If any files are to be written to standard output, which would be
  5190. indicated by a `NULL' path, they are not written until all other
  5191. files in the list are written. This feature is in the interest of security,
  5192. as it makes it more difficult for malicious or inept virtual code
  5193. to alter the appearance of the console through standard output
  5194. until after the interactive dialogue has taken place. Permission
  5195. is not solicited for writing to standard output, and it will not
  5196. be closed.
  5197. Any of the fatal errors or i/o errors possible with `avm_output' or
  5198. `avm_path_name' are also possible with this function, as well as
  5199. the following additional ones.
  5200. * `PROGRAM-NAME: invalid file specification'
  5201. * `PROGRAM-NAME: can't close FILENAME'
  5202. * `PROGRAM-NAME: can't write FILENAME'
  5203. The last two are non-fatal i/o errors that will be accompanied by
  5204. an explanation from the `strerror' function if the host supports
  5205. it. The other message is the result of a badly formatted `DATA'
  5206. parameter.
  5207. -- Function: void avm_put_bytes (list BYTES)
  5208. This function takes a list of character representations, converts
  5209. them to characters, and sends them to standard output. There is no
  5210. chance of a memory overflow, but the following other errors are
  5211. possible.
  5212. * `PROGRAM-NAME: invalid text format (code NN)'
  5213. * `PROGRAM-NAME: can't write to standard output'
  5214. The latter is non-fatal, but the former causes the program to
  5215. abort. It is caused when any member of the list `BYTES' is not a
  5216. character representation appearing in *note Character Table::.
  5217. -- Function: void avm_initialize_formout ()
  5218. This function initializes some data structures used locally by the
  5219. other functions in this section, and should be called at the
  5220. beginning of a run before any of them is called.
  5221. -- Function: void avm_count_formout ()
  5222. This function doesn't do anything in the current version of the
  5223. library, but should be called after the last call to any of the
  5224. other functions in this section. Future versions of the library
  5225. might use this function for cleaning up some internal data
  5226. structures, and client programs that call it will maintain
  5227. compatibility with them.
  5228. 
  5229. File: avram.info, Node: Invocation, Next: Version Management, Prev: File Manipulation, Up: Library Reference
  5230. 3.4 Invocation
  5231. ==============
  5232. The functions documented in this section can be used to incorporate the
  5233. capabilities of a virtual machine emulator into other C programs with a
  5234. minimal concern for the details of the required data structures and
  5235. virtual code invocation conventions.
  5236. * Menu:
  5237. * Command Line Parsing::
  5238. * Execution Modes::