avram.info-1 286 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697989910010110210310410510610710810911011111211311411511611711811912012112212312412512612712812913013113213313413513613713813914014114214314414514614714814915015115215315415515615715815916016116216316416516616716816917017117217317417517617717817918018118218318418518618718818919019119219319419519619719819920020120220320420520620720820921021121221321421521621721821922022122222322422522622722822923023123223323423523623723823924024124224324424524624724824925025125225325425525625725825926026126226326426526626726826927027127227327427527627727827928028128228328428528628728828929029129229329429529629729829930030130230330430530630730830931031131231331431531631731831932032132232332432532632732832933033133233333433533633733833934034134234334434534634734834935035135235335435535635735835936036136236336436536636736836937037137237337437537637737837938038138238338438538638738838939039139239339439539639739839940040140240340440540640740840941041141241341441541641741841942042142242342442542642742842943043143243343443543643743843944044144244344444544644744844945045145245345445545645745845946046146246346446546646746846947047147247347447547647747847948048148248348448548648748848949049149249349449549649749849950050150250350450550650750850951051151251351451551651751851952052152252352452552652752852953053153253353453553653753853954054154254354454554654754854955055155255355455555655755855956056156256356456556656756856957057157257357457557657757857958058158258358458558658758858959059159259359459559659759859960060160260360460560660760860961061161261361461561661761861962062162262362462562662762862963063163263363463563663763863964064164264364464564664764864965065165265365465565665765865966066166266366466566666766866967067167267367467567667767867968068168268368468568668768868969069169269369469569669769869970070170270370470570670770870971071171271371471571671771871972072172272372472572672772872973073173273373473573673773873974074174274374474574674774874975075175275375475575675775875976076176276376476576676776876977077177277377477577677777877978078178278378478578678778878979079179279379479579679779879980080180280380480580680780880981081181281381481581681781881982082182282382482582682782882983083183283383483583683783883984084184284384484584684784884985085185285385485585685785885986086186286386486586686786886987087187287387487587687787887988088188288388488588688788888989089189289389489589689789889990090190290390490590690790890991091191291391491591691791891992092192292392492592692792892993093193293393493593693793893994094194294394494594694794894995095195295395495595695795895996096196296396496596696796896997097197297397497597697797897998098198298398498598698798898999099199299399499599699799899910001001100210031004100510061007100810091010101110121013101410151016101710181019102010211022102310241025102610271028102910301031103210331034103510361037103810391040104110421043104410451046104710481049105010511052105310541055105610571058105910601061106210631064106510661067106810691070107110721073107410751076107710781079108010811082108310841085108610871088108910901091109210931094109510961097109810991100110111021103110411051106110711081109111011111112111311141115111611171118111911201121112211231124112511261127112811291130113111321133113411351136113711381139114011411142114311441145114611471148114911501151115211531154115511561157115811591160116111621163116411651166116711681169117011711172117311741175117611771178117911801181118211831184118511861187118811891190119111921193119411951196119711981199120012011202120312041205120612071208120912101211121212131214121512161217121812191220122112221223122412251226122712281229123012311232123312341235123612371238123912401241124212431244124512461247124812491250125112521253125412551256125712581259126012611262126312641265126612671268126912701271127212731274127512761277127812791280128112821283128412851286128712881289129012911292129312941295129612971298129913001301130213031304130513061307130813091310131113121313131413151316131713181319132013211322132313241325132613271328132913301331133213331334133513361337133813391340134113421343134413451346134713481349135013511352135313541355135613571358135913601361136213631364136513661367136813691370137113721373137413751376137713781379138013811382138313841385138613871388138913901391139213931394139513961397139813991400140114021403140414051406140714081409141014111412141314141415141614171418141914201421142214231424142514261427142814291430143114321433143414351436143714381439144014411442144314441445144614471448144914501451145214531454145514561457145814591460146114621463146414651466146714681469147014711472147314741475147614771478147914801481148214831484148514861487148814891490149114921493149414951496149714981499150015011502150315041505150615071508150915101511151215131514151515161517151815191520152115221523152415251526152715281529153015311532153315341535153615371538153915401541154215431544154515461547154815491550155115521553155415551556155715581559156015611562156315641565156615671568156915701571157215731574157515761577157815791580158115821583158415851586158715881589159015911592159315941595159615971598159916001601160216031604160516061607160816091610161116121613161416151616161716181619162016211622162316241625162616271628162916301631163216331634163516361637163816391640164116421643164416451646164716481649165016511652165316541655165616571658165916601661166216631664166516661667166816691670167116721673167416751676167716781679168016811682168316841685168616871688168916901691169216931694169516961697169816991700170117021703170417051706170717081709171017111712171317141715171617171718171917201721172217231724172517261727172817291730173117321733173417351736173717381739174017411742174317441745174617471748174917501751175217531754175517561757175817591760176117621763176417651766176717681769177017711772177317741775177617771778177917801781178217831784178517861787178817891790179117921793179417951796179717981799180018011802180318041805180618071808180918101811181218131814181518161817181818191820182118221823182418251826182718281829183018311832183318341835183618371838183918401841184218431844184518461847184818491850185118521853185418551856185718581859186018611862186318641865186618671868186918701871187218731874187518761877187818791880188118821883188418851886188718881889189018911892189318941895189618971898189919001901190219031904190519061907190819091910191119121913191419151916191719181919192019211922192319241925192619271928192919301931193219331934193519361937193819391940194119421943194419451946194719481949195019511952195319541955195619571958195919601961196219631964196519661967196819691970197119721973197419751976197719781979198019811982198319841985198619871988198919901991199219931994199519961997199819992000200120022003200420052006200720082009201020112012201320142015201620172018201920202021202220232024202520262027202820292030203120322033203420352036203720382039204020412042204320442045204620472048204920502051205220532054205520562057205820592060206120622063206420652066206720682069207020712072207320742075207620772078207920802081208220832084208520862087208820892090209120922093209420952096209720982099210021012102210321042105210621072108210921102111211221132114211521162117211821192120212121222123212421252126212721282129213021312132213321342135213621372138213921402141214221432144214521462147214821492150215121522153215421552156215721582159216021612162216321642165216621672168216921702171217221732174217521762177217821792180218121822183218421852186218721882189219021912192219321942195219621972198219922002201220222032204220522062207220822092210221122122213221422152216221722182219222022212222222322242225222622272228222922302231223222332234223522362237223822392240224122422243224422452246224722482249225022512252225322542255225622572258225922602261226222632264226522662267226822692270227122722273227422752276227722782279228022812282228322842285228622872288228922902291229222932294229522962297229822992300230123022303230423052306230723082309231023112312231323142315231623172318231923202321232223232324232523262327232823292330233123322333233423352336233723382339234023412342234323442345234623472348234923502351235223532354235523562357235823592360236123622363236423652366236723682369237023712372237323742375237623772378237923802381238223832384238523862387238823892390239123922393239423952396239723982399240024012402240324042405240624072408240924102411241224132414241524162417241824192420242124222423242424252426242724282429243024312432243324342435243624372438243924402441244224432444244524462447244824492450245124522453245424552456245724582459246024612462246324642465246624672468246924702471247224732474247524762477247824792480248124822483248424852486248724882489249024912492249324942495249624972498249925002501250225032504250525062507250825092510251125122513251425152516251725182519252025212522252325242525252625272528252925302531253225332534253525362537253825392540254125422543254425452546254725482549255025512552255325542555255625572558255925602561256225632564256525662567256825692570257125722573257425752576257725782579258025812582258325842585258625872588258925902591259225932594259525962597259825992600260126022603260426052606260726082609261026112612261326142615261626172618261926202621262226232624262526262627262826292630263126322633263426352636263726382639264026412642264326442645264626472648264926502651265226532654265526562657265826592660266126622663266426652666266726682669267026712672267326742675267626772678267926802681268226832684268526862687268826892690269126922693269426952696269726982699270027012702270327042705270627072708270927102711271227132714271527162717271827192720272127222723272427252726272727282729273027312732273327342735273627372738273927402741274227432744274527462747274827492750275127522753275427552756275727582759276027612762276327642765276627672768276927702771277227732774277527762777277827792780278127822783278427852786278727882789279027912792279327942795279627972798279928002801280228032804280528062807280828092810281128122813281428152816281728182819282028212822282328242825282628272828282928302831283228332834283528362837283828392840284128422843284428452846284728482849285028512852285328542855285628572858285928602861286228632864286528662867286828692870287128722873287428752876287728782879288028812882288328842885288628872888288928902891289228932894289528962897289828992900290129022903290429052906290729082909291029112912291329142915291629172918291929202921292229232924292529262927292829292930293129322933293429352936293729382939294029412942294329442945294629472948294929502951295229532954295529562957295829592960296129622963296429652966296729682969297029712972297329742975297629772978297929802981298229832984298529862987298829892990299129922993299429952996299729982999300030013002300330043005300630073008300930103011301230133014301530163017301830193020302130223023302430253026302730283029303030313032303330343035303630373038303930403041304230433044304530463047304830493050305130523053305430553056305730583059306030613062306330643065306630673068306930703071307230733074307530763077307830793080308130823083308430853086308730883089309030913092309330943095309630973098309931003101310231033104310531063107310831093110311131123113311431153116311731183119312031213122312331243125312631273128312931303131313231333134313531363137313831393140314131423143314431453146314731483149315031513152315331543155315631573158315931603161316231633164316531663167316831693170317131723173317431753176317731783179318031813182318331843185318631873188318931903191319231933194319531963197319831993200320132023203320432053206320732083209321032113212321332143215321632173218321932203221322232233224322532263227322832293230323132323233323432353236323732383239324032413242324332443245324632473248324932503251325232533254325532563257325832593260326132623263326432653266326732683269327032713272327332743275327632773278327932803281328232833284328532863287328832893290329132923293329432953296329732983299330033013302330333043305330633073308330933103311331233133314331533163317331833193320332133223323332433253326332733283329333033313332333333343335333633373338333933403341334233433344334533463347334833493350335133523353335433553356335733583359336033613362336333643365336633673368336933703371337233733374337533763377337833793380338133823383338433853386338733883389339033913392339333943395339633973398339934003401340234033404340534063407340834093410341134123413341434153416341734183419342034213422342334243425342634273428342934303431343234333434343534363437343834393440344134423443344434453446344734483449345034513452345334543455345634573458345934603461346234633464346534663467346834693470347134723473347434753476347734783479348034813482348334843485348634873488348934903491349234933494349534963497349834993500350135023503350435053506350735083509351035113512351335143515351635173518351935203521352235233524352535263527352835293530353135323533353435353536353735383539354035413542354335443545354635473548354935503551355235533554355535563557355835593560356135623563356435653566356735683569357035713572357335743575357635773578357935803581358235833584358535863587358835893590359135923593359435953596359735983599360036013602360336043605360636073608360936103611361236133614361536163617361836193620362136223623362436253626362736283629363036313632363336343635363636373638363936403641364236433644364536463647364836493650365136523653365436553656365736583659366036613662366336643665366636673668366936703671367236733674367536763677367836793680368136823683368436853686368736883689369036913692369336943695369636973698369937003701370237033704370537063707370837093710371137123713371437153716371737183719372037213722372337243725372637273728372937303731373237333734373537363737373837393740374137423743374437453746374737483749375037513752375337543755375637573758375937603761376237633764376537663767376837693770377137723773377437753776377737783779378037813782378337843785378637873788378937903791379237933794379537963797379837993800380138023803380438053806380738083809381038113812381338143815381638173818381938203821382238233824382538263827382838293830383138323833383438353836383738383839384038413842384338443845384638473848384938503851385238533854385538563857385838593860386138623863386438653866386738683869387038713872387338743875387638773878387938803881388238833884388538863887388838893890389138923893389438953896389738983899390039013902390339043905390639073908390939103911391239133914391539163917391839193920392139223923392439253926392739283929393039313932393339343935393639373938393939403941394239433944394539463947394839493950395139523953395439553956395739583959396039613962396339643965396639673968396939703971397239733974397539763977397839793980398139823983398439853986398739883989399039913992399339943995399639973998399940004001400240034004400540064007400840094010401140124013401440154016401740184019402040214022402340244025402640274028402940304031403240334034403540364037403840394040404140424043404440454046404740484049405040514052405340544055405640574058405940604061406240634064406540664067406840694070407140724073407440754076407740784079408040814082408340844085408640874088408940904091409240934094409540964097409840994100410141024103410441054106410741084109411041114112411341144115411641174118411941204121412241234124412541264127412841294130413141324133413441354136413741384139414041414142414341444145414641474148414941504151415241534154415541564157415841594160416141624163416441654166416741684169417041714172417341744175417641774178417941804181418241834184418541864187418841894190419141924193419441954196419741984199420042014202420342044205420642074208420942104211421242134214421542164217421842194220422142224223422442254226422742284229423042314232423342344235423642374238423942404241424242434244424542464247424842494250425142524253425442554256425742584259426042614262426342644265426642674268426942704271427242734274427542764277427842794280428142824283428442854286428742884289429042914292429342944295429642974298429943004301430243034304430543064307430843094310431143124313431443154316431743184319432043214322432343244325432643274328432943304331433243334334433543364337433843394340434143424343434443454346434743484349435043514352435343544355435643574358435943604361436243634364436543664367436843694370437143724373437443754376437743784379438043814382438343844385438643874388438943904391439243934394439543964397439843994400440144024403440444054406440744084409441044114412441344144415441644174418441944204421442244234424442544264427442844294430443144324433443444354436443744384439444044414442444344444445444644474448444944504451445244534454445544564457445844594460446144624463446444654466446744684469447044714472447344744475447644774478447944804481448244834484448544864487448844894490449144924493449444954496449744984499450045014502450345044505450645074508450945104511451245134514451545164517451845194520452145224523452445254526452745284529453045314532453345344535453645374538453945404541454245434544454545464547454845494550455145524553455445554556455745584559456045614562456345644565456645674568456945704571457245734574457545764577457845794580458145824583458445854586458745884589459045914592459345944595459645974598459946004601460246034604460546064607460846094610461146124613461446154616461746184619462046214622462346244625462646274628462946304631463246334634463546364637463846394640464146424643464446454646464746484649465046514652465346544655465646574658465946604661466246634664466546664667466846694670467146724673467446754676467746784679468046814682468346844685468646874688468946904691469246934694469546964697469846994700470147024703470447054706470747084709471047114712471347144715471647174718471947204721472247234724472547264727472847294730473147324733473447354736473747384739474047414742474347444745474647474748474947504751475247534754475547564757475847594760476147624763476447654766476747684769477047714772477347744775477647774778477947804781478247834784478547864787478847894790479147924793479447954796479747984799480048014802480348044805480648074808480948104811481248134814481548164817481848194820482148224823482448254826482748284829483048314832483348344835483648374838483948404841484248434844484548464847484848494850485148524853485448554856485748584859486048614862486348644865486648674868486948704871487248734874487548764877487848794880488148824883488448854886488748884889489048914892489348944895489648974898489949004901490249034904490549064907490849094910491149124913491449154916491749184919492049214922492349244925492649274928492949304931493249334934493549364937493849394940494149424943494449454946494749484949495049514952495349544955495649574958495949604961496249634964496549664967496849694970497149724973497449754976497749784979498049814982498349844985498649874988498949904991499249934994499549964997499849995000500150025003500450055006500750085009501050115012501350145015501650175018501950205021502250235024502550265027502850295030503150325033503450355036503750385039504050415042504350445045504650475048504950505051505250535054505550565057505850595060506150625063506450655066506750685069507050715072507350745075507650775078507950805081508250835084508550865087508850895090509150925093509450955096509750985099510051015102510351045105510651075108510951105111511251135114511551165117511851195120512151225123512451255126512751285129513051315132513351345135513651375138513951405141514251435144514551465147514851495150515151525153515451555156515751585159516051615162516351645165516651675168516951705171517251735174517551765177517851795180518151825183518451855186518751885189519051915192519351945195519651975198519952005201520252035204520552065207520852095210521152125213521452155216521752185219522052215222522352245225522652275228522952305231523252335234523552365237523852395240524152425243524452455246524752485249525052515252525352545255525652575258525952605261526252635264526552665267526852695270527152725273527452755276527752785279528052815282528352845285528652875288528952905291529252935294529552965297529852995300530153025303530453055306530753085309531053115312531353145315531653175318531953205321532253235324532553265327532853295330533153325333533453355336533753385339534053415342534353445345534653475348534953505351535253535354535553565357535853595360536153625363536453655366536753685369537053715372537353745375537653775378537953805381538253835384538553865387538853895390539153925393539453955396539753985399540054015402540354045405540654075408540954105411541254135414541554165417541854195420542154225423542454255426542754285429543054315432543354345435543654375438543954405441544254435444544554465447544854495450545154525453545454555456545754585459546054615462546354645465546654675468546954705471547254735474547554765477547854795480548154825483548454855486548754885489549054915492549354945495549654975498549955005501550255035504550555065507550855095510551155125513551455155516551755185519552055215522552355245525552655275528552955305531553255335534553555365537553855395540554155425543554455455546554755485549555055515552555355545555555655575558555955605561556255635564556555665567556855695570557155725573557455755576557755785579558055815582558355845585558655875588558955905591559255935594559555965597559855995600560156025603560456055606560756085609561056115612561356145615561656175618561956205621562256235624562556265627562856295630563156325633563456355636563756385639564056415642564356445645564656475648564956505651565256535654565556565657565856595660566156625663566456655666566756685669567056715672567356745675567656775678567956805681568256835684568556865687568856895690569156925693569456955696569756985699570057015702570357045705570657075708570957105711571257135714571557165717571857195720572157225723572457255726572757285729573057315732573357345735573657375738573957405741574257435744574557465747574857495750575157525753575457555756575757585759576057615762576357645765576657675768576957705771577257735774577557765777577857795780578157825783578457855786578757885789579057915792579357945795579657975798579958005801580258035804580558065807580858095810581158125813581458155816581758185819582058215822582358245825582658275828582958305831583258335834583558365837583858395840584158425843584458455846584758485849585058515852585358545855585658575858585958605861586258635864586558665867586858695870587158725873587458755876587758785879588058815882588358845885588658875888588958905891589258935894589558965897589858995900590159025903590459055906590759085909591059115912591359145915591659175918591959205921592259235924592559265927592859295930593159325933593459355936593759385939594059415942594359445945594659475948594959505951595259535954595559565957595859595960596159625963596459655966596759685969597059715972597359745975597659775978597959805981598259835984598559865987598859895990599159925993599459955996599759985999600060016002600360046005600660076008600960106011601260136014601560166017601860196020602160226023602460256026602760286029603060316032603360346035603660376038603960406041604260436044604560466047604860496050605160526053605460556056605760586059606060616062606360646065606660676068606960706071607260736074607560766077607860796080608160826083608460856086608760886089609060916092609360946095609660976098609961006101610261036104610561066107610861096110611161126113611461156116611761186119612061216122612361246125612661276128612961306131613261336134613561366137613861396140614161426143614461456146614761486149615061516152615361546155615661576158615961606161616261636164616561666167616861696170617161726173617461756176617761786179618061816182618361846185618661876188618961906191619261936194619561966197619861996200620162026203620462056206620762086209621062116212621362146215621662176218621962206221622262236224622562266227622862296230623162326233623462356236623762386239624062416242624362446245624662476248624962506251625262536254625562566257625862596260626162626263626462656266626762686269627062716272627362746275627662776278627962806281628262836284628562866287628862896290629162926293629462956296629762986299630063016302630363046305630663076308630963106311631263136314631563166317631863196320632163226323632463256326632763286329633063316332633363346335633663376338633963406341634263436344634563466347634863496350635163526353635463556356635763586359636063616362636363646365636663676368636963706371637263736374637563766377637863796380638163826383638463856386638763886389639063916392639363946395639663976398639964006401640264036404640564066407640864096410641164126413641464156416641764186419642064216422642364246425642664276428642964306431643264336434643564366437643864396440644164426443644464456446644764486449645064516452645364546455645664576458645964606461646264636464646564666467646864696470647164726473647464756476647764786479648064816482648364846485648664876488648964906491649264936494649564966497649864996500650165026503650465056506650765086509651065116512651365146515651665176518651965206521652265236524652565266527652865296530653165326533653465356536653765386539654065416542654365446545654665476548654965506551655265536554655565566557655865596560656165626563656465656566656765686569657065716572657365746575657665776578657965806581658265836584658565866587658865896590659165926593659465956596659765986599660066016602660366046605660666076608660966106611661266136614661566166617661866196620
  1. This is avram.info, produced by makeinfo version 4.13 from
  2. avram.texinfo.
  3. This file documents the `avram' command which is a virtual machine code
  4. interpreter
  5. Copyright (C) 2000, 2003, 2006-2010, 2012 Dennis Furey Permission is
  6. granted to make and distribute verbatim copies of this manual provided
  7. the copyright notice and this permission notice are preserved on all
  8. copies.
  9. Permission is granted to copy and distribute modified versions of
  10. this manual under the conditions for verbatim copying, provided that
  11. the entire resulting derived work is distributed under the terms of a
  12. permission notice identical to this one.
  13. Permission is granted to copy and distribute translations of this
  14. manual into another language, under the above conditions for modified
  15. versions, except that this permission notice may be stated in a
  16. translation approved by the Free Software Foundation.
  17. 
  18. File: avram.info, Node: Top, Next: Preface, Prev: (dir), Up: (dir)
  19. This file documents `avram' version 0.13.0, which is a virtual
  20. machine code interpreter.
  21. * Menu:
  22. * Preface:: project aims and scope
  23. * User Manual:: command line options and usage
  24. * Virtual Machine Specification:: a guide for compiler writers
  25. * Library Reference:: how to reuse or enhance `avram'
  26. * Character Table:: representations for ASCII characters
  27. * Reference Implementations:: constructive computability proofs
  28. * Changes:: recent updates to the manual
  29. * External Libraries:: specifications and calling conventions
  30. * Copying:: license terms
  31. * Function Index:: for the shared library API
  32. * Concept Index::
  33. 
  34. File: avram.info, Node: Preface, Next: User Manual, Prev: Top, Up: Top
  35. Preface
  36. *******
  37. `avram' is a virtual machine code interpreter. It reads an input file
  38. containing a user-supplied application expressed in virtual machine
  39. code, and executes it on the host machine. The name is a quasi-acronym
  40. for "Applicative ViRtuAl Machine". Notable features are
  41. * strong support for functional programming operations (e.g., list
  42. processing)
  43. * interfaces to selected functions from mathematical libraries, such
  44. as
  45. * `gsl' (numerical integration, differentiation, and
  46. series acceleration)
  47. `http://www.gnu.org/software/gsl/'
  48. * `mpfr' (arbitrary precision arithmetic)
  49. `http://www.mpfr.org'
  50. * `minpack' (non-linear optimization)
  51. `http://www.netlib.org/minpack'
  52. * `lapack' (linear algebra)
  53. `http://www.netlib.org/lapack'
  54. * `fftw' (fast fourier transforms)
  55. `http://www.fftw.org'
  56. * `Rmath' (statistical and transcendental functions)
  57. `http://www.r-project.org'
  58. * `ufsparse' (sparse matrices)
  59. `http://www.cise.ufl.edu/research/sparse/SuiteSparse/current/SuiteSparse/'
  60. * `glpk' (linear programming by the simplex method)
  61. `http://www.gnu.org/software/glpk'
  62. * `lpsolve' (mixed integer linear programming)
  63. `http://sourceforge.net/projects/lpsolve/'
  64. * `kinsol' (constrained non-linear optimization)
  65. `http://www.llnl.gov/CASC/sundials/'
  66. * interoperability of virtual code applications with other console
  67. applications or shells through the `expect' library
  68. * a simple high-level interface to files, environment variables and
  69. command line parameters
  70. * support for various styles of stateless or persistent stream
  71. processors (a.k.a. Unix filters)
  72. The reason for writing `avram' was that I wanted to do some work
  73. using a functional programming language, didn't like any functional
  74. programming languages that already existed, and felt that it would be
  75. less trouble to write a virtual machine emulator than the back end of a
  76. compiler. As of version 0.1.0, the first public release of `avram' as
  77. such in 2000, most of the code base had been in heavy use by me for
  78. about four years, running very reliably. At this writing some six years
  79. later, it has seen even more use with rarely any reliability issues, in
  80. some cases attacking large combinatorial problems for weeks or months
  81. at a time. These problems have involved both long running continuous
  82. execution, and batches of thousands of shorter jobs.
  83. Although the virtual machine is biased toward functional programming,
  84. it is officially language agnostic, so `avram' may be useful to anyone
  85. involved in the development of compilers for other programming,
  86. scripting, or special purpose languages. The crucial advantage of using
  87. it in your own project is that rather than troubling over address
  88. modes, register allocation, and other hassles inherent in generating
  89. native code, your compiler can just dump a fairly high level
  90. intermediate code representation of the source text to a file, and let
  91. the virtual machine emulator deal with the details. The tradeoff for
  92. using a presumably higher level interpreted language is that the
  93. performance is unlikely to be competitive with native code, but this
  94. issue is mitigated in the case of numerical applications whose heavy
  95. lifting is done by the external libraries mentioned above.
  96. Portability is an added bonus. The virtual code is binary compatible
  97. across all platforms. Versions of `avram' as of 0.1.0 and later are
  98. packaged using GNU autotools and should be possible to build on any
  99. platform supporting them. In particular, the package is known to have
  100. built successfully on MacOS, FreeBSD, Solaris (thanks to the compile
  101. farm at Sourceforge.net) Digital Unix, and Debian GNU/Linux for i386 and
  102. Alpha platforms, although it has not been extensively tested on all of
  103. them. Earlier versions were compiled and run successfully on Irix and
  104. even Windows-NT (with `gcc').
  105. This document is divided into three main parts, with possibly three
  106. different audiences, but they all depend on a basic familiarity with Unix
  107. or GNU/Linux systems.
  108. *note User Manual::
  109. essentially reproduces the information found in the manpage that
  110. is distributed with `avram' with a few extra examples and longer
  111. explanations. Properly deployed, `avram' should be almost entirely
  112. hidden from end users by wrapper scripts, so the "users" to whom
  113. this part is relevant would be those involved in preparing these
  114. scripts (a matter of choosing the right command line options).
  115. Depending on the extent to which this task is automated by a
  116. compiler, that may include the compiler writer or the developers
  117. of applications.
  118. *note Virtual Machine Specification::
  119. documents much of what one would need to know in order to write a
  120. compiler that generates code executable by `avram'. That includes
  121. the complete virtual machine code semantics and file formats. It
  122. would also be possible to implement a compatible replacement for
  123. `avram' from scratch based on the information in this chapter, in
  124. case anyone has anything against C, my coding style, or the GPL.
  125. (A few patches to make it `lint' cleanly or a new implementation
  126. in good pedagogical Java without pointers would both be instructive
  127. exercises. ;-))
  128. *note Library Reference::
  129. includes documentation on the application program interface and
  130. recommended entry points for the C library distributed with
  131. `avram'. This information would be of use to those wishing to
  132. develop applications incorporating similar features, or to reuse
  133. the code for unrelated purposes. It might also be useful to anyone
  134. wishing to develop C or C++ applications that read or write data
  135. files in the format used by `avram'.
  136. 
  137. File: avram.info, Node: User Manual, Next: Virtual Machine Specification, Prev: Preface, Up: Top
  138. 1 User Manual
  139. *************
  140. This chapter provides the basic information on how to use `avram' to
  141. execute virtual machine code applications.
  142. `avram' is invoked by typing a command at a shell prompt in one of
  143. these three forms.
  144. `avram' [_general options_]
  145. `avram' [_filter mode options_] CODEFILE[`.avm']
  146. `avram' [_parameter mode options_] CODEFILE[`.avm'] [_parameters_]
  147. In the second case, `avram' reads from standard input, and may of
  148. course appear as part of commands such as
  149. `avram' [_filter mode options_] CODEFILE[`.avm'] < INPUTFILE
  150. ANOTHERCOMMAND | `avram' [_filter mode options_] CODEFILE[`.avm']
  151. When `avram' is invoked with the name of an input file (with a default
  152. extension `.avm'), it reads virtual machine code from the file and
  153. executes it on the host machine.
  154. The virtual code format used by `avram' is designed to support the
  155. features of functional or applicative programming languages. Although
  156. this chapter documents only the usage of `avram' and not the internals,
  157. it will be helpful to keep in mind that the virtual machine code
  158. expresses a mathematical function rather than a program in the
  159. conventional sense. As such, it performs no action directly, but may be
  160. applied in a choice of ways by the user of `avram' according to the
  161. precise operation required.
  162. The following sections provide information in greater detail about
  163. usage and diagnostics.
  164. * Menu:
  165. * General Options:: getting help and version information
  166. * Modes of Operation:: stream processing or file oriented
  167. * Filter Mode Options:: how to run a stream processor
  168. * Parameter Mode Options:: how to have an application use files
  169. * Command Line Syntax:: application-independent conventions
  170. * Diagnostics:: explanation of error messages
  171. * Security:: running untrusted applications
  172. * Example Script:: how to unburden the end users
  173. * Files:: miscellaneous files used
  174. * Environment:: environment variables
  175. * Bugs:: hall of shame
  176. 
  177. File: avram.info, Node: General Options, Next: Modes of Operation, Prev: User Manual, Up: User Manual
  178. 1.1 General Options
  179. ===================
  180. Regardless of whatever other command line parameters are given, `avram'
  181. accepts the following parameters:
  182. `-h, --help'
  183. Show a summary of options and exit.
  184. `-V,-v, --version'
  185. Show the version of program and a short copyleft message and exit.
  186. `--emulation=VERSION'
  187. Be backward compatible with an older version of `avram'. This
  188. option should include a valid version number, for example
  189. `0.13.0', which is the version of `avram' to be emulated. It can
  190. make virtual code applications future proof, assuming that future
  191. versions of `avram' correctly support backward compatibility. It
  192. may be used in conjunction with any other option in any mode of
  193. operation. This copy of the user manual has not been updated
  194. since version 0.13.0 of `avram', so it is unable to document
  195. incompatibilities with later versions. The latest version of the
  196. manual may be found at `http://www.lsbu.ac.uk/~fureyd/avram'.
  197. `-e, --external-libraries'
  198. Show a list of libraries with which `avram' has been linked and
  199. whose functions therefore could be called from virtual machine
  200. programs. This growing list currently includes selected functions
  201. from `fftw', `glpk', `gsl', `kinsol', `lapack', `minpack', `mpfr',
  202. `lpsolve', `Rmath' and `ufsparse' (see *note Preface::) which are
  203. documented further in *note External Libraries::.
  204. `-j, --jail'
  205. This option disables execution of shell commands by virtual code
  206. applications, which is normally possible by default even for
  207. nominally non-interactive applications (see *note Parameter Mode
  208. Options::). A virtual code application attempting to spawn a shell
  209. (using the `interact' combinator) when this option is selected will
  210. encounter an exception rather than successful completion of the
  211. operation. This option is provided as a security feature for
  212. running untrusted code (see *note Security::), and is incompatible
  213. with `-i', `-t', and `-s'.
  214. `-f, --force-text-input'
  215. Normally `avram' will try to guess by looking at a file whether it
  216. is an ordinary text file or one that has been written in the
  217. virtual code file format, and choose a different internal
  218. representation accordingly. An application may require one
  219. representation or the other. This option tells `avram' to treat
  220. all input files other than the virtual code file (named in the
  221. first command line parameter) as text files regardless of whether
  222. or not it would be possible to interpret them otherwise. This
  223. option may be used in combination with any other option.
  224. 
  225. File: avram.info, Node: Modes of Operation, Next: Filter Mode Options, Prev: General Options, Up: User Manual
  226. 1.2 Modes of Operation
  227. ======================
  228. Apart from to the capability to print brief help messages and exit,
  229. there are two main modes of operation, depending on which options are
  230. specified on the command line before the virtual code file name.
  231. For the purpose of choosing the mode of operation, the virtual code
  232. filename is taken to be the first command line argument not beginning
  233. with a dash. Other conventions relevant to application specific
  234. parameters are detailed in *note Command Line Syntax::.
  235. * Menu:
  236. * Filter Mode::
  237. * Parameter Mode::
  238. 
  239. File: avram.info, Node: Filter Mode, Next: Parameter Mode, Prev: Modes of Operation, Up: Modes of Operation
  240. 1.2.1 Filter Mode
  241. -----------------
  242. In filter mode, the argument to the function given by the virtual code
  243. is taken from standard input, and the result is written to standard
  244. output, except for error messages resulting from a failure to evaluate
  245. the function, which are written to standard error. *Note
  246. Diagnostics::. Filter mode is indicated whenever these three conditions
  247. are all met.
  248. * Either at least one of the filter mode options appears on the
  249. command line preceding the first filename parameter, or there are
  250. no options at all. *Note Filter Mode Options::.
  251. * Exactly one filename parameter appears on the command line, which
  252. is the name of the virtual machine code file.
  253. * Either the filename comes last on the command line, or the
  254. `--unparameterized' option precedes it, causing everything
  255. following it to be ignored.
  256. Examples:
  257. `avram mynewapp < inputfilename'
  258. In this example, filter mode is recognized by default because
  259. there are no options or input files on the command line to indicate
  260. otherwise. (The input file redirected into standard input is not
  261. treated by the shell as a command line argument.)
  262. `cat somefile | avram -r coolprog > outputfile'
  263. In this example, the `-r' option gives it away, being one of the
  264. filter mode options, in addition to the fact that there are no
  265. input file parameters or application-specific options.
  266. `avram -u devilmaycare.avm --bogusoption ignoredparameter'
  267. In this case, filter mode is forced by the `-u' option despite
  268. indications to the contrary.
  269. 
  270. File: avram.info, Node: Parameter Mode, Prev: Filter Mode, Up: Modes of Operation
  271. 1.2.2 Parameter Mode
  272. --------------------
  273. In parameter mode, the argument to the function given by the virtual
  274. code is a data structure containing environment variables and command
  275. line parameters including files, application specific options, and
  276. possibly standard input. The result obtained by evaluating the function
  277. is either a data structure representing a set of files to be written,
  278. which may include standard output, or a sequence of shell commands to
  279. be executed, or a combination of both. Parameter mode is indicated
  280. whenever either of these conditions is met.
  281. * Any of the parameter mode options appears on the command line
  282. preceding the first filename parameter.
  283. *Note Parameter Mode Options::.
  284. * At least one additional filename parameter or option follows the
  285. first filename parameter, and the option `--unparameterized' does
  286. not precede it.
  287. Examples:
  288. `avram --map-to-each-file prettyprinter.avm *.c *.h --extra-pretty'
  289. In this example, parameter mode is indicated both by the parameter
  290. mode option `--map-to-each-file' and by the presence of input file
  291. names and the `--extra-pretty' option. The latter is specific to
  292. the hypothetical `prettyprinter.avm' virtual code application, as
  293. indicated by its position on the command line, and is therefore
  294. passed to it by `avram'.
  295. `cat ~/specfile | avram reportgenerator -v - /var/log/syslog'
  296. In this example, a hypothetical parameter mode application
  297. `reportgenerator' is able to read `~/specfile' from standard input
  298. because of the `-' used as a parameter.
  299. `avram --parameterized grepenv'
  300. In this example, a hypothetical application that searches shell
  301. variables is invoked in parameter mode even with no input files or
  302. application specific options, because of the `--parameterized'
  303. option. Parameter mode invocation is required by the application
  304. to give it access to the environment.
  305. `avram grepenv --search-targets=PATH,MANPATH'
  306. This example shows an application specific option with both a
  307. keyword and a parameter list. They suffice to indicate parameter
  308. mode without an explicit `--parameterized' option.
  309. 
  310. File: avram.info, Node: Filter Mode Options, Next: Parameter Mode Options, Prev: Modes of Operation, Up: User Manual
  311. 1.3 Filter Mode Options
  312. =======================
  313. The options available in filter mode are listed below. Except as
  314. otherwise noted, all options are mutually exclusive. Ordinarily a given
  315. application will require certain fixed settings of these options and
  316. will not work properly if they are set inappropriately.
  317. `-r, `--raw-output''
  318. Normally the result obtained by evaluating the function in the
  319. virtual code file must be a list of character strings, which is
  320. written as such to standard output. However, if this option is
  321. selected, the form of the result is unconstrained, and it will be
  322. written in a data file format that is not human readable but can
  323. be used by other applications. This option is incompatible with
  324. any other options except `-u'.
  325. `-c, --choice-of-output'
  326. When this option is used, the evaluation of the function given by
  327. the virtual machine code will be expected to yield a data
  328. structure from which `avram' will ascertain whether standard
  329. output should be written in text or raw data format. This option
  330. should be used only if application is aware of it. It is
  331. incompatible with any other options except `-u'.
  332. `-l, --line-map'
  333. Normally the entire contents of standard input up to `EOF' are
  334. loaded into memory and used as the argument to the function in the
  335. virtual code file. However, this option causes standard input to
  336. be read a line at a time, with the function applied individually
  337. to each line, and its result in each case written immediately to
  338. standard output. A given application either requires this option
  339. or does not, and will not work properly in the alternative. This
  340. option implies `--force-text-input' and is incompatible with any
  341. other option except `-u'.
  342. `-b, --byte-transducer'
  343. This option causes standard input to be read one character at a
  344. time, evaluating the function given by the virtual code file each
  345. time. The function is used as a state transition function that
  346. takes a state and input to a next state and output. The output is
  347. written concurrently with the input operations. A given
  348. application will not work properly with an inappropriate setting
  349. of this option. This option implies `--force-text-input' and is
  350. incompatible with any other option except `-u'.
  351. `-u, --unparameterized'
  352. Normally `avram' guesses whether to use filter mode or parameter
  353. mode depending on whether there are any parameters. Selecting this
  354. option forces it to operate in filter mode regardless. Any
  355. parameters that may appear on the command line after the virtual
  356. code file name are ignored. This option may be used in conjunction
  357. with any other filter mode option.
  358. 
  359. File: avram.info, Node: Parameter Mode Options, Next: Command Line Syntax, Prev: Filter Mode Options, Up: User Manual
  360. 1.4 Parameter Mode Options
  361. ==========================
  362. The parameter mode options are listed below. Except as otherwise noted,
  363. any combination of parameter mode options may be selected together, and
  364. except as noted, the settings of these options can be varied without
  365. breaking the application.
  366. `-q, --quiet'
  367. `avram' normally informs the user when writing an output file with
  368. a short message to standard output. This option suppresses such
  369. messages. This option is compatible with any application and any
  370. other parameter mode option except `-a'.
  371. `-a, --ask-to-overwrite'
  372. Selecting this option will cause `avram' to ask permission
  373. interactively before overwriting an existing file, and to refrain
  374. from overwriting it without permission, in which case the contents
  375. that were to be written will be lost. This option overrides `-q'
  376. and is compatible with any other parameter mode option or
  377. application.
  378. `-.EXT'
  379. An option beginning with a dash followed by a period specifies a
  380. default extension for input file names. If `avram' doesn't find a
  381. file named on the command line, and the filename doesn't already
  382. contain a period, `avram' will try to find a file having a similar
  383. name but with the default extension appended. The default
  384. extension given by this option takes precedence over the hard
  385. coded default extensions of `.fun' and `.avm'. At most one default
  386. extension can be supplied. This option is compatible with any
  387. other parameter mode option and compatible with any application.
  388. `-d, --default-to-stdin'
  389. If no filename parameter appears on the command line (other than
  390. the name of the virtual code file), this option directs `avram' to
  391. read the contents of standard input as if it were specified as a
  392. command line parameter. (Standard input can also be specified
  393. explicitly as a dash. See *note Command Line Syntax::.) This
  394. option is compatible with any application and any other parameter
  395. mode option except `-m'.
  396. `-m, --map-to-each-file'
  397. Normally `avram' loads the entire contents of all files named on
  398. the command line into memory so as to evaluate the virtual machine
  399. code application on all of them together. This option can be used
  400. to save memory in the case of applications that operate on
  401. multiple files independently. It causes `avram' to load only one
  402. file at a time and to perform the relevant evaluation and output
  403. before loading the next one. Application specific options and
  404. standard input (if specified) are read only once and reused. This
  405. option is incompatible with `-d', and not necessarily compatible
  406. with all applications, although some may work both with and
  407. without it.
  408. `-i, --interactive'
  409. This option is used in the case of applications that interact with
  410. other programs through shell commands. An application that is
  411. meant to be invoked in this way requires this option and will not
  412. work without it, nor will applications that are not of this type
  413. work with it. This option is implied by `-t' and `-s', and is
  414. compatible with any other parameter mode option.
  415. `-s, --step'
  416. This option is used in the case of applications that interact with
  417. other programs through shell commands, similarly to `-i', and can
  418. substitute for it (see above). The option has the additional
  419. effect of causing shell commands issued by `avram' on behalf of
  420. the application to be written with their results to standard
  421. output, and to cause `avram' to pause after displaying each shell
  422. command until a key is pressed. This capability may be useful for
  423. debugging or auditing purposes but does not otherwise alter the
  424. effects of the application. This option is compatible with any
  425. other parameter mode option.
  426. `-t, --trace'
  427. This option is used in the case of applications that interact with
  428. other programs through shell commands, but only by way of the
  429. `interact' combinator, for which it provides developers a means of
  430. low level debugging, particularly deadlock detection. When this
  431. option is selected, a verbose trace of all characters exchanged
  432. between the functional transducer and the external application are
  433. written to standard output, along with some additional control flow
  434. diagnostics. This option is compatible with any other parameter
  435. mode option.
  436. `-p, --parameterized'
  437. Normally `avram' tries to guess whether to operate in filter mode
  438. or parameter mode based on the options used and the parameters. If
  439. there are no parameters and no options, it will default to filter
  440. mode, and try to read standard input. However, if this option is
  441. selected, it will use parameter mode (and therefore not try to read
  442. standard input unless required).
  443. 
  444. File: avram.info, Node: Command Line Syntax, Next: Diagnostics, Prev: Parameter Mode Options, Up: User Manual
  445. 1.5 Command Line Syntax
  446. =======================
  447. The command line parameters that follow the virtual code file name when
  448. `avram' is used in parameter mode (*note Parameter Mode::) are
  449. dependent on the specific application. However, all supported
  450. applications are constrained for implementation reasons to observe
  451. certain uniform conventions regarding their command line parameters,
  452. which are documented here to avoid needless duplication.
  453. The shell divides the command line into "arguments" separated by
  454. white space. Arguments containing white space or special characters
  455. used by the shell must be quoted or protected as usual. File names with
  456. wild cards in them are expanded by the shell before `avram' sees them.
  457. `avram' then extracts from the sequence of arguments a sequence of
  458. filenames and a sequence of options. Each option consists of a keyword
  459. and an optional parameter list. Filenames, keywords, and parameter
  460. lists are distinguished according to the following criteria.
  461. 1. An argument is treated as a keyword iff it meets these three
  462. conditions.
  463. a. It starts with a dash.
  464. b. It doesn't contain an equals sign.
  465. c. It doesn't consist solely of a dash.
  466. 2. An argument is treated as a parameter list iff it meets these four
  467. conditions.
  468. a. It doesn't begin with a dash.
  469. b. It either begins with an equals sign or doesn't contain one.
  470. c. It immediately follows an argument beginning with a dash, not
  471. containing an equals sign and not consisting solely of a dash.
  472. d. At least one of the following is true.
  473. 1. It doesn't contain a period, tilde, or path separator.
  474. 2. It contains a comma.
  475. 3. It can be interpreted as a C formatted floating point
  476. number.
  477. 3. An argument is treated as an input file name iff it meets these
  478. four conditions.
  479. a. It doesn't begin with a dash.
  480. b. It doesn't contain an equals sign.
  481. c. It doesn't contain a comma.
  482. d. At least one of the following is true.
  483. 1. It contains a period, tilde, or path separator.
  484. 2. It doesn't immediately follow an argument beginning with
  485. a dash, not consisting solely of a dash, and not
  486. containing an equals sign.
  487. 4. If an argument contains an equals sign but doesn't begin with one,
  488. the part on the left of the first equals sign is treated as a
  489. keyword and the part on the right is treated as a parameter list.
  490. 5. An argument consisting solely of a dash is taken to represent the
  491. standard input file.
  492. 6. An argument not fitting any of the above classifications is an
  493. error.
  494. These conventions are needed for `avram' to detect input file names
  495. in a general, position independent way, so that it can preload the files
  496. on behalf of the application. Many standard Unix utilities follow these conventions
  497. to a large extent, the exceptions being those that employ non-filename
  498. arguments without distinguishing syntax, and use positional or other ad
  499. hoc methods of command line interpretation. A drop-in replacement for
  500. such an application could nevertheless be implemented using `avram'
  501. with an appropriate wrapper script, similar to the approach recommended
  502. in *note Example Script::, but with suitable keywords inserted prior to
  503. the ambiguous arguments.
  504. 
  505. File: avram.info, Node: Diagnostics, Next: Security, Prev: Command Line Syntax, Up: User Manual
  506. 1.6 Diagnostics
  507. ===============
  508. The means exists for virtual code applications to have run time error
  509. messages written to standard error on their behalf by `avram'. Any
  510. error messages not documented here originate with an application and
  511. should be documented by it.
  512. Most error messages originating from `avram' are prefaced by the
  513. application name (i.e., the name of the file containing the virtual
  514. machine code), but will be prefaced by `avram:' if the error is caused
  515. by a problem loading this file itself. Error messages originating from
  516. virtual code applications are the responsibility of their respective
  517. authors and might not be prefaced by the application name.
  518. The run time errors not specifically raised by the application can be
  519. classified as internal errors, i/o errors, overflow errors, file format
  520. errors, application programming errors, and configuration related
  521. errors.
  522. Some error messages include a code number. The number identifies the
  523. specific point in the source code where the condition was detected, for
  524. the benefit of the person maintaining it.
  525. * Menu:
  526. * Internal Errors::
  527. * i/o Errors::
  528. * Overflow Errors::
  529. * File Format Errors::
  530. * Application Programming Errors::
  531. * Configuration Related Errors::
  532. * Other Diagnostics and Warnings::
  533. 
  534. File: avram.info, Node: Internal Errors, Next: i/o Errors, Prev: Diagnostics, Up: Diagnostics
  535. 1.6.1 Internal Errors
  536. ---------------------
  537. Internal errors should never occur unless the `avram' source code has
  538. been carelessly modified, except as noted in *note Bugs::. There are
  539. two kinds.
  540. `APPLICATION-NAME: virtual machine internal error (code NN)'
  541. Most internal errors would be reported by a message of this form
  542. if they were to occur. It indicates that some required invariant
  543. was not maintained. In such cases, the program terminates
  544. immediately, and any results already produced are suspect.
  545. `APPLICATION-NAME: NN unreclaimed STRUCT-NAMES'
  546. A message of this form could be printed at the end of an otherwise
  547. successful run. `avram' maintains a count of the number of units
  548. allocated for various data structures, and checks that they are
  549. all reclaimed eventually as a safeguard against memory leaks. This
  550. message indicates that some memory remains unaccounted for.
  551. If a repeatable internal error is discovered, please email a bug
  552. report and a small representative test case to
  553. <[email protected]> or file an issue on the Avram github page.
  554. Include the version number of `avram', which you can get by running
  555. `avram --version'.
  556. 
  557. File: avram.info, Node: i/o Errors, Next: Overflow Errors, Prev: Internal Errors, Up: Diagnostics
  558. 1.6.2 i/o Errors
  559. ----------------
  560. These error messages are prefaced with the name of the application. A
  561. further explanation as to the reason, obtained from the standard
  562. `strerror()' utility, is appended to the messages below if possible.
  563. `APPLICATION-NAME: can't read FILENAME'
  564. A file was not able to be opened for reading, typically because it
  565. was not found or because the user does not have permission. The
  566. file name is displayed with special characters expanded but
  567. without any default extensions or search paths that may have been
  568. tried. If you think a file exists and should have been found,
  569. there may be a problem with your `AVMINPUTS' environment variable
  570. (*note Environment::).
  571. `APPLICATION-NAME: can't write FILENAME'
  572. A file was not able to be opened for writing.
  573. `APPLICATION-NAME: can't write to FILENAME'
  574. A file was successfully opened for writing but became impossible to
  575. write thereafter.
  576. `APPLICATION-NAME: can't spawn COMMAND'
  577. An attempt to execute a shell command on behalf of an interactive
  578. application failed during the `exp_popen()' call to the
  579. `libexpect' library.
  580. `APPLICATION-NAME: can't close FILENAME'
  581. A call to the standard C procedure `fclose()' failed due to
  582. unforeseen circumstances. The error is non-fatal but the file
  583. should be checked for missing data.
  584. 
  585. File: avram.info, Node: Overflow Errors, Next: File Format Errors, Prev: i/o Errors, Up: Diagnostics
  586. 1.6.3 Overflow Errors
  587. ---------------------
  588. These errors are reported by the application name prefacing one of the
  589. following messages, except as noted below.
  590. `APPLICATION-NAME: counter overflow (code NN)'
  591. An overflow occurred in an unsigned long integer being used as a
  592. reference counter or something similar. This situation is very
  593. unlikely.
  594. `APPLICATION-NAME: memory overflow (code NN)'
  595. There wasn't enough memory to build an internal data structure. The
  596. most likely cause is an attempt to operate on input files that are
  597. too large. Standard remedies apply.
  598. The memory overflow or counter overflow messages can also be reported
  599. without the application name preface or a code number. In these cases,
  600. they arise in the course of evaluating the function given by the
  601. application, rather than by loading the input files.
  602. A counter overflow in this case is possible if the application
  603. attempts to compute the size of a very large, shared structure using
  604. native integer arithmetic.
  605. Memory overflows are possible due to insufficient memory for a valid
  606. purpose, but may also occur due to a non-terminating recursion in the
  607. virtual machine code. To prevent thrashing or other bad effects from
  608. runaway code, the `ulimit' shell command is your friend.
  609. 
  610. File: avram.info, Node: File Format Errors, Next: Application Programming Errors, Prev: Overflow Errors, Up: Diagnostics
  611. 1.6.4 File Format Errors
  612. ------------------------
  613. Certain application crashes result from an application not adhering to
  614. the required conventions about data and file formats, or because the
  615. application was invoked in the wrong mode (*note Modes of Operation::).
  616. These are the following.
  617. `APPLICATION-NAME: invalid text format (code NN)'
  618. An application that was expected to return a string of characters
  619. to be written to a text file returned data that did not correspond
  620. to any valid character representation.
  621. `APPLICATION-NAME: null character in prompt'
  622. An interactive application (invoked rightly or wrongly with `-i',
  623. `-t', or `-s') is required to exchange strings of non-null
  624. characters internally with `avram', and used a null.
  625. `APPLICATION-NAME: invalid file name (code NN)'
  626. The data structure representing a file obtained from an application
  627. has a name consisting of something other than character strings.
  628. This error could be the result of a filter mode application (*note
  629. Filter Mode::) being invoked in parameter mode.
  630. (*note Parameter Mode::)
  631. `APPLICATION-NAME: null character in file name'
  632. Similar to the above errors.
  633. `APPLICATION-NAME: bad character in file name'
  634. Slashes, backslashes, and unprintable characters other than spaces
  635. are also prohibited in file names.
  636. `APPLICATION-NAME: invalid output preamble format'
  637. According the format used by `avram' for data files, a data file
  638. may contain an optional text portion, known as the preamble. This
  639. error occurs when a data file obtained from an application can not
  640. be written because the preamble is something other than a list of
  641. character strings.
  642. `APPLICATION-NAME: invalid file specification'
  643. This error occurs in situations where the data structure for a file
  644. obtained by evaluating the application is too broken to permit any
  645. more specific diagnosis.
  646. `avram: invalid raw file format in APPLICATION-NAME'
  647. The file containing the virtual machine code was not able to be
  648. loaded, because the code was not in a recognizable format. Either
  649. the file has become corrupted, the compiler that generated it has a
  650. bug in it, or the wrong file was used as a virtual code file.
  651. 
  652. File: avram.info, Node: Application Programming Errors, Next: Configuration Related Errors, Prev: File Format Errors, Up: Diagnostics
  653. 1.6.5 Application Programming Errors
  654. ------------------------------------
  655. A further class of application crashes results from miscellaneous bugs
  656. in the application. These require the application to be debugged and
  657. have no user level explanation or workaround, but are listed here for
  658. reference. These messages are not normally prefaced by the application
  659. name when reported unless the application elects to do so, except for
  660. the `invalid profile identifier' message.
  661. * `invalid recursion'
  662. * `invalid comparison'
  663. * `invalid deconstruction'
  664. * `invalid transpose'
  665. * `invalid membership'
  666. * `invalid distribution'
  667. * `invalid concatenation'
  668. * `invalid assignment'
  669. * `unrecognized combinator (code NN)'
  670. * `APPLICATION-NAME: invalid profile identifier'
  671. * `unsupported hook'
  672. 
  673. File: avram.info, Node: Configuration Related Errors, Next: Other Diagnostics and Warnings, Prev: Application Programming Errors, Up: Diagnostics
  674. 1.6.6 Configuration Related Errors
  675. ----------------------------------
  676. The source code distribution of `avram' incorporates a flexible
  677. configuration script allowing it to be installed on a variety of
  678. platforms. Not all platforms allow support for all features. It is also
  679. anticipated that new features may be added to `avram' from time to
  680. time. Some problems may therefore occur due to features not being
  681. supported at your site for either of these reasons. The following error
  682. messages are relevant to these situations.
  683. `unsupported hook'
  684. If it's not simply due to an application programming error (*note
  685. Application Programming Errors::) this message may be the result of
  686. trying to use an application that requires a newer version of
  687. `avram' than the one installed, even though applications should
  688. avoid this problem by checking the version number at run time. If
  689. this is the reason, the solution would be to install the latest
  690. version.
  691. `APPLICATION-NAME: I need avram linked with FOO, BAR and BAZ.'
  692. A message of the this form indicates that a new installation may be
  693. needed. At this writing (11/11/1), `avram' may report this message
  694. with respect to `libexpect5.32', `tcl8.3', and `libutil' if any of
  695. the `-i', `-t', or `-s' options is used on a system where not all
  696. of these libraries were detected when `avram' was installed from a
  697. source distribution. (See *note Parameter Mode Options::.)
  698. Because `avram' is useful even without interactive applications,
  699. these libraries are not considered absolute prerequisites by the
  700. configuration script.
  701. `avram: can't emulate version VERSION'
  702. The `--emulation=VERSION' option obviously won't work if the
  703. requested version is newer than the installed version, or if it is
  704. not a valid version number (*note General Options::). When that
  705. happens, this message is printed instead and `avram' terminates.
  706. `avram: multiple version specifications'
  707. The `--emulation=VERSION' option can be used at most once on a
  708. command line. This message is printed if it is used more than
  709. once. If you only typed it once and got this message, check your
  710. aliases and wrapper scripts before reporting a bug.
  711. `avram: unrecognized option: OPTION-NAME'
  712. may mean that a command line option has been misspelled, or may be
  713. another sign of an obsolete version of `avram'. This message will
  714. be followed by a usage summary similar to that of the `--help'
  715. option. (*note General Options::).
  716. `APPLICATION-NAME: warning: search paths not supported'
  717. If the `argz.h' header file was not detected during configuration,
  718. `avram' will not be able to support search paths in the
  719. `AVMINPUTS' environment variable (*note Environment::). This
  720. message is a warning that the environment variable is being
  721. ignored. If the warning is followed by an i/o error
  722. (*note i/o Errors::), the latter may be due to a file being in a
  723. path that was not searched for this reason. A workaround is to
  724. specify the full path names of all input files outside the current
  725. working directory. If you don't need search paths, you can get rid
  726. of this message by undefining `AVMINPUTS'.
  727. 
  728. File: avram.info, Node: Other Diagnostics and Warnings, Prev: Configuration Related Errors, Up: Diagnostics
  729. 1.6.7 Other Diagnostics and Warnings
  730. ------------------------------------
  731. `avram: multiple -.EXT options; all but last ignored'
  732. This message is written when more than one default extension is
  733. given as a command line parameter. At most one default extension
  734. is allowed. If more than one is given, only the last one is used.
  735. The error is non-fatal and `avram' will try to continue. If you
  736. need more than one default extension, consider using the hard
  737. coded default extensions of `.fun' and `.avm', or hacking the
  738. shell script in which the `avram' command line appears.
  739. `APPLICATION NAME: empty operator'
  740. This message probably means that the virtual code file is corrupt
  741. or invalid.
  742. usage summary
  743. For any errors in usage not covered by other diagnostics, such as
  744. incompatible combinations of options, `avram' prints a message to
  745. standard error giving a brief summary of options, similar to the
  746. output from `avram --help'. (See *note General Options::.)
  747. 
  748. File: avram.info, Node: Security, Next: Example Script, Prev: Diagnostics, Up: User Manual
  749. 1.7 Security
  750. ============
  751. A few obvious security considerations are relevant to running untrusted
  752. virtual code applications. These points are only as reliable as the
  753. assumption that the `avram' executable has not been modified to the
  754. contrary.
  755. * The applications with the best protection from malicious code are
  756. those that run in filter mode, because they have no access to any
  757. information not presented to them in standard input, nor the
  758. ability to affect anything other than the contents of standard
  759. output (provided that the `--jail' command line option is used).
  760. The worst they can do is use up a lot of memory, which can be
  761. prevented with the `ulimit' command. Unfortunately, not all
  762. applications are usable in this mode.
  763. * Parameter mode applications that do not involve the `-i', `-t' or
  764. `-s' options are almost as safe (also assuming `--jail'). They
  765. have (read-only) access to environment variables, and to the files
  766. that are indicated explicitly on the command line. If standard
  767. input is one of the files (as indicated by the use of `-' as a
  768. parameter), the virtual code application may infer the current
  769. date and time. However, a parameter mode application may write
  770. any file that the user has permission to write. The
  771. `--ask-to-overwrite' option should be used for better security, or
  772. at least the `--quiet' option should not be used. The virtual
  773. code can neither override nor detect the use of these options.
  774. * Interactive parameter mode applications (those that use either the `-i',
  775. `-t' or `-s' options) are the least secure because they can
  776. execute arbitrary shell commands on behalf of the user. This
  777. statement also applies to filter mode and parameter mode
  778. applications where the `--jail' option is not used. Use of
  779. `--step' is preferable to `-i' for making an audit trail of all
  780. commands executed, but the application could probably subvert it.
  781. The `--step' option may be slightly better because it can allow
  782. the user to inspect each command and interrupt it if appropriate.
  783. However, in most cases a command will not be displayed until it is
  784. already executed. Commands executed by non-interactive
  785. applications normally will display no output to that effect. A
  786. `chroot' environment may be the only secure way of running
  787. untrusted interactive applications.
  788. 
  789. File: avram.info, Node: Example Script, Next: Files, Prev: Security, Up: User Manual
  790. 1.8 Example Script
  791. ==================
  792. It is recommended that the application developer (or the compiler)
  793. package virtual machine code applications as shell scripts with the
  794. `avram' command line embedded in them. This style relieves the user of
  795. the need to remember the appropriate virtual machine options for
  796. invoking the application, which are always the same for a given
  797. application, or even to be aware of the virtual machine at all.
  798. Here is a script that performs a similar operation to the standard Unix
  799. `cat' utility.
  800. #!/bin/sh
  801. #\
  802. exec avram --force-text-input --default-to-stdin "$0" "$@"
  803. sKYQNTP\
  804. That is, it copies the contents of a file whose name is given on the
  805. command line to standard output, or copies standard input to standard
  806. output if no file name is given. This script can be marked executable (with
  807. `chmod') and run by any user with the directory of the `avram'
  808. executable in his or her `PATH' environment variable, even if `avram'
  809. had to be installed in a non-standard directory such as `~/bin'.
  810. The idea for this script is blatantly lifted from the `wish' manpage.
  811. The first line of the script invokes a shell to process what follows.
  812. The shell treats the second line as a comment and ignores it. Based on
  813. the third line, the shell invokes `avram' with the indicated options,
  814. the script itself as the next argument, and whatever command line
  815. parameters were initially supplied by the user as the remaining
  816. arguments. The rest of the script after that line is never processed by
  817. the shell.
  818. When `avram' attempts to load the shell script as a virtual machine
  819. code file, which happens as a result of it being executed by the shell,
  820. it treats the first line as a comment and ignores it. It also treats
  821. the second line as a comment, but takes heed of the trailing backslash,
  822. which is interpreted as a comment continuation character. It therefore
  823. also treats the third line as a comment and ignores it. Starting with
  824. the fourth line, it reads the virtual code, which is in a binary data
  825. format encoded with printable characters, and evaluates it.
  826. 
  827. File: avram.info, Node: Files, Next: Environment, Prev: Example Script, Up: User Manual
  828. 1.9 Files
  829. =========
  830. `./profile.txt'
  831. This file is written automatically by `avram' on behalf of
  832. applications that include profile annotations. It lists the number
  833. of invocations for each annotated part of the application, the
  834. total amount of time spent on it (in relative units), the average
  835. amount of time for each invocation, and the percentage of time
  836. relative to the remainder of the application. The exact format is
  837. undocumented and subject to change.
  838. 
  839. File: avram.info, Node: Environment, Next: Bugs, Prev: Files, Up: User Manual
  840. 1.10 Environment
  841. ================
  842. An environment variable `AVMINPUTS' can be made to store a list of
  843. directories (using the `set' or `export' commands) that `avram' will
  844. search for input files. The directories should be separated by colons,
  845. similarly to the `PATH' environment variable.
  846. The search paths in `AVMINPUTS' apply only to the names of input
  847. files given on the command line (*note Command Line Syntax::) when
  848. `avram' is invoked in parameter mode (*note Parameter Mode::). They do
  849. not apply to the name of the virtual code file, which is always assumed
  850. to be either absolute or relative to the current working directory
  851. (this assumption being preferable in the case of a script like that of
  852. *note Example Script::).
  853. Starting in the first directory in the list of `AVMINPUTS', `avram'
  854. searches for a file exactly as its name appears on the command line
  855. (subject to the expansion of special characters by the shell). If it is
  856. not found and the name does not contain a period, but a command line
  857. option of `-.EXT' has been used, `avram' will then search for a file
  858. with that name combined with the extension `.EXT'. If `-.EXT' has not
  859. been used or if no matching file is found with it, `avram' tries the
  860. extensions of `.avm' and `.fun' in that order, provided the given file
  861. name contained no periods. If no match is found for any of those cases,
  862. `avram' proceeds to search the next directory in the list obtained from
  863. `AVMINPUTS', and so on. It stops searching when the first match is
  864. found. For subsequent input files, the search begins again at the first
  865. directory.
  866. If `AVMINPUTS' is defined, the current working directory is not
  867. searched for input files unless it is listed. If it is empty or not defined,
  868. a default list of search paths is used, currently
  869. .:/usr/local/lib/avm:/usr/lib/avm:/lib/avm:/opt/avm:/opt/lib/avm\
  870. :/usr/local/share/avm:/usr/share/avm:/share/avm:/opt/avm:/opt/share/avm
  871. These paths are defined in `avram.c' and can be changed by recompiling
  872. it.
  873. 
  874. File: avram.info, Node: Bugs, Prev: Environment, Up: User Manual
  875. 1.11 Bugs
  876. =========
  877. There are no known bugs outstanding, except for any that may be
  878. inherent in the external library functions. However, `avram' has been
  879. used most extensively on GNU/Linux systems, and the prospect of
  880. portability issues with new or lesser used features on other systems
  881. can't be excluded.
  882. Though not observed in practice, it's theoretically possible to blow
  883. the stack by passing enough functions as arguments to library functions
  884. that pass more functions to library functions (e.g., by using nested
  885. calls to the gsl integration functions meant for a single variable to
  886. evaluate a very high dimensional multiple integral). In all other cases
  887. only dynamic heap storage or a constant amount of stack space is used.
  888. In particular, this issue is _not_ relevant to virtual code
  889. applications that don't use external libraries, or that don't pass
  890. functions to them as arguments.
  891. `avram' is designed to recover gracefully from memory overflows by
  892. always checking for `NULL' results from `malloc()' or otherwise
  893. trapping functions that allocate memory. In the event of an overflow,
  894. it conveys an appropriate error message to the virtual code application
  895. to be handled by the usual exception handling mechanisms. However,
  896. there is currently no way for a virtual code application to detect in
  897. advance whether sufficient memory is available, nor for it to resume
  898. normal operation once an exception occurs. Furthermore, it has been
  899. observed on some systems including Irix and 2.4 series Linux kernels
  900. that the `avram' process is killed automatically for attempting to
  901. allocate too much memory rather than given the chance to recover.
  902. Please send bug reports to <[email protected]> or file an
  903. issue on the Avram github page.
  904. 
  905. File: avram.info, Node: Virtual Machine Specification, Next: Library Reference, Prev: User Manual, Up: Top
  906. 2 Virtual Machine Specification
  907. *******************************
  908. This chapter contains a description of the virtual machine implemented
  909. by `avram', from the point of view of a person wishing to write a
  910. compiler that generates code for it. Before reading this chapter,
  911. readers should at least skim *note User Manual:: in order to see the big
  912. picture. Topics covered in this chapter include data representations,
  913. virtual code semantics, and file formats. A toy programming language is
  914. introduced for illustrative purposes. The sections in this chapter might
  915. not make sense if read out of order the first time through. The last
  916. section, *note Virtual Code Semantics::, contains many equations that
  917. may be difficult to read in the info or html renderings. The printed
  918. version is recommended for anyone who really wants to comprehend this
  919. material.
  920. * Menu:
  921. * Raw Material::
  922. * Concrete Syntax::
  923. * File Format::
  924. * Representation of Numeric and Textual Data::
  925. * Filter Mode Interface::
  926. * Parameter Mode Interface::
  927. * Virtual Code Semantics::
  928. 
  929. File: avram.info, Node: Raw Material, Next: Concrete Syntax, Prev: Virtual Machine Specification, Up: Virtual Machine Specification
  930. 2.1 Raw Material
  931. ================
  932. The purpose of this section is to instill some basic concepts about the
  933. way information is stored or communicated by the virtual machine, which
  934. may be necessary for an understanding of subsequent sections.
  935. The virtual machine represents both programs and data as members of a
  936. semantic domain that is straightforward to describe. Lisp users and
  937. functional programmers may recognize familiar concepts of atoms and lists
  938. in this description. However, these terms are avoided for the moment,
  939. in order to keep this presentation self contained and to prevent
  940. knowledgeable readers from inferring any unintended meanings.
  941. As a rule, it is preferable to avoid overspecifying any theoretical
  942. artifact. In this spirit, the set of entities with which the virtual
  943. machine is concerned can be defined purely in terms of the properties we
  944. need it to have.
  945. _A distinguished element_
  946. A particular element of the set is designated, arbitrarily or
  947. otherwise, as a distinguished element. Given any element of the
  948. set, it is always possible to decide whether or not it is the
  949. distinguished element. The set is non-empty and such an element
  950. exists.
  951. _A binary operator_
  952. A map from pairs of elements of the set to elements of the set
  953. exists and meets these conditions.
  954. * It associates a _unique_ element of the set with any given
  955. ordered pair of elements from the set.
  956. * It does not associate the distinguished element with any pair
  957. of elements.
  958. For the sake of concreteness, an additional constraint is needed:
  959. _the set has no proper subset satisfying the above conditions_. Any
  960. number of constructions remain within these criteria, but there is no
  961. need to restrict them further, because they are all equivalent for our
  962. purposes.
  963. To see that these properties provide all the structure we need for
  964. general purpose computation, we may suppose some given set `S' and an
  965. operator `cons' having them are fixed, and infer the following points.
  966. * `S' contains at least one element, the distinguished element. Call
  967. it `nil'.
  968. * The pair `(nil,nil)' is a pair of elements of `S', so there must
  969. be an element of `S' that `cons' associates with it. We can denote
  970. this element `cons(nil,nil)'.
  971. * As no pair of elements is associated with the distinguished
  972. element, `cons(nil,nil)' must differ from `nil', so `S' contains
  973. at least two distinct elements.
  974. * The pair `(nil,cons(nil,nil))' therefore differs from `(nil,nil)',
  975. but because it is yet another pair of elements from `S', there
  976. must be an element associated with it by the operator. We can
  977. denote this element as `cons(nil,cons(nil,nil))'.
  978. * Inasmuch as the operator associates every pair of elements with a
  979. _unique_ element, `cons(nil,cons(nil,nil))' must differ from the
  980. element associated with any other pair of elements, so it must
  981. differ from `cons(nil,nil)', and we conclude that `nil',
  982. `cons(nil,nil)' and `cons(nil,cons(nil,nil))' constitute three
  983. distinct elements of the set `S'.
  984. * By defining `cons(cons(nil,nil),nil)' and
  985. `cons(cons(nil,nil),cons(nil,nil))' analogously and following a
  986. similar line of reasoning, one may establish the existence of two
  987. more distinct elements of `S'.
  988. It is not difficult to see that an argument in more general terms
  989. could show that the inclusion of infinitely many elements in `S' is
  990. mandated by the properties of the `cons' operator. Furthermore, every
  991. element of `S' other than `nil' owes its inclusion to being associated
  992. with some other pair of elements by `cons', because if it were not, its
  993. exclusion would permit a proper subset of `S' to meet all of the above
  994. conditions. We can conclude that `S' contains exactly `nil' and the
  995. countable infinitude of elements of the form `cons(x,y)', where `x' and
  996. `y' are either `nil' or something of the form `cons(...)' themselves.
  997. Some specific examples of sets and operators that have the required
  998. properties are as follows.
  999. * the set of natural numbers, with `0' as the distinguished element,
  1000. and the `cons' operator defined by `cons(X,Y) = ((X+Y)(X+Y+1))/2 +
  1001. Y + 1'
  1002. * a set of balanced strings of parentheses, with `()' as the
  1003. distinguished element, and `cons' defined as string concatenation
  1004. followed by enclosure in parentheses
  1005. * a set of ordered binary trees, with the empty tree as the
  1006. distinguished element, and the `cons' operator as that which takes
  1007. an ordered pair of trees to the tree having them as its descendents
  1008. * a set containing only its own Cartesian product and an arbitrary
  1009. but fixed element `nil', with `cons' being the identity function
  1010. Each of these models may suggest a different implementation, some of
  1011. which are more practical than others. The remainder of this document is
  1012. phrased somewhat imprecisely in terms of a combination of the latter
  1013. two. The nature of the set in question is not considered further, and
  1014. elements of the set are described as "trees" or "lists". The distinguished
  1015. element is denoted by `nil' and the operator by `cons'. Where no
  1016. ambiguity results, `cons(x,y)' may be written simply as `(x,y)'. These
  1017. terms should not be seen as constraints on the implementation.
  1018. 
  1019. File: avram.info, Node: Concrete Syntax, Next: File Format, Prev: Raw Material, Up: Virtual Machine Specification
  1020. 2.2 Concrete Syntax
  1021. ===================
  1022. The previous section has developed a basic vocabulary for statements
  1023. such as "the virtual machine code for the identity function is `(nil,(nil,nil))'",
  1024. which are elaborated extensively in the subsequent sections on code and
  1025. data formats. However, a description in this style would be inadequate
  1026. without an explanation of how such an entity as `(nil,(nil,nil))' is
  1027. communicated to `avram' in a virtual machine code file. The purpose of
  1028. this section is to fill the gap by explaining exactly how any given
  1029. tree would be transformed to its concrete representation.
  1030. The syntax is based on a conversion of the trees to bit strings, followed
  1031. by grouping the bits into blocks of six, which are then encoded by
  1032. printable characters. Although anyone is free to modify `avram', it is
  1033. recommended that the concrete syntax described here be maintained for
  1034. the sake of portability of virtual machine code applications.
  1035. Building a tree by reading the data from a file requires a more
  1036. difficult algorithm than the one presented in this section, and is not
  1037. considered because it's not strictly necessary for a compiler.
  1038. Procedures for both reading and writing are available to C and C++
  1039. users as part of the `avram' library, and are also easily invoked on
  1040. the virtual code level.
  1041. * Menu:
  1042. * Bit String Encoding::
  1043. * Blocking::
  1044. 
  1045. File: avram.info, Node: Bit String Encoding, Next: Blocking, Prev: Concrete Syntax, Up: Concrete Syntax
  1046. 2.2.1 Bit String Encoding
  1047. -------------------------
  1048. The conversion from trees to bit strings might have been done in several ways,
  1049. perhaps the most obvious being based on a preorder traversal with each
  1050. vertex printed as it is traversed. By this method, the entire encoding
  1051. of the left descendent would precede that of the right in the bit
  1052. string. This alternative is therefore rejected because it imposes
  1053. unnecessary serialization on communication.
  1054. It is preferable for the encodings of both descendents of a tree to
  1055. be interleaved to allow concurrent transmission. Although there is
  1056. presently no distributed implementation of the virtual machine and hence none
  1057. that takes advantage of this possibility, it is better to plan ahead
  1058. than to be faced with backward compatibility problems later.
  1059. The preferred algorithm for encoding a tree as a bit string employs a
  1060. queue. The queue contains trees and allows them to be processed in a first-in
  1061. first-out order. Intuitively, the algorithm works by traversing the
  1062. tree in level order. To print a tree `T' as a string of `1's and `0's,
  1063. it performs the following steps.
  1064. Initialize the queue to contain only `T'
  1065. while the queue is not empty do
  1066. if the front element of the queue is `nil' then
  1067. print `0'
  1068. else if the front element of the queue is of the form `cons(x,y)' then
  1069. print `1'
  1070. append `x' to the back of the queue
  1071. append `y' to the back of the queue
  1072. end if
  1073. remove the front element of the queue
  1074. end while
  1075. This algorithm presupposes that any given tree `cons(x,y)' can be
  1076. "deconstructed" to obtain `x' and `y'. The computability of such an
  1077. operation is assured in theory by the uniqueness property of the `cons'
  1078. operator, regardless of the representation chosen. If the trees are
  1079. implemented with pointers in the obvious way, their deconstruction is a
  1080. trivial constant time operation.
  1081. As an example, running the following tree through the above algorithm
  1082. results in the bit string
  1083. `111111101011110010001001100010100010100100100'.
  1084. cons(
  1085. cons(
  1086. cons(nil,cons(nil,cons(nil,nil))),
  1087. cons(nil,cons(nil,nil))),
  1088. cons(
  1089. cons(
  1090. cons(nil,cons(nil,cons(nil,cons(nil,nil)))),
  1091. cons(nil,nil)),
  1092. cons(
  1093. cons(
  1094. cons(nil,cons(nil,cons(cons(nil,cons(nil,nil)),nil))),
  1095. cons(nil,nil)),
  1096. nil)))
  1097. 
  1098. File: avram.info, Node: Blocking, Prev: Bit String Encoding, Up: Concrete Syntax
  1099. 2.2.2 Blocking
  1100. --------------
  1101. After the bit string is obtained as described above, it is grouped into
  1102. blocks of six. Continuing with the example, the string
  1103. 111111101011110010001001100010100010100100100
  1104. would be grouped as
  1105. 111111 101011 110010 001001 100010 100010 100100 100
  1106. Because the number of bits isn't a multiple of six, the last group has
  1107. to be padded with zeros, to give
  1108. 111111 101011 110010 001001 100010 100010 100100 100000
  1109. Each of these six bit substrings is then treated as a binary number,
  1110. with the most significant bit on the left. The numbers expressed in
  1111. decimal are
  1112. 63 43 50 9 34 34 36 32
  1113. The character codes for the characters to be written are obtained by
  1114. adding sixty to each of these numbers, so as to ensure that they will be
  1115. printable characters. The resulting character codes are
  1116. 123 103 110 69 94 94 96 92
  1117. which implies that the tree in the example could be written to a file as
  1118. `{gnE^^`\'.
  1119. 
  1120. File: avram.info, Node: File Format, Next: Representation of Numeric and Textual Data, Prev: Concrete Syntax, Up: Virtual Machine Specification
  1121. 2.3 File Format
  1122. ===============
  1123. A virtual code file consists of an optional text preamble, followed by
  1124. the concrete representation for a tree. The latter uses the syntax
  1125. described in the previous section. The purpose of this section is to
  1126. specify the remaining details of the file format.
  1127. The format for virtual code files may also be used for other purposes
  1128. by virtual code applications, as it is automatically detected and parsed
  1129. by `avram' when used in an input file, and can be automatically written
  1130. to output files at the discretion of the application.
  1131. Other than virtual code files, input files not conforming to this
  1132. format are not an error as far as `avram' is concerned, because they are assumed
  1133. to be text files. Applications can detect in virtual code the
  1134. assumption that is made and report an error if appropriate.
  1135. Although the data file format includes no checksums or other explicit methods
  1136. of error detection, the concrete syntax itself provides a good measure
  1137. of protection against undetected errors. The probability is vanishingly
  1138. small that a random alteration to any valid encoding leaves it intact,
  1139. because every bit in the sequence either mandates or prohibits the
  1140. occurrence of two more bits somewhere after it. Errors in different
  1141. parts of the file would have to be consistent with one another to go
  1142. unnoticed.
  1143. * Menu:
  1144. * Preamble Section::
  1145. * Data Section::
  1146. 
  1147. File: avram.info, Node: Preamble Section, Next: Data Section, Prev: File Format, Up: File Format
  1148. 2.3.1 Preamble Section
  1149. ----------------------
  1150. * A file may contain at most one preamble.
  1151. * The preamble, if any, is a consecutive sequence of lines beginning
  1152. with the first line in the file.
  1153. * The first line of the preamble must begin with a hash (`#')
  1154. character.
  1155. * Subsequent lines of the preamble must either begin with a hash, or
  1156. immediately follow a line that ends with a backslash (`\')
  1157. character (or both).
  1158. 
  1159. File: avram.info, Node: Data Section, Prev: Preamble Section, Up: File Format
  1160. 2.3.2 Data Section
  1161. ------------------
  1162. * The data or virtual code section of the file begins on the first
  1163. line of the file that isn't part of the preamble.
  1164. * The data section may not contain any hashes, white space, or other
  1165. extraneous characters other than line breaks.
  1166. * If line breaks are ignored, the data section contains a sequence
  1167. of characters expressing a single tree in the concrete syntax
  1168. described in *note Concrete Syntax::.
  1169. 
  1170. File: avram.info, Node: Representation of Numeric and Textual Data, Next: Filter Mode Interface, Prev: File Format, Up: Virtual Machine Specification
  1171. 2.4 Representation of Numeric and Textual Data
  1172. ==============================================
  1173. As noted already, virtual code applications are specified by functions
  1174. operating on elements of a set having the properties described in *note
  1175. Raw Material::, which are convenient to envision as ordered binary
  1176. trees or pairs of `nil'. However, virtual code applications normally
  1177. deal with numeric or textual data, for example when they refer to the
  1178. contents of a text file. It is therefore necessary for the application
  1179. and the virtual machine emulator to agree on a way of describing textual
  1180. or numeric data with these trees.
  1181. The purpose of this section is to explain the basic data structures
  1182. used in the exchange of information between `avram' and a virtual code
  1183. application. For example, an explanation is needed for statements like
  1184. "an application invoked with the `--baz' option is expected to return a
  1185. pair `(FOO,BAR)', where `FOO' is a list of character strings ...", that
  1186. are made subsequently in this document. Such statements should be
  1187. understood as referring to the trees representing the pairs, lists,
  1188. character strings, etc., according to the conventions explained below.
  1189. _Characters_
  1190. An arbitrarily chosen set of 256 trees is used to represent the
  1191. character set. They are listed in *note Character Table::. For
  1192. example, the letter `A' is represented by
  1193. `(nil,(((nil,(nil,(nil,nil))),nil),(nil,nil)))'. That means that
  1194. when an application wants the letter `A' written to a text file, it
  1195. returns something with this tree in it.
  1196. _Booleans_
  1197. The value of `false' is represented by `nil', and the value of
  1198. `true' is represented by `(nil,nil)'.
  1199. _Pairs_
  1200. Given any two items of data X1 and X2, having the respective
  1201. representations R1 and R2, the pair `(X1,X2)' has the
  1202. representation `cons(R1,R2)'.
  1203. _Lists_
  1204. A list of the items X1, X2 ... XN with respective representations
  1205. R1 through RN is represented by the tree
  1206. `cons(R1,cons(R2...cons(RN,nil)...))'. In other words, lists are
  1207. represented as pairs whose left sides are the heads and whose
  1208. right sides are the tails. The empty list is identified with
  1209. `nil'. Lists of arbitrary finite length can be accommodated.
  1210. _Naturals_
  1211. A number of the form `B0 + 2B1 + 4B2 + ... + 2^n BN', where each
  1212. `Bi' is `0' or `1', is represented by a tree of the form
  1213. `cons(T0,cons(T1...cons(TN,nil)...))' where each `Ti' is `nil' if
  1214. the corresponding `Bi' is `0', and `(nil,nil)' otherwise. Note that
  1215. the numbers `Bi' are exactly the bits written in the binary
  1216. expansion of the number, with `B0' being the least significant bit.
  1217. _Strings_
  1218. are represented as lists of characters.
  1219. `avram' imposes no more of a "type discipline" than necessary to a
  1220. workable interface between it and an application. This selection of
  1221. types and constructors should not be seen as constraining what a
  1222. compiler writer may wish to have in a source language.
  1223. 
  1224. File: avram.info, Node: Filter Mode Interface, Next: Parameter Mode Interface, Prev: Representation of Numeric and Textual Data, Up: Virtual Machine Specification
  1225. 2.5 Filter Mode Interface
  1226. =========================
  1227. From the point of view of the application developer or compiler writer,
  1228. there are parameter mode applications, which are discussed in *note
  1229. Parameter Mode Interface::, and filter mode applications, which are
  1230. discussed in this section. Of the latter, there are mainly three kinds:
  1231. those that read one character at a time, those that read a line at a
  1232. time, and those that read the whole standard input file at once. Each
  1233. of them is invoked with different options and expected to follow
  1234. different calling conventions. This section summarizes these
  1235. conventions.
  1236. * Menu:
  1237. * Loading All of Standard Input at Once::
  1238. * Line Maps::
  1239. * Byte Transducers::
  1240. 
  1241. File: avram.info, Node: Loading All of Standard Input at Once, Next: Line Maps, Prev: Filter Mode Interface, Up: Filter Mode Interface
  1242. 2.5.1 Loading All of Standard Input at Once
  1243. -------------------------------------------
  1244. Unless `--line-map' or `--byte-transducer' is used as a command line
  1245. option when the application is invoked, the contents of standard input
  1246. are loaded entirely into memory by `avram' before evaluation of the
  1247. virtual code begins. This interface is obviously not appropriate for
  1248. infinite streams.
  1249. The virtual code application in this mode of operation is treated as
  1250. a single function taking the entire contents of standard input as its
  1251. argument, and returning the entire contents of standard output as its
  1252. result. Hence, this interface is one of the simplest available.
  1253. * Menu:
  1254. * Standard Input Representation::
  1255. * Standard Output Representation::
  1256. 
  1257. File: avram.info, Node: Standard Input Representation, Next: Standard Output Representation, Prev: Loading All of Standard Input at Once, Up: Loading All of Standard Input at Once
  1258. 2.5.1.1 Standard Input Representation
  1259. .....................................
  1260. The representation for the standard input file used as the argument to
  1261. the function depends both on the file format and on the command line
  1262. options specified when the application is invoked. The `--unparameterized'
  1263. and `--raw-output' options make no difference to the input
  1264. representation, and the `--line-map' and `--byte-transducer' options
  1265. are not relevant to this mode of operation. That leaves four possible
  1266. combined settings of the `--choice-of-output' and `--force-text-input'
  1267. options. If standard input conforms to the data file format
  1268. specification *note File Format::, the following effects are possible.
  1269. * If neither `--choice-of-output' nor `--force-text-input' is used,
  1270. the argument to the function will be given directly by the tree
  1271. encoded in the data section of the file. The preamble of the file
  1272. will be ignored.
  1273. * If the `--choice-of-output' option is used and the
  1274. `--force-text-input' option is not used, the argument to the
  1275. function will be a pair `(PREAMBLE,CONTENTS)', where PREAMBLE is a
  1276. list of character strings taken from the preamble of the file
  1277. (with leading hashes stripped), and CONTENTS is the tree
  1278. represented in the data section of the file.
  1279. * If the `--choice-of-output' option is not used and the
  1280. `--force-text-input' option is used, the argument to the function
  1281. will be the whole file as a list of character strings. I.e., both
  1282. the preamble and the data sections are included, hashes are not
  1283. stripped from the preamble, and the data section is not converted
  1284. to the tree it represents.
  1285. * If the `--choice-of-output' option is used and the
  1286. `--force-text-input' option is also used, the argument to the the
  1287. function will be a pair `(nil,CONTENTS)', where the contents are
  1288. the list of character strings as in the previous case.
  1289. If standard input does not conform to the data file format
  1290. specification in *note File Format::, then it is assumed to be a text
  1291. file. The `--force-text-input' option makes no difference, and there are
  1292. only two possible effects, depending on whether `--choice-of-output' is
  1293. used. They correspond to the latter two cases above, where
  1294. `--force-text-input' is used.
  1295. The idea of the `--choice-of-output' option is that it is used only
  1296. for applications that are smart enough to be aware of the
  1297. `(PREAMBLE,CONTENTS)' convention. A non-empty preamble implies a data
  1298. file whose contents could be any type, but an empty preamble implies a
  1299. text file whose contents can only be a list of character strings. (In
  1300. the case of a data file with no preamble, the list of the empty string
  1301. is used for the preamble to distinguish it from a text file.)
  1302. Dumb applications that never want to deal with anything but text
  1303. files should be invoked with `--force-text-input'. Otherwise, they have
  1304. to be prepared for either text or data as arguments.
  1305. The use of both options at once is unproductive as far as the input
  1306. format is concerned, but may be justified when the output is to be a
  1307. data file and the input is text only.
  1308. 
  1309. File: avram.info, Node: Standard Output Representation, Prev: Standard Input Representation, Up: Loading All of Standard Input at Once
  1310. 2.5.1.2 Standard Output Representation
  1311. ......................................
  1312. As in the case of standard input, the representation for standard output
  1313. that the function is expected to return depends on the command line
  1314. options with which the application is invoked. The only relevant options
  1315. are `--raw-output' and `--choice-of-output', which are mutually
  1316. exclusive.
  1317. * If neither option is selected, the result returned by the function
  1318. must be a list of character strings.
  1319. * If `--raw-output' is used, the result returned by the function is
  1320. unconstrained, and it will be written as a data file with no
  1321. preamble, following the format specified in *note File Format::.
  1322. * If `--choice-of-output' is used, the result returned by the
  1323. function must be a pair `(PREAMBLE,CONTENTS)'.
  1324. In the last case, the preamble determines how the file will be
  1325. written. If it is meant to be a text file, the preamble should be
  1326. `nil', and the contents should be a list of character strings. If it is
  1327. meant to be a data file, the preamble should be a non-empty list of
  1328. character strings, and the format of the contents is unconstrained. To
  1329. express a data file with no preamble, the preamble should be the list
  1330. containing the empty string, rather than being empty.
  1331. In the result returned by the function, the preamble lines should not
  1332. include leading hash characters, because they are automatically added to
  1333. the output to enforce consistency with the data file format. However,
  1334. they should include trailing backslashes as continuation characters
  1335. where appropriate. The hashes that are automatically added will be
  1336. automatically stripped by `avram' on behalf of whatever application
  1337. uses the file.
  1338. Any file can be written as a list of character strings, even "text"
  1339. files that are full of unprintable characters, and even "text" files
  1340. that happen to conform to the format used for data files. However, if
  1341. the application intends to write a data file in the standard format used
  1342. by other virtual code applications, it can do so more quickly and easily
  1343. by having the virtual machine do the formatting automatically with the
  1344. `--choice-of-output' option than by implementing the algorithm in *note
  1345. Concrete Syntax::, from scratch in virtual code.
  1346. 
  1347. File: avram.info, Node: Line Maps, Next: Byte Transducers, Prev: Loading All of Standard Input at Once, Up: Filter Mode Interface
  1348. 2.5.2 Line Maps
  1349. ---------------
  1350. Virtual code applications invoked with the `--line-map' option (with or
  1351. without the `--unparameterized' option) adhere to a very simple
  1352. interface.
  1353. * The argument to the function is a character string, and the result
  1354. must also be a character string.
  1355. * The function is applied to each line of the standard input file and
  1356. the result in each case is written to standard output followed by a line
  1357. break.
  1358. This kind of application may be used on finite or infinite streams,
  1359. provided that the lengths of the lines are finite, but preserves no
  1360. state information from one line to the next.
  1361. 
  1362. File: avram.info, Node: Byte Transducers, Prev: Line Maps, Up: Filter Mode Interface
  1363. 2.5.3 Byte Transducers
  1364. ----------------------
  1365. The interface used when the `--byte-transducer' option is selected allows
  1366. an application to serve as a persistent stream processor suitable for
  1367. finite or infinite streams. The interface can be summarized by the
  1368. following points.
  1369. * When it is first invoked, the function in the virtual code file is
  1370. applied to an argument of `nil', and is expected to return a pair
  1371. `(STATE,OUTPUT)'. The STATE format is unconstrained. The OUTPUT
  1372. must be a character string that will be written to standard
  1373. output, but it may be the empty string.
  1374. * For each byte read from standard input, `avram' applies the
  1375. function to the pair `(STATE,CHARACTER)', using the state obtained
  1376. from previous evaluation, and the character whose code is the
  1377. byte. The purpose of the STATE field is therefore to provide a way
  1378. for the application to remember something from one invocation to
  1379. the next.
  1380. * The function is usually expected to return a pair `(STATE,OUTPUT)'
  1381. for each input byte, so that the state can be used on the next
  1382. iteration, and the output can be written to standard output as a
  1383. character string.
  1384. * If the function ever returns a value of `nil', the computation
  1385. terminates.
  1386. * If standard input comes to an end before the computation
  1387. terminates, the function will be applied to a pair of the form
  1388. `(STATE,nil)' thereafter, but may continue to return
  1389. `(STATE,OUTPUT)' pairs for arbitrarily many more iterations. The
  1390. `EOF' character is not explicitly passed to the function, but the
  1391. end is detectable insofar as `nil' is not a representation for any
  1392. character.
  1393. Unlike the situation with line maps, the output character strings do
  1394. not have line breaks automatically appended, and the application must
  1395. include them explicitly if required. The convention for line breaks is
  1396. system dependent. On Unix and GNU/Linux systems, character code 10
  1397. indicates a line break, but other systems may use character code 13
  1398. followed by character code 10. See *note Character Table:: for the representations
  1399. of characters having these codes.
  1400. 
  1401. File: avram.info, Node: Parameter Mode Interface, Next: Virtual Code Semantics, Prev: Filter Mode Interface, Up: Virtual Machine Specification
  1402. 2.6 Parameter Mode Interface
  1403. ============================
  1404. The virtual code file for a parameter mode application contains a tree
  1405. representing a single function, which takes as its argument a data
  1406. structure in the format described below. The format of the result
  1407. returned by the function depends on the virtual machine options used on
  1408. the command line, and the various alternatives are explained
  1409. subsequently.
  1410. * Menu:
  1411. * Input Data Structure::
  1412. * Input for Mapped Applications::
  1413. * Output From Non-interactive Applications::
  1414. * Output From Interactive Applications::
  1415. 
  1416. File: avram.info, Node: Input Data Structure, Next: Input for Mapped Applications, Prev: Parameter Mode Interface, Up: Parameter Mode Interface
  1417. 2.6.1 Input Data Structure
  1418. --------------------------
  1419. The data structure that is used as the argument to the parameter mode
  1420. application incorporates all the information about the command line and the
  1421. environment variables. It is in the form of a triple
  1422. `((FILES,OPTIONS),ENVIRONS)'. The fields have these interpretations.
  1423. FILES
  1424. is a list of quadruples `((DATE,PATH),(PREAMBLE,CONTENTS))', with
  1425. one quadruple for each input file named on the command line (but
  1426. not the virtual code file or the `avram' executable). The list
  1427. will be in the same order as the filenames on the command line,
  1428. and is not affected by options interspersed with them. The fields
  1429. in each item have the following interpretations.
  1430. DATE
  1431. is the time stamp of the file in as a character string in the
  1432. usual Unix format, for example, `Fri Jan 19 14:34:44 GMT
  1433. 2001'. If the file corresponds to standard input, the current
  1434. system time and date are used.
  1435. PATH
  1436. is the full path of the file, expressed as a list of strings.
  1437. If the file corresponds to standard input, the list is empty.
  1438. Otherwise, the first string in the list is the file name. The
  1439. next is the name of the file's parent directory, if any. The
  1440. next is the parent of the parent, and so on. The root
  1441. directory is indicated by the empty string, so that any path
  1442. ending with the empty string is an absolute path, while any
  1443. path ending with a non-empty string is relative to the
  1444. current working directory. Path separators (i.e., slashes)
  1445. are omitted.
  1446. PREAMBLE
  1447. In the case of a text file, or any file not conforming to the
  1448. format in *note File Format::, this field is `nil'.
  1449. Otherwise, this field contains the preamble of the file
  1450. expressed as a list of strings, or contains just the empty
  1451. string if the file has no preamble. Any leading hashes in the
  1452. preamble of the file are stripped.
  1453. CONTENTS
  1454. In the case of a text file (as indicated by the PREAMBLE
  1455. field), this field will contain a list of character strings,
  1456. with each line of the file contained in a character string.
  1457. Otherwise, it can contain data in any format, which are
  1458. obtained by converting the data section of the file to a tree.
  1459. OPTIONS
  1460. is a list of quadruples of the form
  1461. `((POSITION,LONGFORM),(KEYWORD,PARAMS))' with one quadruple for
  1462. each option appearing on the command line after the name of the
  1463. virtual code file.
  1464. POSITION
  1465. is a natural number indicating the position of the option on
  1466. the command line. The position numbers of all the options
  1467. will form an ascending sequence, but not necessarily
  1468. consecutive nor starting with zero. The missing numbers in
  1469. the sequence will correspond to the positions of the file
  1470. names on the command line, allowing their positions to be
  1471. inferred by applications for which the position matters.
  1472. LONGFORM
  1473. is a boolean value which is true if the option starts with
  1474. two or more dashes but false otherwise.
  1475. KEYWORD
  1476. is the key word of the option expressed as a character
  1477. string. For example in the case of a command line option
  1478. `--foo=bar,baz', the keyword is `foo'. Leading dashes are
  1479. stripped.
  1480. PARAMS
  1481. is a list of character strings identifying the parameters for
  1482. the command line option in question. In the case of an option
  1483. of the form `--foo=bar,baz', the first character string in
  1484. the list will be `bar' and the next will be `baz'. The same
  1485. applies if the option is written `--foo bar,baz' or `--foo
  1486. =bar,baz'. If there are no parameters associated with the
  1487. option, the list is empty.
  1488. ENVIRONS
  1489. is a list of pairs of character strings, with one pair in the list
  1490. for each environment variable. The identifier is the left string
  1491. in the pair, and the value is the right. For example, if the
  1492. environment contains the definition `OSTYPE=linux-gnu', there will
  1493. be a pair in the list whose left side is the string `OSTYPE' and
  1494. whose right side is the string `linux-gnu'.
  1495. 
  1496. File: avram.info, Node: Input for Mapped Applications, Next: Output From Non-interactive Applications, Prev: Input Data Structure, Up: Parameter Mode Interface
  1497. 2.6.2 Input for Mapped Applications
  1498. -----------------------------------
  1499. Applications invoked using the `--map-to-each-file' option benefit from
  1500. a slightly different interface than the one described above. As the
  1501. purpose of this option is to save memory by loading only one file at a
  1502. time, the application does not have access to all input files named on
  1503. the command line simultaneously within the same data structure.
  1504. Although the data structure is of the type already described, the FILES
  1505. field is not a list of arbitrary length. Instead, it is a list
  1506. containing exactly one item for an input file. If `-' is used as a
  1507. command line parameter, indicating standard input, then FILES will have
  1508. another item pertaining to standard input. In no event will it have
  1509. other than one or two items.
  1510. The mapped application is expected to work by being applied
  1511. individually to each of any number of separately constructed data
  1512. structures, doing the same in each case as it would if that case were
  1513. the only one. To make that possible, copies of the environment
  1514. variables, the contents of standard input, and the list of application
  1515. specific options are contained in the data structure used for every
  1516. invocation.
  1517. The position numbers in the options are adjusted for each invocation
  1518. to reflect the position of the particular input file associated with it.
  1519. For example, in the following command
  1520. `avram --map-to-each-file mapster.avm fa.txt --data fb.dat --code fc.o'
  1521. the function in the virtual code file `mapster.avm' would be applied
  1522. to each of three data structures, corresponding to the commands
  1523. `avram mapster.avm fa.txt --data --code'
  1524. `avram mapster.avm --data fb.dat --code'
  1525. `avram mapster.avm --data --code fc.o'
  1526. If the relative positions of the options and filenames were important to
  1527. the application, they could be reliably inferred from the position
  1528. numbers. In the first case, they would be 1 and 2, implying that the
  1529. file is in position 0. In the second case they would be 0 and 2,
  1530. implying that the file is in position 1, and in the third case they
  1531. would be 0 and 1, implying that the file is in position 2. (Of course,
  1532. nothing compels an application to concern itself with the positions of
  1533. its parameters, and the alternative might be preferable.)
  1534. For the most part, any application that can operate on one file at a
  1535. time without needing information from any others can be executed more
  1536. economically with the `--map-to-each-file' option and few if any
  1537. changes to the code. The effect will normally be analogous to the above
  1538. example, subject to a few possible differences.
  1539. * If an application is supposed to do something by default when
  1540. there are no file parameters or only standard input, it won't work
  1541. as a mapped application, because if there are no file parameters
  1542. it won't be executed at all.
  1543. * If a mapped application causes any output files to be generated,
  1544. they may be written before other input files are read, possibly
  1545. causing the input files to be overwritten if they have the same
  1546. names, and causing subsequent invocations to use the overwritten
  1547. versions. This behavior differs from that of loading all input
  1548. files at the outset, which ensures the application seeing all of
  1549. the original versions. The latter may be more convenient for
  1550. maintaining a group of files in some sort of consistent state.
  1551. * If an application causes standard output to be written along with
  1552. output files, normally standard output is written last as a
  1553. security measure against malicious code altering the
  1554. `--ask-to-overwrite' prompts by subtly clobbering the console. In
  1555. a mapped application, standard output isn't always last because
  1556. there may be more invocations to come.
  1557. 
  1558. File: avram.info, Node: Output From Non-interactive Applications, Next: Output From Interactive Applications, Prev: Input for Mapped Applications, Up: Parameter Mode Interface
  1559. 2.6.3 Output From Non-interactive Applications
  1560. ----------------------------------------------
  1561. If a parameter mode application is not invoked with either of the
  1562. `--interactive' or `--step' options, then it is deemed to be
  1563. non-interactive, and therefore does not concern itself with executing
  1564. shell commands. Instead, it simply specifies a list of output files to
  1565. be created or updated on its behalf by `avram'.
  1566. The files are described by a list of quadruples
  1567. `((OVERWRITE,PATH),(PREAMBLE,CONTENTS))', with one quadruple for each
  1568. file.
  1569. In each quadruple, the PATH, PREAMBLE, and CONTENTS fields have the
  1570. same interpretations as in the list of files in the input data
  1571. structure described in *note Input Data Structure::, except that a
  1572. `nil' path refers to standard output rather than to standard input.
  1573. The OVERWRITE field in each quadruple tells whether the file should
  1574. be overwritten or appended. If the OVERWRITE field is `nil' (i.e., the
  1575. representation for the Boolean value of `false') and a file already
  1576. exists at the given path, the new contents will be appended to it. If
  1577. the OVERWRITE field is anything other than `nil' and/or no file exists
  1578. with the given path, a new file is written or the extant one is
  1579. overwritten. Note that the data file format specified in *note File
  1580. Format:: precludes appending anything to it, but the format of existing
  1581. output files is not checked and nothing prevents data or text from
  1582. being appended to one.
  1583. 
  1584. File: avram.info, Node: Output From Interactive Applications, Prev: Output From Non-interactive Applications, Up: Parameter Mode Interface
  1585. 2.6.4 Output From Interactive Applications
  1586. ------------------------------------------
  1587. Parameter mode applications invoked with either of the `--interactive'
  1588. or `--step' options are required to take the data structure described
  1589. in *note Input Data Structure:: as an argument but to return the
  1590. virtual code for a function that will observe a certain protocol
  1591. allowing shell commands to be executed on its behalf. The intent is
  1592. that the virtual code file doesn't contain the real application _per
  1593. se_, but only something that builds the real one on the fly using
  1594. configuration information from the input files and command line options.
  1595. The format of the result returned by an interactive application,
  1596. being a virtual code application itself, requires a full exposition of
  1597. the virtual machine code semantics. This subject is deferred to *note
  1598. Virtual Code Semantics::. The remainder of this section describes the
  1599. protocol followed by the function returned by the interactive
  1600. application rather than the application itself.
  1601. Similarly to the case of a byte transducer described in *note Byte
  1602. Transducers::, the basic pattern of interaction between `avram' and the
  1603. function is a cycle of invocations. In general terms, the function is
  1604. applied to a `nil' argument initially, and expected to return an
  1605. initial state and initial output. Thereafter, the function is applied
  1606. to a pair of the state returned on the previous iteration, and the next
  1607. installment of input. The function returns further output and a new
  1608. state, and the cycle continues until the function returns a value of
  1609. `nil', at which time the computation terminates.
  1610. * Menu:
  1611. * Line Oriented Interaction::
  1612. * Character Oriented Interaction::
  1613. * Mixed Modes of Interaction::
  1614. 
  1615. File: avram.info, Node: Line Oriented Interaction, Next: Character Oriented Interaction, Prev: Output From Interactive Applications, Up: Output From Interactive Applications
  1616. 2.6.4.1 Line Oriented Interaction
  1617. .................................
  1618. Within this general pattern, more specific styles of interaction are
  1619. possible. In the simplest one to explain first, the result returned by
  1620. the function is always a data structure of the form `(STATE,(COMMAND
  1621. LINES,PROMPTS))', wherein the fields have these interpretations.
  1622. STATE
  1623. is a tree incorporating any data in any format that the application
  1624. needs to remember from one invocation to the next.
  1625. COMMAND LINES
  1626. is a list of character strings that are piped to the standard input
  1627. stream of a separately spawned process. The process may persist
  1628. from one invocation of the function to the next, or may be spawned
  1629. each time.
  1630. PROMPTS
  1631. is a non-empty list of character strings containing a suffix of
  1632. the text expected from the standard output stream of the process
  1633. as a result of sending the command lines to it.
  1634. On each iteration, `avram' sends the command line character strings to
  1635. a separately spawned process, with line breaks between them if there
  1636. are more than one command. If a process remains from the previous
  1637. iteration that has not terminated itself, the list of command lines is
  1638. sent to the same process. If no such process already exists, the first
  1639. string in the list of command lines is treated as a shell command and
  1640. used to spawn the process (using the `exp_popen' library function), and
  1641. the remaining strings are sent to the newly spawned process.
  1642. Normally processes spawned with commands that invoke interactive
  1643. command line interpreters of their own, such as `bash', `ftp' or `bc',
  1644. will persist indefinitely unless the command causing them to exit is
  1645. issued or some other event kills them. Processes spawned with
  1646. non-interactive commands, such as `ls' or `pwd', will terminate when
  1647. the last of their initial output has been received.
  1648. In the case of processes that persist after being spawned, `avram'
  1649. needs some way of knowing when to stop waiting for more output from them
  1650. so that it doesn't get stuck waiting forever. This purpose is served by
  1651. the PROMPTS field. This field could contain a single string holding the
  1652. last thing the process will send before becoming quiescent, such as the
  1653. strings `bash$ ' or `ftp> ' in the above examples. Alternatively, a
  1654. sequence of more than one prompt string can be used to indicate that
  1655. the corresponding sequence of lines must be detected. An empty string
  1656. followed by `ftp> ' would indicate that the `ftp> ' prompt is expected
  1657. to be immediately preceded by a line break. There is also the option of
  1658. using prompt strings to indicate a pattern that does not necessarily
  1659. imply quiescence, but is a more convenient point at which to stop
  1660. reading the output from the process.
  1661. For processes spawned with commands that do not start their own
  1662. interactive command line interpreters, such as `ls' or `pwd', it may be
  1663. preferable to read all the output from them until they terminate. To
  1664. achieve this effect, the list of prompt strings should contain only the
  1665. single string containing only the single `EOF' character (usually code
  1666. 4) or any other character that is certain not to occur in the output of
  1667. the process. This technique is based on the assumption that the process
  1668. was spawned originally with the command in question, not that such a
  1669. command is sent to an existing shell process.
  1670. In any case, when enough output has been received from the process,
  1671. it is collected into a list of received strings including the prompt
  1672. strings at the end (if they were received), and the function is applied
  1673. to the pair `(STATE,RECEIVED STRINGS)'. If the cycle is to continue,
  1674. the result returned by the function will include a new state, a new
  1675. list of command lines, and a new list of prompt strings. A result of
  1676. `nil' will cause the computation to terminate.
  1677. There are some unusual situations that could occur in the course of
  1678. line oriented interaction, and are summarized as follows.
  1679. * If the process terminates before any pattern matching the prompt
  1680. strings is received from it, all of the output from the process up
  1681. to the point where it terminated is collected into the RECEIVED
  1682. STRINGS list and passed to the function. This situation includes
  1683. cases where the process terminates immediately upon being spawned,
  1684. but not abnormal completion of the `exp_popen' library function,
  1685. which is a fatal error. This feature of the interface is what
  1686. allows `EOF' to be used for collecting all the output at once from
  1687. a non-interactive command.
  1688. * If the list of COMMAND LINES is empty, and no process currently
  1689. exists due to a previous iteration, the effect is the same as if
  1690. the process terminates unexpectedly before outputting anything.
  1691. I.e., the function is applied to a pair containing an empty list
  1692. of received strings. There is no good reason for an application to
  1693. get into this situation.
  1694. * If the list of COMMAND LINES is empty but a process persists from
  1695. a previous iteration, no output is sent to it, but receiving from
  1696. it proceeds normally. This feature of the interface could be used
  1697. effectively by applications intended to process the received data
  1698. in parts, but will cause deadlock if the process is already
  1699. quiescent.
  1700. * All character strings have to consist of lists of valid
  1701. representations of non-null characters according to *note
  1702. Character Table::, or else there will be some fatal error messages.
  1703. * If the list of PROMPT STRINGS contains only the empty string,
  1704. `avram' will not wait to receive anything from the process, but
  1705. will proceed with the next iteration immediately. If this effect is
  1706. intended, care must be taken not to confuse the empty list of
  1707. PROMPT STRINGS with the list containing the empty string. The
  1708. former indicates character oriented interaction, which is
  1709. explained next.
  1710. 
  1711. File: avram.info, Node: Character Oriented Interaction, Next: Mixed Modes of Interaction, Prev: Line Oriented Interaction, Up: Output From Interactive Applications
  1712. 2.6.4.2 Character Oriented Interaction
  1713. ......................................
  1714. A character oriented style of interaction involves the function always
  1715. returning a data structure of the form `(STATE,(COMMAND LINES,nil))'.
  1716. The STATE and COMMAND LINES fields serve exactly the same purposes
  1717. respectively as they do in the case of line oriented interaction. The
  1718. field that would be occupied by the PROMPT STRINGS list in the case of
  1719. line oriented interaction is identically `nil' in this style.
  1720. When this style is used, `avram' spawns a process and/or sends command
  1721. lines to it as in the case of line oriented interaction, but attempts
  1722. to read only a single character from it per iteration. When the
  1723. character is received, `avram' applies the function to the pair
  1724. `(STATE,CHARACTER)' in order to obtain the next state and the next list
  1725. of command lines. If the process has terminated, a `nil' value is used
  1726. in place of the character. If the process is quiescent, deadlock ensues.
  1727. The character oriented style is a lower level protocol that shifts
  1728. more of the burden of analyzing the process's output to the virtual code
  1729. application. It can do anything line oriented interaction can do except
  1730. proceeding immediately without waiting to receive any output from the
  1731. process. It may also allow more general criteria (in effect) than the
  1732. matching of a fixed prompt string to delimit the received data, for
  1733. those pathological processes that may require such things.
  1734. Applications using character oriented interaction need to deal with
  1735. line breaks explicitly among the received characters, unlike the case
  1736. with line oriented interaction, where the line breaks are implicit in
  1737. the list of received strings. Contrary to the convention for Unix text
  1738. files, line breaks in the output of a process are indicated by character
  1739. code 13 followed by character code 10.
  1740. 
  1741. File: avram.info, Node: Mixed Modes of Interaction, Prev: Character Oriented Interaction, Up: Output From Interactive Applications
  1742. 2.6.4.3 Mixed Modes of Interaction
  1743. ..................................
  1744. An application is not confined exclusively to line oriented or character
  1745. oriented interaction, but may switch from one style to the other between
  1746. iterations, and signal its choice simply by the format of the data
  1747. structure it returns. If the PROMPT STRINGS field is non-empty, the
  1748. interaction is line oriented, and if the field is empty, the
  1749. interaction is character oriented. A function using both styles has to
  1750. be prepared for whichever type of data it indicates, either a character
  1751. or a list of character strings as the case may be.
  1752. Another alternative is possible if the function returns a data
  1753. structure in the form `(FILES,nil)'. This structure includes neither a
  1754. list of command lines nor a list of prompt strings, empty or otherwise,
  1755. but does include a list of quadruples in the FILES field. The
  1756. quadruples are of the form `((OVERWRITE,PATH),(PREAMBLE,CONTENTS))'.
  1757. The fields have the same interpretations as in the output from a
  1758. non-interactive parameter mode application, as described in *note
  1759. Output From Non-interactive Applications::, and will cause a list of
  1760. files to be written in the same way.
  1761. As an interactive application is able cause the execution of
  1762. arbitrary shell commands, it doesn't need `avram' to write files for it
  1763. the way a non-interactive application does, so this feature does not
  1764. provide any additional capabilities. However, it may be helpful as a
  1765. matter of convenience.
  1766. After the files are written, the function will be applied to the same
  1767. result it returned, `(FILES,nil)'. There is no direct means of
  1768. preserving unconstrained state information from previous iterations in
  1769. this style of interaction. A likely scenario might therefore be that
  1770. the function returns a file list after finishing its other business, and
  1771. then returns `nil' on the next iteration to terminate.
  1772. 
  1773. File: avram.info, Node: Virtual Code Semantics, Prev: Parameter Mode Interface, Up: Virtual Machine Specification
  1774. 2.7 Virtual Code Semantics
  1775. ==========================
  1776. As the previous sections explain, virtual code applications are defined
  1777. in terms of mathematical functions. Up until this point, the discussion
  1778. has focused on the interface between the function and the virtual
  1779. machine interpreter, by detailing the arguments passed to the functions
  1780. under various circumstances and the results they are expected to return
  1781. in order to achieve various effects.
  1782. The purpose of this section is to complete the picture by explaining
  1783. how a given computable function can be expressed in virtual code,
  1784. considering only functions operating on the trees described in *note
  1785. Raw Material::. Functions manipulating trees of `nil' are undoubtedly
  1786. a frivolous and abstract subject in themselves. One is obliged to refer
  1787. back to the previous sections if in need of motivation.
  1788. * Menu:
  1789. * A New Operator::
  1790. * On Equality::
  1791. * A Minimal Set of Properties::
  1792. * A Simple Lisp Like Language::
  1793. * How `avram' Thinks::
  1794. * Variable Freedom::
  1795. * Metrics and Maintenance::
  1796. * Deconstruction::
  1797. * Recursion::
  1798. * Assignment::
  1799. * Predicates::
  1800. * Iteration::
  1801. * List Combinators::
  1802. * List Functions::
  1803. * Exception Handling::
  1804. * Interfaces to External Code::
  1805. * Vacant Address Space::
  1806. 
  1807. File: avram.info, Node: A New Operator, Next: On Equality, Prev: Virtual Code Semantics, Up: Virtual Code Semantics
  1808. 2.7.1 A New Operator
  1809. --------------------
  1810. With regard to a set of trees as described in *note Raw Material::, we
  1811. can define a new binary operator. Unlike the `cons' operator, this one
  1812. is not required to associate an element of the set with every possible
  1813. pair of elements. For very many pairs of operands we will have nothing
  1814. to say about its result. In fact, we require nothing of it beyond a few
  1815. simple properties to be described presently.
  1816. Because this is the only other operator than `cons', there is no need
  1817. to have a special notation for it, so it will be denoted by empty
  1818. space. The tree associated by the operator with a pair of trees `X' and
  1819. `Y', if any, will be expressed in the infix notation `X Y'. For
  1820. convenience, the operator is regarded as being right associative, so
  1821. that `A B C' can be written for `A (B C)'.
  1822. 
  1823. File: avram.info, Node: On Equality, Next: A Minimal Set of Properties, Prev: A New Operator, Up: Virtual Code Semantics
  1824. 2.7.2 On Equality
  1825. -----------------
  1826. One example of a property this operator should have, for reasons that
  1827. will not be immediately clear, is that for any trees `X' and `K', the
  1828. equality `cons(cons(nil,`K),nil) X' = `K'' always holds. Even though
  1829. the present exposition opts for readability over formality, statements
  1830. like these demand clarification of the notion of equality. Some of the
  1831. more pedantic points in *note Raw Material:: may be needed for the
  1832. following ideas to hold water.
  1833. * As originally stipulated, it is always possible to distinguish
  1834. `nil' from any member of the set. We can therefore decide on this
  1835. basis whether `A = B' whenever at least one of them is `nil'.
  1836. * Where neither `A' nor `B' is `nil' in an expression `A = B', but
  1837. rather something of the form `cons(X,Y)', the equality holds if
  1838. and only if both pairs of corresponding subexpressions are equal.
  1839. If at least one member of each pair of corresponding
  1840. subexpressions is `nil', the question is settled, but otherwise
  1841. there is recourse to their respective subexpressions, and so on.
  1842. This condition follows from the uniqueness property of the `cons'
  1843. operator.
  1844. * If one side of an equality is of the form `X Y', or is defined in
  1845. terms of such an expression, but `(X,Y)' is one of those pairs
  1846. with which the operator associates no result, then the equality
  1847. holds if and only if the other side is similarly ill defined.
  1848. * Statements involving universal quantification (i.e., beginning
  1849. with words similar to "for any tree `X' ...") obviously do not
  1850. apply to instances of a variable (`X') outside the indicated set
  1851. (trees). Hence, they are not refuted by any consequence of
  1852. identifying a variable with an undefined expression.
  1853. Readers who are aware of such issues as pointer equality or
  1854. intensional versus extensional equality of functions are urged to
  1855. forget all about them in the context of this document, and abide only
  1856. by what is stated. Other readers should ignore this paragraph.
  1857. 
  1858. File: avram.info, Node: A Minimal Set of Properties, Next: A Simple Lisp Like Language, Prev: On Equality, Up: Virtual Code Semantics
  1859. 2.7.3 A Minimal Set of Properties
  1860. ---------------------------------
  1861. For any trees `X', `Y', and `K', and any non-`nil' trees `P', `F', and
  1862. `G', the new invisible operator satisfies these conditions. In these
  1863. expressions and hereafter, increasing abuse of notation is perpetrated
  1864. by not writing the `cons' in expressions of the form `cons(X,Y)'.
  1865. _P0_
  1866. `(nil,(nil,nil)) X' = `X'
  1867. _P1_
  1868. `(nil,((nil,nil),nil)) (X,Y)' = `X'
  1869. _P2_
  1870. `(nil,(nil,(nil,nil))) (X,Y)' = `Y'
  1871. _P3_
  1872. `((nil,K),nil) X' = `K'
  1873. _P4_
  1874. `(((nil,(nil,nil)),nil),nil) (F,X)' = `F (F,X)'
  1875. _P5_
  1876. `((F,G),nil) X' = `F G X'
  1877. _P6_
  1878. `((F,nil),G) X' = `(F X,G X)'
  1879. _P7_
  1880. `((P,F),G) X' = `F X' if `P X' is a non-`nil' tree, but `G X' if
  1881. `P X' = `nil'
  1882. Although other properties remain to be described, it is worth
  1883. pausing at this point because there is ample food for thought in the
  1884. ones already given. An obvious question would be that of their origin.
  1885. The short answer is that they have been chosen arbitrarily to be true by
  1886. definition of the operator. At best, the completion of the construction
  1887. may lead to a more satisfactory answer based on aesthetic or engineering
  1888. grounds.
  1889. A more important question would be that of the relevance of the
  1890. mystery operator and its properties to the stated purpose of this
  1891. section, which is to specify the virtual machine code semantics. The
  1892. answer lies in that the operator induces a function for any given tree
  1893. `T', such that the value returned by the function when given an argument
  1894. X is `T X'. This function is the one that is implemented by the virtual
  1895. code T, which is to say the way an application will behave if we put T
  1896. in its virtual code file. An equivalent way of looking at the situation
  1897. is that the virtual machine does nothing but compute the result of this
  1898. operator, taking the tree in the virtual code file as its left operand
  1899. and the input data as the right operand. By knowing what the operator
  1900. will do with a given pair of operands, we know what to put into the
  1901. virtual code file to get the function we want.
  1902. It is worthwhile to note that properties _P0_ to _P7_ are sufficient
  1903. for universality in the sense of Turing equivalence. That means that
  1904. any computable function could be implemented by the suitable choice of
  1905. a tree T without recourse to any other properties of the operator. A
  1906. compiler writer who finds this material boring could therefore stop
  1907. reading at this point and carry out the task of targeting any general
  1908. purpose programming language to the virtual machine based on the
  1909. specifications already given. However, such an implementation would not
  1910. take advantage of the features for list processing, exception handling,
  1911. or profiling that are also built into the virtual machine and have yet
  1912. to be described.
  1913. 
  1914. File: avram.info, Node: A Simple Lisp Like Language, Next: How `avram' Thinks, Prev: A Minimal Set of Properties, Up: Virtual Code Semantics
  1915. 2.7.4 A Simple Lisp Like Language
  1916. ---------------------------------
  1917. With a universal computational model already at our disposal, it will be
  1918. easier to use the virtual machine to specify itself than to define all
  1919. of it from scratch. For this purpose, we use the `silly' programming
  1920. language, whose name is an acronym for SImple Lisp-like Language (Yeah
  1921. right). The language serves essentially as a thin layer of symbolic
  1922. names on top of the virtual machine code. Due to its poor support for
  1923. modularity and abstraction, `silly' is not recommended for serious
  1924. application development, but at least it has a shallow learning
  1925. curve.(1)
  1926. * Menu:
  1927. * Syntax::
  1928. * Semantics::
  1929. * Standard Library::
  1930. ---------- Footnotes ----------
  1931. (1) Previous releases of `avram' included a working `silly'
  1932. compiler, but this has now been superseded by the Ursala programming
  1933. language. Ursala includes `silly' as a subset for the most part, and
  1934. the examples in this manual should compile and execute with very little
  1935. modification.
  1936. 
  1937. File: avram.info, Node: Syntax, Next: Semantics, Prev: A Simple Lisp Like Language, Up: A Simple Lisp Like Language
  1938. 2.7.4.1 Syntax
  1939. ..............
  1940. `silly' has no reserved words, but it has equals signs, commas and
  1941. parentheses built in. A concise but ambiguous grammar for it can be
  1942. given as follows.
  1943. PROGRAM ::= DECLARATION*
  1944. DECLARATION ::= IDENTIFIER `=' EXPRESSION
  1945. EXPRESSION ::= () | IDENTIFIER
  1946. | (EXPRESSION)
  1947. | (EXPRESSION,EXPRESSION)
  1948. | EXPRESSION EXPRESSION
  1949. | EXPRESSION(EXPRESSION)
  1950. | EXPRESSION(EXPRESSION,EXPRESSION)
  1951. The real grammar is consistent with this one but enforces right
  1952. associativity for binary operations and higher precedence for
  1953. juxtaposition without intervening white space.
  1954. The declaration of any identifier must be unique and must precede its
  1955. occurrence in any expression. Hence, cyclic dependences between
  1956. declarations and "recursive" declarations are not allowed.
  1957. 
  1958. File: avram.info, Node: Semantics, Next: Standard Library, Prev: Syntax, Up: A Simple Lisp Like Language
  1959. 2.7.4.2 Semantics
  1960. .................
  1961. Declarations in `silly' should be understood in the obvious way as
  1962. preprocessor directives to perform parenthetic textual substitutions
  1963. (similar to `#define id (exp)' in C). All identifiers in expressions
  1964. are thereby eliminated during the preprocessing phase.
  1965. The overall meaning of the program is the meaning of the expression
  1966. in the last declaration. A denotational semantics for expressions is
  1967. given by the following equations, where [[`X']] should be read as "the
  1968. meaning of `X'", and `X', `Y' and `Z' are metavariables. (That is, they
  1969. stand for any source code fragment that could fit there subject to the
  1970. constraint, informally speaking, that it has to correspond to a
  1971. connected subtree of the parse tree as enforced by the unambiguous
  1972. grammar in the context of the rest of the program.)
  1973. [[`()']] = `nil'
  1974. [[`(X)']] = [[`X']]
  1975. [[`(X,Y)']] = `cons('[[`X']]`,'[[`Y']]`)'
  1976. [[`X Y']] = [[`X(Y)']] = [[`X']] [[`Y']]
  1977. [[`X (Y,Z)']] = [[`X(Y,Z)']] = [[`X']] [[`(Y,Z)']]
  1978. Toy languages like this are among the few situations a denotational
  1979. semantics stands a chance of clarifying more than it obfuscates, so the
  1980. reader is invited to take a moment to savor it.
  1981. 
  1982. File: avram.info, Node: Standard Library, Prev: Semantics, Up: A Simple Lisp Like Language
  1983. 2.7.4.3 Standard Library
  1984. ........................
  1985. `silly' programs may be linked with library modules, which consist of
  1986. `silly' source text to be concatenated with the user program prior to
  1987. the preprocessing phase. Most `silly' programs are linked with the
  1988. standard `silly' prelude, which contains the following declarations
  1989. among others.
  1990. nil = ()
  1991. identity = (nil,(nil,nil))
  1992. left = (nil,((nil,nil),nil))
  1993. right = (nil,(nil,(nil,nil)))
  1994. meta = (((nil,(nil,nil)),nil),nil)
  1995. constant_nil = ((nil,nil),nil)
  1996. couple = ((((left,nil),constant_nil),nil),right)
  1997. compose = couple(identity,constant_nil)
  1998. constant = couple(couple(constant_nil,identity),constant_nil)
  1999. conditional =
  2000. couple(couple(left,compose(left,right)),compose(right,right))
  2001. There is a close correspondence between these declarations and the
  2002. properties described in *note A Minimal Set of Properties::. A fitting
  2003. analogy would be that the properties of the operator specify the
  2004. virtual machine instruction set in a language independent way, and the
  2005. `silly' library defines the instruction mnemonics for a virtual
  2006. assembly language. The relationship of some mnemonics to their
  2007. corresponding instructions may be less clear than that of others, so
  2008. they are all discussed next.
  2009. 
  2010. File: avram.info, Node: How `avram' Thinks, Next: Variable Freedom, Prev: A Simple Lisp Like Language, Up: Virtual Code Semantics
  2011. 2.7.5 How `avram' Thinks
  2012. ------------------------
  2013. The definitions in the standard `silly' library pertaining to the basic
  2014. properties of the operator can provide a good intuitive illustration of
  2015. how computations are performed by `avram'. This task is approached in
  2016. the guise of a few trivial correctness proofs about them. Conveniently,
  2017. as an infeasibly small language, `silly' is an ideal candidate for
  2018. analysis by formal methods.
  2019. Technically the semantic function [[...]] has not been defined on
  2020. identifiers, but we can easily extend it to them by stipulating that the
  2021. meaning of an identifier `X' is the meaning of the program `MAIN = X'
  2022. when linked with a library containing the declaration of `X', where
  2023. `MAIN' is an identifier not appearing elsewhere in the library.
  2024. With this idea in mind, the following "theorems" can be stated, all
  2025. of which have a similar proof. The variables X and Y stand for any
  2026. tree, and the variable F stands for any tree other than `nil'.
  2027. _T0_
  2028. [[`identity']] `X' = `X'
  2029. _T1_
  2030. [[`left']] `(X,Y)' = `X'
  2031. _T2_
  2032. [[`right']] `(X,Y)' = `Y'
  2033. _T4_
  2034. [[`meta']] `(F,X)' = `F (F,X)'
  2035. _T5_
  2036. [[`constant_nil']] `X' = `nil'
  2037. Replacing each identifier with its defining expression directly
  2038. demonstrates a logical equivalence between the relevant theorem and one
  2039. of the basic operator properties postulated in *note A Minimal Set of
  2040. Properties::.
  2041. For more of a challenge, it is possible to prove the next theorem.
  2042. _T6_
  2043. For non-`nil' `F' and `G', ([[`couple']] `(F,G)') `X' = `(F X,G X)'
  2044. The proof is a routine calculation. Beware of the distinction between
  2045. the mathematical `nil' and the `silly' identifier `nil'.
  2046. ([[`couple']] `(F,G)') `X'
  2047. = ([[`((((left,nil),constant_nil),nil),right)']] `(F,G)') `X'
  2048. by substitution of `couple' with its definition in the standard library
  2049. = (
  2050. `(
  2051. ((('[[`left']]`,'[[`nil']]`),'[[`constant_nil']]`),'[[`nil']])`,'
  2052. [[`right']]`)'
  2053. `(F,G)')
  2054. `X'
  2055. by definition of the semantic function [[...]] regarding pairs
  2056. = (
  2057. `(
  2058. ((('[[`left']]`,'[[`()']]`),'[[`constant_nil']]`),'[[`()']])`,'
  2059. [[`right']]`)'
  2060. `(F,G)')
  2061. `X'
  2062. by substitution of `nil' from its definition in the standard library
  2063. = (
  2064. `(
  2065. ((('[[`left']]`,'[[`nil']]`),'[[`constant_nil']]`),'[[`nil']])`,'
  2066. [[`right']]`)'
  2067. `(F,G)')
  2068. `X'
  2069. by definition of the semantic function in the case of [[`()']]
  2070. = (
  2071. `('
  2072. [[`left']] `(F,G),'[[`constant_nil']] `(F,G)),'
  2073. [[`right']] `(F,G)')
  2074. `X'
  2075. by property _P6_ (twice)
  2076. = `((F,nil),G) X'
  2077. by theorems _T1_, _T2_, and _T5_
  2078. = `(F X,G X)'
  2079. by property _P6_ again.
  2080. Something to observe about this proof is that it might just as well
  2081. have been done automatically. Every step is either the substitution of
  2082. an identifier or a pattern match against existing theorems and
  2083. properties of the operator. Another thing to note is that the use of
  2084. identifiers and previously established theorems helps to make the proof
  2085. human readable, but is not a logical necessity. An equivalent proof
  2086. could have been expressed entirely in terms of the properties of the
  2087. operator. If one envisions a proof like this being performed blindly and
  2088. mechanically, without the running commentary or other amenities, that
  2089. would not be a bad way of thinking about what takes place when `avram'
  2090. executes virtual code.
  2091. Three more theorems have similar proofs. For non-`nil' trees `P',
  2092. `F' and `G', and any trees `X' and `K':
  2093. _T7_
  2094. ([[`compose']] `(F,G)') X = F G X
  2095. _T8_
  2096. ([[`constant']] `K') X = K
  2097. _T9_
  2098. ([[`conditional']] `(P,(F,G)') X = `F X' if `P X' is non-`nil',
  2099. but `G X' if `P X' = `nil'
  2100. The proofs of these theorems are routine calculations analogous to the
  2101. proof of _T6_. Here is a proof of theorem _T7_ for good measure.
  2102. ([[`compose']] `(F,G)') `X'
  2103. = ([[`couple(identity,constant_nil)']] `(F,G)') `X'
  2104. by substitution of `compose' with its definition in the standard
  2105. library
  2106. = (`('[[`couple']] `('[[`identity']]`,'[[`constant_nil']]`))(F,G)') `X'
  2107. by definition of the semantic function
  2108. = `('[[`identity']] `(F,G),'[[`constant_nil']]` (F,G)) X'
  2109. by theorem _T6_
  2110. = `((F,G),nil) X'
  2111. by theorems _T0_ and _T5_
  2112. = `F G X'
  2113. by property _P5_ of the operator.
  2114. 
  2115. File: avram.info, Node: Variable Freedom, Next: Metrics and Maintenance, Prev: How `avram' Thinks, Up: Virtual Code Semantics
  2116. 2.7.6 Variable Freedom
  2117. ----------------------
  2118. The virtual code semantics is easier to specify using the `silly'
  2119. language than it would be without it, but still awkward in some cases.
  2120. An example is the following declaration from the standard library,
  2121. hired = compose(
  2122. compose,
  2123. couple(
  2124. constant compose,
  2125. compose(couple,couple(constant,constant couple))))
  2126. which is constructed in such a way as to imply the following theorem,
  2127. provable by routine computation.
  2128. _T9_
  2129. `('[[`hired']] `H) (F,G)' = [[`compose']]`(H,'[[`couple']]`(F,G))'
  2130. Intuitively, `hired' represents a function that takes a given function
  2131. to a higher order function. For example, if `f' were a function that
  2132. adds two real numbers, `hired f' would be a function that takes two
  2133. real valued functions to their pointwise sum.
  2134. Apart from its cleverness, such an opaque way of defining a function
  2135. has little to recommend it. The statement of the theorem about the
  2136. function is more readable than the function definition itself, probably
  2137. because the theorem liberally employs mathematical variables, whereas
  2138. the `silly' language is variable free. On the other hand, it is not
  2139. worthwhile to linger over further enhancements to the language, such as
  2140. adding variables to it. The solution will be to indicate informally a
  2141. general method of inferring a variable free function definition from an
  2142. expression containing variables, and hereafter omit the more cumbersome
  2143. definitions.
  2144. An algorithm called `isolate' does the job. The input to `isolate'
  2145. is a pair `(E,X)', where `E' is a syntactically correct `silly'
  2146. expression in which the identifier `X' may occur, but no other
  2147. identifiers dependent on `X' may occur (or else it's
  2148. garbage-in/garbage-out). Output is a syntactically correct `silly'
  2149. expression `F' in which the identifier `X' does not occur, such that
  2150. [[`E']] = [[`F X']]. The algorithm is as follows,
  2151. if `E' = `X' then
  2152. return `identity'
  2153. else if `E' is of the form `(U,V)'
  2154. return `couple(isolate(U,X),isolate(V,X))'
  2155. else if `E' is of the form `U V'
  2156. return `(hired apply)(isolate(U,X),isolate(V,X))'
  2157. else
  2158. return `constant E'
  2159. where equality is by literal comparison of expressions, and the
  2160. definition of `apply' is
  2161. apply = (hired meta)((hired compose)(left,constant right),right)
  2162. which represents a function that does the same thing as the invisible
  2163. operator.
  2164. _T10_
  2165. [[`apply']] `(F,X)' = `F X'
  2166. The `isolate' algorithm can be generalized to functions of
  2167. arbitrarily many variables, but in this document we will need only a
  2168. unary and a binary version. The latter takes an expression `E' and a
  2169. pair of identifiers `(X,Y)' as input, and returns an expression `F'
  2170. such that [[`E']] = [[`F (X,Y)']].
  2171. if `E' = `X' then
  2172. return `left'
  2173. else if `E' = `Y' then
  2174. return `right'
  2175. else if `E' is of the form `(U,V)'
  2176. return `couple(isolate(U,(X,Y)),isolate(V,(X,Y)))'
  2177. else if `E' is of the form `U V'
  2178. return `(hired apply)(isolate(U,(X,Y)),isolate(V,(X,Y)))'
  2179. else
  2180. return `constant E'
  2181. It might be noted in passing that something similar to these
  2182. algorithms would be needed in a compiler targeted to `avram' if the
  2183. source were a functional language with variables.
  2184. 
  2185. File: avram.info, Node: Metrics and Maintenance, Next: Deconstruction, Prev: Variable Freedom, Up: Virtual Code Semantics
  2186. 2.7.7 Metrics and Maintenance
  2187. -----------------------------
  2188. Certain features of the virtual machine pertain to software development
  2189. and maintenance more than to implementing any particular function. The
  2190. operations with the mnemonics `version', `note', `profile', and
  2191. `weight' are in this category.
  2192. * Menu:
  2193. * Version::
  2194. * Note::
  2195. * Profile::
  2196. * Weight::
  2197. 
  2198. File: avram.info, Node: Version, Next: Note, Prev: Metrics and Maintenance, Up: Metrics and Maintenance
  2199. 2.7.7.1 Version
  2200. ...............
  2201. A virtual code application with exactly the following definition
  2202. implements a function that returns a constant character string
  2203. regardless of its argument.
  2204. version = ((nil,nil),((nil,nil),(nil,((nil,nil),nil))))
  2205. The character string encodes the version number of the installed
  2206. `avram' executable, for example `0.13.0', using the standard
  2207. representation for characters.
  2208. Although such an application is useless by itself, the intended use
  2209. for this feature is to cope with the possibility that future versions of
  2210. `avram' may include enhancements. Ideally, the maintainer of `avram'
  2211. will update the version number when new enhancements are added.
  2212. Applications can then detect whether they are available in the
  2213. installed version by using this feature. If a needed enhancement is not
  2214. available, the application can either make allowances or at least
  2215. terminate gracefully.
  2216. 
  2217. File: avram.info, Node: Note, Next: Profile, Prev: Version, Up: Metrics and Maintenance
  2218. 2.7.7.2 Note
  2219. ............
  2220. This operation allows arbitrary information or comments to be embedded
  2221. in a virtual code application in such a way that it will be ignored by
  2222. `avram' when executing it. For the `silly' language, a `note' function
  2223. is defined in the standard prelude so as to imply the following theorem.
  2224. _T11_
  2225. [[`note']] `(F,C)' = `((nil,nil),((nil,nil),(nil,(nil,(F,C)))))'
  2226. Intuitively, the argument `F' represents a function, and the argument
  2227. `c' represents the comment, annotation, or whatever, that will be
  2228. embedded but ignored in the virtual code.
  2229. Semantically, a function with a note attached is the same as the
  2230. function by itself, as the following property stipulates for any
  2231. non-`nil' `F'.
  2232. _P8_
  2233. ([[`note']] `(F,C)') `X' = `F X'
  2234. A possible reason for using this feature might be to support a
  2235. language that performs run-time type checking by hanging type tags on everything.
  2236. Another possible use would be to include symbolic information needed by
  2237. a debugger.
  2238. 
  2239. File: avram.info, Node: Profile, Next: Weight, Prev: Note, Up: Metrics and Maintenance
  2240. 2.7.7.3 Profile
  2241. ...............
  2242. The virtual machine supports a profiling capability by way of this feature.
  2243. Profiling an application causes run time statistics about it to be
  2244. written to a file `./profile.txt'. Profiled applications are of the
  2245. form indicated in the following theorem
  2246. _T12_
  2247. [[`profile']] `(F,S)' = `((nil,nil),((nil,nil),(nil,((F,S),nil))))'
  2248. where `F' stands for the virtual code of the application, and `S'
  2249. stands for the name of it to be written to the file. The semantics of
  2250. a profiled function is identical to the unprofiled form for any
  2251. non-`nil' `F'.
  2252. _P9_
  2253. ([[`profile']] `(F,S)') `X' = `F X'
  2254. Unlike the situation with `note', the annotation `S' of used in
  2255. profiled code is not in an unrestricted format but must be a character
  2256. string in the standard representation (as in *note Representation of
  2257. Numeric and Textual Data::), because this string needs to be written by
  2258. `avram' to the file `./profile.txt'. Ordinarily this string will be the
  2259. source code identifier of the function being profiled.
  2260. When profiles are used in many parts of an application, an
  2261. informative table is generated showing the time spent in each part.
  2262. 
  2263. File: avram.info, Node: Weight, Prev: Profile, Up: Metrics and Maintenance
  2264. 2.7.7.4 Weight
  2265. ..............
  2266. The following virtual code implements a function that returns the weight
  2267. of its argument in the standard representation for natural numbers.
  2268. weight = ((nil,nil),((nil,nil),(nil,(nil,nil))))
  2269. The weight of a tree is zero if the tree is `nil', and otherwise the
  2270. sum of the weights of the two subtrees plus one.
  2271. An algorithm to compute the weight of a tree would be trivial to
  2272. implement without being built in to the virtual machine. The built in
  2273. capability could also be used for purposes unrelated to code
  2274. maintenance. Nevertheless, it is built in for the following reasons.
  2275. * Computing weights happened to be a bottleneck for a particular
  2276. aspect of code generation that was of interest to the author, namely
  2277. the compression of generated code.
  2278. * A built in implementation in C runs at least an order of magnitude
  2279. faster than the equivalent implementation in virtual code.
  2280. * It runs even faster when repeated on the same data, by retrieving
  2281. previously calculated weights rather than recalculating them.
  2282. The only disadvantage of using this feature instead of implementing a
  2283. function in virtual code to compute weights is that it relies on native integer
  2284. arithmetic and could overflow, causing a fatal error. It has never
  2285. occurred in practice, but is possible due to sharing, whereby the
  2286. nominal weight of a tree could be exponentially larger than the actual
  2287. amount of memory occupied by it.
  2288. 
  2289. File: avram.info, Node: Deconstruction, Next: Recursion, Prev: Metrics and Maintenance, Up: Virtual Code Semantics
  2290. 2.7.8 Deconstruction
  2291. --------------------
  2292. Much of the time required for evaluating a function is devoted to performing
  2293. deconstruction operations, e.g., taking the left side of a pair, the
  2294. tail of a list, the right side of the head of the tail, etc.. Because
  2295. these operations are so frequent, there are some features of the
  2296. virtual machine to make them as efficient as possible.
  2297. * Menu:
  2298. * Field::
  2299. * Fan::
  2300. 
  2301. File: avram.info, Node: Field, Next: Fan, Prev: Deconstruction, Up: Deconstruction
  2302. 2.7.8.1 Field
  2303. .............
  2304. The virtual machine supports a generalization of the `left' and `right'
  2305. deconstruction operations that is applicable to deeply nested
  2306. structures. Use of this feature is conducive to code that is faster and
  2307. more compact than is possible by relying on the primitive deconstructors
  2308. alone. It may also be easier for a code optimizer to recognize and
  2309. transform.
  2310. The general form of a virtual code application to perform
  2311. deconstruction is that it is a pair with a `nil' left side, and a
  2312. non-`nil' right side. The right side indicates the nature of the
  2313. deconstruction to be performed when the function is evaluated on an
  2314. argument.
  2315. To make the expression of deconstruction functions more readable in
  2316. `silly', the standard library contains the declaration
  2317. field = couple(constant nil,identity)
  2318. which implies the following theorem.
  2319. _T13_
  2320. [[`field']] `W' = `(nil,W)'
  2321. The virtual machine recognizes an application in this form and
  2322. evaluates it according to the following properties, where `U' and `V'
  2323. are other than `nil', but `X', `Y', and `Z' are unrestricted.
  2324. _P10_
  2325. ([[`field']] `(U,nil)') `(X,Y)' = ([[`field']] `U') `X'
  2326. _P11_
  2327. ([[`field']] `(nil,V)') `(X,Y)' = ([[`field']] `V') `Y'
  2328. _P12_
  2329. ([[`field']] `(U,`v')') `Z' = `(('[[`field']] `U) Z,('[[`field']]
  2330. `V) Z)'
  2331. One might also add that ([[`field']] `(nil,nil)') `Z' = `Z', but this
  2332. statement would be equivalent to _P0_.
  2333. A suitable choice of the `field' operand permits the implementation
  2334. of any deconstruction function expressible in terms of `compose',
  2335. `couple', `identity', `left' and `right'. For example, the application
  2336. `couple(compose(right,right),left)' has an equivalent representation in
  2337. `field((nil,(nil,(nil,nil))),((nil,nil),nil))'. The latter looks longer
  2338. in `silly' but is smaller and faster in virtual code.
  2339. 
  2340. File: avram.info, Node: Fan, Prev: Field, Up: Deconstruction
  2341. 2.7.8.2 Fan
  2342. ...........
  2343. In cases where a deconstructions would be needed to apply the same
  2344. function to both sides of a pair, the overhead can be avoided by means
  2345. of a property of the virtual machine intended for that purpose.
  2346. A `silly' definition of `fan' implying the following theorem is
  2347. helpful in expressing such an application.
  2348. _T14_
  2349. [[`fan']] `F' = `((nil,nil),((nil,F),(nil,nil)))'
  2350. The virtual machine recognizes when an application has the form shown
  2351. above, and uses `F' as a function to be applied to both sides of the
  2352. argument.
  2353. _P13_
  2354. ([[`fan']] `F') `(X,Y)' = `(F X,F Y)'
  2355. 
  2356. File: avram.info, Node: Recursion, Next: Assignment, Prev: Deconstruction, Up: Virtual Code Semantics
  2357. 2.7.9 Recursion
  2358. ---------------
  2359. Defining functions or programs self referentially is sometimes
  2360. informally known as recursion. In functional languages, the clever use of
  2361. "combinators" is often preferred to this practice, and is in fact well
  2362. supported by the virtual machine. However, some computations may be
  2363. inexpressible without an explicitly "recursive" formulation, so there
  2364. is some support for that as well.
  2365. * Menu:
  2366. * Recur::
  2367. * Refer::
  2368. 
  2369. File: avram.info, Node: Recur, Next: Refer, Prev: Recursion, Up: Recursion
  2370. 2.7.9.1 Recur
  2371. .............
  2372. A generalization of the form denoted by `meta' in `silly' is recognized
  2373. by the virtual machine and allows a slightly more efficient encoding of
  2374. recursive applications. An expression `recur P' has the representation
  2375. indicated by this theorem,
  2376. _T15_
  2377. [[`recur']] `P' = `(((nil,P),nil),nil)'
  2378. which implies that [[`meta']] = [[`recur']] `(nil,nil)'.
  2379. If `P' is non-`nil', a tree of the form [[`recur']] `P' is
  2380. interpreted as follows. Note that _P4_ is equivalent to the special
  2381. case of this property for which `P' is `(nil,nil)'.
  2382. _P14_
  2383. ([[`recur']] `P') `X' = [[`meta']] ([[`field']] `P') `X'
  2384. The rationale is that `meta' would very frequently be composed with
  2385. a deconstruction `field P', so the virtual machine saves some time and
  2386. space by allowing the two of them to be encoded in a smaller tree with
  2387. the combined meaning.
  2388. 
  2389. File: avram.info, Node: Refer, Prev: Recur, Up: Recursion
  2390. 2.7.9.2 Refer
  2391. .............
  2392. In the style of recursive programming compelled by the available `meta'
  2393. primitive, a function effectively requires a copy of its own machine
  2394. code as its left argument. Bringing about that state of affairs is an
  2395. interesting party trick.
  2396. If we had a definition of `bu' in the standard library implying
  2397. _T16_
  2398. ([[`bu']] `(F,K)') `X' = `F(K,X)'
  2399. which for the sake of concreteness can be done like this,
  2400. bu = (hired compose)(
  2401. left,
  2402. (hired couple)(compose(constant,right),constant identity))
  2403. then a definition of `refer' as
  2404. refer = (hired bu)(identity,identity)
  2405. would be consistent with the following property of the operator.
  2406. _P15_
  2407. ([[`refer']] `F') `X' = `F (F,X)'
  2408. The proof, as always, is a matter of routine calculation in the manner
  2409. of the section on how `avram' thinks.
  2410. However, this pattern would occur so frequently in recursively
  2411. defined functions as to be a significant waste of space and time.
  2412. Therefore, rather than requiring it to be defined in terms of other
  2413. operations, the virtual machine specification recognizes a pattern of
  2414. the form below with respect to property _P15_,
  2415. _T17_
  2416. [[`refer']] `F' = `(((F,nil),nil),nil)'
  2417. and takes the property to be true by definition of the operator. A
  2418. definition of `refer' consistent with _T17_ is therefore to be found in
  2419. the standard library, not the definition proposed above.
  2420. 
  2421. File: avram.info, Node: Assignment, Next: Predicates, Prev: Recursion, Up: Virtual Code Semantics
  2422. 2.7.10 Assignment
  2423. -----------------
  2424. In an imperative programming paradigm, a machine consists partly of an
  2425. ensemble of addressable storage locations, whose contents are changed
  2426. over time by assignment statements. An assignment statement includes
  2427. some computable function of the global machine state, and the address of
  2428. the location whose contents will be overwritten with the value computed
  2429. from the function when it is evaluated.
  2430. Compiling a language containing assignment statements into virtual
  2431. machine code suitable for `avram' might be facilitated by exploiting
  2432. the following property.
  2433. _P16_
  2434. ([[`assign']] `(P,F)') `X' = [[`replace']] `((P,F X),X)'
  2435. The identifier `assign' is used in `silly' to express a virtual code
  2436. fragment having the form shown below, and `replace' corresponds to a
  2437. further operation to be explained presently.
  2438. _T18_
  2439. [[`assign']] `(P,F)' = `(((P,F),nil),nil)'
  2440. This feature simulates assignment statements in the following way.
  2441. The variable `X' in _P16_ corresponds intuitively to the set of
  2442. addressable locations in the machine. The variable `F' corresponds to
  2443. the function whose value will be stored in the location addressed by
  2444. `P'. The result of a function expressed using `assign' is a new store
  2445. similar to the argument `X', but with the part of it in location `P'
  2446. replaced by `F X'. A source text with a sequence of assignment
  2447. statements could therefore be translated directly into a functional
  2448. composition of trees in this form.
  2449. The way storage locations are modeled in virtual code using this
  2450. feature would be as nested pairs, and the address `P' of a location is
  2451. a tree interpreted similarly to the trees used as operands to the
  2452. `field' operator described in *note Field::, to specify
  2453. deconstructions. In fact, `replace' can be defined as a minimal
  2454. solution to the following equation.
  2455. _E0_
  2456. ([[`field']] `P') [[`replace']] `((P,Y),X)' = `Y'
  2457. This equation regrettably does not lend itself to inferring the
  2458. `silly' source for `replace' using the `isolate' algorithm in *note
  2459. Variable Freedom::, so an explicit construction is given in *note
  2460. Replace::. This construction need not concern a reader who considers
  2461. the equation a sufficiently precise specification in itself.
  2462. In view of the way addresses for deconstruction are represented as
  2463. trees, it would be entirely correct to infer from this equation that a
  2464. tuple of values computed together can be assigned to a tuple of
  2465. locations. The locations don't even have to be "contiguous", but could
  2466. be anywhere in the tree representing the store, and the function is
  2467. computed from the contents of all of them prior to the update. Hence,
  2468. this simulation of assignment fails to capture the full inconvenience of
  2469. imperative programming except in the special case of a single value
  2470. assigned to a single location, but fortunately this case is the only one
  2471. most languages allow.
  2472. There is another benefit to this feature besides running languages
  2473. with assignment statements in them, which is the support of abstract or
  2474. opaque data structures. A function that takes an abstract data structure
  2475. as an argument and returns something of the same type can be coded in a
  2476. way that is independent of the fields it doesn't use. For example, a
  2477. data structure with three fields having the field identifiers `foo',
  2478. `bar', and `baz' in some source language might be represented as a
  2479. tuple `((FOO CONTENTS,BAR CONTENTS),BAZ CONTENTS)' on the virtual code
  2480. level. Compile time constants like `bar = ((nil,(nil,nil)),nil)' could
  2481. be defined in an effort to hide the details of the representation, so
  2482. that the virtual code `field bar' is used instead of
  2483. `compose(right,left)'. Using field identifiers appropriately, a
  2484. function that transforms such a structure by operating on the `bar'
  2485. field could have the virtual code `couple(couple(field
  2486. foo,compose(f,field bar)),field baz)'. However, this code does not
  2487. avoid depending on the representation of the data structure, because it
  2488. relies on the assumption of the `foo' field being on the left of the
  2489. left, and the `baz' field being on the right. On the other hand, the
  2490. code `assign(bar,compose(f,field bar))' does the same job without
  2491. depending on anything but the position of the `bar' field. Furthermore,
  2492. if this position were to change relative to the others, the code
  2493. maintenance would be limited to a recompilation.
  2494. 
  2495. File: avram.info, Node: Predicates, Next: Iteration, Prev: Assignment, Up: Virtual Code Semantics
  2496. 2.7.11 Predicates
  2497. -----------------
  2498. A couple of operations are built into the virtual machine for performing
  2499. tests efficiently. These functions return either `nil' for false or
  2500. `(nil,nil)' for true, and are useful for example as a predicate `P' in
  2501. programs of the form `conditional(P,(F,G))' among other things. In this
  2502. example, the predicate is applied to the argument, a result of
  2503. `(nil,nil)' causes `F' to be applied to it, and a result of `nil'
  2504. causes `G' to be applied to it.
  2505. * Menu:
  2506. * Compare::
  2507. * Member::
  2508. 
  2509. File: avram.info, Node: Compare, Next: Member, Prev: Predicates, Up: Predicates
  2510. 2.7.11.1 Compare
  2511. ................
  2512. A function that performs comparison has a the following very simple
  2513. virtual code representation.
  2514. _T19_
  2515. [[`compare']] = `(nil,nil)'
  2516. The proof of theorem _T19_ is that the standard `silly' prelude
  2517. contains the declaration `compare = (nil,nil)'. Code in this form has
  2518. the following semantics.
  2519. _P17_
  2520. For distinct trees `X' and `Y', [[`compare']] `(X,Y)' = `nil'
  2521. _P18_
  2522. [[`compare']] `(X,X)' = `(nil,nil)'
  2523. In other words, the virtual code `(nil,nil)' implements a function that
  2524. takes a pair of trees and returns true if and only if they are equal.
  2525. It would be fairly simple to write an equivalent virtual code
  2526. application that implements this function if it were not realizable in
  2527. this form by definition of the operator. However, this method is
  2528. preferable because it saves space in virtual code and has a highly
  2529. optimized implementation in C.
  2530. 
  2531. File: avram.info, Node: Member, Prev: Compare, Up: Predicates
  2532. 2.7.11.2 Member
  2533. ...............
  2534. Another built in predicate function has the virtual code shown below.
  2535. _T20_
  2536. [[`member']] = `((nil,nil),((nil,nil),nil))'
  2537. As the mnemonic suggests, the function implemented by this code detects
  2538. whether a given item is a member of a list. The left side of its
  2539. argument is the item to be detected, and the right side is the list that
  2540. may or may not contain it. Lists are represented as explained in *note
  2541. Representation of Numeric and Textual Data::.
  2542. The virtual code semantics can be specified by these three
  2543. properties of the operator.
  2544. _P19_
  2545. [[`member']] `(X,nil)' = `nil'
  2546. _P20_
  2547. [[`member']] `(X,(X,Y))' = `(nil,nil)'
  2548. _P21_
  2549. For distinct trees `X' and `Y', [[`member']] `(X,(Y,Z))' =
  2550. [[`member']] `(X,`z')'
  2551. As in the case of `compare', the implementation of `member' is well
  2552. optimized in C, so this form is to be preferred over an ad hoc
  2553. construction of a membership testing function in virtual code.
  2554. 
  2555. File: avram.info, Node: Iteration, Next: List Combinators, Prev: Predicates, Up: Virtual Code Semantics
  2556. 2.7.12 Iteration
  2557. ----------------
  2558. One of many alternatives to recursion provided by the virtual machine is
  2559. iteration, which allows an operation to be repeated until a condition is
  2560. met. If the source language is imperative, this feature provides an easy
  2561. means of translating loop statements to virtual code. In languages that
  2562. allow functions to be treated as data, iteration can be regarded as a
  2563. function that takes a predicate and a function as arguments, and
  2564. returns a function that applies the given function repeatedly to its
  2565. argument until the predicate is refuted.
  2566. Iterative applications are expressed in virtual code by the pattern
  2567. shown below.
  2568. _T21_
  2569. [[`iterate']] `(P,F)' = `((nil,nil),(nil,(P,F)))'
  2570. In the `silly' language, the `iterate' mnemonic plays the role of the
  2571. function that takes the virtual code for a predicate `P' and a function
  2572. `F' as arguments, and returns the virtual code for an iterating
  2573. function.
  2574. The code for an iterating function is recognized as such by the
  2575. virtual machine emulator only if the subtrees `F' and `P' are other
  2576. than `nil'. The resulting function tests the argument `X' with `P' and
  2577. returns `X' if the predicate is false.
  2578. _P22_
  2579. ([[`iterate']] `(P,F)') `X' = `X' if `P X' = `nil'
  2580. If the predicate turns out to be true, `F' is applied to `X', and then
  2581. another iteration is performed.
  2582. _P23_
  2583. ([[`iterate']] `(P,F)') `X' = ([[`iterate']] `(P,F)') `F X' if `P
  2584. X' is a non-`nil' tree
  2585. 
  2586. File: avram.info, Node: List Combinators, Next: List Functions, Prev: Iteration, Up: Virtual Code Semantics
  2587. 2.7.13 List Combinators
  2588. -----------------------
  2589. There is extensive support for operations on lists in the virtual code
  2590. format. Use of these features is encouraged because they are conducive
  2591. to tight code with explicit concurrency. Within an imperative
  2592. programming paradigm, these features might perhaps have to be understood
  2593. as design patterns or algorithmic skeletons. The present exposition
  2594. takes a functional view, describing them in terms of operators that take
  2595. functions as their arguments and return functions as their result.
  2596. * Menu:
  2597. * Map::
  2598. * Filter::
  2599. * Reduce::
  2600. * Sort::
  2601. * Transfer::
  2602. * Mapcur::
  2603. 
  2604. File: avram.info, Node: Map, Next: Filter, Prev: List Combinators, Up: List Combinators
  2605. 2.7.13.1 Map
  2606. ............
  2607. A virtual code application in the following form causes a function with
  2608. non-`nil' virtual code `F' to be applied to every item in a list.
  2609. _T22_
  2610. [[`map']] `F' = `((nil,nil),((nil,F),nil))'
  2611. The `map' mnemonic is used in `silly' to express applications in this
  2612. form as `map F'. This operation is also well known to lisp users and
  2613. functional programmers. The semantics is determined by these two
  2614. operator properties (for non-`nil' `F').
  2615. _P24_
  2616. ([[`map']] `F') `nil' = `nil'
  2617. _P25_
  2618. ([[`map']] `F') `(X,Y)' = `(F X,('[[`map']] `F) Y)'
  2619. Note that the representation of lists described in *note Representation
  2620. of Numeric and Textual Data::, is assumed.
  2621. 
  2622. File: avram.info, Node: Filter, Next: Reduce, Prev: Map, Up: List Combinators
  2623. 2.7.13.2 Filter
  2624. ...............
  2625. Another well known list operation is that which applies a predicate to
  2626. every item of a list, and deletes those for which the predicate is
  2627. false. For a predicate with virtual code `P', such an application can
  2628. be coded conveniently in this form,
  2629. _T23_
  2630. [[`filter']] `P' = `((nil,nil),(nil,(P,nil)))'
  2631. which is to say that writing `((nil,nil),(nil,(P,nil)))' in `silly' is
  2632. the same as writing `filter P'.
  2633. The virtual machine detects code of this form provided that `P' is
  2634. other than `nil', and evaluates it consistently with the following
  2635. properties, causing it to have the meaning that it does.
  2636. _P26_
  2637. ([[`filter']] `P') `nil' = `nil'
  2638. _P27_
  2639. ([[`filter']] `P') `(X,Y)' = ([[`filter']] `P') `Y' if `P `X'' =
  2640. `nil'
  2641. _P28_
  2642. ([[`filter']] `P') `(X,Y)' = `(X,'([[`filter']] `P') `Y)' if `P X'
  2643. is a non-`nil' tree
  2644. 
  2645. File: avram.info, Node: Reduce, Next: Sort, Prev: Filter, Up: List Combinators
  2646. 2.7.13.3 Reduce
  2647. ...............
  2648. In the virtual code fragment shown below, `F' should be regarded as the
  2649. virtual code for a binary operator, and `K' is a constant.
  2650. _T24_
  2651. [[`reduce']] `(F,K)' = `((nil,nil),((F,K),nil))'
  2652. By constructing a tree in the form shown, the `silly' mnemonic `reduce'
  2653. effectively constructs the code for a function operating on lists in a
  2654. particular way.
  2655. The effect of evaluating an application in this form with an argument
  2656. representing a list can be broken down into several cases.
  2657. * If the list is empty, then the value of `K' is the result.
  2658. * If the list has only one item, then that item is the result.
  2659. * if the list has exactly two items, the first being `X' and the
  2660. second being `Y', then the result is `F (X,Y)'.
  2661. * If the list has more than two items and an even number of them, the
  2662. result is that of evaluating the application with the list of
  2663. values obtained by partitioning the list into pairs of adjacent
  2664. items, and evaluating `F' with each pair.
  2665. * If the list has more than two items and an odd number of them, the
  2666. result is that of evaluating the application with the list of
  2667. values obtained by partitioning the list into pairs of adjacent
  2668. items excluding the last one, evaluating `F' with each pair, and
  2669. then appending the last item to the list of values.
  2670. In the last two cases, evaluation of the application is not necessarily
  2671. finished after just one traversal of the list, because it has to carry
  2672. on until the list is reduced to a single item.
  2673. Some care has been taken to describe this operation in detail
  2674. because it differs from comparable operations common to functional
  2675. programming languages, variously known as fold or reduce. All of these
  2676. operations could be used, for example, to compute the summation of a
  2677. list of numbers. The crucial differences are as follows.
  2678. * Whereas a fold or a reduce is conventionally either of the left or
  2679. right variety, this `reduce' is neither.
  2680. * The vacuous case result `K' is never used at all unless the
  2681. argument is the empty list.
  2682. * This `reduce' is not very useful if the operator `F' is not
  2683. associative.
  2684. The reason for defining the semantics of `reduce' in this way
  2685. instead of the normal way is that a distributed implementation of this one
  2686. could work in logarithmic time, so it's worth making it easy for a
  2687. language processor to detect the pattern in case the virtual code is
  2688. ever going to be targeted to such an implementation. Of course, nothing
  2689. prevents the conventional left or right reduction semantics from being
  2690. translated to virtual code by explicit recursion.
  2691. The precise semantics of this operation are as follows, where `F' is
  2692. not `nil', `K' is unconstrained, and `pairwise' represents a function
  2693. to be explained presently.
  2694. _P29_
  2695. ([[`reduce']] `(F,K)') `nil' = `K'
  2696. _P30_
  2697. ([[`reduce']] `(F,K)') `(X,Y)' =
  2698. [[`left']] ([[`bu(iterate,right)']] [[`pairwise']] `F') `(X,Y)'
  2699. The latter property leverages off some virtual machine features and
  2700. `silly' code that has been defined already. The function implemented by
  2701. [[`pairwise']] `F' is the one that partitions its argument into pairs
  2702. and evaluates `F' with each pair as described above. The combination of
  2703. that with `bu(iterate,right)' results in an application that repeatedly
  2704. performs [[`pairwise']] `F' while the resulting list still has a tail
  2705. (i.e., a `right' side), and stops when the list has only a single item
  2706. (i.e., when `right' is false). The `left' operation then extracts the
  2707. item.
  2708. For the sake of completeness, it is tedious but straightforward to
  2709. give an exact definition for `pairwise'. The short version is that it
  2710. can be anything that satisfies these three equations.
  2711. _E1_
  2712. ([[`pairwise']] `F') `nil' = `nil'
  2713. _E2_
  2714. ([[`pairwise']] `F') `(X,nil)' = `(X,nil)'
  2715. _E3_
  2716. ([[`pairwise']] `F') `(X,(Y,Z))' = `(F (X,Y),'([[`pairwise']] `F')
  2717. `Z)'
  2718. For the long version, see *note Pairwise::.
  2719. 
  2720. File: avram.info, Node: Sort, Next: Transfer, Prev: Reduce, Up: List Combinators
  2721. 2.7.13.4 Sort
  2722. .............
  2723. Sorting is a frequently used operation that has the following standard
  2724. representation in virtual code.
  2725. _T25_
  2726. [[`sort']] `P' = `((nil,nil),((P,nil),(nil,nil)))'
  2727. When an application in this form is evaluated with an operand
  2728. representing a list, the result is a sorted version of the list.
  2729. The way a list is sorted depends on what order is of interest. For
  2730. example, numbers could be sorted in ascending or descending order,
  2731. character strings could be sorted lexically or by length, etc.. The
  2732. value of `P' is therefore needed in sorting applications to specify the
  2733. order. It contains the virtual code for a partial order relational
  2734. operator. This operator can be evaluated with any pair of items
  2735. selected from a list, and should have a value of true if the left one
  2736. should precede the right under the ordering. For example, if numbers
  2737. were to be sorted in ascending order, then `P' would compute the "less
  2738. or equal" relation, returning true if its operand were a pair of
  2739. numbers in which the left is less or equal to the right.
  2740. The virtual code semantics for sorting applications is given by these
  2741. two properties, wherein `P' is a non-`nil' tree, and `insert' is a
  2742. `silly' mnemonic to be defined next.
  2743. _P31_
  2744. ([[`sort']] `P') `nil' = `nil'
  2745. _P32_
  2746. ([[`sort']] `P') `(X,Y)' = ([[`insert']] `P') `(X,'([[`sort']]
  2747. `P') `Y)'
  2748. These properties say that the empty list is already sorted, and a
  2749. non-empty list is sorted if its tail is sorted and the head is then
  2750. inserted in the proper place. This specification of sorting has nothing
  2751. to do with the sorting algorithm that `avram' really uses.
  2752. The meaning of insertion is convenient to specify in virtual code
  2753. itself. It should satisfy these three equations.
  2754. _E4_
  2755. ([[`insert']] `P') `(X,nil)' = `(X,nil)'
  2756. _E5_
  2757. ([[`insert']] `P') `(X,(Y,Z))' = `(Y,'([[`insert']] `P') `(X,Z))'
  2758. if `P(X,Y)' = `nil'
  2759. _E6_
  2760. ([[`insert']] `P') `(X,(Y,Z)') = `(X,(Y,Z))' if `P(X,Y)' is a
  2761. non-`nil' tree
  2762. Intuitively, the right argument, whether `nil' or `(Y,Z)', represents a
  2763. list that is already sorted. The left argument `X' therefore only needs
  2764. to be compared to the head element, `Y' to ascertain whether or not it
  2765. belongs at the beginning. If not, it should be inserted into the tail.
  2766. A possible implementation of `insert' in `silly' is given in *note
  2767. Insert::.
  2768. 
  2769. File: avram.info, Node: Transfer, Next: Mapcur, Prev: Sort, Up: List Combinators
  2770. 2.7.13.5 Transfer
  2771. .................
  2772. A particular interpretation is given to virtual code in the following
  2773. form.
  2774. _T26_
  2775. [[`transfer']] `F' = `((nil,nil),(nil,(nil,F)))'
  2776. When code in this form is evaluated with an argument, the tree `F' is
  2777. used as a state transition function, and the argument is used as a list
  2778. to be traversed. The traversal begins with `F' being evaluated on `nil'
  2779. to get the initial state and the initial output. Thereafter, each item
  2780. of the list is paired with the current state to be evaluated with `F',
  2781. resulting in a list of output and the next state. The output resulting
  2782. from the entire application is the cumulative concatenation of all
  2783. outputs obtained in the course of evaluating `F'. The computation
  2784. terminates when `F' yields a `nil' result. If the list of inputs runs
  2785. out before the computation terminates, `nil' values are used as inputs.
  2786. This behavior is specified more precisely in the following property
  2787. of the operator, which applies in the case of non-`nil' `F'.
  2788. _P33_
  2789. ([[`transfer']] `F') `X' = ([[`transition']] `F') `(nil,(F nil,X))'
  2790. Much of the `transfer' semantics is implicit in the meaning of
  2791. `transition'. For any given application `F', [[`transition']] `F' is
  2792. the virtual code for a function that takes the list traversal from one
  2793. configuration to the next. A configuration is represented as a tuple,
  2794. usually in the form `(PREVIOUS OUTPUTS,((STATE,OUTPUT),(NEXT
  2795. INPUT,SUBSEQUENT INPUTS)))'. A terminal configuration has the form
  2796. `(PREVIOUS OUTPUTS,(nil,(NEXT INPUT,SUBSEQUENT INPUTS)))'. A
  2797. configuration may also have `nil' in place of the pair `(NEXT
  2798. INPUT,SUBSEQUENT INPUTS)' if no more input remains.
  2799. In the non-degenerate case, the meaning of [[`transition']] `F' is
  2800. consistent with the following equation.
  2801. _E7_
  2802. ([[`transition']] `F') `(Y,((S,O),(I,X)))' =
  2803. ([[`transition']] `F') `((O,Y),(F (S,I),X))'
  2804. That is, the current output `O' is stored with previous outputs `Y',
  2805. the next input `I' is removed from the configuration, and the next
  2806. state and output are obtained from the evaluation of `F' with the state
  2807. `S' and the next input `I'.
  2808. In the case where no input remains, the transition function is
  2809. consistent with the following equation.
  2810. _E8_
  2811. ([[`transition']] `F') `(Y,((S,O),nil))' =
  2812. ([[`transition']] `F') `((O,Y),(F (S,nil),nil))'
  2813. This case is similar to the previous one except that the `nil' value is
  2814. used in place of the next input. Note that in either case, nothing
  2815. about `F' depends on the particular way configurations are represented,
  2816. except that it should have a state as its left argument and an input as
  2817. its right argument.
  2818. Finally, in the case of a terminal configuration, the transition
  2819. function returns the cumulative output.
  2820. _E9_
  2821. ([[`transition']] `F') `(Y,(nil,X))' = [[`reduce(cat,nil)']]
  2822. [[`reverse']] `Y'
  2823. The `silly' code `reduce(cat,nil)' has the effect of flattening a list
  2824. of lists into one long list, which is necessary insofar as the
  2825. transition function will have generated not necessarily a single output
  2826. but a list of outputs on each iteration. The `cat' mnemonic stands for
  2827. list concatenation and is explained in *note Cat::. The reversal is
  2828. necessary to cause the first generated output to be at the head of the
  2829. list. List reversal is a built in operation of the virtual machine and
  2830. is described in *note Reverse::.
  2831. If such a function as `transition' seems implausible, its
  2832. implementation in `silly' can be found in *note Transition::.
  2833. It is usually more awkward to express a function in terms of a
  2834. `transfer' than to code it directly using recursion or other list
  2835. operations. However, this feature is provided by the virtual machine for
  2836. several reasons.
  2837. * Functions in this form may be an easier translation target if the
  2838. source is an imperative language.
  2839. * Translating from virtual code to asynchronous circuits or process networks
  2840. has been a research interest of the author, and code in this form
  2841. lends itself to easy recognition and mapping onto discrete
  2842. components.
  2843. * The `--byte-transducer' and `--interactive' command line options
  2844. to `avram' cause an application to be invoked in a similar manner
  2845. to the transition function in a `transfer' function, so this
  2846. feature allows for easy simulation and troubleshooting of these
  2847. applications without actually deploying them.
  2848. 
  2849. File: avram.info, Node: Mapcur, Prev: Transfer, Up: List Combinators
  2850. 2.7.13.6 Mapcur
  2851. ...............
  2852. An alternative form of recursive definition is the following.
  2853. _T27_
  2854. [[`mapcur']] `P' = `((nil,nil),((nil,nil),(P,nil)))'
  2855. This form is convenient for applications that cause themselves to be applied
  2856. recursively to a list of arguments. It has this semantics.
  2857. _P34_
  2858. ([[`mapcur']] `P') `X' = [[`map meta']] [[`distribute']]
  2859. ([[`field']] `P') `X'
  2860. 
  2861. File: avram.info, Node: List Functions, Next: Exception Handling, Prev: List Combinators, Up: Virtual Code Semantics
  2862. 2.7.14 List Functions
  2863. ---------------------
  2864. In addition to the foregoing list operations, the virtual machine provides
  2865. a number of canned functions operating on lists, namely concatenation,
  2866. reversal, distribution, and transposition. These functions could be
  2867. coded by other means if they were not built in, but the built in
  2868. versions are faster and smaller.
  2869. * Menu:
  2870. * Cat::
  2871. * Reverse::
  2872. * Distribute::
  2873. * Transpose::
  2874. 
  2875. File: avram.info, Node: Cat, Next: Reverse, Prev: List Functions, Up: List Functions
  2876. 2.7.14.1 Cat
  2877. ............
  2878. The list concatenation operation has this representation in virtual
  2879. code.
  2880. _T28_
  2881. [[`cat']] = `((nil,nil),(nil,nil))'
  2882. This function takes a pair of lists as an argument, an returns the list
  2883. obtained by appending the right one to the left. The semantics of
  2884. concatenation is what one would expect.
  2885. _P35_
  2886. [[`cat']] `(nil,Z)' = `Z'
  2887. _P36_
  2888. [[`cat']] `((X,Y),Z)' = `(X,'[[`cat']] `(Y,`z'))'
  2889. 
  2890. File: avram.info, Node: Reverse, Next: Distribute, Prev: Cat, Up: List Functions
  2891. 2.7.14.2 Reverse
  2892. ................
  2893. The function that reverses a list has the following representation in
  2894. virtual code.
  2895. _T29_
  2896. [[`reverse']] = `((nil,nil),(nil,(nil,nil)))'
  2897. This function takes a list as an argument, and returns a the list
  2898. consisting of the same items in the reverse order. The semantics is
  2899. given by the following properties.
  2900. _P37_
  2901. [[`reverse']] `nil' = `nil'
  2902. _P38_
  2903. [[`reverse']] `(X,Y)' = [[`cat']] ([[`reverse']] `Y,(X,nil)')
  2904. 
  2905. File: avram.info, Node: Distribute, Next: Transpose, Prev: Reverse, Up: List Functions
  2906. 2.7.14.3 Distribute
  2907. ...................
  2908. The function with the following virtual code representation is
  2909. frequently useful for manipulating lists.
  2910. _T30_
  2911. `distribute' = `(((nil,nil),nil),nil)'
  2912. This function takes a pair whose right side represents a list, and
  2913. returns a list of pairs, with one pair for each item in the list. The
  2914. left side of each pair is the left side of the original argument, and
  2915. the right side is the corresponding item of the list. A semantics for
  2916. this operation is specified by the following properties.
  2917. _P39_
  2918. [[`distribute']] `(X,nil)' = `nil'
  2919. _P40_
  2920. [[`distribute']] `(X,(Y,Z))' = `((X,Y),'[[`distribute']] `(X,Z))'
  2921. 
  2922. File: avram.info, Node: Transpose, Prev: Distribute, Up: List Functions
  2923. 2.7.14.4 Transpose
  2924. ..................
  2925. The `transpose' operation has the following representation in virtual
  2926. code.
  2927. _T31_
  2928. [[`transpose']] = `((nil,nil),((nil,nil),(nil,nil)))'
  2929. This function takes a list of equal length lists as an argument, and
  2930. returns a list of lists as a result. In the resulting list, the first
  2931. item is the list of all first items of lists in the argument. The next
  2932. item is the list of all second items, and so on.
  2933. In the specification of the semantics, the `silly' mnemonic `flat'
  2934. is defined by `flat = reduce(cat,nil)' in the standard `silly' prelude,
  2935. which means that it flattens a list of lists into one long list.
  2936. _P41_
  2937. [[`transpose']] `X' = `nil' if [[`flat']] `X' = `nil'
  2938. _P42_
  2939. [[`transpose']] `X' = `('[[`map left']] `X,'[[`transpose']] [[`map
  2940. right']] `X)'
  2941. if [[`flat']] `X' is a non-`nil' tree
  2942. 
  2943. File: avram.info, Node: Exception Handling, Next: Interfaces to External Code, Prev: List Functions, Up: Virtual Code Semantics
  2944. 2.7.15 Exception Handling
  2945. -------------------------
  2946. In quite a few cases, the properties given for the operator up to this
  2947. point do not imply any particular result. A good example would be an
  2948. expression such as [[`left']] `nil', which appears to represent the
  2949. left side of an empty pair. It can be argued that expressions like this
  2950. have no sensible interpretation and should never be used, so it would
  2951. be appropriate to leave them undefined. On the other hand, attempts to
  2952. evaluate such expressions occur frequently by mistake, and in any case,
  2953. the virtual machine emulator should be designed to do something
  2954. reasonable about them if only for the sake of reporting the error. The
  2955. chosen remedy for this situation addresses the need for error
  2956. reporting, and also turns out to be useful in other ways.
  2957. * Menu:
  2958. * A Hierarchy of Sets::
  2959. * Operator Generalization::
  2960. * Error Messages::
  2961. * Expedient Error Messages::
  2962. * Computable Error Messages::
  2963. * Exception Handler Usage::
  2964. 
  2965. File: avram.info, Node: A Hierarchy of Sets, Next: Operator Generalization, Prev: Exception Handling, Up: Exception Handling
  2966. 2.7.15.1 A Hierarchy of Sets
  2967. ............................
  2968. As indicated already, the virtual machine represents all functions and
  2969. data as members of a set satisfying the properties in *note Raw
  2970. Material::, namely a `nil' element and a `cons' operator for
  2971. constructing trees or nested pairs of `nil'. However, it will be
  2972. necessary to distinguish the results of computations that go wrong for
  2973. exceptional reasons from normal results. Because any tree in the set
  2974. could conceivably represent a normal result, we need to go outside the
  2975. set to find an unambiguous representation of exceptional results.
  2976. Because there may be many possible exceptional conditions, it will
  2977. be helpful to have a large set of possible ways to encode them, and in
  2978. fact there is no need to refrain from choosing a countably infinite
  2979. set. Furthermore, it will be useful to distinguish between different
  2980. levels of severity among exceptional conditions, so for this purpose a
  2981. countably infinite hierarchy of mutually disjoint sets is used.
  2982. In order to build on the theory already developed, the set that has
  2983. been used up to this point will form the bottom level of the hierarchy,
  2984. and its members will represent normal computational results. The
  2985. members of sets on the higher levels in the hierarchy represent
  2986. exceptional results. To avoid ambiguity, the term "trees" is reserved
  2987. for members of the bottom set, as in "for any tree `X' ...". Unless
  2988. otherwise stated, variables like `X' and `Y' are universally quantified
  2989. over the bottom set only.
  2990. Because each set in the hierarchy is countably infinite, it is
  2991. isomorphic to the bottom set. With respect to an arbitrary but fixed
  2992. bijection between them, let `X_N' denote the image in the `N'th level
  2993. set of a tree `X' in the bottom set. The level numbers in this notation
  2994. start with zero, and we take `X_0' to be synonymous with `X'. For good
  2995. measure, let `(X_N)_M' = `X_(N+M)'.
  2996. 
  2997. File: avram.info, Node: Operator Generalization, Next: Error Messages, Prev: A Hierarchy of Sets, Up: Exception Handling
  2998. 2.7.15.2 Operator Generalization
  2999. ................................
  3000. Each set in the hierarchy induces a structure preserving `cons' operator,
  3001. denoted `cons_N' for the `N'th level set, and satisfying this equation.
  3002. _E10_
  3003. `cons_N(X_N,Y_N)' = `(cons(X,Y))_N'
  3004. It will be convenient to generalize all of these `cons' operators to be
  3005. defined on members of other sets than their own.
  3006. _E11_
  3007. For `M' greater than `N', `cons_N(X_M,Y_P)' = `X_M'
  3008. In this equation, `P' is unrestricted. The intuition is that if the
  3009. left operand of a `cons' is the result of a computation that went wrong
  3010. due to an exceptional condition (more exceptional than `N', the level
  3011. already in effect), then the exceptional result becomes the whole
  3012. result.
  3013. It is tempting to hazard a slightly stronger statement, which is that
  3014. this equation holds even if `Y_P' is equal to some expression `F X'
  3015. that is undefined according to the operator semantics. This stipulation
  3016. would correspond to an implementation in which the right operand isn't
  3017. evaluated after an error is detected in the left, but there are two
  3018. problems with it.
  3019. * This semantics might unreasonably complicate a concurrent
  3020. implementation of the virtual machine. If evaluation leads to
  3021. non-termination in some cases where the result is undefined (as it
  3022. certainly would in any possible implementation consistent with
  3023. cases where it's defined), then the mechanism that evaluates the
  3024. right side of a pair must be interruptible in case an exception is
  3025. detected in the left.
  3026. * It is beyond the expressive power of the present mathematical
  3027. framework to make such a statement, because it entails universal
  3028. quantification over non-members of the constructed sets, which
  3029. includes almost everything.
  3030. Nevertheless, the implementation in `avram' is sequential and does
  3031. indeed behave as proposed, with no practical difficulty. As for any
  3032. deficiency in the theory, it could be banished by recasting the
  3033. semantics in terms of a calculus of expressions with formal rules of
  3034. manipulation. An operand to the `cons' operator would be identified not
  3035. with a member of a semantic domain, but with the expression used to
  3036. write it down, and then even "undefinedness" could be defined. However,
  3037. the present author's preference in computing as in life is to let some
  3038. things remain a mystery rather than to abandon the quest for meaning
  3039. entirely.
  3040. A comparable condition applies in cases where the right side of a
  3041. pair represents an exceptional result.
  3042. _E12_
  3043. For `M' greater than `N', `cons_N(X_N,Y_M)' = `Y_M'
  3044. Whereas an infinitude of `cons' operators has been needed, it will be
  3045. possible to get by with only one invisible operator, as before, by
  3046. generalizing it in the following way to operands on any level of the
  3047. hierarchy.
  3048. _P43_
  3049. `F_N X_N' = `(F X)_N'
  3050. _P44_
  3051. For distinct `N' and `M', `F_N X_M' = `X_M'
  3052. That is, the result of evaluating two operands on the same level is the
  3053. image relative to that level of the result of their respective images on
  3054. the bottom level, but the result of evaluating two operands on different
  3055. levels is the same as the right operand.
  3056. 
  3057. File: avram.info, Node: Error Messages, Next: Expedient Error Messages, Prev: Operator Generalization, Up: Exception Handling
  3058. 2.7.15.3 Error Messages
  3059. .......................
  3060. The basic strategy for representing the results of exceptional
  3061. conditions arising from the evaluation of operands on a given level of
  3062. the hierarchy will be to use an error message corresponding to the image
  3063. of a list of character strings on the level above.
  3064. Unfortunately, the official `silly' standard does not define
  3065. character constants, but they are available as a vendor specific
  3066. extension in `silly-me' (millennium edition), where character strings may
  3067. be enclosed in single quotes. The value of the semantic function
  3068. [[...]] in the case of a character string is the list of
  3069. representations of the characters, based on *note Character Table:: and
  3070. *note Representation of Numeric and Textual Data::.
  3071. For the sake of consistency, each standard error message is a list of
  3072. character strings, even though the list has only one string in it. If
  3073. any exceptional condition is the result of a computation, it is written
  3074. to standard error by `avram' as the list of character strings it
  3075. represents.
  3076. _P45_
  3077. ([[`compare']] `nil')`_N' = [[`('invalid
  3078. comparison',nil)']]`_(N+1)'
  3079. _P46_
  3080. ([[`left']] `nil')`_N' = [[`('invalid
  3081. deconstruction',nil)']]`_(N+1)'
  3082. _P47_
  3083. ([[`right']] `nil')`_N' = [[`('invalid
  3084. deconstruction',nil)']]`_(N+1)'
  3085. _P48_
  3086. (([[`fan']] `F') `nil')`_N' = [[`('invalid
  3087. deconstruction',nil)']]`_(N+1)'
  3088. _P49_
  3089. ([[`member']] `nil')`_N' = [[`('invalid
  3090. membership',nil)']]`_(N+1)'
  3091. _P50_
  3092. ([[`distribute']] `nil')`_N' = [[`('invalid
  3093. distribution',nil)']]`_(N+1)'
  3094. _P51_
  3095. ([[`cat']] `nil')`_N' = [[`('invalid
  3096. concatenation',nil)']]`_(N+1)'
  3097. _P52_
  3098. ([[`meta']] `nil')`_N' = [[`('invalid recursion',nil)']]`_(N+1)'
  3099. Note that by virtue of property _P44_, there is no need for an
  3100. application to make explicit checks for exceptional results at any
  3101. point, because the exceptional result propagates through to the output
  3102. of any function composed with the one that incurred it. For example, an
  3103. application of the form `h = compose(f,right)', which will cause an
  3104. invalid deconstruction error if applied in filter mode to an empty file,
  3105. imposes no requirement that `f' be written to accommodate that
  3106. possibility (i.e., by checking for it) in order for the error to be
  3107. reported properly. The following proof demonstrates that the meaning of
  3108. `f' is irrelevant to the result.
  3109. [[`compose(f,right)']]`_0' `nil_0'
  3110. = [[`f']]`_0' [[`right']]`_0' `nil'`_0'
  3111. = [[`f']]`_0' [[`('invalid deconstruction',nil)']]`_1'
  3112. = [[`('invalid deconstruction',nil)']]`_1'
  3113. In an application `h = compose(f,g)', the input validation therefore
  3114. may be confined to the "front end", `g'.
  3115. It will be recalled from the discussions of `recur' (*note Recur::) and
  3116. `transpose' (*note Transpose::) that the semantics of virtual code
  3117. involving these forms is defined in terms of the `field' format for
  3118. deconstruction functions (*note Field::), which depends implicitly on
  3119. the semantics of `left' and `right', being a generalization of them. An
  3120. invalid deconstruction message could therefore result from applications
  3121. incorporating any of the forms of `recur', `transpose', or `field'.
  3122. Invalid deconstructions could also arise from the `replace' operation (*note
  3123. Replace::), which is used for assignment (*note Assignment::), because
  3124. `replace' is defined by virtual code, except as noted next.
  3125. 
  3126. File: avram.info, Node: Expedient Error Messages, Next: Computable Error Messages, Prev: Error Messages, Up: Exception Handling
  3127. 2.7.15.4 Expedient Error Messages
  3128. .................................
  3129. Because there are so many ways to cause an invalid deconstruction, this
  3130. message is the most common in practice and therefore the least
  3131. informative. As a matter of convenience, `avram' takes the liberty of a
  3132. slight departure from the virtual machine specification as written
  3133. hitherto, and employs the following messages when invalid
  3134. deconstructions occur respectively in the cases of recursion,
  3135. transposition, and assignment.
  3136. * `invalid recursion'
  3137. * `invalid transpose'
  3138. * `invalid assignment'
  3139. That is, this section contradicts and supersedes what is stated at the
  3140. end of *note Error Messages:: and implied by the operator properties
  3141. _P14_, _P16_, and _P42_. It is also possible that user applications may
  3142. modify the error messages by methods described in *note Computable
  3143. Error Messages::.
  3144. Whereas these three cases constitute an expedient variation on the
  3145. semantics, there is another sense in which no possible implementation
  3146. could conform faithfully to the specification. When an evaluation can
  3147. not be carried out because of insufficient space on the host machine,
  3148. one of the following error messages may be the result.
  3149. * `memory overflow'
  3150. * `counter overflow'
  3151. These messages are treated in the same way as those that are caused by
  3152. programming errors, and propagate to the final result written to
  3153. standard error without any specific consideration by the application
  3154. developer. The latter occurs only in connection with the built in weight
  3155. function (*note Weight::). Other messages listed in *note Application
  3156. Programming Errors:: are also of this ilk.
  3157. 
  3158. File: avram.info, Node: Computable Error Messages, Next: Exception Handler Usage, Prev: Expedient Error Messages, Up: Exception Handling
  3159. 2.7.15.5 Computable Error Messages
  3160. ..................................
  3161. The automatic generation and reporting of error messages provides a
  3162. reasonable default behavior for applications that do not consider
  3163. exceptional conditions. All applications and their input data are
  3164. ordinarily members of the bottom level set in the hierarchy (*note A
  3165. Hierarchy of Sets::). The error messages caused by invalid operations
  3166. on this level are on the first level above the bottom, which are
  3167. recognized as such and written to standard error without intervention
  3168. from the application. However, there are two drawbacks to this style of
  3169. dealing with exceptions.
  3170. * An application developer may wish to translate error messages into
  3171. terms that are meaningful to the user, not only by literally
  3172. translating them from English to the local vernacular, but perhaps
  3173. by relating the particular exceptional condition to application
  3174. specific causes. While it is convenient for the "back end" code
  3175. not to be required to intervene in the error reporting, it would
  3176. be most inconvenient for it not to be able to do so.
  3177. * Some application specific errors might not correspond directly to
  3178. any of the particular conditions detected automatically due to
  3179. invalid operations, for example a semantic error in a
  3180. syntactically correct input file. It might be convenient in such
  3181. cases for an application to be able to define its own error
  3182. messages but still have them reported automatically like the built
  3183. in messages.
  3184. These situations suggest a need for some ability on the part of an
  3185. application to operate on error messages themselves. Based on the
  3186. operator semantics given so far, such an application is inexpressible,
  3187. because for any application `F_0' and error message `X_1', property
  3188. _P44_ stipulates `F_0 X_1' = `X_1', meaning that the resulting error
  3189. message is unchanged. Therefore, we need to define another basic
  3190. property of the operator.
  3191. The following form of virtual code is used in applications that may
  3192. need to operate on error messages.
  3193. _T32_
  3194. [[`guard']] `(F,G)' = `((nil,F),G)'
  3195. Code in this form has the following semantics.
  3196. _P53_
  3197. ([[`guard']] `(F,G)')`_N' `X_P' = `G_(N+1) F_N X_P'
  3198. The intuitive explanation is that `F' is the main part of the
  3199. application, and `G' is the part of the application that operates on
  3200. the error message that comes from `F' if an exception occurs while it
  3201. is being evaluated (i.e., the "exception handler"). Typically the
  3202. exception handler code implements a function that takes an error
  3203. message as an argument and returns an error message as a result.
  3204. Where there is no exception, the exception handler `G_(N+1)' is
  3205. never used, because its argument will be on level `N', and therefore
  3206. unaffected by an application on level `N+1'.
  3207. Exception handlers may have their own exception handlers, which will
  3208. be invoked if the evaluation of the exception handler causes a further
  3209. exception. Such an exception corresponds semantically to a value on the
  3210. next level of the hierarchy of sets.
  3211. 
  3212. File: avram.info, Node: Exception Handler Usage, Prev: Computable Error Messages, Up: Exception Handling
  3213. 2.7.15.6 Exception Handler Usage
  3214. ................................
  3215. One way for this feature of the virtual machine to be used is to
  3216. intercept and translate error messages to a more meaningful form. An
  3217. application guarded as shown below causes messages of invalid
  3218. deconstruction to be changed to `'syntax error''.
  3219. `main = guard(
  3220. application,
  3221. conditional(
  3222. bu(compare,('invalid deconstruction',nil)),
  3223. (constant ('syntax error',nil),identity)))'
  3224. The conditional compares its argument to the error message for an invalid
  3225. deconstruction, and if it matches, the syntax error message is
  3226. returned, but otherwise the original message is returned. Note that an
  3227. error message must be in the form of a list of character strings, so
  3228. that it can be printed. Although the message of `'syntax error'' might
  3229. not be very informative, at least it looks less like a crash. A real
  3230. application should of course strive to do better than that.
  3231. Exception handling features of the virtual machine can also be
  3232. adapted by applications to raise their own exceptions with customized
  3233. messages.
  3234. error_messenger =
  3235. guard(compose(compare,constant nil),constant ('syntax error',nil))
  3236. This code fragment implements a function that causes a message of
  3237. `'syntax error'' to be reported for any possible input. This code
  3238. works by first causing an invalid comparison and then substituting its
  3239. own error message. A function that always causes an error is not useful
  3240. in itself, but might be used as part of an application in the following
  3241. form.
  3242. main = conditional(validation,(application,error_messenger))
  3243. In this case, the application checks the validity of the input with a
  3244. predicate, and invokes the error messenger if it is invalid.
  3245. Although the previous examples return a fixed error message for each
  3246. possible kind of error, it is also possible to have error messages that
  3247. depend on the input data, as the next example shows.
  3248. main = (hired apply)(
  3249. compose(
  3250. bu(guard,some_application),
  3251. (hired constant)(constant 'invalid input was:',identity)),
  3252. identity)
  3253. If the application causes an exception for any reason, the error message
  3254. returned will include a complete listing of the input, prefaced by the
  3255. words `'invalid input was:''. This particular example works only if the
  3256. input is a list of character strings, but could be adapted for other
  3257. types of data by substituting an appropriate formatting function for the
  3258. first identity. The formatting function would take the relevant data
  3259. type to a list of character strings. Another possible variation would
  3260. be to concatenate the invalid input listing with the error message that
  3261. was generated, rather than just replacing it.
  3262. As the last example may suggest, exception handlers turn out to be an essential
  3263. debugging tool for functional programs, making them as easy to debug as
  3264. imperative programs if not more so. This example forms the basis for a
  3265. higher order function that wraps any given function with an exception
  3266. handler that prints the argument causing it to crash. For arguments not
  3267. causing a crash, the behavior is unchanged. Alternatively, code
  3268. implementing a function that unconditionally reports its argument in an
  3269. error message can be inserted at a strategic point in the application
  3270. code similarly to a print statement. Finally, inspired use of exception
  3271. handlers that concatenate their messages with previously generated
  3272. messages can show something like a parameter stack dump when a
  3273. recursively defined function crashes. These are all matters for a
  3274. language designer and are not pursued further in this document.
  3275. 
  3276. File: avram.info, Node: Interfaces to External Code, Next: Vacant Address Space, Prev: Exception Handling, Up: Virtual Code Semantics
  3277. 2.7.16 Interfaces to External Code
  3278. ----------------------------------
  3279. A few other combinators have been incorporated into the virtual machine
  3280. as alternatives to the style of interactive applications described in
  3281. *note Output From Interactive Applications::. These make it possible to
  3282. interface with external libraries and applications either by a simple
  3283. function call, or by executing a run-time generated transducer as
  3284. described previously. In either case, there is no need for any
  3285. particular command line options to specify interactive invocation, nor
  3286. for the application to be designed that way from the outset. Existing
  3287. virtual code applications may therefore be enhanced to make use of
  3288. these features without radical changes.
  3289. To account for these additional capabilities, it is not entirely
  3290. adequate to continue defining the virtual machine semantics in terms of
  3291. a mathematical function, but it is done nevertheless due to the lack of
  3292. any appealing alternative. Although most library functions are in fact
  3293. functions in the sense that their outputs are determined by their
  3294. arguments, they defy a concise specification within the present
  3295. mathematical framework, especially insofar as they may involve finite
  3296. precision floating point numbers. More problematically, the effect of
  3297. interaction with a shell is neither well defined nor deterministic.
  3298. The descriptions that follow presuppose a computational procedure
  3299. associated with the following definitions but leave its exact nature
  3300. unspecified.
  3301. * Menu:
  3302. * Library combinator::
  3303. * Have combinator::
  3304. * Interaction combinator::
  3305. 
  3306. File: avram.info, Node: Library combinator, Next: Have combinator, Prev: Interfaces to External Code, Up: Interfaces to External Code
  3307. 2.7.16.1 Library combinator
  3308. ...........................
  3309. The simplest and fastest method of interfacing to an external library
  3310. is by way of a virtual machine combinator called `library'. It takes
  3311. two non-empty character strings as arguments to a virtual code program
  3312. of the form implied by the following property.
  3313. _T33_
  3314. [[`library']] (`X',`Y') = `((nil,nil),((X,Y),(nil,nil)))'
  3315. Intuitively, X is the name of a library and Y is the name of a function
  3316. within the library. For example, if X is `'math'' and Y is `'sqrt'',
  3317. then `library'(X,Y) represents the function that computes the square
  3318. root of a floating point number as defined by the host machine's native
  3319. C implementation, normally in IEEE double precision format. Different
  3320. functions and libraries may involve other argument and result types,
  3321. such as complex numbers, arrays, sparse matrices, or arbitrary
  3322. precision numbers. A list of currently supported external library names
  3323. with their functions and calling conventions is given in *note External
  3324. Libraries::.
  3325. On the virtual code side, all function arguments and results
  3326. regardless of their types are encoded as nested pairs of `nil', as
  3327. always, and may be manipulated or stored as any other data, including
  3328. storage and retrieval from files in `.avm' virtual code format (*note
  3329. File Format::). However, on the C side, various memory management and
  3330. caching techniques are employed to maintain this facade while allowing
  3331. the libraries to operate on data in their native format. The details
  3332. are given more fully in the API documentation, particularly in *note
  3333. Type Conversions:: and *note External Library Maintenance::.
  3334. While this style is fast and convenient, it is limited either to
  3335. libraries that have already been built into the virtual machine, or to
  3336. those for which the user is prepared to implement a new interface
  3337. module in C as described in *note Implementing new library functions::.
  3338. 
  3339. File: avram.info, Node: Have combinator, Next: Interaction combinator, Prev: Library combinator, Up: Interfaces to External Code
  3340. 2.7.16.2 Have combinator
  3341. ........................
  3342. As virtual machine interfaces to external libraries accumulate faster
  3343. than they can be documented and may vary from one installation to
  3344. another, it is helpful to have a way of interrogating the virtual
  3345. machine for an up to date list of the installed libraries and
  3346. functions. A combinator called `have' can be used to test for the
  3347. availability of a library function. It takes the form
  3348. _T34_
  3349. [[`have']] (`X',`Y') = `((nil,nil),((nil,X),(nil,Y)))'
  3350. where X is the name of a library and Y is the name of a function within
  3351. the library encoded as character strings. For example, if X is
  3352. `'mtwist'' and Y is `'u_disc'' (for the natural random number generator
  3353. function in the Mersenne twistor library) then `have(X,Y)' is a
  3354. function that returns a non-empty value if an only if that library is
  3355. installed and that function is available within it. The actual argument
  3356. to the function is ignored as the result depends only on the installed
  3357. virtual machine configuration. In this sense, it acts like a `constant'
  3358. combinator.
  3359. One way for this combinator to be used is in code of the form
  3360. portable_rng =
  3361. conditional(
  3362. have('mtwist','u_disc'),
  3363. library('mtwist','u_disc'),
  3364. some_replacement_function)
  3365. which will use the library function if available but otherwise use a
  3366. replacement function. Code in this form makes the decision at run time,
  3367. but it is also possible to express the function such that the check for
  3368. library presence is made at compile time, as the following example
  3369. shows, which will imply a slight improvement in performance.
  3370. non_portable_rng =
  3371. apply(
  3372. conditional(
  3373. have('mtwist','u_disc'),
  3374. constant library('mtwist','u_disc'),
  3375. constant some_replacement_function),
  3376. 0)
  3377. This program would be non-portable in the sense that it would need to
  3378. be recompiled for each installation if there were a chance that some of
  3379. them might have the `mtwist' library and some might not, whereas the
  3380. previous example would be binary compatible across all of them. (1)
  3381. The actual value returned by a function `have(foo,bar)' is the list
  3382. of pairs of strings `<(foo,bar)>' if the function is available, or the
  3383. empty list otherwise. A non-empty list is represented as a pair
  3384. `(head,tail)', and an empty list as `nil'. The angle bracket notation
  3385. `<a,b,c...>' used here is an abbreviation for `(a,(b,(c...nil)))'.
  3386. Either or both arguments to the `have' combinator can be a wildcard,
  3387. which is the string containing a single asterisk, `'*''. In that case,
  3388. the list of all available matching library names and function names
  3389. will be returned. This feature can be used to find out what library
  3390. functions are available without already knowing their names.
  3391. If a library had a function named `'*'', which clashes with the wild
  3392. card string, the interpretation as a wild card would take precedence.
  3393. ---------- Footnotes ----------
  3394. (1) In practice both examples are equally portable because the
  3395. `mtwist' source is distributed with `avram' so all installations will
  3396. have it. Most libraries are distributed separately.
  3397. 
  3398. File: avram.info, Node: Interaction combinator, Prev: Have combinator, Up: Interfaces to External Code
  3399. 2.7.16.3 Interaction combinator
  3400. ...............................
  3401. A further combinator allows virtual code applications to interact
  3402. directly with any interactive console application using the `expect'
  3403. library. The mechanism is similar to that of interactive applications
  3404. documented in the *note Output From Interactive Applications::, but
  3405. attempts to be more convenient. Instead of being designed as an
  3406. interactive application, any virtual code application may use this
  3407. combinator to spawn a shell and interact with it in order to compute
  3408. some desired result.
  3409. The advantage of this combinator over the `library' combinator is
  3410. that it requires no modification of the virtual machine to support new
  3411. applications. It can also interact with applications that may reside on
  3412. remote servers, that are implemented languages other than C, or whose
  3413. source code is unavailable. For example, the GNU R statistical package
  3414. provides an interactive command to evaluate multivariate normal
  3415. distribution functions with an arbitrary covariance matrix, but the
  3416. corresponding function is not provided by the `Rmath' C library (or any
  3417. other free library, to the author's knowledge) because it is
  3418. implemented in interpreted code. This combinator makes it callable by
  3419. an `avram' virtual code application nevertheless. The disadvantage
  3420. compared to the `library' combinator is that there is more overhead in
  3421. spawning a process than simply making a call to a built in function,
  3422. and the programming interface is more complicated.
  3423. The combinator takes the form
  3424. _T35_
  3425. [[`interact']] F = `((nil,nil),(((nil,nil),nil),((nil,F),nil)))'
  3426. where F is the virtual code for a function that follows the same
  3427. protocol described in *note Output From Interactive Applications::,
  3428. except that it does not allow file output as described in *note Mixed
  3429. Modes of Interaction::. The argument `x' is ignored when the expression
  3430. `(interact f) x' is evaluated, similarly to the way the argument is
  3431. ignored in an expression like `(constant k) x'. The result returned is
  3432. a transcript of the dialogue that took place between `f' and the
  3433. externally spawned shell, represented as a list of lists of strings for
  3434. line oriented interaction, or a list of characters alternating with
  3435. lists of strings in the case of character oriented interaction.
  3436. The following example demonstrates a trivial use of the `interact'
  3437. combinator to spawn an `ftp' client, do an `ls' command, and then terminate
  3438. the session.
  3439. eof = <(nil,(nil,(((nil,nil),nil),(nil,nil))))>
  3440. demo =
  3441. interact conditional(
  3442. conditional(identity,constant false,constant true),
  3443. constant(0,<'ftp'>,<'ftp> '>),
  3444. conditional(
  3445. conditional(left,constant false,constant true),
  3446. constant(1,<'ls',''>,<'','ftp> '>),
  3447. conditional(
  3448. compose(compare,couple(left,constant 1)),
  3449. constant(2,<'bye',''>,<eof>),
  3450. constant nil)))
  3451. Some liberties are taken with `silly' syntax in this example, in the
  3452. way of using angle brackets to denote lists, and numbers to represent
  3453. states.
  3454. * The interacting transducer works by checking whether its argument
  3455. is empty (via the `identity' function used as a predicate in the
  3456. `conditional', which is then negated). In that case it returns the
  3457. triple containing the initial state of 0, the `ftp' shell command
  3458. to spawn the client, and the `'ftp> '' prompt expected when the
  3459. client has been spawned, both of the latter being lists of strings.
  3460. * If the argument is non-empty, then next it checks whether it is in
  3461. the initial state of 0, (via the `left' function used as a
  3462. predicate, referring to the state variable expected on the left of
  3463. any given `(state,input)' pair, also negated). If so, it returns
  3464. the triple containing the next state of 1, the `ls' command
  3465. followed by an empty string to indicate a line break, and the
  3466. expected prompt preceded by an empty string to match it only at
  3467. the beginning of a line.
  3468. * Finally, it checks for state 1, in which case it issues the `bye'
  3469. command to close the session, `eof' rather than a prompt to wait
  3470. for termination of the client, and a state of 2.
  3471. * In the remaining state of 2, which needn't be explicitly tested
  3472. because it is the only remaining possibility, the program returns a
  3473. `nil' value to indicate that the computation has terminated.
  3474. Deadlock would be possible at any point if either party did not
  3475. follow this protocol, but for this example it is not an issue. If an
  3476. expression of the form `demo x' were to be evaluated, then regardless
  3477. of the value of `x', the value of the result would be as shown below.
  3478. <
  3479. <'ftp'>,
  3480. <'ftp> '>,
  3481. <'ls',''>,
  3482. <'ls','Not connected.','ftp> '>,
  3483. <'bye',''>,
  3484. <'bye',''>>
  3485. That is, it would be a list of lists of strings, alternating between the
  3486. output of the interactor and the output of the `ftp' client. If the
  3487. spawned application had been something non-trivial such as a computer
  3488. algebra system or a command line web search utility, then it is easy to
  3489. see how functions using this combinator can leverage off a wealth of
  3490. available resources.
  3491. 
  3492. File: avram.info, Node: Vacant Address Space, Prev: Interfaces to External Code, Up: Virtual Code Semantics
  3493. 2.7.17 Vacant Address Space
  3494. ---------------------------
  3495. Not every possible pattern has been used by the virtual machine as a way
  3496. of encoding a function. The following patterns, where `A', `B', and `C'
  3497. are non-`nil' trees, do not represent anything useful.
  3498. unary forms
  3499. `((nil,nil),((nil,nil),(nil,((nil,A),nil))))'
  3500. `((nil,nil),((nil,nil),(nil,(nil,(nil,A)))))'
  3501. binary forms
  3502. `((nil,nil),((nil,nil),(A,B)))'
  3503. `((nil,nil),((A,nil),(B,nil)))'
  3504. `((nil,nil),((A,nil),(nil,B)))'
  3505. ternary forms
  3506. `((nil,nil),((A,B),(C,nil)))'
  3507. `((nil,nil),((A,B),(nil,C)))'
  3508. `((nil,nil),((A,nil),(B,C)))'
  3509. `((nil,nil),((nil,A),(B,C)))'
  3510. These patterns are detected by the virtual machine simply to avoid
  3511. blowing it up, but they always cause an error message to be reported.
  3512. _P55_
  3513. For `F' matching any of the first three trees in the above list,
  3514. `F_N X_N' = [[`('unsupported hook',nil)']]`_(N+1)'
  3515. _P56_
  3516. For the remaining trees `F' in the above list,
  3517. `F_N X_N' = [[`('unrecognized combinator (code
  3518. M)',nil)']]`_(N+1)'
  3519. Here, `M' is a numeric constant dependent on which tree `F' was used.
  3520. The unsupported hook message is meant to be more informative than the
  3521. unrecognized combinator message, suggesting that a feature intended for
  3522. future use is not yet available.
  3523. This list has been assembled for the benefit of readers considering
  3524. the addition of backward compatible extensions to the virtual code
  3525. semantics, who are undeterred by the facts that
  3526. * the computational model is already universal
  3527. * virtual code applications are already interoperable with all kinds
  3528. of high performance software having a text based or console
  3529. interface by way of the `interact' combinator
  3530. * an unlimited number of built in library functions can be added by
  3531. way of the `library' combinator as described in *note Implementing
  3532. new library functions::
  3533. * the C code in `avram' makes fairly intricate use of pointers with
  3534. a careful policy of reference counting and storage reclamation
  3535. * there is also a performance penalty incurred by further extensions
  3536. to the semantics, even for applications that don't use them,
  3537. because a pattern recognition algorithm in the interpreter has
  3538. more cases to consider.
  3539. Nevertheless, a new functional form combining a pair of functions to
  3540. be interpreted in a new way by the virtual machine could be defined
  3541. using any of the binary forms above, for example, with `A' as the
  3542. virtual code for one of the functions and `B' as that of the other.
  3543. Such a form would not conflict with any existing applications, provided
  3544. that both `A' and `B' are not `nil', which is true of any valid
  3545. representation for a function.
  3546. Virtual machine architects, take note. There are infinitely many
  3547. trees fitting these patterns, but it would be possible to use them up by
  3548. assigning them without adequate foresight. For example, if
  3549. interpretations were assigned to the four ternary forms, the three
  3550. binary forms, and one of the remaining unary forms, then the only
  3551. unassigned pattern could be of the form
  3552. `((nil,nil),((nil,nil),(nil,(nil,(nil,A)))))'
  3553. Assigning an interpretation to it would leave no further room for
  3554. backward compatible expansion. On the other hand, any tree of the
  3555. following form also fits the above pattern,
  3556. `((nil,nil),((nil,nil),(nil,(nil,(nil,(B,C))))))'
  3557. with any values for `B' and `C'. Different meanings could be chosen for
  3558. the case where both are `nil', both are non-`nil', or one is `nil' and
  3559. the other non-`nil', allowing two unary forms, one binary, and one
  3560. constant. If at least one of these patterns is reserved for future
  3561. enhancements, then a potentially inexhaustible supply of address space
  3562. remains and there will be no need for incompatible changes later.
  3563. 
  3564. File: avram.info, Node: Library Reference, Next: Character Table, Prev: Virtual Machine Specification, Up: Top
  3565. 3 Library Reference
  3566. *******************
  3567. Much of the code developed for `avram' may be reusable in other
  3568. projects, so it has been packaged into a library and documented in this
  3569. chapter. For ease of reference, this chapter is organized with a
  3570. separate section for each source file. For the most part, each source
  3571. file encapsulates an abstract type and a number of related functions,
  3572. except for a few cases where C makes such a design awkward. An attempt
  3573. has been made to present the sections in a readable order as far as
  3574. possible.
  3575. The documentation in this chapter is confined to the application
  3576. program interface (API), and does not delve unnecessarily into any
  3577. details of the implementation. A reader wishing to extend, modify, or
  3578. troubleshoot the library itself can find additional information in the
  3579. source code comments. These are more likely to be in sync with the code
  3580. than this document may be, and are more readily accessible to someone
  3581. working with the code.
  3582. Some general points pertaining to the library are the following.
  3583. * Unlike the previous chapter, this chapter uses the word "function"
  3584. in the C sense rather than the mathematical sense of the word.
  3585. * Internal errors are internal from the user's point of view, not
  3586. the developer's (*note Internal Errors::). Invoking these
  3587. functions in ways that are contrary to their specifications can
  3588. certainly cause internal errors (not to mention segfaults).
  3589. * The library is definitely not thread safe, and thread safety is not
  3590. a planned enhancement. The amount of locking required to make it
  3591. thread safe would probably incur an objectionable performance
  3592. penalty due to the complexity of the shared data structures
  3593. involved, in addition to being very difficult to get right. If you
  3594. need these facilities in a concurrent application, consider
  3595. spawning a process for each client of the library so as to keep
  3596. their address spaces separate.
  3597. * The library files are built from the standard source distribution
  3598. using GNU `libtool'. In the default directory hierarchy, they will
  3599. be found either in `/usr/lib/libavram.*' or in
  3600. `/usr/local/lib/libavram.*'. These directories will differ in a
  3601. non-standard installation.
  3602. * The header files will probably be located in either
  3603. `/usr/include/avm/*.h' or `/usr/local/include/avm/*.h' for a
  3604. standard installation.
  3605. * All exported functions, macros and constants are preceded with
  3606. `avm_', so as to reduce the chance of name clashes with other
  3607. libraries. Not all type declarations or field identifiers follow
  3608. this convention, because that would be far too tedious.
  3609. * The library header files are designed to be compatible with C++ but
  3610. have been tested only with C. Please refer to platform specific
  3611. documentation for further information on how to link library
  3612. modules with your own code.
  3613. * Menu:
  3614. * Lists::
  3615. * Characters and Strings::
  3616. * File Manipulation::
  3617. * Invocation::
  3618. * Version Management::
  3619. * Error Reporting::
  3620. * Profiling::
  3621. * Emulation Primitives::
  3622. * External Library Maintenance::
  3623. 
  3624. File: avram.info, Node: Lists, Next: Characters and Strings, Prev: Library Reference, Up: Library Reference
  3625. 3.1 Lists
  3626. =========
  3627. The basic data structure used for representing virtual code and data in
  3628. the `avram' library is declared as a `list'. The `list' type is a
  3629. pointer to a structure having a `head' field and a `tail' field, which
  3630. are also lists. The empty tree, `nil', is represented by the C constant
  3631. `NULL'. A tree of the form `cons(A,B)' is represented in C as a list
  3632. whose `head' is the representation of `A' and whose `tail' is the
  3633. representation of `B'.
  3634. A number of other fields in the structure are maintained
  3635. automatically and should not be touched. For that matter, even the
  3636. `head' and `tail' fields should be considered read-only. Because of
  3637. sharing, it is almost never valid to modify a list "in place", except
  3638. for cases that are already covered by library functions.
  3639. * Menu:
  3640. * Simple Operations::
  3641. * Recoverable Operations::
  3642. * List Transformations::
  3643. * Type Conversions::
  3644. * Comparison::
  3645. * Deconstruction Functions::
  3646. * Indirection::
  3647. * The Universal Function::
  3648. 
  3649. File: avram.info, Node: Simple Operations, Next: Recoverable Operations, Prev: Lists, Up: Lists
  3650. 3.1.1 Simple Operations
  3651. -----------------------
  3652. These functions are declared in the header file `lists.h', which should
  3653. be included in any C source file that uses them with a directive such
  3654. as `#include <avm/lists.h>'. All of these functions except the first
  3655. three have the potential cause a memory overflow. In that event, a
  3656. brief message is written to standard error and the process is killed
  3657. rather than returning to the caller. It is possible for client programs
  3658. requiring more robust behavior to do their own error handling by using
  3659. the alternative versions of these operations described in the next
  3660. section.
  3661. -- Function: void avm_initialize_lists ()
  3662. The function `avm_initialize_lists' should be called before any of
  3663. the other ones in this section is called, because it sets up some
  3664. internal data structures. Otherwise, the behavior of the other
  3665. functions is undefined.
  3666. -- Function: void avm_dispose (list FRONT)
  3667. This function deallocates the memory associated with a given list,
  3668. either by consigning it to a cache maintained internally by the
  3669. library, or by the standard `free' function if the cache is full.
  3670. Shared lists are taken into account and handled properly according
  3671. to a reference counting scheme. Lists should be freed only by this
  3672. function, not by using `free' directly.
  3673. -- Function: void avm_count_lists ()
  3674. If a client program aims to do its own storage reclamation, this
  3675. function can be called optionally at the end of a run when it is
  3676. believed that all lists have been freed. If any allocated lists
  3677. remain at large, a warning will be printed to standard error. This
  3678. function therefore provides a useful check for memory leaks.
  3679. Overhead is small enough that it is not infeasible to leave this
  3680. check in the production code.
  3681. -- Function: list avm_copied (list OPERAND)
  3682. A copy of the argument list is returned by this function. The copy
  3683. remains intact after the original is reclaimed. A typical use
  3684. might be for retaining part of a list after the rest of it is no
  3685. longer needed. In this example, a list `x' is traversed by a
  3686. hypothetical `visit' function to each item, which is then
  3687. immediately reclaimed.
  3688. while(x){
  3689. visit(x->head);
  3690. old_x = x;
  3691. x = avm_copied(x->tail); /* the right way */
  3692. avm_dispose(old_x);
  3693. }
  3694. This example allows each item in the list to be visited even as
  3695. previously visited items are reclaimed, because `x' is copied at
  3696. each iteration. This example contrasts with the next one, which
  3697. will probably cause a segmentation fault.
  3698. while(x){
  3699. visit(x->head);
  3700. old_x = x;
  3701. x = x->tail; /* the wrong way */
  3702. avm_dispose(old_x);
  3703. }
  3704. In the second example, a reference is made to a part of a list
  3705. which no longer exists because it has been deallocated.
  3706. In fact, the `avm_copied' function does nothing but increment a
  3707. reference count, so it is a fast, constant time operation that
  3708. requires no additional memory allocation. Semantically this action
  3709. is equivalent to creating a fresh copy of the list, because all
  3710. list operations in the library deal with reference counts properly.
  3711. -- Function: list avm_join (list LEFT, list RIGHT)
  3712. This function takes a pair of lists to a list in which the left is
  3713. the head and the right is the tail. It may need to use `malloc' to
  3714. allocate additional memory. If there is insufficient memory, an
  3715. error message is written to standard error and the program exits.
  3716. When the list returned by `avm_join' is eventually deallocated, the
  3717. lists from which it was built are taken with it and must not be
  3718. referenced again. For example, the following code is an error.
  3719. z = avm_join(x,y);
  3720. ...
  3721. avm_dispose(z);
  3722. avm_print_list(x); /* error here */
  3723. To accomplish something similar to this without an error, a copy of
  3724. `x' should be made, as in the next example.
  3725. z = avm_join(avm_copied(x),y);
  3726. ...
  3727. avm_dispose(z);
  3728. avm_print_list(x); /* original x still intact */
  3729. -- Function: void avm_enqueue (list *FRONT, list *BACK, list OPERAND)
  3730. A fast simple way of building a list head first is provided by the
  3731. `enqueue' function. The `front' is a pointer to the beginning of
  3732. the list being built, and the `back' is a pointer to the last
  3733. item. The recommended way to use it would be something like this.
  3734. front = back = NULL;
  3735. avm_enqueue(&front,&back,item);
  3736. avm_enqueue(&front,&back,next_item);
  3737. avm_enqueue(&front,&back,another_item);
  3738. ...
  3739. It might be more typical for the calls to `avm_enqueue' to appear
  3740. within a loop. In any case, after the above code is executed, the
  3741. following postconditions will hold.
  3742. front->head == item
  3743. front->tail->head == next_item
  3744. front->tail->tail->head == another_item
  3745. back->head == another_item
  3746. back->tail == NULL
  3747. The `avm_enqueue' function must never be used on a shared list,
  3748. because it modifies its arguments in place. The only practical way
  3749. to guarantee that a list is not shared is to initialize the
  3750. `front' and `back' to `NULL' as shown before the first call to
  3751. `avm_enqueue', and to make no copies of `front' or `back' until
  3752. after the last call to `avm_enqueue'.
  3753. Because a list built with `avm_enqueue' is not shared, it is one
  3754. of the few instances of a list that can have something harmlessly
  3755. appended to it in place. For example, if the next line of code were
  3756. back->tail = rest_of_list;
  3757. that would be acceptable assuming `rest_of_list' is not shared and
  3758. does not conceal a dangling or cyclic reference, and if nothing
  3759. further were enqueued.
  3760. The items that are enqueued into a list are not copied and will be
  3761. deallocated when the list is deallocated, so they must not be
  3762. referenced thereafter. A non-obvious violation of this convention
  3763. is implicit in the following code.
  3764. ...
  3765. avm_enqueue(&front,&back,x->head);
  3766. ...
  3767. avm_dispose(front);
  3768. avm_print_list(x); /* error here */
  3769. This code might cause a segmentation fault because of the
  3770. reference to `x' after its head has been deallocated. The
  3771. following code is subject to the same problem,
  3772. ...
  3773. avm_enqueue(&front,&back,x->head);
  3774. ...
  3775. avm_dispose(x);
  3776. avm_print_list(front); /* error here */
  3777. as is the following.
  3778. ...
  3779. avm_enqueue(&front,&back,x->head);
  3780. ...
  3781. avm_dispose(x); /* front is now impossible to reclaim */
  3782. avm_dispose(front);
  3783. The problem with the last example is that it is not valid even to
  3784. dispose of the same list more than once, albeit indirectly.
  3785. If part of a list is intended to be enqueued temporarily or
  3786. independently of its parent, the list should be copied explicitly,
  3787. as the following code demonstrates.
  3788. ...
  3789. avm_enqueue(&front,&back,avm_copied(x->head)); /* correct */
  3790. ...
  3791. avm_dispose(front);
  3792. avm_print_list(x);
  3793. -- Function: counter avm_length (list OPERAND)
  3794. A `counter' is meant to be the longest unsigned integer available on
  3795. the host machine, and is defined in `common.h', which is
  3796. automatically included whenever `lists.h' is included. The
  3797. `avm_length' function returns the number of items in a list. If a
  3798. list is `NULL', a value of zero is returned. There is a possibility
  3799. of a counter overflow error from this function (*note Overflow
  3800. Errors::), but only on a platform where the `counter' type is
  3801. shorter than the address length.
  3802. -- Function: counter avm_area (list OPERAND)
  3803. This function is similar to `avm_length', but it treats its
  3804. argument as a list of lists and returns the summation of their
  3805. lengths.
  3806. -- Function: list avm_natural (counter NUMBER)
  3807. This function takes a `counter' to its representation as a list, as
  3808. described in *note Representation of Numeric and Textual Data::.
  3809. That is, the number is represented as a list of bits, least
  3810. significant bit first, with each zero bit represented by `NULL'
  3811. and each one bit represented by a list whose `head' and `tail' are
  3812. `NULL'.
  3813. -- Function: void avm_print_list (list OPERAND)
  3814. The `avm_print_list' function is not used in any production code
  3815. but retained in the library for debugging purposes. It prints a
  3816. list to standard output using an expression involving only commas
  3817. and parentheses, as per the `silly' syntax (*note A Simple Lisp
  3818. Like Language::). The results quickly become unintelligible for
  3819. lists of any significant size. The function is recursively
  3820. defined and will crash in the event of a stack overflow, which
  3821. will occur in the case of very large or cyclic lists.
  3822. -- Function: list avm_position (list KEY, list TABLE, int *FAULT)
  3823. This function searches for a KEY in a short TABLE where each item
  3824. is a possible key.
  3825. If it's not found, a `NULL' value is returned. If it's found, a
  3826. list representing a character encoding according to *note
  3827. Character Table:: is returned.
  3828. The ascii code of the character corresponding to the returned list
  3829. is the position of the KEY in the TABLE, assuming position numbers
  3830. start with 1.
  3831. The table should have a length of 255 or less. If it's longer and
  3832. the KEY is found beyond that range, the higher order bits of the
  3833. position number are ignored.
  3834. The integer referenced by FAULT is set to a non-zero value in the
  3835. event of a memory overflow, which could happen in the course of
  3836. the list comparisons necessary for the search.
  3837. 
  3838. File: avram.info, Node: Recoverable Operations, Next: List Transformations, Prev: Simple Operations, Up: Lists
  3839. 3.1.2 Recoverable Operations
  3840. ----------------------------
  3841. The functions in this section are similar to the ones in the previous
  3842. section except with regard to error handling. Whereas the other ones
  3843. cause an error message to be printed and the process to exit in the
  3844. event of an overflow, these return to the caller, whose responsibility
  3845. it is to take appropriate action. The functions in both sections are
  3846. declared in `lists.h', and should be preceded by a call to
  3847. `avm_initialize_lists'.
  3848. -- Function: list avm_recoverable_join (list LEFT, list RIGHT)
  3849. This function is similar to `avm_join', but will return a `NULL'
  3850. pointer if memory that was needed can not be allocated. A `NULL'
  3851. pointer would never be the result of a join under normal
  3852. circumstances, so the overflow can be detected by the caller.
  3853. Regardless of whether overflow occurs, the arguments are
  3854. deallocated by this function and should not be referenced
  3855. thereafter.
  3856. -- Function: void avm_recoverable_enqueue (list *FRONT, list *BACK,
  3857. list OPERAND, int *FAULT)
  3858. This version of the enqueue function will dispose of the `OPERAND'
  3859. if there isn't room to append another item and set `*FAULT' to a
  3860. non-zero value. Other than that, it does the same as `avm_enqueue'.
  3861. -- Function: counter avm_recoverable_length (list OPERAND)
  3862. This function checks for arithmetic overflow when calculating the
  3863. length of a list, and returns a zero value if overflow occurs. The
  3864. caller can detect the error by noting that zero is not the length
  3865. of any list other than `NULL'. This kind of overflow is impossible
  3866. unless the host does not have long enough integers for its address
  3867. space.
  3868. -- Function: counter avm_recoverable_area (list OPERAND, int *FAULT)
  3869. This function is similar to `avm_area', except that it reacts
  3870. differently to arithmetic overflow. The `fault' parameter should be
  3871. the address of an integer known to the caller, which will be set
  3872. to a non-zero value if overflow occurs. In that event, the value
  3873. of zero will also be returned for the area. Note that it is
  3874. possible for non-empty lists to have an area of zero, so this
  3875. condition alone is not indicative of an error.
  3876. -- Function: list avm_recoverable_natural (counter NUMBER)
  3877. This function returns the `list' representation of a native
  3878. unsigned long integer, provided that there is enough memory,
  3879. similarly to the `avm_natural' function. Unlike that function,
  3880. this one will return a value of `NULL' rather than exiting the
  3881. program in the event of a memory overflow. The overflow can be
  3882. detected by the caller insofar as a `NULL' `list' does not
  3883. represent any number other than zero.
  3884. 
  3885. File: avram.info, Node: List Transformations, Next: Type Conversions, Prev: Recoverable Operations, Up: Lists
  3886. 3.1.3 List Transformations
  3887. --------------------------
  3888. Some functions declared in `listfuns.h' are used to implement the
  3889. operations described in *note List Functions::. These functions are able
  3890. to report error messages in the event of overflow or other exceptional conditions,
  3891. as described in *note Error Messages::. The error messages are
  3892. represented as lists and returned to the caller. The occurrence of an
  3893. error can be detected by the `*FAULT' flag being set to a non-zero
  3894. value. None of these functions ever causes a program exit except in the
  3895. event of an internal error.
  3896. -- Function: void avm_initialize_listfuns ()
  3897. This has to be called before any of the other functions in this
  3898. section is called. It initializes the error message lists, among
  3899. other things.
  3900. -- Function: void avm_count_listfuns ()
  3901. At the end of a run, a call to this function can verify that no
  3902. unreclaimed storage attributable to these functions persists. If it
  3903. does, a warning is printed to standard error. If `avm_count_lists'
  3904. is also used, it must be called after this function.
  3905. -- Function: list avm_reversal (list OPERAND, int *FAULT)
  3906. The reversal of the list is returned by this function if no
  3907. overflow occurs. A non-zero `*FAULT' and an error message are
  3908. returned otherwise. The original `OPERAND' still exists in its
  3909. original order after this function is called. The amount of
  3910. additional storage allocated is proportional only to the length of
  3911. the list, not the size of its contents.
  3912. -- Function: list avm_distribution (list OPERAND, int *FAULT)
  3913. This function performs the operation described in *note
  3914. Distribute::. The invalid distribution message is returned in the
  3915. event of a `NULL' operand. Otherwise, the returned value is the
  3916. distributed list. In any event, the `OPERAND' is unaffected.
  3917. -- Function: list avm_concatenation (list OPERAND, int *FAULT)
  3918. The `OPERAND' is treated as a pair of lists to be concatenated,
  3919. with the left one in the `head' field and the right one in the
  3920. `tail' field. The invalid concatenation message is returned in the
  3921. event of a `NULL' `OPERAND'. The result returned otherwise is the
  3922. concatenation of the lists, but the given `OPERAND' still exists
  3923. unchanged.
  3924. -- Function: list avm_transposition (list OPERAND, int *FAULT)
  3925. The operation performed by this function corresponds to that of
  3926. *note Transpose::. Unlike other functions in this section, the
  3927. operand passed to this function is deallocated, and must not be
  3928. referenced thereafter. The transposed list is accessible as the
  3929. returned value of this function. If the original `OPERAND' is
  3930. still needed after a call to `avm_transposition', only a copy of
  3931. it should be passed to it, obtained from `avm_copied'. The invalid
  3932. transpose error message is the result if the operand does not
  3933. represent a list of equal length lists.
  3934. -- Function: list avm_membership (list OPERAND, int *FAULT)
  3935. This function computes the membership predicate described in *note
  3936. Member::. The operand is a list in which the `tail' field is a
  3937. list that will be searched for the item in the `head'. If the item
  3938. is not found, a `NULL' list is returned, but otherwise a list with
  3939. `NULL' `head' and `tail' fields is returned. If the operand is
  3940. `NULL', an error message of invalid membership is returned and
  3941. `*FAULT' is set to a non-zero value.
  3942. The `avm_membership' function calls `avm_binary_comparison' in
  3943. order to compare lists, so the same efficiency and side-effect
  3944. considerations are relevant to both (*note Comparison::). It is not
  3945. necessary to `#include' the header file `compare.h' or to call
  3946. `avm_initialize_compare' in order to use `avm_membership', because
  3947. they will be done automatically.
  3948. -- Function: list avm_binary_membership (list OPERAND, list MEMBERS,
  3949. int *FAULT);
  3950. This function is the same as `avm_membership' except that it
  3951. allows the element and the set of members to be passed as separate
  3952. lists instead of being the head and the tail of the same list.
  3953. -- Function: list avm_measurement (list OPERAND, int *FAULT)
  3954. This function implements the operation described in *note
  3955. Weight::, which pertains to the weight of a tree. The returned
  3956. value of this function is a list encoding the weight as a binary
  3957. number, unless a counter overflow occurs, in which case it's an
  3958. error message. As noted previously, the weight of a tree can
  3959. easily be exponentially larger than the amount of memory it
  3960. occupies, but this function uses native integer arithmetic for
  3961. performance reasons. Hence, a counter overflow is a real
  3962. possibility.
  3963. 
  3964. File: avram.info, Node: Type Conversions, Next: Comparison, Prev: List Transformations, Up: Lists
  3965. 3.1.4 Type Conversions
  3966. ----------------------
  3967. External library functions accessed by the `library' combinator as
  3968. explained in *note Library combinator:: may operate on data other than
  3969. the `list' type usually used by `avram', such as floating point numbers
  3970. and arrays, but a virtual code application must be able to represent
  3971. the arguments and results of these functions in order to use them. As a
  3972. matter of convention, a data structure occupying SIZE bytes of
  3973. contiguous storage on the host machine appears as a list of length SIZE
  3974. to a virtual code application, in which each item corresponds to a
  3975. byte, and is represented according to *note Character Table::.
  3976. In principle, a virtual code application invoking a library function
  3977. to operate on a contiguous block of data, such as an IEEE double
  3978. precision number, for example, would construct a list of eight
  3979. character representations (one for each byte in a double precision
  3980. number), and pass this list as an argument to the library function. The
  3981. virtual machine would transparently convert this representation to the
  3982. native floating point format, evaluate the function, and convert the
  3983. result back to a list. In practice, high level language features
  3984. beyond the scope of this document would insulate the programmer from
  3985. some of the details on the application side as well.
  3986. To save the time of repeatedly converting between the list
  3987. representation and the contiguous native binary representation, the
  3988. structure referenced by a `list' pointer contains a `value' field which
  3989. is a `void' pointer to a block of memory of unspecified type, and
  3990. serves as a persistent cache of the value represented by the list. This
  3991. field normally should be managed by the API rather than being accessed
  3992. directly by client modules, but see the code in `mpfr.c' for an example
  3993. of a situation in which it's appropriate to break this rule. (Generally
  3994. these situations involve library functions operating on non-contiguous
  3995. data.)
  3996. * Menu:
  3997. * Primitive types::
  3998. * One dimensional arrays::
  3999. * Two dimensional arrays::
  4000. * Related utility functions::
  4001. 
  4002. File: avram.info, Node: Primitive types, Next: One dimensional arrays, Prev: Type Conversions, Up: Type Conversions
  4003. 3.1.4.1 Primitive types
  4004. .......................
  4005. A pair of functions in support of this abstraction is prototyped in
  4006. `listfuns.h'. These functions will be of interest mainly to developers
  4007. wishing to implement an interface to a new library module and make it
  4008. accessible on the virtual side by way of the `library' combinator
  4009. (*note Library combinator::).
  4010. -- Function: void *avm_value_of_list (list OPERAND, list *MESSAGE, int
  4011. *FAULT)
  4012. This function takes an OPERAND representing a value used by a
  4013. library function in the format described above (*note Type
  4014. Conversions::) and returns a pointer to the value.
  4015. The `value' field in the OPERAND normally will point to the block
  4016. of memory holding the value, and the OPERAND itself will be a list
  4017. of character representations whose binary encodings spell out the
  4018. value as explained above.
  4019. The `value' field need not be initialized on entry but it will be
  4020. initialized as a side effect of being computed by this function.
  4021. If it has been initialized due to a previous call with the same
  4022. OPERAND, this function is a fast constant time operation.
  4023. The caller should not free the pointer returned by this function
  4024. because a reference to its value will remain in the OPERAND. When
  4025. the OPERAND itself is freed by `avm_dispose' (*note Simple
  4026. Operations::), the value will go with it.
  4027. If an error occurs during the evaluation of this function, the
  4028. integer referenced by FAULT will be set to a non-zero value, and
  4029. the list referenced by MESSAGE will be assigned a representation of
  4030. a list of strings describing the error. The MESSAGE is freshly
  4031. created and should be freed by the caller with `avm_dispose' when
  4032. no longer needed.
  4033. Possible error messages are `<'missing value'>', in the case of an
  4034. empty OPERAND, `<'invalid value'>' in the case of an OPERAND that
  4035. is not a list of character representations, and `<'memory
  4036. overflow'>' if there was insufficient space to allocate the result.
  4037. -- Function: list avm_list_of_value (void *CONTENTS, size_t SIZE, int
  4038. *FAULT)
  4039. This function performs the inverse operation of
  4040. `avm_value_of_list', taking the address of an area of contiguously
  4041. stored data and its SIZE in bytes to a list representation. The
  4042. length of the list returned is equal to the number of bytes of
  4043. data, SIZE, and each item of the list is a character
  4044. representation for the corresponding byte as given by *note
  4045. Character Table::.
  4046. A copy of the memory area is made so that the original is no longer
  4047. needed and may be freed by the caller. A pointer to this copy is
  4048. returned by subsequent calls to `avm_value_of_list' when the
  4049. result returned by this function is used as the OPERAND parameter.
  4050. If there is insufficient memory to allocate the result, the integer
  4051. referenced by FAULT is set to a non-zero value, and a copy of the
  4052. message `<'memory overflow'>' represented as a list is returned.
  4053. This function could also cause a segmentation fault if it is passed
  4054. an invalid pointer or a SIZE that overruns the storage area.
  4055. However, it is acceptable to specify a SIZE that is less than the
  4056. actual size of the given memory area to construct a list
  4057. representing only the first part of it. The SIZE must always be
  4058. greater than zero.
  4059. 
  4060. File: avram.info, Node: One dimensional arrays, Next: Two dimensional arrays, Prev: Primitive types, Up: Type Conversions
  4061. 3.1.4.2 One dimensional arrays
  4062. ..............................
  4063. A couple of functions declared in `matcon.h' are concerned mainly with
  4064. one dimensional arrays or vectors. They have been used for vectors of
  4065. double precision and complex numbers, but are applicable to any base
  4066. type that is contiguous and of a fixed size.
  4067. The motivation for these functions is to enable a developer to
  4068. present an API to virtual code applications wherein external library
  4069. functions operating natively on one dimensional arrays of numbers are
  4070. seen from the virtual side to operate on lists of numbers. Lists are the
  4071. preferred container for interoperability with virtual code applications.
  4072. -- Function: void *avm_vector_of_list (list OPERAND, size_t ITEM_SIZE,
  4073. list *MESSAGE, int *FAULT)
  4074. This function calls `avm_value_of_list' (*note Primitive types::)
  4075. for each item of the OPERAND and puts all the values together into
  4076. one contiguous block, whose address is returned.
  4077. The given ITEM_SIZE is required to be the lengths of the items,
  4078. all necessarily equal, and is required only for validation. For
  4079. example, ITEM_SIZE is 8 for a list of double precision numbers,
  4080. because they occupy 8 bytes each and are represented as lists of
  4081. length 8.
  4082. The total number of bytes allocated is the product of ITEM_SIZE
  4083. and the length of the OPERAND. Unlike the case of
  4084. `avm_value_of_list' (*note Primitive types::), the result returned
  4085. by this function should be explicitly freed by the caller when no
  4086. longer needed.
  4087. Any errors such as insufficient memory cause the integer
  4088. referenced by FAULT to be assigned a non-zero value and the
  4089. MESSAGE to be assigned an error message represented as a list of
  4090. strings. An error message of `<'bad vector specification'>' is
  4091. possible in the case of an empty OPERAND or one whose item lengths
  4092. don't match the given ITEM_SIZE. Error messages caused by
  4093. `avm_value_of_list' can also be generated by this function. Any
  4094. non-empty error message should be reclaimed by the caller using
  4095. `avm_dispose' (*note Simple Operations::). If an error occurs, a
  4096. `NULL' pointer is returned.
  4097. -- Function: list avm_list_of_vector (void *VECTOR, int NUM_ITEMS,
  4098. size_t ITEM_SIZE, int *FAULT)
  4099. This function takes it on faith that an array of dimension
  4100. NUM_ITEMS in which each item occupies ITEM_SIZE bytes begins at
  4101. the address given by VECTOR. A list representation of each item in
  4102. the array is constructed by the function `avm_list_of_value'
  4103. (*note Primitive types::), and a list of all of the lists thus
  4104. obtained in order of their position in the array is returned.
  4105. In the event of any errors caused by `avm_list_of_value' or errors
  4106. due to insufficient memory, the error message is returned as the
  4107. function result, and the integer referenced by FAULT is assigned a
  4108. non-zero value. The error message is in the form of a list of
  4109. character string representations. A segmentation fault is possible if
  4110. VECTOR is not a valid pointer or if the array size implied by
  4111. misspecified values of NUM_ITEMS and ITEM_SIZE exceeds its actual
  4112. size.
  4113. 
  4114. File: avram.info, Node: Two dimensional arrays, Next: Related utility functions, Prev: One dimensional arrays, Up: Type Conversions
  4115. 3.1.4.3 Two dimensional arrays
  4116. ..............................
  4117. Several other functions in `matcon.h' are meant to support conversions
  4118. between matrices represented as lists of lists and arrays in a variety
  4119. of representations. Dense matrices either square or rectangular are
  4120. accommodated, and symmetric square matrices can be stored with
  4121. redundant entries omitted in either upper trangular or lower triangular
  4122. format.
  4123. Similarly to the vector operations (*note One dimensional arrays::)
  4124. these functions are intended to allow a developer to present an
  4125. interface to external libraries based on lists rather than arrays.
  4126. The preferred convention for virtual code applications is to
  4127. represent a matrix as a list of lists of entities (typically numbers),
  4128. with one list for each row of the matrix. For example, a 3 by 3 matrix
  4129. containing a value of `aij' in the `i'-th row and the `j'-th column
  4130. would be represented by this list of three lists.
  4131. <
  4132. <a11,a12,a13>,
  4133. <a21,a22,a23>,
  4134. <a31,a32,a33>>
  4135. Such a representation is convenient for manipulation by virtual machine
  4136. combinators, for example `transpose' (*note Transpose::), and is
  4137. readily identified with the matrix it represents.
  4138. If a matrix is symmetric (that is, with `aij' equal to `aji' for all
  4139. values of `i' and `j'), only the lower triangular portion needs to be
  4140. stored because the other entries are redundant. The list
  4141. representatation would be something like this.
  4142. <
  4143. <a11>,
  4144. <a21,a22>,
  4145. <a31,a32,a33>>
  4146. Another alternative for representing a symmetric matrix is to store
  4147. only the upper triangular portion. In this case, a list such as the
  4148. following would be used.
  4149. <
  4150. <a11,a12,a13>,
  4151. <a22,a23>,
  4152. <a33>>
  4153. The upper and lower triangular representations are distinguishable by
  4154. whether or not the row lengths form an increasing sequence.
  4155. In addition to representing symmetric matrices, these upper and lower triangular
  4156. forms are also appropriate for representing matrices whose remaining
  4157. entries are zero, such as the factors in an LU decomposition.
  4158. -- Function: void *avm_matrix_of_list (int SQUARE, int
  4159. UPPER_TRIANGULAR, int LOWER_TRIANGULAR, int COLUMN_MAJOR,
  4160. list OPERAND, size_t ITEM_SIZE, list *MESSAGE, int *FAULT)
  4161. This function converts a matrix in one of the list representations
  4162. above to a contiguous array according to the given specifications.
  4163. The array can contain elements of any fixed sized type of size
  4164. ITEM_SIZE. The memory for it is allocated by this function and it
  4165. should be freed by the caller when no longer needed.
  4166. The input matrix is given by the list parameter, OPERAND, and its
  4167. format is described by the integer parameters SQUARE,
  4168. UPPER_TRIANGULAR, and LOWER_TRIANGULAR. The number of bytes
  4169. occupied by each entry is given by ITEM_SIZE.
  4170. To the extent these specifications are redundant, they are used for
  4171. validation. If any of the following conditions is not met, the
  4172. integer referenced by FAULT is assigned a non-zero value and a copy
  4173. of the message `<'bad matrix specification'>' represented as a
  4174. list is assigned to the list referenced by MESSAGE. Errors are
  4175. also possible due to insufficient memory.
  4176. * The OPERAND must be a list of lists of lists such that each
  4177. item of each item is has a length of ITEM_SIZE, and its items
  4178. consist of character representations as required by
  4179. `avm_value_of_list' (*note Primitive types::).
  4180. * If the lengths of the top level lists in the OPERAND form an
  4181. increasing sequence, the lower triangular representation is
  4182. assumed and the LOWER_TRIANGULAR parameter must have a
  4183. non-zero value.
  4184. * If the lengths of the top level lists in the OPERAND form a
  4185. decreasing sequence, the upper triangular representation is
  4186. assumed and the UPPER_TRIANGULAR parameter must have a
  4187. non-zero value.
  4188. * At least one of UPPER_TRIANGULAR or LOWER_TRIANGULAR must be
  4189. zero.
  4190. * If SQUARE has a non-zero value, then either all items of the
  4191. OPERAND must have the same length as the operand, or if it's
  4192. triangular, then the longest one must have the same length as
  4193. the operand.
  4194. * If the OPERAND is neither square nor a triangular form, all
  4195. items of it are required to have the same length.
  4196. The parameters UPPER_TRIANGULAR or LOWER_TRIANGULAR may be set to
  4197. non-zero values even if the OPERAND is not in one of the upper or
  4198. lower triangular forms discussed above. In this case, the OPERAND
  4199. must be square or rectangular (i.e., with all items the same
  4200. length), and the following interpretations apply.
  4201. * If UPPER_TRIANGULAR is non-zero, the diagonal elements and the
  4202. upper triangular portion of the input matrix are copied to
  4203. the output. The lower triangle of the input is ignored and
  4204. the lower triangle of the output is left uninitialized.
  4205. * If LOWER_TRIANGULAR is non-zero, the diagonal elements and the
  4206. lower triangular portion of the input matrix are copied to
  4207. the output. The upper triangle of the input is ignored and
  4208. the upper triangle of the output is left uninitialized.
  4209. The COLUMN_MAJOR parameter affects the form of the output array.
  4210. If it is zero, then each row of the input matrix is stored in a
  4211. contiguous block of memory in the output array, and if it is
  4212. non-zero, each column is stored contiguously. The latter
  4213. representation is also known as Fortran order and may be required
  4214. by library functions written in Fortran.
  4215. In all cases when a triangular form is specified, part of the
  4216. output matrix is left uninitialized. The redundant entries may be
  4217. assigned if required by the `avm_reflect_matrix' function (*note
  4218. Related utility functions::).
  4219. -- Function: list avm_list_of_matrix (void *MATRIX, int ROWS, int
  4220. COLS, size_t ITEM_SIZE, int *FAULT)
  4221. This function performs an inverse operation to
  4222. `avm_matrix_of_list' by taking the address of a matrix stored as a
  4223. contiguous array in the parameter MATRIX and constructing the list
  4224. representation as discussed above. Only square and rectangular
  4225. matrices in row major order are supported, but see
  4226. `avm_matrix_transposition' for a way to convert between row major and
  4227. column major order (*note Related utility functions::).
  4228. The parameters ROWS, COLS, and ITEM_SIZE describe the form of the
  4229. matrix. The list returned as a result will have a length of ROWS,
  4230. and each item will be a list of length COLS. Each item of the
  4231. result corresponds to a row of the matrix, and each item of the
  4232. items represents the an entry of the matrix as a list of length
  4233. ITEM_SIZE. These items could be passed to `avm_value_of_list', for
  4234. example, to obtain their values (*note Primitive types::).
  4235. Memory is allocated by this function to create the list, which can
  4236. be reclaimed by `avm_dispose' (*note Simple Operations::). If
  4237. there is insufficient memory, the integer referenced by FAULT is
  4238. assigned a non-zero value and the result returned is a list
  4239. representation of the message `<'memory overflow'>'. The error
  4240. message be reclaimed by the caller as well using `avm_dispose'.
  4241. A packed storage representation for symmetric square matrices and triangular
  4242. matrices is of interest because it is used by some library functions,
  4243. notably those in `LAPACK', to save memory and thereby accommodate
  4244. larger problems. In this representation, column major order is assumed,
  4245. and either the lower or the upper triangle of the matrix is not
  4246. explicitly stored. For example, a lower triangular matrix whose list
  4247. representation corresponds to
  4248. <
  4249. <a11>,
  4250. <a21,a22>,
  4251. <a31,a32,a33>,
  4252. <a41,a42,a43,a44>>
  4253. would be stored according to the memory map
  4254. [a11 a21 a31 a41 a22 a32 a42 a33 a43 a44]
  4255. with `a11' at the beginning address. An upper triangular matrix
  4256. <
  4257. <a11,a12,a13,a14>,
  4258. <a22,a23,a24>,
  4259. <a33,a34>,
  4260. <a44>>
  4261. would be stored according to the memory map
  4262. [a11 a12 a22 a13 a23 a33 a14 a24 a34 a44].
  4263. A couple of functions converting between list representations and
  4264. packed array format are provided as described below.
  4265. -- Function: void *avm_packed_matrix_of_list (int UPPER_TRIANGULAR,
  4266. list OPERAND, int N, size_t ITEM_SIZE, list *MESSAGE, int
  4267. *FAULT)
  4268. If the OPERAND is a list in one of the triangular forms explained
  4269. above, then the UPPER_TRIANGULAR parameter must be consisitent
  4270. with it, being non-zero if the OPERAND is upper triangular and
  4271. zero otherwise.
  4272. If the OPERAND is not in a triangular form, then each item of the
  4273. operand must be a list of length N. In this case, the
  4274. UPPER_TRIANGULAR parameter indicates which triangle of the operand
  4275. should be copied to the result, and the other triangle is ignored.
  4276. In either case, the operand must have a length of N, and the items
  4277. of its items must be lists of length ITEM_SIZE containing
  4278. character representations as required by `avm_value_of_list'
  4279. (*note Primitive types::).
  4280. If the input parameters are inconsistent or if there is
  4281. insufficient memory to allocate the result, the integer referenced
  4282. by FAULT is assigned a non-zero value, and the list referenced by
  4283. MESSAGE is assigned a copy of the list representation of `<'bad
  4284. matrix specification'>' or `<'memory overflow'>', respectively. A
  4285. non-empty message must be reclaimed by the caller using
  4286. `avm_dispose' (*note Simple Operations::).
  4287. If there are no errors, the result is a pointer to a packed array
  4288. representation of the OPERAND as explained above. The memory for
  4289. this result is allocated by this function and should be freed by
  4290. the caller when no longer required. The number of bytes allocated
  4291. will be ITEM_SIZE * (N * (N + 1))/2.
  4292. -- Function: list avm_list_of_packed_matrix (int UPPER_TRIANGULER,void
  4293. *OPERAND, int N, size_t ITEM_SIZE, int *FAULT)
  4294. This function performs an inverse operation to that of
  4295. `avm_packed_matrix_of_list' given the address of a packed matrix
  4296. stored according to one of the memory maps discussed above. The
  4297. OPERAND parameter holds the address, the parameter N gives the
  4298. number of rows, and the UPPER_TRIANGULAR parameter specifies which
  4299. of the two possible memory maps to assume.
  4300. If there is sufficient memory, the result returned is a list in
  4301. one of the triangular forms described above, being upper
  4302. triangular if the UPPER_TRIANGULAR parameter is non-zero, with
  4303. values of length ITEM_SIZE taken from the array.
  4304. In the event of a memory overflow, the integer referenced by FAULT
  4305. is assigned a non-zero value and the result is a copy of the
  4306. message `<'memory overflow'>' represented as a list. A segmentation
  4307. fault is possible if this function is passed an invalid pointer or
  4308. dimension.
  4309. 
  4310. File: avram.info, Node: Related utility functions, Prev: Two dimensional arrays, Up: Type Conversions
  4311. 3.1.4.4 Related utility functions
  4312. .................................
  4313. A small selection of additional functions that are likely to be of use
  4314. to developers concerned with matrix operations has been incorporated
  4315. into the API to save the trouble of reinventing them, although doing so
  4316. would be straightforward. They are described in this section without
  4317. further motivation.
  4318. -- Function: void *avm_matrix_transposition (void *MATRIX, int ROWS,
  4319. int COLS, size_t ITEM_SIZE)
  4320. This function takes the address of an arbitrary rectangular MATRIX
  4321. represented as a contiguous array (not a list) and transposes it
  4322. in place. That is, this function transforms an M by N matrix to an
  4323. N by M matrix by exchanging the I,Jth element with the J,Ith
  4324. element for all values of I and J.
  4325. The numbers of rows and columns in the MATRIX are given by the
  4326. parameters ROWS and COLS, respectively, and the size of the
  4327. entries in bytes is given by ITEM_SIZE.
  4328. The MATRIX is assumed to be in row major order, but this function
  4329. is applicable to matrices in column major order if the caller passes
  4330. the number of columns in ROWS and the number of rows in COLS.
  4331. Alternatively, this function can be seen as a conversion between
  4332. the row major and the column major representation of a matrix. An M
  4333. by N matrix in row major order will be transformed to the same M
  4334. by N matrix in column order, or from column order to row order.
  4335. A notable feature of this function is that it allocates no memory
  4336. so there is no possibility of a memory overflow even for very large
  4337. matrices, unlike a naive implementation which would involve making
  4338. a temporary copy of the matrix. There is a possibility of a
  4339. segmentation fault if invalid pointers or dimensions are given.
  4340. -- Function: void *avm_matrix_reflection (int UPPER_TRIANGULAR, void
  4341. *MATRIX, int N, size_t ITEM_SIZE)
  4342. This function takes a symmetric square MATRIX of dimension N
  4343. containing entries of ITEM_SIZE bytes each and fills in the
  4344. redundant entries.
  4345. If UPPER_TRIANGULAR is non-zero, the upper triangle of the MATRIX
  4346. is copied to the lower triangle. If UPPER_TRIANGULAR is zero, the
  4347. lower triangular entries are copied to the upper triangle.
  4348. These conventions assume row major order. If the MATRIX is in column
  4349. major order, then the caller can either transpose it in place before
  4350. and after this function by `avm_matrix_transposition', or can
  4351. complement the value of UPPER_TRIANGULAR.
  4352. Note that this function may be unnecessary for `LAPACK' library
  4353. functions that ignore the redundant entries in a symmetric matrix,
  4354. because they can be left uninitialized, but it is included for the
  4355. sake of completeness.
  4356. -- Function: list *avm_row_number_array (counter M, int *FAULT)
  4357. A fast, memory efficient finite map from natural numbers to their
  4358. list representations can be obtained by using this function as an
  4359. alternative to `avm_natural' or `avm_recoverable_natural' when
  4360. repeated evaluations of numbers within a known range are required
  4361. (*note Simple Operations:: and *note Recoverable Operations::).
  4362. Given a positive integer M, this function allocates and returns an
  4363. array of M lists whose Ith entry is the list representation of the
  4364. number I as explained in *note Representation of Numeric and
  4365. Textual Data::.
  4366. An amount of memory proportional to M is used for the array and
  4367. its contents. If there is insufficient memory, a `NULL' value is
  4368. returned and the integer referenced by FAULT is set to a non-zero
  4369. value.
  4370. -- Function: void avm_dispose_rows (counter M, list *ROW_NUMBER)
  4371. This function reclaims an array ROW_NUMBER of size M returned by
  4372. `avm_row_number_array', and its contents if any. A `NULL' pointer
  4373. is allowed as the ROW_NUMBER parameter and will have no effect,
  4374. but an uninitialized pointer will cause a segmentation fault.
  4375. -- Function: void avm_initialize_matcon ();
  4376. This function initializes some static variables used by the
  4377. functions declared in `matcon.h' and should be called before any
  4378. of them is called or they might not perform according to
  4379. specifications.
  4380. -- Function: void avm_count_matcon ();
  4381. This function frees the static variables allocated by
  4382. `avm_initialize_matcon' and is used to verify the absence of
  4383. memory leaks. It should be called after the last call to any
  4384. functions in `matcon.h' but before `avm_count_lists' if the latter
  4385. is being used (*note Simple Operations::).
  4386. 
  4387. File: avram.info, Node: Comparison, Next: Deconstruction Functions, Prev: Type Conversions, Up: Lists
  4388. 3.1.5 Comparison
  4389. ----------------
  4390. The file `compare.h' contains a few function declarations pertaining to
  4391. the computation of the comparison predicate described in *note
  4392. Compare::. Some of the work is done by static functions in `compare.c'
  4393. that are not recommended entry points to the library.
  4394. -- Function: void avm_initialize_compare ()
  4395. This function should be called once before the first call to
  4396. `avm_comparison', as it initializes some necessary internal data
  4397. structures.
  4398. -- Function: void avm_count_compare ()
  4399. This function can be used to check for memory leaks, by detecting
  4400. unreclaimed storage at the end of a run. The data structures
  4401. relevant to comparison that could be reported as unreclaimed are
  4402. known as "decision" nodes, but these should always be handled
  4403. properly by the library without intervention. If `avm_count_lists'
  4404. is also being used, the call to this function must precede it.
  4405. -- Function: list avm_comparison (list OPERAND, int *FAULT)
  4406. This function takes a list operand representing a pair of trees and
  4407. returns a list representing the logical value of their equality.
  4408. If the operand is `NULL', a message of invalid comparison is
  4409. returned and the `*FAULT' is set to a non-zero value. If the
  4410. `head' of the operand is unequal to the `tail', a `NULL' value is
  4411. returned. If they are equal, a list is returned whose `head' and
  4412. `tail' are both `NULL'. The equality in question is structural rather
  4413. than pointer equality.
  4414. The list operand to this function may be modified by this
  4415. function, but not in a way that should make any difference to a
  4416. client program. If two lists are found to be equal, or if even two
  4417. sublists are found to be equal in the course of the comparison,
  4418. one of them is deallocated and made to point to the other. This
  4419. action saves memory and may make subsequent comparisons faster.
  4420. However, it could disrupt client programs that happen to be
  4421. holding stale list pointers.
  4422. As of `avram' version 0.6.0, a logical field called
  4423. `discontiguous' has been added to the `node' record type declared
  4424. in `lists.h', which is checked by the comparison function. If a
  4425. list node has its `discontiguous' field set to a non-zero value,
  4426. and if it also has a non-null `value' field, then it won't be
  4427. deallocated in the course of comparison even if it is found to be
  4428. equal to something else. This feature can be used by client
  4429. modules to create lists in which value fields refer to data
  4430. structures that are meant to exist independently of them. See
  4431. `mpfr.c' for an example.
  4432. This function is likely to have better performance and memory
  4433. usage than a naive implementation of comparison, for the above
  4434. reasons and also because of optimizations pertaining to comparison
  4435. of lists representing characters. Moreover, it is not subject to
  4436. stack overflow exceptions because it is not written in a recursive
  4437. style.
  4438. -- Function: list avm_binary_comparison (list LEFT_OPERAND, list
  4439. RIGHT_OPERAND, int *FAULT);
  4440. This function is the same as `avm_comparison' except that it
  4441. allows the left and right operands to be passed as separate lists
  4442. rather than taking them from the `head' and the `tail' of a single
  4443. list.
  4444. 
  4445. File: avram.info, Node: Deconstruction Functions, Next: Indirection, Prev: Comparison, Up: Lists
  4446. 3.1.6 Deconstruction Functions
  4447. ------------------------------
  4448. A fast native implementation of the deconstruction operation is provided by
  4449. the functions declared in `decons.h'.
  4450. -- Function: void avm_initialize_decons ()
  4451. This should be called prior to the first call to
  4452. `avm_deconstruction', so as to initialize some necessary internal
  4453. data structures. Results will be undefined if it is not.
  4454. -- Function: void avm_count_decons ()
  4455. For ecologically sound memory management, this function should be
  4456. called at the end of a run to verify that there have been no leaks
  4457. due to the deconstruction functions, which there won't be unless
  4458. the code in `decons.c' has been ineptly modified. An error message
  4459. to the effect of unreclaimed "points" could be the result
  4460. otherwise.
  4461. -- Function: list avm_deconstruction (list POINTER, list OPERAND, int
  4462. *FAULT)
  4463. Deconstructions are performed by this function, as described in
  4464. *note Field::. In the `silly' program notation (*note A Simple
  4465. Lisp Like Language::), this function computes the value of
  4466. ([[`field']] `POINTER') `OPERAND'.
  4467. For example, using the fixed list `avm_join(NULL,NULL)' as the
  4468. `POINTER' parameter will cause a copy of the operand itself to be
  4469. returned as the result. A `POINTER' equal to
  4470. `avm_join(NULL,avm_join(NULL,NULL))' will cause a copy of
  4471. `operand->tail' to be returned, and so on. A `NULL' `POINTER'
  4472. causes an internal error.
  4473. If the deconstruction is invalid, as in the case of the tail of an
  4474. empty list, the invalid deconstruction error message is returned
  4475. as the result, and the `*FAULT' parameter is set to a non-zero
  4476. value. The `*FAULT' parameter is also set to a non-zero value in
  4477. the event of a memory overflow, and the memory overflow message is
  4478. returned.
  4479. 
  4480. File: avram.info, Node: Indirection, Next: The Universal Function, Prev: Deconstruction Functions, Up: Lists
  4481. 3.1.7 Indirection
  4482. -----------------
  4483. In some cases it is necessary to build a tree from the top down rather than
  4484. from the bottom up, when it is not known in advance what's on the
  4485. bottom. Although the `list' type is a pointer itself, these situations
  4486. call for a type of pointers to lists, which are declared as the
  4487. `branch' type in `branches.h'. For example, if `b' is declared as a
  4488. `branch' and `l' is declared as a `list', it would be possible to write
  4489. `b = &l'.
  4490. Facilities are also provided for maintaining queues of branches,
  4491. which are declared as the `branch_queue' type. This type is a pointer to
  4492. a structure with two fields, `above' and `following', where `above' is
  4493. a `branch' and `following' is a `branch_queue'.
  4494. These functions are used internally elsewhere in the library and
  4495. might not be necessary for most client programs to use directly.
  4496. -- Function: void avm_initialize_branches ()
  4497. This must be done once before any of the other branch related
  4498. functions is used, and creates some internal data structures.
  4499. Results of the other functions are undefined if this one isn't
  4500. called first.
  4501. -- Function: void avm_count_branches ()
  4502. This function can be used at the end of a run to detect unreclaimed
  4503. storage used for branches or branch queues. If any storage remains
  4504. unreclaimed, a message about unreclaimed branches is written to
  4505. standard error.
  4506. -- Function: void avm_anticipate (branch_queue *FRONT, branch_queue
  4507. *BACK, branch OPERAND)
  4508. This function provides a simple queueing facility for branches.
  4509. Similarly to the case with `avm_enqueue', `front' and `back'
  4510. should be initialized to `NULL' before the first call. Each call
  4511. to this function will enqueue one item to the back, assuming
  4512. enough memory is available, as the following example shows.
  4513. front = NULL;
  4514. back = NULL;
  4515. l = avm_join(NULL,NULL);
  4516. anticipate(&front,&back,&(l->head));
  4517. anticipate(&front,&back,&(l->tail));
  4518. After the above code is executed, these postconditions will hold.
  4519. front->above == &(l->head)
  4520. front->following->above == &(l->tail)
  4521. front->following == back
  4522. back->following == NULL
  4523. The name "anticipate" is used because ordinarily the queue contains
  4524. positions in a tree to be filled in later. As usual, only unshared
  4525. trees should be modified in place.
  4526. -- Function: void avm_recoverable_anticipate (branch_queue *FRONT,
  4527. branch_queue *BACK, branch OPERAND, int *FAULT)
  4528. This function is similar to `avm_anticipate', except that it will
  4529. not exit with an error message in the event of an overflow error,
  4530. but will simply set `*FAULT' to a non-zero value and return to the
  4531. caller. If an overflow occurs, nothing about the queue is changed.
  4532. -- Function: void avm_enqueue_branch (branch_queue *FRONT,
  4533. branch_queue *BACK, int RECEIVED_BIT)
  4534. A slightly higher level interface to the `avm_anticipate' function
  4535. is provided by this function, which is useful for building a tree
  4536. from a string of input bits in a format similar to the one
  4537. described in *note Concrete Syntax::.
  4538. This function should be called the first time with `front' and
  4539. `back' having been initialized to represent a queue containing a single
  4540. branch pointing to a list known to the caller. The list itself
  4541. need not be allocated or initialized. An easy way of doing so
  4542. would be the following.
  4543. front = NULL;
  4544. back = NULL;
  4545. avm_anticipate(&front,&back,&my_list);
  4546. On each call to `avm_enqueue_branch', the `RECEIVED_BIT' parameter
  4547. is examined. If it is zero, nothing will be added to the queue,
  4548. the list referenced by the front branch will be assigned `NULL',
  4549. and the front branch will be removed from the queue. If
  4550. `RECEIVED_BIT' is a non-zero value, the list referenced by the
  4551. front branch will be assigned to point to a newly created unshared
  4552. list node, and two more branches will be appended to the queue. The
  4553. first branch to be appended will point to the head of the newly
  4554. created list node, and the second branch to be appended will point
  4555. to the tail.
  4556. If the sequence of bits conforms to the required concrete syntax,
  4557. this function can be called for each of them in turn, and at the
  4558. end of the sequence, the queue will be empty and the list
  4559. referenced by the initial branch (i.e., `my_list') will be the one
  4560. specified by the bit string. If the sequence of bits does not
  4561. conform to the required concrete syntax, the error can be detected
  4562. insofar as the emptying of the queue will not coincide exactly
  4563. with the last bit.
  4564. The caller should check for the queue becoming prematurely empty
  4565. due to syntax errors, because no message is reported by
  4566. `avm_enqueue_branch' in that event, and subsequent attempts to
  4567. enqueue anything are ignored. However, in the event of a memory
  4568. overflow, an error message is reported and the process is
  4569. terminated.
  4570. -- Function: void avm_recoverable_enqueue_branch (branch_queue *FRONT,
  4571. branch_queue *BACK, int RECEIVED_BIT, int *FAULT)
  4572. This function is similar to `avm_enqueue_branch' but will leave
  4573. error handling to the caller in the event of insufficient memory to
  4574. enqueue another branch. Instead of printing an error message and
  4575. exiting, it will dispose of the queue, set the `FAULT' flag to a
  4576. non-zero value, and return. Although the queue will be reclaimed,
  4577. the lists referenced by the branches in it will persist. The list
  4578. nodes themselves can be reclaimed by disposing of the list whose
  4579. address was stored originally in the front branch.
  4580. -- Function: void avm_dispose_branch_queue (branch_queue FRONT)
  4581. This function deallocates a branch queue by chasing the `following'
  4582. fields in each one. It does nothing to the lists referenced by the
  4583. branches in the queue.
  4584. Rather than using `free' directly, client programs should use this
  4585. function for deallocating branch queues, because it allows better
  4586. performance by interacting with a local internal cache of free
  4587. memory, and because it performs necessary bookkeeping for
  4588. `avm_count_branches'.
  4589. -- Function: void avm_dispose_branch (branch_queue OLD)
  4590. This disposes of a single branch queue node rather than a whole
  4591. queue. Otherwise, the same comments as those above apply.
  4592. 
  4593. File: avram.info, Node: The Universal Function, Prev: Indirection, Up: Lists
  4594. 3.1.8 The Universal Function
  4595. ----------------------------
  4596. A function computing the result of the invisible operator used to
  4597. specify the virtual code semantics in *note Virtual Code Semantics::, is
  4598. easily available by way of a declaration in `apply.h'.
  4599. -- Function: void avm_initialize_apply ()
  4600. This function should be called by the client program at least once
  4601. prior to the first call to `avm_apply' or `avm_recoverable_apply'.
  4602. It causes certain internal data structures and error message texts
  4603. to be initialized.
  4604. -- Function: void avm_count_apply ()
  4605. This function should be used at the end of a run for the purpose of
  4606. detecting and reporting any unreclaimed storage associated with
  4607. functions in this section. If the function `avm_count_lists()' is
  4608. also being used, it should be called after this one.
  4609. -- Function: list avm_apply (list OPERATOR, list OPERAND)
  4610. This is the function that evaluates the operator used to describe
  4611. the virtual code semantics. For example, the value of `F X' can be
  4612. obtained as the result returned by `avm_apply(F,X)'.
  4613. Both parameters to this function are deallocated unconditionally
  4614. and should not be referenced again by the caller. If the
  4615. parameters are needed subsequently, then only copies of them
  4616. should be passed to `avm_apply' using `avm_copied'.
  4617. This function is not guaranteed to terminate, and may cause a
  4618. memory overflow error. In the event of an exceptional condition,
  4619. the error message is written to standard error and the program is
  4620. halted. There is no externally visible distinction between
  4621. different levels of error conditions.
  4622. -- Function: list avm_recoverable_apply (list OPERATOR, list OPERAND,
  4623. int *FAULT)
  4624. This function is similar to `avm_apply' but leaves the
  4625. responsibility of error handling with the caller. If any overflow
  4626. or exceptional condition occurs, the result returned is a list
  4627. representing the error message, and the `FAULT' flag is set to a
  4628. non-zero value. This behavior contrasts with that of `avm_apply',
  4629. which will display the message to standard error and kill the
  4630. process.
  4631. 
  4632. File: avram.info, Node: Characters and Strings, Next: File Manipulation, Prev: Lists, Up: Library Reference
  4633. 3.2 Characters and Strings
  4634. ==========================
  4635. If a C program is to interact with a virtual code application by
  4636. exchanging text, it uses the representation for characters described in
  4637. *note Character Table::. This convention would be inconvenient without
  4638. a suitable API, so the functions in this section address the need. These
  4639. functions are declared in the header file `chrcodes.h'.
  4640. Some of these functions have two forms, with one of them having the
  4641. word `standard' as part of its name. The reason is to cope with
  4642. multiple character encodings. Versions of `avram' prior to 0.1.0 used a
  4643. different character encoding than the one documented in *note Character
  4644. Table::. The functions described in *note Version Management:: can be
  4645. used to select backward compatible operation with the older character
  4646. encoding. The normal forms of the functions in this section will use
  4647. the older character set if a backward compatibility mode is indicated,
  4648. whereas the standard forms will use the character encoding documented
  4649. in *note Character Table:: regardless.
  4650. Standard encodings should always be assumed for library and function names
  4651. associated with the `library' combinator (*note Calling existing
  4652. library functions::), and for values of lists defined by
  4653. `avm_list_of_value' (*note Primitive types::), but version dependent
  4654. encodings should be used for all other purposes such as error messages.
  4655. Alternatively, the normal version dependent forms of the functions
  4656. below can be used safely in any case if backward compatibility is not
  4657. an issue. This distinction is viewed as a transitional feature of the
  4658. API that will be discontinued eventually when support for the old
  4659. character set is withdrawn and the `standard' forms are be removed.
  4660. -- Function: list avm_character_representation (int CHARACTER)
  4661. -- Function: list avm_standard_character_representation (int CHARACTER)
  4662. This function takes an integer character code and returns a copy of
  4663. the list representing it, as per the table in *note Character
  4664. Table::. Because the copy is shared, no memory is allocated by this
  4665. function so there is no possibility of overflow. Nevertheless, it
  4666. is the responsibility of the caller dispose of the list when it is
  4667. no longer needed by `avm_dispose', just as if the copy were not
  4668. shared (*note Simple Operations::). For performance reasons, this
  4669. function is implemented as a macro. If the argument is outside the
  4670. range of zero to 255, it is masked into that range.
  4671. -- Function: int avm_character_code (list OPERAND)
  4672. -- Function: int avm_standard_character_code (list OPERAND)
  4673. This function takes a list as an argument and returns the
  4674. corresponding character code, as per *note Character Table::. If
  4675. the argument does not represent any character, a value of `-1' is
  4676. returned.
  4677. -- Function: list avm_strung (char *STRING)
  4678. -- Function: list avm_standard_strung (char *STRING)
  4679. This function takes a pointer to a null terminated character
  4680. string and returns the list obtained by translating each character
  4681. into its list representation and enqueuing them together. Memory
  4682. needs to be allocated for the result, and if there isn't enough
  4683. available, an error message is written to standard error and the
  4684. process is terminated. This function is useful to initialize lists
  4685. from hard coded strings at the beginning of a run, as in this
  4686. example.
  4687. hello_string = avm_strung("hello");
  4688. This form initializes a single string, but to initialize a one line
  4689. message suitable for writing to a file, it would have to be a list
  4690. of strings, as in this example.
  4691. hello_message = avm_join(avm_strung("hello"),NULL);
  4692. The latter form is used internally by the library for initializing
  4693. most of the various error messages that can be returned by other
  4694. functions.
  4695. -- Function: list avm_recoverable_strung (char *STRING, int *FAULT);
  4696. -- Function: list avm_recoverable_standard_strung (char *STRING, int
  4697. *FAULT);
  4698. This function is like `avm_strung' except that if it runs out of
  4699. memory it sets the integer referenced by FAULT to a non-zero value
  4700. and returns instead of terminating the process.
  4701. -- Function: char *avm_unstrung (list STRING, list *MESSAGE, int
  4702. *FAULT)
  4703. -- Function: char *avm_standard_unstrung (list STRING, list *MESSAGE,
  4704. int *FAULT)
  4705. This function performs an inverse operation to
  4706. `avm_recoverable_strung', taking a list representing a character
  4707. string to the character string in ASCII null terminated form as per
  4708. the standard C representation. Memory is allocated for the result
  4709. by this function which should be freed by the caller.
  4710. In the event of an exception, the integer referenced by `fault' is
  4711. assigned a non-zero value and an error message represented as a
  4712. list is assigned to the list referenced by `message'. The error
  4713. message should be reclaimed by the caller with `avm_dispose'
  4714. (*note Simple Operations:: if it is non-empty. Possible error
  4715. messages are `<'memory overflow'>', `<'counter overflow'>', and
  4716. `<'invalid text format'>'.
  4717. -- Function: list avm_scanned_list (char *STRING)
  4718. An application that makes use of virtual code snippets or data
  4719. that are known at compile time can use this function to initialize
  4720. them. The argument is a string in the format described in *note
  4721. Concrete Syntax::, and the result is the list representing it. For
  4722. example, the program discussed in *note Example Script:: could be
  4723. hard coded into a C program by pasting the data from its virtual
  4724. code file into an expression of this form.
  4725. cat_program = avm_scanned_list("sKYQNTP\\");
  4726. Note that the backslash character in the original data has to be
  4727. preceded by an extra backslash in the C source, because backslashes
  4728. usually mean something in C character constants.
  4729. The `avm_scanned_list' function needs to allocate memory. If there
  4730. isn't enough memory available, it writes a message to standard
  4731. error and causes the process to exit.
  4732. -- Function: list avm_multiscanned (char **STRINGS)
  4733. Sometimes it may be useful to initialize very large lists from
  4734. strings, but some C compilers impose limitations on the maximum
  4735. length of a string constant, and the ISO standard for C requires
  4736. only 512 bytes. This function serves a similar purpose to
  4737. `avm_scanned_list', but allows the argument to be a pointer to a
  4738. null terminated array of strings instead of one long string,
  4739. thereby circumventing this limitation in the compiler.
  4740. char *code[] = {"sKYQ","NTP\\",NULL};
  4741. ...
  4742. cat_program = avm_multiscanned(code);
  4743. If there is insufficient memory to allocate the list this function
  4744. needs to create, it causes an error message to be written to
  4745. standard error, and then kills the process.
  4746. -- Function: char* avm_prompt (list PROMPT_STRINGS)
  4747. This function takes a list representing a list of character
  4748. strings, and returns its translation to a character string with
  4749. the sequence 13 10 used as a separator. For example, given a tree
  4750. of this form
  4751. some_message = avm_join(
  4752. avm_strung("hay"),
  4753. avm_join(
  4754. avm_strung("you"),
  4755. NULL));
  4756. the result returned by `prompt_strings(some_message)' would be a
  4757. pointer to a null terminated character string equivalent to the C
  4758. constant `"hay\13\10you"'.
  4759. Error messages are printed and the process terminated in the event
  4760. of either a memory overflow or an invalid character representation.
  4761. This function is used by `avram' in the evaluation of interactive virtual
  4762. code applications, whose output has to be compared to the output
  4763. from a shell command in this format. The separator is chosen to be
  4764. compatible with the `expect' library.
  4765. -- Function: char* avm_recoverable_prompt (list PROMPT_STRINGS, list
  4766. *MESSAGE, int *FAULT)
  4767. This function performs the same operation as `avm_prompt' but
  4768. allows the caller to handle exceptional conditions. If an exception
  4769. such as a memory overflow occurs, the integer referenced by
  4770. `fault' is assigned a non-zero value and a representation of the
  4771. error message as a list of strings is assigned to the list
  4772. referenced by `message'.
  4773. This function is used to by `avram' to evaluate the `interact'
  4774. combinator (*note Interaction combinator::), when terminating in
  4775. the event of an error would be inappropriate.
  4776. -- Function: void avm_initialize_chrcodes ()
  4777. This function has to be called before any of the other character
  4778. conversion functions in this section, or else their results are
  4779. undefined. It performs the initialization of various internal data
  4780. structures.
  4781. -- Function: void avm_count_chrcodes ()
  4782. This function can be called at the end of a run, after the last
  4783. call to any of the other functions in this section, but before
  4784. `avm_count_lists' if that function is also being used. The purpose
  4785. of this function is to detect and report memory leaks. If any
  4786. memory associated with any of these functions has not been
  4787. reclaimed by the client program, a message giving the number of
  4788. unreclaimed lists will be written to standard error.
  4789. 
  4790. File: avram.info, Node: File Manipulation, Next: Invocation, Prev: Characters and Strings, Up: Library Reference
  4791. 3.3 File Manipulation
  4792. =====================
  4793. The functions described in this section provide an interface between
  4794. virtual code applications and the host file system by converting
  4795. between files or file names and their representations as lists. These
  4796. conversions are necessary when passing a file to a virtual code
  4797. application, or when writing a file received in the result of one.
  4798. * Menu:
  4799. * File Names::
  4800. * Raw Files::
  4801. * Formatted Input::
  4802. * Formatted Output::
  4803. 
  4804. File: avram.info, Node: File Names, Next: Raw Files, Prev: File Manipulation, Up: File Manipulation
  4805. 3.3.1 File Names
  4806. ----------------
  4807. A standard representation is used by virtual code applications for the path
  4808. names of files, following the description in *note Input Data
  4809. Structure::. The functions and constants declared in `fnames.h' provide
  4810. an API for operating on file names in this form.
  4811. -- Function: list avm_path_representation (char *PATH)
  4812. If a C program is to invoke a virtual code application and pass a
  4813. path name to it as a parameter, this function can be used to
  4814. generate the appropriate representation from a given character
  4815. string.
  4816. conf_path = avm_path_representation("/etc/resolve.conf");
  4817. In this example, `conf_path' is a `list'. For potentially better
  4818. portability, a C program can use the character constant
  4819. `avm_path_separator_character' in place of the slashes in hard
  4820. coded path names.
  4821. Other useful constants are `avm_current_directory_prefix' as a portable
  4822. replacement for `"./"', as well as `avm_parent_directory_prefix'
  4823. instead of `"../"'. There is also `avm_root_directory_prefix' for
  4824. `"/"'. These three constants are null terminated strings, unlike
  4825. `avm_path_separator_character', which is a character.
  4826. If a `NULL' pointer is passed as the `PATH', a `NULL' list is
  4827. returned, which is the path representation for standard input or
  4828. standard output. If the address of an empty string is passed to
  4829. this function as the `PATH', the list of the empty string will be
  4830. returned, which is the path representation for the root directory.
  4831. Trailing path separators are ignored, so `"/"' is the same as the
  4832. empty string.
  4833. Some memory needs to be allocated for the result of this function.
  4834. If the memory is not available, an error message is written to
  4835. standard error and the process is terminated.
  4836. -- Function: list avm_date_representation (char *PATH)
  4837. This function is essentially a wrapper around the standard
  4838. `ctime_r' function that not only gets the time stamp for a file at
  4839. a given path, but transforms it to a list representation according
  4840. to *note Character Table::. It needs to allocate memory for the
  4841. result and will cause the program to exit with an error message if
  4842. there is not enough memory available.
  4843. The time stamp will usually be in a format like `Sun Mar 4 10:56:40
  4844. GMT 2001'. If for some reason the time stamp can not be obtained,
  4845. the result will be a representation of the string `unknown date'.
  4846. -- Function: char* avm_path_name (list PATH)
  4847. This function is the inverse of `avm_path_representation', in that
  4848. it takes a list representing a path to the path name expressed as
  4849. a character string. This function can be used in C programs that
  4850. invoke virtual code applications returning paths as part of their
  4851. results, so that the C program can get the path into a character
  4852. string in order to open the file.
  4853. If the `PATH' parameter is `NULL', a `NULL' pointer is returned as
  4854. the result. The calling program should check for a `NULL' result
  4855. and interpret it as the path to standard input or standard output.
  4856. The memory needed for the character string whose address is
  4857. returned is allocated by this function if possible. The given
  4858. `PATH' is not required to be consistent with the host file system,
  4859. but it is required to consist of representations of non-null
  4860. printable characters or spaces as lists per *note Character
  4861. Table::. In the event of any error or overflow, control does not
  4862. return to the caller, but an error message is printed and the
  4863. program is aborted. The possible error messages from this function
  4864. are the following.
  4865. * `PROGRAM-NAME: counter overflow (code NN)'
  4866. * `PROGRAM-NAME: memory overflow (code NN)'
  4867. * `PROGRAM-NAME: null character in file name'
  4868. * `PROGRAM-NAME: bad character in file name'
  4869. * `PROGRAM-NAME: invalid file name (code NN)'
  4870. -- Function: void avm_initialize_fnames ()
  4871. A few housekeeping operations relevant to internal data structures
  4872. are performed by this function, making it necessary to be called
  4873. by the client program prior to using any of the other ones.
  4874. -- Function: void avm_count_fnames ()
  4875. This function can be used after the the last call to any of the
  4876. other functions in this section during a run, and it will detect
  4877. memory leaks that may be attributable to code in these functions
  4878. or misuse thereof. If any unreclaimed storage remains when this
  4879. function is called, a warning message will be written to standard
  4880. error. If the function `avm_count_lists' is also being used by the
  4881. client, it should be called after this one.
  4882. 
  4883. File: avram.info, Node: Raw Files, Next: Formatted Input, Prev: File Names, Up: File Manipulation
  4884. 3.3.2 Raw Files
  4885. ---------------
  4886. Some low level operations involving lists and data files are provided by
  4887. these functions, which are declared in the header file `rawio.h'.
  4888. -- Function: list avm_received_list (FILE *OBJECT, char *FILENAME)
  4889. This function is a convenient way of transferring data directly
  4890. from a raw format file into a list in memory. It might typically
  4891. be used to load the virtual code for an application that has been
  4892. written to a file by a compiler.
  4893. `OBJECT'
  4894. is the address of a file which should already be open for
  4895. reading before this function is called, and will be read from
  4896. its current position.
  4897. `FILENAME'
  4898. should be set by the caller to the address of a null
  4899. terminated string containing the name of the file, but is not
  4900. used unless it needs to be printed as part of an error
  4901. message. If it is a null pointer, standard input is assumed.
  4902. The result returned is a list containing data read from the file.
  4903. The file format is described in *note File Format::. The preamble
  4904. section of the file, if any, is ignored. If the file ends
  4905. prematurely or otherwise conflicts with the format, the program is
  4906. aborted with a message of
  4907. `PROGRAM-NAME: invalid raw file format in FILENAME'
  4908. written to standard error. The program will also be aborted by this
  4909. function in the event of a memory overflow.
  4910. The file is left open when this function returns, and could
  4911. therefore be used to store other data after the end of the list.
  4912. The end of a list is detected automatically by this function, and
  4913. it reads no further, leaving the file position on the next
  4914. character, if any.
  4915. -- Function: void avm_send_list (FILE *REPOSITORY, list OPERAND, char
  4916. *FILENAME)
  4917. This function can be used to transfer data from a list in memory
  4918. to a file, essentially by implementing the printing algorithm
  4919. described in *note Bit String Encoding::.
  4920. `REPOSITORY'
  4921. is the address of a file already open for writing, to which
  4922. the data are written starting from the current position.
  4923. `OPERAND'
  4924. is the list containing the data to be written
  4925. `FILENAME'
  4926. is the address of a null terminated string containing the
  4927. name of the file that will be reported in an error message if
  4928. necessary.
  4929. No preamble section is written by this function, but one could be written
  4930. to the file by the caller prior to calling it. Error messages are
  4931. possible either because of i/o errors or because of insufficient
  4932. memory. I/o errors are not fatal and will result only in a warning
  4933. message being printed to standard error, but a memory overflow will
  4934. cause the process to abort. An i/o error message reported by this
  4935. function would be of the form
  4936. `PROGRAM-NAME: can't write to FILENAME'
  4937. followed by the diagnostic obtained from the standard `strerror' function
  4938. if it exists on the host platform. The file is left open when this
  4939. function returns.
  4940. -- Function: void avm_initialize_rawio ()
  4941. This function initializes some necessary data structures for the
  4942. functions in this section, and should be called prior to them at
  4943. the beginning of a run.
  4944. -- Function: void avm_count_rawio ()
  4945. This function does nothing in the present version of the library,
  4946. but should be called after the last call to all of the other
  4947. functions in this section in order to maintain compatibility with
  4948. future versions of the library. Future versions may decide to use
  4949. this function to do some cleaning up of local data structures.
  4950. 
  4951. File: avram.info, Node: Formatted Input, Next: Formatted Output, Prev: Raw Files, Up: File Manipulation
  4952. 3.3.3 Formatted Input
  4953. ---------------------
  4954. Some functions relating to the input of text files or data files with
  4955. preambles are declared in the header file `formin.h'. The usage of
  4956. these functions is as follows.
  4957. -- Function: list avm_preamble_and_contents (FILE *SOURCE, char
  4958. *FILENAME)
  4959. This function loads a file of either text or data format into
  4960. memory.
  4961. `SOURCE'
  4962. should be initialized by the caller as the address of a file
  4963. already open for reading that will be read from its current
  4964. position.
  4965. `FILENAME'
  4966. should be set by the caller to the address of a null
  4967. terminated character string giving the name of the file that
  4968. will be used if an i/o error message needs to be written
  4969. about it. If it is a `NULL' pointer, standard input is
  4970. assumed.
  4971. The result returned by the function will be a list whose `head' represents
  4972. the preamble of the file and whose `tail' represents the contents.
  4973. As a side effect, the input file will be closed, unless the
  4974. `FILENAME' parameter is `NULL'.
  4975. If the file conforms to the format described in *note File
  4976. Format::, the preamble is a list of character strings. In the
  4977. result returned by the function, the `head' field will be a list
  4978. with one item for each line in the file, and each item will be a
  4979. list of character representations as in *note Character Table::,
  4980. but with the leading hashes stripped. The `tail' will be the list
  4981. specified by remainder of the file according to *note Concrete
  4982. Syntax::. If the file has an empty preamble but is nevertheless a
  4983. data file, the `head' will be a list whose `head' and `tail' are
  4984. both `NULL'.
  4985. If the file does not conform to the format in *note File Format::,
  4986. then the `head' of the result will be `NULL', and the `tail' will
  4987. be a list of lists of character representations, with one for each
  4988. line.
  4989. Whether or not the file conforms to the format is determined on
  4990. the fly, so this function is useful for situations in which the
  4991. format is not known in advance. The conventions regarding the
  4992. preamble and contents maintained by this function are the same as
  4993. those used by virtual code applications as described in *note
  4994. Standard Output Representation:: and *note Input Data Structure::.
  4995. The characters used for line breaks are not explicitly represented
  4996. in the result. Depending on the host system, line breaks in text
  4997. files may be represented either by the character code 10, or by
  4998. the sequence 13 10. However, in order for the library to deal with
  4999. binary files in a portable way, a line break always corresponds to
  5000. a 10 as far as this function is concerned regardless of the host,
  5001. and a 13 is treated like any other character. Hence, if this
  5002. function were used on binary files that happened to have some 10s
  5003. in them, the exact contents of a file could be reconstructed
  5004. easily by appending a 10 to all but the last line and flattening
  5005. the list.
  5006. A considerable amount of memory may need to be allocated by this
  5007. function in order to store the file as a list. If not enough
  5008. memory is available, the function prints an error message to
  5009. standard error and aborts rather than returning to the caller.
  5010. However, i/o errors are not fatal, and will cause the function to
  5011. print a warning but attempt to continue.
  5012. -- Function: list avm_load (FILE *SOURCE, char *FILENAME, int RAW)
  5013. Similarly to `avm_preamble_and_contents', this function also loads
  5014. a file into memory, but the format is specified in advance.
  5015. `SOURCE'
  5016. should be set by the caller to the address of an already open
  5017. file for reading, which will be read from its current
  5018. position.
  5019. `FILENAME'
  5020. should be initialized by the caller as a pointer to a null
  5021. terminated string containing the name of the file that will
  5022. be reported to the user in the event of an error reading from
  5023. it. If it is a `NULL' pointer, standard input is assumed.
  5024. `RAW'
  5025. is set to a non-zero value by the caller to indicate that the
  5026. file is expected to conform to the format in *note File
  5027. Format::. If the file is an ordinary text file, then it
  5028. should be set to zero.
  5029. In the case of a data file, which is when `RAW' is non-zero, the
  5030. result returned by this function will be a list representing the
  5031. data section of the file and ignoring the preamble. In the case of
  5032. a text file, the result will be a list of lists of character
  5033. representations as per *note Character Table::, with one such list
  5034. for each line in the file. Similar comments about line breaks to
  5035. those mentioned under `avm_preamble_and_contents' are applicable.
  5036. As a side effect of this function, the `SOURCE' file will be
  5037. closed, unless the `FILENAME' is a `NULL' pointer.
  5038. This function is useful when the type of file is known in advance.
  5039. If a data file is indicated by the `RAW' parameter but the format
  5040. is incorrect, an error message is reported and the process
  5041. terminates. The error message will be of the form
  5042. `PROGRAM-NAME: invalid raw file format in FILENAME'
  5043. Alternatively, if a text file is indicated by the `RAW' parameter,
  5044. then no attempt is made to test whether it could be interpreted as
  5045. data, even if it could be. This behavior differs from that of
  5046. `avm_preamble_and_contents', where a bad data file format causes
  5047. the file to be treated as text, and a valid data file format, even
  5048. in a "text" file, causes it to be treated as data.
  5049. Memory requirements for this function are significant and will
  5050. cause the process to abort with an error message in the event of
  5051. insufficient free memory. Messages pertaining to i/o errors are
  5052. also possible and are not fatal.
  5053. -- Function: void avm_initialize_formin ()
  5054. This function should be called before either of the other
  5055. functions in this section is called, as it initializes some
  5056. necessary static data structures. Results of the other functions
  5057. are undefined if this one is not called first.
  5058. -- Function: void avm_count_formin ()
  5059. This function should be called after the last call to any of the
  5060. other functions in this section, as it is necessary for cleaning
  5061. up and reclaiming some internal data. If any storage remains
  5062. unreclaimed due to memory leaks in these functions or to misuse of
  5063. them, a warning message is written to standard error. If the
  5064. function `avm_count_lists' is also being used by the client
  5065. program, it should be called after this one.
  5066. 
  5067. File: avram.info, Node: Formatted Output, Prev: Formatted Input, Up: File Manipulation
  5068. 3.3.4 Formatted Output
  5069. ----------------------
  5070. The following functions pertaining to the output of text files or data
  5071. files with preambles are declared in the header file `formout.h'.
  5072. -- Function: void avm_output (FILE *REPOSITORY, char *FILENAME, list
  5073. PREAMBLE, list CONTENTS, int TRACE_MODE)
  5074. This function writes a either a text file or a data file in the
  5075. format described in *note File Format::. The parameters have these
  5076. interpretations.
  5077. `REPOSITORY'
  5078. is the address of a file opened for writing by the caller,
  5079. that will be written from its current position.
  5080. `FILENAME'
  5081. is the address of a null terminated character string set by
  5082. the caller to be the name of the file that will be reported
  5083. to the user in the event of an i/o error.
  5084. `PREAMBLE'
  5085. is `NULL' in the case of a text file, but a list of character
  5086. string representations as per *note Character Table::, in the
  5087. case of a data file. If a data file has is to be written with
  5088. an empty preamble, then this list should have a `NULL' `head'
  5089. and a `NULL' `tail'.
  5090. `CONTENTS'
  5091. is either a list of character string representations in the
  5092. case of a text file, or is an unconstrained list in the case
  5093. of a data file.
  5094. `TRACE_MODE'
  5095. may be set to a non-zero value by the caller to request that
  5096. everything written to a text file should be echoed to
  5097. standard output. It is ignored in the case of a data file.
  5098. The effect of calling this function is to write the preamble and
  5099. contents to the file in the format indicated by the preamble. The
  5100. file is left open when this function returns.
  5101. Line breaks are always written as character code 10, not as 13 10, regardless
  5102. of the convention on the host system, so that files written by
  5103. this function can be reliably read by other functions in the
  5104. library.
  5105. Leading hashes are automatically added to the beginning of the
  5106. lines in the preamble, except where they are unnecessary due to a
  5107. continuation character on the previous line. This action enforces
  5108. consistency with the file format, ensuring that anything written
  5109. as a data file can be read back as one. The hashes are stripped
  5110. automatically when the file is read by `avm_preamble_and_contents'.
  5111. Another feature of this function is that it will mark any output
  5112. file as executable if it is a data format file with a prelude
  5113. whose first character in the first line is an exclamation point.
  5114. This feature makes it easier for a compiler implemented in virtual
  5115. code to generate executable shell scripts directly.
  5116. A fatal error is reported if any of the data required to be a
  5117. character representation is not listed in the *note Character
  5118. Table::. A fatal error can also be caused by a memory overflow.
  5119. Possible error messages are the following.
  5120. * `PROGRAM-NAME: invalid output preamble format'
  5121. * `PROGRAM-NAME: invalid text format'
  5122. * `PROGRAM-NAME: can't write to FILENAME'
  5123. In the last case, the error message will be followed by an
  5124. explanation furnished by the standard `strerror' function if
  5125. available.
  5126. -- Function: void avm_output_as_directed (list DATA, int
  5127. ASK_TO_OVERWRITE_MODE, int VERBOSE_MODE)
  5128. This function writes an ensemble of files at specified paths in
  5129. either text or data format, optionally interacting with the user
  5130. through standard input and output. The parameters have these
  5131. interpretations.
  5132. `DATA'
  5133. is a list in which each item specifies a file to be written.
  5134. `ASK_TO_OVERWRITE_MODE'
  5135. may be set to a non-zero value by the calling program in
  5136. order to have this function ask the user for permission to
  5137. overwrite existing files.
  5138. `VERBOSE_MODE'
  5139. may be set to a non-zero value by the calling program to have
  5140. this function print to standard output a list of the names of
  5141. the files it writes.
  5142. A high level interface between virtual code applications and the
  5143. file system is provided by this function. The `DATA' parameter
  5144. format is compatible with the the data structure returned by an
  5145. application complying with the conventions in *note Output From
  5146. Non-interactive Applications::.
  5147. Each item in the `DATA' list should be a non-empty list whose
  5148. `head' and `tail' are also non-empty. The fields in each item have
  5149. the following relevance to the file it specifies.
  5150. * The `head' of the `head' is `NULL' if the file is to be
  5151. opened for appending, and non-`NULL' if it is to be
  5152. overwritten.
  5153. * The `tail' of the `head' represents a path as a list of
  5154. character string representations, in a form suitable as an
  5155. argument to `avm_path_name'.
  5156. * The `head' of the `tail' represents the preamble of the file,
  5157. as either `NULL' for a text file or a non-empty list of
  5158. character string representations for a data file.
  5159. * The `tail' of the `tail' represents the contents of the file,
  5160. either as a list of character string representations for a
  5161. text file or as a list in an unconstrained format for a data
  5162. file.
  5163. For each item in the list, the function performs the following
  5164. steps.
  5165. 1. It decides whether to open a file for overwriting or
  5166. appending based on the `head' of the `head'.
  5167. 2. It uses the `tail' of the `head' to find out the file name
  5168. from `avm_path_name', in order to open it.
  5169. 3. If the `ASK_TO_OVERWRITE_MODE' flag is set and the file is
  5170. found to exist already, the function will print one of the
  5171. following messages to standard output, depending on whether
  5172. the file is to be overwritten or appended.
  5173. * `PROGRAM-NAME: overwrite FILENAME? (y/n)'
  5174. * `PROGRAM-NAME: append to FILENAME? (y/n)'
  5175. It will then insist on either `y' or `n' as an answer before
  5176. continuing.
  5177. 4. If the `ASK_TO_OVERWRITE' flag has not been set, or the file
  5178. did not previously exist, or the answer of `y' was given, the
  5179. preamble and contents of the file are then written with
  5180. `avm_output'.
  5181. 5. If permission to write or append was denied, one of the
  5182. following messages is reported to standard output, and the
  5183. data that were to be written are lost.
  5184. * `PROGRAM-NAME: not writing FILENAME'
  5185. * `PROGRAM-NAME: not appending FILENAME'
  5186. 6. If permission was granted to write or append to the file or
  5187. the `VERBOSE_MODE' flag is set, one of the messages
  5188. * `PROGRAM-NAME: writing FILENAME'
  5189. * `PROGRAM-NAME: appending FILENAME'
  5190. is sent to standard output.
  5191. If any files are to be written to standard output, which would be
  5192. indicated by a `NULL' path, they are not written until all other
  5193. files in the list are written. This feature is in the interest of security,
  5194. as it makes it more difficult for malicious or inept virtual code
  5195. to alter the appearance of the console through standard output
  5196. until after the interactive dialogue has taken place. Permission
  5197. is not solicited for writing to standard output, and it will not
  5198. be closed.
  5199. Any of the fatal errors or i/o errors possible with `avm_output' or
  5200. `avm_path_name' are also possible with this function, as well as
  5201. the following additional ones.
  5202. * `PROGRAM-NAME: invalid file specification'
  5203. * `PROGRAM-NAME: can't close FILENAME'
  5204. * `PROGRAM-NAME: can't write FILENAME'
  5205. The last two are non-fatal i/o errors that will be accompanied by
  5206. an explanation from the `strerror' function if the host supports
  5207. it. The other message is the result of a badly formatted `DATA'
  5208. parameter.
  5209. -- Function: void avm_put_bytes (list BYTES)
  5210. This function takes a list of character representations, converts
  5211. them to characters, and sends them to standard output. There is no
  5212. chance of a memory overflow, but the following other errors are
  5213. possible.
  5214. * `PROGRAM-NAME: invalid text format (code NN)'
  5215. * `PROGRAM-NAME: can't write to standard output'
  5216. The latter is non-fatal, but the former causes the program to
  5217. abort. It is caused when any member of the list `BYTES' is not a
  5218. character representation appearing in *note Character Table::.
  5219. -- Function: void avm_initialize_formout ()
  5220. This function initializes some data structures used locally by the
  5221. other functions in this section, and should be called at the
  5222. beginning of a run before any of them is called.
  5223. -- Function: void avm_count_formout ()
  5224. This function doesn't do anything in the current version of the
  5225. library, but should be called after the last call to any of the
  5226. other functions in this section. Future versions of the library
  5227. might use this function for cleaning up some internal data
  5228. structures, and client programs that call it will maintain
  5229. compatibility with them.
  5230. 
  5231. File: avram.info, Node: Invocation, Next: Version Management, Prev: File Manipulation, Up: Library Reference
  5232. 3.4 Invocation
  5233. ==============
  5234. The functions documented in this section can be used to incorporate the
  5235. capabilities of a virtual machine emulator into other C programs with a
  5236. minimal concern for the details of the required data structures and
  5237. virtual code invocation conventions.
  5238. * Menu:
  5239. * Command Line Parsing::
  5240. * Execution Modes::