.. include:: ../../global.rst ***************** Pre-Compilation I ***************** .. aside:: Software Engineering-wise, a complete disaster. This has been a huge investment of time and effort with a pragmatic but incomplete result. I've conflated ideas, tried to pursue too much at once, missed swathes. A hopeless mess. Let's get into it. The basic plan is to replace the live evaluation of source code when :lname:`Idio` starts up with a system that reads in the pre-compiled byte code saving time and effort. .. aside:: We'll come back to this... Many :lname:`Lisp`\ y languages support the notion of *undumping* where, having parsed all of the startup source code into its internal (byte code) form, the system saves that internal form back out ready to roll for next time. The Problem =========== The crux of the problem with reading in some byte code it that that byte code cannot know anything about the running :lname:`Idio` engine when it starts. Well, it can make some presumptions about the set of *primitives* in the :program:`idio` executable, probably, but it can't know whether, say, the symbol ``load``, refers to a primitive or a closure. (In fact, ``load`` will refer to both, and be used in the wrong order!) This is individually true for every file loaded in as none of the loaded files can predict what a previously loaded file has done to modify the state of the :lname:`Idio` engine. The fix to not knowing anything is to have to look everything up. Easy! Strategy ======== I ran around some thoughts previously in :ref:`pre-compilation` and hopefully have brought it all together here. A couple of things from that discussion didn't sink in at the time. The first is the x86_64's use of relative instruction pointers (RIP) into the GOT to access variables for the loaded object which might be extremely convenient there but does seem to my eye to be a touch confusing. A naïve ``x + x`` would use two different RIPs to access the same variable. How would you keep track? The second was the global offset table (GOT) where all the doubly-dereferenced entities live. I couldn't quite see where the GOT was kept as it is loaded-object specific. It turns out that it is in a register. Which register? Why, *any* register, of course, it's up to you to maintain it and apply the appropriate register to your instructions. .. aside:: Yeah, it's obvious *now*\ ! This then led me to the idea that the *VM*'s threads should have another "register" maintaining a reference to whichever set of tables needed double dereferencing for this block of code. For the VM ---------- That leads us to the idea of a table, or series of tables, where the byte code (restricted to small positive integers) will embed a *symbol index*, :samp:`{si}`, which will be mapped through (some other tables) to a *global value index*, :samp:`{gvi}`, containing the value the *VM* wants to operate on. We need a common global value index because my symbol index to the global variable ``A`` is different to your symbol index for ``A``. If I then used my own value index for ``A`` then how are you supposed to know what to access/update? In between symbol index and global value index we want to: * do a cached lookup of the :samp:`{gvi}` immediately * otherwise, a one-time: * recover a symbol from the loaded object's tables * do a symbol lookup of the symbol through the current and imported modules (which will contain the global value index or be auto-vivified as a symbol -- think ``ls``) * set the cached :samp:`{gvi}` for next time * use the :samp:`{gvi}` as above This creates the idea of an *execution environment*, or *xenv*, one per loaded file, with an associated :samp:`{xi}`, the index into the table of all xenvs. I settled on a basic three tables for the *VM*: #. a symbol table mapping the symbol index, :samp:`{si}`, to an index into the constants table, :samp:`{ci}` You only need symbol indices for things you need to lookup (as though they were variables). *symbol* symbols are just constants (if that makes any sense!): .. code-block:: idio-console Idio> printf "%s is %s\n" 'load load load is # Here, we quote the first use of ``load``, as in ``'load``, to access the *symbol* ``load`` which will result in a lookup of some constant index, :samp:`{ci}` in the byte code. The second use of ``load`` is as a variable which means we must do a ``SYM-REF`` of a symbol index, :samp:`{si}`, to force the *VM* to resolve a :samp:`{gvi}`. In particular, the :samp:`{si}` for ``load`` (the second) will map to the same :samp:`{ci}` for ``'load`` as they reference the same *symbol* and there can be only one! #. a constants table containing all of the, uh, constants used in the loaded file Here, constants are, essentially, anything that isn't a small positive integer that can be embedded into the byte code -- think symbols, strings, bignums etc.. In the example from above, there will be constants for the symbol ``printf``, the string ``"%s is %s\n"`` and the symbol ``load`` (only once, there can be only one... *[Ed - stop it!]* ) There is an associated constants hash table which reverses the constants table lookup -- very handy. #. a values table which maps loaded file value indices, :samp:`{vi}`, to the global value indices, :samp:`{gvi}` used by the *VM* A :samp:`{gvi}` of ``0`` (zero) is a sentinel value indicating that the global value index hasn't been looked up yet. Again, for our example, there will be value indices for the symbol indices for ``printf`` and ``load``. Here, it is *critical* that the :samp:`{si}` and :samp:`{vi}` align. That's because the very first thing we do with a :samp:`{si}` is query the values table at slot :samp:`{si}` to see if the corresponding index is non-zero. If non-zero we can access the global values table straight away. If it is zero then we'll have to: #. use the symbol table to map the :samp:`{si}` to some :samp:`{ci}` #. use the constants table to map the :samp:`{ci}` to a symbol #. lookup the symbol information tuple in the current environment (or create a new symbol tuple for the auto-vivified symbol) #. extract the :samp:`{gvi}` from the symbol information tuple #. set the values table at index :samp:`{si}` (the :samp:`{si}` and notional :samp:`{vi}` align) to :samp:`{gvi}` ready for next time #. use the :samp:`{gvi}` index into the global values table as before Note, however, that for operational reasons, you will have more value indices (values you need to use) than you will have symbol indices (symbols you want to lookup) leaving the symbol table with some gaps. A prominent example, here, is the :ref:`previous work ` to separate the creation of closures into two parts: a one-time ``CREATE-FUNCTION`` and a per-use ``CREATE-CLOSURE``. The ``CREATE-CLOSURE`` needs to refer to the value resulting from the ``CREATE-FUNCTION`` and a value index is reserved for that purpose leaving the corresponding symbol table index unused. There are some knock-on effects of xenvs in that, say, any closure defined in an xenv will have embedded in its byte code references to that xenv's tables. As a consequence, the definition of a closure must now include the xenv or, rather, the :samp:`{xi}`, it was defined in such that those lookups will be correct. Similarly, the *arguments constant index*, :samp:`{aci}`, for a *frame* refers to a constant in the current xenv and so to decode it, the frame needs to include the :samp:`{xi}` for the :samp:`{aci}`. .. aside:: Noting that ``load`` is defined in the startup then somehow we've managed to defined fourteen other xenvs before we started! You can see that in the printed form of the closure ``load``, above: ``#`` where the ``[14]`` is indicating that ``load`` was defined in :samp:`{xi}` 14. The *VM* also has some other tables of interest: #. a source code expressions table This is an array of source code expressions. This exists whilst the source code is being parsed but is not saved as it is not possible to recreate this data sensibly without re-parsing the source code. Consider, ``((1 + 2) * (3 + 4))`` where the *reader* can successfully maintain references to the internal sub-lists of the main expression but when written out becomes three distinct lists, ``((1 + 2) * (3 + 4))``, ``(1 + 2)`` and ``(3 + 4)``, with the larger one repeating the smaller two. This duplication of lists balloons the memory requirements enormously. It may be reported in the disassembled code if the ``--vm-tables`` flag is passed to :program:`idio` but otherwise is lost. #. a source code properties table (This is one-for-one with the source code expressions table, above.) This is an array of :samp:`({fi} {line})` tuples where :samp:`{fi}` is the index into the constants table for the source code's file name. The source code properties table is what is indexed by the ``SRC-EXPR`` opcode. Modules ------- The symbol/variable lookup works much as before in that there is a module-specific tables of known (and possibly) exported symbols which contain some internal symbol information data. That symbol information is a tuple of: * the kind of symbol (``'predef`` (primitive), ``'toplevel``, ``'dynamic``, ``'environ``, etc.) * an :samp:`{si}` Up to r0.2 this was an unreliably module-specific :samp:`{ci}`. * a :samp:`{ci}` Up to r0.2 this was a global :samp:`{ci}`. * a :samp:`{gvi}` * the module the symbol was defined in This is useful for debugging as, in order to get this tuple, you must have known the correct module to ask the question of in the first place. In other words, a module is already known. However, as modules import names from other modules, some import history is observable. * a descriptive string This is useful for debugging as it usually hints at the last modifier of this data. Although this isn't reliable. Evaluator --------- In addition, the *evaluator* wants to maintain slightly more information, essentially an association list of the symbol and the symbol information tuple. It'll keep track of things in an *evaluation environment*, or *eenv*. Three Problems ============== I tried to do three things at once which, you might guess, didn't end well. Maintain the Existing Mechanism ------------------------------- As I already had a perfectly good working system, despite it not implementing this new strategy, I thought I could maintain it. The plan was to pre-compile the known startup files (using :file:`lib/evaluate.idio` rather than :file:`src/evaluate.c`). Here, several things went wrong: * a lack of surety that :file:`lib/evaluate.idio` actually does the same thing as :file:`src/evaluate.c` The problem here being that, whilst :file:`lib/evaluate.idio` can run the test suite quite happily, the test suite doesn't exercise the evaluator quite as much as the startup code does. * the pre-compilation was flawed (see below) * :file:`lib/evaluate.idio`, via :file:`lib/expander.idio`, implements a different template expansion mechanism than :file:`src/evaluate.c`, via :file:`src/expander.c` This means the resultant (disassembled) code is hard to compare (easily). * when you load the pre-compiled code, using the new strategy, back in, the :lname:`Idio` engine behaves differently to the way it behaves when evaluating the startup code "live." This makes debugging very difficult as the engine is following different paths through the code base. In the end I junked the idea of a special "first pass", reworking it, and therefore everything, to use the new strategy. Pre-Compile ----------- .. aside:: Read: much at all. I hadn't thought pre-compilation through properly. Here the problem is that the behaviour of the code is dependent on its current state. If one expression would have modified that state then the next expression may react differently. Most of the time this isn't an issue as the *VM* would "just" look a value up or, at least, just use a value as modified by the previous expression changing it. However, we made the judicious decision to allow predefs (primitives) to be shadowed by (hopefully) improved versions in the form of closures. The problem here is that the calling conventions for primitives and closures are not the same. Internally, once the arguments have been computed you can simply call the primitive's underlying :lname:`C` function with the various arguments. For a closure, we construct a frame and copy the argument values into the frame, change the *PC* to the start of the function and have the closure code extract arguments from the frame. The creation and filling of the frame is completely unnecessary for primitives. For pre-compilation we have to pretend to modify the program state so as to know whether the use of a symbol will be for a predef or a closure. As it stands, though, the code modifies the *real* program state not a pretend one. That's sort of OK for source code you've not seen before (although, in effect, you are *running* the source code, not performing a standoff-ish pre-compilation) but is close to disastrous for source code you *have* seen before such as the startup code. For the startup code we are trying to distinguish between running the code for the first time and any subsequent runs which we don't want to modify the program state because it affects the running program... A great example, here, is the use of operators by the reader. Having started up we now have a full set of operators and we want to run :file:`lib/evaluate.idio` over the startup code. However, as the startup code is being read in, which would normally have no operators present (as they haven't been defined yet) operators get applied because we're a fully operational\ *[sic]* system. I did try an entertaining method of removing the user-defined operators then carefully re-adding them as the user-definition appeared. That (surprisingly) worked but didn't help with the other modifications of real program state. Of course it took a while to realise that in modifying the extant state I wasn't pre-compiling but merely sowing confusion. Implement the Strategy ---------------------- Actually implementing the strategy is a thing in itself, of course, with its own set of problems and nuances. We'll cover this in detail in the next sections. In the meanwhile... Undump ====== .. aside:: A *lot* of time. I spent a lot of time debugging and requiring more and more detailed dumping of state, the sort you can see with the ``--vm-tables`` flag, before realising that the "live" state I was dumping was, in fact, the very thing I should be using. All I needed to do was reformat the output into the chosen cache file format and... *Boom!* Job done. Of course, this is exactly the *undumping* described previously. *Ho hum.* Implementation ============== It's a little bit complicated with a symbiotic relationship between xenvs and eenvs. Here, it transpires, as the evaluator reads in source code it must maintain the newly extended source, constant and value tables as well as generating the byte code for the *VM*. The reason for this is that any interactive session must cycle between reading in source code and running the compiled byte code. The evaluator sets everything up ready for the *VM* to roll. The *VM* also needs to be able to refer back to the original source file used by the evaluator so it maintains a reference to the eenv through the xenv. Evaluator Changes ----------------- Much of the evaluator's new behaviour has been abstracted behind some :samp:`idio_meaning_find_{scope}_symbol()` (and :samp:`meaning-find-{scope}-symbol`) functions which, if the symbol is not already known to the evaluator will call ``idio_meaning_extend_tables()`` (``meaning-extend-tables``) which looks to ensure that the symbol and values tables are aligned. The symbol itself can be added to the constants table (and hash), if necessary, and that :samp:`{ci}` used in the symbol table. The values table is extended to keep it aligned but note that, at this point, no :samp:`{gvi}` can be known and the value table entry is set to zero. It will grow the (association) list of known symbols (in this eenv) and, optionally, set the module's symbol information tuple with the same information (including the zero :samp:`{gvi}`). xi 0 ---- :samp:`{xi}` 0 (zero) is the xenv for the global tables, notably the global values table which everything else indirectly seeks to refer to. C add_primitives ---------------- Most of the :lname:`C` modules have an ``idio_X_add_primitives()`` function to bootstrap the system with several hundred primitives. There's nothing to stop module ``A`` defining the same named primitive as module ``B`` -- it does happen: ``number?`` is either the regular :ref:`number? ` (for fixnums and bignums) or the :lname:`C`-specific :ref:`C/number? ` that identifies :lname:`C` numeric types. So the code in ``idio_meaning_predef_extend()`` needs to select a module-specific eenv to add the primitive into. This is the primary reason there are so many xenvs "prior to" starting up: they're the result of the evaluator creating a separate eenv (and therefore xenv) for each :lname:`Idio` module that has had a primitive defined for it. These are really just holding spaces with no associated code. This addition of a primitive is one of the rare occasions that a value is known in the here and now for the evaluator. Accordingly, it can request a new global value index and retro-fit that :samp:`{gvi}` back into the symbol information tuple created by the extend tables code. Evaluation ---------- Evaluation proceeds much as it did before except that name lookups are performed through the eenv-specific list of known symbols rather than resorting to any recursive lookup through imported modules. The resultant symbol information tuple now contains the all important :samp:`{si}` for the symbol which is going to be embedded in the byte code. This means that the first time the evaluator sees the symbol, say, ``printf``, in your code it will fail to find it in its list of previously seen symbols and so extend its tables for this eenv to add ``printf`` as a new constant, an :samp:`{si}` mapping to that new :samp:`{ci}` and a new values table entry of zero (meaning, the *VM* needs to look this up). This is deferring that resolution of ``printf`` to runtime. It only needs to be resolved once (as the resolving code will have updated the values table index with the real live :samp:`{gvi}`) but it is delayed to runtime. This is the correct behaviour for a "pre-compiled" file being loaded in. It doesn't know how this :program:`idio` executable has laid everything out so it must look it up at runtime. Shadowing Predefs ^^^^^^^^^^^^^^^^^ This doesn't happen much but it does happen with a rather critical function, ``load``, so we need to sort it out. The primitive ``load`` opens the file and tries to evaluate and run every expression. That's great until there's an error or even where the loaded file redefines the current module. Who resets everything? That's when our closure ``load`` (in :file:`lib/bootstrap/module.idio`) comes into its own. It utilizes :ref:`unwind-protect ` to ensure that the current module is reset to whatever it was before ``load`` was invoked. Of course, otherwise, this ``load`` simply calls the primitive ``load``. As noted previously, the calling convention for primitives and closures is different therefore we need to ensure that we know when which variant of ``load`` is being called. In practice, the closure ``load`` isn't a definition, it is an assignment to the symbol ``load`` because the closure is being defined inside a block to retain access to the original primitive ``load``, something like: .. code-block:: idio { orig-load := load load = function (filename) { ... orig-load filename ... } } We can't *define* ``load`` inside the block as it will become a block local variable (who's scope ends at the end of the block). So we need to capture assignments to a symbol which is currently flagged as a predef. We can't just override the value because somewhere there will be byte code (you imagine the call to ``orig-load filename`` being one of them) which is expecting a primitive and it's calling convention and, at the same time, we need to prep for code that is expecting our new shiny closure and *its* calling convention. So we need to extend the eenv's symbols again with the same symbol but now different information. Future lookups of ``load`` will get a symbol information tuple for a regular toplevel definition and do the right thing. xi Predefs ^^^^^^^^^^ Part of the problem, here, for ``load`` is that the assignment happens before the primitive is called. Technically, the primitive ``load`` is invoked in order to load :file:`bootstrap.idio` but that happens in a non-xenv-specific way. When the bootstrap code is run, ``load`` (closure or primitive) is not invoked until after we've performed this assignment which makes you wonder who set up the original association between some :samp:`{si}` and the primitive ``load`` in this xenv in order that the assignment to ``orig-load`` at the start of the block finds anything at all. .. aside:: Er, oops. Answer: no-one! So we need to record the set of predefs that this xenv uses in order that, having loaded the compiled code file in we can run round the set of used predefs setting up the value table mapping from :samp:`{si}` to the :samp:`{gvi}` that the :lname:`C` "add primitives" code set up. xi Operators ^^^^^^^^^^^^ In a similar way, we might define an operator in this eenv/xenv but operators always exist in the ``operator`` module, not the current module. That's because their usage is effectively global in that they must be visible through the reader which looks for them in a consistent place, the ``operator`` module. That said, the definition of the operator closure *is* in this eenv and all of its internal symbol references will use this eenv's symbol table etc.. So, operators are handled slightly differently to other toplevel definitions in that they exist in this eenv but their symbol information tuples are set in the ``operator`` module. This distinct set of known operators is saved and restored for no good reason other than general tidiness. Stack Values ^^^^^^^^^^^^ The *VM* will push some "values" onto the stack as it is processing. For example, the condition a trap is for, dynamic and environment variables. What we *can't* do is push values that are xenv-specific as we won't know which xenv we are referring to. We need to push values that are globally recognised. That's not a big deal but it is a deal we need to do. For all of them, we'll pass an :samp:`{si}` in the usual way and have the runtime code discover the appropriate current global value. Traps """"" For a trap we want to push the global equivalent of the :samp:`{si}`'s :samp:`{ci}`. So that's an extra ``constants_lookup_or_extend`` in the global constants table (in :samp:`{xi}` zero). Dynamics and Environs """"""""""""""""""""" For dynamic and environment variables we'll (continue to) push the :samp:`{gvi}` of the variable. That allows separate bodies of code to temporarily create a shadowing dynamic/environment variable, which will use a new :samp:`{gvi}`. Invoking ======== ``--save-xenvs`` Operational Effects =================== It's a mixed bag, I'll be honest. .. sidebox:: :file:`lib/bootstrap/delim-control.idio` being a particularly egregious example where the ratio of template expansion to resultant working code is something like 50:1. In the first instance, loading the compiled startup files, there is a huge runtime saving. When evaluating the code, the first time, the evaluator has to expand all the templates which is very expensive. This results in some 7.5 million opcodes being run, all in, by the *VM*. The resultant code is at least an order of magnitude smaller requiring around 208 thousand opcodes to be run. However, the constant double dereferencing far outweighs that benefit and the overall effect is that the test suite, say, takes about 15-20% longer. It's not just the double de-reference *per se*, it is the jumping about in memory and the complete lack of cache coherence in the code that is the problem. On reflection, it might be necessary to revisit this with a view to a leaner "first pass" with the startup code (and only the startup code?). Here we'd be going back to the idea of using :samp:`{xi}` zero directly, ie. the :samp:`{si}`, and therefore nominal :samp:`{vi}`, *is* the :samp:`{gvi}` and needn't be dereferenced again. Problems Problems ----------------- We have a problem if :lname:`C` code wants to look a variable up as no xenv specifically represents a module. We have a problem if a loaded file redefines a value already set in this file. We have a problem in that *gensyms* are saved. .. include:: ../../commit.rst