.. include:: ../../global.rst ********** Evaluating ********** The evaluator is how we find meaning in the source code. Knowing that the reader has supplied us with lists of lists then it shouldn't come as huge surprise that the evaluator is quite recursive in its nature but is relatively straightforward all the same. There are plenty of complications, of course, for example there are several situations where the source code is normalised by re-writing it. This normalisation is a form of implicit syntax transformation (as in, the :lname:`Idio` language makes the transformation according to its built-in rules, which we're about to discuss). The use of syntax expanders (via templates aka macros) allows users to make explicit syntax transformations. Context ======= We're not deviating a great deal from the technique outlined in :ref-title:`LiSP` (:cite:`LiSP`) based on which our search for meaning is going to involve a few basic repeating variables: * ``e`` -- the expression we're currently evaluating (not a huge surprise) As we recursively evaluate the elements of a list, say, then the expression to be evaluated will become the, say, head of the list. When the evaluation recursion unwinds, the expression "to hand" will revert as expected. We'll likely have ``eh`` and ``et`` as the head and tail of a pair/list and further derivatives. * ``src`` -- an :lname:`Idio` addition is to maintain the original source expression in order that we can pass on any source code properties (namely the lexical object defined by the reader) to any derived expression we might generate * ``nametree`` -- a "name tree" As we walk through the lists of lists and determine that new *lexical* variables are introduced then they push in front of previous lexical variables giving us a hierarchy of known names. This name tree is then available for us to check down when a variable is referenced. As the lists of lists recursion unwinds then the nametree unwinds with it. It is a list of lists where the inner lists are of the variables introduced by a given variable-introducing statement. A name tree is slightly more obvious in :lname:`Scheme` where multiple variables can be introduced in a single ``let`` (or variant) statement but the effect is still true in :lname:`Idio` where an assignment operator introduces variables one ``let*`` at a time. .. code-block:: idio { a := 1 ;; nametree ~ ((a)) ;; => a is first list, first slot ~> SHALLOW-ARGUMENT-REF0 b := 2 ;; nametree ~ ((b) (a)) ;; => a is second list, first slot ~> DEEP-ARGUMENT-REF 1 0 } Hmm, not the most clear example but our list of names in the name tree has: * after the first variable assignment a variable ``a`` in scope * after the second variable assignment it then has a ``b`` in scope and then the ``a``, now a level out The reason this is important goes back to the :ref-title:`LiSP` mechanism for accessing lexical variables through a linked list of *frames*. The opcodes go one level back, :samp:`SHALLOW-ARGUMENT-REF{n}`, or multiple levels back, :samp:`DEEP-ARGUMENT-REF {d} {n}` (for some depth :samp:`{d}` and index within the frame :samp:`{i}`). The nested frame mechanism is required because when we call a closure we'll create a frame here for the arguments to go into *then* invoke the closure. The first thing the closure mechanism does is reset the frame hierarchy to that which the closure had when it was created. The frame we just created is linked into that "historic" frame hierarchy and the closure runs. From the closure's perspective, it sees the arguments to itself in front of the original set of lexical variables when the closure was created. ``let`` is still legal syntax so we can make it a bit more obvious: .. code-block:: idio { let ((a 1) (b 2) (c 3)) { ;; nametree ~ ((a b c)) ;; => c is first list, third slot ~> SHALLOW-ARGUMENT-REF2 x := 2 ;; nametree ~ ((x) (a b c)) ;; => c is second list, third slot ~> DEEP-ARGUMENT-REF 1 2 } Here, now, after the ``x`` assignment, we have an ``x`` in scope and then all three of ``a``, ``b`` and ``c`` are known names another level out. All three were created with the same variable-introducing statement, ``let`` * ``flags`` -- we'll need some flags to indicate whether: - the expression is in tail position This is very important -- and surprisingly easy to maintain -- to give us the power of `tail call optimisation `_. - an :lname:`Idio` addition is the nature of variables being created Here, we're looking at whether the variable is being created: * lexically, because we found it in the name tree * at top-level, because we couldn't find it in the lexical name tree * in a dynamic or environmental or computed context -- which is effectively top-level but managed in a different way * ``cs`` -- a set of known constants Nominally, this can be used as a "known top-level names" list (amongst other things) but in :lname:`Idio` it is used to map a constant of any kind (symbol, list, array, etc.) into a unique integer for embedding in the byte code. * ``cm`` -- an :lname:`Idio` addition is the current module As the source code switches between modules the expectation is that the evaluator can find the correct variable (ie. *my* ``v`` not the other guy's ``v``) and to effect that we need to track any changes to the sense of the current module by latching onto any module changing statements in the source code. All of which are :lname:`C` lexical variables used throughout :file:`src/evaluate.c` (and :lname:`Idio` lexical variables in the :lname:`Idio` variant :file:`lib/evaluate.idio`, the :term:`metacircular evaluator`). In effect, all of the above become formal parameters to almost every function in the evaluator. In case anyone is still reading the ``s`` in ``cs`` for the constants is for "star" as in a more :lname:`Lisp`\ y or EBNF-y ``c*`` meaning zero or more. There is also use of the likes of ``ep`` with the ``p`` for "plus" as the :lname:`C` equivalent of ``e+`` meaning one or more. evaluate -------- Kicking it all off is ``idio_evaluate()`` which looks like this: .. code-block:: c IDIO idio_evaluate (IDIO src, IDIO cs) { ... IDIO m = idio_meaning (src, src, idio_S_nil, /* name tree */ IDIO_MEANING_FLAG_NONE, cs, idio_thread_current_module ()); ... return IDIO_LIST2 (..., m); } It'll take some source code, ``src``, and a list of known constants, ``cs``. The source code bit is obvious and most invocations will pass the virtual machine's constants array for constants. Fundamental Meaning =================== As noted elsewhere, we rely on the evaluator distinguishing between *special forms*, *templates* and anything left over is a *derived form* or a constant. It is *hugely* tempting to add to the list of special forms. Of course, the magic works but it will become a bind it is hard to extract yourself from. However, even :lname:`Scheme` has a minimal set of special forms to let you bootstrap everything else: * ``define`` and ``set!`` allow you to bind a name to a value and to change that binding * ``quote`` prevents the evaluator evaluating an expression * ``if`` provides the conditional *consequent* / *alternative* **without** the evaluator evaluating either * ``lambda`` (or ``function`` in :lname:`Idio`) lets you define abstractions which you can subsequently invoke -- these are the derived forms * ``define-macro`` (or ``define-template`` in :lname:`Idio`) lets you define your own "special form" -- special in that the arguments are not evaluated -- albeit all you can do is return more code for the evaluator to evaluate. There are other special forms which have a genuine need to be handled specially -- think of anything that needs to manipulate internal :lname:`C` state -- and, of course, some that have snuck in because it is convenient etc.. So, the premise of the main evaluation loop is simply to look at the expression to hand and determine if it is special, a template or otherwise treat it as a derived form or constant. :lname:`Lisp`\ y languages always have the functional part in the first position of a list so, if the expression is a list we simply need to look at what the first element is. ``idio_meaning()`` in :file:`src/evaluate.c` (a debatably poor name as it *is* the evaluator but almost everything is called :samp:`idio_meaning_{something}`!) has a big test: .. code-block:: c IDIO idio_meaning (IDIO src, IDIO e, IDIO nametree, int flags, IDIO cs, IDIO cm) { if (idio_isa_pair (e)) { IDIO eh = IDIO_PAIR_H (e); IDIO et = IDIO_PAIR_T (e); /* e is (eh ...) */ if (...) { ... } else if (idio_S_quote == eh) { ... } else if (idio_S_function == eh) { ... } else if (idio_S_if == eh) { ... } else if (idio_S_set == eh) { ... } else if (idio_S_define_template == eh) { ... } else if (idio_S_define == eh) { ... } else { /* could be a template */ if (idio_isa_symbol (eh)) { if (idio_expanderp (eh)) { return idio_meaning_expander (e, e, nametree, flags, cs, cm); } } /* default is a function call */ return idio_meaning_application (src, eh, et, nametree, flags, cs, cm); } } else { /* * do something with: * * symbols: (de-)reference them * * constants: quote them -- evaluate (12) -> 12 */ } } and, without suggesting that that is everything, in fact, that single (large) conditional clause is the guts of a :lname:`Lisp` evaluator. The ``idio_meaning()`` function is physically large because it also embeds the initial syntactic checking. For example, ``quote`` takes a single argument to be quoted. Which means that no argument, ``(qoute)``, or more than one argument, ``(quote 1 2)``, must be caught and flagged as errors. These are slightly obscure and might not happen in practice -- as most use of ``quote`` is through :samp:`'{expr}` where the reader ensures that there is only one expression passed to ``quote`` -- but we should flag up the error to catch wayward typing. This testing could be devolved to the specific special form handler, ``idio_meaning_quotation()``, in this case. Yeah, maybe, but `I've started so I'll finish `_. The Result of the Meaning ------------------------- Not yet defined is what ``idio_meaning()`` is meant to return. What is it meant to *do*\ ? Our goal, from inferring some meaning from the lists of lists the reader gave us, will be to head off to the code generator so we probably want something amenable to that. In our case, we're going to have the evaluator generate some "intermediate code." By this we mean to have reduced the source code expressions down to some high level statements of intent with a vague eye on how the virtual machine works. I confess, that's not a terribly clear description as, for me, it's a bit hard to describe without showing examples (coming in the next section). You can imagine, though, in our `highfalutin `_ source language we *bind* variables to values whereas in the grubby world of machine code we're going to "set" something. The intermediate language has a group of constants, :samp:`IDIO_I_{some_thing}` -- with the ``_I_`` for intermediate, which, when we're finished doing whatever we intend to do with intermediate code, will be translated reasonably straightforwardly into our virtual machine's machine code, another group of constants, :samp:`IDIO_A_{some_thing}` -- with the ``_A_`` for assembler. Often, though, I'll refer to :samp:`SOME-THING {arg}` meaning the corresponding assembly code written in a more :lname:`Idio`-sympathetic way. The structure of the intermediate code is... you guessed it, a list of lists of lists. The code generator is expecting that, of course, but as it descends the tree of intermediate code statements it will eventually reach the point where it has to emit a stream of byte code, one intermediate instruction at a time. In that sense the list of lists of lists becomes a depth-first sequence of instructions for the virtual machine. Specific Meaning ================ I don't want to go through all of the special forms but we can look at a few to get the general gist. quote ----- ``idio_meaning()`` invokes a slightly truncated argument list with: .. code-block:: c return idio_meaning_quotation (IDIO_PAIR_H (et), IDIO_PAIR_H (et), nametree, flags); which, on reflection, could be even shorter still as ``idio_meaning_quotation()`` is the straightforward: .. code-block:: c static IDIO idio_meaning_quotation (IDIO src, IDIO v, IDIO nametree, int flags) { ... return IDIO_LIST2 (IDIO_I_CONSTANT_SYM_REF, v); } in other words, only the argument to ``quote``, the head-of-the-tail of the original ``e``, is used. What we're doing is returning an intermediate instruction to create a "symbolic reference" to a constant from :samp:`{v}`. *We* haven't created the constant -- the code generator will do that -- but that is our intent. What we imagine, then, is that the code generator will add :samp:`{v}` to the virtual machine's array of constants and get back the integer index into the array. The code generator will then encode a corresponding ``IDIO_A_CONSTANT_SYM_REF`` and then the integer into the byte code. When the VM runs it'll hit the ``IDIO_A_CONSTANT_SYM_REF`` instruction which will prompt it to read an integer from the byte code and then set the *val* register to the element in the constants array (indexed by the integer it just read). So, slightly indirectly, the current value being processed will be :samp:`{v}`. The code generator is much more complicated as is tries to make a few educated guesses about how to speed things up. For example, the integer 1 is used "a bit" so it might make some sense to have a special ``IDIO_A_CONSTANT_1`` opcode that simply deposits 1 in the *val* register and avoids a lengthy indirection via the constants array. .. _if: if -- ``if`` is the canonical special form in the sense that it **must not** have its arguments evaluated before calling the "function" ``if`` -- there is no function ``if``, of course, its behaviour is encoded in the byte code generated from the special form's behaviour. The other :lname:`Scheme`\ ly aspect to ``if`` is that *everything* is "true" except ``#f``. .. sidebox:: And we use it as the standard "not ``#f``" value. As a side-effect, that means that ``#t``'s existence is very nearly pointless as *any* value other than ``#f`` is true. However, people like a solid two values to choose from in a boolean set so we need to keep ``#t`` around. First, of course, there's a bit of argument checking. ``if`` takes two or three arguments: :samp:`(if {condition} {consequent} {alternate})` and a variant for when there's no "else" clause, :samp:`(if {condition} {consequent})`. The latter causes us a problem when some wise-guy rumbles: :samp:`(if #f {consequent})`. Um, ``if`` **must** return a value -- *everything* returns a value -- yet there is no :samp:`{alternate}` clause... what gives? The :lname:`Scheme` answer appears to be: "void". A special value suggesting "no computed answer." The "void" value has no printed representation -- well, it'll come out as ``#`` which the reader will reject -- although you can test for it with the primitive predicate ``void?``. For the most part, you suspect it is used in situations where the result from the ``if`` clause is thrown away anyway. In the meanwhile, we have a shoo-in value for non-existent :samp:`{alternate}` clause, ``idio_S_void`` -- another magic constant-symbol. ``idio_meaning()`` invokes: .. code-block:: c return idio_meaning_alternative (src, IDIO_PAIR_H (et), /* condition */ IDIO_PAIR_H (ett), /* consequent */ ehttt, /* alternate -- could be */ nametree, flags, cs, cm); In other words the full set of lexical state. This is because *any* of :samp:`{condition}`, :samp:`{consequent}` or :samp:`{alternate}` could be of arbitrary complexity. ``idio_meaning_alternative()`` is the surprisingly concise: .. code-block:: c static IDIO idio_meaning_alternative (IDIO src, IDIO e1, IDIO e2, IDIO e3, ...) { ... IDIO m1 = idio_meaning (e1, e1, nametree, IDIO_MEANING_NOT_TAILP (flags), cs, cm); IDIO m2 = idio_meaning (e2, e2, nametree, flags, cs, cm); IDIO m3 = idio_meaning (e3, e3, nametree, flags, cs, cm); return IDIO_LIST4 (IDIO_I_ALTERNATIVE, m1, m2, m3); } where we recursively figure out the meanings of the three arguments and return them in a list with the ``IDIO_I_ALTERNATIVE`` intermediate code. So, nothing interesting at all. The code generator for ``if`` is quite cunning, mind. .. _tailp: tailp ^^^^^ The only thing that will catch your eye is the use of :samp:`IDIO_MEANING_NOT_TAILP ({flags})` which unsets the "in tail position" bit in :samp:`{flags}`. What's going on here? Let's have a quick think about things in tail position. If your alternate expression is in the middle of a sequence: .. parsed-literal:: define (foo) { *this* if *condition* *consequent* *alternate* *that* } then you assume that whatever is processing the sequence will have handled that this ``if`` is not in tail position so us unsetting the "tailp" flag is neither here nor there. What if we *are* in tail position? .. parsed-literal:: define (foo) { *this* *that* if *condition* *consequent* *alternate* } We know that one of two possible code sequences will apply: either the evaluation of the :samp:`{condition}` results in "true" and then we'll run the code for the :samp:`{consequent}`: .. parsed-literal:: define (foo) { *this* *that* *condition* *consequent* } or the evaluation of the :samp:`{condition}` results in "false" and then we'll run the code for the :samp:`{alternate}`: .. parsed-literal:: define (foo) { *this* *that* *condition* *alternate* } In *both cases*, though, the evaluation of the :samp:`{condition}` is **not** the last thing to be run. It is *never* in tail position hence we can scrub the flag when processing it. Either of the of the :samp:`{consequent}` or :samp:`{alternate}` *could* be in tail position so we'll leave the flag alone. But notice that we don't *set* the flag. We only ever disable it. :socrates:`How does it ever get set, then?` Well, it's only ever set for the body clause of a function *definition*. The reason is slightly back-to-front. The whole reason to have tail call optimisation is to avoid "blowing up the stack" by making too many nested function calls. Every function call tacks a bit more *stuff* on the stack -- we save a bit of state in case the thing we call overwrites it -- and that accumulated *stuff* will, eventually, add up. If we know that we're *in* a function call and the *last* thing we do in this function call is make a function call to someone else then we can skip any state preservation nonsense because whatever the guy we're about to call is going to return is what *we* would be returning ourselves in turn. So this guy might as well return direct to our caller. The details for returning to our caller are on the stack ready for us to use so instead of the full function invocation palaver we effect a sort of function "goto." This next guy *replaces* me and, instead of returning a value to me, will non-the-wiser be returning the value to *my* caller. So, this "tailp" trickery **must** require that we're *in* a function call -- otherwise the replacement and expectation about a function return won't be on the stack -- for us to enact it. Hence the "tailp" flag is only set during the evaluation of a function definition. A function's body, however, is usually a sequence -- as in a block -- in which case the "tailp" flag is suppressed for all but the last statement in the sequence. Thereafter, whenever a function is invoked, when it reaches the last statement in the body, "tailp" would have been enabled during the evaluation of the meaning of that statement and if that statement resulted, ultimately, in a function call at the end, then the function call will be a function "goto." .. _define: define ------ ``define`` introduces a variable at "top level" and then assigns a value to it, or, more properly, *binds* it to a value. The English language expression, "assign to", suggests that the variable might be a container for the value. In practice, most :lname:`Idio` values are allocated on the :lname:`C` heap and the underlying :lname:`C` ``IDIO`` values refer/point to the allocated heap memory -- unless it's a constant or fixnum in which case we squeeze it into the upper bits of the ``IDIO`` "pointer". So, correctly (most of the time), the :lname:`C` ``IDIO`` variable refers/points to some splodge of memory and, by extension, the :lname:`Idio` variable is *bound* to that splodge of memory (value). If we subsequently "assign" a different value to the variable then in practice we are simply changing the reference in the ``IDIO`` entity to point at a different splodge of memory and the :lname:`Idio` variable is now bound to a different value. The phrase "assigning a value to a variable" is endemic and mostly incorrect. However, it's what we say. "Top level" could mean a global table of known names or, as in the case for :lname:`Idio`, a module-specific table of known names. This "top level" is usually described as the *environment* during :lname:`Lisp` language processing. Of course "environment" has an alternative meaning to us shell-people so I'm slightly loathe to use it. The virtual machine's register is still *env*, though, as a throwback to our :lname:`Scheme`-ly origins. You might ask why we want to *define* things rather than simply assign to them, auto-creating the name in the top level as we go? Well, I suppose we could (and, indeed, we can) but there's an air of organisation and clarity if we're defining things. In addition, if a variable is defined before it is (otherwise) used -- ie. there are no forward lookups of variables -- then we don't have to employ extra checks to ensure a variable was eventually defined and we've not just been left hanging in the wind, here. ``define`` itself has a couple of forms it can be used in: #. :samp:`define {name} {expression}` -- for the straightforward assignment/binding of :samp:`{name}`, a symbol, to some value resulting from the evaluation of :samp:`{expression}` #. :samp:`define ({name} {formals*}) {expression}` -- for the definition of a function with the reultant function value assigned to :samp:`{name}` :samp:`{expression}` will most likely be a block: .. parsed-literal:: define (*name* *formals\**) { ... } This second form is the equivalent of: .. parsed-literal:: define *name* (function (*formals\**) { ... }) and this rewrite is exactly what the evaluator does. You'll note the extra parentheses around the function definition which, in the first instance, mean that ``define`` isn't given an arbitrary number of arguments but just two, the *name* and *expression*, and secondly give the impression (realised in practice) that like any other argument, say, ``(+ 1 2)``, the anonymous function definition is instantiated into a function value and it is the function value that is passed to ``define``. We'll see this rewrite in a second. .. sidebox:: Though maybe not as lazy as that *other* guy... I'm as lazy as the next guy so the ``:=`` operator has been co-opted into use as a synonym for the first form of ``define``: :samp:`{name} := {expression}`. Of course, if it's the second form, ie. the second argument is a list, and we're implicitly constructing a function from it then we need to re-tag the newly created function with the source code properties of the original. ``idio_meaning()`` invokes: .. code-block:: c idio_meaning_define (src, IDIO_PAIR_H (et), ett, nametree, flags, cs, cm); where ``idio_meaning_define()`` looks like: .. code-block:: c static IDIO idio_meaning_define (IDIO src, IDIO name, IDIO e, ...) Here, :samp:`{name}` might be a symbol or a list -- depending of which form of ``define`` was in use. If :samp:`{name}` is a list then we know it is :samp:`({name} {formals*})` so we can extract both :samp:`{name}` and :samp:`{formals*}` (the head and tail of the incoming :samp:`{name}`) to construct a new function, rewriting both :samp:`{name}` and :samp:`{e}` in the process: .. code-block:: c if (idio_isa_pair (name)) { /* * (define (func arg) ...) => (define func (function (arg) ...)) * * NB e is already a list */ e = idio_list_append2 (IDIO_LIST2 (idio_S_function, IDIO_PAIR_T (name)), e); name = IDIO_PAIR_H (name); idio_meaning_copy_src_properties (src, e); } If :samp:`{name}` *wasn't* a list then this is a simple assignment/binding and we can do a quick check on :samp:`{e}` as *that* should just be a simple expression. .. code-block:: c if (idio_isa_pair (name)) { ... } else { if (idio_isa_pair (e)) { e = IDIO_PAIR_H (e); idio_meaning_copy_src_properties (src, e); } } this means that :samp:`define {name} {expr1} {expr2} ...` is quietly reduced to just :samp:`define {name} {expr1}`. Perhaps we should complain more? Next we need to look :samp:`{name}` up. It *might* already exist. In fact, it might be a *lexical* variable! In both of those cases, we'll simply be reverting to assignment of the existing variable -- **not** creating a new one. .. code-block:: c IDIO sk = idio_meaning_variable_kind (src, nametree, name, IDIO_MEANING_TOPLEVEL_SCOPE (flags), cs, cm); /* some top level variable creation hocus-pocus if required */ :samp:`IDIO_MEANING_TOPLEVEL_SCOPE ({flags})` is used to indicate what sort of variable should be created if an existing variable is not found (hint: a toplevel variable). The "hocus-pocus" is important -- though the details aren't as it's a bit bespoke -- in that if the result of the variable lookup does not have a VM variable array index associated with it then we generate one right now. We *are* defining the variable, it definitely exists. Almost done. We now have an existing or new (top level) variable in our hands so we can do the real action, the assignment which, given that assignment, ``=`` or the :lname:`Scheme`-ish ``set!``, needs to be handled in its own right simply means we can jump on the back of it: .. code-block:: c return idio_meaning_assignment (src, name, e, nametree, IDIO_MEANING_DEFINE (IDIO_MEANING_TOPLEVEL_SCOPE (flags)), cs, cm); We pass in a "define" flag with :samp:`IDIO_MEANING_DEFINE ({flags})` which adds a prefix to what the assignment function will generate. We could pull the prefix code the assignment function adds back here but two other places (defining dynamic and environment variables) also do the same. So, put the prefix code in three places or one? Assignment ---------- Assignment *is* a lot more interesting. Remember it's called directly as well as from :ref:`define`. A quick recap on the various ways we might stumble over the assignment of, in particular, a free variable. If we have previously defined a variable (or are in the act of defining one) then we should have an index into the VM's variable array to hand, :samp:`{vi}`, and can perform the assignment directly with a :samp:`GLOBAL-VAL-SET {vi}` instruction. On the other hand, if we're mid-function assigning to a variable we haven't seen defined yet, ie. a forward reference, then we ought to complain if, come the time of assignment when the code is run, the variable had never been defined. That's poor form on the part of the coder (*bad user!*). This is where it gets a little tricky. We know the variable is used -- we're about to assign to it -- but we need to know *separately* whether the variable was defined. So the variable lookup also returns the extra information -- in particular it returns 0 (zero) for the VM variable array index. Under these circumstances we need to have the VM perform a check, which means a different opcode, :samp:`GLOBAL-SYM-SET {ci}`, where we require to pass in an index into the VM's constants array in order that we can dig out the symbol and perform the necessary lookups (through the module's top-level and the exports of its imported modules) to find out if its been defined yet. Clearly, this isn't as lean a process as simply assigning to a known variable. What is worse is that we cannot change the opcodes (it's been a while since you've been able to modify assembler mid-run -- think: read-only ``.TEXT`` segments -- and we should not be bucking any trends here) so this assignment will *always* have to perform this relatively convoluted lookup to get the variable array index it ultimately needs to do the real assignment. When I get round to :ref:`pre-compilation` which will require a double dereference for pre-compiled byte code brought in "from the cold" then I suspect that *all* the generated byte code will fall into line -- for consistency if nothing else. The only thing that will lose out are any known direct variable assignments, :samp:`GLOBAL-VAL-SET {vi}`, which would be replaced with a double dereference. Unless it's left in as an option. Anyway, back to assignment in ``idio_meaning_assignment()``. We'll skip the bit about :ref:`setters` (too advanced) and syntax checking (too dull). We'll figure out the meaning of the expression passed in: .. code-block:: c IDIO m = idio_meaning (e, e, nametree, IDIO_MEANING_NO_DEFINE (IDIO_MEANING_NOT_TAILP (flags)), cs, cm); which handles two things: #. the expression is not being evaluated in tail position This is the expression on the right hand side of an assignment. It will be evaluated before the assignment itself and therefore cannot be in tail position. #. we turn off the "define" flag (if it was turned on) We'll then lookup what *kind* of a variable :samp:`{name}` is. If the variable didn't exist previously then it will now, as a top level one, except it'll have no value index associated with it. The kind of variable is now important as it affects the code we want generated: * if it is a lexical variable then we can generate :samp:`SHALLOW-ARGUMENT-SET{i}` or :samp:`DEEP-ARGUMENT-SET {d} {i}` code as appropriate where the variable lookup will have informed us of the relevant values for :samp:`{d}` and :samp:`{i}` (and it's a "shallow" reference if :samp:`{d}` is zero) These can be return immediately, there's nothing more to do. * if it is a top-level variable then: * if we haven't seen a definition yet then we can generate a :samp:`GLOBAL-SYM-SET {ci}` assignment .. code-block:: c assign = IDIO_LIST3 (IDIO_I_GLOBAL_SYM_SET, fmci, m); * otherwise we can generate a :samp:`GLOBAL-VAL-SET {vi}` assignment .. code-block:: c assign = IDIO_LIST3 (IDIO_I_GLOBAL_VAL_SET, fgvi, m); * if it is a dynamic or environment variable we generate a :samp:`GLOBAL-SYM-SET {ci}` assignment * if it is a computed variable we generate a :samp:`COMPUTED-SYM-SET {ci}` with or without a definition tag and return immediately * if it is a predefined variable -- ie. a primitive -- then there's a bit of a dance regarding *templates* which might get run between now (when we've just created a new toplevel variable overriding the predefined variable) and when the byte code is run to (re-)define this new toplevel variable. So we temporarily set the new toplevel to the old predefined value. In this sense, there is a general assumption that if you intend to redefine ``map``, say, then your intention is to create a new function to iterate over lists, applying a function and collecting a result and not, say, go off on some `cartographic odyssey `_. Maintaining the old functionality until the new functionality is defined seems sensible enough. Finally, then, we can return either the assignment or the assignment with a "define" prefix: .. code-block:: c if (IDIO_MEANING_IS_DEFINE (flags)) { return IDIO_LIST2 (IDIO_LIST4 (IDIO_I_GLOBAL_SYM_DEF, name, kind, fmci), assign); } else { return assign; } sequence -------- For a sequence of statements it is quite important to squash the :ref:`tailp` flag for all but the final statement. Otherwise the three sequence functions, ``begin``, ``and`` and ``or``, only really differ by: #. their default value if they are not passed any arguments: * ``(begin)`` is "void" (see :ref:`if `) * ``(and)`` is ``#t`` * ``(or)`` is ``#f`` #. how they decide to stop processing the sequence and what value to return * ``begin`` -- stop when it gets to the end of the sequence and return the value of the last expression * ``and`` -- stop if any value is ``#f`` and return the last value computed * ``or`` -- stop when any value is not ``#f`` and return the last value computed Remember, these are the sequence functions not the ``and`` and ``or`` *operators*. They are processed identically, though, at this stage. Assuming they *do* have some arguments ``idio_meaning()`` calls: .. code-block:: c return idio_meaning_sequence (et, et, nametree, flags, eh, cs, cm); where ``eh`` will be ``begin``, ``and`` or ``or`` and ``et`` will be the argument expressions. ``idio_meaning_sequence()`` does a quick test: * if the arguments are, in fact, a single argument then we call ``idio_meanings_single_sequence()`` which (recursively) returns the meaning of the head of the list of argument expressions. * otherwise we *would* have followed in the footsteps of :ref-title:`LiSP` in calling a function ``idio_meanings_multiple_sequence()`` except :ref-title:`LiSP`, using the underlying :lname:`Scheme` implementation can recurse to its heart's content whereas we will eventually blow up our :lname:`C` stack if the sequence is too large. The exemplar "large sequence" is that sequence of statements in a large source file. In practice, then, we convert the :lname:`Scheme`\ ly recursion into a :lname:`C`-friendly iterative loop and walk down the list of arguments converting each one in turn into some meaning and tacking it onto a list. Technically, we push it onto the front of a now reversed list of meanings which, come the end of the loop, we reverse. However we have managed it, we have a correctly ordered list of meanings onto the front of which we tack the :samp:`IDIO_I_{sequence}` intermediate code -- ``IDIO_I_BEGIN`` etc.. In other words: * :samp:`(and {e1})` becomes just :samp:`{m1}` (converting an expression into a meaning) * :samp:`(and {e1} {e2} {e3})` becomes :samp:`(IDIO_I_AND {m1} {m2} {m3})` .. _module: module ------ As mentioned previously, the evaluator cares about the current module and the virtual machine... not so much. The virtual machine does retain the value for the current module if only to have a value to return for ``(current-module)``. The evaluator, of course, needs to keep track of the current module so it can figure out which ``v`` you are referring to. Today it all just works but back when I was loading files a little differently, ``module`` and friends, required some evaluator support. This section gives a little history you might learn from. First, a quick diversion. .. aside:: I thought that *concomitant* was along the lines of "co-committed" -- albeit with a funny spelling (so what's new in English?) However, it is derived from *con* (together with) and *comes* (companion) ultimately giving you the meaning "accompanying." Here it is used to mean "must be defined with reference to each other." In the source code you'll be using :samp:`module {m}` to change module. ``module`` is a template, though, partly because it needs to be *concomitant* with ``load``. We have a semantic problem in that if you load in a file which, at the top, says ``module foo`` then when do you stop being in module ``foo``? Naturally, you will say, *at the end of the file*. When is that, given that you are reading a sequence of statements from the file? There needs to be a hook into ``load`` to handle this -- but not the hook you necessarily expect. ``load`` could also fail and quit early because of any kind of error when reading and evaluating the file. You would expect it to "unwind" the ``module`` statement then too. For handling modules I've taken an idea from STklos_, that of a module "stack" and lets you nested :samp:`(define-module {name} & {body})` statements. ``define-module`` will catch any conditions an unwind the module stack. I don't actually use ``define-module`` but rather have a simple :samp:`module {name}` statement which flips the rest of the file (or to the next ``module`` statement) into module :samp:`{name}`. I did add an :samp:`(in-module {name} & {body})` which functions identically to ``define-module`` but just feels better purposed. Not that I use *it* much but it can come in handy. OK, when we run ``load`` it needs to be module-aware -- and condition aware! -- and reset the current module back to whatever it was when ``load`` started. And remember to return the result of the actual (original) ``load`` call -- not that many people will look at it. Back to the evaluator. In fact, back to when I was entertaining myself with the idea of reading all the expressions in from the source, evaluating them all then running the generated code from them all. (Rather than, read, evaluate and run one expression, read, evaluate and run the next expression, etc..) The ``module`` statement -- as the evaluator sees it -- isn't going to change the sense of the current module until we actually get round to running the code which is going to be ages away after we've evaluated the rest of the statements in the file. The very statements that want to know they're in a different module. Hmm. The above diversion tells us that ``module`` is a template -- which ultimately calls the primitive :samp:`%set-current-module! {name}`. It seems we have a choice, we could replace the primitive ``%set-current-module!`` with a special form (which makes a single function call to set the *mod* register in the VM) or we could have the evaluator spot ``module`` as a special form and then run the expander code for ``module`` anyway. For some reason I did the latter. I think it's because ``%set-current-module!`` can be given a parameter rather than a symbol and therefore the evaluator won't know the value of it until the code is run. ``module``, on the other hand, is forced to be passed a symbol (because it's a template). Anyway, for the evaluator, when we see the ``module`` statement, we'll steal the argument (which must be a symbol because ``module`` is a template and so won't have had any arguments evaluated) and set the current module directly here and now. This immediately affects all future variable lookups which will use the current module as its starting point. This *feels* slightly wrong. We're changing the state of the currently running process whilst evaluating and therefore before any code is run. However, it does mean that the evaluator has the correct sense of the current module and subsequent variable lookups do the right thing. Also note that nothing has set the module back to its original value. We *rely* on the improved concomitant ``load`` to do that work for us. There is a similar knock-on effect on module imports and, arguably, exports, as, in particular, module imports need updating immediately in order that the rest of the statements can successfully use variables exported from other modules. We can't wait until the code is run before knowing what we've imported from other modules. So, the problem here is *entirely* the "all in one" loading method. If we read, evaluate and run a statement at a time then everything just falls into place. .. include:: ../../commit.rst