Evaluating¶
The evaluator is how we find meaning in the source code.
Knowing that the reader has supplied us with lists of lists then it shouldn’t come as huge surprise that the evaluator is quite recursive in its nature but is relatively straightforward all the same.
There are plenty of complications, of course, for example there are several situations where the source code is normalised by re-writing it.
This normalisation is a form of implicit syntax transformation (as in, the Idio language makes the transformation according to its built-in rules, which we’re about to discuss). The use of syntax expanders (via templates aka macros) allows users to make explicit syntax transformations.
Context¶
We’re not deviating a great deal from the technique outlined in LiSP ([Que94]) based on which our search for meaning is going to involve a few basic repeating variables:
e
– the expression we’re currently evaluating (not a huge surprise)As we recursively evaluate the elements of a list, say, then the expression to be evaluated will become the, say, head of the list. When the evaluation recursion unwinds, the expression “to hand” will revert as expected.
We’ll likely have
eh
andet
as the head and tail of a pair/list and further derivatives.src
– an Idio addition is to maintain the original source expression in order that we can pass on any source code properties (namely the lexical object defined by the reader) to any derived expression we might generatenametree
– a “name tree”As we walk through the lists of lists and determine that new lexical variables are introduced then they push in front of previous lexical variables giving us a hierarchy of known names. This name tree is then available for us to check down when a variable is referenced.
As the lists of lists recursion unwinds then the nametree unwinds with it.
It is a list of lists where the inner lists are of the variables introduced by a given variable-introducing statement.
A name tree is slightly more obvious in Scheme where multiple variables can be introduced in a single
let
(or variant) statement but the effect is still true in Idio where an assignment operator introduces variables onelet*
at a time.{ a := 1 ;; nametree ~ ((a)) ;; => a is first list, first slot ~> SHALLOW-ARGUMENT-REF0 b := 2 ;; nametree ~ ((b) (a)) ;; => a is second list, first slot ~> DEEP-ARGUMENT-REF 1 0 }
Hmm, not the most clear example but our list of names in the name tree has:
after the first variable assignment a variable
a
in scopeafter the second variable assignment it then has a
b
in scope and then thea
, now a level out
The reason this is important goes back to the LiSP mechanism for accessing lexical variables through a linked list of frames. The opcodes go one level back,
SHALLOW-ARGUMENT-REFn
, or multiple levels back,DEEP-ARGUMENT-REF d n
(for some depthd
and index within the framei
).The nested frame mechanism is required because when we call a closure we’ll create a frame here for the arguments to go into then invoke the closure. The first thing the closure mechanism does is reset the frame hierarchy to that which the closure had when it was created. The frame we just created is linked into that “historic” frame hierarchy and the closure runs.
From the closure’s perspective, it sees the arguments to itself in front of the original set of lexical variables when the closure was created.
let
is still legal syntax so we can make it a bit more obvious:{ let ((a 1) (b 2) (c 3)) { ;; nametree ~ ((a b c)) ;; => c is first list, third slot ~> SHALLOW-ARGUMENT-REF2 x := 2 ;; nametree ~ ((x) (a b c)) ;; => c is second list, third slot ~> DEEP-ARGUMENT-REF 1 2 }
Here, now, after the
x
assignment, we have anx
in scope and then all three ofa
,b
andc
are known names another level out. All three were created with the same variable-introducing statement,let
flags
– we’ll need some flags to indicate whether:the expression is in tail position
This is very important – and surprisingly easy to maintain – to give us the power of tail call optimisation.
an Idio addition is the nature of variables being created
Here, we’re looking at whether the variable is being created:
lexically, because we found it in the name tree
at top-level, because we couldn’t find it in the lexical name tree
in a dynamic or environmental or computed context – which is effectively top-level but managed in a different way
cs
– a set of known constantsNominally, this can be used as a “known top-level names” list (amongst other things) but in Idio it is used to map a constant of any kind (symbol, list, array, etc.) into a unique integer for embedding in the byte code.
cm
– an Idio addition is the current moduleAs the source code switches between modules the expectation is that the evaluator can find the correct variable (ie. my
v
not the other guy’sv
) and to effect that we need to track any changes to the sense of the current module by latching onto any module changing statements in the source code.
All of which are C lexical variables used throughout
src/evaluate.c
(and Idio lexical variables in the
Idio variant lib/evaluate.idio
, the
metacircular evaluator).
In effect, all of the above become formal parameters to almost every function in the evaluator.
In case anyone is still reading the
s
incs
for the constants is for “star” as in a more Lispy or EBNF-yc*
meaning zero or more.There is also use of the likes of
ep
with thep
for “plus” as the C equivalent ofe+
meaning one or more.
evaluate¶
Kicking it all off is idio_evaluate()
which looks like this:
IDIO idio_evaluate (IDIO src, IDIO cs)
{
...
IDIO m = idio_meaning (src,
src,
idio_S_nil, /* name tree */
IDIO_MEANING_FLAG_NONE,
cs,
idio_thread_current_module ());
...
return IDIO_LIST2 (..., m);
}
It’ll take some source code, src
, and a list of known constants,
cs
. The source code bit is obvious and most invocations will pass
the virtual machine’s constants array for constants.
Fundamental Meaning¶
As noted elsewhere, we rely on the evaluator distinguishing between special forms, templates and anything left over is a derived form or a constant.
It is hugely tempting to add to the list of special forms. Of course, the magic works but it will become a bind it is hard to extract yourself from.
However, even Scheme has a minimal set of special forms to let you bootstrap everything else:
define
andset!
allow you to bind a name to a value and to change that bindingquote
prevents the evaluator evaluating an expressionif
provides the conditional consequent / alternative without the evaluator evaluating eitherlambda
(orfunction
in Idio) lets you define abstractions which you can subsequently invoke – these are the derived formsdefine-macro
(ordefine-template
in Idio) lets you define your own “special form” – special in that the arguments are not evaluated – albeit all you can do is return more code for the evaluator to evaluate.
There are other special forms which have a genuine need to be handled specially – think of anything that needs to manipulate internal C state – and, of course, some that have snuck in because it is convenient etc..
So, the premise of the main evaluation loop is simply to look at the expression to hand and determine if it is special, a template or otherwise treat it as a derived form or constant.
Lispy languages always have the functional part in the first position of a list so, if the expression is a list we simply need to look at what the first element is.
idio_meaning()
in src/evaluate.c
(a debatably poor name as
it is the evaluator but almost everything is called
idio_meaning_something
!) has a big test:
IDIO idio_meaning (IDIO src, IDIO e, IDIO nametree, int flags, IDIO cs, IDIO cm)
{
if (idio_isa_pair (e)) {
IDIO eh = IDIO_PAIR_H (e);
IDIO et = IDIO_PAIR_T (e);
/* e is (eh ...) */
if (...) {
...
} else if (idio_S_quote == eh) {
...
} else if (idio_S_function == eh) {
...
} else if (idio_S_if == eh) {
...
} else if (idio_S_set == eh) {
...
} else if (idio_S_define_template == eh) {
...
} else if (idio_S_define == eh) {
...
} else {
/* could be a template */
if (idio_isa_symbol (eh)) {
if (idio_expanderp (eh)) {
return idio_meaning_expander (e, e, nametree, flags, cs, cm);
}
}
/* default is a function call */
return idio_meaning_application (src, eh, et, nametree, flags, cs, cm);
}
} else {
/*
* do something with:
*
* symbols: (de-)reference them
*
* constants: quote them -- evaluate (12) -> 12
*/
}
}
and, without suggesting that that is everything, in fact, that single (large) conditional clause is the guts of a Lisp evaluator.
The idio_meaning()
function is physically large because it also
embeds the initial syntactic checking.
For example, quote
takes a single argument to be quoted. Which
means that no argument, (qoute)
, or more than one argument,
(quote 1 2)
, must be caught and flagged as errors.
These are slightly obscure and might not happen in practice – as most
use of quote
is through 'expr
where the reader ensures
that there is only one expression passed to quote
– but we should
flag up the error to catch wayward typing.
This testing could be devolved to the specific special form handler,
idio_meaning_quotation()
, in this case. Yeah, maybe, but I’ve
started so I’ll finish.
The Result of the Meaning¶
Not yet defined is what idio_meaning()
is meant to return. What
is it meant to do?
Our goal, from inferring some meaning from the lists of lists the reader gave us, will be to head off to the code generator so we probably want something amenable to that.
In our case, we’re going to have the evaluator generate some “intermediate code.” By this we mean to have reduced the source code expressions down to some high level statements of intent with a vague eye on how the virtual machine works. I confess, that’s not a terribly clear description as, for me, it’s a bit hard to describe without showing examples (coming in the next section).
You can imagine, though, in our highfalutin source language we bind variables to values whereas in the grubby world of machine code we’re going to “set” something.
The intermediate language has a group of constants,
IDIO_I_some_thing
– with the _I_
for intermediate,
which, when we’re finished doing whatever we intend to do with
intermediate code, will be translated reasonably straightforwardly
into our virtual machine’s machine code, another group of constants,
IDIO_A_some_thing
– with the _A_
for assembler.
Often, though, I’ll refer to SOME-THING arg
meaning the
corresponding assembly code written in a more
Idio-sympathetic way.
The structure of the intermediate code is… you guessed it, a list of lists of lists. The code generator is expecting that, of course, but as it descends the tree of intermediate code statements it will eventually reach the point where it has to emit a stream of byte code, one intermediate instruction at a time.
In that sense the list of lists of lists becomes a depth-first sequence of instructions for the virtual machine.
Specific Meaning¶
I don’t want to go through all of the special forms but we can look at a few to get the general gist.
quote¶
idio_meaning()
invokes a slightly truncated argument list with:
return idio_meaning_quotation (IDIO_PAIR_H (et),
IDIO_PAIR_H (et),
nametree,
flags);
which, on reflection, could be even shorter still as
idio_meaning_quotation()
is the straightforward:
static IDIO idio_meaning_quotation (IDIO src, IDIO v, IDIO nametree, int flags)
{
...
return IDIO_LIST2 (IDIO_I_CONSTANT_SYM_REF, v);
}
in other words, only the argument to quote
, the head-of-the-tail
of the original e
, is used.
What we’re doing is returning an intermediate instruction to create a
“symbolic reference” to a constant from v
.
We haven’t created the constant – the code generator will do that – but that is our intent.
What we imagine, then, is that the code generator will add v
to the virtual machine’s array of constants and get back the integer
index into the array. The code generator will then encode a
corresponding IDIO_A_CONSTANT_SYM_REF
and then the integer into
the byte code.
When the VM runs it’ll hit the IDIO_A_CONSTANT_SYM_REF
instruction
which will prompt it to read an integer from the byte code and then
set the val register to the element in the constants array (indexed
by the integer it just read).
So, slightly indirectly, the current value being processed will be
v
.
The code generator is much more complicated as is tries to make a few
educated guesses about how to speed things up. For example, the
integer 1 is used “a bit” so it might make some sense to have a
special IDIO_A_CONSTANT_1
opcode that simply deposits 1 in the
val register and avoids a lengthy indirection via the constants
array.
if¶
if
is the canonical special form in the sense that it must not
have its arguments evaluated before calling the “function” if
–
there is no function if
, of course, its behaviour is encoded in
the byte code generated from the special form’s behaviour.
The other Schemely aspect to if
is that everything is
“true” except #f
.
And we use it as the standard “not #f
” value.
As a side-effect, that means that #t
’s existence is very nearly
pointless as any value other than #f
is true. However, people
like a solid two values to choose from in a boolean set so we need to
keep #t
around.
First, of course, there’s a bit of argument checking. if
takes
two or three arguments: (if condition consequent
alternate)
and a variant for when there’s no “else” clause,
(if condition consequent)
.
The latter causes us a problem when some wise-guy rumbles: (if
#f consequent)
. Um, if
must return a value – everything
returns a value – yet there is no alternate
clause… what
gives? The Scheme answer appears to be: “void”. A special
value suggesting “no computed answer.” The “void” value has no
printed representation – well, it’ll come out as #<void>
which
the reader will reject – although you can test for it with the
primitive predicate void?
.
For the most part, you suspect it is used in situations where the
result from the if
clause is thrown away anyway. In the
meanwhile, we have a shoo-in value for non-existent
alternate
clause, idio_S_void
– another magic
constant-symbol.
idio_meaning()
invokes:
return idio_meaning_alternative (src,
IDIO_PAIR_H (et), /* condition */
IDIO_PAIR_H (ett), /* consequent */
ehttt, /* alternate -- could be <void> */
nametree,
flags,
cs,
cm);
In other words the full set of lexical state. This is because any
of condition
, consequent
or alternate
could be of arbitrary complexity.
idio_meaning_alternative()
is the surprisingly concise:
static IDIO idio_meaning_alternative (IDIO src, IDIO e1, IDIO e2, IDIO e3, ...)
{
...
IDIO m1 = idio_meaning (e1, e1, nametree, IDIO_MEANING_NOT_TAILP (flags), cs, cm);
IDIO m2 = idio_meaning (e2, e2, nametree, flags, cs, cm);
IDIO m3 = idio_meaning (e3, e3, nametree, flags, cs, cm);
return IDIO_LIST4 (IDIO_I_ALTERNATIVE, m1, m2, m3);
}
where we recursively figure out the meanings of the three arguments
and return them in a list with the IDIO_I_ALTERNATIVE
intermediate
code.
So, nothing interesting at all. The code generator for if
is
quite cunning, mind.
tailp¶
The only thing that will catch your eye is the use of
IDIO_MEANING_NOT_TAILP (flags)
which unsets the “in tail
position” bit in flags
. What’s going on here?
Let’s have a quick think about things in tail position. If your alternate expression is in the middle of a sequence:
define (foo) { this if condition consequent alternate that }
then you assume that whatever is processing the sequence will have
handled that this if
is not in tail position so us unsetting the
“tailp” flag is neither here nor there.
What if we are in tail position?
define (foo) { this that if condition consequent alternate }
We know that one of two possible code sequences will apply: either the
evaluation of the condition
results in “true” and then we’ll
run the code for the consequent
:
define (foo) { this that condition consequent }
or the evaluation of the condition
results in “false” and
then we’ll run the code for the alternate
:
define (foo) { this that condition alternate }
In both cases, though, the evaluation of the condition
is
not the last thing to be run. It is never in tail position
hence we can scrub the flag when processing it.
Either of the of the consequent
or alternate
could be in tail position so we’ll leave the flag alone.
But notice that we don’t set the flag. We only ever disable it.
How does it ever get set, then? Well, it’s only ever set for the body clause of a function definition. The reason is slightly back-to-front.
The whole reason to have tail call optimisation is to avoid “blowing up the stack” by making too many nested function calls. Every function call tacks a bit more stuff on the stack – we save a bit of state in case the thing we call overwrites it – and that accumulated stuff will, eventually, add up.
If we know that we’re in a function call and the last thing we do in this function call is make a function call to someone else then we can skip any state preservation nonsense because whatever the guy we’re about to call is going to return is what we would be returning ourselves in turn. So this guy might as well return direct to our caller.
The details for returning to our caller are on the stack ready for us to use so instead of the full function invocation palaver we effect a sort of function “goto.” This next guy replaces me and, instead of returning a value to me, will non-the-wiser be returning the value to my caller.
So, this “tailp” trickery must require that we’re in a function call – otherwise the replacement and expectation about a function return won’t be on the stack – for us to enact it. Hence the “tailp” flag is only set during the evaluation of a function definition.
A function’s body, however, is usually a sequence – as in a block – in which case the “tailp” flag is suppressed for all but the last statement in the sequence.
Thereafter, whenever a function is invoked, when it reaches the last statement in the body, “tailp” would have been enabled during the evaluation of the meaning of that statement and if that statement resulted, ultimately, in a function call at the end, then the function call will be a function “goto.”
define¶
define
introduces a variable at “top level” and then assigns a
value to it, or, more properly, binds it to a value.
The English language expression, “assign to”, suggests that the
variable might be a container for the value. In practice, most
Idio values are allocated on the C heap and the
underlying C IDIO
values refer/point to the allocated
heap memory – unless it’s a constant or fixnum in which case we
squeeze it into the upper bits of the IDIO
“pointer”.
So, correctly (most of the time), the C IDIO
variable
refers/points to some splodge of memory and, by extension, the
Idio variable is bound to that splodge of memory (value).
If we subsequently “assign” a different value to the variable then in
practice we are simply changing the reference in the IDIO
entity
to point at a different splodge of memory and the Idio
variable is now bound to a different value.
The phrase “assigning a value to a variable” is endemic and mostly incorrect. However, it’s what we say.
“Top level” could mean a global table of known names or, as in the case for Idio, a module-specific table of known names.
This “top level” is usually described as the environment during Lisp language processing. Of course “environment” has an alternative meaning to us shell-people so I’m slightly loathe to use it. The virtual machine’s register is still env, though, as a throwback to our Scheme-ly origins.
You might ask why we want to define things rather than simply assign to them, auto-creating the name in the top level as we go? Well, I suppose we could (and, indeed, we can) but there’s an air of organisation and clarity if we’re defining things.
In addition, if a variable is defined before it is (otherwise) used – ie. there are no forward lookups of variables – then we don’t have to employ extra checks to ensure a variable was eventually defined and we’ve not just been left hanging in the wind, here.
define
itself has a couple of forms it can be used in:
define name expression
– for the straightforward assignment/binding ofname
, a symbol, to some value resulting from the evaluation ofexpression
define (name formals*) expression
– for the definition of a function with the reultant function value assigned toname
expression
will most likely be a block:define (name formals*) { ... }
This second form is the equivalent of:
define name (function (formals*) { ... })
and this rewrite is exactly what the evaluator does.
You’ll note the extra parentheses around the function definition which, in the first instance, mean that
define
isn’t given an arbitrary number of arguments but just two, the name and expression, and secondly give the impression (realised in practice) that like any other argument, say,(+ 1 2)
, the anonymous function definition is instantiated into a function value and it is the function value that is passed todefine
.We’ll see this rewrite in a second.
Though maybe not as lazy as that other guy…
I’m as lazy as the next guy so the :=
operator has been co-opted
into use as a synonym for the first form of define
: name
:= expression
.
Of course, if it’s the second form, ie. the second argument is a list, and we’re implicitly constructing a function from it then we need to re-tag the newly created function with the source code properties of the original.
idio_meaning()
invokes:
idio_meaning_define (src, IDIO_PAIR_H (et), ett, nametree, flags, cs, cm);
where idio_meaning_define()
looks like:
static IDIO idio_meaning_define (IDIO src, IDIO name, IDIO e, ...)
Here, name
might be a symbol or a list – depending of which
form of define
was in use.
If name
is a list then we know it is (name
formals*)
so we can extract both name
and
formals*
(the head and tail of the incoming name
)
to construct a new function, rewriting both name
and
e
in the process:
if (idio_isa_pair (name)) {
/*
* (define (func arg) ...) => (define func (function (arg) ...))
*
* NB e is already a list
*/
e = idio_list_append2 (IDIO_LIST2 (idio_S_function,
IDIO_PAIR_T (name)),
e);
name = IDIO_PAIR_H (name);
idio_meaning_copy_src_properties (src, e);
}
If name
wasn’t a list then this is a simple
assignment/binding and we can do a quick check on e
as
that should just be a simple expression.
if (idio_isa_pair (name)) {
...
} else {
if (idio_isa_pair (e)) {
e = IDIO_PAIR_H (e);
idio_meaning_copy_src_properties (src, e);
}
}
this means that define name expr1 expr2 ...
is quietly
reduced to just define name expr1
. Perhaps we should
complain more?
Next we need to look name
up. It might already exist. In
fact, it might be a lexical variable! In both of those cases, we’ll
simply be reverting to assignment of the existing variable – not
creating a new one.
IDIO sk = idio_meaning_variable_kind (src,
nametree,
name,
IDIO_MEANING_TOPLEVEL_SCOPE (flags),
cs,
cm);
/* some top level variable creation hocus-pocus if required */
IDIO_MEANING_TOPLEVEL_SCOPE (flags)
is used to indicate what
sort of variable should be created if an existing variable is not
found (hint: a toplevel variable).
The “hocus-pocus” is important – though the details aren’t as it’s a bit bespoke – in that if the result of the variable lookup does not have a VM variable array index associated with it then we generate one right now. We are defining the variable, it definitely exists.
Almost done. We now have an existing or new (top level) variable in
our hands so we can do the real action, the assignment which, given
that assignment, =
or the Scheme-ish set!
, needs to
be handled in its own right simply means we can jump on the back of
it:
return idio_meaning_assignment (src,
name,
e,
nametree,
IDIO_MEANING_DEFINE (IDIO_MEANING_TOPLEVEL_SCOPE (flags)),
cs,
cm);
We pass in a “define” flag with IDIO_MEANING_DEFINE (flags)
which adds a prefix to what the assignment function will generate.
We could pull the prefix code the assignment function adds back here but two other places (defining dynamic and environment variables) also do the same. So, put the prefix code in three places or one?
Assignment¶
Assignment is a lot more interesting. Remember it’s called directly as well as from define.
A quick recap on the various ways we might stumble over the assignment
of, in particular, a free variable. If we have previously defined a
variable (or are in the act of defining one) then we should have an
index into the VM’s variable array to hand, vi
, and can
perform the assignment directly with a GLOBAL-VAL-SET vi
instruction.
On the other hand, if we’re mid-function assigning to a variable we haven’t seen defined yet, ie. a forward reference, then we ought to complain if, come the time of assignment when the code is run, the variable had never been defined. That’s poor form on the part of the coder (bad user!).
This is where it gets a little tricky. We know the variable is used – we’re about to assign to it – but we need to know separately whether the variable was defined. So the variable lookup also returns the extra information – in particular it returns 0 (zero) for the VM variable array index.
Under these circumstances we need to have the VM perform a check,
which means a different opcode, GLOBAL-SYM-SET ci
, where we
require to pass in an index into the VM’s constants array in order
that we can dig out the symbol and perform the necessary lookups
(through the module’s top-level and the exports of its imported
modules) to find out if its been defined yet.
Clearly, this isn’t as lean a process as simply assigning to a known
variable. What is worse is that we cannot change the opcodes (it’s
been a while since you’ve been able to modify assembler mid-run –
think: read-only .TEXT
segments – and we should not be bucking
any trends here) so this assignment will always have to perform this
relatively convoluted lookup to get the variable array index it
ultimately needs to do the real assignment.
When I get round to Pre-Compiled Modules which will require a double dereference for pre-compiled byte code brought in “from the cold” then I suspect that all the generated byte code will fall into line – for consistency if nothing else.
The only thing that will lose out are any known direct variable assignments,
GLOBAL-VAL-SET vi
, which would be replaced with a double dereference.Unless it’s left in as an option.
Anyway, back to assignment in idio_meaning_assignment()
.
We’ll skip the bit about Setters (too advanced) and syntax checking (too dull).
We’ll figure out the meaning of the expression passed in:
IDIO m = idio_meaning (e,
e,
nametree,
IDIO_MEANING_NO_DEFINE (IDIO_MEANING_NOT_TAILP (flags)),
cs,
cm);
which handles two things:
the expression is not being evaluated in tail position
This is the expression on the right hand side of an assignment. It will be evaluated before the assignment itself and therefore cannot be in tail position.
we turn off the “define” flag (if it was turned on)
We’ll then lookup what kind of a variable name
is. If the
variable didn’t exist previously then it will now, as a top level one,
except it’ll have no value index associated with it.
The kind of variable is now important as it affects the code we want generated:
if it is a lexical variable then we can generate
SHALLOW-ARGUMENT-SETi
orDEEP-ARGUMENT-SET d i
code as appropriate where the variable lookup will have informed us of the relevant values ford
andi
(and it’s a “shallow” reference ifd
is zero)These can be return immediately, there’s nothing more to do.
if it is a top-level variable then:
if we haven’t seen a definition yet then we can generate a
GLOBAL-SYM-SET ci
assignmentassign = IDIO_LIST3 (IDIO_I_GLOBAL_SYM_SET, fmci, m);
otherwise we can generate a
GLOBAL-VAL-SET vi
assignmentassign = IDIO_LIST3 (IDIO_I_GLOBAL_VAL_SET, fgvi, m);
if it is a dynamic or environment variable we generate a
GLOBAL-SYM-SET ci
assignmentif it is a computed variable we generate a
COMPUTED-SYM-SET ci
with or without a definition tag and return immediatelyif it is a predefined variable – ie. a primitive – then there’s a bit of a dance regarding templates which might get run between now (when we’ve just created a new toplevel variable overriding the predefined variable) and when the byte code is run to (re-)define this new toplevel variable.
So we temporarily set the new toplevel to the old predefined value.
In this sense, there is a general assumption that if you intend to redefine
map
, say, then your intention is to create a new function to iterate over lists, applying a function and collecting a result and not, say, go off on some cartographic odyssey.Maintaining the old functionality until the new functionality is defined seems sensible enough.
Finally, then, we can return either the assignment or the assignment with a “define” prefix:
if (IDIO_MEANING_IS_DEFINE (flags)) {
return IDIO_LIST2 (IDIO_LIST4 (IDIO_I_GLOBAL_SYM_DEF, name, kind, fmci),
assign);
} else {
return assign;
}
sequence¶
For a sequence of statements it is quite important to squash the tailp flag for all but the final statement.
Otherwise the three sequence functions, begin
, and
and or
,
only really differ by:
their default value if they are not passed any arguments:
(begin)
is “void” (see if)(and)
is#t
(or)
is#f
how they decide to stop processing the sequence and what value to return
begin
– stop when it gets to the end of the sequence and return the value of the last expressionand
– stop if any value is#f
and return the last value computedor
– stop when any value is not#f
and return the last value computed
Remember, these are the sequence functions not the
and
andor
operators.
They are processed identically, though, at this stage.
Assuming they do have some arguments idio_meaning()
calls:
return idio_meaning_sequence (et, et, nametree, flags, eh, cs, cm);
where eh
will be begin
, and
or or
and et
will be
the argument expressions.
idio_meaning_sequence()
does a quick test:
if the arguments are, in fact, a single argument then we call
idio_meanings_single_sequence()
which (recursively) returns the meaning of the head of the list of argument expressions.otherwise we would have followed in the footsteps of LiSP in calling a function
idio_meanings_multiple_sequence()
except LiSP, using the underlying Scheme implementation can recurse to its heart’s content whereas we will eventually blow up our C stack if the sequence is too large.The exemplar “large sequence” is that sequence of statements in a large source file.
In practice, then, we convert the Schemely recursion into a C-friendly iterative loop and walk down the list of arguments converting each one in turn into some meaning and tacking it onto a list.
Technically, we push it onto the front of a now reversed list of meanings which, come the end of the loop, we reverse.
However we have managed it, we have a correctly ordered list of meanings onto the front of which we tack the
IDIO_I_sequence
intermediate code –IDIO_I_BEGIN
etc..
In other words:
(and e1)
becomes justm1
(converting an expression into a meaning)(and e1 e2 e3)
becomes(IDIO_I_AND m1 m2 m3)
module¶
As mentioned previously, the evaluator cares about the current module
and the virtual machine… not so much. The virtual machine does
retain the value for the current module if only to have a value to
return for (current-module)
.
The evaluator, of course, needs to keep track of the current module so
it can figure out which v
you are referring to.
Today it all just works but back when I was loading files a little
differently, module
and friends, required some evaluator support.
This section gives a little history you might learn from.
First, a quick diversion.
In the source code you’ll be using module m
to change
module. module
is a template, though, partly because it needs to
be concomitant with load
.
We have a semantic problem in that if you load in a file which, at the
top, says module foo
then when do you stop being in module
foo
? Naturally, you will say, at the end of the file. When is
that, given that you are reading a sequence of statements from the
file?
There needs to be a hook into load
to handle this – but not the
hook you necessarily expect. load
could also fail and quit early
because of any kind of error when reading and evaluating the file.
You would expect it to “unwind” the module
statement then too.
For handling modules I’ve taken an idea from STklos, that of a module
“stack” and lets you nested (define-module name & body)
statements. define-module
will catch any conditions an unwind the
module stack.
I don’t actually use define-module
but rather have a simple
module name
statement which flips the rest of the file (or
to the next module
statement) into module name
.
I did add an (in-module name & body)
which functions
identically to define-module
but just feels better purposed. Not
that I use it much but it can come in handy.
OK, when we run load
it needs to be module-aware – and condition
aware! – and reset the current module back to whatever it was when
load
started. And remember to return the result of the actual
(original) load
call – not that many people will look at it.
Back to the evaluator. In fact, back to when I was entertaining myself with the idea of reading all the expressions in from the source, evaluating them all then running the generated code from them all. (Rather than, read, evaluate and run one expression, read, evaluate and run the next expression, etc..)
The module
statement – as the evaluator sees it – isn’t going to
change the sense of the current module until we actually get round to
running the code which is going to be ages away after we’ve evaluated
the rest of the statements in the file. The very statements that want
to know they’re in a different module. Hmm.
The above diversion tells us that module
is a template – which
ultimately calls the primitive %set-current-module! name
.
It seems we have a choice, we could replace the primitive
%set-current-module!
with a special form (which makes a single
function call to set the mod register in the VM) or we could have
the evaluator spot module
as a special form and then run the
expander code for module
anyway.
For some reason I did the latter. I think it’s because
%set-current-module!
can be given a parameter rather than a symbol
and therefore the evaluator won’t know the value of it until the code
is run. module
, on the other hand, is forced to be passed a
symbol (because it’s a template).
Anyway, for the evaluator, when we see the module
statement, we’ll
steal the argument (which must be a symbol because module
is a
template and so won’t have had any arguments evaluated) and set the
current module directly here and now. This immediately affects all
future variable lookups which will use the current module as its
starting point.
This feels slightly wrong. We’re changing the state of the currently running process whilst evaluating and therefore before any code is run. However, it does mean that the evaluator has the correct sense of the current module and subsequent variable lookups do the right thing.
Also note that nothing has set the module back to its original value.
We rely on the improved concomitant load
to do that work for us.
There is a similar knock-on effect on module imports and, arguably, exports, as, in particular, module imports need updating immediately in order that the rest of the statements can successfully use variables exported from other modules. We can’t wait until the code is run before knowing what we’ve imported from other modules.
So, the problem here is entirely the “all in one” loading method. If we read, evaluate and run a statement at a time then everything just falls into place.
Last built at 2024-12-21T07:11:00Z+0000 from 463152b (dev)