Escapes¶
Escapes can be thought of as a limited ability to truncate further processing and continue from a given point.
An obvious example is C‘s return
which says a function
should stop further processing and return a value now.
Loops in many languages have break
and continue
(or
equivalent) to stop processing of the loop altogether or stop further
processing of this iteration and go on to the next iteration of the
loop.
Elsewhere you might throw
a value back to a preceding catch
that can be in a different function.
return
, break
and continue
have lexical scope – you don’t
really expect them to jump function boundaries – whereas catch
and throw
have dynamic scope.
Another interesting titbit is
that Common Lisp’s lexical escape block label ...
and return-from label ...
use label
unevaluated
which suggests at least a macro, if not a special form. The dynamic
escape, catch label ...
and throw label ...
have
label
evaluated which suggests they are derived forms,
ie. functions.
Lexical Escapes¶
Lexical escapes, whilst visually easier to see and reason about are surprisingly hard to implement.
Let’s take return
. return
should return from a function but
wait, we do a lot of transformations from incidental things into
functions. let
, or other assignments, is at risk of becoming a
function:
{
return 0
}
should report an error as there’s no function to return from but:
{
a := 1
return 0
}
won’t because it is transformed into the application of a closed
function taking a single parameter, a
, to the argument
1
:
{
(function (a) {
return 0
}) 1)
}
and return
now has a function to return from.
That’s probably not what we meant to return from – and we would probably still expect to get an error – and lots of transformations occur. Not just in core rewriting but most templates will use functions with the original source text embedded in them.
*
We could differentiate between an anonymous function and a named
function. Common Lisp supports something along these lines
by associating every defined function with an implied block
named
after the function and most looping operations have a block named
nil
that can be returned from.
The idea being that for the named function we can wrap the function body with the implied escape block before asking the evaluator to do its thing.
In our case, though, we have started to lose the distinction between named and unnamed functions because we transform:
define (foo a) {
...
}
into:
set! foo (function (a) {
...
})
and the actual function value is anonymous again.
We could float the idea of having the evaluator’s define
code
recognise where the value is a function and do something but slightly
worse is the rewriting of a nested function:
define (foo a) {
define (bar b) {
...
}
define (baz c) {
...
}
...
}
into:
define (foo a) {
set! bar #f
set! baz #f
set! bar (function (b) {
...
})
set! baz (function (c) {
...
})
...
}
which is a transform where we don’t have a handle on the individual
define
s and we can’t handle it as easily.
That directs us into the code for handling assignment and having the function be identified there. Which is OK, not great, but at least we can capture that association between a function name and the underlying function value.
Nested functions introduce another dilemma. Does it make sense for a
nested function to return-from
(in Common Lisp parlance)
the outer function?
define (foo a) {
define (handler c) {
...
return-from foo #t
}
...
trap some-condition handler {
risky-processing ...
}
}
Actually, I think I’m OK with that. The return-from
label,
foo
, is in scope for handler
and it makes sense for a handler
to patch up whatever mess has occurred and exit cleanly.
A slightly clearer example might be a bespoke SIGINT
handler for
risky-processing
that could break out of a time- or
resource-consuming loop.
We could have a system-wide SIGINT
handler but that wouldn’t know
it was in risky-processing
and wouldn’t know what resources needed
to be cleaned up.
A slightly more dubious case might be:
while #t {
define (handler c) {
...
break
}
...
trap some-condition handler {
risky-processing ...
}
}
where we’re achieving the same effect although it seems less elegant to have a handler function break from an enclosing loop.
I think it really is much the same, though, and the uncomfortable feeling is probably because we’re not used to nested functions.
Escapees¶
In C, return
may or may not take an argument – notably
not if the function returns void
. break
and continue
never take arguments, in fact, they are bare words.
Implementations¶
Let’s consider that, somehow, we have an escape-block label
body
from within which we can escape-from label [val]
with an optional value.
A LiSP Escaper¶
This feels like a curious hybrid as the use of the stack to store the continuation to use is surely dynamic behaviour yet the validation is done lexically by the evaluator.
In LiSP, suggests a simple stack
oriented escaper which we can rework into something along the lines
of the Common Lisp block
and return-from
.
We’ll use the names escape-block
and escape-from
partly as I
already use a (largely redundant) block
expression and partly
because I prefer it when unfamiliar things have a name related to
their function.
We can handle both with special forms – although special forms will
cause problems with unwind-protect
.
In the first instance we need to build a set of lexical escapes as we
pass through the escape-block
s starting with #n
at the top
level. The label
is not evaluated and should be a symbol –
although as it isn’t evaluated then anything passed will be used as a
constant. We’ll need to pass our list of escapers through to all the
functions deriving meaning (think of them as another argument, adjunct
to nametree
) but the list is just that, a list, and we can simply
prepend new labels onto the front of the list when we see an
escape-block
.
If we have successfully captured the association between a name and a function then we are in the position of clearing the slate of existing escapes when we enter the definition of a new function. I had varying degrees of success with this while experimenting but it’s a question of semantics as to whether a nested function should be able to see its enclosing functions’ escapes.
The meaning of the body
can be wrappered with
PUSH-ESCAPER ci
and POP-ESCAPER
instructions.
The escaper functions handle the extra values on the stack:
the
ci
of thelabel
the saved frame
the PC to continue with – the one after
POP-ESCAPER
A slight annoyance is that the PC must be absolute. We can’t use a relative PC here as a sort of jump as we don’t know what the current PC is at the time the escaper is invoked! We could have gotten anywhere in the code since this PC was stored on the stack so our only hope is an absolute PC.
It’s no worse than storing the absolute PC that
RETURN
uses but it feels wrong.
The special form escape-from
then only needs to verify that
label
exists in the current escape list and it can invoke
the ESCAPER-LABEL-REF ci
instruction.
At runtime, as we lexically verified that label
was in scope
during evaluation, we can feel reasonably confident that
label
‘s corresponding ci
will be on the stack.
We can search down (up?) for it and:
squelch the stack above this point
restore the saved frame
set val to any supplied value
jump to the absolute PC
And on we go. Seems to work.
break
and continue
¶
In one sense these are quite easy as they are explicit
escape-block
labels.
Essentially, Scheme’s do
wants to return a value
whereas the C or Bash for
doesn’t (care to).
We can rework Scheme’s do
into something more like a
C or Bash for
loop where the interesting thing
to us about the template is:
#T{
{
$loop :+ function $v {
(escape-block break
(if $test {
(escape-block continue {
$@body
})
($loop $@s)
}))
}
$loop $@i
}
}
Here, you can see we’ve just inserted the relevant escape-block
s
in the loop such that escape-from break
will do the right thing.
We want break
not escape-from break
(although it will work) so
we’ll need some syntax sugar to translate the one into the other.
Of course, we now need escape-from
, the special form, to be
willing to accept no value and use, say, (void)
in its stead (more
syntax sugar).
return
¶
Here, I think, we want return
to mean return from the current
function. That’s a bit tricky as we don’t know what the current
function’s name is!
The Common Lisp-y return-from label
at least knows
explicitly what it wants but plain old return
is a bit vague.
I think we can play a similar trick to break
and continue
,
though, in that when we pick up on the association between a name and
a function in assignment and insert an escape block for the function
name we can also insert an escape block for the generic label
return
which will allow us to escape-from return
much like
escape-from break
, above. We return from the nearest enclosing
(named) function body.
The net effect of the two escape blocks would be something like:
define (name args) { (escape-block name (escape-block return ... )) }
Usage comes up in a couple of places. return
might be used as a
name in function position or as a bare word:
return #t
return
As a name in functional position we’ll need to handle return
like
break
and continue
, above. Bare words are more tricky.
Bare Words¶
What if we want to have just plain old:
break
continue
return
Bare words are handled at the bottom of the
meaning function (idio_meaning()
or meaning
) where,
normally, if the bare word is a symbol then we look it up as a
reference.
We can make a check there to see if the bare word is return
, say,
and execute the code to invoke the escape.
That works fine for bare words on their own as above.
However, the same code is used for the expansion of any symbol in an argument position. That means we can write:
define (foo) {
printf "A return looks like %s\n" return
}
and we won’t see a thing because the action of evaluating the third
element of the list, return
, replaces it with escape-from
return
which is compiled into ESCAPER-LABEL-REF return
.
In other words, when we run through this code in the VM, we stop
processing at that point (in the middle of the arguments to
printf
) and leave the function immediately (with a value of
(void)
as no argument was passed).
Tricky. I suppose the argument would be that return
is a reserved
word and you get (exactly) what you asked for.
unwind-protect¶
We have a problem, though, in the shape of the interaction between
escapers and unwind-protect
.
unwind-protect
is a good thing (although there are plenty of
people who are less keen) but we don’t get to use it here.
define (foo x) {
resource := #f
unwind-protect {
resource = some precious thing
trap some-conditions (function (c) {
return
}) {
risky-processing
}
} {
freeup resource
}
}
Here, if the trap handler is invoked and calls return
to leave
foo
then we haven’t run the cleanup clause from unwind-protect
and the resource is locked away.
When we apply the escape-label-ref label
special form,
ultimately, ESCAPER-LABEL-REF ci
, in the byte code, we’ll
find the corresponding escaper on the stack and unwind the stack and
frame back to there. The point being, that we are unaware of
unwind-protect
and its cleanup clause.
Special Forms¶
There is no function to be re-written for special forms – no matter how hard we try. We can’t capture it in a name:
original-escape-block := escape-block
because the evaluator will see the right-hand side escape-block
in
functional position in an expression and complain that there are no
arguments.
We can’t wrap it in a template:
define-template (template-escape-block label & args) {
val := (void)
if (pair? args) {
val = ph args
}
#T{
escape-block '$label $val
}
}
because the evaluator will see escape-block
in functional position
in an expression and then use the unevaluated word '$label
as the
label. Erm, but we want '$label
evaluated to give us the
not-to-be-evaluated label…
Drat!
As an observation, the
unwind-protect
cleanup clause that we skipped with our special form lexical escape will (often but not always) be run albeit some time after it is expected. The reason is thatload
has done the honours for us.
load
wants to ensure that the current module is restored when a loaded file is finished with (either normally or abnormally) and so restores the current module in aunwind-protect
cleanup clause.So when
load
has itsunwind-protect
cleanup clause run then the recursive nature of the unwinding mechanism will mean that any other staged cleanup clauses from otherunwind-protect
calls will get run (in correct order).However, you can’t force that mechanism, it is a side-effect. So if your
unwind-protect
call is holding critical resource, you can’t rely on a timely cleanup.
Go Deeper!¶
For special forms, we could say, OK, let’s rewrite the
dynamic-wind
function (and attendant values) in C and
make the special forms use it.
That’s possible. However, as things stood at the time I had
dynamic-wind
implemented in lib/call-cc.idio
using reroot!
and *here*
which were private variables inside the block of code
that uses them.
Private variables are a very neat trick to encapsulate local variables
and we can carry on in that vein here using C functions.
Until someone, say, call/cc
, wants to be re-written in
Idio-land and we now have to expose the
(wanting-to-be-private) C function and variable to arbitrary
Idio functions.
That sounds like trouble waiting to happen.
Go Functional¶
We could implement escape-block
and escape-from
as functions.
There’s several problems here.
In the first instance, as functions, all their arguments are
evaluated, notably, the label
. That’s not the end of the
world, you’d just have to quote
the label
to be safe.
However, a function having label
in its hands isn’t the same
as the evaluator having label
in its hands. As the
evaluator is processing the, now arbitrary, function escape-block
it would need to identity it and pluck the label
argument
from it and add that to the list of lexical escapes in order that it
can verify the presence of the escape from it identifies the, now
arbitrary, function escape-from
and plucks its label
argument from it.
There’s a little too much in the way of having the evaluator know what the (arbitrary) functions are meant to be doing in that approach.
It doesn’t sound like it will scale.
It certainly doesn’t look like Art.
Lexical Verification¶
In LiSP, gives us a good gnarly example of the lexical checking:
((escape-block foo (function (x) (escape-from foo x))) 37)
Here, the application of a closed function to 37, ((...) 37)
,
results in a function value whose body, when invoked will try to
escape from foo
.
But, wait. The escape block for foo
has been unwound from our
list of lexical escapes after evaluating the expression (...)
–
it is no longer in scope. This means that when the function value is
run it will try to access a named escape block that no longer exists.
In Idio we’ll get an ^rt-variable-unbound-error
.
This is clearly unlike the closure over (local) variables in scope that a function provides. The point being that you cannot capture lexical escapes, they exist only for the lifetime of the block that creates them.
Here, the block creating the escape, escape-block foo ...
, has
completed it’s run when the function has been created and returned.
It therefore will have run POP-ESCAPER
in the VM and the
label
is no longer available for the escape-from
clause
to return to.
*
Technically, you could have had another escape-block foo
further
out which is still in scope in which case the escape-from foo
clause can execute successfully although the code will now return from
an escape block much further out that the user possibly anticipated.
Here, now, we get into some debate about whether having such absolute
label
s are what we really mean. Would it be better if,
instead of the absolute label foo
we really meant a foo
-ish
label, constructed by gensym
in a template:
real-label := gensym $label
#T{
((escape-block $real-label (function (x) (escape-from $real-label x))) 37)
}
and then we can be sure that our foo
is different to the other
guy’s foo
(splitter!) should we both have used foo
as a
label.
Another example might be nested loops where next
could be picked
up much further out because the escaper is blindly unwinding the
stack.
Observations¶
It looks like we have a set of competing issues here.
For label
s to be unevaluated they need to be at least used
by a template (although there become quoting issues) or, more likely,
a special form.
For label
s to be lexically checked they need to be seen by
the evaluator which means a special form.
Special forms don’t “play nicely” with anything else.
Special forms cannot interact with private variables and functions defined in Idio-land. Similar functions defined in C cannot be conditionally exported into Idio.
Perhaps the lexical escapes, return
, break
and continue
are as simple as they are and users shouldn’t use them when they are
using unwind-protect
?
That doesn’t seem to sit very well, though. Users will make unintended mistakes and tracking down lost resources is very hard.
Continuations¶
Can continuations help us out, here? Presumably, as they are touted as being able to create all control flow constructs.
In the very first instance, we can create our escapees rather easily.
We can (implicitly) wrap the body of functions with a suitably
argument-named continuation capture. Here, rather than call the
continuation passed in k
, we’ll call it something more meaningful,
like return
:
define (foo x) {
(call/cc (function (return) {
...
}))
}
And now, in the body of foo
we can invoke the continuation
return
like a unary function, return value
.
As it stands, continuations expect to be invoked with a single value
so maybe we can revisit that decision to allow us to pass no argument
and have the engine supply (void)
instead.
return
is also lexical in scope – it can only be used in the body
of the function declaring it, like any parameter. That doesn’t take
it into the realms of a lexical escape as the evaluator is unaware of
return
being special in any way, it’s just another parameter.
eq?
,
eqv?
or equal?
?
return
is not in the realm of Common Lisp’s catch
and
throw
’s label
where the label
is evaluated and
that value is associated with a continuation (of some kind).
As a full blown continuation, though, return
can be captured –
saved in a variable in wider scope – and then you can appear to
return from foo
multiple times. That might be what we want a
generator to do when it yields multiple results but that isn’t
really what we want return
to be doing.
We’ll get into the same problems with corresponding break
and
continue
continuations. Resuming a break multiple times might
make some possible sense, resuming a continue might lead to disaster
depending on how the loop’s test is implemented. Were the loop to
stop on a specific value then a resumption of a continue will have had
the loop’s variables’ steps run and the test won’t fire in the
same way.
Last built at 2024-12-22T07:10:59Z+0000 from 463152b (dev)