Pairs and Lists

Pairs are the fundamental compound type of any Lisp language.

In Scheme a . U+002E (FULL STOP) is used to separate the elements of a pair creating a visual distinction between a dotted-pair and a list:

  • (1 . 2) – a dotted-pair

  • (1 2) – a list which is technically two dotted-pairs, (1 . (2 . nil))

I wanted to use . for a value-index operator which meant I needed an alternative character for dotted-pairs. I chose & U+0026 (AMPERSAND), possibly unwisely, but its sentiment is about right: a b & c is a, b and (the rest in) c – was the use-case I was worrying about albeit that I intend to rework that as a b c* at some point.

Of course, using an ampersand in a dotted-pair is completely at odds with our shell-dogma that & means put the pipeline in the background.

Given that a dotted-pair (effectively) mandates whitespace around the . and that the value-index operator should not, then I might reverse that decision.

Should &, the instruction to background, always have to appear after the pipeline? That’s an interesting question. I can understand that as a combined end-of-statement and backgrounding sigil it works very well.

At the time of writing there isn’t a statement separator at all in Idio – barring newline – so maybe this is another shell-ism that can be reconsidered.

Much like invoking the shell builtin, time, which is followed by the command you intend to get timing metrics for, perhaps the bg builtin should be indicating that you want the following pipeline to be backgrounded:

bg zcat file | tar xf -

I’m OK with that. It fits in better with a general sense of command args where it so happens that args is itself command' args' and with the general Schemely sense of command args within a form.

Implementation

The implementation is as simple as you would expect and there are the same set of funky IDIO_PAIR_HTT() macros defined as appear to be needed in the C code base, much like the phtt functions in Idio.

gc.h
typedef struct idio_pair_s {
    struct idio_s *grey;
    struct idio_s *h;
    struct idio_s *t;
} idio_pair_t;

#define IDIO_PAIR_GREY(P)    ((P)->u.pair.grey)
#define IDIO_PAIR_H(P)       ((P)->u.pair.h)
#define IDIO_PAIR_T(P)       ((P)->u.pair.t)

#define IDIO_PAIR_HH(P)      IDIO_PAIR_H (IDIO_PAIR_H (P))
#define IDIO_PAIR_HT(P)      IDIO_PAIR_H (IDIO_PAIR_T (P))
#define IDIO_PAIR_TH(P)      IDIO_PAIR_T (IDIO_PAIR_H (P))
#define IDIO_PAIR_TT(P)      IDIO_PAIR_T (IDIO_PAIR_T (P))

...

In addition we have some C macros to help with the construction of lists:

pair.h
#define IDIO_LIST1(e1)               idio_pair (e1, idio_S_nil)
#define IDIO_LIST2(e1,e2)            idio_pair (e1, idio_pair (e2, idio_S_nil))
...
#define IDIO_LIST5(e1,e2,e3,e4,e5)   idio_pair (e1, ...)

Defining in C

idio_pair() is the creator of pairs. Easy enough.

There’s no list creator per se but a list is a chain of pairs and there’s a series of C helper macros to do the honours: IDIO_LIST1(e1), IDIO_LIST2(e1, e2), etc..

Reading

Normally pairs and lists are constructed with pair and list with pair taking precisely two arguments and list taking zero or more. A zero-element list is equivalent to #n.

Alternatively, both can be quoted expressions:

  • dotted pairs are '(value1 & value2)

  • lists are '(value ...)

although, obviously, being quoted, the values cannot be variables – or, rather, any symbols will remain so (and not evaluated) which is the very point of quoting.

Writing

No particular surprise with dotted pairs being printed as (value1 & value2) and lists as (value ...).

Pairs and lists occupy a slightly strange place because of homoiconicity in that they are they building blocks of the language itself as well as being data structures within it.

This is similar to the way that Symbols can be values to be passed around as well as variable names. Which role they play is context dependent.

Operations

Several list-oriented functions have the benefit of not being named after the value-type they manipulate. The benefits of being the primary data structure, I guess.

function pair h t

construct a pair from h and t

function pair? value

is value a pair

function ph p

return the head of pair p

function pt p

return the tail of pair p

function set-ph! p value

set the head of pair p to value

function set-pt! p value

set the tail of pair p to value

function reverse list

return the list list reversed

This is very commonly used when a list is constructed in reverse order when walking down an argument list and the corrected reversed list is required.

There is a special case for when processing improper lists, eg. (1 2 & 3), to accommodate which the underlying C function for list reversal, idio_list_reverse(), indirects through idio_improper_list_reverse().

You cannot reverse an improper list in Idio.

function length list

return the number of pairs in list list

This does not accommodate improper lists.

function list [args]

construct a list from args

function append a b

construct a list from list a and list b such that the resultant list is a copy of list a except the final pair’s tail is a reference to list b.

Debatably, this should be a copy of list b too.

function list->array list

return an array constructed from the elements of list list

See also array->list.

function list->string list

return a string constructed from the Unicode code points in list

See also string->list.

function nth list n [default]

return the nth element of list list starting at one

If there is no such element then default is returned if supplied otherwise #n.

Last built at 2024-09-07T06:11:19Z+0000 from 463152b (dev)