C Types

Overview

In principle we don’t want to deal with C data types as they are don’t come across as well-defined, for example, an int could be 16, 32 or 64 bits wide.

On top of that are the numeric promotion rules meaning I can pass a short where a long is expected and the right thing will happen. I can compare almost anything to 0.

Someone obviously knows what’s going on there but it isn’t me.

However, internally, Idio needs to poke about with the system even just for reading and writing but much more comprehensively for job control and it needs to transport those results around.

Initially, I wrote all of those system interfaces by hand although I became increasingly annoyed (with myself) for not handling types correctly. Particularly for structures.

Eventually I started the C API work which required a overhaul of the handling of C types.

Of course, once that’s done it is available for use elsewhere.

C Types

There are fourteen base types, int, etc., plus pointers and void.

void is a little unusual in that it, correctly, is the absence of a type. I’ve stumbled across one void in a structure (in a FILE) although I avoided that becoming an issue by stopping using FILE as the underlying type for handling files when I added pipe-handles.

More commonly, you’ll see references to pointers to void (which, arguably, still doesn’t make any sense), in practice meaning a pointer to an unknown type.

C Base Types

The C base types have a fluid set of possible names which I’ve normalised to:

C types

C

Idio

char

char

signed char

schar

unsigned char

uchar

short

short

unsigned short

ushort

int

int

unsigned int

uint

long

long

unsigned long

ulong

long long

longlong

unsigned long long

ulonglong

float

float

double

double

long double

longdouble

There’s a separate Idio type for each C type which we store in a union (of C base types) in the struct idio_s/IDIO value type:

#define IDIO_TYPE_C_CHAR             29
#define IDIO_TYPE_C_SCHAR            30
#define IDIO_TYPE_C_UCHAR            31
#define IDIO_TYPE_C_SHORT            32
#define IDIO_TYPE_C_USHORT           33
#define IDIO_TYPE_C_INT              34
#define IDIO_TYPE_C_UINT             35
#define IDIO_TYPE_C_LONG             36
#define IDIO_TYPE_C_ULONG            37
#define IDIO_TYPE_C_LONGLONG         38
#define IDIO_TYPE_C_ULONGLONG        39
#define IDIO_TYPE_C_FLOAT            40
#define IDIO_TYPE_C_DOUBLE           41
#define IDIO_TYPE_C_LONGDOUBLE       42
#define IDIO_TYPE_C_POINTER          43

typedef struct idio_C_type_s {
    union {
        char                 C_char;
        signed char          C_schar;
        unsigned char        C_uchar;
        ...
        float                C_float;
        double               C_double;
        long double          C_longdouble;
        idio_C_pointer_t    *C_pointer;
    } u;
} idio_C_type_t;

struct idio_s {
    ...
    union idio_s_u {
        ...
        idio_C_type_t          C_type;
        ...
    } u;
};

So, accessing C types involves an extra indirection (more for a pointer) but otherwise all good. It all could have been dragged up a level but no-one’s looking closely.


Based on these we can define some basic:

  • constructors, eg. idio_C_int

  • accessors, eg. IDIO_C_TYPE_int

  • predicates, eg. C/int?

    The predicates exist in a C namespace which is not importable. You simply have to use the direct name.

Numeric Operations

With the best will in the world we can’t escape needing to provide some numeric operations for C types.

All of these pose some problems for us. By and large we can do stuff on things of the same C type. Stepping away from identical types takes us into a combinatorial explosion of possibilities that the C compiler hides from us.

So I haven’t bothered.

C Equality

Integral equality is straightforward but floating point equality, it turns out, is quite hard. There is a trick we can use for float and double types wherein, the fact that they are fixed-format 32-bit and 64-bit values means, we can convert into a integer and test the component bits. This is referred to as Units in Last Place comparisons and you can read much more about it.

On the other hand, long doubles are not clearly defined. They might be 80-bit, 96-bit or 128-bit implementations or even just an alias for double.

I gave up and throw an error if you’re trying to compare long doubles from Idio.

Arithmetic

I’ve added +, -, * and / variants which get invoked if the first argument in a binary arithmetic operation is a C type and then throw a condition if the other argument is not the same C type.

Similarly, there are the usual comparators, <, <=, ==, >= and > for the C domain.

These, clearly, differ from the nominal Idio numeric comparison functions (lt, le, eq, ge and gt) which exist to avoid clashing with the shell-like < and > IO redirection operators. I’ve stuck with the “minimal changes from C” principle for the C comparators.

Names for things, eh? Who’d have thought it would be hard?

Conversions

In C/integer->, as we have now been given the precise C type of the result we can also perform some range tests on the supplied integer.

C/->integer does what you’d expect except won’t work for floating point C types for which you want C/->number – which is ultimately a superset.

You cannot convert a C long double type into an Idio floating point type (ie. a bignum) this way for similar uncertainty of encoding issues.

The problem is less the conversion, as all of them are implemented “lazily” by having sprintf(3) print them out and the reader read them back in again, but rather capturing the special forms (NaN etc.) without requiring to include the entirety of the maths library, libm.

How “lazy” you consider that conversion is moot. Any radix conversion routine will have to loop performing division by the new radix and store of the remainder followed by some reworking of the exponent. The people writing sprintf(3) have had some considerable head-start in making their algorithms efficient. Feel free to have a gander at print_fp.c if you are interested.

Printing

There’s a small amount of flexibility for printing a C type:

C Type Printing

Idio

format specifiers

char

c

schar

d

uchar

X o u* x b

short

d

ushort

X o u* x b

int

d

uint

X o u* x b

long

d

ulong

X o u* x b

longlong

d

ulonglong

X o u* x b

float

e f g*

double

e f g*

longdouble

e f g*

* denotes the default where more than one format is possible

b is a binary output format

printf(3)-style conversion precisions are supported, eg. "%.1f". This only really affects floating point numbers for which the precision affects the number of significant or decimal places. Otherwise, the string returned by simply printing the value will have any conversion precision applied in format.

C Pointers

C pointers are a little more interesting. In the first instance we need to store a “free me” flag as a few C pointers we pass around are not ours to free(3).

In writing the C API I needed a mechanism to associate some arbitrary blob of memory allocated for some struct with a primitive that knew how to access the members of the struct.

This C Struct Identification replaces a limited mechanism to print a C struct and provides considerably more functionality.

An idio_CSI_module_struct_something is a simple list that has the struct’s name, "struct something" and the primitive for accessing the members of that struct, probably, module/struct-something-ref.

The struct modifier, probably, module/struct-something-set! can be invoked through the Setters mechanisms and we can associate a printer for the struct through the add-as-string mechanism.

Last built at 2024-12-21T07:11:04Z+0000 from 463152b (dev)