C Types¶
Overview¶
In principle we don’t want to deal with C data types as they are
don’t come across as well-defined, for example, an int
could be
16, 32 or 64 bits wide.
On top of that are the numeric promotion rules meaning I can pass a
short where a long is expected and the right thing will happen. I can
compare almost anything to 0
.
Someone obviously knows what’s going on there but it isn’t me.
However, internally, Idio needs to poke about with the system even just for reading and writing but much more comprehensively for job control and it needs to transport those results around.
Initially, I wrote all of those system interfaces by hand although I became increasingly annoyed (with myself) for not handling types correctly. Particularly for structures.
Eventually I started the C API work which required a overhaul of the handling of C types.
Of course, once that’s done it is available for use elsewhere.
C Types¶
There are fourteen base types, int
, etc., plus pointers and
void
.
void
is a little unusual in that it, correctly, is the absence of
a type. I’ve stumbled across one void
in a structure (in a
FILE
) although I avoided that becoming an issue by stopping using
FILE
as the underlying type for handling files when I added
pipe-handles.
More commonly, you’ll see references to pointers to void
(which,
arguably, still doesn’t make any sense), in practice meaning a pointer
to an unknown type.
C Base Types¶
The C base types have a fluid set of possible names which I’ve normalised to:
At this point I finally realised after all these years that a
C char
is neither signed nor unsigned but is the:
Smallest addressable unit of the machine that can contain basic character set.
—Wikipedia
and is therefore, technically, not a numeric type.
Who knew?
C |
Idio |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
There’s a separate Idio type for each C type which
we store in a union (of C base types) in the struct
idio_s
/IDIO
value type:
#define IDIO_TYPE_C_CHAR 29
#define IDIO_TYPE_C_SCHAR 30
#define IDIO_TYPE_C_UCHAR 31
#define IDIO_TYPE_C_SHORT 32
#define IDIO_TYPE_C_USHORT 33
#define IDIO_TYPE_C_INT 34
#define IDIO_TYPE_C_UINT 35
#define IDIO_TYPE_C_LONG 36
#define IDIO_TYPE_C_ULONG 37
#define IDIO_TYPE_C_LONGLONG 38
#define IDIO_TYPE_C_ULONGLONG 39
#define IDIO_TYPE_C_FLOAT 40
#define IDIO_TYPE_C_DOUBLE 41
#define IDIO_TYPE_C_LONGDOUBLE 42
#define IDIO_TYPE_C_POINTER 43
typedef struct idio_C_type_s {
union {
char C_char;
signed char C_schar;
unsigned char C_uchar;
...
float C_float;
double C_double;
long double C_longdouble;
idio_C_pointer_t *C_pointer;
} u;
} idio_C_type_t;
struct idio_s {
...
union idio_s_u {
...
idio_C_type_t C_type;
...
} u;
};
So, accessing C types involves an extra indirection (more for a pointer) but otherwise all good. It all could have been dragged up a level but no-one’s looking closely.
Based on these we can define some basic:
constructors, eg.
idio_C_int
accessors, eg.
IDIO_C_TYPE_int
predicates, eg.
C/int?
The predicates exist in a
C
namespace which is not importable. You simply have to use the direct name.
Numeric Operations¶
With the best will in the world we can’t escape needing to provide some numeric operations for C types.
All of these pose some problems for us. By and large we can do stuff on things of the same C type. Stepping away from identical types takes us into a combinatorial explosion of possibilities that the C compiler hides from us.
So I haven’t bothered.
C Equality¶
Integral equality is straightforward but floating point equality, it
turns out, is quite hard. There is a trick we can use for float
and double
types wherein, the fact that they are fixed-format
32-bit and 64-bit values means, we can convert into a integer and test
the component bits. This is referred to as Units in Last Place comparisons and you can read much more
about it.
On the other hand, long double
s are not clearly defined. They might be 80-bit,
96-bit or 128-bit implementations or even just an alias for
double
.
I gave up and throw an error if you’re trying to compare long
double
s from Idio.
Arithmetic¶
I’ve added +
, -
, *
and /
variants which get invoked if
the first argument in a binary arithmetic operation is a C
type and then throw a condition if the other argument is not the same
C type.
Similarly, there are the usual comparators, <
, <=
, ==
,
>=
and >
for the C
domain.
These, clearly, differ from the nominal Idio numeric
comparison functions (lt
, le
, eq
, ge
and gt
) which
exist to avoid clashing with the shell-like <
and >
IO
redirection operators. I’ve stuck with the “minimal changes from
C” principle for the C comparators.
Names for things, eh? Who’d have thought it would be hard?
Conversions¶
Semantically, in the C
namespace convert an (Idio)
integer into a C one.
It started life just creating C int
s – the
limitation of my interest at the time.
In C/integer->
, as we have now been given the precise C
type of the result we can also perform some range tests on the
supplied integer.
C/->integer
does what you’d expect except won’t work for floating
point C types for which you want C/->number
– which is
ultimately a superset.
You cannot convert a C long double
type into an
Idio floating point type (ie. a bignum) this way for similar
uncertainty of encoding issues.
The problem is less the conversion, as all of them are implemented
“lazily” by having sprintf(3) print them out and the reader
read them back in again, but rather capturing the special forms
(NaN
etc.) without requiring to include the entirety of the maths
library, libm
.
How “lazy” you consider that conversion is moot. Any radix conversion routine will have to loop performing division by the new radix and store of the remainder followed by some reworking of the exponent. The people writing sprintf(3) have had some considerable head-start in making their algorithms efficient. Feel free to have a gander at print_fp.c if you are interested.
Printing¶
There’s a small amount of flexibility for printing a C type:
Idio |
format specifiers |
---|---|
|
c |
|
d |
|
X o u* x b |
|
d |
|
X o u* x b |
|
d |
|
X o u* x b |
|
d |
|
X o u* x b |
|
d |
|
X o u* x b |
|
e f g* |
|
e f g* |
|
e f g* |
* denotes the default where more than one format is possible
b is a binary output format
printf(3)-style conversion precisions are supported, eg.
"%.1f"
. This only really affects floating point numbers for which
the precision affects the number of significant or decimal
places. Otherwise, the string returned by simply printing the value
will have any conversion precision applied in format.
C Pointers¶
C pointers are a little more interesting. In the first instance we need to store a “free me” flag as a few C pointers we pass around are not ours to free(3).
In writing the C API I needed a mechanism to associate some
arbitrary blob of memory allocated for some struct
with a
primitive that knew how to access the members of the struct.
This C Struct Identification replaces a limited
mechanism to print a C struct
and provides considerably
more functionality.
Actually, we only need a unique totem as everything else can be in
lookup tables. However, whilst a pair
is (always) a unique
totem, we might as well stick something in it and save two of those
lookup tables!
An idio_CSI_module_struct_something
is a simple list that
has the struct’s name, "struct something"
and the primitive
for accessing the members of that struct, probably,
module/struct-something-ref
.
The struct modifier, probably,
module/struct-something-set!
can be invoked through the
Setters mechanisms and we can associate a printer for the
struct through the add-as-string
mechanism.
Last built at 2024-12-21T07:11:04Z+0000 from 463152b (dev)