Unicode Code Points¶
The “characters” in strings (and as standalone values) are Unicode
code points, normally represented by #U+...
for enough hexadecimal
digits to represent the code point. Leading zeroes are not required
(but may be necessary, see below).
You can input a specific character with #\X
for some X, a UTF-8
code point.
There are a limited number of #\{newline}
named characters.
;; ħ is U+0127 LATIN SMALL LETTER H WITH STROKE
c1 := #U+127
c2 := #\ħ
;; the unicode type is much like fixnum and can be compared with eqv?
printf "Does <<%s>> eqv? <<%s>>? %s\n" c1 c2 (eqv? c1 c2)
;; SPACE
c1 = #U+20
;; or using a named character
c2 = #\{space}
printf "Does <<%s>> eqv? <<%s>>? %s\n" c1 c2 (eqv? c1 c2)
$ idio code-points
Does <<ħ>> eqv? <<ħ>>? #t
Does << >> eqv? << >>? #t
There are a number of Unicode-derived Category and Property predicates and a very limited set of conversion functions.
c1 := #U+127
if (Lowercase? c1) {
printf "%s ->Uppercase %s\n" c1 (->Uppercase c1)
}
;; tell me what you know!
unicode/describe c1
$ idio unicode-functions
ħ ->Uppercase Ħ
0127;;Ll;;;;;;;;;;0126;;0126 # Letter Lowercase Alphabetic Uppercase=0126 Titlecase=0126
Last built at 2024-11-10T07:11:46Z+0000 from 77077af (dev) for Idio 0.3