.. include:: ../global.rst .. idio:currentmodule:: unicode unicode Functions ----------------- The *predicates* in this list are asserting some Unicode *Category* or *Property*. There are at least 65 Categories and Properties, the ones here are those required for :lname:`Idio` to do what it needs to do. There are three conversion functions between cases. .. _`unicode/ASCII_Hex_Digit?`: .. idio:function:: unicode/ASCII_Hex_Digit? o Does `o` have the Unicode Property ``ASCII_Hex_Digit``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/Alphabetic?`: .. idio:function:: unicode/Alphabetic? o Does `o` have the Unicode Property ``Alphabetic``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/Control?`: .. idio:function:: unicode/Control? o Does `o` have the Unicode Property ``Control``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/Decimal_Number?`: .. idio:function:: unicode/Decimal_Number? o Is `o` in the Unicode Category ``Nd``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/Extend?`: .. idio:function:: unicode/Extend? o Does `o` have the Unicode Property ``Extend``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/Fractional_Number?`: .. idio:function:: unicode/Fractional_Number? o Is the Unicode Property ``Numeric_Value`` of `o` a fraction? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/L?`: .. idio:function:: unicode/L? o Does `o` have the Unicode Property ``L``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/LV?`: .. idio:function:: unicode/LV? o Does `o` have the Unicode Property ``LV``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/LVT?`: .. idio:function:: unicode/LVT? o Does `o` have the Unicode Property ``LVT``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/Letter?`: .. idio:function:: unicode/Letter? o Is `o` in any of the Unicode Categories ``L*``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/Lowercase?`: .. idio:function:: unicode/Lowercase? o Does `o` have the Unicode Property ``Lowercase``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/Mark?`: .. idio:function:: unicode/Mark? o Is `o` in any of the Unicode Categories ``M*``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/Number?`: .. idio:function:: unicode/Number? o Is `o` in any of the Unicode Categories ``N*``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/Punctuation?`: .. idio:function:: unicode/Punctuation? o Does `o` have the Unicode Property ``Punctuation``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/Regional_Indicator?`: .. idio:function:: unicode/Regional_Indicator? o Does `o` have the Unicode Property ``Regional_Indicator``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/Separator?`: .. idio:function:: unicode/Separator? o Is `o` in any of the Unicode Categories ``Z*``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/SpacingMark?`: .. idio:function:: unicode/SpacingMark? o Does `o` have the Unicode Property ``SpacingMark``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/Symbol?`: .. idio:function:: unicode/Symbol? o Is `o` in any of the Unicode Categories ``S*``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/T?`: .. idio:function:: unicode/T? o Does `o` have the Unicode Property ``T``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/Titlecase_Letter?`: .. idio:function:: unicode/Titlecase_Letter? o Does `o` have the Unicode Property ``Titlecase_Letter``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/Uppercase?`: .. idio:function:: unicode/Uppercase? o Does `o` have the Unicode Property ``Uppercase``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/V?`: .. idio:function:: unicode/V? o Does `o` have the Unicode Property ``V``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/White_Space?`: .. idio:function:: unicode/White_Space? o Does `o` have the Unicode Property ``White_Space``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/ZWJ?`: .. idio:function:: unicode/ZWJ? o Does `o` have the Unicode Property ``ZWJ``? :param o: object to test :type o: unicode|string :return: boolean .. _`unicode/ASCII-Decimal_Number?`: .. idio:function:: unicode/ASCII-Decimal_Number? cp Is `cp` in the Unicode Category ``Nd`` and less than 0x80? :param cp: code point to test :type cp: unicode :return: boolean This function overrides the :ref:`ASCII-Decimal_Number? ` version found in :file:`lib/bootstrap/common.idio`. .. _`unicode/->Lowercase`: .. idio:function:: unicode/->Lowercase cp Return the Unicode ``Simple_Lowercase_Mapping`` of `cp` :param cp: value to convert :type cp: unicode :return: unicode Note that the default lower-case mapping is to `cp`. .. _`unicode/->Titlecase`: .. idio:function:: unicode/->Titlecase cp Return the Unicode ``Simple_Titlecase_Mapping`` of `cp` :param cp: value to convert :type cp: unicode :return: unicode Note that the default Title-case mapping is to `cp`. .. _`unicode/->Uppercase`: .. idio:function:: unicode/->Uppercase cp Return the Unicode ``Simple_Uppercase_Mapping`` of `cp` :param cp: value to convert :type cp: unicode :return: unicode Note that the default upper-case mapping is to `cp`. .. _`unicode/numeric-value`: .. idio:function:: unicode/numeric-value cp return the Unicode ``Numeric_Value`` of `cp` :param cp: code point :type cp: unicode :return: integer or string :raises ^rt-param-value-error: if `cp` is not ``Numeric?`` The Unicode ``Numeric_Value`` can be a decimal integer or a rational which is returned as a string .. seealso:: :ref:`Fractional_Number? ` .. _`unicode/describe`: .. idio:function:: unicode/describe o print the Unicode attributes of `o` :param o: value to describe :type o: unicode or string :return: ``#`` The :ref:`unicode/describe ` reports a pseudo *Unicode Character Database* entry plus the Categories and Properties associated with the code point and indications of any Lowercase, Uppercase or Titlecase variants and any possible *Numeric_Value*. It will do the same for each code point in a string (which may, of course, be more than the number of "characters"). .. code-block:: idio Idio> unicode/describe "é" 00E9;;Ll;;;;;;;;;;00C9;;00C9 # Letter Lowercase Alphabetic Uppercase=00C9 Titlecase=00C9 Idio> unicode/describe "é" 0065;;Ll;;;;;;;;;;0045;;0045 # Letter Lowercase Alphabetic ASCII_Hex_Digit Uppercase=0045 Titlecase=0045 0301;;Mn;;;;;;;;;;;; # Mark Extend Idio> describe "🏴󠁧󠁢󠁳󠁣󠁴󠁿" 1F3F4;;So;;;;;;;;;;;; # Symbol E0067;;Cf;;;;;;;;;;;; # Extend E0062;;Cf;;;;;;;;;;;; # Extend E0073;;Cf;;;;;;;;;;;; # Extend E0063;;Cf;;;;;;;;;;;; # Extend E0074;;Cf;;;;;;;;;;;; # Extend E007F;;Cf;;;;;;;;;;;; # Extend The third string is an example of an emoji, in this case, ``flag: Scotland``, 🏴󠁧󠁢󠁳󠁣󠁴󠁿, in the *Subdivision Flags* Category. Don't worry if you can't actually see a Saltire, maybe just a black flag, as desktop browser support is poor, mobile phone support is better. The point being that a single (double-width) "character" is, in this case, constructed from seven Unicode code points: .. csv-table:: :widths: auto :align: left 1F3F4, WAVING BLACK FLAG E0067, TAG LATIN SMALL LETTER G E0062, TAG LATIN SMALL LETTER B E0073, TAG LATIN SMALL LETTER S E0063, TAG LATIN SMALL LETTER C E0074, TAG LATIN SMALL LETTER T E007F, CANCEL TAG revealing the GB then SCT elements. There are corresponding WLS and ENG variants for the flags for Wales and England, respectively. These abbreviations appear to be derived from `ISO 3166-2:GB `_. Utility Functions ----------------- Some utility functions for dealing with :ref:`SRFI-14 module` char-sets. .. _`unicode/unicode->plane`: .. idio:function:: unicode/unicode->plane cp return the Unicode plane of `cp` :param cp: unicode to analyse :return: Unicode plane `cp` :rtype: fixnum .. _`unicode/unicode->plane-codepoint`: .. idio:function:: unicode/unicode->plane-codepoint cp return the lower 16 bits of `cp` :param cp: unicode to convert :return: lower 16 bits of of `cp` :rtype: fixnum .. include:: ../commit.rst