.. include:: ../global.rst .. idio:currentmodule:: SRFI-115 Regular Expressions Functions ----------------------------- .. _`SRFI-115/regexp`: .. idio:function:: SRFI-115/regexp sre [flags] Compiles a regexp if given an object whose structure matches the SRE syntax. Returns `sre` unmodified if it is already a regexp. :param sre: SRE :param sre: list :param flags: flags :param flags: list :return: regexp :rtype: struct-instance :raises ^rt-parameter-value-error: .. _`SRFI-115/regexp?`: .. idio:function:: SRFI-115/regexp? o Is `o` a regexp? :param o: value to test :type o: any :return: boolean .. _`SRFI-115/valid-sre?`: .. idio:function:: SRFI-115/valid-sre? x Is `x` a valid SRE? :param x: value to test :type x: any :return: boolean .. _`SRFI-115/rx`: .. idio:function:: SRFI-115/rx sre ... define-template: (rx & args) (x e) This is a syntax transformer compiling `sre ...` into a *sequence* regular expression :samp:`(regexp #T\\{ (: {sre ...}) \\})` .. _`SRFI-115/regexp->sre`: .. idio:function:: SRFI-115/regexp->sre re Returns an SRE corresponding to the given regexp `re`. :param re: regexp :type re: struct-instance :return: SRE :rtype: list .. _`SRFI-115/char-set->sre`: .. idio:function:: SRFI-115/char-set->sre cset Returns an SRE corresponding to the given SRFI 14 character set `cset`. :param cset: SRFI-14 character set :type cset: struct-instance :return: SRE :type cset: list .. _`SRFI-115/regexp-matches`: .. idio:function:: SRFI-115/regexp-matches re str [start [end]] Returns an regexp-match object if `re` successfully matches the entire string `str` from `start` (inclusive) to `end` (exclusive), or ``#f`` if the match fails. The regexp-match object will contain information needed to extract any submatches. :param re: SRE or regexp :type re: list or struct-instance :param str: string to match :type str: string :param start: starting index, defaults to 0 :type start: integer, optional :param end: ending index, defaults to string length :type end: integer, optional :return: regexp-match object :rtype: struct-instance .. _`SRFI-115/regexp-matches?`: .. idio:function:: SRFI-115/regexp-matches? rx str [o] Returns ``#t`` if `re` matches `str` as in :ref:`regexp-matches `, or ``#f`` otherwise. :param re: SRE or regexp :type re: list or struct-instance :param str: string to match :type str: string :return: boolean .. _`SRFI-115/regexp-search`: .. idio:function:: SRFI-115/regexp-search re str [start [end]] Returns an regexp-match object if `re` successfully matches a substring of `str` from `start` (inclusive) to `end` (exclusive), or ``#f`` if the match fails. The regexp-match object will contain information needed to extract any submatches. :param re: SRE or regexp :type re: list or struct-instance :param str: string to match :type str: string :param start: starting index, defaults to 0 :type start: integer, optional :param end: ending index, defaults to string length :type end: integer, optional :return: regexp-match object :rtype: struct-instance .. _`SRFI-115/regexp-replace`: .. idio:function:: SRFI-115/regexp-replace re str subst [start [end [count]]] Returns a new string replacing the `count` :sup:`th` match of `re` in `str` the `subst`, where the zero-indexed `count` defaults to zero (i.e. the first match). If there are not `count` matches, returns the selected substring unmodified. `subst` can be a string, an integer or symbol indicating the contents of a numbered or named submatch of `re`, ``'pre`` for the substring to the left of the match, or ``'post`` for the substring to the right of the match. The optional parameters `start` and `end` restrict both the matching and the substitution, to the given indices, such that the result is equivalent to omitting these parameters and replacing on :samp:`(substring {str} {start} {end})`. As a convenience, a value of ``#f`` for `end` is equivalent to :samp:`(string-length {str})`. :param re: SRE or regexp :type re: list or struct-instance :param str: string to match :type str: string :param subst: replacement :type subst: see above :param start: starting index, defaults to 0 :type start: integer, optional :param end: ending index, defaults to string length :type end: integer or ``#f``, optional :param count: replacement index count, defaults to zero :type count: integer, optional :return: (possibly) modified string :rtype: string .. _`SRFI-115/regexp-replace-all`: .. idio:function:: SRFI-115/regexp-replace-all re str subst [start [end]] Equivalent to :ref:`regexp-replace ` but replaces all occurrences of `re` in `str` :param re: SRE or regexp :type re: list or struct-instance :param str: string to match :type str: string :param subst: replacement :type subst: see above :param start: starting index, defaults to 0 :type start: integer, optional :param end: ending index, defaults to string length :type end: integer or ``#f``, optional :return: (possibly) modified string :rtype: string .. _`SRFI-115/regexp-fold`: .. idio:function:: SRFI-115/regexp-fold re kons knil str [finish [start [end]]] The fundamental regexp matching iterator. Repeatedly searches `str` for the regexp `re` so long as a match can be found. On each successful match, applies :samp:`({kons} {i} {regexp-match} {str} {acc})` where `i` is the index since the last match (beginning with `start`), `regexp-match` is the resulting match, and `acc` is the result of the previous `kons` application, beginning with `knil`. When no more matches can be found, calls `finish` with the same arguments, except that `regexp-match` is ``#f``. By default `finish` just returns `acc`. :param re: SRE or regexp :type re: list or struct-instance :param kons: accumulation function :type kons: function :param knil: accumlated result :type knil: any :param str: string to match :type str: string :param finish: result function, defaults to a function returning `acc` :type finish: function, optional :param start: starting index, defaults to 0 :type start: integer, optional :param end: ending index, defaults to string length :type end: integer or ``#f``, optional :return: according to `finish` :rtype: any .. _`SRFI-115/regexp-extract`: .. idio:function:: SRFI-115/regexp-extract re str [start [end]] Extracts all non-empty substrings of `str` which match `re` between `start` and `end` as a list of strings. :param re: SRE or regexp :type re: list or struct-instance :param str: string to match :type str: string :param start: starting index, defaults to 0 :type start: integer, optional :param end: ending index, defaults to string length :type end: integer, optional :return: list of matching strings :rtype: list .. _`SRFI-115/regexp-split`: .. idio:function:: SRFI-115/regexp-split re str [start [end]] Splits `str` into a list of (possibly empty) strings separated by non-empty matches of `re`. :param re: SRE or regexp :type re: list or struct-instance :param str: string to match :type str: string :param start: starting index, defaults to 0 :type start: integer, optional :param end: ending index, defaults to string length :type end: integer, optional :return: list of matching strings :rtype: list .. _`SRFI-115/regexp-partition`: .. idio:function:: SRFI-115/regexp-partition re str [start [end]] Partitions `str` into a list of non-empty strings matching `re`, interspersed with the unmatched portions of the string. The first and every odd element is an unmatched substring, which will be the empty string if `re` matches at the beginning of the string or end of the previous match. The second and every even element will be a substring matching `re`. If the final match ends at the end of the string, no trailing empty string will be included. Thus, in the degenerate case where `str` is the empty string, the result is ``("")`` . Note that ``regexp-partition`` is equivalent to interleaving the results of :ref:`regexp-split ` and :ref:`regexp-extract `, starting with the former. :param re: SRE or regexp :type re: list or struct-instance :param str: string to match :type str: string :param start: starting index, defaults to 0 :type start: integer, optional :param end: ending index, defaults to string length :type end: integer, optional :return: list of matching strings :rtype: list .. _`SRFI-115/regexp-match?`: .. idio:function:: SRFI-115/regexp-match? o Is `o` a successful match from :ref:`regexp-matches ` or :ref:`regexp-search `? :param o: value to test :type o: any :return: boolean .. _`SRFI-115/regexp-match-count`: .. idio:function:: SRFI-115/regexp-match-count regexp-match Returns the number of submatches of `regexp-match`, regardless of whether they matched or not. Does not include the implicit zero full match in the count. :param regexp-match: regexp-match :type regexp-match: struct-instance :return: number of submatches :rtype: integer .. _`SRFI-115/regexp-match-submatch`: .. idio:function:: SRFI-115/regexp-match-submatch regexp-match field Returns the substring matched in `regexp-match` corresponding to `field`, either an integer or a symbol for a named submatch. Index 0 refers to the entire match, index 1 to the first lexicographic submatch, and so on. If there are multiple submatches with the same name, the first which matched is returned. If passed an integer outside the range of matches, or a symbol which does not correspond to a named submatch of the pattern, it is an error. If the corresponding submatch did not match, returns false. The result of extracting a submatch after the original matched string has been mutated is unspecified. :param regexp-match: regexp-match :type regexp-match: struct-instance :param field: field identifier :type field: integer or symbol :return: submatch string or ``#f`` :rtype: string or ``#f`` .. SRFI-115/regexp-match-submatch/list - XXX not found .. _`SRFI-115/regexp-match-submatch-start`: .. idio:function:: SRFI-115/regexp-match-submatch-start regexp-match field Returns the start index for `regexp-match` corresponding to `field` as per :ref:`regexp-match-submatch `. :param regexp-match: regexp-match :type regexp-match: struct-instance :param field: field identifier :type field: integer or symbol :return: submatch starting index :rtype: integer .. _`SRFI-115/regexp-match-submatch-end`: .. idio:function:: SRFI-115/regexp-match-submatch-end regexp-match field Returns the end index for `regexp-match` corresponding to `field` as per :ref:`regexp-match-submatch `. :param regexp-match: regexp-match :type regexp-match: struct-instance :param field: field identifier :type field: integer or symbol :return: submatch ending index :rtype: integer .. _`SRFI-115/regexp-match->list`: .. idio:function:: SRFI-115/regexp-match->list regexp-match Returns a list of all submatches in `regexp-match` as string or false, beginning with the entire match 0. :param regexp-match: regexp-match :type regexp-match: struct-instance :return: list of submatches :rtype: list .. _`SRFI-115/regexp-match->sexp`: .. idio:function:: SRFI-115/regexp-match->sexp regexp-match Convert `regexp-match` to a forest of submatches, beginning with the full match, using ``#f`` for unmatched submatches. :param regexp-match: regexp-match :type regexp-match: struct-instance :return: sexp :rtype: list .. include:: ../commit.rst