regex
| Definition: |
include builtins\regex.e
sequence m = regex(sequence re, string target, integer strtndx=1) |
| Description: |
Applies the regular expression re (in string or pre-compiled format) to target and returns an array of group
indexes or {} if no match could be found.
re: a regular expression such as "a(b*)" or the result of applying regex_compile() to such a string. target: a string to be matched against the regular expression. strtndx: obviously a non-1 value of n skips the first n-1 characters of target and can be significantly faster than repeatedly passing ever-decreasing slices of target and at the same time having to add an offset to any results. Returns: an even-length sequence of group indexes, or {} if no match could be found. The results, corresponding to \0, \1, \2, etc of regular expression afficionado parlance, are start/end pairs. |
| pwa/p2js: | Supported. |
| Example 1: |
?regex(`(a)(bc)`,"abc") -- yields {1,3, 1,1, 2,3}
The result is 3 pairs of start/end indexes, which can be read as {target[1..3],target[1..1],target[2..3]} which in this case is {"abc","a","bc"}. The first element, aka `\0` is the entire match, the second aka `\1` from `(a)` is "a", and obviously the third aka `\2` from `(bc)` is "bc". |
| Example 2: |
string target = "abc"
sequence res = regex(`(a)(bc)`,target) -- res is {1,3, 1,1, 2,3}
if res!={} then
sequence capture_groups = {}
for i=1 to length(res) by 2 do
integer {s,e} = res[i..i+1]
capture_groups = append(capture_groups,target[s..e])
end for
-- capture_groups is now {"abc","a","bc"}
end if
Obviously, in most cases, only one capture group exists or only one is of any interest, and such a loop would be overkill. |
| See Also: | regex syntax |