Regular Expressions
Regular expressions are characters that define a search pattern. They can be used to identify or format data. Use the Pattern Matching activity to test a regular expression.
Character Classes
Matches any one of a set of characters.
Regular Expression | Description |
---|---|
. | Matches any character except "\n". For example, the pattern "a.e" matches "ave" in "have" and "ate" in "water". To match a period character ".", precede the period with the escape character "\" to produce "\.". |
[] | Matches any character in a set of characters. The characters are case-sensitive. For example, the pattern "[as]" matches "a" and "s" in "Laserfiche". |
[^] | Matches any character not in a set of characters. The characters are case-sensitive. For example, the pattern "[^as]" matches "L", "r", "f", "i", "c", and "h", and "e" twice in "Laserfiche". |
- | Matches any single character in the range from left to right. For example, the pattern "[A-X]" matches "X" in "XY". |
\p{} | Matches any single character in the Unicode category. For example, the pattern "\p{IsCyrillic}" matches "Д" in "ДA". |
\P{} | Matches any single character not in the Unicode category. For example, the pattern "\p{IsCyrillic}" matches "A" in "ДA". |
\w | Matches any word character. For example, the pattern "\w" matches "A", "B", "1", and "2" in "AB 1.2". |
\W | Matches any non-word character. For example, the pattern "\W" matches " " (blank space) and "." in "AB 1.2". |
\s | Matches any white-space character. For example, the pattern "\s" matches " " (blank space) in "AB 1.2". |
\S | Matches any non-white-space character. For example, the pattern "\s" matches "A", "B", "1", ".", and "2" in "AB 1.2". |
\d | Matches any decimal digit. For example, the pattern "\d" matches "1", "2", "3", and "4" in "ab 1.234". |
\D | Matches any non-decimal digit. For example, the pattern "\D" matches "a", "b", " " (blank space), and "." in "ab 1.234". |
Quantifiers
Matches a specified number of elements.
Regular Expression | Description |
---|---|
* | Matches the previous element zero or more times. For example, the pattern "d*" matches "d" twice in "1dad". |
+ | Matches the previous element one or more times. For example, the pattern "to+" matches "to" in "tough" and "too" in "tooth". |
? | Matches the previous element zero or one time. For example, the pattern "card?" matches "card" in "cards" and "car" in "cars". |
{n} | Matches the previous element n times. For example, the pattern ",\d{3}" matches ",234" and ",567" in "1,234,567.890". |
{n,} | Matches the previous element at least n times. For example, the pattern "\d{2,}" matches "11" and "24" in "11.24". |
{n,m} | Matches the previous element at least n times, but no more than m times. For example, the pattern "\d{2,4}" matches "113" and "2444" in "113.2444". |
Grouping Constructs
Groups regular expressions to capture sub strings of a string.
Regular Expression | Description |
---|---|
() | Matches the exact expression in the parentheses. For example, the pattern "(1-3)" matches "1-3" in "1-34", but nothing in "1". |
(?:) | Creates a group that will not capture the string matched by the group. |
(?<>) | Creates a named capture group for future use in the regular expression. |
\k<> | References a named capture group created in the expression. Matches the string captured by that capture group. |
Character Escapes
Matches a special or literal character using a backslash character (\).
Regular Expression | Description |
---|---|
\a | Matches a bell character (\u0007). |
\t | Matches a tab (\u0009). |
\v | Matches a vertical tab (\u000B). |
\f | Matches a form feed (\u000C). |
\n | Matches a new line (\u000A). |
\r | Matches a carriage return (\u000D). |
\e | Matches an escape (\u001B). |
\040 | Matches an ASCII character as octal. |
\x20 | Matches an ASCII character using hexadecimal representation. |
\c | Matches an ASCII control character. |
\\ | Matches a backslash. |
Metacharacters
Matches or does not match depending on the position in the string.
Regular Expression | Description |
---|---|
^ | Matches the matched item at the beginning of the string. For example, the pattern "^\d{2}" matches "12" in "12-34". |
$ | Matches the matched item at the end of the string or before "\n" at the end of the string. For example, the pattern "\d{2}$" matches "34" in "12-34". |
\A | Matches the matched item at the beginning of the string. For example, the pattern "\A\d{2}" matches "12" in "12-34". |
\Z | Matches the matched item at the end of the string or before "\n" at the end of the string. For example, the pattern "\d{2}\Z" matches "34" in "12-34". |
\G | Matches only the item at the point where the previous match ended. For example, the pattern "\G\(\d\)" matches "(1)" and "(2)" in "(1)(2)[3](4)", but matches only "(1)" in "(1) (2)[3](4)". |
\b | Matches the item on a boundary between a "\w" and a "\W". It also matches any item at the start or end of a line. For example, the pattern "\b\w" matches "s" twice in sea shells. The pattern "\w\b" matches "w" once in the workflow. |
\B | Matches the matched item that does not occur on a "\b" boundary. For example, the pattern "\B\w" matches "a", "h", and "s" and "e" and "l" twice in "sea shells". |
Alternations
Modifies a regular expression to allow either-or matching.
Regular Expression | Description |
---|---|
| | Matches any single element in the items separated by the vertical bar "|" character. For example, the pattern "c(ar|haracter|all)" matches "car" and "character" in "This car has character". |
Options
Sets or disables miscellaneous options.
Regular Expression | Description |
---|---|
(?i) | Disables case-sensitive matching. |
(?m) | Enables multiline mode. |
(?n) | Enables explicit capture. |
(?s) | Enables single-line mode. |
(?x) | Enables ignoring whitespace in a regular expression. |
(?-i) | Enables case-sensitive matching. |
(?-m) | Disables multiline mode. |
(?-n) | Disables explicit capture. |
(?-s) | Disables single-line mode. |
(?-x) | Disables ignoring whitespace in a regular expression. |