Regular Expression Reference

The following reference describes the parts of a regular expression that can be combined to form a character field constraint.

When specifying constraints, you may want certain characters to be automatically assigned to a field, which can be done by typing the characters where you want them to appear in the field. The exceptions are characters that have been reserved for use by regular expressions. A list of these characters can be viewed from the Symbol column of the table below. Reserved characters can be automatically assigned to a field by placing a backslash prior to the reserved character. For example, if your organization decides that phone numbers should be specified as (310) 555-1212, you would specify the following expression:  \(\d\d\d\) \d\d\d-\d\d\d\d. Notice that the parentheses have been escaped with a backslash, while the dash has not been. Both parentheses and dashes are reserved characters. However, dashes can never be used without brackets, therefore they are treated as a regular character and do not require a backslash. Assigning this constraint to a field will create visual indicators as to how field data should be formatted. In the example stated above, blank fields will look like "(   )    -    ." As you can tell, users will not need to type the parentheses or the dash when specifying a phone number. These symbols are automatically shown to indicate what the format of a phone number should look like.  

The following table describes each regular expression that can be used to establish a pattern that field data must match.

Name Symbol Description
Any Character . Any single character.
Character in Range [] Any character inside the brackets. For example, the expression [abc123] allows only any of the following characters:  "a," "b," "c," "1," "2," or "3."
Character Not in Range [^] Any single character except for those inside the brackets. For example, the expression [^abc123] allows only any character except for:   "a," "b," "c," "1," "2," or "3."
Range Character [-] Any single character contained within the specified range. For example, the expression [0-9] allows only any number that falls between 0 and 9.
Beginning of Input ^ Requires the expression that follows it at the beginning of the user-defined value. For example, the expression ^[abc123] allows only field data that starts with either "a," "b," "c," "1," "2," or "3."
End of Input $ Requires the expression that precedes it at the end of the user-defined value. For example, the expression [abc123]$ allows only field data that ends with either "a," "b," "c," "1," "2," or "3."
Not ! Requires that the expression following the symbol (!) not be found in the field data. For example, the expression a!b allows only field data containing an "a" when it is not immediately followed by "b."
Or | Requires one of two expressions. For example, the expression he|she allows only field data that is set to either "he" or "she."
0 or More * The preceding expression can occur zero or more times. For example, the expression [0-9]* allows any set of consecutive digits or no digits at all.
1 or More + The preceding expression can occur one or more times. For example, the expression [0-9]+ allows any set of consecutive digits.
Previous Statement is Optional ? The preceding expression is optional. Data satisfying the expression may be specified as field data or a user can choose to not enter it. For example, the expression [0-9][0-9]? allows only a single or double digit.
Group () Groups an expression together. For example, the expression (t|T)he allows only field data that is set to either "the" or "The".
Escape Character \ Either an abbreviation (see table below) or that the next character be translated literally. This character should only be used for reserved characters, such as those listed under the Symbol column of this table. For example, \d+ allows only one or more digits, while \d\+ allows a digit followed by a plus sign.

Character Classes

A character class can be used to restrict the character for a particular field.

Name Symbol Description
Alphanumeric [[:alnum:]] Any alphanumeric character.
Alphabetic [[:alpha:]] Any alphabetical character in the following ranges:  a-z and A-Z.
Space/Tab [[:blank:]] A space or a tab.
Digit [[:digit:]] Any digit. A valid character is a whole number from 0 to 9.
Lower-case [[:lower:]] Any lower-case character (i.e., a-z).
Printable [[:print:]] Any printable character.
Punctuation [[:punct:]] Any punctuation character.
Space [[:space:]] Any whitespace character.
Upper-case [[:upper:]] Any upper-case character (i.e., A-Z).
Hexadecimal [[:xdigit:]] Any hexadecimal digit (i.e., 0-9, a-f and A-F).
Word [[:word:]] Any word character. Valid characters are all alphanumeric characters and underscore.

Abbreviations

The following table describes the various abbreviations for specifying a regular expression.

Name Symbol Description
Character . Any single character.
Decimal Digit \d Any single decimal digit. Corresponding syntax: [[:digit:]]
Non-Decimal Digit \D Any character except for a single decimal digit. Corresponding syntax: [^[:digit:]]
Space \s A single space character. Corresponding syntax: [[:space:]]
Non-Space \S Any character except for a single space character. Corresponding syntax: [^[:space:]]