Appendix B - Regular Expressions

Top  Previous  Next

Regular Expressions can be considered a programming language that was specifically designed for string processing. Its main purpose is to locate patterns of substrings within a larger string, according to relative position, context, case and many other attributes.

 

To achieve this, the Regular Expressions language recognizes a set of special characters that can be compared in functionality to the wildcard characters * and ? in the DOS environment. The language employs many of these special characters to provide endless possibilities when searching for a certain pattern within a string. There's also a system for grouping parts of substrings and intermediate results during a search operation.

 

z/Scope takes advantage of the power and simplicity of the Regular Expressions language for defining Hotspots, one of z/Scopes' key features. The creation of a Hotspot requires the user to specify the criteria that a text string in the emulation display must fulfill in order to be recognized and respond to mouse clicks. See Creating/Editing a Hotspot.

 

Most letters and characters will simply match themselves. For example, the regular expression "engine" will match the string "engine" exactly. However, there are some special characters (usually called metacharacters) that do not match themselves. Instead, they are used to define rules and patterns that will be looked for when analyzing the strings.

 

Here's a comprehensive list of all available metacharacters:

 

 

Character

Description

Example

^

 

Matches the position at the beginning of the  string.

^B matches "B" but only if it is the first character in the string.

$

 

Matches the position at the end of the string.

$p matches "p" but only if it is the last character in the string.

.

Matches any single character.

le. matches "leg" and "let".

+

Matches the preceding character 1 or more times.

ca+t matches "cat" and "caat" but not "ct".

*

Matches the preceding character 0 or more times.

ca*t matches "ct", "cat", "caat" and so on.

 

?

Matches the preceding character 0 or 1 times.

si?t matches "st" and "sit" only.

[xyz]

 

Matches any one of the enclosed characters (character set).

[gdp]ot matches the "got", "dot" and "pot".

[^xyz]

 

Matches any character not enclosed (complementary set).

[^aeiou] matches any character that is not a vocal.

[x|y]

 

Matches either x or y.

wom[a|e]n matches either "woman" or "women".

[a-z]

 

Matches any character in the specified range (character range).

[a-z] matches any lowercase letter of the alphabet.

[^a-z]

 

Matches any character not in the specified range (complementary range).

[^a-z] matches any character that is not in the alphabet.

\b

 

Matches a word boundary (the position between a word and a space).

al\b matches the "al" in "general" but not the "al" in "fall".

\B

 

Matches a nonword boundary.

al\B matches the "al" in "fall" but not the "al" in "general".

\s

 

Matches any white space character including space, tab, form-feed, and so on.

 

\S

 

Matches any non-white space character.

 

\d

 

Matches a digit character. Equivalent to [0-9].

 

\D

 

Matches any non-digit character. Equivalent to [^0-9].

 

\w

 

Matches any word character including underscore. Equivalent to [A-Za-z0-9_].

 

\W

 

Matches any non-word character. Equivalent to [^A-Za-z0-9_].

 

{n}

 

Matches a character exactly n times.

 

p{2} does not match the "p" in "peach" but matches the two p's in "apple".

{n,}

 

Matches a character at least n times.

 

p{2,} does not match the "p" in "peach" and matches all the p's in "apppp".

{n,m}

 

Matches a character at least n and at most m times.

p{1,3} matches the first three p's in "appppp".

 

 

If you need to search for one of the characters that are reserved as metacharacters, you can do so by placing a backslash (\) before the desired character. In this way, for example, \? will actually match "?" instead of matching the position at the end of the string.

 

 

Related Topics

 

Creating/Editing a Hotspot