Difference between revisions of "Regular Expressions"

Latest revision as of 10:02, 18 May 2026

Introduction

Regular expressions (often called regex or regexp) are powerful sequences of characters that define a search pattern. They're used for string matching within text, allowing you to search and match strings based on a specified pattern. A regular expression may contain literals or special characters with a predefined meaning.

Elements of a regular expression

Literal characters

Most characters simply match themselves. The letter a matches the letter "a" in the text.

Anchors (position markers)

Anchors do not match a character, they match a position in the text.

^ (caret) — matches the start of a line.
$ (dollar sign) — matches the end of a line.
\b — matches a word boundary (the position between a word character and a non-word character).

Character classes

Enclosed in square brackets [ ], a character class matches one character from a defined set.

[aeiou] — matches any single vowel.
[a-z] — matches any lowercase letter from a to z (range).
[0-9a-fA-F] — matches any hexadecimal digit.

Negated character classes

A caret ^ placed immediately inside a character class negates it, matching any character not in the set.

[^aeiou] — matches any character that is not a vowel.
[^0-9] — matches any character that is not a digit.

Special (shorthand) character classes

Predefined shortcuts for common character sets.

. (dot) — matches any single character except a newline.
\d — matches any digit (same as [0-9]).
\D — matches any non-digit.
\w — matches any word character: letter, digit, or underscore (same as [a-zA-Z0-9_]).
\W — matches any non-word character.
\s — matches any whitespace character (space, tab, newline).
\S — matches any non-whitespace character.

Quantifiers (repetition)

Specify how many times the preceding element must occur.

* — zero or more times.
+ — one or more times.
? — zero or one time (makes the element optional).
{n} — exactly n times.
{n,} — n or more times.
{n,m} — between n and m times (inclusive).

Greedy vs. lazy quantifiers

By default, quantifiers are greedy — they match as much text as possible. Adding a ? after a quantifier makes it lazy (matches as little as possible).

.* — greedy: matches as many characters as possible.
.*? — lazy: matches as few characters as possible.

Groups

Parentheses ( ) group multiple characters or sub-patterns into a single unit. Groups can be quantified, and their matched content can be referenced later.

Capturing group: (abc) — matches "abc" and remembers the match for later use (back-reference or replacement).
Non-capturing group: (?:abc) — groups the pattern without remembering the match (useful for performance or clarity).
Named group: (?P<name>abc) or (?<name>abc) — a capturing group accessible by name instead of number.

Back-references

Refer back to the content matched by a previous capturing group.

\1 — matches the same text that was matched by the first capturing group.
\2 — matches the same text as the second group, and so on.

Alternation

The pipe | acts as an "or" operator, matching either the pattern on the left or the pattern on the right.

cat|dog — matches "cat" or "dog".
(red|blue) car — matches "red car" or "blue car".

Escape character

The backslash \ removes the special meaning of the following character, allowing it to be matched literally.

\. — matches a literal dot (instead of "any character").
\[ — matches a literal opening bracket.
\\ — matches a literal backslash.

It is also used for encoded characters:

\n — newline.
\t — tab.
\r — carriage return.
\x20 — the character with hexadecimal code 20 (a space).

Lookahead and lookbehind (assertions)

These check whether a pattern exists before or after the current position, without consuming any characters.

(?=abc) — positive lookahead: succeeds if "abc" follows.
(?!abc) — negative lookahead: succeeds if "abc" does not follow.
(?<=abc) — positive lookbehind: succeeds if "abc" precedes.
(?<!abc) — negative lookbehind: succeeds if "abc" does not precede.

Flags (modifiers)

Flags change how the entire expression behaves. They are typically placed after the closing delimiter (e.g. /pattern/gi).

i — case-insensitive matching.
g — global: find all matches, not just the first.
m — multiline: ^ and $ match the start/end of each line, not just the whole string.
s — single-line (dotall): . also matches newline characters.

Examples

Some example regular expressions to be used within Openscape Endpoint Management.

A group matches a device when all defined filter fields match. If a field is left empty, it is not evaluated and treated as matching. The fields available are: IP address, Phone number (E.164), Device type and Server address.

IP address

The IP address filter is matched against the device's IP address. Only one IP address is evaluated per device.

Regular Expression	Description
192\.168\.1\.	All addresses starting with 192.168.1. (entire /24 subnet)
^10\.0\.0\.(1[0-9]\|2[0-9]\|30)$	10.0.0.10 through 10.0.0.30
^192\.168\.0\.(2[5-9]\|3[0-9])$	192.168.0.25 through 192.168.0.39
^192\.168\.0\.([5-9][0-9]\|[0-9]{3})$	192.168.0.50 through 192.168.0.255
^172\.16\.(0\|1)\.	Two subnets: 172.16.0.0/24 and 172.16.1.0/24
^(192\.168\.1\|192\.168\.2\|10\.0\.0)\.	Multiple ranges: 192.168.1.x, 192.168.2.x and 10.0.0.x
^192\.168\.[0-9]{1,3}\.[0-9]{1,3}$	All addresses within 192.168.0.0/16
^10\.	All addresses starting with 10. (entire 10.0.0.0/8 network)

Tips for IP ranges:

Always escape dots with \. — an unescaped dot matches any character.
Use ^ and $ anchors for exact matching to prevent partial matches (e.g. 192\.168\.1\. without anchors would also match 192.168.10.x or 192.168.11.x).
To match multiple subnets, use alternation with parentheses: ^(subnet1|subnet2)\.

Phone number (E.164)

The phone number filter is matched against all known phone numbers of a device. This includes the E.164 number, the basic E.164, the HFA number, subscriber number, and registration phone number depending on the software type.

Regular Expression	Description
^1234	All numbers starting with 1234
^49301	All numbers with country code 49 and area code 301 (e.g. international format)
5[0-9]{3}$	All numbers ending in a 4-digit number starting with 5 (5000–5999)
^(100\|101\|102)$	Exactly 100, 101 or 102
^[1-3][0-9]{2}$	All 3-digit numbers from 100 to 399
^49	All numbers starting with country code 49 (Germany)
200[0-9]$	All numbers ending in 2000 through 2009
^(1[0-9]{3}\|2[0-9]{3})$	All 4-digit numbers from 1000 to 2999

Device type

The device type filter is matched against the hardware type identifier of the device (e.g. CP600, CP700, CP700X, CP710, HG3500, PE, FUSION).

Regular Expression	Description
CP	All device types containing CP (matches CP600, CP700, CP700X, CP710)
CP[67].*	Matches CP600, CP700, CP700X and CP710
^CP700$	Matches CP700 only (not CP700X)
^CP7	Matches CP700, CP700X and CP710
^(CP700\|CP710)$	Matches exactly CP700 or CP710
^CP700(X)?$	Matches CP700 and CP700X but not CP710
^HG3500$	Matches only HG3500 gateway devices
^(PE\|FUSION)$	Matches PE or FUSION devices

Server address

The server address filter is matched against the server addresses configured on the device. These include the registration address, registrar address, signaling gateway address and backup address.

Regular Expression	Description
^10\.1\.1\.100$	Server address is exactly 10.1.1.100
^pbx01\.example\.com$	Server is exactly pbx01.example.com
\.example\.com$	Any server ending in .example.com
^(pbx01\|pbx02)\.example\.com$	Server is pbx01.example.com or pbx02.example.com
^10\.1\.(1\|2)\.	Server in subnet 10.1.1.x or 10.1.2.x
example\.com	Any server address containing example.com

Combining filters

When multiple fields are defined on a group, a device must match all of them. For example:

IP	E.164	Device Type	Server	Result
`^192\.168\.1\.`	`^1[0-9]{3}$`	`^CP7`		All CP7xx devices in subnet 192.168.1.x with 4-digit numbers starting with 1
	`^49301`		`^pbx01\.`	All devices with numbers starting with 49301 registered to pbx01
`^10\.`		`^(PE\|FUSION)$`		All PE and FUSION devices in the 10.x.x.x network

Testing your regular expressions

If you want to test your regular expression, there are plenty of websites that allow you to do this online.

@@ Line 108: / Line 108: @@
 | 192\.168\.1\.
 | All addresses starting with '''192.168.1.''' (entire /24 subnet)
-|-
-| 192\.168\.1\.[0-9]{1,3}
-| All addresses within subnet '''192.168.1.0/24''' (same as above, but stricter)
 |-
 | ^10\.0\.0\.(1[0-9]&#124;2[0-9]&#124;30)$
@@ Line 117: / Line 114: @@
 | ^192\.168\.0\.(2[5-9]&#124;3[0-9])$
 | '''192.168.0.25''' through '''192.168.0.39'''
-|-
-| ^192\.168\.0\.(1[0-9][0-9]&#124;2[0-4][0-9]&#124;25[0-5])$
-| '''192.168.0.100''' through '''192.168.0.255'''
 |-
 | ^192\.168\.0\.([5-9][0-9]&#124;[0-9]{3})$

Views