Difference between revisions of "Regular Expressions"
The Wiki of Unify contains information on clients and devices, communications systems and unified communications. - Unify GmbH & Co. KG is a Trademark Licensee of Siemens AG.
(→IP address) |
(→IP address) |
||
| Line 108: | Line 108: | ||
| 192\.168\.1\. | | 192\.168\.1\. | ||
| All addresses starting with '''192.168.1.''' (entire /24 subnet) | | All addresses starting with '''192.168.1.''' (entire /24 subnet) | ||
| − | |||
| − | |||
| − | |||
|- | |- | ||
| ^10\.0\.0\.(1[0-9]|2[0-9]|30)$ | | ^10\.0\.0\.(1[0-9]|2[0-9]|30)$ | ||
| Line 117: | Line 114: | ||
| ^192\.168\.0\.(2[5-9]|3[0-9])$ | | ^192\.168\.0\.(2[5-9]|3[0-9])$ | ||
| '''192.168.0.25''' through '''192.168.0.39''' | | '''192.168.0.25''' through '''192.168.0.39''' | ||
| − | |||
| − | |||
| − | |||
|- | |- | ||
| ^192\.168\.0\.([5-9][0-9]|[0-9]{3})$ | | ^192\.168\.0\.([5-9][0-9]|[0-9]{3})$ | ||
Latest revision as of 10:02, 18 May 2026
Contents
- 1 Introduction
- 2 Elements of a regular expression
- 2.1 Literal characters
- 2.2 Anchors (position markers)
- 2.3 Character classes
- 2.4 Negated character classes
- 2.5 Special (shorthand) character classes
- 2.6 Quantifiers (repetition)
- 2.7 Greedy vs. lazy quantifiers
- 2.8 Groups
- 2.9 Back-references
- 2.10 Alternation
- 2.11 Escape character
- 2.12 Lookahead and lookbehind (assertions)
- 2.13 Flags (modifiers)
- 3 Examples
- 4 Testing your regular expressions
Introduction
Regular expressions (often called regex or regexp) are powerful sequences of characters that define a search pattern. They're used for string matching within text, allowing you to search and match strings based on a specified pattern. A regular expression may contain literals or special characters with a predefined meaning.
Elements of a regular expression
Literal characters
Most characters simply match themselves. The letter a matches the letter "a" in the text.
Anchors (position markers)
Anchors do not match a character, they match a position in the text.
-
^(caret) — matches the start of a line. -
$(dollar sign) — matches the end of a line. -
\b— matches a word boundary (the position between a word character and a non-word character).
Character classes
Enclosed in square brackets [ ], a character class matches one character from a defined set.
-
[aeiou]— matches any single vowel. -
[a-z]— matches any lowercase letter from a to z (range). -
[0-9a-fA-F]— matches any hexadecimal digit.
Negated character classes
A caret ^ placed immediately inside a character class negates it, matching any character not in the set.
-
[^aeiou]— matches any character that is not a vowel. -
[^0-9]— matches any character that is not a digit.
Special (shorthand) character classes
Predefined shortcuts for common character sets.
-
.(dot) — matches any single character except a newline. -
\d— matches any digit (same as[0-9]). -
\D— matches any non-digit. -
\w— matches any word character: letter, digit, or underscore (same as[a-zA-Z0-9_]). -
\W— matches any non-word character. -
\s— matches any whitespace character (space, tab, newline). -
\S— matches any non-whitespace character.
Quantifiers (repetition)
Specify how many times the preceding element must occur.
-
*— zero or more times. -
+— one or more times. -
?— zero or one time (makes the element optional). -
{n}— exactly n times. -
{n,}— n or more times. -
{n,m}— between n and m times (inclusive).
Greedy vs. lazy quantifiers
By default, quantifiers are greedy — they match as much text as possible. Adding a ? after a quantifier makes it lazy (matches as little as possible).
-
.*— greedy: matches as many characters as possible. -
.*?— lazy: matches as few characters as possible.
Groups
Parentheses ( ) group multiple characters or sub-patterns into a single unit. Groups can be quantified, and their matched content can be referenced later.
- Capturing group:
(abc)— matches "abc" and remembers the match for later use (back-reference or replacement). - Non-capturing group:
(?:abc)— groups the pattern without remembering the match (useful for performance or clarity). - Named group:
(?P<name>abc)or(?<name>abc)— a capturing group accessible by name instead of number.
Back-references
Refer back to the content matched by a previous capturing group.
-
\1— matches the same text that was matched by the first capturing group. -
\2— matches the same text as the second group, and so on.
Alternation
The pipe | acts as an "or" operator, matching either the pattern on the left or the pattern on the right.
-
cat|dog— matches "cat" or "dog". -
(red|blue) car— matches "red car" or "blue car".
Escape character
The backslash \ removes the special meaning of the following character, allowing it to be matched literally.
-
\.— matches a literal dot (instead of "any character"). -
\[— matches a literal opening bracket. -
\\— matches a literal backslash.
It is also used for encoded characters:
-
\n— newline. -
\t— tab. -
\r— carriage return. -
\x20— the character with hexadecimal code 20 (a space).
Lookahead and lookbehind (assertions)
These check whether a pattern exists before or after the current position, without consuming any characters.
-
(?=abc)— positive lookahead: succeeds if "abc" follows. -
(?!abc)— negative lookahead: succeeds if "abc" does not follow. -
(?<=abc)— positive lookbehind: succeeds if "abc" precedes. -
(?<!abc)— negative lookbehind: succeeds if "abc" does not precede.
Flags (modifiers)
Flags change how the entire expression behaves. They are typically placed after the closing delimiter (e.g. /pattern/gi).
-
i— case-insensitive matching. -
g— global: find all matches, not just the first. -
m— multiline:^and$match the start/end of each line, not just the whole string. -
s— single-line (dotall):.also matches newline characters.
Examples
Some example regular expressions to be used within Openscape Endpoint Management.
A group matches a device when all defined filter fields match. If a field is left empty, it is not evaluated and treated as matching. The fields available are: IP address, Phone number (E.164), Device type and Server address.
IP address
The IP address filter is matched against the device's IP address. Only one IP address is evaluated per device.
| Regular Expression | Description |
|---|---|
| 192\.168\.1\. | All addresses starting with 192.168.1. (entire /24 subnet) |
| ^10\.0\.0\.(1[0-9]|2[0-9]|30)$ | 10.0.0.10 through 10.0.0.30 |
| ^192\.168\.0\.(2[5-9]|3[0-9])$ | 192.168.0.25 through 192.168.0.39 |
| ^192\.168\.0\.([5-9][0-9]|[0-9]{3})$ | 192.168.0.50 through 192.168.0.255 |
| ^172\.16\.(0|1)\. | Two subnets: 172.16.0.0/24 and 172.16.1.0/24 |
| ^(192\.168\.1|192\.168\.2|10\.0\.0)\. | Multiple ranges: 192.168.1.x, 192.168.2.x and 10.0.0.x |
| ^192\.168\.[0-9]{1,3}\.[0-9]{1,3}$ | All addresses within 192.168.0.0/16 |
| ^10\. | All addresses starting with 10. (entire 10.0.0.0/8 network) |
Tips for IP ranges:
- Always escape dots with
\.— an unescaped dot matches any character. - Use
^and$anchors for exact matching to prevent partial matches (e.g.192\.168\.1\.without anchors would also match192.168.10.xor192.168.11.x). - To match multiple subnets, use alternation with parentheses:
^(subnet1|subnet2)\.
Phone number (E.164)
The phone number filter is matched against all known phone numbers of a device. This includes the E.164 number, the basic E.164, the HFA number, subscriber number, and registration phone number depending on the software type.
| Regular Expression | Description |
|---|---|
| ^1234 | All numbers starting with 1234 |
| ^49301 | All numbers with country code 49 and area code 301 (e.g. international format) |
| 5[0-9]{3}$ | All numbers ending in a 4-digit number starting with 5 (5000–5999) |
| ^(100|101|102)$ | Exactly 100, 101 or 102 |
| ^[1-3][0-9]{2}$ | All 3-digit numbers from 100 to 399 |
| ^49 | All numbers starting with country code 49 (Germany) |
| 200[0-9]$ | All numbers ending in 2000 through 2009 |
| ^(1[0-9]{3}|2[0-9]{3})$ | All 4-digit numbers from 1000 to 2999 |
Device type
The device type filter is matched against the hardware type identifier of the device (e.g. CP600, CP700, CP700X, CP710, HG3500, PE, FUSION).
| Regular Expression | Description |
|---|---|
| CP | All device types containing CP (matches CP600, CP700, CP700X, CP710) |
| CP[67].* | Matches CP600, CP700, CP700X and CP710 |
| ^CP700$ | Matches CP700 only (not CP700X) |
| ^CP7 | Matches CP700, CP700X and CP710 |
| ^(CP700|CP710)$ | Matches exactly CP700 or CP710 |
| ^CP700(X)?$ | Matches CP700 and CP700X but not CP710 |
| ^HG3500$ | Matches only HG3500 gateway devices |
| ^(PE|FUSION)$ | Matches PE or FUSION devices |
Server address
The server address filter is matched against the server addresses configured on the device. These include the registration address, registrar address, signaling gateway address and backup address.
| Regular Expression | Description |
|---|---|
| ^10\.1\.1\.100$ | Server address is exactly 10.1.1.100 |
| ^pbx01\.example\.com$ | Server is exactly pbx01.example.com |
| \.example\.com$ | Any server ending in .example.com |
| ^(pbx01|pbx02)\.example\.com$ | Server is pbx01.example.com or pbx02.example.com |
| ^10\.1\.(1|2)\. | Server in subnet 10.1.1.x or 10.1.2.x |
| example\.com | Any server address containing example.com |
Combining filters
When multiple fields are defined on a group, a device must match all of them. For example:
| IP | E.164 | Device Type | Server | Result |
|---|---|---|---|---|
^192\.168\.1\.
|
^1[0-9]{3}$
|
^CP7
|
All CP7xx devices in subnet 192.168.1.x with 4-digit numbers starting with 1 | |
^49301
|
^pbx01\.
|
All devices with numbers starting with 49301 registered to pbx01 | ||
^10\.
|
^(PE|FUSION)$
|
All PE and FUSION devices in the 10.x.x.x network |
Testing your regular expressions
If you want to test your regular expression, there are plenty of websites that allow you to do this online.



