Working with Regular Expressions in C# - Lots of Examples
(Page 4 of 5 )
Let’s check out the somewhat complete list of all of the regular expressions and syntax elements that you can use to phrase, format, and build any kind of patterns, from the simplest to the most complex, cryptic-looking, and pretty deceiving.
. | matches any single character, except new line |
* | matches the preceding character zero or many times |
+ | matches the preceding character once or many times |
? | matches the preceding character zero times or only once |
$ | matches the end of the input data string |
^ | matches the beginning of the input data string |
< | matches the empty string at the beginning of a word (start-of-the-word) |
> | matches the empty string at the end of a word (end-of-the-word) |
_ | matches the next character (such as $ in place of _) as a special character |
( ) | matches the pattern enclosed in the brackets: (pattern) |
[ ] | matches any of the enclosed characters in the set |
[^] | matches any of the characters that are not enclosed in the set |
{n} | matches the preceding character exactly the enclosed integer times (n) |
{n,} | matches the preceding character at least the enclosed integer times (n) |
n | matches any new line character |
r | matches any carriage return character |
f | matches any form-feed character |
t | matches any tab character |
v | matches any vertical tab character |
s | matches any white space character; equivalent to [nrftv] |
S | matches any non-white space character; equivalent to [^nrftv] |
w | matches any word character, herein including underscore and digits |
W | matches any non-word character, the negation of the above |
b | matches any word boundary: position between the word and the space |
B | matches any non-word boundary, negation of the above |
d | matches any digit character; equivalent to [0-9] |
D | matches any non-digit character; equivalent to [^0-9] |
e | matches any escape character |
__ | matches any octal escape value specified in place of __; 1, 2, 3 digits long |
x__ | matches any hexadecimal escape value; hex value must be 2 digits long |
digit | back reference operator; reaches back to the matches of the preceding digit-th grouping operator; thus, it always follows a grouping operator |
- | range operator, it is used when specifying ranges in sets such as [0-9] |
I think that a few of the above should be illustrated in code. Check out the following real-world practical pattern examples. We’ll just state a pattern and answer what kind of results it would return if used in the case of a match, for example. What we need to understand is how they work and what exactly each one of them does. If we know that, then we can do a myriad of other things using the classes presented earlier.
"d.v" -> matches “dev”, “d5v”… it is a placeholder of any single character.
"dev*" -> matches “de” but also “dev”, “devv”, or “devvvvv”, etc.
"dev+" -> matches “dev”, or “devvvvv”, and so forth.
"de?v" -> matches “dv” and “dev”
"dev|shed" -> matches either of the two: dev OR shed.
"dev[0-9]" -> matches “dev0”, “dev1”, “dev2”, … “dev8”, “dev 9”.
"dev[^0-9]" -> matches “deva”, “devb”, anything but digits in the end.
"shedb " -> matches any word ending in “shed”
"sshed" -> matches “tshed”, “nshed”, “ shed” (space as first char)…
"da*" -> matches any digits optionally followed by any times of an “a”
"^5[1-5][0-9]{14}$" -> matches all valid MASTERCARDs.
Furthermore, I’d like to recommend downloading and trying out the following free application: Rad Software Regular Expression Designer [link: here, download: here]. I am not affiliated, nor have any relation with the author(s) of this nifty application. All I know is that it is small, efficient, and very practical. And I mean it. It is an amazing utility that helps you practice and deepen your knowledge of regular expressions.
It supports such features as specifying a regular expression, adding text as input string(s), and letting you simulate the pattern matching and/or replacing process. You can also fix your regular expressions easier because after every change you can just match the expression again, again, and again. It also has a tree-view that contains all of the language elements and you can pick them one by one. Check it out!
Next: Final Words >>
More C# Articles
More By Barzan "Tony" Antal