Metacharacters
Character | What does it do? |
---|---|
$ | Matches the end of the input. If in multiline mode, it also matches before a line break character, hence every end of line. |
(?:x) | Matches 'x' but does NOT remember the match. Also known as NON-capturing parenthesis. |
(x) | Matches 'x' and remembers the match. Also known as capturing parenthesis. |
* | Matches the preceding character 0 or more times. |
+ | Matches the preceding character 1 or more times. |
. | Matches any single character except the newline character. |
? |
|
[\b] | Matches a backspace. |
[^abc] | Matches anything NOT enclosed by the brackets. Also known as a negative character set. |
[abc] | Matches any of the enclosed characters. Also known as a character set. You can create range of characters using the hyphen character such as A-Z (A to Z). Note that in character sets, special characters (., *, +) do not have any special meaning. |
\ |
|
\0 | Matches a NULL character. |
\b | Matches a word boundary. Boundaries are determined when a word character is NOT followed or NOT preceeded with another word character. |
\B | Matches a NON-word boundary. Boundaries are determined when two adjacent characters are word characters OR non-word characters. |
\cX | Matches a control character. X must be between A to Z inclusive. |
\d | Matches a digit character. Same as [0-9] or [0123456789]. |
\D | Matches a NON-digit character. Same as [^0-9] or [^0123456789]. |
\f | Matches a form feed. |
\n | Matches a line feed. |
\r | Matches a carriage return. |
\s | Matches a single white space character. This includes space, tab, form feed and line feed. |
\S | Matches anything OTHER than a single white space character. Anything other than space, tab, form feed and line feed. |
\t | Matches a tab. |
\uhhhh | Matches a character with the 4-digits hexadecimal code. |
\v | Matches a vertical tab. |
\w | Matches any alphanumeric character incuding underscore. Equivalent to [A-Za-z0-9_]. |
\W | Matches anything OTHER than an alphanumeric character incuding underscore. Equivalent to [^A-Za-z0-9_]. |
\x | A back reference to the substring matched by the x parenthetical expression. x is a positive integer. |
\xhh | Matches a character with the 2-digits hexadecimal code. |
^ |
|
x(?!y) | Matches 'x' only if 'x' is NOT followed by 'y'. Also known as a negative lookahead. |
x(?=y) | Matches 'x' only if 'x' is followed by 'y'. Also known as a lookahead. |
x|y | Matches 'x' OR 'y'. |
{n,m} | Matches the preceding character at least n times and at most m times. n and m can be omitted if zero.. |
{n} | Matches the preceding character exactly n times. |
No comments:
Post a Comment