TESTEVERYTHING

Wednesday 5 May 2021

XPath Axis Family Tree Analogy

 The major XPath axes follow family tree terminology:

  • self:: is you.


Downward:

  • child:: are your immediate children.
  • descendant:: are your children, and their children, recursively.
  • descendant-or-self:: (aka //): are you and your descendants.


Upward:

  • parent:: is your mother or father.
  • ancestor:: are your parent, and your parent's parent, recursively.
  • ancestor-or-self:: are you and your ancestors.


Sideways (consider elements earlier in the document to be younger):

  • previous-sibling:: are your younger siblings, in age order.
  • following-sibling:: are your older siblings, in age order.
  • previous:: are your younger siblings and their descendants, in age order.
  • following:: are your older siblings and their descendants, in age order.

---------------------------------------------------------------------------------------------------------------------

  • child:: will select the immediate descendants of the context node, but does not go any deeper, like descendant:: does.
  • following:: will select all of the nodes that come after the context node and their descendant's, but that does not include the context node's descendants.
  • descendant:: will select all of the nodes along the child:: axis, as well as their children, and their children's children, etc..

Sometimes, a picture is worth a thousand words:




Thursday 4 March 2021

How to validate credit cards using a regular expression?

 Again, you should rely on other methods since the regular expressions here will only validate the format. Make use of the Luhn algorithm to properly validate a card.

VISA:
^4[0-9]{12}(?:[0-9]{3})?$
MasterCard:
^5[1-5][0-9]{14}$
American Express:
^3[47][0-9]{13}$
Diners Club:
^3(?:0[0-5]|[68][0-9])[0-9]{11}$
Discover:
^6(?:011|5[0-9]{2})[0-9]{12}$
JCB:
^(?:2131|1800|35\d{3})\d{11}$

How to validate NUMBERS with a regular expression?

 It depends. What type of number? What precision? What length? What do you want as a decimal separator? Etc. The following examples should help you want with the most common tasks.

Positive integers of undefined length:
^\d+$
Positive integers of maximum length (10 in our example):
^\d{1,10}$
Positive integers of fixed length (5 in our example):
^\d{5}$
Negative integers of undefined length:
^-\d+$
Negative integers of maximum length (10 in our example):
^-\d{1,10}$
Negative integers of fixed length (5 in our example):
^-\d{5}$
Integers of undefined length:
^-?\d+$
Integers of maximum length (10 in our example):
^-?\d{1,10}$
Integers of fixed length (5 in our example):
^-?\d{5}$
Numbers of undefined length with or without decimals (1234.1234):
^-?\d*\.{0,1}\d+$
Numbers with 2 decimals (.00):
^-?\d*\.\d{2}$
Currency numbers with optional dollar sign and thousand separators and optional 2 decimals ($1,000,00.00, 10000.12, 0.00):
^$?\-?([1-9]{1}[0-9]{0,2}(\,\d{3})*(\.\d{0,2})?|[1-9]{1}\d{0,}(\.\d{0,2})?|0(\.\d{0,2})?|(\.\d{1,2}))$|^\-?$?([1-9]{1}\d{0,2}(\,\d{3})*(\.\d{0,2})?|[1-9]{1}\d{0,}(\.\d{0,2})?|0(\.\d{0,2})?|(\.\d{1,2}))$|^\($?([1-9]{1}\d{0,2}(\,\d{3})*(\.\d{0,2})?|[1-9]{1}\d{0,}(\.\d{0,2})?|0(\.\d{0,2})?|(\.\d{1,2}))\)$
Percentage from 0 to 100 with optional 2 decimals and optional % sign at the end (0, 0.00, 100.00, 100%, 99.99%):
^-?[0-9]{0,2}(\.[0-9]{1,2})?%?$|^-?(100)(\.[0]{1,2})?%?$

How to validate a DATE with a regular expression?

 Never use a regular expression to validate a date. The regular expression is only useful to validate the format of the date as entered by a user. For the actual date validity, you should rely on another language.

The following expressions will validate the number of days in a month but will NOT handle leap year validation; hence february can have 29 days every year, but not more.

ISO date format (yyyy-mm-dd):
^[0-9]{4}-(((0[13578]|(10|12))-(0[1-9]|[1-2][0-9]|3[0-1]))|(02-(0[1-9]|[1-2][0-9]))|((0[469]|11)-(0[1-9]|[1-2][0-9]|30)))$
ISO date format (yyyy-mm-dd) with separators '-' or '/' or '.' or ' '. Forces usage of same separator accross date.
^[0-9]{4}([- /.])(((0[13578]|(10|12))\1(0[1-9]|[1-2][0-9]|3[0-1]))|(02\1(0[1-9]|[1-2][0-9]))|((0[469]|11)\1(0[1-9]|[1-2][0-9]|30)))$
United States date format (mm/dd/yyyy)
^(((0[13578]|(10|12))/(0[1-9]|[1-2][0-9]|3[0-1]))|(02/(0[1-9]|[1-2][0-9]))|((0[469]|11)/(0[1-9]|[1-2][0-9]|30)))/[0-9]{4}$
Hours and minutes, 24 hours format (HH:MM):
^(20|21|22|23|[01]\d|\d)((:[0-5]\d){1,2})$

How can I emulate DOTALL in JavaScript?

 DOTALL is a flag in most recent regex libraries that makes the . metacharacter match anything INCLUDING line breaks. JavaScript by default does not support this since the . metacharacter matches anything BUT line breaks. To emulate this behavior, simply replaces all . metacharacters by [\S\s]. This means match anything that is a single white space character OR anything that is not a white space character!

[\S\s]

Regular Expression - Documentation

 

Metacharacters

CharacterWhat does it do?
$Matches the end of the input. If in multiline mode, it also matches before a line break character, hence every end of line.
(?:x)Matches 'x' but does NOT remember the match. Also known as NON-capturing parenthesis.
(x)Matches 'x' and remembers the match. Also known as capturing parenthesis.
*Matches the preceding character 0 or more times.
+Matches the preceding character 1 or more times.
.Matches any single character except the newline character.
?
  • Matches the preceding character 0 or 1 time.
  • When used after the quantifiers *, +, ? or {}, makes the quantifier non-greedy; it will match the minimum number of times as opposed to matching the maximum number of times.
[\b]Matches a backspace.
[^abc]Matches anything NOT enclosed by the brackets. Also known as a negative character set.
[abc]Matches any of the enclosed characters. Also known as a character set. You can create range of characters using the hyphen character such as A-Z (A to Z). Note that in character sets, special characters (., *, +) do not have any special meaning.
\
  • Used to indicate that the next character should NOT be interpreted literally. For example, the character 'w' by itself will be interpreted as 'match the character w', but using '\w' signifies 'match an alpha-numeric character including underscore'.
  • Used to indicate that a metacharacter is to be interpreted literally. For example, the '.' metacharacter means 'match any single character but a new line', but if we would rather match a dot character instead, we would use '\.'.
\0Matches a NULL character.
\bMatches a word boundary. Boundaries are determined when a word character is NOT followed or NOT preceeded with another word character.
\BMatches a NON-word boundary. Boundaries are determined when two adjacent characters are word characters OR non-word characters.
\cXMatches a control character. X must be between A to Z inclusive.
\dMatches a digit character. Same as [0-9] or [0123456789].
\DMatches a NON-digit character. Same as [^0-9] or [^0123456789].
\fMatches a form feed.
\nMatches a line feed.
\rMatches a carriage return.
\sMatches a single white space character. This includes space, tab, form feed and line feed.
\SMatches anything OTHER than a single white space character. Anything other than space, tab, form feed and line feed.
\tMatches a tab.
\uhhhhMatches a character with the 4-digits hexadecimal code.
\vMatches a vertical tab.
\wMatches any alphanumeric character incuding underscore. Equivalent to [A-Za-z0-9_].
\WMatches anything OTHER than an alphanumeric character incuding underscore. Equivalent to [^A-Za-z0-9_].
\xA back reference to the substring matched by the x parenthetical expression. x is a positive integer.
\xhhMatches a character with the 2-digits hexadecimal code.
^
  • Matches the beginning of the input. If in multiline mode, it also matches after a line break character, hence every new line.
  • When used in a set pattern ([^abc]), it negates the set; match anything not enclosed in the brackets
x(?!y)Matches 'x' only if 'x' is NOT followed by 'y'. Also known as a negative lookahead.
x(?=y)Matches 'x' only if 'x' is followed by 'y'. Also known as a lookahead.
x|yMatches 'x' OR 'y'.
{n,m}Matches the preceding character at least n times and at most m times. n and m can be omitted if zero..
{n}Matches the preceding character exactly n times.

Which one is right ?

Translate







Tweet