Regular expression finds text that matches specific patterns. It uses symbols to specify matches. Writing regex is make a specific pattern match a group of characters. Common usage for regex are validations and searching.

Regular expressions are the soul of text processing.1

Regex are constructed from patterns by using concatenation and alternation, and are placed between two / characters. Each pattern can be the result of the concatenation and alternation of one or more smaller patterns.2

$ ^ * + ? . ( ) [ ] { } | \ are special characters (meta-character) which need to be escaped in Regex pattern.

NOTE: Unescaped a, b, and z characters are used to denote regular characters. Unescaped p and q characters are used to denote patterns.


Basic Matching

Metacharacters have special meaning inside a regex. Metacharacters has to be escaped with a leading \ to match it literally.

() overrides how patterns are grouped together in a regex.

() provide a way to capture parts of a match for later reuse in the regex or in a replacement string; in such cases, they are referred to as capture groups. The captured values are accessed via backreferences.

Pattern Meaning
/a/ Match the character a
/\?/, /\./ Match a meta-character literally
/\n/, /\t/ Match a control character (newline, tab, etc)
/pq/ Concatenation (p followed by q)
/(p)/ Capturing Group
/(?:p)/ Non-capturing Group
/p/i Case insensitive match (i is modifier, which is language specific)
/p/m Make dot match newlines


Quantifiers

Quantifier allows a pattern to be matched multiple times. Quantifiers are greedy by default, but can be treated as lazy.

Pattern Meaning
/p*/ 0 or more occurrences of pattern
/p+/ 1 or more occurrences of pattern
/p?/ 0 or 1 occurrence of pattern
/p{m}/ m occurrences of pattern
/p{m,}/ m or more occurrences of pattern
/p{m,n}/ m through n occurrences of pattern
/p*?/ 0 or more occurrences (lazy)
/p+?/ 1 or more occurrences (lazy)
/p??/ 0 or 1 occurrence (lazy)
/p{m,}?/ m or more occurrences (lazy)
/p{m,n}?/ m through n occurrences (lazy)


Character Classes and Shortcuts

Character class pattern matches character in a specified set or in a range of characters or any combination of sets and ranges. A range only works in a character class; A character set represents 1 character.

Pattern Meaning
/[ab]/ a or b
/[a-z]/ a through z, inclusive
/[^ab]/ Not (a or b)
/[^a-z]/ Not (a through z)
/./ Any character except newline (wildcard)
/\s/, /[\s]/ Whitespace character (space, tab, newline, etc)
/\S/, /[\S]/ Not a whitespace character
/\d/, /[\d]/ Decimal digit (0-9)
/\D/, /[\D]/ Not a decimal digit
/\w/, /[\w]/ Word character (0-9, a-z, A-Z, _)
/\W/, /[\W]/ Not a word character


Anchors

Anchor force a regex to only match at a specified point.

Pattern Meaning
/^p/ Pattern at start of line
/p$/ Pattern at end of line
/\Ap/ Pattern at start of string
/p\z/ Pattern at end of string (after newline)
/p\Z/ Pattern at end of string (before newline)
/\bp/ Pattern begins at word boundary
/p\b/ Pattern ends at word boundary
/\Bp/ Pattern begins at non-word boundary
/p\B/ Pattern ends at non-word boundary


Common Ruby Methods for Regex

Use Rebular to test regex for Ruby.

Method Use
String#match Determine if regex matches a string
string =~ regex Determine if regex matches a string
String#split Split string by regex
String#sub Replace regex match one time
String#gsub Replace regex match globally


Common JavaScript Functions for Regex

Use scriptular to test regex for JavaScript.

Method Use
String.match Determine if regex matches a string
String.split Split string by regex
String.replace Replace regex match


To learn more about Regular Expression read books listed blow,

  1. https://www.safaribooksonline.com/library/view/learning-rails-5/9781491926185/app03.html#an_incredibly_brief_guide_to_regular_exp

  2. https://launchschool.com/books/regex/read/conclusion#cheatsheet