Recommended site for playing with regex → https://regex101.com/
Regular expressions are a way to represent a pattern of characters. These patterns are used to match character combinations in strings.
Regular Expression:
- - is object
- - There are two ways to create regular expression: one is literal syntax and another is using the constructor.
using constructor
let regex1 = new RegExp("hello");
using literal syntax
let regex2 = /world/;
/
→ start of regular expressionworld
→ pattern/
→ end of regular expression
- Once you have a regex object, you can then use it with one of the methods on
RegExp
The constructor or the String object wrapper
let txt = "Lets start with a Hello"
let regex1 = new RegExp("Hello");
let regex2 = /world/;
console.log(regex1);
console.log(regex2);
console.log(regex1.test(txt));
console.log(regex2.test(txt));
Output
/Hello/
/world/
true
false
Regular Expression Flags/ Modifiers
Modifiers are used to perform global, case-insensitive, and multiline searches:
Modifier | Description |
---|---|
g | Perform a global match (find all matches rather than stopping after the first match) |
i | Perform case-insensitive matching |
m | Perform multiline matching |
Syntax
/pattern/flags;
or
new RegExp("pattern", "flags");
Lets match s
before whitespace:
let txt = "Let's detect as many s as available in this sentence."
let regex1 = /s\s/;
console.log(txt.match(regex1));
output:
["s ", index: 4, input: "Let's detect as many s as available in this sentence.", groups: undefined]
0: "s "
groups: undefined
index: 4
input: "Let's detect as many s as available in this sentence."
length: 1
In the above example regular expression /s\s/
detected the first match of s
which is before whitespace i.e.(L
→0
, e
→1
, t
→2
, '
→3
, s
→4
).
Here \s
is a metacharacter that matches the whitespace character. We will read about metacharacter later in this article. Also, we have used a string method match()
that matches a string against a regular expression and returns an array. We will also read about regex methods later.
Now let's try again with a global modifier g
let txt = "Let's detect as many s as available in this sentence."
let regex1 = /s\s/g;
console.log(txt.match(regex1));
output:
0: "s "
1: "s "
2: "s "
3: "s "
4: "s "
length: 5
Now we got all the s
before whitespace which are in 5 places.
⇒Let's detect as many s as available in this sentence.
Metacharacters
Metacharacters are the building blocks of regular expressions.
Example: ^$.*+?=!:|\/()[]{}
some common metacharacters
Backslash represent escape character
Backslash represent escape character
Character classes
Metacharacter | Description |
---|---|
. | Find a single character, except newline or line terminator |
\w \d \s | word, digit, whitespace |
\W \D \S | not word, digit, whitespace |
example:
let text = "How was the day today"
let pattern = /d.y/g
text.match(pattern)
// output:
(2) ['day', 'day']
In the above example, we saw that .
finds a single character after d i.e. a contains y after that single character.
let text = "Underscore (_) yields a single line, 1 pt thick, gap 0.75 mm."
let pattern = /\w/g
text.match(pattern)
// output:
(44) ['U', 'n', 'd', 'e', 'r', 's', 'c', 'o', 'r', 'e', '_', 'y', 'i', 'e', 'l',
'd', 's', 'a', 's', 'i', 'n', 'g', 'l', 'e', 'l', 'i', 'n', 'e', '1', 'p', 't',
't', 'h', 'i', 'c', 'k', 'g', 'a', 'p', '0', '7', '5', 'm', 'm']
In the above example, we saw that \w
matches word characters i.e. a-z, A-Z, 0-9, including _ (underscore)
let text = "How was the day today 15th March"
let pattern = /\d/g
text.match(pattern)
// output:
(2) ['1', '5']
In the above example, we saw that \d
matches digits from 0 to 9.
let text = "How was the day today 15th March"
let pattern = /\s/g
text.match(pattern)
// output:
(6) [' ', ' ', ' ', ' ', ' ', ' ']
In the above example, we saw that \s
matches whitespace character i.e. space, carriage, new line, tab, vertical tab, or form feed character
Similarly if we use
\W
\D
\S
i.e. in capital form, then it will match except word (\w
), digit (\d
) and whitespace (\s
)
Anchors and Word Boundaries
Anchors do not match any character but match a position before or after characters. ^
(caret) matches the start of the string while $
(dollar) matches the end of the string.
Metacharacter | Description |
---|---|
^abc$ | start/end of the string |
\b | word boundary. Find a match at the beginning of a word like this: \bWORD, or at the end of a word like this: WORD\b |
To enable the multiline mode, you use
m
flag.
let text = `This is first line
This is second line
This is third line`
let pattern = /^This/gm;
text.match(pattern)
// Output
(3) ['This', 'This', 'This']
The following three positions are qualified as word boundaries:
- Before the first character in a string if the first character is a word character.
- After the last character in a string if the last character is a word character.
- Between two characters in a string if one is a word character and the other is not.
let text = `REGEX IS GOOD!`
let pattern = /\b/g;
text.match(pattern)
// Output
(6) ['', '', '', '', '', '']
let text = `Hello,Regular expression is very powerful.`
let pattern = /\b/g;
text.match(pattern)
// Output
(12) ['', '', '', '', '', '', '', '', '', '', '', '']
------------------------- Click here to continue -------------------------