Regular expressions, while very powerful, are notoriously hard to master. Here are 6 useful features that can help you start using them in your JavaScript projects:
Capturing groups allow you to get specific parts of the matched string,
simply by wrapping part of the regular expression in parentheses
(...)
:
const str = 'JavaScript is a programming language';
/(JavaScript) is a (.*)/.exec(str);
/*
[
0: 'JavaScript is a programming language',
1: 'JavaScript',
2: 'programming language'
]
*/
Non-capturing groups are used for matching something without capturing it,
like an either/or matching group that you do not really need. They are
defined similarly to capturing groups, but prefixed with ?:
:
const str = 'JavaScript is a programming language';
/(?:JavaScript|Python) is a (.+)/.exec(str);
/*
[
0: 'JavaScript is a programming language',
1: 'programming language'
]
*/
Named capturing groups allow you to name a capturing group, by prefixing
it with <name>
:
const str = 'JavaScript is a programming language';
/(?<subject>.+) is a (?<description>.+)/.exec(str);
/*
[
0: 'JavaScript is a programming language',
1: 'JavaScript',
2: 'programming language',
groups: {
subject: 'JavaScript,
description: 'programming language'
}
]
*/
Backreferences help you write shorter regular expressions, by repeating an
existing capturing group, using \1
, \2
etc.
Similarly, you can also repeat named capturing groups using
\k<name>
:
const str = 'JavaScript is a programming language - an awesome programming language JavaScript is';
/(.+) is a (?<description>.+) - an awesome \k<description> \1 is/.exec(str);
/*
[
0: 'JavaScript is a programming language - an awesome programming language JavaScript is',
1: 'JavaScript',
2: 'programming language',
groups: {
subject: 'JavaScript,
description: 'programming language'
}
]
*/
Lookaheads allow you to check if something is followed by a certain
pattern, without actually matching it. You can create positive lookaheads
using ?=
and negative lookaheads using ?!
:
const str = 'JavaScript is not the same as Java and you should remember that';
/Java(?=Script)(.*)/.exec(str);
/*
[
0: 'JavaScript is not the same as Java and you should remember that',
1: 'Script is not the same as Java and you should remember that'
]
*/
/Java(?!Script)(.*)/.exec(str);
/*
[
0: 'Java and you should remember that',
1: ' and you should remember that'
]
*/
Finally, you can match unicode characters, using /p{...}
and
the /u
flag. Examples include, but are not limited to
{Emoji}
, {Math_Symbols}
and
{Script=Greek}
:
const str = 'Greek looks like this: γεια';
/\p{Script=Greek}+/u.exec(str);
/*
[
0: 'γεια'
]
*/