Introduction

Regex (for regular expressions) is a tool to find, match, or manipulate text using patterns. It’s perfect for spotting patterns, validating input or text modifications via patterns.

We can use this website to play with regex: https://regex101.com/

Regex patterns are usually delimited by slashes:

/world/

Character sets

/f[ai]t/       // Matches: "fat", "fit"
/fat|fit/       // Matches: "fat" or "fit"
/f[a-z]t/      // Matches: "fat", "fbt", ..., "fzt" # Letters from a to z (lowercase)
/f[a-z0-9]t/   // Matches: "fat", "fbt", ..., "f8t", "f9t" # Letters from a to z (lowercase) + numbers
/f[A-Za-z0-9]t/ // Matches: "fAt", "fBt", ..., "fat", "fbt", ... "f9t" # Letters from a to z (lowercase + uppercase) + numbers

Special characters

/./ // Any character except newline
/\./ // A literal dot
/\d/ // Any digit (0–9)
/\D/ // Any non-digit
/\w/ // Word character (a-z, A-Z, 0-9, _)
/\W/ // Non-word character
/\s/ // Whitespace (space, tab, newline)
/\S/ // Non-whitespace

Anchors

/^abc/      // Starts with "abc"
/abc$/      // Ends with "abc"
/^$/        // Empty string

Negation

/[^a-z]/ // Any character NOT a lowercase letter
/[^-]/ // Any character except "-"

Quantifiers

/a{3}/ // Exactly 3 "a"s
/a{1,3}/ // Between 1 and 3 "a"s
/a{1,}/ // At least 1 "a"
/a*/ // 0 or more "a"s
/a+/ // 1 or more "a"s
/a?/ // 0 or 1 "a"

Flags

/\d+/g // Global match (find all)
/hello/i // Case-insensitive
/hello/gi // Global + case-insensitive

Groups & alternation

/do(g|ll)/ // Matches "dog" or "doll"
/(pa){2}/ // Matches "papa"
/(?:abc)/ // Non-capturing group

Capturing groups

A capturing group is defined using parentheses () in a regex. For example:

/(abc)\1/
// Matches: "abcabc"
// Example: duplicate words
/\b(\w+)\s+\1\b/
// Matches: "hello hello", "test test"
// HTML Tag Matching
/<(\w+)>.*<\/\1>/
// Matches: "<div>content</div>", "<p>text</p>"

Tips & tricks

/(.+)/          // Greedy match (longest)
/(.+?)/         // Lazy match (shortest)
/\bword\b/      // Match whole word "word"
/(?=abc)/       // Positive lookahead (followed by "abc")
/(?!abc)/       // Negative lookahead (not followed by "abc")

Examples

PatternMatches exampleDescription
/^\d{5}$/75001Validates a 5-digit postal code
/^[A-Z][a-z]+$/ParisCapitalized word
/\b\w{4}\b/This test is fine → This, test, fineWords with exactly 4 letters
/\d{2}\/\d{2}\/\d{4}/18/06/2025Date format DD/MM/YYYY
/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/hello@gmail.comEmail address format

Replacement

The $1 syntax is used in replacement strings to refer to the first capturing group from your regex match — similar to how \1 is used in the matching pattern.

// Javascript example
const str = "Hello, my name is John.";
const result = str.replace(/my name is (\w+)/, "I'm $1");
// result: "Hello, I'm John."
const str = "Doe, John";
const result = str.replace(/(\w+), (\w+)/, "$2 $1");
// result: "John Doe"

Lookahead & lookbehind

Lookarounds let you assert what comes before or after a pattern without capturing it.

Positive lookahead (?=...)

Matches a pattern only if it’s followed by something.

/\d+(?=px)/
// Matches a digit only if it's followed by "px"

Negative lookahead (?!...)

Matches a pattern only if it’s NOT followed by something.

/\d(?!px)/
// Matches a digit only if it's NOT followed by "px"
// "10px" → matches nothing
// "10em" → matches "1" and "0"

Positive lookbehind (?<=...)

Matches a pattern only if it’s preceded by something.

/(?<=\$)\d+/
// Matches digits only if preceded by "$"
// "$100" → matches "100"

Negative lookbehind (?<!...)

Matches a pattern only if it’s NOT preceded by something.

/(?<!\$)\d+/
// Matches digits only if NOT preceded by "$"
// "100" → matches "100"
// "$100" → matches nothing

Notes

  • Lookbehind is not supported in all environments (e.g., older JavaScript engines).
  • Lookarounds are zero-width assertions: they don’t consume characters in the match.

Performance tips

Regular expressions can become slow or inefficient if not written carefully. Here are some best practices and examples to help you avoid common pitfalls:

Avoid catastrophic backtracking

Some patterns can cause the regex engine to try too many combinations, especially with nested quantifiers.

// BAD: prone to catastrophic backtracking
/(a+)+b/
// Input: "aaaaaaaaaaaaaaaaaaaaa" → very slow!
// ✅ Fix: Use more specific patterns or atomic groups (if supported):
/(?:a+)+b/  // Non-capturing group

Use anchors when possible

Anchors like ^ and $ help the engine narrow down the search.

// Without anchor: checks every position
/\d{5}/
// With anchor: faster if you expect the number at the start
/^\d{5}/

Avoid greedy wildcards when not needed

// Greedy: matches everything until the last </div>
/<div>.*<\/div>/
// Lazy: stops at the first </div>
/<div>.*?<\/div>/
// ✅ Use *? or +? for lazy matching when appropriate.

Recommended articles