✨ Introduction

Regex (for regular expressions) is a tool to find, match, or manipulate text using patterns. It’s perfect for:

  • 🔍 Spotting patterns
  • ✅ Validating input
  • 🔄 Text modifications via patterns

We can use this website to play with regex: https://regex101.com/

Regex patterns are usually delimited by slashes:

/world/

🧮 Character sets

/f[ai]t/       // Matches: "fat", "fit"
/fat|fit/       // Matches: "fat" or "fit"
/f[a-z]t/      // Matches: "fat", "fbt", ..., "fzt" # Letters from a to z (lowercase)
/f[a-z0-9]t/   // Matches: "fat", "fbt", ..., "f8t", "f9t" # Letters from a to z (lowercase) + numbers
/f[A-Za-z0-9]t/ // Matches: "fAt", "fBt", ..., "fat", "fbt", ... "f9t" # Letters from a to z (lowercase + uppercase) + numbers

✨ Special characters

/./ // Any character except newline
/\./ // A literal dot
/\d/ // Any digit (0–9)
/\D/ // Any non-digit
/\w/ // Word character (a-z, A-Z, 0-9, _)
/\W/ // Non-word character
/\s/ // Whitespace (space, tab, newline)
/\S/ // Non-whitespace

⚓ Anchors

/^abc/      // Starts with "abc"
/abc$/      // Ends with "abc"
/^$/        // Empty string

🚫 Negation

/[^a-z]/ // Any character NOT a lowercase letter
/[^-]/ // Any character except "-"

🔢 Quantifiers

/a{3}/ // Exactly 3 "a"s
/a{1,3}/ // Between 1 and 3 "a"s
/a{1,}/ // At least 1 "a"
/a*/ // 0 or more "a"s
/a+/ // 1 or more "a"s
/a?/ // 0 or 1 "a"

🏳️ Flags

/\d+/g // Global match (find all)
/hello/i // Case-insensitive
/hello/gi // Global + case-insensitive

🧩 Groups & alternation

/do(g|ll)/ // Matches "dog" or "doll"
/(pa){2}/ // Matches "papa"
/(?:abc)/ // Non-capturing group

📦 Capturing groups

A capturing group is defined using parentheses () in a regex. For example:

/(abc)\1/
// Matches: "abcabc"
// Example: duplicate words
/\b(\w+)\s+\1\b/
// Matches: "hello hello", "test test"
// HTML Tag Matching
/<(\w+)>.*<\/\1>/
// Matches: "<div>content</div>", "<p>text</p>"

🧠 Tips & tricks

/(.+)/          // Greedy match (longest)
/(.+?)/         // Lazy match (shortest)
/\bword\b/      // Match whole word "word"
/(?=abc)/       // Positive lookahead (followed by "abc")
/(?!abc)/       // Negative lookahead (not followed by "abc")

🧪 Examples

PatternMatches exampleDescription
/^\d{5}$/75001Validates a 5-digit postal code
/^[A-Z][a-z]+$/ParisCapitalized word
/\b\w{4}\b/This test is fine → This, test, fineWords with exactly 4 letters
/\d{2}\/\d{2}\/\d{4}/18/06/2025Date format DD/MM/YYYY
/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/hello@gmail.comEmail address format

♻️ Replacement

The $1 syntax is used in replacement strings to refer to the first capturing group from your regex match — similar to how \1 is used in the matching pattern.

// Javascript example
const str = "Hello, my name is John.";
const result = str.replace(/my name is (\w+)/, "I'm $1");
// result: "Hello, I'm John."
const str = "Doe, John";
const result = str.replace(/(\w+), (\w+)/, "$2 $1");
// result: "John Doe"

🔍 Lookahead & lookbehind

Lookarounds let you assert what comes before or after a pattern without capturing it.

Positive lookahead (?=...)

Matches a pattern only if it’s followed by something.

/\d+(?=px)/
// Matches a digit only if it's followed by "px"

Negative lookahead (?!...)

Matches a pattern only if it’s NOT followed by something.

/\d(?!px)/
// Matches a digit only if it's NOT followed by "px"
// "10px" → matches nothing
// "10em" → matches "1" and "0"

Positive lookbehind (?<=...)

Matches a pattern only if it’s preceded by something.

/(?<=\$)\d+/
// Matches digits only if preceded by "$"
// "$100" → matches "100"

Negative lookbehind (?<!...)

Matches a pattern only if it’s NOT preceded by something.

/(?<!\$)\d+/
// Matches digits only if NOT preceded by "$"
// "100" → matches "100"
// "$100" → matches nothing

Notes

  • Lookbehind is not supported in all environments (e.g., older JavaScript engines).
  • Lookarounds are zero-width assertions: they don’t consume characters in the match.

⚠️ Performance tips

Regular expressions can become slow or inefficient if not written carefully. Here are some best practices and examples to help you avoid common pitfalls:

Avoid catastrophic backtracking

Some patterns can cause the regex engine to try too many combinations, especially with nested quantifiers.

// BAD: prone to catastrophic backtracking
/(a+)+b/
// Input: "aaaaaaaaaaaaaaaaaaaaa" → very slow!
// ✅ Fix: Use more specific patterns or atomic groups (if supported):
/(?:a+)+b/  // Non-capturing group

Use anchors when possible

Anchors like ^ and $ help the engine narrow down the search.

// Without anchor: checks every position
/\d{5}/
// With anchor: faster if you expect the number at the start
/^\d{5}/

Avoid greedy wildcards when not needed

// Greedy: matches everything until the last </div>
/<div>.*<\/div>/
// Lazy: stops at the first </div>
/<div>.*?<\/div>/
// ✅ Use *? or +? for lazy matching when appropriate.

Recommended articles