Regular expressions (regex) are one of the most powerful tools in a developer's arsenal. They allow you to describe and match patterns in text with remarkable precision. From validating user input to extracting data from logs, from search-and-replace operations to URL routing, regex is used everywhere in software development. However, regex has a reputation for being difficult to learn and even harder to read. This cheat sheet aims to change that by providing clear explanations, practical examples, and a structured reference you can come back to again and again.
Whether you are a beginner writing your first pattern or an experienced developer looking for a quick reference, this guide covers everything you need to know about regular expressions. You can practice all the patterns below using the free DevBox Regex Tester, which provides real-time matching and detailed pattern breakdowns.
What Are Regular Expressions?
A regular expression is a sequence of characters that defines a search pattern. Think of it as a specialized language for describing text patterns. When you apply a regex to a string, the regex engine scans the string and finds all substrings that match the pattern you described.
For example, the regex \d{3}-\d{3}-\d{4} matches phone numbers in the format 555-123-4567. The \d matches any digit, {3} specifies exactly three occurrences, and the literal hyphens match themselves. Simple, right? Let's build on this foundation.
Basic Syntax
Literal Characters
Most characters in a regex match themselves literally. The regex hello matches the string "hello" and nothing else. Letters, digits, and many special characters are literal by default.
Metacharacters
Some characters have special meaning in regex. These metacharacters are the building blocks of patterns:
.— Matches any character except a newline^— Matches the beginning of a string (or line in multiline mode)$— Matches the end of a string (or line in multiline mode)*— Matches zero or more of the preceding element+— Matches one or more of the preceding element?— Matches zero or one of the preceding element{n}— Matches exactly n occurrences{n,}— Matches n or more occurrences{n,m}— Matches between n and m occurrences[abc]— Character set: matches any one of the characters a, b, or c[^abc]— Negated set: matches any character except a, b, or c[a-z]— Range: matches any lowercase letter from a to z\d— Matches any digit (equivalent to [0-9])\D— Matches any non-digit\w— Matches any word character (letters, digits, underscore)\W— Matches any non-word character\s— Matches any whitespace character (space, tab, newline)\S— Matches any non-whitespace character\b— Word boundary\— Escapes a special character
Quantifiers
Quantifiers specify how many times a pattern element should match. The three basic quantifiers are * (zero or more), + (one or more), and ? (zero or one). Curly braces give you precise control: {3} matches exactly 3 times, {2,5} matches 2 to 5 times, and {3,} matches 3 or more times.
Anchors
Anchors do not match characters — they match positions. ^ asserts that the match must occur at the start of the string, and $ asserts that the match must occur at the end. For example, ^hello matches "hello" only at the beginning of a string, and world$ matches "world" only at the end.
Groups and Capturing
Parentheses create groups. A capturing group (pattern) not only groups elements together but also captures the matched text for later reference. A non-capturing group (?:pattern) groups without capturing. Named groups (?<name>pattern) let you reference captures by name instead of index.
// Capturing group example
(\d{3})-(\d{3})-(\d{4})
// Matches "555-123-4567"
// Group 1: "555", Group 2: "123", Group 3: "4567"
// Non-capturing group example
(?:https?://)?(?:www\.)?example\.com
// Matches "example.com", "www.example.com", "https://example.com"
Common Regex Patterns
Email Validation
^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$
This pattern matches most common email formats. It requires one or more valid characters before the @ symbol, a domain name with at least one dot, and a top-level domain of at least two characters. Note that fully RFC 5322 compliant email validation is extremely complex — this pattern covers 99% of real-world cases.
Phone Number (US Format)
^\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}$
This pattern matches US phone numbers in various formats: (555) 123-4567, 555-123-4567, 555.123.4567, and 5551234567. The optional parentheses, separator, and space make it flexible enough for most use cases.
URL Matching
https?://(?:www\.)?[a-zA-Z0-9-]+\.[a-zA-Z]{2,}(?:/[\w.-]*)?
This pattern matches HTTP and HTTPS URLs with an optional www prefix, a domain name, and an optional path. It handles common URL structures but does not cover every edge case defined in RFC 3986.
IPv4 Address
^((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)$
This pattern validates IPv4 addresses by ensuring each octet is between 0 and 255. It rejects invalid addresses like 256.1.1.1 or 1.1.1.999.
Date Format (YYYY-MM-DD)
^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
This pattern matches dates in ISO 8601 format. It validates that the month is between 01 and 12, and the day is between 01 and 31. Note that it does not account for the varying number of days in different months — additional logic would be needed for complete date validation.
Strong Password
^(?=.*[a-z])(?=.*[A-Z])(?=.*\d)(?=.*[@$!%*?&])[A-Za-z\d@$!%*?&]{8,}$
This pattern enforces a strong password policy: at least 8 characters, with at least one lowercase letter, one uppercase letter, one digit, and one special character. The lookaheads (?=...) check for the presence of each character type without consuming characters.
Regex Flags Explained
Flags modify the behavior of the regex engine. They are appended after the closing delimiter (e.g., /pattern/g).
- g (global): Finds all matches in the string, not just the first one. Without this flag, the regex engine stops after the first match.
- i (case-insensitive): Makes the match case-insensitive. The pattern
/hello/imatches "hello", "Hello", "HELLO", and any other case variation. - m (multiline): Makes
^and$match the beginning and end of each line, not just the entire string. - s (dotall): Makes the dot
.match newline characters as well. By default, the dot matches everything except newlines. - u (unicode): Enables full Unicode support. This is important when working with international text that includes characters outside the ASCII range.
- y (sticky): Matches only from the index indicated by the lastIndex property. Useful for incremental parsing.
Greedy vs Lazy Matching
By default, quantifiers are greedy — they match as much as possible. For example, the regex /<.*>/ applied to the string <p>Hello</p> matches the entire string <p>Hello</p>, not just <p>. This is because the .* greedily consumes everything up to the last >.
To make a quantifier lazy (match as little as possible), append a question mark ? after it. The regex /<.*?>/ applied to the same string matches <p> first, and then </p> on subsequent matches. Lazy matching is often what you want when extracting data between delimiters.
// Greedy: matches "<p>Hello</p>"
<.*>
// Lazy: matches "<p>" then "</p>"
<.*?>
Understanding the difference between greedy and lazy matching is one of the most important concepts in regex. Getting it wrong is a common source of bugs, especially when processing HTML, XML, or other structured text formats.
Lookahead and Lookbehind
Lookahead and lookbehind are zero-width assertions — they check for a pattern without consuming characters. They are incredibly useful for conditional matching.
Positive Lookahead: (?=pattern)
Asserts that the pattern must follow the current position, but does not include it in the match. For example, \d+(?=px) matches a number only if it is immediately followed by "px". In the string "12px 34em", it matches "12" but not "34".
Negative Lookahead: (?!pattern)
Asserts that the pattern must NOT follow the current position. For example, \b(?!test)\w+\b matches all whole words except "test". This is useful for excluding specific patterns from a match.
Positive Lookbehind: (?<=pattern)
Asserts that the pattern must precede the current position. For example, (?<=\$)\d+ matches a number only if it is immediately preceded by a dollar sign. In "$100 and $200", it matches "100" and "200".
Negative Lookbehind: (?<!pattern)
Asserts that the pattern must NOT precede the current position. For example, (?<!@)\b\w+@\w+\.\w+\b matches email addresses that are not preceded by an @ symbol, helping avoid matching within longer email-like strings.
Practical Examples
Extracting Numbers from Text
const text = "Order 1234 has 56 items totaling $789.00";
const numbers = text.match(/\d+\.?\d*/g);
// Result: ["1234", "56", "789.00"]
Replacing Sensitive Data
const text = "SSN: 123-45-6789 and CC: 4111-1111-1111-1111";
const redacted = text.replace(/\d{3}-\d{2}-\d{4}/g, "***-**-****");
// Result: "SSN: ***-**-**** and CC: 4111-1111-1111-1111"
Splitting a String While Keeping Delimiters
const csv = "red,green,blue,yellow";
const result = csv.split(/(?<=,)/);
// Result: ["red,", "green,", "blue,", "yellow"]
Validating a Hex Color Code
^#?([0-9a-fA-F]{3}|[0-9a-fA-F]{6})$
This pattern matches 3-digit and 6-digit hex color codes, with or without the # prefix. It matches #fff, #FFFFFF, fff, and FFFFFF.
Tips for Writing Better Regex
- Start simple and build up: Begin with the most basic pattern that works, then add complexity incrementally. Test at each step.
- Use raw strings in code: In JavaScript, use template literals or the RegExp constructor to avoid double-escaping. In Python, use raw strings (r'pattern').
- Comment complex patterns: Many regex engines support the
xflag, which allows whitespace and comments within the pattern. - Test thoroughly: Use the DevBox Regex Tester to test your patterns against multiple input strings, including edge cases.
- Know when not to use regex: For parsing HTML or XML, use a proper parser. Regex is powerful, but it is not the right tool for every job.
Regular expressions are a skill that rewards practice. The more you use them, the more intuitive they become. Bookmark this cheat sheet and the DevBox Regex Tester, and you will have everything you need to master pattern matching.