Skip to main content

RegExp Generator & Tester

Test regex in 8 engines: JavaScript, PCRE2, Python, Go, Rust, Java, .NET, C++. Paste examples to auto-generate optimized patterns with named groups, match highlighting, and live validation. Free, no signup.

Last updated

Sep
/
/
Enter test strings to see results
Lines0
Matched0
Groups0
Time0µs

What is a Regular Expression?

A regular expression (regex or regexp) is a sequence of characters that defines a search pattern. Regular expressions are used across programming languages and tools for string matching, validation, extraction, and replacement. They power everything from form validation and URL routing to log parsing and data extraction.

The syntax includes literal characters, metacharacters like . (any character) and * (zero or more), character classes like \d (digits) and \w (word characters), anchors like ^ and $, and grouping constructs for capturing matched substrings.

This tool helps you generate regex patterns from example strings and test them interactively across 8 different regex engines. Instead of writing complex patterns from scratch, you provide sample inputs and the tool builds an optimized pattern with named capture groups automatically.

Supported Regex Engines

This tool supports 8 regex engines covering the dominant languages and libraries. Each engine has its own syntax quirks, flag set, and feature trade-offs.

ECMAScript (JavaScript)

The native RegExp in browsers and Node.js runs entirely in your browser via the V8 engine. Supports (?<name>) named groups, the d flag for match indices (ES2022), the u flag for full Unicode, the y flag for sticky matching, and lookbehind (ES2018+). No backtracking limits in the spec, so catastrophic patterns can hang the page — this tool runs ECMAScript inside a Web Worker with a 2-second termination timer for safety.

Docs: developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp

PCRE2 (PHP)

Perl Compatible Regular Expressions 2 — the engine used by PHP preg_match, preg_replace, and preg_match_all functions. Supports (?P<name>) named groups, variable-length lookbehind, recursion via (?R) and (?n), the J flag for duplicate group names, the x flag for free-spacing comments, and Unicode property classes. Most feature-rich engine in the lineup. This tool runs PCRE2 via a sandboxed PHP CLI subprocess with a 3-second SIGKILL timeout.

Docs: www.pcre.org/current/doc/html/pcre2syntax.html

Python (re module)

Python standard re module. Supports (?P<name>) named groups (the original named-group syntax — PCRE later adopted it), variable-length lookbehind (Python 3.7+), Unicode-by-default for str patterns, the x flag for verbose mode with comments, and the a flag for ASCII-only matching. No g flag — use re.findall or re.finditer for global matching. Pattern flags can be inline like (?i)pattern or as re.IGNORECASE constants.

Docs: docs.python.org/3/library/re.html

Go (regexp package)

Go standard regexp package built on Google RE2 engine. Linear-time matching guarantee — no catastrophic backtracking ever, regardless of pattern. Supports (?P<name>) named groups and all Unicode property classes via \p{L}, \p{Greek}, etc. Does NOT support lookbehind, lookahead, or backreferences — these features break the linear-time guarantee. Use Compile for runtime patterns, MustCompile for constants. The trade-off: Go regex is fast and DoS-proof, but not as expressive.

Docs: pkg.go.dev/regexp/syntax

Rust (regex crate)

The regex crate, also based on RE2 like Go but with extra features. Linear-time matching, Unicode-by-default, (?P<name>) named groups, the x flag for verbose mode. Like Go, no lookbehind, lookahead, or backreferences. Optionally enable regex::bytes for byte-string matching. The crate is widely considered the fastest regex engine in any major language. Compile with Regex::new(), match with is_match, find, captures, or find_iter.

Docs: docs.rs/regex/

Java (java.util.regex)

Java java.util.regex.Pattern and Matcher. Supports (?<name>) named groups, variable-length lookbehind, all Unicode property classes via \p{IsGreek} form, the UNICODE_CHARACTER_CLASS flag for Unicode-aware \w/\d/\s, and the UNIX_LINES flag for Unix-style newline handling. Compile with Pattern.compile, match with Matcher.find or matches. Subject to ReDoS — this tool runs Java patterns via a JVM subprocess with a 6-second cold-start budget.

Docs: docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/regex/Pattern.html

.NET (System.Text.RegularExpressions)

C#/.NET System.Text.RegularExpressions.Regex. Supports (?<name>) named groups, variable-length lookbehind, balanced groups for matched delimiters, the n flag for explicit-capture mode (only named groups), the r flag for right-to-left matching, and c for CultureInvariant. The .NET engine is backtracking-based but offers a non-backtracking mode via RegexOptions.NonBacktracking for ReDoS-safe operation. This tool ships .NET 7+.

Docs: learn.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-language-quick-reference

C++ (std::regex)

C++ std::regex from <regex>. Defaults to ECMAScript grammar but also supports POSIX basic, POSIX extended, AWK, grep, and egrep modes via a constructor flag. Supports lookbehind, lookahead, and backreferences in ECMAScript mode. Does NOT support \p{L} or other Unicode property classes — use byte-level character classes instead. Match with std::regex_match, std::regex_search, or std::regex_iterator. The C++ standard library implementation is the slowest of the eight engines on most workloads.

Docs: en.cppreference.com/w/cpp/regex

Regex Engines Compared — Feature Matrix

Regex engine comparison — feature matrix across the 8 engines supported by this tester
Engine Library Named Group Lookbehind Backrefs Unicode Flags Notable
ECMAScript JS RegExp (?<name>) Yes (ES2018+) Yes Yes (u flag) g i m s u y d d-flag indices, sticky
PCRE2 PHP preg_* (?P<name>) Yes (variable) Yes Yes i m s u x U J Recursion, J for duplicate names
Python re module (?P<name>) Yes (3.7+ var) Yes Yes (default str) i m s u x a Verbose mode, ASCII flag
Go regexp (RE2) (?P<name>) No No Yes (\p{Greek}) i m s U RE2 linear-time
Rust regex crate (?P<name>) No No Yes i m s U x Linear-time, Unicode default
Java java.util.regex (?<name>) Yes Yes Yes i m s u x d U UNICODE_CHARACTER_CLASS
.NET System.Text.RegularExpressions (?<name>) Yes Yes Yes i m s x r c n RightToLeft, ExplicitCapture
C++ std::regex No native Yes Yes No \p{} i m ECMAScript grammar default

Why Choose This Regex Tester — 8 Engines vs Alternatives

  • vs regex101.com: 8 engines including Rust and C++ (regex101 ships 7 flavors, without C++ or Rust).
  • vs regexr.com: 6 more engines, no ads, no signup required.
  • vs regextester.com: 4× more engines (8 vs 2 advertised).
  • vs regex-vis.com: testing + generation in one tool, not just visualization.
  • vs Pythex: 7 more languages.

How to Generate a Regex Pattern

This free online regex generator creates patterns from your example strings automatically. Follow these steps:

  1. Enter test strings — Paste or type URLs or text, one per line. The tool analyzes common patterns in your inputs.
  2. Generate regex — Click Generate or let auto-generation create a pattern that matches all your input strings.
  3. Switch engines — Toggle between 8 engines (JavaScript, PCRE2, Python, Go, Rust, Java, .NET, C++) to get the right syntax for your platform.
  4. Refine the pattern — Edit the regex directly and see live validation. Toggle flags like case-insensitive or multiline.
  5. Review results — Check match highlighting, capture groups, and statistics. Copy the final pattern.

Regex Syntax Quick Reference

Character Classes

\d digits, \w word chars, \s whitespace, [a-z] ranges, [^abc] negation.

Quantifiers

* zero+, + one+, ? optional, {n} exact, {n,m} range. Add ? for lazy.

Groups & Captures

(...) capture, (?:...) non-capture, (?<name>...) named (JS/Java/.NET), (?P<name>...) named (PCRE2/Python/Go/Rust).

Anchors & Lookaround

^ start, $ end, \b word boundary, (?=...) lookahead, (?<=...) lookbehind.

Frequently Asked Questions

Is it free?

Yes, the entire tool is free. There is no signup, no ads, no usage limits, and no upgrade tier. ECMAScript regex testing runs entirely in your browser with zero server calls — usage is unlimited. The 7 server engines (PCRE2, Python, Go, Rust, Java, .NET, C++) execute via a hardened subprocess host with a soft rate limit of about 60 requests per minute per IP to prevent abuse, but typical interactive use never hits it. Each engine runs inside a sandboxed worker with strict input validation, ReDoS isolation, and a 2-3 second SIGKILL timeout. We do not log, store, or sell any user input.

Is my data safe?

Yes. ECMAScript mode runs entirely in your browser via a Web Worker — your patterns and test inputs never leave your machine. The 7 server engines (PCRE2, Python, Go, Rust, Java, .NET, C++) send the pattern, flags, and inputs to /api/test over HTTPS, where they execute inside a hardened subprocess host with a 256 KiB body cap, strict input validation, ReDoS protection via worker_threads termination, per-engine FIFO concurrency limits, and a SIGKILL timeout. Inputs are never written to disk and are not logged. The middleware rejects null bytes and ASCII control characters before any execution. No telemetry includes user content.

Does it work offline?

Partially. ECMAScript mode works fully offline after the initial page load — the regex executor runs in a Web Worker bundled into the page, and there are no server calls. You can test JavaScript regex patterns indefinitely without a connection. The 7 server engines (PCRE2, Python, Go, Rust, Java, .NET, C++) require an active connection to /api/test, because each engine runs in its native runtime on the server inside a subprocess sandbox. Switching engines while offline will show a friendly error for those 7. To use the tool entirely offline, choose ECMAScript. Auto-save to localStorage works in either mode.

How do I write a regex for an email address?

A practical email regex that covers the common cases: ^[\w.%+-]+@[\w.-]+\.[A-Za-z]{2,}$. This matches the local part (letters, digits, underscore, dot, percent, plus, hyphen), an @, the domain (letters, digits, dot, hyphen), and a top-level domain of 2+ letters. Per-engine notes: in Python, \w matches Unicode letters by default — this is fine for most use, but if you need ASCII-only behavior, use the a flag or [A-Za-z0-9_] literally. RFC 5322 allows much more (quoted strings, comments, IDN), so for true validation use a server-side library — regex alone cannot fully validate email addresses.

How do I match a URL with regex?

Use this pattern (works in all 8 engines): ^https?:\/\/[\w.-]+(?::\d+)?(?:\/[^\s?#]*)?(?:\?[^\s#]*)?(?:#\S*)?$. It matches the scheme (http:// or https://), the host (letters, digits, dot, hyphen), an optional :port, an optional path, an optional query string, and an optional fragment. The negated classes prevent the path/query from accidentally consuming the next field delimiter. For practical extraction from text, drop the ^ and $ anchors. For strict URL validation including IDN domains, use URL (JS) or urllib.parse (Python) instead — regex cannot enforce all RFC 3986 rules.

How do I match a phone number with regex?

A US phone number pattern: ^\(?\d{3}\)?[\s.-]?\d{3}[\s.-]?\d{4}$. This matches an optional (, three digits, an optional ), an optional separator (space, dot, or hyphen), three digits, an optional separator, and four digits — covering (415) 555-1234, 415-555-1234, 415.555.1234, and 4155551234. Per-engine notes: in Python (default str pattern) \d matches Unicode digits including non-ASCII numerals; use [0-9] for strict ASCII. International phone numbers vary widely — for E.164 (+15551234567) use ^\+\d{1,15}$. For strict validation across countries, use Google libphonenumber.

How do I match an IPv4 address?

Use this pattern: ^(?:(?:25[0-5]|2[0-4]\d|[01]?\d?\d)\.){3}(?:25[0-5]|2[0-4]\d|[01]?\d?\d)$. Each octet matches 0-255 via three alternatives: 25[0-5] (250-255), 2[0-4]\d (200-249), or [01]?\d?\d (0-199). The {3} repeats it three times with a literal dot, and the final octet uses the same pattern without a trailing dot. All 8 engines support this — it uses no advanced features. To match IPv6, the pattern is much longer; consider using your language built-in IP parser (ipaddress in Python, net.ParseIP in Go, IPAddress.Parse in C#) for production.

How do I escape special regex characters?

The metacharacters in most engines are: . ^ $ * + ? ( ) [ ] { } | \ /. To match them literally, prefix with a backslash: \. matches a dot, \( matches an open paren. Per-language helpers: JavaScript ES2024+ has RegExp.escape(str); Python has re.escape(str); Go has regexp.QuoteMeta(str); Java has Pattern.quote(str) (which wraps the string in \Q...\E); .NET has Regex.Escape(str); PHP has preg_quote(str, "/"). For C++ std::regex, no built-in helper — use a manual character-by-character escape function. Always use the helper instead of hand-escaping when interpolating user input into a regex.

What is the difference between greedy and lazy quantifiers?

The quantifier .* is greedy — it matches as many characters as possible, then backtracks if the rest of the pattern fails to match. The quantifier .*? is lazy (non-greedy) — it matches as few characters as possible, then expands one at a time. Example on <b>foo</b><b>bar</b>: the pattern <b>.*</b> matches the entire string (greedy: <b>foo</b><b>bar</b>), while <b>.*?</b> matches just <b>foo</b> (lazy: stops at the first </b>). Use lazy for HTML/XML/JSON parsing where you want the smallest match. Go and Rust use linear-time RE2 engines that do not backtrack, but they still honor lazy quantifiers as part of the matching algorithm.

How do I use lookbehind in regex?

Lookbehind asserts that a position is preceded by a pattern, without consuming characters. Syntax: (?<=pattern) for positive, (?<!pattern) for negative. Example: (?<=\$)\d+ matches digits preceded by a dollar sign — on $100, it matches 100 (not $100). Per-engine support: ECMAScript (ES2018+), PCRE2 (variable-length), Python (3.7+ variable-length), Java, and .NET all support lookbehind. Go and Rust do NOT — their RE2 engines deliberately omit lookbehind because it breaks the linear-time guarantee. For Go/Rust, use a capture group and post-process: \$(\d+) then read group 1.

How do I match all whitespace except newlines?

Use [^\S\n] — a negated character class containing \S (non-whitespace) and \n (newline). It reads as "any character that is NOT (non-whitespace OR newline)" which equals "any whitespace that is not a newline". Matches space, tab, carriage return, and other whitespace but NOT newlines. Alternative [ \t] matches just space and tab — narrower but more explicit. Useful when you want to collapse horizontal whitespace in source code without joining lines together. All 8 engines support this pattern. Note that on Windows files with \r\n line endings, [^\S\n] will match the \r — use [^\S\r\n] to exclude both.

How do I match a date in YYYY-MM-DD format?

Use this pattern: ^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$. It matches 4 digits for the year, a hyphen, a month (01-12) using the alternation 01-09 or 10-12, a hyphen, and a day (01-31) using 01-09 or 10-29 or 30-31. All 8 engines support this. Note: this validates the format but not the calendar — it accepts 2025-02-31 (February has no 31st). For full calendar validation, use a date-parsing library: Date.parse() (JS), datetime.strptime() (Python), time.Parse (Go), chrono::NaiveDate::parse_from_str (Rust). Regex is the wrong tool for full date validation.

How do I use named capture groups in regex?

Named groups let you reference matches by name instead of index. Syntax varies by engine: ECMAScript (JavaScript), Java, and .NET use (?<name>pattern). PCRE2 (PHP), Python, Go, and Rust use (?P<name>pattern). C++ std::regex has no native named groups — reference captures by numeric index only. This tool auto-generator emits (?P<name>) form by default and converts to (?<name>) when you switch to ECMAScript via the pcreToEcma helper. Example: parsing a date — (?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2}) (JS/Java/.NET) or (?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2}) (PCRE2/Python/Go/Rust). Reference by name: match.groups.year (JS), match.group("year") (Python).

Why does my regex work in JavaScript but not Python?

The most common reasons: (1) Named group syntax differs. JavaScript uses (?<name>), Python uses (?P<name>) — the P is required. (2) Lookbehind variability. JavaScript ES2018+ supports variable-length lookbehind; Python 3.7+ also supports variable-length, but earlier Python only allowed fixed-width. (3) Unicode by default. Python \w, \d, \s match Unicode characters by default for str patterns; JavaScript only matches Unicode with the u flag. (4) Possessive quantifiers. Python supports *+, ++, ?+ as of 3.11; older versions do not. (5) Flag spelling. JavaScript uses gimsuyd letters; Python uses re.IGNORECASE constants or an inline (?i) modifier. This tool pcreToEcma converter handles named groups automatically.

How do I validate a strong password with regex?

A common pattern: ^(?=.*[A-Z])(?=.*[a-z])(?=.*\d)(?=.*[!@#$%^&*]).{8,}$. This uses four lookaheads to assert the string contains at least one uppercase letter, one lowercase letter, one digit, and one special character, followed by .{8,} for minimum 8 characters total. The lookaheads are zero-width assertions. This pattern is linear-time because each lookahead scans the full string once. Per-engine notes: Go and Rust do NOT support lookahead — their RE2 engines omit zero-width assertions. For Go/Rust, run four separate regex checks plus a length check. Front-end regex validation is a UX hint, not a security control. Always validate server-side with argon2id, scrypt, or bcrypt.