Regular Expressions 101

Save & Share

Flavor

  • PCRE2 (PHP >=7.3)
  • PCRE (PHP <7.3)
  • ECMAScript (JavaScript)
  • Python
  • Golang
  • Java 8
  • .NET 7.0 (C#)
  • Rust
  • Regex Flavor Guide

Function

  • Match
  • Substitution
  • List
  • Unit Tests (4)

Tools

Sponsors
There are currently no sponsors. Become a sponsor today!
An explanation of your regex will be automatically generated as you type.
Detailed match information will be displayed here automatically.
  • All Tokens
  • Common Tokens
  • General Tokens
  • Anchors
  • Meta Sequences
  • Quantifiers
  • Group Constructs
  • Character Classes
  • Flags/Modifiers
  • Substitution
  • A single character of: a, b or c
    [abc]
  • A character except: a, b or c
    [^abc]
  • A character in the range: a-z
    [a-z]
  • A character not in the range: a-z
    [^a-z]
  • A character in the range: a-z or A-Z
    [a-zA-Z]
  • Any single character
    .
  • Alternate - match either a or b
    a|b
  • Any whitespace character
    \s
  • Any non-whitespace character
    \S
  • Any digit
    \d
  • Any non-digit
    \D
  • Any word character
    \w
  • Any non-word character
    \W
  • Non-capturing group
    (?:...)
  • Capturing group
    (...)
  • Zero or one of a
    a?
  • Zero or more of a
    a*
  • One or more of a
    a+
  • Exactly 3 of a
    a{3}
  • 3 or more of a
    a{3,}
  • Between 3 and 6 of a
    a{3,6}
  • Start of string
    ^
  • End of string
    $
  • A word boundary
    \b
  • Non-word boundary
    \B

Regular Expression

/
/
gmx

Test String

Code Generator

Generated Code

# coding=utf8 # the above tag defines encoding for this document and is for Python 2.x compatibility import re regex = r""" ^ (?<integer> (?<n1> (?<n1nz>[一二三四五六七八九壹贰叁肆伍陆柒捌玖]) |(?<n1wz>[〇零]) ) |(?<n2> (?<h2nz> (?<n2nz>(?<n2z1>(?&n1nz)[十拾])(?&n1nz)) |[十拾](?&n1nz) ) |(?<h2wz> (?<n2wz>(?&n2z1)) |[十拾] ) ) |(?<n3> (?<n3nz>(?<n3z2>(?<h1nz>(?&n1nz)|两)[百佰])((?&n2nz)|(?&n1wz)(?&n1nz))) |(?<n3wz>(?&n3z2)(?&n2wz)?) ) |(?<n4> (?<n4nz>(?<n4z3>(?&h1nz)[千仟])((?&n3nz)|(?&n1wz)((?&n2nz)|(?&n1nz)))) |(?<n4wz> (?&n4z3)(?&n3wz)? |(?&n4z3)(?&n1wz)(?&n2wz) ) ) |(?<n5_7> (?<n5_7nz> (?<n5_7z4>(?<h1_3nz>(?&h1nz)|(?&h2nz)|(?&n3nz))[万萬])((?&n4nz)|(?&n1wz)((?&n3nz)|(?&n2nz)|(?&n1nz))) |(?<n6_7z5_7>(?<h2_3wz>(?&h2wz)|(?&n3wz))[万萬])((?&n1wz)((?&n4nz)|(?&n3nz)|(?&n2nz)|(?&n1nz))) ) |(?<n5_7wz> (?&n5_7z4)((?&n4wz)|(?&n1wz)((?&n3wz)|(?&n2wz)))? |(?&n6_7z5_7)((?&n1wz)((?&n4wz)|(?&n3wz)|(?&n2wz)))? ) ) |(?<n8> (?<n8nz> (?<n8z4>(?&n4nz)[万萬])((?&n4nz)|(?&n1wz)((?&n3nz)|(?&n2nz)|(?&n1nz))) |(?<n8z5_7>(?&n4wz)[万萬])(?&n1wz)((?&n4nz)|(?&n3nz)|(?&n2nz)|(?&n1nz)) ) |(?<n8wz> (?&n8z4)((?&n4wz)|(?&n1wz)((?&n3wz)|(?&n2wz)))? |(?&n8z5_7)((?&n1wz)((?&n4wz)|(?&n3wz)|(?&n2wz)))? ) ) |(?<n9_16> ((?&h1_3nz)(?&n4nz)|(?&n5_7nz)(?&n8nz))[亿億]((?&n8)|(?&n1wz)((?&n5_7)|(?&n4)|(?&n3)|(?&n2)|(?&n1)))? |((?&h2_3wz)(?&n4wz)|(?&n5_7wz)(?&n8wz))[亿億]((?&n1wz)((?&n8)|(?&n5_7)|(?&n4)|(?&n3)|(?&n2)|(?&n1)))? ) ) (?<decimal>点 [〇一二三四五六七八九零壹贰叁肆伍陆柒捌玖]+ )? $ """ test_str = ("三十四\n" "↑True\n" "三十四万\n" "↑True\n" "三十四万三百二十\n" "↑False\n" "三百四十五万零三百二十\n" "↑True\n" "三十四万零三百\n" "↑True\n" "二十万零一千\n" "↑True\n" "二十万零一百\n" "↑True\n" "二十万零一\n" "二十万\n" "二万零一十\n" "二万零一百\n" "二万零一百一十\n" "二万一千一百一十一\n" "两万零一\n" "二百万零一\n" "两千万零一\n" "两千两百万零一\n" "两千万零一十\n" "两千万零一十二\n" "两万零二千\n" "两千万零两百\n" "两千万零二十\n" "两千万零二百二十\n" "二千零二十一\n" "二千零二十\n" "两千万零二千零二十一\n" "两千万零二十一\n" "两千万零二百\n" "两千万零两百二十一\n" "两千万零两千两百二十\n" "两千万零二千零二十\n" "两千万零二千零二\n" "两千万零二百零三\n" "两千三百四十万三千二百三十\n" "二百万\n" "两千二百二十一万\n" "两千两百万\n" "两千两百一十万\n" "两千三百四十一万\n" "两千零二十\n" "两千零二十万\n" "两千零一十五\n" "两千零一十五万\n" "两千零三十万\n" "两百一十万\n" "一十万\n" "十万\n" "两千两百\n" "两千两百万\n" "两千两百二十万\n" "两千三百四十五万\n" "两千零一十一万\n" "两百一十万\n" "两百四十二万\n" "两百零六万\n" "二十一万\n" "四百万\n\n" "两百一十万\n\n" "两百万零三百零一\n" "↑True\n" "三百四十万零一\n" "↑True\n" "四百五十六万零一十\n" "↑True\n" "五百零七万零三百\n" "↑True\n" "一千万零二十三\n" "↑True\n" "三百一十\n" "↑True\n" "一千两百万零三百一十\n" "↑True\n" "三百\n" "一千万零三百\n" "↑True\n" "一千零一万零三百零一\n" "↑True\n" "一千两百三十万零一十\n" "↑True\n" "一千两百四十四万零二十三\n" "↑True\n" "一千\n" "↑True\n" "一千零四十\n" "↑True\n" "一千零四十\n" "↑True\n" "一千四百\n" "↑True\n" "一千零四万\n" "↑True\n" "一千零四十\n" "一千零四十万\n" "↑True\n" "一千四百万\n" "↑True\n" "一千零四十五万零两百零四\n" "↑True\n" "一亿零两百万\n" "↑True\n" "一亿两千万\n" "↑True\n" "一亿零两千万\n" "↑False\n" "十亿零两千万\n" "↑True\n" "一亿零两百\n" "↑True\n" "一亿零二十\n" "↑True\n" "一亿零二十二\n" "↑True\n" "一万亿零二\n" "↑True\n" "九千八百七十六万五千四百三十二亿一千两百三十四万五千六百七十八\n" "↑True\n" "九千零七十六万零四百零二亿零两百零四万零六百零八\n" "↑True\n" "一千零\n" "↑False\n" "两千零零六\n" "↑False\n" "两千零六\n" "↑True\n" "九千零零七十六万零四百零二亿零两百零四万零六百零八\n" "↑False\n" "九千零六万零二亿零四万零八\n" "↑True\n" "九千零六万零二亿零\n" "↑False\n" "一千万亿零二点二\n" "↑True\n" "一千零二万亿零二\n" "↑True\n" "一千零二万億零二点两\n" "↑False\n" "一千零二万億零二点四\n" "↑True\n" "两万两千\n" "↑True\n" "两万零两千\n" "↑False\n" "两万零两百\n" "↑True\n" "两亿零五千万\n" "↑False\n" "两亿零五百万\n" "↑True\n" "一百\n" "↑True\n" "九千\n" "↑True\n" "两千零两百\n" "↑False\n" "五百万零四千三百\n" "↑True\n" "五百万四千三百\n" "↑False\n" "十万零一千亿四千万零四千\n" "↑False\n" "十万零一千亿零四千万零四千\n" "↑True\n" "十万零一千亿\n" "↑True\n" "十万零四千零四十\n" "两千两百\n" "↑True\n" "二百零六十\n" "↑False\n" "二百五十\n" "↑True\n" "二百五十七\n" "↑True\n" "二百零八\n" "↑True\n" "二\n" "↑True\n" "两\n" "↑False\n" "一十\n" "↑True\n" "二十\n" "↑True\n" "十五\n" "↑True\n" "一十五\n" "↑True\n" "二十五\n" "↑True\n" "三十四\n" "↑True\n" "两百\n" "↑True\n" "三百四十\n" "↑True\n" "四百五十六\n" "↑True\n" "五百零七\n" "↑True\n" "一千\n" "↑True\n" "一千零一\n" "↑True\n" "一千零四十\n" "↑True\n" "一千零四十五\n" "↑True\n" "一千两百\n" "↑True\n" "一千两百三十\n" "↑True\n" "一千两百四十四\n" "↑True\n" "两千六百八十万零一百\n" "↑True\n" "两千六百八十万零一千\n" "↑True\n" "两千六百八十万一千\n" "↑False\n" "两千六百零八万一千\n" "↑True\n" "二万\n" "↑True\n" "两万\n" "↑True\n" "二十万\n" "↑True\n" "十五万\n" "↑True\n" "三十四万\n" "↑True\n" "两百万\n" "↑True\n" "三百四十万\n" "↑True\n" "四百五十六万\n" "↑True\n" "五百零七万\n" "↑True\n" "一千万\n" "↑True\n" "一千两百万\n" "↑True\n" "一千零一万\n" "↑True\n" "一千两百三十万\n" "↑True\n" "一千两百四十四万\n" "↑True\n" "一千零四十\n" "↑Tru\n" "一千万\n" "↑Truee\n" "一千四百万\n" "↑Truee\n" "一千零四万\n" "↑Truee\n" "一千零四十\n" "↑True\n" "一千零四十万\n" "↑True\n" "一千零四十五\n" "↑True\n" "一千零四十五万\n" "↑True\n" "二万零一\n" "↑True\n" "两万零三百\n" "↑True\n" "二十\n" "↑True\n" "二十万\n" "↑True\n" "两百万\n" "两千万\n" "二万\n" "十五万零二十三\n" "↑True\n") matches = re.finditer(regex, test_str, re.MULTILINE | re.VERBOSE) for matchNum, match in enumerate(matches, start=1): print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group())) for groupNum in range(0, len(match.groups())): groupNum = groupNum + 1 print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum))) # Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

Please keep in mind that these code samples are automatically generated and are not guaranteed to work. If you find any syntax errors, feel free to submit a bug report. For a full regex reference for Python, please visit: https://docs.python.org/3/library/re.html