Regular Expressions 101

Matches some number of a's followed by the same number of b's, usually written as a^nb^n for all natural numbers n. For regular expressions as formally defined in formal language theory, this is impossible, but modern day regex engines can do far more than match just regular languages. Note that even though this works with the Python flavor on regex101, this doesn't actually compile with Python's re library, since their regex engine doesn't allow you to:

Lazily evaluate conditional groups: (?(1)\1) requires \1 to exist, even though if it doesn't exist the pattern would just match nothing and keep parsing.
Reference open groups: This is kind of related to the previous one, since we need to reference the previous value of a group within itself, and we need to use a conditional for before the group first matches anything. As per 1, this requires the group to exist even if we don't use the value, hence it doesn't work.

Here is a short explanation on how the regex works: The key part of the regex is this lookahead (?=a+((?(1)\1)b)) wrapped by ((?=...)a)*. The lookahead skips through the a's, and then group one ((?(1)\1)b)) matches either b if group one doesn't exist (the first "iteration", where we iterate via the outer *), or group one and a b if it does exist. The effect of this is that we iterate the * group until either we run out of b's or we run out of a's. After that, we just need to check that we have group one followed by the end of the string at the end, since those are the b's. If we have too many a's, we would be matching against a...b...b, and if we have too many b's we wouldn't hit the end of the string. One small caveat is that if we have an empty string, group one doesn't exist, which we can remedy by either putting (?(1)\1) or just wrapping everything in an optional group.

Submitted by FieryIceStickie - 2 months ago

Order By

Filter by Flavor

Community Patterns

Repair unquoted HTML attribute values

Validate Base64 single-line string

Date: DD-MM-YYYY hh:mm:ss

DateTime: DD-MM-YYYY HH:mm:ss

dd-mm-YYYY HH:mm:ss (year range 1000-2999)

js function test

1st TSV

Table Purgatory

Strip out domain extensions except for Microsoft .net technologies.

For Auto RTFs

Strip multi-line (star) comments from JavaScript, while leaving any CDATA sections intact.

MIME type verification

PHP ReflectionParameter:__toString() parser

PHP variable name

Disorted Email Addresses

URL regexp

Community Library Entry

Regular ExpressionPython

Description

Regular Expression
Python