Regular Expressions 101

Community Patterns

There does not seem to be anything here

Community Library Entry


Regular Expression



This is useful for parsing Australian street addresses.

It discards initial rubbish, then extracts:

  1. BND or CNR, which is useful for geolocating by boundaries
  2. Numbers. If you use the regex Python library, you could get a list of numbers preceding a street name
  3. Name, e.g. Bourke
  4. Type, e.g. Road Also discards trailing rubbish

Detailed explanation as verbose regex below (?ix) # case insensitive and verbose flag

(?:^[A-Z\W]*\W+(?!bnd|cnr)\W+)?         # discard initial rubbish (names not including BND/CNR) if present
(?:(?!\b(?:{0})\b)                      # not starting with road type
(?P<numbers>\d+-\d+|\d+[A-Z]?)\W+)*?    # capture numbers if present, including extension letter
(?:\W+(?:AND|&)\W+)*                    # do not capture AND/& if present
(?:(?:(?P<geo>BND|CNR)                  # capture BND/CNR if present (GEO var)
(?:\WOF|\WBY)?)\W+)*                    # do not capture OF/BY following BND/CNR
(?P<name>                               # capture street name (NAME var)
(?:(?!\b(?:{0})\b)                      # not starting with road type
[A-Z]+\W*)+)\W+                         # contains letters and non-words only
(?:(?P<type>{0}\W))+                    # capture street type (TYPE var) and ignores trailing rubbish
(?:\W|$)                                # non-word or end of string
Submitted by Daniel Vianna - 7 years ago