Regular Expressions 101

Community Patterns

There does not seem to be anything here

Community Library Entry

1

Regular Expression
PCRE (PHP <7.3)

/
^ (?&address) $ (?(DEFINE) (?<ALPHANUMERICE> [A-Za-z0-9!\#$%&'*+\-\/=?^_`{|}~] ) (?<CFWS> (?: (?: (?&FWS)? (?&comment) )++ (?&FWS)? ) | (?&FWS) ) (?<FWS> (?&WSPCRLF)++ ) (?<UTF8_extra_char> (?=[\x{C2}-\x{DF}\x{E0}-\x{EF}\x{F0}-\x{F4}]) (?: (?&UTF8_2) | (?&UTF8_3) | (?&UTF8_4) ) ) (?<UTF8_tail> [\x{80}-\x{BF}] ) (?<UTF8_2> [\x{C2}-\x{DF}] (?&UTF8_tail) ) (?<UTF8_3> \x{E0} [\x{A0}-\x{BF}] (?&UTF8_tail) | [\x{E1}-\x{EC}] (?&UTF8_tail){2} | \x{ED} [\x{80}-\x{9F}] (?&UTF8_tail) | [\x{EE}\x{EF}] (?&UTF8_tail){2} ) (?<UTF8_4> \x{F0} [\x{90}-\x{BF}] (?&UTF8_tail){2} | [\x{F1}-\x{F3}] (?&UTF8_tail){3} | \x{F4} [\x{80}-\x{8F}] (?&UTF8_tail){2} ) (?<VCHARWSP> [\x{21}-\x{7E}\ \t] ) (?<WSPCRLF> [\ \t\x{0D}\x{0A}] ) (?<address> (?&mailbox_list) | (?&group) ) (?<addr_spec> (?&local_part) @ (?&domain) ) (?<utf8_addr_spec> (?&utf8_local_part) @ (?&utf8_domain) ) (?<alt_address> (?&FWS) < (?&addr_spec) > ) (?<angle_addr> (?&CFWS)? < (?: (?&addr_spec) | (?&utf8_addr_spec) (?&alt_address)? | (?&obs_route)? (?&addr_spec) ) > (?&CFWS)? ) (?<atext> (?&ALPHANUMERICE) ) (?<utf8_atext> (?&ALPHANUMERICE) | (?&UTF8_extra_char) ) (?<atom> (?&CFWS)? (?&atext)++ (?&CFWS)? ) (?<utf8_atom> (?&CFWS)? (?&utf8_atext)++ (?&CFWS)? ) (?<utf8_ccontent> (?&ctext) | (?&utf8_quoted_pair) | (?&comment) ) (?<comment> \( (?: (?&FWS)? (?&utf8_ccontent) )*+ (?&FWS)? \) ) (?<ctext> [!-'*-\[\]-~] | (?&obs_ctext) | (?&UTF8_extra_char) ) (?<display_name> (?&phrase) ) (?<domain> (?&dot_atom) | (?&domain_literal) | (?&obs_domain) ) (?<utf8_domain> (?&utf8_dot_atom) | (?&domain_literal) | (?&obs_domain) ) (?<domain_literal> (?&CFWS)? \[ (?: (?&FWS)? (?&dtext) )*+ (?&FWS)? \] (?&CFWS)? ) (?<dot_atom> (?&CFWS)? (?&dot_atom_text) (?&CFWS)? ) (?<utf8_dot_atom> (?&CFWS)? (?&utf8_dot_atom_text) (?&CFWS)? ) (?<dot_atom_text> (?&atext)++ (?: \. (?&atext)++ )*+ ) (?<utf8_dot_atom_text> (?&utf8_atext)++ (?: \. (?&utf8_atext)++ )*+ ) (?<dtext> [!-Z^-~] | (?&obs_dtext) ) (?<group> (?&display_name) : (?&group_list)? ; (?&CFWS)? ) (?<group_list> (?&mailbox_list) | (?&CFWS) | (?&obs_group_list) ) (?<local_part> (?&dot_atom) | (?&quoted_string) | (?&obs_local_part) ) (?<utf8_local_part> (?&utf8_dot_atom) | (?&utf8_quoted_string) | (?&obs_local_part) ) (?<mailbox> (?&name_addr) | (?&addr_spec) | (?&utf8_addr_spec) ) (?<mailbox_list> (?: (?&mailbox) (?: , (?&mailbox) )*+ ) | (?&obs_mbox_list) ) (?<name_addr> (?&display_name)? (?&angle_addr) ) (?<obs_NO_WS_CTL> [\x{01}-\x{08}\x{0B}\x{0C}\x{0E}-\x{1F}\x{7F}] ) (?<obs_ctext> (?&obs_NO_WS_CTL) ) (?<obs_domain> (?&atom) (?: \. (?&atom) )*+ ) (?<obs_domain_list> (?: (?&CFWS) | , )*+ @ (?&domain) (?: , (?&CFWS)? (?: @ (?&domain) )? )*+ ) (?<obs_dtext> (?&obs_NO_WS_CTL) | (?&quoted_pair) ) (?<obs_group_list> (?: (?&CFWS)? , )++ (?&CFWS)? ) (?<obs_local_part> (?&word) (?: \. (?&word) )*+ ) (?<obs_mbox_list> (?: (?&CFWS)? , )*+ (?&mailbox) (?: , (?: (?&mailbox) | (?&CFWS))? )*+ ) (?<obs_phrase> (?&word) (?: (?&word) | \. | (?&CFWS) )*+ ) (?<obs_qp> \\ (?: \x{00} | (?&obs_NO_WS_CTL) | [\x{0D}\x{0A}] ) ) (?<obs_qtext> (?&obs_NO_WS_CTL) ) (?<obs_route> (?&obs_domain_list) : ) (?<phrase> (?&word)++ | (?&obs_phrase) ) (?<qcontent> (?&utf8_qcontent) ) (?<utf8_qcontent> (?&utf8_qtext) | (?&utf8_quoted_pair) ) (?<qtext> [!\#-\[\]-~] | (?&obs_qtext) ) (?<utf8_qtext> (?&qtext) | (?&UTF8_extra_char) ) (?<quoted_pair> \\ (?&VCHARWSP) | (?&obs_qp) ) (?<utf8_quoted_pair> \\ (?&utf8_text) | (?&obs_qp) ) (?<utf8_text> [\x{01}-\x{09}\x{0D}\x{0A}\x{0E}-\x{7F}] | (?&UTF8_extra_char) ) (?<quoted_string> (?&CFWS)? " (?: (?&FWS)? (?&qcontent) )*+ (?&FWS)? " (?&CFWS)? ) (?<utf8_quoted_string> (?&CFWS)? " (?: (?&FWS)? (?&utf8_qcontent) )*+ (?&FWS)? " (?&CFWS)? ) (?<word> (?&utf8_atom) | (?&utf8_quoted_string) ) )
/
x

Description

This regex validates an email against RFC 5322 with RFC 6530 merged in. This allows for checking for International email addresses. Note that this regex works on bytes so any Unicode characters will have to be encoded in UTF-8 before they will pass.

Submitted by anonymous - 6 years ago