Regular Expressions 101

Community Library Entry

1

Regular Expression
PCRE2 (PHP >=7.3)

/
^ # Uncomment to allow prefix continuation (second letter of a dangling match) # ([abcdefghiklmnoprstuvy\ ])? ( \ |(?: # Single-letter elements [BCFHIKNOPSUVWY] ) |(?: # Two-letter elements. (?:A[cglmrstu])|(?:B[aehikr])|(?:C[adeflmnorsu])|(?:D[bsy]) | (?:E[rsu])|(?:F[elmr])|(?:G[ade])|(?:H[efgos])|(?:I[nr]) | (?:K[r])|(?:L[airuv])|(?:M[cdgnot])|(?:N[abdehiop]) | (?:O[gs])|(?:P[abdmortu])|(?:R[abefghnu])|(?:S[bcegimnr]) | (?:T[abcehilms])|(?:X[e])|(?:Yb)|(?:Z[nr]) ) |(?: # Smushed trigrams. No need to go past trigrams because two twos make four (?:A(?:ga|gd|ge|la|lr|md|mg|mt|ra|re|rg|ta|te|tl|tm)) |(?:E(?:ra|re|rg)) |(?:G(?:ag|al|am|ar|at|er)) |(?:L(?:ag|al|am|ar|at|ra|re|rg)) |(?:M(?:ga|gd|ge|ta|te|tl|tm)) |(?:R(?:ag|al|am|ar|at|er|ga|gd|ge)) |(?:T(?:ag|al|am|ar|at|er|la|lr|md|mg|mt)) |(?:X(?:er)) |(?:Z(?:ra|re|rg)) ) # repeated any number of times )+ # Uncomment to allow last letter if it's a potential first letter of the next word # ([ADEGLMRTXZ]\ *)? $
/
gimx

Description

This will match only words that include chemical elements: H + Og + W + Ar + Ts but not Eton. It allows internal overlap: LaG = La + Ag, and allows the first letter or last letter to be the end/start of a symbol to allow chaining. You can easily remove the various pieces, though. A single-line (non-"extended") regex is at the bottom of the test string

Submitted by mrflip - 17 days ago (Last modified 17 days ago)