Regular Expressions 101

Community Patterns

Community Library Entry

1

Regular Expression
ECMAScript (JavaScript)

/
(?:^|\b)(?<number>(?:\d+|\d{1,3}(?:,\d{3})+)(?:\.\d+)?)(?<replace> ?|&[^;]{2,7};)(?<dimensions>(?<mathjax>\$[^\n\$]+\$)|(?<unit>(?<si>(?!as)(?:[QRYZEPTGMkhdcmµnpfazyrq]|da)?(?:mol|cd|Hz|Pa|Wb|lm|lx|Bq|Gy|Sv|kat|[mgsAKNJWCVFΩSTHL]))|(?<imperial>mil|P|in|ft|yd|mi|le|ftm|nmi|oz|lbs?|t|tsp|tbsp|c|fl oz|pt|qt|gal|(?:sq|cu) ?(?:in|ft|yd|mi)|ft-lbf?|lbf?-ft|deg|Btu|psi|Ly)|(?<common>mph|kph|rad|sr)|(?<binary>(?:(?:[KMGTPEZY])i?)?B)))(?=\b|$| |[.,:;()\[\]{}=/*+—–-])
/
gm

Description

A more general match for numbers with units/dimensions in plaintext. This is unoptimized, a bit rough, and leaves out plenty of units while including some oddities. Use it as the first step to something better if you're linting md files. This was good enough for my use case, which doesn't care much about match time but has to meet certain other restrictions on the re. The replacement character is the narrow non-breaking space, which keeps values+units together when rendering plaintext documents. Other document types don't care at all about this, and may render it as a regular "thin space" character, but may still respect the non-breaking char.

Submitted by ecfrechette - a year ago