$re = '/^https:\/\/www.test.com\/(en|de|fr|es|pt-br)\/((blog|news|nouvelles|noticias))+([a-zA-Z0-9-\/]*)$/m';
$str = 'Should be excluded - starting page:
https://www.test.com/en/
https://www.test.com/de/
https://www.test.com/fr/
https://www.test.com/es/
https://www.test.com/pt-br/
Should be excluded - pages and sub pages from blog and news:
https://www.test.com/en/blog
https://www.test.com/en/news
https://www.test.com/en/noticias
https://www.test.com/en/nouvelles
https://www.test.com/en/blog/hardware
https://www.test.com/en/news/hardware
Should be included - all other pages:
https://www.test.com/en/shop
https://www.test.com/en/shop/product-1/
';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
// Print the entire match result
var_dump($matches);
Please keep in mind that these code samples are automatically generated and are not guaranteed to work. If you find any syntax errors, feel free to submit a bug report. For a full regex reference for PHP, please visit: http://php.net/manual/en/ref.pcre.php