re = /^https:\/\/www.test.com\/(en|de|fr|es|pt-br)\/((blog|news|nouvelles|noticias))+([a-zA-Z0-9-\/]*)$/m
str = 'Should be excluded - starting page:
https://www.test.com/en/
https://www.test.com/de/
https://www.test.com/fr/
https://www.test.com/es/
https://www.test.com/pt-br/
Should be excluded - pages and sub pages from blog and news:
https://www.test.com/en/blog
https://www.test.com/en/news
https://www.test.com/en/noticias
https://www.test.com/en/nouvelles
https://www.test.com/en/blog/hardware
https://www.test.com/en/news/hardware
Should be included - all other pages:
https://www.test.com/en/shop
https://www.test.com/en/shop/product-1/
'
# Print the match result
str.scan(re) do |match|
puts match.to_s
end
Please keep in mind that these code samples are automatically generated and are not guaranteed to work. If you find any syntax errors, feel free to submit a bug report. For a full regex reference for Ruby, please visit: http://ruby-doc.org/core-2.2.0/Regexp.html