re = /(?############ Let's catch paths without "" or '' ############################
)(?<opening>(?# First, catch the starting path, the <opening> ###################
)\b(?<montage>[a-zA-Z]:[\/\\])(?# montage = 'C:\/'
)|[\/\\][\/\\](?<!http:\/\/)(?<!https:\/\/)(?>(?# check not 'http[s]:' prefix
)[?.][\/\\](?:[^\/\\<>:"|?\n\r ]+[\/\\])?(?# '\/\/[?or.]\/xxxxx' or '\/\/[?or.]\/server\/'
)(?&montage)?(?# '\/\/[?or.]\/c:\/' or '\/\/[?or.]\/server\/c:\/'
)|(?!(?&montage)))(?# '\/\/[addressIP\/ or serverName\/ but not C:\/]'
)|%\w+%[\/\\]?(?# '%EnvVariable%[\/]'
))(?# So, <opening> catch :
'C:\/' or
'\/\/[?or.]\/[UNC\/]C:\/' or
'\/\/[?or.]\/[UNC\/]' or
'\/\/[next characters must be something other than C:\/]' or
'%EnvironementVariable%[\/]'
)(?:(?# now, we catch each directory name wich is between [\/] ########################
)[^\/\\<>:"|?\n\r ,'](?# the first character should not be [ ,']
)[^\/\\<>:"|?\n\r]*(?# Any pathFrendly character
)(?<![ ,'])(?# The last directory name's character must not be [ ,']
)[\/\\](?# End of directory name - who are between '\/' -
))*(?# Catch most 'directoryName\/' as possible
)(?:(?# Lets catch the End path. There is a file ? a directory ? or just a useless '\/' ?
)(?=[^\/\\<>:"'|?\n\r;, ])(?#if next character is not pathFriendly or ' ' or [,'], we have reach the end of the path => we don't catch the last '\/' and the the Regex end now.
You can't catch fileName who begin by [,'] because they are probably a delimiter between 2 path. but '.' is allowed
)(?:(?#If we are here, that mean there is a fileName or directoryName to catch
###### We will catch the last directoryName or the fileName without the extention ######
)(?:[^\/\\<>:"|?\n\r;, .](?# catch any character pathFriendly exept ' ' or [,.]
)(?: (?=[\w\-]))?(?# If we find a ' ', we catch him if next charcter is not a delimiter. I see '-' after an ' ' not like a delimiter.
)(?:\*(?!= ))?(?# If we find a '*', we stop the catch if next character is an ' '
)(?!(?&montage))(?# If we find a string who look like 'C:\/', we stop the catch
))+(?# We catch theses word delimited by ' ' as much as possible
))?(?# it's possible the fileName have no name, but just an extention
)(?:\.\w+(?# #### an extention begin by '.' and at least one none delimiter chracter
))*(?# we can add more extention until the first none '.' delimiter character. So, after the first '.' character inside a fileName, we cannot catch any ' ' character
If we don't find one extention, so the filename is a directory name, and we stop the catch.
))(?# ############# END OF PATH CATCHING WITHOUT QUOTE "" and '' #######################
)|(?:(?# ######### Catching path quoted '' ###########################
Path quoted '' is difficult because ['] is also a pathFrendly character
)'(?&opening)(?# We catch .* between quote only if string start with an <opening>
)(?=.*'\W|.*'$)(?# We catch .* between quote only if we are sure we will find end quote. End quote must be ['] and delimiter character or ['] and end string
)(?:[^\/\\<>:'"|?\n\r]+(?# We take any pathFriendly character exept quote [']
)(?:'(?=\w))?(?# we catch quote ['] if next character is not a delimiter
)[\/\\]?)*(?# Path quoted must respect this patern until end quote character [']
)')(?# end quoted '' path
)|(?# ######### Catching path quoted "" ###########################
)"(?&opening)(?# We catch .* between quote only if string start with an <opening>
)(?=.*")(?# We catch .* between quote only if we are sure we will find end quote ["]
)(?:[^\/\\<>:"|?\n\r]+(?# We take any pathFriendly character
)[\/\\]?(?# pathFriendly characters can be is delimited by '\'
))*(?# Path quoted must respect this patern until end quote character
)"(?# end quoted path
)/
str = 'THIS IS COMMENTED VERSION !
to simple copy and use it, go https://regex101.com/r/zWGLMP
C:/testOk\\dot.Dirname/.nameFileBeginByDot first space after a dot in file name stop the match
C:/testOk\\_.._AsDirName/../file name.ext1.ext2 first space after a dot stop the match
start text don\'t match C:/testOk\\lastDir Or FileName WithDouble..dot stop the match
C:/testOk\\lastDir Or FileName dot ended. stop the match like an end sentence. So, a last name with a space after a dot is not catch
C:/testOk\\LastNameIs/DirName C:/testOk\\2Paths_ _separated/f.ext space after extention stop match
C:/testOK\\Last_/_isNotmatched/fgfj.gjjb/uhloext/ and [ ,\'] after \'\\\' stop match
\\\\127.0.0.1/this\', \'isOkInMidDirName\\butSimple\',\' stop match in last DirName or FileName
\\\\.\\c:/this exotic path begining work\\and\\ space after \'\\\' stop the match
\\\\?\\c:/this exotic path begining work too\\and \\space before \'\\\' stop the match too
\\\\testOk/this\' - \'is ok in dirName/and - in lastName.ext
i:/dir/fileName with a .space before dot stop the match
\\\\?\\server1\\e:\\utilities\\\\filecomparer\\ this double \\\\ is interpretated as new path
@"c:\\testOk\\double quote character is more permissive/ \'\' , ; .txt, .ext2",
@"\\\\127.0.0.1\\c$\\temp\\t\'est-file.txt, if end double quote is missing, we use unquote match
@"\\c:\\LOCALHOST\\c$\\ thisIsNotMatched" "temp\\test-file.txt", quoted path must have a right opening to be matched
@\'\\\\.\\c:\\temp\\te\'st-file.txt\' simple quoted is ok
\'c:\\simpleQuoteInsideStill\'Match\\but\' stopMatch if next is space character,
\'c:\\simpleQuoteInsideStill\'Match\\but\\\'stopMatch if is fisrt character after \\
\'c:\\simpleQuoteInsideStill\'Match\\but\'\'stopMatch if he is double
@"\\\\?\\c:\\te \' mp\\est-file.txt",
@"\\\\.\\UNC\\LOCALHOST\\c$\\temp\\test-file.txt",
@"\\\\127.0.0.1\\c$\\temp\\test-file.txt"
/\\serverName\\mix/and\\still match" double quote character stop match
\\\\\\IfMoreThan2_\\_we take only the 2 lasts.ext first space after ext stop the match
/testNotMatch/html
/testNotMatch.html
testNotMatch.html
// -> this simple // or \\\\ is not matched, but this //isMatched !
/ -> this simple / is not matched, and this /notMatchedToo
b-renice\\sauvegardes\\B-HIER\\GEO\\Geo_NetAct_Atoll_Planet\\UR_Est\\Custom Data"
"b-renice\\sauvegardes\\B-HIER\\GEO\\Geo_NetAct_Atoll_Planet\\UR_Est\\Custom Data"
"\\\\b-renice\\sauvegardes\\B-HIER\\GEO\\Geo_NetAct_Atoll_Planet\\UR_Est\\Custom Data"
error Message test:
---------------------------
Tentative d\'accès à C:\\Users\\tpgz4017\\App - Data\\Local\\Temp\\tempShapeFile_CrossWave Calibration Zones - Atoll CrossWave Model.shx après sa fin.
---------------------------
local url path :
file://C:/Users/Downloads/20220516_32289275_1049383.pdf
urlPath :
file://p-eco2.rd.fr/vol_H0037_01$/599/livraison/20220516_32289275_1049383.pdf
c:\\temp\\test-file.txt",
\\\\127.0.0.1\\c$\\temp\\test-file.txt",
\\\\LOCALHOST\\c$\\ temp\\test-file.txt",
\\\\LOCALHOST\\c$ \\temp\\test-file.txt",
\\\\.\\c:\\temp\\t\\est-file.txt",
\\\\?\\c:\\temp\\test-file.txt",
\\\\.\\UNC\\LOCALHOST\\c$\\temp\\test-file.txt",
\\\\?\\UNC\\ServerName\\ temp\\test-file.txt",
\\\\127.0.0.1\\c$\\temp\\test -file.txt"
error Message test:
Site0 / 3: - Warning . See log file \'C:\\ProgramData\\InfoVista\\Planet 7.4\\7.4\\RPE\\Log\\Plugins\\Universal_Model_masked\\log_Universal_Model.txt\' for details
C:/test\\gvk.hv/fgfj.gjjb/uhloext : some random text
\\\\b-renice\\sauvegardes\\B-HIER\\GEO\\Geo_NetAct_Atoll_Planet\\UR_Est\\Polygon\\Haguenau\\Building\\Haguenau hgtfhyt "C:/te-st.html" "C:/te-st.html" gd"dhbcsk "C:/te/dsst.ikpo fdsf "C:\\test" "C:// test.html" gd
"//te s t/e, llo.html
C:/test\\f/uhlo/.
C://te?st.html
b-renice\\sauvegardes\\B-HIER\\GEO\\Geo_NetAct_Atoll_Planet\\UR_Est\\Custom Data"
; dfsdf "\\\\b-renice\\sauvegardes\\B-HIER\\GEO\\Geo_NetAct_Atoll_Planet\\UR_Est\\Custom Data"
; dfsdf "\\\\
"\\\\b-renice\\sauvegardes\\B-HIER\\GEO\\Geo_NetAct_Atoll_Planet\\UR_Est\\Custom Data"Haguenau_Building.tab : Data format of \\\\b-renice\\sauvegardes\\B-HIER\\GEO\\Geo_NetAct_Atoll_Planet\\UR_Est\\Polygon\\Haguenau\\Building\\Haguenau Building.* C: is invalid
Haguenau_Building.tab : Data format of \\\\b-renice\\sauvegardes\\B-HIER\\GEO\\Geo NetAct Atoll_Planet\\UR_Est\\Polygon\\Haguenau\\Building\\Haguenau Building.TAB, is invalid
Haguenau_Building.tab : Data format of \\\\b-renice\\sauvegardes\\B-HIER\\GEO\\Geo_NetAct_Atoll_Planet\\UR_Est\\Polygon\\Haguenau\\Building\\Haguenau Buildi*.*ng.*, is invalid
C:/test/../hjgbkl C:/test/../hjgbkl.gfgdfgrdgfdgr C:/test/../hjgbkl
C:/test.html
C://test/ .h/hel,lo.html//test/./hello.html
C:/test//hello.html
//test
//hello.html
/test
"%tmp%/fsdfs"
%tmp%/fsdfs
ERROR 8/31/2021 - 6:45:39 PM HighResClutter .RasterFile : \\\\b-ren ice\\sauv egardes\\B-HIER\\GEO%dsq%\\NewJersey_NewYork\\DTM\\DTM\\CENTRAL_JERSE..Y_New_York_2 m_Z18N_0_DTM_02_06.bil : Le fichier spécifié est introuvable.
\\\\b-ren ice\\sauv egardes\\..\\B-HIER\\GEO\\NewJersey_NewYork\\DTM\\DTM\\CENTRAL_JERSE..Y_New_York_2 m_Z18N_0_DTM_02_06.bil C:\\b-ren ice\\sauv egardes\\B-HIER\\GEO\\NewJersey_NewYork\\DTM\\DTM\\CENTRAL_JERSE..Y_New_York_2 m_Z18N_0_DTM_02_06.bil \\\\b-ren ice\\sauv egardes\\B-HIER\\GEO\\NewJersey_NewYork\\DTM\\DTM\\CENTRAL_JER SE.Y_New_York_2 m_Z18N_0_DTM_02_06.bil.
//test.html
\\\\10.1.1.107
//10.1.1.107/test.html
//10.1.1.107/te st/hello.html
//10.1.1.107/test/hello
//test/hello.txt
//test/hello.txt.
\\\\.\\UNC\\Server\\Share\\Test\\Foo.txt
\\\\?\\UNC\\Server\\Share\\Test\\Foo.txt
Pour les chemins UNC de périphérique, la partie serveur/partage forme le volume. Par exemple, dans \\\\?\\server1\\e:\\utilities\\filecomparer\\ , la partie serveur/partage est server1\\utilities . Ceci est important quand
\'\\\\127.0.0.1\\c$\\temp\\test-fi\'le.txt\''
# Print the match result
str.scan(re) do |match|
puts match.to_s
end
Please keep in mind that these code samples are automatically generated and are not guaranteed to work. If you find any syntax errors, feel free to submit a bug report. For a full regex reference for Ruby, please visit: http://ruby-doc.org/core-2.2.0/Regexp.html