Surprised I haven't encountered this before, but I haven't.
I am parsing a CLI terminal export file that uses a random number of spaces between columns, but zero spaces within column data. It would be great to arrayize() each line, using the space(s) as the delimiter.
To do that, I need to S/R all multiple-spaces with a single space, so that columns are exactly one space apart (recall that there are zero spaces in the actual data, so this will work fine).
I am guessing that this problem has been solved before, likely many times, and wondering if anyone has any advice on the best / fastest approach -- or a tried-and-true UDF they could share?
For the curious, here's an example of the input file. I only need the first two columns... the rest is chaff.
------------------------------------------------------------------------------------------------------------------
VLAN Name Status Reason Type Interfaces
------------------------------------------------------------------------------------------------------------------
1 DEFAULT_VLAN_1 up ok default 1/1/1-1/1/4,1/1/8-1/1/9,
1/1/11-1/1/15,1/1/17-1/1/19,
1/1/36-1/1/38,1/1/44,
1/1/51-1/1/52,2/1/1-2/1/24,
2/1/27-2/1/28
10 10-SERVER-CDE up ok static 1/1/1-1/1/3,1/1/7,1/1/16,
1/1/22,1/1/51-1/1/52,
2/1/1-2/1/24
11 11-VOICE-LEGACY up ok static 1/1/1-1/1/3,1/1/7,1/1/16,
1/1/22,1/1/51-1/1/52,
2/1/1-2/1/24
12 VLAN12 up ok static 1/1/1-1/1/3,2/1/1-2/1/24
If you are open to using a 3rdParty Extender and the columns begin in fixed positions, there is a snCut() function in the wbOmnibus extender that would allow you to chop out the two columns you want and place a delimiter between them.
If not, something like:
rcnt = 10
txt = FileGet(file)
While StrIndex(txt," ",......) > 0
txt = StrReplace(txt,StrFill(" ", rcnt)," ")
If rcnt > 2 Then rcnt = rcnt -1
EndWhile
FileWrite(newfile,txt)
Jim
Seems to be a fixed length for for data columns. Code below appears to work
file = "C:\temp\vlan.txt" ;your sample copied/pasted to file
file1 = "C:\temp\vlan_1.txt"
If FileExist(file1) Then FileDelete(file1)
h = FileOpen(file, "READ")
h1 = FileOpen(file1, "WRITE")
While @True
line = FileRead(h)
If line == "*EOF*" Then Break
If strsub(line,1,1)=="-" || strlen(strtrim(strsub(line,1,30)))==0 Then Continue
fld1 = strtrim(strsub(line,1,6))
fld2 = strtrim(strsub(line,7,30))
FileWrite(h1,fld1:" ":fld2)
Endwhile
FileClose(h)
FileClose(h1)
Thanks Stan and Jim !
Stan, looking at your solution, that will work perfectly and I'll implement that immediately for this task.
Jim, I've long been looking for a use case to play with your Omnibus extender - I will experiment with that here to see how it works. I've been meaning to kick the tires on wbOmnibus for a long time - much too long.
---
However, I frequently need to S/R random-length strings of multiple spaces with a single space. Typically, I just do this (using periods in place of spaces for visibility):
strReplace(strReplace(strReplace(strReplace(strReplace(_txt, `..`, `.`), `..`, `.`), `..`, `.`), `..`, `.`), `..`, `.`)
It works - but it's slow.
I would think one could use Regular Expressions to solve this, but I haven't a clue how to implement that for a Search/Replace... I've looked at the usual RegEx example in the Tech DB (https://techsupt.winbatch.com/webcgi/webbatch.exe?techsupt/nftechsupt.web+WinBatch/dotNet/System_Text/RegularExpressions+RegEx.txt), but it is merely for conducting a search and returning the found text -- how would one turn that into a reusable Search/Replace udf?
Thanks guys!
Assuming that I haven't misunderstood your needs, here is a simple regex example that appears to work:
OutFile = '\temp\Cleaned.txt'
if FileExist(OutFile ) then FileDelete(OutFile )
ObjectClrOption('useany','System.Text.RegularExpressions')
objRegEx = ObjectClrNew('System.Text.RegularExpressions.Regex', '123')
strOrig = FileGet(OutFile)
strCleaned = objRegEx.Replace(strOrig, '\s+', " ")
FilePut('\temp\Cleaned.txt', strCleaned)
RunApp('Notepad', OutFile )
If you wish to keep carriage returns, line feeds, and tabs, the following works:
strCleaned = objRegEx.Replace(strOrig, ' +', " ")
Nice, but seems it does a fileget() on a file that has been deleted
Quote from: cssyphus on April 10, 2024, 05:57:56 AMThanks Stan and Jim !
Stan, looking at your solution, that will work perfectly and I'll implement that immediately for this task.
Jim, I've long been looking for a use case to play with your Omnibus extender - I will experiment with that here to see how it works. I've been meaning to kick the tires on wbOmnibus for a long time - much too long.
Thanks guys!
Glad it worked. If the issue is parsing specific columns from a fixed length file and rewriting them a single-space delimited then I assumed mapping the columns start and length would be necessary. This could be accomplished with a config/ini file, a WB map, or a regex pattern. Parsing the file, assuming it could be huge [i.e. 100-200,000 rows] could be done with WB file functions, or .net [System.IO.StreamReader] / [System.IO.StreaWriter] / [System.Text.StringBuilder] and would be interesting to test performance on a 1gb file. Then, of course there are the lines that should not be parsed into final output. In your case you only wanted to parse 2 columns, but if those circumstances change or you need to parse other columns in other files, then a one of the mappings for all columns would in my opinion be provident.
Quote from: spl on April 10, 2024, 11:24:45 AMNice, but seems it does a fileget() on a file that has been deleted
You are correct. The script should be
OutFile = '\temp\Cleaned.txt'
if FileExist(OutFile ) then FileDelete(OutFile )
ObjectClrOption('useany','System.Text.RegularExpressions')
objRegEx = ObjectClrNew('System.Text.RegularExpressions.Regex', '123')
strOrig = FileGet('\temp\test.txt') ; Reminder: check before you post.
strCleaned = objRegEx.Replace(strOrig, ' +', " ")
FilePut(OutFile, strCleaned)
RunApp('Notepad', OutFile )
And thanks for pointing out the blunder.
Bad weather here, keep losing power on and off. It is quiet now so thought I would contribute one more quick script to the thread - a sample for creating a strsub() mapping for the sample
cols = "VLAN,Name,Status,Reason,Type,Interfaces"
line = "VLAN Name Status Reason Type Interfaces "
cnt = ItemCount(cols,",")
llen = Strlen(line)
i=1
k=2
map = ""
While k<=cnt+1
If k>i
col = ItemExtract(i,cols,",")
col2 = ItemExtract(k,cols,",")
pos = Strindex(line,col,0,@FWDSCAN)
If k>cnt
pos2 = llen
else
pos2 = Strindex(line,col2,0,@FWDSCAN)
endif
map=map:col:"=strtrim(strsub(line,":pos:",":pos2-pos:"))":@CRLF
i+=1
k+=1
Endif
Endwhile
Message("Mapping",map)
Exit