All:
(Using Winbatch 2008c)
Is there a way to quickly read how many lines an ASCII text file has?
I want to be able to do this so I can dynamically dimension arrays, set countdown counters, etc. and do this quickly.
Some of these files can contain millions of lines.
Any ideas?
As always, thanks in advance
Two approaches come readily to mind...
FileGet()/ItemCount using @CL or @LF as a delimiter (whatever is appropriate).
BinaryStrCnt(). Searching for above delimiter.
Jim
Here's my code snippet:
F=FileGet("abc.txt")
Num=ItemCount(F,@LF)
MsgTxt=StrCat(Num," Records")
Message("There are:",MsgTxt)
exit
When I run it, I get:
"VMalloc error - VirtualAlloc failed" message.
(File has about 7,000,000 lines)
Any ideas? Thanks
That is why I also suggested the Binary approach. If the files are VERY large then that would be the better approach.
Jim
If your system lacks the resources to handle the data in that fashion you will probably need to split it into smaller chunks.
Another approach would be to load it into an array and use the ArrInfo() function....again, this assumes your system can handle the load.
Jim
Quote from: JTaylor on August 01, 2015, 06:22:40 PM
That is why I also suggested the Binary approach. If the files are VERY large then that would be the better approach.
Jim
Might also consider the .net Streamreader through the WB CLR. There are a couple of posts using that in the Tech DB. I believe it has a lines or linecount property that may give what the OP is after.
Since the OP is using a 2008 version of WinBatch he doesn't not have access to CLR hosting. FileGet is limited to a file of somewhere around 80-90 MB max so the error is not surprising. The suggested binary buffer approach would be good up to around 350 MB file size depending on the execution environment. If a straight binary buffer is too small then the it will be necessary to take the divide and conquer approach. The topic concerning replacing NULLs in this board has an example that can be adapted to the OP's purposes.
Possibly relevant topic:
http://forum.winbatch.com/index.php?topic=1429.0 (http://forum.winbatch.com/index.php?topic=1429.0)
Should have also mentioned that the ArrayFileGet approach is a good solution as long as your file isn't to big to get into memory all at once. The only downsides are that the function has a bit more memory and CPU overhead than the binary buffer approach. The obvious upside is that your file is loaded and placed in an array in a single step.
I just remembered WB works with PHP (even version 2008) and you can harness something as simple as:
<?php
$file = "somefile.txt";
$lines = count(file($file));
echo "There are $lines lines in $file";
?>
if this could help :
#definefunction NbLinesInFile(myfile)
intcontrol (73, 1, 0, 0, 0)
bb = 0
fs = Filesize(myfile)
if !fileexist(myfile) then return - 1
if fs == 0 then return 0
bb = binaryalloc(fs)
fs = binaryread(bb, myfile)
nblf = Binarystrcnt(bb, 0, fs - 1, @lf)
nbcrlf = Binarystrcnt(bb, 0, fs - 1, @crlf)
nbcr = Binarystrcnt(bb, 0, fs - 1, @cr)
maxrc = max(nblf, nbcrlf, nbcr)
if maxrc == nbcrlf
sep = @crlf
else
if maxrc == nblf
sep = @lf
else
sep = @cr
endif
endif
lsep = strlen(sep)
lastc = Binarypeekstr(bb, fs - lsep, lsep)
if lastc <> sep then maxrc = maxrc + 1
binaryfree(bb)
return maxrc
:wberrorhandler
if bb <> 0 then binaryfree(bb)
return -1
#endfunction