File write speed over time

Started by cssyphus, April 21, 2014, 12:49:35 PM

Previous topic - Next topic

cssyphus

My TSR script is maintaining a running log of activities. It runs 24/7 on a user workstation.

Presently, the activity log is stored as a variable in memory, and it grows larger over time (max 300Kb). Every few seconds, to prevent potential data loss, it is saved to disk using FilePut().

Initially, this is speedy, but over time the save takes longer and longer to accomplish -- as much as 3 or 4 seconds per save. This makes other things the program does appear sluggish.

Would it be faster if I switch to using FileOpen / FileWrite, and just filewrite() each new line to the file instead of maintaining a growing variable in RAM?

Max log size is ~ 300Kb.

Script is running on Windows 8.1 and, although a paid-up compiler subscription, I am still using 2011B for this script.

Deana

I recommend using Binary Operations to store and write out to the file. See BinaryAlloc, BinaryPokeStr, BinaryWrite and BinaryFree. Generally Binary operations are much quicker than File operations.
Deana F.
Technical Support
Wilson WindowWare Inc.

td

In this case you might get the best results using FileOpen with APPEND mode and then just Write each line or some number of lines to the file as required.  This approach should keep the cost of maintaining your log file closer to constant and may improve performance by taking advantage of the system's file caching to save disk I/O. 

That said, there are several variables involved that can affect the performance of one method when compared to the other so it might be advisable to perform a little experimentation to determine the best approach for you.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

cssyphus

Thanks very much, looks like it is worth doing, as I had hoped. It's great to have an idea whether on the right track before undertaking the code changes.

Question: Is there much of a performance hit between opening/closing log with each filewrite versus leaving logfile open and closing when app terminated? (see below example).  I will frequently copy the logfile to another location for review and the file won't copy (file is locked) if log is open...  However, if method 02 is faster, I will figure it out.

METHOD 1
Code (winbatch) Select

While 1
//do stuff
goSub WriteLog
EndWhile

:WriteLog
hF = fileOpen("path/logfile.log", "append")
fileWrite(hF, "text for log")
fileClose(hF)
Return


METHOD 02
Code (winbatch) Select

hF = fileOpen("path/logfile.log", "append")

While 1
//do stuff
goSub WriteLog
EndWhile

fileClose(hF)
Exit

:WriteLog
fileWrite(hF, "text for log")
Return


Deana

QuoteInstead of opening and closing an often needed file every time you access it, open it only the first time you access it, and close it when you are finished using it. To close and reopen a disk file takes a variable time, but about the same time to read 15 to 20 KB from the disk cache. Therefore, if you need to access a file often, you can avoid this overhead by opening the file only one time before accessing it, keeping it open by hoisting its handle wrapper to an external scope, and closing it when you are done.

Reference: http://en.wikibooks.org/wiki/Optimizing_C%2B%2B/General_optimization_techniques/Input/Output
Deana F.
Technical Support
Wilson WindowWare Inc.

cssyphus

Thanks, Deana, that's a great resource!

I was already pretty sure it would be faster to fileopen at start and fileclose at exit. My request, though, is for expert-level comment regarding the degree of difference this will make.  I am moving from a continuous filePut() loop to fileOpen/fileWrite, and just need a gut-level estimate of the delay overhead of fileOpen/fileClosing with each fileWrite().

For example, would doing all three at once (fileOpen/Write/Close) basically negate the advantages of using fileOpen/fileWrite at all (compared to filePut), or is fileOpen/fileWrite("append") so much faster over time that it will still make a large difference, even if the three are clumped together?

Regarding binary operations, I understand they are significantly faster working in RAM (binarySearch, binaryReplace, binaryIndex, etc) -- but is binaryWrite significantly faster than filePut()?

Deana

For a definitive estimate, I recommend creating a benchmark test on your system using GetTickCount for each scenario and log the results.

Code (winbatch) Select
GoSub Tests

loopcnt = 100
testfile = DirScript():'testlog.txt'
ret1 = udfTest1(loopcnt,testfile)
ret2 = udfTest2(loopcnt,testfile)
ret3 = udfTest3(loopcnt,testfile)
ret4 = udfTest4(loopcnt,testfile)

Message("Results", 'Test1 - Optimized = ': ret1 : @CR :'Test2 - Open and close =': ret2 : @CR :'Test3 - FilePut = ': ret3 : @CR :'Test4 - Binary = ': ret4  )
Exit

:TESTS
#DefineFunction udfTest1(loopcnt, testfile) ; Optimized
   start = GetTickCount()

;Add code here
   h = FileOpen(testfile, "append")
   For x = 1 to loopcnt
       FileWrite(h, "text for log")     
   Next
   FileClose(h)
   FileDelete( testfile )

   endp = GetTickCount()
   elapsed = (endp - start)
   Return elapsed
#EndFunction

#DefineFunction udfTest2(loopcnt, testfile); Open and close
   start = GetTickCount()

   ;Add code here
   For x = 1 to loopcnt
       h = FileOpen(testfile, "append")
       FileWrite(h, "text for log")
       FileClose(h)
   Next
   FileDelete( testfile )
   

   endp = GetTickCount()
   elapsed = (endp - start)
   Return elapsed
#EndFunction

#DefineFunction udfTest3(loopcnt, testfile) ; FilePut
   start = GetTickCount()

   ;Add code here
   data = "text for log"
   For x = 1 to loopcnt
       FilePut( testfile, data )
       data = data:@CRLF:"text for log"
   Next
   FileDelete( testfile )

   endp = GetTickCount()
   elapsed = (endp - start)
   Return elapsed
#EndFunction


#DefineFunction udfTest4(loopcnt, testfile) ; Binary
   start = GetTickCount()

   ;Add code here
   data = "text for log"
   handle = BinaryAlloc((StrLen(data)+2)*100+100)
   offset = 0
   For x = 1 to loopcnt
       FilePut( testfile, data )
       BinaryPokeStr( handle, offset, "text for log":@CRLF )
       offset = offset +(StrLen(data)+2)
   Next
   BinaryWrite( handle, testfile )
   BinaryFree( handle )
   FileDelete( testfile )

   endp = GetTickCount()
   elapsed = (endp - start)
   Return elapsed
#EndFunction
RETURN


You will see that a single open/close with FileWrite in append mode seems to generate the best results.
Deana F.
Technical Support
Wilson WindowWare Inc.