Memory?

Started by bottomleypotts, June 15, 2024, 05:03:55 AM

Previous topic - Next topic

bottomleypotts

I have a 32-bit winbatch app that is scraping a website. It scrapes a webpage and loops through 20000 pieces of data to populate a MS access database, and does this repeatedly. The app will start and will use approximately 36mb of data, but after the 30th webpage - so 600000 record that it adds to the database - the app will be consuming 1.4Gb of memory. I am using functions, and dropping any variable that may contain data. And finally after running for about 2 hours, the app will crash with undocumented errors.

I don't know what I can do to prevent the consumption of memory. I am dropping all variables except counters. Is there anything I can do to force the app stop and drop all the memory?

td

There isn't enough information here to get into specifics. Assuming your "app" is a compiled WinBatch script and for example, if you use COM to connect to a Website, do you release COM object resources in the COM object's prescribed way? If you are using an extender, do you release extender resources correctly? Are you cleaning up any Access resources as your script continues to execute?

If you have access to a 64-bit version of MS Office, along with a 64-bit version of your WinBatch script may be sufficient to work around the problem.

There are several tools that can be used to explore memory usage and memory faulting in detail. Depending on how desperate you are it is a rabbit hole that may be worth exploring.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

td

Again assuming a WinBatch compiled script, you could create scripts that only performs Web scrapping and another that only performs database building. It may narrow down the possibilities. Your description hints at a memory leak but that is a very speculative explanation. If that can be shown to be the problem then it can be addressed in some fashion.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

bottomleypotts

Thanks Tony, I tried compiling as a 64-bit app, that didn't make a difference. The app is downloading JSON approximately 60mb in total, parsing it, then there are 2 levels that I get through jsConMap. The memory use goes up gradually, not all at once, as it's parsing info. Will make the change you suggest and get back to you.

bottomleypotts

I deleted the database access, and I deleted the web object. I think it's parsing the JSON that is the problem. Dropping all the variables makes no difference.

Edit. I deleted everything but the JSON parsing. That is not the problem.

td

Quote from: bottomleypotts on June 15, 2024, 09:23:23 AMThanks Tony, I tried compiling as a 64-bit app, that didn't make a difference. The app is downloading JSON approximately 60mb in total, parsing it, then there are 2 levels that I get through jsConMap. The memory use goes up gradually, not all at once, as it's parsing info. Will make the change you suggest and get back to you.

Dropping variables can lead to dangling references. The JSON extender provides the jsConClose function to release all resources connected to a parsed JSON container. You might want to examine your script for proper usage of that function when you are done with a JSON container.

From the help file, "JsConClose accepts an extender JSON object or array container handle and removes the handle from the internal container handle table. The function also attempts to release all system resources associated with the handle. Once a handle is released it can no longer be used in extender functions."

"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

bottomleypotts

I've found I can jsConClose, but continue to use anything created with jsConMap? So I (was, but no longer) drop these variables also.

td

The jsConMap function accepts a JSON container, that is a JSON object or a JSON array and converts its contents to a WIL map with each key of the map being a JSON name and each map value being a JSON value of a JSON name/value pair. Since the created map contains copies and not references to the JSON container's content, you can safely close the container if you don't need to use it directly after creating the map.

There is one caveat to this, however, if the map contains nested JSON objects as values those objects need to be closed before the map variable is cleared. To work around this potential problem you can use jsConClose without a parameter to release all the currently open JSON containers or foreach through the WIL map closing any JSON containers present as you spin through the map. Since container handles are WIL string values that begin with the text "*arr" or "*obj" it is relatively easy to identify containers in a map. The alternative is to create a parallel map with JSON value types and use the maps in combination to release objects in the value map. My description makes it sound much more complicated than it is.

An enhancement note in the to-do list for the extender suggests adding a function to the extender that accepts a JSON based map as a parameter and releases all containers and binary buffers in the map.

Also, note that removing the JSON extender using IntControl 99 releases all JSON containers created by the extender.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

td

Need to correct the rambling screed above. The jsConMap function does not create binary buffers so you do not need to concern yourself with their presence in a map.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

td

Here is a simple map cleanup example if you can't use jsConClose() to close all objects and arrays for some reason:

AddExtender("ilcjs44i.dll",0,"ilcjs64i.dll")
jsData = $"{
  "pi":: 3.141,
  "happy":: true,
  "name":: "Boost",
  "nothing":: null,
  "answer":: {
    "everything":: 42
  },
  "list":: [1, 0, 2],
  "object":: {
    "currency":: "USD",
     "value":: 42.99,

     "nested":: {
       "value":: "nested value"
     }
  },
  "value":: "Top level value"
}$"

jsCon = jsParse(jsData)
jsMap= jsConMap(jsCon)
foreach Name in jsMap
   jsType = jsValueType(jsCOn, Name)
   if jsType == @JsonArr || jsType == @JsonObj then jsConClose(jsMap[Name])
next
exit
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

bottomleypotts

Tony, The simple map cleanup example seems to be appropriate to the rambling post and not the correction post. It appears that the example is clearing the binary buffers that a jsConMap would have created, but your clarification said is not required. As I understand your example is one where I am not using jsConClose(), therefore I would have thought that after jsMap was created, it would be the only object I needed to jsConClose. Or am I getting this wrong?

td

A map is a WIL language associative array. You cannot pass a WIL map to jsConClose because the function only accepts extender generated JSON containers. JSON objects and arrays can contain other JSON objects and arrays as member name/value pairs. They are stored as container handles and are independent of the parent object used to create the mape. The handles in a map need to be released just like the container handle used to create the map needs to be released to free resources used by the handle. This is usually done by calling jsConClose without parameters, exiting the script, or releasing them individually as shown in the script above. The last option is only necessary if a script accesses many JSON objects and process resources are in danger of exhaustion.

The JSON extender does not represent any JSON values as binary buffers. That was an error on my part.

The bigger questions are; have you determined what is causing your memory problem and has cleaning up JSON handles fixed it?   
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

bottomleypotts

I wrote some code to just process a bunch of .json files into a .txt file using filewrites.

I tried all sorts of combinations of cleaning up handles as per your example, and eventually I just coded a jsConClose() after processing each file.

Still have memory issues. However, the memory is not being consumed anywhere near the rate it was. This allowed us to get the data we wanted - so job done this time.

td

Closing handles when no longer needed is good programming practice. Preventing unnecessary resource consumption is the reason for the function in the first place.

It is likely your memory issues are the result of something other than the JSON extender.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

bottomleypotts

Ok. I am starting to see where I may be having problems. I am reading from a MS access database into an array using ADODB.Connection and GetRows(). I am then ArraySearch()ing on that array and I am seeing memory run away. Changing my code to put an ArrayFilePutCSV() followed by an ArrayFileGetCSV() has drastically reduced my memory issues. It appears that these kinds of variable errors may be the cause of my memory problems.

JTaylor

I don't know if it would help but WinBatch used to have problems with those arrays and they did something to fix  it.  What I did before the fix was to make some alteration to the array using the WinBatch functions.  Add a row and delete it, etc.  It seemed to convert the array in some fashion and it worked as expected. 

Might try that to see if you can avoid the file write/read, if that would be helpful.  Don't know your situation so may not matter.

Jim

bottomleypotts

Thanks Jim, I was unable to view a GetRows() array while debugging, so I was using an ArrayRedim() using the same dimensions as the original array.

JTaylor

That might be what I did, actually.  Been a while and can't quite remember.

jim

spl

Quote from: JTaylor on June 22, 2024, 06:54:57 PMThat might be what I did, actually.  Been a while and can't quite remember.

jim

Quite a while is correct. I had to go back to 2004 posts I made regarding Getrows() from ADO recordsets. Definitely issues with field types like memo, and ArrayFilePutCSV() would fail. I then tried
;RS is recordset
;outfile = file to be created
array = RS.GetRows()
mybuf = BinaryAllocArray( array )
BinaryWrite( mybuf,outfile )
BinaryFree(mybuf)

which got results.
Stan - formerly stanl [ex-Pundit]