Is there a way to do a Checksum of an Arrays contents in memory?

Started by ....IFICantBYTE, January 15, 2017, 05:01:18 PM

Previous topic - Next topic

....IFICantBYTE

Does anyone know how I could do a checksum of a two dimensional array's contents in memory?
I know I could probably write the array to a file (ArrayFilePutCSV) and then load it again into a binary buffer and use the BinaryChecksum command on it, but I don't want to save it to disk at all if possible and this will be in a loop and be done every 20 seconds all day 24x7.

Basically I want to see if there has been any data changed between the current array contents and the previous array contents... an array compare.

The array data is the result of an ADODB / LDAP query of various Active Directory User objects and some associated attribute data.. if anything has changed regarding any of the users since the previous query, I need to perform some actions.

Doing a loop running through every array element and comparing to a copy would also be too slow and onerous.. basically, I just want the equivalent to the BinaryCheckSum command, but for Array contents.

Any help converting the Array directly to a binary buffer would possibly be the answer, although, as mentioned before, it is the result of a "objRecordSet.GetRows()" rather than a WinBatch natively ArrDimensioned array if that makes any difference.

Thanks in advance for any help...
 
Regards,
....IFICantBYTE

Nothing sucks more than that moment during an argument when you realize you're wrong. :)

JTaylor

Could you do a GetString() for the compare?   Not sure how much data there might be and how slow a compare of large text blocks might be.  Could plop that into a Buffer if too slow though.

Jim

....IFICantBYTE

Hi Jim,
a getString() on the whole ADO recordset before I turn it into an Array?  ... (And then write that string data into a binary buffer?)

Hmmm... that might work I guess... I was looking at it from the other side .. ie. the data being in an Array that I manipulate with Winbatch commands rather than before, as I have a UDF that does the query and spits it out with some other data added in an array that I then use in several ways to display a reportview control. I guess I could break things up a bit in the UDF first.

I will see what I can do ... would still be nice to have a "native" WinBatch CheckSum/Hash command of WinBatch Array data though.

Thanks for the idea.. looking at it from the beginning rather than the end though.
Regards,
....IFICantBYTE

Nothing sucks more than that moment during an argument when you realize you're wrong. :)

JTaylor

You can, of course, do both, a GetString() and a GetRows() when you open the recordset or, probably better, do the GetString(), compare it immediately with the previous GetString() while the RecordSet is open and then if needed do the GetRows() so you have an Array.  This, of course, assumes I am understanding what you are doing which may not be the case.   Probably no need to put it in a buffer unless the Compare is unreasonably slow for some reason.

Jim

JTaylor

If I am not understanding explain what steps are required for the task and I might have other ideas.   Data is sort of my "thing" :)

Jim

ChuckC

I'm seeing a slight problem with just making a big string out of the array content... Let's say the first instance of the array has logically adjacent cells with values of "ABCDEF" and "" in them, and then the next instance of the array has those same two cells with values of "ABC" and "DEF".  Unless you have delimiters between the various cell values as you concatenate them into one long string from which a checksum/hash is to be computed, the differences will be indistinguishable using the proposed methodology.

If you can have two instances of the array data in memory at the same time, why not just write a UDF that does a cell by cell comparison of the two arrays and return a boolean value regarding the [in]equality of the two arrays?

td

Of course, these results are highly dependent  on many variables but I did a quick benchmark comparing each element of two rank two arrays with 10,000 strings elements each using nested loops.  The comparison took .24 (less than a quarter of a second) using 32-bit WinBatch and even less - .17 seconds using 64-bit WinBatch. 
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

JTaylor

For what it is worth.   Guess large strings compare okay.  Assuming I did this right and offering the same caveats as Tony.  What I was suggesting took 15 Ticks so about .015 seconds.  This was with a little over 13,500 elements or 846 rows with 16 columns.

If you try it, do remember you must issue a MoveFirst after GetString before issuing GetRows.

Jim

td

As Chuck pointed out for checksums, unless fields are delimited you can get a false positive using a string compare.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

JTaylor

GetString returns the data from a RecordSet, by default, with the fields separated by @TAB and the rows by @CR.  Started to respond to Chuck but wasn't sure if he was speaking to what I posted or not.   Chose to assume he knew how GetString and GetRows worked.

Jim

td

Then a string compare should work.  I can't speak for Chuck but I have very little interest in using something like ADO when working with databases. 
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

JTaylor

Just out of curiosity...what method do you use for interacting with databases in WinBatch or similar situations?

jim

ChuckC

The issue I brought up wasn't in reference to the string values you get back from ADO methods.  I was thinking in the context of the original post in this thread, where the stated problem was to perform a checksum of the contents of a WIL array.

td

Likewise,  my post regarding array comparison speed was in response to the assertion that array comparisons would be "too slow and onerous".   That is not to say that it isn't the case as it may well be but  it cannot be assumed either.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

JTaylor

Likewise, I responded to the fact that he was getting his data for comparison from GetRows() and wanted to compare it to previous data (I think) :D

Jim

ChuckC

If I'm not mistaken, though, the stats on test cases for comparisons of all values across two arrays looked favorable enough to make that method feasible, with a very simple implementation in a UDF that performs whole array comparison of identically sized arrays.

JTaylor

I agree that either is quite feasible.  I simply suggested what I did because it fit neatly in with what he is already doing and is much faster and more straightforward than the array compare....but when you are talking sub-second either way the speed usually doesn't matter for an operation like this.   Assuming I understand the situation, of course.

Jim