Determining Type of Binary Buffer contents

Started by KeithW, October 08, 2025, 05:01:00 PM

Previous topic - Next topic

KeithW

Greetings,

I have looked thru the archives and I cannot seem to locate some code I know I have seen.  In a program, I am loading a Binary Buffer and want to know before I start processing the contents of the buffer if it contains any Binary Data as opposed to strict Ascii text data.  I believe the example I saw was by Tony but not positive.

Anyone have a reliable way of determining if the buffer contains Binary Data in excess of 7Fh?

Thanx,
Keith

spl

Not sure. Looked in my archives and found code to look for @LF (char2num(10)) for text binary data. Maybe looking for char2num( < 127 ), although sure Tony will reply.
Stan - formerly stanl [ex-Pundit]

td

The most reliable method is to check each byte value to ensure that it is in the range of an ASCII character.

An alternative would be to randomly sample a percentage of byte values. That approach is less reliable but more efficient if the buffer is large enough.

A third approach would be to use the BinaryIndexEx function to repeatedly search for character values that our outside of expected text values.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

spl

Quote from: td on October 09, 2025, 01:09:07 PMA third approach would be to use the BinaryIndexEx function to repeatedly search for character values that our outside of expected text values.

That gets my vote, and I apologize - I meant >127 not <127 in my response. How that might be done would be interesting, but dependent on your original goal. Loading a binary buffer and...
  • Do not process further if high ascii value found
  • Iterate all instances of high ascii values and decide from there
  • Re-encode and continue

WB has a super supply of binary functions. Using CLR with .Net StreamReader() might be an option for re-coding, but has issues.

I think with a little more information about the source and goal, your ask could be accomplished.

[EDIT]
you can also consider loading as ADODB.Stream with charset as "UTF-8" ' or "ISO-8859-1"
Stan - formerly stanl [ex-Pundit]

td

Of course, if loading a binary buffer from a file, one can simply use the FileEncoding function to determine whether the file and therefore buffer is ANSI text.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

spl

Quote from: td on October 10, 2025, 08:17:28 AMOf course, if loading a binary buffer from a file, one can simply use the FileEncoding function to determine whether the file and therefore buffer is ANSI text.

And, of course, of course... these threads can often leader to a function not immediate in memory. Op might not like the sideshow, but appreciate you mentioning the function.
Stan - formerly stanl [ex-Pundit]

td

Yup. I didn't think of the function when I first read the OP's post. There is a certain amount of irony in that lapse.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

SMF spam blocked by CleanTalk