Binary Tag

Started by JTaylor, March 14, 2014, 03:15:57 PM

Previous topic - Next topic

JTaylor

One for the Wish List...assuming I'm not overlooking something.

It would be helpful if the Binary Tag Find/Extract could have a flag that would make the End Tag search work from the end of the Buffer so one could extract data from HTML/XML data.     Example:  If I wanted to extract the text for a table that had an embedded table in HTML I can't because it would find the <table> for the first table and the </table> for the embedded table. 

No need to post code on how to work-around the problem as I could work that out but that can get very messy very quickly.

Thanks.

Jim

JTaylor

Should you tackle the previous one maybe you could also provide the option for the Extract and Replace functions to affect the Begin/End tags as well.   You provide options for Index and Length to take those into account but didn't follow through with all the BinaryTag functions.

Thanks again.

Jim

JTaylor

I think the BinaryTagRepl() function does replace the Tags as well as text.   Hmmmmmmmmm....if the Extract is changed so one can pull the tags this would probably be okay...but might be useful to be able to only replace the text in between the tags as well.   Hopefully you can see the problem with these two not being able to work the same.

Jim

Deana

As I understand it the Binary Tag functions were written/intended for template file processing. They can sometimes be used to simple HTML parsing but may require another method for more complicated HTML. For more about theusage of the Binary Tag functions check out: http://techsupt.winbatch.com/webcgi/webbatch.exe?techsupt/tsleft.web+Tutorials+Template~File~Processing.txt.

Since it appears you are trying to parse HTML/XML. There are quite a few different ways to parse HTML, using WinBatch: http://techsupt.winbatch.com/webcgi/webbatch.exe?techsupt/tsleft.web+WinBatch/How~To+Parse~HTML.txt.

I would probably chose either the DOM Extender http://techsupt.winbatch.com/webcgi/webbatch.exe?techsupt/tsleft.web+WIL~Extenders/_Third~Party~Extenders/DOM+DOM~Extender.txt or the MSIE COM interface and use the GetElementsByTagName Method (http://msdn.microsoft.com/en-us/library/ie/ms536439(v=vs.85).aspx) to grab the table data.
Deana F.
Technical Support
Wilson WindowWare Inc.

JTaylor

I do a HUGE amount of XML and HTML parsing so familiar with and use a number of options but since you use HTML processing as an example in the Help file for the BinaryTag stuff it seems like a reasonable request.  Plus, I am doing template processing and not parsing HTML from web pages I'm downloading from somewhere, just happens to be HTML templates.  Also, most of your functions that involve searching in some form, such as SearchIndex(), allow us to specify a direction so this isn't without precedent in your design.

The inconsistency between the TagExtract and the TagReplace has been an issue before but can't remember if I mentioned that problem.   That isn't an issue for just HTML.

Not that it necessarily applies to this issue but the DOM extender is rapidly, becoming non-useful since it won't handle Unicode.  I know Stan asked the author about releasing the source code so, hopefully, someone could update it and, I believe, they agreed but said it would take some work to remove some other stuff but nothing ever came of that  :(

Jim