Below is a current version of an xml parser for files w/out prior knowledge of the xml structure. I attached a zip with several small xml files I used in testing, as well as a screenshot of output for the bookstore xml file. The script is focused on parsing attributes and innerText but does include a map for nodeTypes. The csv output includes a parent node useful for grouping data like the bookstore into a table of Excel. However, if you parse the included Inventory xml, similar transfer can be problematic. That is, although well-formed xml, linking the privilege nodes to the name attribute, while consistent in the output, is not intuitive. It is also worth noting the script does a fair job with .xsd schema files.
;Winbatch 2025A - iterate/parse xml nodelist
;now uses loop based on XDoc.SelectNodes("//*")
;Stan Littlefield (who to blame)
;3/21/2025 [updated]
;
;Purpose: extract attributes/node innerText from xml
; files w/out prior knowledge of structure.
;=====================================================
gosub udfs
IntControl(73,1,0,0,0)
;construct map for possible node type description
nodeTypes = $"0=None
1=ELEMENT_NODE
2=ATTRIBUTE_NODE
3=TEXT_NODE
4=CDATA_SECTION_NODE
5=ENTITY_REFERENCE_NODE
6=ENTITY_NODE
7=PROCESSING_INSTRUCTION_NODE
8=COMMENT_NODE
9=DOCUMENT_NODE
10=DOCUMENT_TYPE_NODE
11=DOCUMENT_FRAGMENT_NODE
12=NOTATION_NODE
13=WHITESPACE
14=SIGNIFICANTWHITESPACE
15=ENDELEMENT
16=ENDENTITY
17=XMLDECLARATION$"
nodeTypes= MapCreate(nodeTypes,'=',@lf)
;select file to process
types="XML Files|*.xml;*.xsd"
file=AskFilename("Select XML", dirscript(), types, "", 101)
if !fileexist(file) then Terminate(@TRUE, "Exiting", "File Not Found:":file)
;create DomDocument Object
XDoc = CreateObject("Msxml2.DOMDocument.6.0") ;or just Msxml2.DOMDocument
XDoc.async = @False
XDoc.validateOnParse = @False
XDoc.Load(file)
;initialize output variable - comma separated with 4 columns
output="Parent,Item,Value,Type":@CRLF
;check if file begins with xml declaration
;use the nodeTypes map to include the description of nodes
dtype = XDoc.ChildNodes.item(0)
if nodeTypes[dtype.NodeType] <> 1
;======== uncomment only if needed
;output := dtype.BaseName:",":"null":",":nodeTypes[dtype.NodeType]:",":"Node":@CRLF
parent = dtype.parentNode.nodeName
output := Get_Attributes(dtype,parent)
endif
;select/process all nodes from root node
nodes = XDoc.SelectNodes("//*")
for i=0 to nodes.length -1
basename = nodes.item(i).BaseName
;======== uncomment only if needed
;output := basename:",":"null":",":nodeTypes[basename.NodeType]:",":"Node":@CRLF
parent = nodes.item(i).parentNode.nodeName
parse = parent: "," :basename:",": nodes.item(i).Text:",":"Text"
if ! (strindex(parse,",,",0,@fwdscan) || nodes.item(i).ChildNodes.Length >1)
output := parse:@CRLF
endif
output := Get_Attributes(nodes.item(i),parent)
Next
Message(file,output)
Exit
:WBERRORHANDLER
XDoc = 0
geterror()
Terminate(@TRUE,"Error Encountered",errmsg)
;=====================================================
:udfs
#DefineSubRoutine geterror()
wberroradditionalinfo = wberrorarray[6]
lasterr = wberrorarray[0]
handlerline = wberrorarray[1]
textstring = wberrorarray[5]
linenumber = wberrorarray[8]
errmsg = "Error: ":lasterr:@LF:textstring:@LF:"Line (":linenumber:")":@LF:wberroradditionalinfo
Return(errmsg)
#EndSubRoutine
#DefineFunction Get_Attributes(node,parent)
IntControl(73,1,0,0,0)
retval = ""
atts = node.attributes
length = atts.Length
If length > 0
For i=0 to length-1
att = atts.Item(i)
name = att.Name
value = att.Value
parse = parent:",":name:",":value:",Attribute":@CRLF
retval := parse
Next
EndIf
Return retval
:WBERRORHANDLER
XDoc = 0
geterror()
Terminate(@TRUE,"Error Encountered",errmsg)
#EndFunction
Return
;=====================================================