The code below is simple to test returning names/values from text based on a regex 'group' pattern. The goal would be to return a message like
Name: John Adams
Age: 204
Email: jqa@example.com
but the script, at this point, only returns the group names. I think I may have conflated the match/matches regex objects for properties. Matches should return an array, but unable to parse that, match returns groups (as does matches) but unable to parse that either. The regex pattern is distinct, i.e. works with John Adams but not John Quincey Adams (when parsed with PS or Python)... but that is a different issue. For now, where did I mess up obtaining groupname values?
IntControl(73,1,0,0,0)
gosub udfs
ObjectClrOption( 'useany', 'System')
text = "John Adams, Age: 204, Email: jqa@example.com"
results = ""
pattern = '(?<Name>\w+\s\w+),\sAge:\s(?<Age>\d+),\sEmail:\s(?<Email>[^\s]+)'
regex(text,pattern,results)
Message("results",results)
Exit
:WBERRORHANDLER
geterror()
Terminate(@TRUE,"Error Encountered",errmsg)
;=====================================================
:udfs
#DefineSubRoutine geterror()
wberroradditionalinfo = wberrorarray[6]
lasterr = wberrorarray[0]
handlerline = wberrorarray[1]
textstring = wberrorarray[5]
linenumber = wberrorarray[8]
errmsg = "Error: ":lasterr:@LF:textstring:@LF:"Line (":linenumber:")":@LF:wberroradditionalinfo
Return(errmsg)
#EndSubRoutine
#DefineSubRoutine regex(text,pattern,results)
retval = 0
opts = ObjectClrType('System.Text.RegularExpressions.RegexOptions',1)
oReg = ObjectClrNew('System.Text.RegularExpressions.Regex',pattern,opts)
oReg.CacheSize = ObjectType("ui2",30)
;couldn't go anywhere with .matches
;m = oReg.Matches(text) ;returns an array
;Message("Groups",m.Groups) ;will fail
m = oReg.Match(text) ;will work
if m<>0
Message("Groups Object",m.Groups) ;show object exists
names = oReg.GetGroupNames()
i=1 ;used to obtain value from group
foreach name in names
if name<>"0"
results := name:': <is>':@LF ;want <is> to be value
;not sure what works???
;results := m.Groups[i].Value ;tested/fails
;results := oReg.names[i].Value ;tested/fails
i += 1
endif
Next
else
results := "No Group Matches Found"
endif
oReg=0
Return(retval)
#EndSubRoutine
Return
If I use the expression and text in the example on the MSFT site, I get the same results MSFT does.
https://learn.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.getgroupnames?view=netframework-4.8.1 (https://learn.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.getgroupnames?view=netframework-4.8.1)
Quote from: td on April 25, 2025, 02:10:16 PMIf I use the expression and text in the example on the MSFT site, I get the same results MSFT does.
.https://learn.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.getgroupnames?view=netframework-4.8.1 (https://learn.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.getgroupnames?view=netframework-4.8.1)
So use C# or PS.
Had to remember using var.Item(index) instead of var[index]. Had some help modifying the pattern to accept names of variable elements. This works
IntControl(73,1,0,0,0)
gosub udfs
ObjectClrOption( 'useany', 'System')
;chose from text below to check variable names
text = "John Adams, Age: 204, Email: jqa@example.com"
;text = "John Quincey Adams, Age: 204, Email: jqa@example.com"
;text = "The 4th Earl of Northumberland, Age: 19, Email: earl4@mykingdom.com"
results = ""
pattern = '^(?<Name>\w+\s\w+(?:\s\w+)*),\sAge:\s(?<Age>\d+),\sEmail:\s(?<Email>[^\s]+)'
regex(text,pattern,results)
Message("results",results)
Exit
:WBERRORHANDLER
geterror()
Terminate(@TRUE,"Error Encountered",errmsg)
;=====================================================
:udfs
#DefineSubRoutine geterror()
wberroradditionalinfo = wberrorarray[6]
lasterr = wberrorarray[0]
handlerline = wberrorarray[1]
textstring = wberrorarray[5]
linenumber = wberrorarray[8]
errmsg = "Error: ":lasterr:@LF:textstring:@LF:"Line (":linenumber:")":@LF:wberroradditionalinfo
Return(errmsg)
#EndSubRoutine
#DefineSubRoutine regex(text,pattern,results)
retval = 0
opts = ObjectClrType('System.Text.RegularExpressions.RegexOptions',1)
oReg = ObjectClrNew('System.Text.RegularExpressions.Regex',pattern,opts)
oReg.CacheSize = ObjectType("ui2",30)
m = oReg.Match(text) ;will work
if m<>0
names = oReg.GetGroupNames()
i=1 ;used to obtain value from group
foreach name in names
if name<>"0"
results := name:': ':m.Groups.Item(i).Value:@LF
i += 1
endif
Next
else
results := "No Group Matches Found"
endif
oReg=0
Return(retval)
#EndSubRoutine
Return
The udf was a bit sloppy. This is a little clearer
#DefineSubRoutine regex(text,pattern,results)
oReg = ObjectClrNew('System.Text.RegularExpressions.Regex',pattern,opts)
oReg.CacheSize = ObjectType("ui2",30)
matches = oReg.Match(text)
if matches
names = oReg.GetGroupNames()
foreach name in names
if !IsInt(name) Then results := name:': ':matches.Groups.Item(name).Value:@LF
Next
else
results := "No Group Matches Found"
endif
oReg=0
Return(results)
#EndSubRoutine
Quote from: spl on April 26, 2025, 02:59:28 AMQuote from: td on April 25, 2025, 02:10:16 PMIf I use the expression and text in the example on the MSFT site, I get the same results MSFT does.
.https://learn.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.getgroupnames?view=netframework-4.8.1 (https://learn.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.getgroupnames?view=netframework-4.8.1)
So use C# or PS.
The point was that your WIL script worked with the MSFT expression and input.
Quote from: td on April 28, 2025, 08:05:24 AMThe point was that your WIL script worked with the MSFT expression and input.
The point was the initial script failed because it could not combine groupname with groups.value - something I figured out or should have know. And referencing C# code, while illustrating I was using correct logic, failed to distinguish key points in iterating results correctly.
I appreciate the code sample, I'd not figured out Groups with System.Text.RegularExpressions.Regex
Thanks.
Quote from: kdmoyers on April 28, 2025, 11:51:55 AMI appreciate the code sample, I'd not figured out Groups with System.Text.RegularExpressions.Regex
Thanks.
Well I appreciate that you appreciate. I had actually been contacted by an old client I had written code for in 2009. Then I was parsing CSR 'notes' from .mdb note fields using SQL 'like' or 'contains' clauses. He asked about similar text processing from text notes, not db fields. I suggested PS regex and he said, not PS but wondered about WB. Obviously .NET regex is superior to older com [in my opinion] so playing with multiple options with the CLR prompted some of my recent posts. My learning challenge has been the significance of '?' in patterns, in terms of 'greedy' look-aheads etc.... Been satisfied with what I have learned and will post more code, even if you are the only one to appreciate it.
Quote from: spl on April 28, 2025, 10:41:34 AMQuote from: td on April 28, 2025, 08:05:24 AMThe point was that your WIL script worked with the MSFT expression and input.
The point was the initial script failed because it could not combine groupname with groups.value - something I figured out or should have know. And referencing C# code, while illustrating I was using correct logic, failed to distinguish key points in iterating results correctly.
I see your point.