WinBatch® Technical Support Forum

All Things WinBatch => WinBatch => Topic started by: spl on April 25, 2025, 10:25:01 AM

Title: Regex Group Values
Post by: spl on April 25, 2025, 10:25:01 AM
The code below is simple to test returning names/values from text based on a regex 'group' pattern. The goal would be to return a message like
Name: John Adams
Age: 204
Email: jqa@example.com

but the script, at this point, only returns the group names. I think I may have conflated the match/matches regex objects for properties. Matches should return an array, but unable to parse that, match returns groups (as does matches) but unable to parse that either. The regex pattern is distinct, i.e. works with John Adams but not John Quincey Adams (when parsed with PS or Python)... but that is a different issue. For now, where did I mess up obtaining groupname values?
IntControl(73,1,0,0,0)
gosub udfs
ObjectClrOption( 'useany', 'System')
text = "John Adams, Age: 204, Email: jqa@example.com"
results = ""
pattern = '(?<Name>\w+\s\w+),\sAge:\s(?<Age>\d+),\sEmail:\s(?<Email>[^\s]+)'
regex(text,pattern,results)
Message("results",results)
Exit

:WBERRORHANDLER
geterror()
Terminate(@TRUE,"Error Encountered",errmsg)

;=====================================================
:udfs
#DefineSubRoutine geterror()
   wberroradditionalinfo = wberrorarray[6]
   lasterr = wberrorarray[0]
   handlerline = wberrorarray[1]
   textstring = wberrorarray[5]
   linenumber = wberrorarray[8]
   errmsg = "Error: ":lasterr:@LF:textstring:@LF:"Line (":linenumber:")":@LF:wberroradditionalinfo
   Return(errmsg)
#EndSubRoutine

#DefineSubRoutine regex(text,pattern,results)
   retval = 0
   opts = ObjectClrType('System.Text.RegularExpressions.RegexOptions',1)
   oReg = ObjectClrNew('System.Text.RegularExpressions.Regex',pattern,opts)
   oReg.CacheSize = ObjectType("ui2",30)
   ;couldn't go anywhere with .matches
   ;m = oReg.Matches(text) ;returns an array
   ;Message("Groups",m.Groups)  ;will fail
   m = oReg.Match(text)  ;will work
   if m<>0
      Message("Groups Object",m.Groups) ;show object exists
      names = oReg.GetGroupNames()
      i=1  ;used to obtain value from group
      foreach name in names
         if name<>"0"
            results := name:': <is>':@LF ;want <is> to be value
            ;not sure what works???
            ;results := m.Groups[i].Value ;tested/fails
            ;results := oReg.names[i].Value ;tested/fails
            i += 1
         endif
      Next
   else
      results := "No Group Matches Found"
   endif
   oReg=0
   Return(retval)
#EndSubRoutine

Return
Title: Re: Regex Group Values
Post by: td on April 25, 2025, 02:10:16 PM
If I use the expression and text in the example on the MSFT site, I get the same results MSFT does.

https://learn.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.getgroupnames?view=netframework-4.8.1 (https://learn.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.getgroupnames?view=netframework-4.8.1)
Title: Re: Regex Group Values
Post by: spl on April 26, 2025, 02:59:28 AM
Quote from: td on April 25, 2025, 02:10:16 PMIf I use the expression and text in the example on the MSFT site, I get the same results MSFT does.
.https://learn.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.getgroupnames?view=netframework-4.8.1 (https://learn.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.getgroupnames?view=netframework-4.8.1)

So use C# or PS.
Title: Re: Regex Group Values
Post by: spl on April 26, 2025, 05:13:14 AM
Had to remember using var.Item(index) instead of var[index]. Had some help modifying the pattern to accept names of variable elements. This works
IntControl(73,1,0,0,0)
gosub udfs
ObjectClrOption( 'useany', 'System')
;chose from text below to check variable names
text = "John Adams, Age: 204, Email: jqa@example.com"
;text = "John Quincey Adams, Age: 204, Email: jqa@example.com"
;text = "The 4th Earl of Northumberland, Age: 19, Email: earl4@mykingdom.com"
results = ""
pattern = '^(?<Name>\w+\s\w+(?:\s\w+)*),\sAge:\s(?<Age>\d+),\sEmail:\s(?<Email>[^\s]+)'
regex(text,pattern,results)
Message("results",results)
Exit

:WBERRORHANDLER
geterror()
Terminate(@TRUE,"Error Encountered",errmsg)

;=====================================================
:udfs
#DefineSubRoutine geterror()
   wberroradditionalinfo = wberrorarray[6]
   lasterr = wberrorarray[0]
   handlerline = wberrorarray[1]
   textstring = wberrorarray[5]
   linenumber = wberrorarray[8]
   errmsg = "Error: ":lasterr:@LF:textstring:@LF:"Line (":linenumber:")":@LF:wberroradditionalinfo
   Return(errmsg)
#EndSubRoutine

#DefineSubRoutine regex(text,pattern,results)
   retval = 0
   opts = ObjectClrType('System.Text.RegularExpressions.RegexOptions',1)
   oReg = ObjectClrNew('System.Text.RegularExpressions.Regex',pattern,opts)
   oReg.CacheSize = ObjectType("ui2",30)
   m = oReg.Match(text)  ;will work
   if m<>0
      names = oReg.GetGroupNames()
      i=1  ;used to obtain value from group
      foreach name in names
         if name<>"0"
            results := name:': ':m.Groups.Item(i).Value:@LF
            i += 1
         endif
      Next
   else
      results := "No Group Matches Found"
   endif
   oReg=0
   Return(retval)
#EndSubRoutine

Return
Title: Re: Regex Group Values
Post by: spl on April 28, 2025, 04:45:34 AM
The udf was a bit sloppy. This is a little clearer
#DefineSubRoutine regex(text,pattern,results)
   oReg = ObjectClrNew('System.Text.RegularExpressions.Regex',pattern,opts)
   oReg.CacheSize = ObjectType("ui2",30)
   matches = oReg.Match(text)
   if matches
      names = oReg.GetGroupNames()
      foreach name in names
         if !IsInt(name) Then results := name:': ':matches.Groups.Item(name).Value:@LF
      Next
   else
      results := "No Group Matches Found"
   endif
   oReg=0
   Return(results)
#EndSubRoutine
Title: Re: Regex Group Values
Post by: td on April 28, 2025, 08:05:24 AM
Quote from: spl on April 26, 2025, 02:59:28 AM
Quote from: td on April 25, 2025, 02:10:16 PMIf I use the expression and text in the example on the MSFT site, I get the same results MSFT does.
.https://learn.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.getgroupnames?view=netframework-4.8.1 (https://learn.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.getgroupnames?view=netframework-4.8.1)

So use C# or PS.

The point was that your WIL script worked with the MSFT expression and input.
Title: Re: Regex Group Values
Post by: spl on April 28, 2025, 10:41:34 AM
Quote from: td on April 28, 2025, 08:05:24 AMThe point was that your WIL script worked with the MSFT expression and input.

The point was the initial script failed because it could not combine groupname with groups.value - something I figured out or should have know. And referencing C# code, while illustrating I was using correct logic, failed to distinguish key points in iterating results correctly.
Title: Re: Regex Group Values
Post by: kdmoyers on April 28, 2025, 11:51:55 AM
I appreciate the code sample, I'd not figured out Groups with System.Text.RegularExpressions.Regex
Thanks.
Title: Re: Regex Group Values
Post by: spl on April 28, 2025, 12:41:35 PM
Quote from: kdmoyers on April 28, 2025, 11:51:55 AMI appreciate the code sample, I'd not figured out Groups with System.Text.RegularExpressions.Regex
Thanks.

Well I appreciate that you appreciate. I had actually been contacted by an old client I had written code for in 2009. Then I was parsing CSR 'notes' from .mdb note fields using SQL 'like' or 'contains' clauses. He asked about similar text processing from text notes, not db fields. I suggested PS regex and he said, not PS but wondered about WB. Obviously .NET regex is superior to older com [in my opinion] so playing with multiple options with the CLR prompted some of my recent posts. My learning challenge has been the significance of '?' in patterns, in terms of 'greedy' look-aheads etc....  Been satisfied with what I have learned and will post more code, even if you are the only one to appreciate it.
Title: Re: Regex Group Values
Post by: td on April 28, 2025, 01:12:19 PM
Quote from: spl on April 28, 2025, 10:41:34 AM
Quote from: td on April 28, 2025, 08:05:24 AMThe point was that your WIL script worked with the MSFT expression and input.

The point was the initial script failed because it could not combine groupname with groups.value - something I figured out or should have know. And referencing C# code, while illustrating I was using correct logic, failed to distinguish key points in iterating results correctly.

I see your point.