More fun with Regex: matches

Started by spl, April 29, 2025, 06:36:57 AM

Previous topic - Next topic

spl

A previous post on Regex Named Groups illustrated a basic parsing udf given the pattern had named groups. The code below addresses a more general method to return one or more matched pattern(s) from a text block. It uses patterns for email, phone, and ip address - these often require special attention to the chars in the pattern that conflict with chars in the text, the whole basis of escape chars. Of special note is the pattern for phone numbers which obtains numbers in 2 formats.
;Winbatch 2025A - Regex Multiple Match test
;Stan Littlefield, 4/29/2025
;========================================================================
IntControl(73,1,0,0,0)
gosub udfs
ObjectClrOption( 'useany', 'System')
opts = ObjectClrType('System.Text.RegularExpressions.RegexOptions',1)
;note the need for %% to escape WB's %
email = "\b[A-Za-z0-9._%%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b"
phone = "\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}"
ip = "\b\d{1,3}(\.\d{1,3}){3}\b"
;==== working on a collection of common patterns
;==== maybe in data table; fabricated rs; map 
text = $"
Welcome to our group,for your convenience use the following.
Email us at support@nicepeople.com or sales@nicepeople.com;
Phone us at (123) 456-7890 or 123-456-7891;
and when logged in use this IP 192.168.0.1. and remember to be nice 
because we are not meanpeople.com.
$"
Message("Text to Search",text)
results = "Email(s):":@LF
regex(text,email,results)
results := "phones(s):":@LF
regex(text,phone,results)
results := "ip address(s):":@LF
regex(text,ip,results)
Message("Search results",results)
Exit
;========================================================================

:WBERRORHANDLER
geterror()
Terminate(@TRUE,"Error Encountered",errmsg)

;========================================================================
:udfs
#DefineSubRoutine geterror()
   wberroradditionalinfo = wberrorarray[6]
   lasterr = wberrorarray[0]
   handlerline = wberrorarray[1]
   textstring = wberrorarray[5]
   linenumber = wberrorarray[8]
   errmsg = "Error: ":lasterr:@LF:textstring:@LF:"Line (":linenumber:")":@LF:wberroradditionalinfo
   Return(errmsg)
#EndSubRoutine

#DefineSubRoutine regex(text,pattern,results)
   oReg = ObjectClrNew('System.Text.RegularExpressions.Regex',pattern,opts)
   oReg.CacheSize = ObjectType("ui2",30)
   matches = oReg.Matches(text,pattern,opts) 
   foreach match in matches
      results := match.Value:@LF 
   Next
   oReg=0
   Return(results)
#EndSubRoutine

Return
;========================================================================
Stan - formerly stanl [ex-Pundit]

spl

This a a little cleaner.
;Winbatch 2025A - Regex Multiple Match test
;Stan Littlefield, 4/29/2025
;revised: 4/30/2025
;========================================================================
IntControl(73,1,0,0,0)
gosub udfs
ObjectClrOption( 'useany', 'System')
opts = ObjectClrType('System.Text.RegularExpressions.RegexOptions',1)
;note the need for %% to escape WB's %
email = "\b[A-Za-z0-9._%%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b"
phone = "\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}"
ip = "\b\d{1,3}(\.\d{1,3}){3}\b"
;==== working on a collection of common patterns
;==== maybe in data table; fabricated rs; map 
text = $"
Welcome to our group,for your convenience use the following.
Email us at support@nicepeople.com or sales@nicepeople.com;
Phone us at (123) 456-7890 or 123-456-7891;
and when logged in use this IP 192.168.0.1. and remember to be nice 
because we are not meanpeople.com.
$"
Message("Text to Search",text)
results = "Email(s):":@LF
regex(email)
results := "phones(s):":@LF
regex(phone)
results := "ip address(s):":@LF
regex(ip)
Message("Search results",results)
Exit
;========================================================================

:WBERRORHANDLER
geterror()
Terminate(@TRUE,"Error Encountered",errmsg)

;========================================================================
:udfs
#DefineSubRoutine geterror()
   wberroradditionalinfo = wberrorarray[6]
   lasterr = wberrorarray[0]
   handlerline = wberrorarray[1]
   textstring = wberrorarray[5]
   linenumber = wberrorarray[8]
   errmsg = "Error: ":lasterr:@LF:textstring:@LF:"Line (":linenumber:")":@LF:wberroradditionalinfo
   Return(errmsg)
#EndSubRoutine

#DefineSubRoutine regex(pattern)
   oReg = ObjectClrNew('System.Text.RegularExpressions.Regex')
   oReg.CacheSize = ObjectType("ui2",30)
   matches = oReg.Matches(text,pattern,opts) 
   if matches.Count>0
      foreach match in matches
         results := match.Value:@LF 
      Next
   else
      results := "No Matches":@LF
   endif
   oReg=0
   Return(results)
#EndSubRoutine

Return
;========================================================================
Stan - formerly stanl [ex-Pundit]

SMF spam blocked by CleanTalk