wbOmnibus Extender - RegEx

Started by JTaylor, May 26, 2024, 07:46:03 AM

Previous topic - Next topic

JTaylor

I have added an snRegEx() function to the Extender.  Let me know if you find any problems.  I know very little about RegEx and used it more writing this function than I have used it all the times before, combined.

I have also split out the Extenders into their own individual DLLs.  The combined one is there but it will probably not see much new functionality (I did add snRegEx()).  It was becoming extremely difficult to add new functionality due to conflicts.  Also, the immense number of functions and constants were starting to become a problem, as well.  Benefits to both options but didn't feel like I had much choice on this one.

http://www.jtdata.com/anonymous/wbOmnibus.zip

Jim

cssyphus

Thanks Jim - I'll be looking at this with great interest. Perfect timing for a challenging project.

JD

JTaylor

I will be very interested in any feedback or suggestions you have to offer.

Jim

spl

Jim, not sure if this a suggestion or a tangent? Your function is solely or 'replace'. One interest I have with Regex is validating various date formats, i.e. mm-dd-yyyy, mm/dd/yyyy, dd-mm-yyyy.....  The snippet below is pretty useless, but given a regex source [could be a list, an array, a map] specific to dates [but could be anything] and an input source, could a regex function be created to validate input against regex source. I'm thinking about a switch, select, or case schema.... As you can see from the snippet, the oReg .Net variable is specific only to pattern, although all dates in the list are valid against one of the patterns. It could be modified to work but very ugly coding with little practical value outside of the snippet.
pattern = '^\d{4}-\d{2}-\d{2}'
pattern1 = '^(3[01]|[12][0-9]|0?[1-9])(\/|-)(1[0-2]|0?[1-9])\2([0-9]{2})?[0-9]{2}'
pattern2 = '(0[1-9]|1[1,2])(\/|-)(0[1-9]|[12][0-9]|3[01])(\/|-)(19|20)\d{2}'
pattern3 = '(0[1-9]|[12][0-9]|3[01])(\/|-)(0[1-9]|1[1,2])(\/|-)(19|20)\d{2}'
pattern4 = '^\d{1}/\d{2}/\d{4}'
pattern5 = '^\d{2}.\d{2}.\d{4}'
pattern6 = '^(January|February|March|April|May|June|July|August|September|October|November|December)\s\d{1,2},\s\d{4}'

ObjectClrOption( 'useany', 'System')
oReg = ObjectClrNew('System.Text.RegularExpressions.Regex',pattern)
oReg.CacheSize = ObjectType("ui2",30)

dateStrings = '2024-05-24,5/24/2024,05/24/2024,05-24-2024,24/5/2024,05.24.2004,May 24, 2024'

For i = 1 to ItemCount(dateStrings,",")-1
   dt = ItemExtract(i,dateStrings,",")
   m = oReg.IsMatch( dt )
   If m==0
      Display(2,"%dt%","DOES NOT MATCH ":pattern)
   Else
      Display(2,"%dt%","MATCHES ":pattern)
   Endif
Next

oReg = 0
Exit
Stan - formerly stanl [ex-Pundit]

JTaylor

I couldn't really think of anything to add apart from the Find and Replace options, other than Count.   Anything I could think of could be accomplished with those two functions.  Of course, I don't know much about RegEx so could be missing something obvious.

In regards to your question, assuming I understand, are you wanting to submit the entire list and all the expression options and then check against all the possible Expressions and Dates?  Currently, you could easily loop through and check them.   If it found a match it would return that date and if not, it would return a blank.  It would look very much like your code (I didn't test):

For i = 1 to ItemCount(dateStrings,",")-1
  dt = ItemExtract(i,dateStrings,",")
  For j = 1 to ItemCount(patternstrings,",")-1
    ex = ItemExtract(j,patternStrings,@CR)
    m = snRegEx(dt,ex)
    If m == ""
       Display(2,"%dt%","DOES NOT MATCH ":pattern)
    Else
       Display(2,"%dt%","MATCHES ":pattern)
    Endif
  Next
Next

Am I following or missing the point? 

Jim

JTaylor

Finished it and Tested it:


ps = '^\d{4}-\d{2}-\d{2}':@CR
ps = ps:'^(3[01]|[12][0-9]|0?[1-9])(\/|-)(1[0-2]|0?[1-9])\2([0-9]{2})?[0-9]{2}':@CR
ps = ps:'(0[1-9]|1[1,2])(\/|-)(0[1-9]|[12][0-9]|3[01])(\/|-)(19|20)\d{2}':@CR
ps = ps:'(0[1-9]|[12][0-9]|3[01])(\/|-)(0[1-9]|1[1,2])(\/|-)(19|20)\d{2}':@CR
ps = ps:'^\d{1}/\d{2}/\d{4}':@CR
ps = ps:'^\d{2}.\d{2}.\d{4}':@CR
ps = ps:'^(January|February|March|April|May|June|July|August|September|October|November|December)\s\d{1,2},\s\d{4}'

dateStrings = '2024-05-24,5/24/2024,05/24/2024,05-24-2024,24/5/2024,05.24.2004,May 24, 2024'

nm = "DID NOT MATCH: ":@LF
dm = "DID MATCH: ":@LF
For i = 1 to ItemCount(dateStrings,",")-1
  dt = ItemExtract(i,dateStrings,",")
  For j = 1 to ItemCount(ps,",")-1
    ex = ItemExtract(j,ps,@CR)
    m = snRegEx(dt,ex)
    If m == ""
       nm = nm:"P%j% - D%i% - ":dt:@LF
    Else
       dm = dm:"P%j% - D%i% - ":dt:@LF
    Endif
  Next
Next
Message("DM",dm)
Message("NM",nm)

spl

I gave the documentation for the regex function only a brief look, so  snRegEx(dt,ex) is equivalent to  snRegEx(dt,ex,"F") or a basic match, so function is really cool. I have to work a little more on the regex for dates: change the month day, year to accept both full month or abbreviated, include datetime stamp, GMT datetimes...

Then set up the regex as either json or a map, and hopefully come up with a switch construct with a final default to "Not a Date" to avoid looping the regex as a list. Your function removes having to play with .NET or ObjectCreate("VBScript.RegExp"). Goal would be to validate a date string candidate from sql query, csv, Rest Query etc..

and not just dates... more a regex lookup library for IP addresses, email address, phone # [ad infinitum]
Stan - formerly stanl [ex-Pundit]

JTaylor

Correct.  I made "Find" the default action.

To expand completely it would be equivalent to

        snRegEx(dt,ex,"F","",@TAB,-1)

Basic Match/Find, returning a @TAB delimited list of all Matches.

Jim