viewpoint-particle

Author Topic: Regex For Email Validation  (Read 117 times)

stanl

  • Pundit
  • *****
  • Posts: 668
Regex For Email Validation
« on: November 06, 2017, 10:39:50 am »
Attached is a test which uses .NET/CLR regex and a pattern I grabbed from C#. I read a few comments that regex is not really the way to go for email address validation. I am going to be looking at a variety of regex patterns to extract emails, phone numbers, salesid's from user notes (queried into memo fields). Would appreciate anyone testing the pattern for a valid email that the script calls invalid.

kdmoyers

  • Full Member
  • ***
  • Posts: 183
Re: Regex For Email Validation
« Reply #1 on: November 09, 2017, 08:52:28 am »
I think that plus signs are valid in an email address, like

fred+sales@company.com

The suggested regex flags them as bad.
The mind is everything; What you think, you become.

stanl

  • Pundit
  • *****
  • Posts: 668
Re: Regex For Email Validation
« Reply #2 on: November 10, 2017, 06:06:19 am »
Thanks Kirby.

chrislegarth

  • Newbie
  • *
  • Posts: 19
Re: Regex For Email Validation
« Reply #3 on: November 14, 2017, 10:46:37 am »
Here is a pattern that I use and found on the Internet.

pattern = `^(?("")("".+?(?<!\\)""@)|(([0-9a-z]((\.(?!\.))|[-!#\$%%&'\*\+/=\?\^`: '`\{\}\|~\w])*)(?<=[0-9a-z])@))(?(\[)(\[(\d{1,3}\.){3}\d{1,3}\])|(([0-9a-z][-\w]*[0-9a-z]*\.)+[a-z0-9][\-a-z0-9]{0,22}[a-z0-9]))$'

it's concatenated because the pattern contains the ` character.

JTaylor

  • Pundit
  • *****
  • Posts: 806
    • Data & Stuff Inc.
Re: Regex For Email Validation
« Reply #4 on: November 14, 2017, 11:21:39 am »
This may be the same thing...but in any event comes from:

https://html.spec.whatwg.org/multipage/input.html#e-mail-state-(type=email)

Jim

Code: Winbatch


A valid e-mail address is a string that matches the email production of the following ABNF, the character set for which is Unicode. This ABNF implements the extensions described in RFC 1123. [ABNF] [RFC5322] [RFC1034] [RFC1123]

email         = 1*( atext / "." ) "@" label *( "." label )
label         = let-dig [ [ ldh-str ] let-dig ]  ; limited to a length of 63 characters by RFC 1034 section 3.5
atext         = < as defined in RFC 5322 section 3.2.3 >
let-dig       = < as defined in RFC 1034 section 3.5 >
ldh-str       = < as defined in RFC 1034 section 3.5 >

This requirement is a willful violation of RFC 5322, which defines a syntax for e-mail addresses that is simultaneously too strict (before the "@" character), too vague (after the "@" character), and too lax (allowing comments, whitespace characters, and quoted strings in manners unfamiliar to most users) to be of practical use here.

The following JavaScript- and Perl-compatible regular expression is an implementation of the above definition.

/^[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/