Regex For Email Validation

Started by stanl, November 06, 2017, 10:39:50 AM

Previous topic - Next topic

stanl

Attached is a test which uses .NET/CLR regex and a pattern I grabbed from C#. I read a few comments that regex is not really the way to go for email address validation. I am going to be looking at a variety of regex patterns to extract emails, phone numbers, salesid's from user notes (queried into memo fields). Would appreciate anyone testing the pattern for a valid email that the script calls invalid.

kdmoyers

I think that plus signs are valid in an email address, like

fred+sales@company.com

The suggested regex flags them as bad.
The mind is everything; What you think, you become.

stanl


chrislegarth

Here is a pattern that I use and found on the Internet.

pattern = `^(?("")("".+?(?<!\\)""@)|(([0-9a-z]((\.(?!\.))|[-!#\$%%&'\*\+/=\?\^`: '`\{\}\|~\w])*)(?<=[0-9a-z])@))(?(\[)(\[(\d{1,3}\.){3}\d{1,3}\])|(([0-9a-z][-\w]*[0-9a-z]*\.)+[a-z0-9][\-a-z0-9]{0,22}[a-z0-9]))$'

it's concatenated because the pattern contains the ` character.

JTaylor

This may be the same thing...but in any event comes from:

https://html.spec.whatwg.org/multipage/input.html#e-mail-state-(type=email)

Jim

Code (winbatch) Select



A valid e-mail address is a string that matches the email production of the following ABNF, the character set for which is Unicode. This ABNF implements the extensions described in RFC 1123. [ABNF] [RFC5322] [RFC1034] [RFC1123]

email         = 1*( atext / "." ) "@" label *( "." label )
label         = let-dig [ [ ldh-str ] let-dig ]  ; limited to a length of 63 characters by RFC 1034 section 3.5
atext         = < as defined in RFC 5322 section 3.2.3 >
let-dig       = < as defined in RFC 1034 section 3.5 >
ldh-str       = < as defined in RFC 1034 section 3.5 >

This requirement is a willful violation of RFC 5322, which defines a syntax for e-mail addresses that is simultaneously too strict (before the "@" character), too vague (after the "@" character), and too lax (allowing comments, whitespace characters, and quoted strings in manners unfamiliar to most users) to be of practical use here.

The following JavaScript- and Perl-compatible regular expression is an implementation of the above definition.

/^[a-zA-Z0-9.!#$%&'*+\/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/