Regular Expression match everything but two names and <email address> after particular word

I have a bunch of Names and email addresses inside of these aggregated emails and I'd like to get rid of everything but the First Last <email@domain.com> throughout the document. Basically I have...

From: Name Wood <email@gmail.com>
Subject: Yelp entries for iPod contest
Date: April 20, 2012 12:51:07 PM EDT
To: email@domain.cc

Have had a great experience with .... My Son ... is currently almost a year into treatment. Dr. ... is great! Very informative and always updates us on progress and we have our regular visits. The ... buck program is a great incentive which they've implemented to help kids take care of their teeth/braces. They also offer payment programs which help for those of us that need a structured payment option. Wouldn't take my kids anywhere else. Thanks Dr. ... and staff
Text for 1, 2, and 3 entries to Yelp
Hope ... wins!!
Begin forwarded message:

From: Name Wood <email@gmail.com>
Subject: reviews 2 and 3
Date: April 20, 2012 12:44:26 PM EDT
To: email@domain.cc

Have had a great experience with ... Orthodontics. My Son ... is currently almost a year into treatment. Dr. ... is great! Very informative and always updates us on progress and we have our regular visits. The ... buck program is a great incentive which they've implemented to help kids take care of their teeth/braces. They also offer payment programs which help for those of us that need a structured payment option. Wouldn't take my kids anywhere else. Thanks Dr. ... and staff
Have had a great experience with...

I want to only match the...

Name Wood <email@gmail.com>
Name Wood <email@gmail.com>

from this text. So basically I want to match next two words after the word "From: " plus "<"+email address+">" excluding the word "From: ". I've gleaned from researching that this is a negative lookahead (I think) searching for two whole words (somehow using {0,2}) and then an email address from one < character to another >.

Answers


You could just do this:

/(?:From: )(.*)/g

This regular expression will find what you're looking for:

(?<=From:)\s*[^<]+<[^>]+>

But what you're going to do with it is a little unclear from your question. The matched text should probably be put into one or more groups so you can extract the text you want. (Name in one group? Email in a separate group? Or both together?) You haven't said what you want to do with it, so you'll have to provide more information. The above is the simplest case scenario.

Explanation:

(?<=From:)   # positive lookbehind to find "From:"
\s*          # optional whitespace
[^<]+<       # everything up to the first '<' (the name)
[^>]+>       # everything up to the '>' (the email)

If you want to strip all but the name and email. Modifier 's' (dot includes newline), Global find and replacement for both regex's is $1\n

This is faster but will leave an extra newline on sucesses.

Find .*?From:[^\S\n]*([^<\n]+<[^>\n]*\@[^>\n]*>)|.*$

This is slower (uses lookahead) but won't leave the extra newline.

Find  .*?From:[^\S\n]*([^<\n]+<[^>\n]*\@[^>\n]*>)(?:(?!From:[^\S\n]*[^<\n]+<[^>\n]*\@[^>\n]*>).)*

Need Your Help

MYSQL auto increase column entity by 1 on update?

mysql sql-update

I have a table: ID,name,count,varchar(255)

How to resolve website references with MSBuild without building website?

asp.net msbuild website

I have a solution that contains web-site and couple of dependent projects. I need to build this solution with MSBuild. The issue is that I need to build site itself only to resolve references and t...

About UNIX Resources Network

Original, collect and organize Developers related documents, information and materials, contains jQuery, Html, CSS, MySQL, .NET, ASP.NET, SQL, objective-c, iPhone, Ruby on Rails, C, SQL Server, Ruby, Arrays, Regex, ASP.NET MVC, WPF, XML, Ajax, DataBase, and so on.