What is email Validation or Scrubbing?
There are basically 2 main steps to really cleaning your email lists.
The First is email Validation or Scrubbing (or sometimes cleaning? There are many names for it) but essentially what you do here is remove as much KNOWN crap as possible. Verification is much costlier part of your overall data hygiene process, so email validation removals as much as possible before you need to verify, which saves you time and money. I'll get into verification in my next posting.
When I say 'Known' crap I mean your suppression list (this is a list of known dead emails, spam traps, honey pots, complainers etc.) and you basically 'scrub' your list against the suppression list and remove any of the bad emails you have on your list.
There are many steps in email validation (removing) your list of harmful bad emails that you DO NOT want to mail to!
Some of the email scrubbing actions should be:
De-duping – or removing duplicates
Role Accounts – removing emails like "info @", "sales @", "webmaster @", etc.
Note: Some people might want to email to these address as they are contacting B2B and these emails address might be who they want contact, but typically you do not want to mail to this data.
Key Words & Profane – removing an email address with specific words like: spam, www, shit, admin etc. However note, that this removes any part of the email address so if you had an email address like spam at domain.com, it would have removed but if you also had email like joelovespam at hotmail.com, then this email validation would have been removed . So be careful l with this filter if you're using it.
Bad Domains – like it says, there are known bad domains out there that are associated with spam traps or honey pots, so you would want to scrub against this list and remove any emails you have with these domains.
Domain Extensions – removing emails like .org, .mil, .ru, UK. etc. Basically any email address extension like. ru, .UK if you do not want to mail to Russia or the UK then you remove these or. mil, emails to the military. These are not necessarily 'Bad' but if you're looking to ONLY Mail to say US customers, then a.UK would be a waste of time.
Numerical emails – emails that start with ONLY numbers is typically a bad email and for the most part, most only numeric domains as well. Example, <a href="email@example.com"> firstname.lastname@example.org </a> or 1234 at 1234.com or joe at 232.com, typically you would not want to mail to these.
Length of email – sometimes you'll get an email like <a href="mailto:email@example.com"> firstname.lastname@example.org </a> or something ridiculous like this, typically most email address are fairly short, on average not much more than say 15-25 characters in total. Ex:
ireallylikecheese at yahoo.com (this email is 28 characters in total) and very long or longer than on average so when I scrub a list I typically choose up to say 40 characters long max, anything longer is deleted.
Now understand the whole idea of email scrubbing your list is to remove as many bad known emails as possible, but it's very possible you'll end up removing some good ones! But if you really want to be safe then losing 1% -2% of your emails is worth not hitting a spam trap or losing your server.
On that same note, its a known fact that AOL for example puts out rewards of 500,000 honey pots a day from abandoned email accounts so its impossible for ANYONE to remove 100% spam traps or honey pots as there are way to many daily being added and impossible for us know that. The large ISP's do not advertise not pass around this information obviously and if someone claims they can remove 100% of all spam traps and honeypots, do not trust them any more than you can throw them !!
Last thing, typically if your scraping list or buying lists and have no idea where they are coming from, then you'll probably see all of these examples. If you have your own opt in list or they are generally business lists, then you will not see half of these things in your list, but it's better to be safe than sorry and email validation is the way to go!