Email Harvesting

Email Harvesting and Counter Measures

Email harvesting means fetching email addresses by crawling web pages using various scripts/softwares. These email addresses then used for sending bulk marketing emails or newsletters/spams.

Email Harvesting

There are lots of softwares/bots available in the market which crawls the website and gives email address from its pages. These softwares are generally known as “Harvesting Bot”. One can also buy/sell email addresses to send bulk emails for spams or marketing. Though many countries have laws in place to fight again spams.

List of Countries having Anti-Span Email legislation in place.

Counter Measures

There are quite a lot counter measures to stop email harvesting from our website. Wikipedia.org has a list including most of these. Though I would suggest not to include email address unless it is utmost important. Because now a days there are much more powerful bots available in the market which can fetch email address despite of the counter measures.

Effective Methods

There are lots of methods to prevent bots fetching email address from our websites.

Like using “at” instead of “@” and “dot” instead of “.” We can also using javascript to hide the email address and on some event we can include the email address but this method is lacks the accessibility.

There are two of the most powerful ways I’ve come across which prevents bots from crawling email address from webpages.

Using CSS & Javascript

<a href="mailto:myemail@ignore-domain.com">myemail@<span style="display:none;">ignore-</span>domain.com</a>

With this we mis-guide crawls to fetch the wrong email address, but on screen reader it will read out the actual email address. When user clicks on the anchor link , we can use javascript to remove the ignore- word.

Using Javascript only

<a href="mailto:email@domain@@com" onmouseover="this.href=this.href.replace('@@','.')">Send email</a>

This method is really effective but not suitable solution for the accessibility as it will not read-out correct email address. But it will totally prevent email addresses from the bots as it doesn’t has dot(.) in the email address.

I am sure there will be other ways out there to prevents email addresses from bots. Do let me know your innovative ideas in the comment box below.

Web/ UI & Front-end developer based in Ahmedabad, GJ, India. Here to help/ discuss community to spread web awareness.

Leave a reply:

Your email address will not be published.