My blog has moved! Redirecting...

You should be automatically redirected. If not, visit and update your bookmarks.

Data Mining Research - Google mining the web for blog spam comments

I'm a Data Miner Collection (T-shirts, Mugs & Mousepads)

All benefits are given to a charity association.

Thursday, August 02, 2007

Google mining the web for blog spam comments

It is always a pleasure to come back from holidays and read comments on your blog. However, not all comments are worth spending time. An example of undesirable comment can be found here. After a first read, it already sounds like a strange comment. Expression such as Hey buddy! make it feel that it is spam. If you go deeper in the text, you can see that there is no personal information about my blog or the topics I cover. This is typical of spam comments.

However, deleting a normal comment may be annoying, especially for the guy who posted it. If you want to avoid the extremity of word verification process or comment moderation, a simple solution is to use Google. Just copy/paste the first line of text and Google will mine the web for other similar comments.

In the case described above, simply put the first sentence (using quotes to get the exact match) and Google will link you to this site. You can easily see that the first comment is exactly the same and can therefore be safely considered as spam.

Sphere: Related Content


Mohammad Taghi said...

Dear Sandro Saitta
Hi, Are you fine?
I am PhD Student in Ankara University, Turkey. I want to apply Data Mining(Decision Tree)in water reservoirs control and operation. I read your blog and i hope that, i can used from your experiences in this topic.

Sandro Saitta said...


If you ask specific questions on the blog, there is a chance that one of the reader (or me) may be able to answer you.

Kind regards.

Mohammad Taghi said...

Hi, Mr Sandro
I new started in Data Mining.
I want any practical and simple examples about clustering and decision tree.

Sandro Saitta said...


In my personal opinion, Introduction to Data Mining (Tan et al, 2006) is the best introduction to data mining (especially if you're interested in clustering)

Anonymous said...

Buying Amoxicillin
Buy Amoxil

Sie sind nicht recht. Schreiben Sie mir in PM, wir werden umgehen.

Clicky Web Analytics