More solutions for combating comment spam.

Seems blogs being hit by comment spam is growing worse with each passing day and as such the talk of the town in the MoveableType community has been how to stop as much of it as possible.  There’s a thread devoted to this topic over at ScriptyGoddess that lists off five possible solutions including the MT-Blacklist plugin by Jay Allen that I talked about the other day (and which I now have successfully running here at SEB). A couple of the other interesting solutions include a variation on the Captcha Turing Test as an MT Plugin written by James Seng who has since gone on to develop a new method utilizing a Bayesian filtering process that’s even better. In fact, I think he has removed his first plugin from his site.

The Captcha Turing Test is familiar to anyone who has signed up for a Hotmail or Yahoo webmail account. It presents you with a randomly generated code inside of a graphic image that a user has to enter correctly before their account will be set up. The SEB Forums make use of this method of verification during registration. It works pretty well in defeating bots due to the way the code is displayed, but it makes it difficult for handicapped users or people who have graphics turned off in their browsers to participate.

So James went on to develop a new plugin that uses Bayesian based comment filter which is already gaining popularity as a means of fighting email spam. This method takes the idea of a blacklist similar to what MT-Blacklist generates and builds on it.  Instead of just looking for known URL fragments, the Bayesian method looks at the entire content of the comment submission and ranks it based on how likely the words used and URLs listed are from a spam comment.  Using a form of fuzzy logic the filter makes a guess at whether or not the comment is spam and blocks it if it thinks it is.

The disadvantage to this method is that you have to “train” the filter at first as it will generate some false positives and miss some real spam at the start. The advantage though is that after a little training the filter will block new spam comments without having to be taught as compared to MT-Blacklist which can only block sites that have their URL fragments in it’s blacklist to begin with. If it’s implemented well the Bayesian method should be the least amount of work to maintain for the benefits it offers. Those of you who use Thunderbird or Mozilla for your email client may already be familiar with the Bayesian method as it’s implemented very well in that email client.

I like MT-Blacklist, but I’ll probably try out James’ Bayesian filter as well this weekend to see if it’s a worthy successor as I’m already sold on the Bayesian method from using Thunderbird.

Either way, it’s impressive and a testament to the MT community that there are already several different ways to try and combat this growing problem. If you’re running an MT based blog you’ll benefit from having several different methods to combat comment spam depending on your needs and preferences.

3 thoughts on “More solutions for combating comment spam.

  1. Arrogance and greed fuck up yet another human endeavour.  Spammers should be disemboweled with a glowing-hot steel hook.

  2. I’ve had Jay Allen’s MT-blacklist up and running for about 2 weeks now and it’s done a great job thus far.  He’s just released 1.5 so I’ll checkout both his and the Bayesian filter as well.

  3. Man: You sit here, dear.
    Wife: All right.
    Man: Morning!
    Waitress: Morning!
    Man: Well, what’ve you got?
    Waitress: Well, there’s egg and bacon; egg sausage and bacon; egg and spam; egg bacon and spam; egg bacon sausage and spam; spam bacon sausage and spam; spam egg spam spam bacon and spam; spam sausage spam spam bacon spam tomato and spam;
    Vikings: Spam spam spam spam…
    Waitress: …spam spam spam egg and spam; spam spam spam spam spam spam baked beans spam spam spam…
    Vikings: Spam! Lovely spam! Lovely spam!
    Waitress: …or Lobster Thermidor a Crevette with a mornay sauce served in a Provencale manner with shallots and aubergines garnished with truffle pate, brandy and with a fried egg on top and spam.
    Wife: Have you got anything without spam?
    Waitress: Well, there’s spam egg sausage and spam, that’s not got much spam in it.
    Wife: I don’t want ANY spam!
    Man: Why can’t she have egg bacon spam and sausage?
    Wife: THAT’S got spam in it!
    Man: Hasn’t got as much spam in it as spam egg sausage and spam, has it?
    Vikings: Spam spam spam spam… (Crescendo through next few lines…)
    Wife: Could you do the egg bacon spam and sausage without the spam then?
    Waitress: Urgghh!
    Wife: What do you mean ‘Urgghh’? I don’t like spam!
    Vikings: Lovely spam! Wonderful spam!
    Waitress: Shut up!
    Vikings: Lovely spam! Wonderful spam!
    Waitress: Shut up! (Vikings stop) Bloody Vikings! You can’t have egg bacon spam and sausage without the spam.
    Wife: I don’t like spam!
    Man: Sshh, dear, don’t cause a fuss. I’ll have your spam. I love it. I’m having spam spam spam spam spam spam spam beaked beans spam spam spam and spam!
    Vikings: Spam spam spam spam. Lovely spam! Wonderful spam!
    Waitress: Shut up!! Baked beans are off.
    Man: Well could I have her spam instead of the baked beans then?
    Waitress: You mean spam spam spam spam spam spam… (but it is too late and the Vikings drown her words)
    Vikings: (Singing elaborately…) Spam spam spam spam. Lovely spam! Wonderful spam! Spam spa-a-a-a-a-am spam spa-a-a-a-a-am spam. Lovely spam! Lovely spam! Lovely spam! Lovely spam! Lovely spam! Spam spam spam spam!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.