The Lonewacko Blog is now 98% spam-free!

Surf with confidence, bloggees!

I used to get a few spam comments every once in a while, which I'd remove immediately. However, I've gotten waves of one or two dozen at a time recently, and many spam comments were still on the site. So, something had to be done.

I implemented the "secret input item in form" trick. That worked to some extent, but this one real jerk spammer apparently found out a way around it either manually or programmatically. That's the guy or group who posts links to sites which look like personal sites, but which are actually link whores for casino sites and spyware/dialers.

So, onto the changes.

I was using the Berkeley DB with MoveableType.

I changed it to use MySQL. The change was easy and is covered in MT's help files. MT includes a migration script.

Then, I was able to use MySQL's client, because I'm familiar with it. I did a 'select distinct comment_url from mt_comment' to get a list of all of the distinct URLs people have left in their comments. Then, I removed the good URLs from that list.

Then, I wrote a small Java program which moved every comment with a 'bad' URL into a new table, bad_comments.

Then, I wrote another small Java program to print out excerpts from the comments to make sure I hadn't forgotten any. Surprisingly, there were only a couple spammers who hadn't provided spam URLs in their comments, so it seems to have worked.

Except, I deleted a comment from someone who left a link to a site that sells Cuban cigars. However, the comment appears legitimate, so when I get a chance I'll put that back.

I also accidentally did not remove Howard Owens from the list of bad URLs, and one of his comments got moved into the bad table and thus removed from the site.

He's not blogging anymore, so who cares?

I kid.

I'll also add the code that inserts an 'edit this comment' link in the email I receive when someone adds a comment. That should make deleting spam comments easier in the future. Also, MT's mt_comment table has a creation date field, so I will only have to look at comments made after this date.