Fighting Spam — All Kinds

How I deal with comment and pingback spam.

I start each morning pretty much the same way. I make myself a cup of coffee, make a scrambled egg for my parrot, and then sit down at the kitchen table and check the comments that came into my blog overnight.

About Spam

The main thing I’m checking for each morning is comment and pingback spam. These are similar but different.

  • Comment spam is a comment that exists solely to provide one or more links to another Web site, usually to promote that site or its services, but possibly to just get links to that site to improve Google rankings. Comment spam ads nothing to the site’s value. Sometimes disguised as a guest book entry or general positive comment — for example, “Great blog! I’ll be back!” accompanied by a link or two — it simply isn’t something the average blogger should want on his or her site.
  • Pingback spam is a comment that appears as a result of a link on another blog pinging your blog. Although many pingbacks are legitimate (as many comments are legitimate), there appears to be a rise in pingbacks as a result of feed scraping, which I’ve discussed here and here. Pingback spam is usually pretty easy to spot; the software that scapes the feeds isn’t very creative, so the excerpt is usually an exact quote from what’s been scraped. Sometimes, oddly enough, the quote is from the copyright notice that appears at the bottom of every feed item originating from this site. Pingbacks automate the linking of your site to someone elses — in the case of pingback spam, it’s likely to be a splogger.

Lucky me: I get both.

Tools to Fight Comment Spam

Fortunately, I use both Bad Behavior and Spam Karma 2 (many thanks again to Miraz for suggesting both of these), so the spam comments that get through their filters and are actually posted to the site are minimized. On a typical day, I might just have 3 to 5 of them. Compare that to 3,400 potential spam messages stopped by Bad Behavior in the past week and the 51,000 spam messages deleted after posting by Spam Karma in the past year since its installation. Without these two forms of protection, I’d be spending all day cleaning up spam.

Anyone who doesn’t use some kind of spam protection on a blog with open comments is, well, an idiot.

Neither program is very effective against pingback spam, although Spam Karma seems to be catching a few of them these days. Although I’m pretty sure I can set up WordPress to reject pingbacks, I like the idea of getting legitimate links from other blogs. It helps form a community. And it provides a service to my readers. For example, if I wrote an article about something and another blogger quoted my work and added his insight to it, his article might interest my readers. Having a link in my comments right to his related post is a good thing.

My Routine

So my morning routine consists of checking Spam Karma’s “Approved Comments” and marking the comments that are spam as spam. Then I go into WordPress’s Comments screen (Dashboard > Manage > Comments) and marking pingback spam as spam and deleting it.

Why do it both ways? Well, I’m concerned that if I keep telling Spam Karma that pingback spam is spam, it’ll think all pingbacks are spam. I don’t want it to do that. So I manually delete them. It only takes a minute or two, so it isn’t a big deal. If I had hundreds of these a day, I might do things differently.

The other reason I delete the pingbacks manually is because I want to check each site that’s pinging mine. I collect URLs of splogging sites and submit them periodically to Google. These sites violate Google’s Terms of Service and I’m hoping Google will either cancel their AdSense accounts or remove them from Google’s search indexing (or, preferably, both). So I send the links to Google and Google supposedly looks at them.

I’m working on a project to make creating a DMCA notice easier — almost automated — and would love to hear from anyone working on a project like that.

This morning was quiet. Only three spams to kill: one comment spam and two pingback spams. I’ll get a few more spams during the day and kill them as they arrive; WordPress notifies me via e-mail of all comments and pingbacks as they are received. (I don’t check my e-mail at the breakfast table anymore.)

Do you have a special way to deal with comment or pingback spam? Don’t keep it a secret. Leave a Comment below.

14 thoughts on “Fighting Spam — All Kinds

  1. So then the first two comments listed on your sideboard as of 7:44 am MST are pingbacks?

    I hate this junk! Talk about a waste of all our time and bandwidth!

    And I like to waste time and bandwidth.

  2. Yes, those are pingbacks, but they’re from another site I manage, so they’re legit. Any posts on this site that are of interest to WordPress users are also posted to that site. That site then pings this site if there are any links to this site in the post. There are two of them so you see two pingbacks.

    I’ll probably leave them, since they do provide readers of the pinged posts with another source of information.

    Normally, my posts on this site would ping other posts on this site. But I installed software to disable that, since I often link to related posts here.

    Sounds confusing, but it’s not once you get an idea of the relationship between the posts.

  3. I’ve just started my first blog called Wonkie (great now you’ve made me paranoid about leaving my blog link anywhere!) – in any case I would like to understand how pingback spam affects your site ranking on google for example.

    Probably a newbie question but I thought it’s good to have other sites link to your blog?

    And would a spam pingback ever be visible to users of my site? (I can totally understand why I wouldn’t want comment spam because that would make for horrible user experience but I’m not sure about the demerits of pingbacks) – thanks

  4. Pratish, I’m not sure if I’m qualified to answer your question about Google and pingback spam. Frankly, I don’t bother with all that SEO stuff. I just blog and earn page rank based on my content.

    I should make something clear here. If you delete a pingback that you think is spam, you are NOT removing the link on the other site to yours. You’re just removing the reciprocating link in your comments list back to that site. So there’s definitely no Google penalty for deleting pingback spam.

    I don’t tolerate pingback spam here because of two reasons:

    (1) Most of the pingbacks are an illegal use of my own content in another blog (splogging). I can’t stop the sploggers, but I can certainly stop pingbacks on my own site from linking back to theirs.

    (2) Other pingbacks are often a thinly veiled attempt to get a link on my site to another site. They’re spam, plain and simple.

    Hope this helps. Good luck with your new blog!

  5. Thanks for the tip – it took all of 1 week with my new venture to discover what splogging means which is pretty sad actually :( Found some random site had basically scraped the content from my site and posted it on theirs (along with a few dozen ads on either side of it!)

    I don’t have much time to spend on SEO related stuff either -actually would much rather spend the time making cartoons for my site! Do you advertise your blog anywhere or have you just built up a loyal following over the years?

  6. Interesting article. I launched a couple of blogs a few months ago and now it seems I’m getting around 20 of these pingbacks (and the odd trackback) daily for each. I check out both to make sure they are legitimate but I’ve noticed the few I do approve that the link disappears off the wordpress dashboard the next day.

    I’m wondering if this is some kind of link spam whereby they make a pingback and when you accept it they immediately delete it so they have a one way link back to their site. Of course that improves their google position and does nothing for ours.

    I can’t seem to find anything about it on wordpress and of course just being a beginning blogger I don’t know enough about the system to make an informed decision.

    Any ideas?


  7. The pingbacks are likely just what you think: someone’s attempt to pump up their Google juice. Do you check the URLs for these pingbacks? Do they have anything at all to do with your blog post? Or are they simply spam sites sucking content off your site with a link back? If they’re not legit, don’t approve them.

  8. Gordon.. it’s generally pretty easy to spot a spam blog. If you check out the referring site and it’s got your post replicated on it (i.e. duplicated, not simply referring to it or quoting part of it) it’s likely a splog so don’t approve the pingback.

    Maria, I’ve checked with google since your last message and splogging can influence your google pagerank in a number of ways besides just the link juice element. Search engines do not like duplicate content and these splogs duplicating your content can cause issues with indexing your original work properly with SEs. If you find the same blogs pinging you on certain keywords and duplicating your entire post as I’ve spotted on my cartoon blog, I suggest you write to the blog owner and tell him not to do it. Failing that, and I have done this a couple of times already too, report them to google with details (Google have a DCMA section that addresses copyright infringement issues particularly for this kind of thing)



  9. Thanks for the prompt reply!

    I just finished tracking 3 of them back. Two were on-topic blogs but the 3rd was definitely just scraping feeds. I have the blogs on feedburner, technorati and feedage as well as a couple of ping services. So they are taking our content which I don’t mind so much as long as the article links are left intact.

    But I am wondering the wisdom of doing these feeds as it seems to me content posts could easily get spidered on another site sooner than ours and we lose credit for the content??? Although we get stuff spidered pretty fast on our site (I’ve seen as low as 20 minutes).

    Anyway I’ve done some reading and come to the conclusion that this is spam and installed spam free on one site to see how it handles it.

    But it also occurred to me that it isn’t all bad. The pingbacks are all moderated. That means a link exists to us UNTIL I APPROVE IT. Assuming they have any google position at all that gives a link in our favour. After a week or so I just delete it. Temporary perhaps but at least we get the advantage of what they’re doing :-)

    But then again as I said I’m new to this so perhaps I’m missing something.


  10. Thanks for the great info. Received my first pingback spam today and had no idea what it was. I thought it was cool to have another site linking to mine, but now I’ve read your comments on splogging, I can see it’s not really that cool at all!

    Cameron Mackay´s last blog post: Water For Weight Loss

  11. spams are really really annoying!

    let me just share you guys this story, the first month of my blog was plagued with spam comments EVERY SINGLE DAY. I installed captcha and it quieted down things..

    then the trackbacks began. didn’t know back then that you could spam trackbacks (these things are like pingbacks but apparently in a trackback spam, it is possible that the spammer only leaves a trackback on your site but your site is nowhere to be found on their site) so I installed a plugin that checks trackbacks if they are legit (i.e if my link is really where it should be) after the plugin, trackbacks stopped.

    now I think it’s pingbacks as this morning I have checked and there’s 3 to moderate. it’s starting to get on my nerves!

    I don’t want to be forced to close comments, trackbacks and pingbacks as I know these are ways to reward my readers and also for them to get in touch with me. it should encourage healthy interaction between the author and the reader. I just hate it when I feel taken advantage of.

    • Spam is part of life. Akismet (a spam plugin for WordPress) catches 95% of the spam that comes through here. Because I moderate ALL comments, I catch the rest. It’s not bad enough to shut down comments and I do wish I could allow comments to be automatically approved. Manual moderation is the answer I’ve chosen. It takes about 5 minutes each morning to go through them for all of my sites.

What do you think?