Re: [linux-audio-user] Fwd: [Fwd: Wiki Spam Report]

New Message Reply About this list Date view Thread view Subject view Author view Other groups

Subject: Re: [linux-audio-user] Fwd: [Fwd: Wiki Spam Report]
From: Ruth A. Kramer (rhkramer_AT_fast.net)
Date: Fri Dec 10 2004 - 05:09:19 EET


Hans Fugal wrote:
> There have been some differing opinions on whether a wiki will attract
> spam and what to do about it. Here's a message about what the
> RubyGarden wiki has experienced and done. Some of you may be familiar
> with Ruby, and know that it is an extremely cool language but not
> (yet) as popular as other languages like perl, python, Java, etc. If
> you haven't heard of it, well that just attests to its not being a
> major player in the language market (yet). Yet they struggle with wiki
> spam.

Hans,

Thanks! Very interesting approach!

I'd like to find out how much time Jim spends dealing with the tarpit.
I may write to him someday, unless you find it convenient to do so.

regards,
Randy Kramer

PS: Unless his efforts take zero time, I'd rather wait till a spam
problem exists on WikiLearn before implementing such an approach. In
the meantime, WikiLearn has, for example, the registration requirement.

>
> ---------- Forwarded message ----------
> From: "Jim Weirich" <jim_AT_weirichhouse.org>
> To: comp.lang.ruby
> Date: Tue, 14 Dec 2004 03:21:02 +0900
> Subject: Wiki Spam Report
> Wiki Spam Report
> ----------------
>
> I thought I would take some time and report on the wiki spam situation
> on RubyGarden. As I hope you have noticed, the wiki has been
> remarkably spam free. This email will tell you what measures we have
> taken to get to this point.
>
> But first ...
>
> Some Numbers
> ------------
>
> Over the past 10 days, we have had:
>
> 93 updates to the wiki page, all (AFAICT) spam free.
> (although I might have missed spotting some).
>
> 46 updates to the wiki tarpit. Of those, we had ...
> 3 innocent updates
> 2 questionable updates
> 1 update by me
> 40 spams
>
> The Mechanism
> -------------
>
> Spammers are automatically routed to a wiki tarpit. The tarpit is an
> (almost) exact copy of the real RubyGarden wiki. Making changes to
> the tarpit looks as if you are making changes to the real wiki. And
> since spammers get their pages from the wiki, it looks like (to them)
> that they have successfully spammed our site.
>
> However, everyone else never gets to see the spam.
>
> By tricking the spammers into thinking they are successful, they don't
> put any additional effort into bypassing our spam detection criteria.
> This is important! When we explicitly denied them access to the wiki,
> then went to great lengths to figure out how to get around the
> restrictions. I haven't seen any of that kind of probing with the
> tarpit.
>
> Detecting Spammers
> ------------------
>
> The current spammer detection logic is based on two observations:
>
> (1) Spammers almost never use an IP address that has reverse lookup
> enabled. This effectively means that it appears (to the wiki
> software) that your host name looks like a numeric IP address.
>
> (2) Spammers almost never set user preferences on the wiki.
>
> So if both of these conditions are true, we treat the access as a spammer
> and send it to the tarpit.
>
> Now this isn't perfect, but that's OK. We also have a explicit ban
> list for spammers who pass one of (1) or (2) above. And we have an
> explicit allow list that overrides the automatic spammer detection.
>
> Innocent Users
> --------------
>
> Can innocent users get trapped by the Tapit? The short answer is yes.
> However, we are monitoring the tarpit and will attempt to rescue such
> users.
>
> In the past 10 days, there were at least 3 page updates that were from
> innocent users. One guy (bless his heart) even removed some spam from
> the tarpit for us.
>
> When I see innocents trapped in the tarpit, I add their IP address to
> the allow list and manually update the wiki with their changes (if
> they are significant).
>
> Detecting the Tarpit?
> ---------------------
>
> The tarpit is deliberately designed to look like the original wiki, so
> it is sometimes difficult to tell when you are trapped. Here's some
> suggestions.
>
> You are probably in the Tarpit when:
>
> * there are a lot of recent updates made with numeric IP addresses
> rather than host names.
>
> * a lot of the pages have spam.
>
> Although neither of these suggestions are foolproof. I refresh the tarpit
> from the real wiki occasionally (to keep it looking realistic).
> Immediately after a refresh it is /very/ difficult to tell the difference.
>
> If you think you are trapped by the tarpit, send me
> (jim_AT_weirichhouse.org) an email with your IP address and I will check
> the logs. If you are trapped, we can add your IP address to the allow
> list.
>
> If you are worried about getting caught in the tarpit, just make sure you
> have your user preferences set when accessing the tarpit (click on the
> preferences link from any wiki page).
>
> Summary
> -------
>
> I am pretty happy with the current wiki situation. In fact, the
> tarpit has been so successful, that I am considering lifting the ban
> on lower case http. The ban currently isn't buying us any benefits
> and is rather annoying (I'll make it so both upper and lower case
> work).
>
> Thanks for your time.
>
> --
> -- Jim Weirich jim_AT_weirichhouse.org http://onestepback.org
> -----------------------------------------------------------------
> "Beware of bugs in the above code; I have only proved it correct,
> not tried it." -- Donald Knuth (in a memo to Peter van Emde Boas)


New Message Reply About this list Date view Thread view Subject view Author view Other groups

This archive was generated by hypermail 2b28 : Mon Dec 13 2004 - 23:50:12 EET