You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@allura.apache.org by Cory Johns <cj...@slashdotmedia.com> on 2013/07/16 19:00:44 UTC

Anti-Spam form-field renaming (without EasyWidgets?)

I know Allura has an anti-spam middleware in place that renames form fields
automagically to non-human readable, but I have a couple of questions
regarding how it works.

Firstly, I'm adding a new form and eschewing EasyWidgets for various
reasons.  I noticed that the form fields were not getting renamed, so I'm
wondering if there's something I need to do to avail my form of the
anti-spam magic, or if it's something that's tied to EW?

Secondly, I noticed that the field ID values aren't changed, which makes
sense as it would make writing javascript difficult, but I wonder how much
renaming the field names but not the IDs actually is.  I guess point is to
block bots that just replay the form submission, but is it really that much
of an obstacle to request the form and extract the field names by ID first?
 Is this a case of "any hurdle we can throw up helps, no matter how small?"
 Is it perhaps a hold-over of earlier attempts at spam prevention and is
maybe less relevant now?

Actually, I noticed that changing the field name to the original,
un-magicked, field name via the debugger and then submitting the form
actually works fine.  Since Allura is open-source, the original field names
are easy to discover and it seems that the field renaming is entirely moot.

Should I just not worry about ensuring that the field renaming magic works
on my new form?

Re: Anti-Spam form-field renaming (without EasyWidgets?)

Posted by Dave Brondsema <db...@slashdotmedia.com>.
On Wed, Jul 17, 2013 at 11:20 AM, Cory Johns <cj...@slashdotmedia.com>wrote:

> On Tue, Jul 16, 2013 at 4:19 PM, Dave Brondsema <
> dbrondsema@slashdotmedia.com> wrote:
>
> >
> > That does seem like an issue and something we should fix.  Perhaps it's
> > because there's an option to turn off this antispam field renaming, so
> the
> > controllers handle it either way.  But would be better to not allow the
> > original names if antispam is on for the form.
> >
>
> I think this is just because allura.lib.utils.AntiSpam.validate() adds the
> un-obfuscated fields to the params but doesn't remove them if they're
> already there and the obfuscated fields are not present.  Should be an easy
> fix, but will require quite a bit of testing to make sure we're not
> inadvertently relying on this somewhere.
>
> Also, what do you mean about the option to turn off the field renaming?  I
> don't see anything about that in the AntiSpam class.
>
>
>

 I misremembered.  I was thinking of the disable_csrf_protection flag,
which unrelated


-- 
Dave Brondsema
Principal Software Engineer - sourceforge.net
Dice Holdings, Inc.

Re: Anti-Spam form-field renaming (without EasyWidgets?)

Posted by Cory Johns <cj...@slashdotmedia.com>.
On Tue, Jul 16, 2013 at 4:19 PM, Dave Brondsema <
dbrondsema@slashdotmedia.com> wrote:


> In app_globals.py, g.antispam is set up as the main class that does the
> work.  And it looks like the ForgeForm EW class
> in Allura/allura/lib/widgets/forms.py hooks into that.  Thus everything
> that inherits from ForgeForm (most if not all of our forms, I'd think) get
> that functionality.
>

g.antispam looks easy enough to use; just use
"{{g.antispam.enc('field_name')}}" instead of "field_name" in your
templates and iterate over g.antispam.extra_fields() for the honeypot
fields, then add @allura.lib.utils.AntiSpam.validate('Spam protection') to
your post handler.  This is exactly what I was looking for, thanks.

>
> > Actually, I noticed that changing the field name to the original,
> > un-magicked, field name via the debugger and then submitting the form
> > actually works fine.  Since Allura is open-source, the original field
> names
> > are easy to discover and it seems that the field renaming is entirely
> moot.
> >
>
> That does seem like an issue and something we should fix.  Perhaps it's
> because there's an option to turn off this antispam field renaming, so the
> controllers handle it either way.  But would be better to not allow the
> original names if antispam is on for the form.
>

I think this is just because allura.lib.utils.AntiSpam.validate() adds the
un-obfuscated fields to the params but doesn't remove them if they're
already there and the obfuscated fields are not present.  Should be an easy
fix, but will require quite a bit of testing to make sure we're not
inadvertently relying on this somewhere.

Also, what do you mean about the option to turn off the field renaming?  I
don't see anything about that in the AntiSpam class.



Now, I'd like to preface the rest of my reply by making it clear that I'm
not actually advocating discontinuing the use of allura.lib.utils.AntiSpam.
 It does get the low-hanging fruit, with regards to spam bots, and it's
easy enough to use that I do think it's worth it.  You answered my real
question above, the rest of this is just me ruminating on its actual
effectiveness in practice.

Note everywhere I say "antispam" I mean the anti-bot mechanisms and is
> unrelated to the spam filtering plugin support for Akismet, etc that we
> also have in Allura (which is hooked up to just a few forms).
>

Of course, Akismet (or any content-based filtering) will only work on
content-based forms, such as discussion posts, tickets, wiki pages, etc.
 Places like the add_project form will never benefit from Akismet
filtering, unfortunately.

http://nedbatchelder.com/text/stopbots.html is the basis for our
> implementation.  A spam script that was aware of all our prevention
> mechanisms certainly could account for them and succeed.  But there are
> lots of bots that are simplistic, so these barriers block them, and it's
> worth it.
>

There are a couple of issues with that, however.

First, that post is over 6 years old at this point.  Even generic spam bots
have had a lot of time to improve, and something like basing the field
value on the label element's text (which is clearly and consistently tied
to the field, as it should be for accessibility) is not a difficult
innovation, and is one that would be useful on many sites.  The bots still
have to fetch the form once per post to get the spinner field, but the
consistent ID and coupled label render the honeypot and field name
obfuscation significantly less effective.

Second, that post is clearly addressing the problem of un-targeted spam, as
evinced by the quote, "Spammers don't make software that can post to any
form, they make software that can post to many forms."  But with the Allura
platform being open source and SourceForge being a well-known site that
uses it, it unfortunately doesn't really fall into the "un-targeted'
category.  While it might not be worth a spammers while to put specific
effort in to spam Ned's site(s), it certainly seems like it would be to
target sites powered by Allura.

Re: Anti-Spam form-field renaming (without EasyWidgets?)

Posted by Dave Brondsema <db...@slashdotmedia.com>.
Note everywhere I say "antispam" I mean the anti-bot mechanisms and is
unrelated to the spam filtering plugin support for Akismet, etc that we
also have in Allura (which is hooked up to just a few forms).

On Tue, Jul 16, 2013 at 1:00 PM, Cory Johns <cj...@slashdotmedia.com>wrote:

> I know Allura has an anti-spam middleware in place that renames form fields
> automagically to non-human readable, but I have a couple of questions
> regarding how it works.
>
> Firstly, I'm adding a new form and eschewing EasyWidgets for various
> reasons.  I noticed that the form fields were not getting renamed, so I'm
> wondering if there's something I need to do to avail my form of the
> anti-spam magic, or if it's something that's tied to EW?
>

In app_globals.py, g.antispam is set up as the main class that does the
work.  And it looks like the ForgeForm EW class
in Allura/allura/lib/widgets/forms.py hooks into that.  Thus everything
that inherits from ForgeForm (most if not all of our forms, I'd think) get
that functionality.


>
> Secondly, I noticed that the field ID values aren't changed, which makes
> sense as it would make writing javascript difficult, but I wonder how much
> renaming the field names but not the IDs actually is.  I guess point is to
> block bots that just replay the form submission, but is it really that much
> of an obstacle to request the form and extract the field names by ID first?
>  Is this a case of "any hurdle we can throw up helps, no matter how small?"
>  Is it perhaps a hold-over of earlier attempts at spam prevention and is
> maybe less relevant now?
>

http://nedbatchelder.com/text/stopbots.html is the basis for our
implementation.  A spam script that was aware of all our prevention
mechanisms certainly could account for them and succeed.  But there are
lots of bots that are simplistic, so these barriers block them, and it's
worth it.


>
> Actually, I noticed that changing the field name to the original,
> un-magicked, field name via the debugger and then submitting the form
> actually works fine.  Since Allura is open-source, the original field names
> are easy to discover and it seems that the field renaming is entirely moot.
>

That does seem like an issue and something we should fix.  Perhaps it's
because there's an option to turn off this antispam field renaming, so the
controllers handle it either way.  But would be better to not allow the
original names if antispam is on for the form.


>
> Should I just not worry about ensuring that the field renaming magic works
> on my new form?
>

If we want to prevent spam bots from abusing that form, we should use the
g.antispam mechanisms.  More broadly, if we're starting to implement some
forms not using EW (I know there is some general dislike for EW, so seems
like a reasonable thing to start doing), then it'd be good to we want to
establish some patterns for doing that, including making it easy to use the
antispam functionality.


-- 
Dave Brondsema
Principal Software Engineer - sourceforge.net
Dice Holdings, Inc.