You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@jspwiki.apache.org by Andrew Jaquith <an...@gmail.com> on 2010/01/05 16:46:29 UTC

CAPTCHA handling -- quick update

Hi all --

Just thought I'd send a quick update on CATPCHA. Janne and I have had
some back-channel conversations about enhancements that I needed to
make.

Functionally, here's how the revised system will work:

- CAPTCHAs will be rendered on the same page as the submitting form,
but by default if the previous post contains spam (this is in line
with Janne's comments)
- CAPTCHA-rendering will be the responsibility of the wiki:SpamProtect
tag (as before)
- wiki:SpamProtect must be added as a child of a form or stripes:form
element (as before)
- If the JSP author wishes, they may require a CAPTCHA by adding an
attribute challenge="captcha" to the SpamProtect tag (new)
- In addition, a form can require password confirmation by adding
attribute challenge="password" to the SpamProtect tag (new)
- All of the back-end processing will be done by SpamInterceptor, in
collaboration with the content-inspection system (as before)
- Stripes ActionBeans that require spam protection need only add a
@SpamProtect annotation to the target event methods (as before)

We will add the SpamProtect tag to the page-edit form, comment form,
new user registration form, and user profile form. For new user
registration, a CAPTCHA will likely be required (challenge=captcha).
For user profile changes and post-install wiki configuration (coming
soon!), the user's password will be required to confirm
(challenge=password).

So, that's the functional design -- nice and simple. And we knock out
some JIRA bugs while we're at it (e.g., confirm password for account
changes)...

Andrew

Re: CAPTCHA handling -- quick update

Posted by Janne Jalkanen <ja...@ecyrd.com>.

Ah, this is excellent. Low impact, yet flexible.

The technique you mention is quite useful for two reasons:

1) It works well when someone eats your cookies.  Quite a few gateways do this.
2) Since state is transported in all requests, it makes the HTTP requests stateless, which means that this approach scales really, really well. Though of course it needs to be applied consistently everywhere...

/Janne

On 7 Jan 2010, at 16:53, Andrew Jaquith wrote:

> Janne,
> 
> I picked the really nice option.  :) The solution is that when a post
> contains spam, we redirect to the editor page, but request a CAPTCHA
> be displayed. Re-editing is allowed.
> 
> Here is how it works. There are two collaborating parts: the
> SpamProtectTag and the SpamInterceptor. This is where we do a little
> magic. :)
> 
> Let's say you've loaded the editor for the first time (i.e., you
> haven't submitted). What we do is write out a special parameter, a
> "challenge request," when SpamProtectTag executes. The contents, for
> the FIRST GET, contain the string value of the enum
> Challenge.Request.CHALLENGE_ON_DEMAND. This means "no CAPTCHA is
> required, but when we interpret the post, get ready to generate one
> after redirect if there's spam in it." Then, we encrypt the parameter
> using CryptoUtil.
> 
> When SpamInterceptor intercepts the POST, we then look for the special
> challenge-request parameter. Two things can happen: a normal user
> submits (in which case the challenge-request parameter will be there),
> or s spammer submits (in which case it will not be).
> 
> In the normal case, we extract the challenge-request parameter,
> decrypt the contents and figure out that its value was
> CHALLENGE_ON_DEMAND. Because it has this value, we do NOT run the
> Captcha validator. We always run the content Inspection. If it
> contains spam, we add a ValidationError. If not, we return a null
> Resolution, the "save" event method executes further down the chain,
> and we are done.
> 
> Now, let's look at the spammer case.
> 
> If the challenge-request parameter is not present in the request, we
> KNOW that the user has been naughty, or that it is a spammer. So we
> add a ValidationError and redirect to the editor again.
> 
> On the second GET (i.e., after the POST and redirect back to the
> editor page), the SpamProtectTag executes again. This time, it knows
> there was spam because of the ValidationError, and this time will
> write out the enum Challenge.Request.CAPTCHA, which means "I just
> rendered a CAPTCHA, and when SpamInterceptor intercepts the post,
> validate it." Thus, when SpamInterceptor handles the post next time
> around, when it sees the CAPTCHA value it knows that it should do the
> CAPTCHA check.
> 
> (and then we lather, rinse, repeat until the user submits a correct
> CAPTCHA value)
> 
> That might sound complicated, but it's not -- the code is dead simple.
> The key is that the SpamProtectTag writes the current state out to the
> challenge-request parameter: CAPTCHA_ON_DEMAND is written out for the
> first-time GET, and on subsequent GETs, CAPTCHA will be written out if
> the contents are spam. All SpamInterceptor needs to do is obtain what
> the state was by retrieving and decrypting the challenge-request
> param.
> 
> There is one other wrinkle here, which is if we see the SpamProtectTag
> attribute "challenge" in the JSP, when the JSP author wants to force a
> password check or a CAPTCHA in all cases. In that case, we will write
> out the value Challenge.Request.CAPTCHA or Challenge.Request.PASSWORD
> and render the Challenge right away, even on that first post.
> 
> Naming-wise, I've gone back and forth about what the right names for
> everything should be. At the moment, I think Challenge.Request might
> better be called Challenge.State. :) Maybe CAPTCHA_ON_DEMAND becomes
> CHALLENGE_NOT_RENDERED, CAPTCHA becomes CAPTCHA_RENDERED, PASSWORD
> becomes PASSWORD_RENDERED? Not sure. But,
> 
> Oh, and one more thing. This basic technique -- encrypt some sort of
> state object, write it out as a hidden parameter to the form, then
> extract/decrypt on POST -- is something I gleaned from looking through
> the Stripes code. They do a lot of "state smuggling" as an alternative
> to storing server-side session attributes. I think it's a nice,
> low-overhead technique for situations like forms, which are
> essentially stateful. I use this technique also for smuggling the
> parameter names used for the spam tokens, for example.
> 
> Long post! Hope it made sense.
> 
> Andrew
> 
> 
> On Thu, Jan 7, 2010 at 3:26 AM, Janne Jalkanen <Ja...@ecyrd.com> wrote:
>> 
>> Errr... How do we determine what is a previous post? Spambots tend to make
>> each request from a  different address and ignore cookies. Or is it so that
>> if the post is determined to contain spam, you get a redirect to the editor
>> page, but this time with a captcha? 'cos that would be really nice, since it
>> allows you to re-edit the content.
>> 
>> /Janne
>> 
>> On Jan 5, 2010, at 18:10 , Andrew Jaquith wrote:
>> 
>>> Small correction (this is what happens when you type too quickly) --
>>> 
>>> CAPTCHAs are rendered, by default, ONLY if the previous post contains
>>> spam. The missing "only" makes all the difference. :)
>>> 
>>> The important point is that we are treating spam, essentially, as a
>>> form validation error.
>>> 
>>> If you don't submit spam, it won't produce a validation error, so you
>>> won't see a CAPTCHA. (Unless the JSP requires it, for example, when
>>> creating a user account).
>>> 
>>> Andrew
>>> 
>>> On Tue, Jan 5, 2010 at 10:46 AM, Andrew Jaquith
>>> <an...@gmail.com> wrote:
>>>> 
>>>> Hi all --
>>>> 
>>>> Just thought I'd send a quick update on CATPCHA. Janne and I have had
>>>> some back-channel conversations about enhancements that I needed to
>>>> make.
>>>> 
>>>> Functionally, here's how the revised system will work:
>>>> 
>>>> - CAPTCHAs will be rendered on the same page as the submitting form,
>>>> but by default if the previous post contains spam (this is in line
>>>> with Janne's comments)
>>>> - CAPTCHA-rendering will be the responsibility of the wiki:SpamProtect
>>>> tag (as before)
>>>> - wiki:SpamProtect must be added as a child of a form or stripes:form
>>>> element (as before)
>>>> - If the JSP author wishes, they may require a CAPTCHA by adding an
>>>> attribute challenge="captcha" to the SpamProtect tag (new)
>>>> - In addition, a form can require password confirmation by adding
>>>> attribute challenge="password" to the SpamProtect tag (new)
>>>> - All of the back-end processing will be done by SpamInterceptor, in
>>>> collaboration with the content-inspection system (as before)
>>>> - Stripes ActionBeans that require spam protection need only add a
>>>> @SpamProtect annotation to the target event methods (as before)
>>>> 
>>>> We will add the SpamProtect tag to the page-edit form, comment form,
>>>> new user registration form, and user profile form. For new user
>>>> registration, a CAPTCHA will likely be required (challenge=captcha).
>>>> For user profile changes and post-install wiki configuration (coming
>>>> soon!), the user's password will be required to confirm
>>>> (challenge=password).
>>>> 
>>>> So, that's the functional design -- nice and simple. And we knock out
>>>> some JIRA bugs while we're at it (e.g., confirm password for account
>>>> changes)...
>>>> 
>>>> Andrew
>>>> 
>> 
>>

Re: CAPTCHA handling -- quick update

Posted by Andrew Jaquith <an...@gmail.com>.

Janne,

I picked the really nice option.  :) The solution is that when a post
contains spam, we redirect to the editor page, but request a CAPTCHA
be displayed. Re-editing is allowed.

Here is how it works. There are two collaborating parts: the
SpamProtectTag and the SpamInterceptor. This is where we do a little
magic. :)

Let's say you've loaded the editor for the first time (i.e., you
haven't submitted). What we do is write out a special parameter, a
"challenge request," when SpamProtectTag executes. The contents, for
the FIRST GET, contain the string value of the enum
Challenge.Request.CHALLENGE_ON_DEMAND. This means "no CAPTCHA is
required, but when we interpret the post, get ready to generate one
after redirect if there's spam in it." Then, we encrypt the parameter
using CryptoUtil.

When SpamInterceptor intercepts the POST, we then look for the special
challenge-request parameter. Two things can happen: a normal user
submits (in which case the challenge-request parameter will be there),
or s spammer submits (in which case it will not be).

In the normal case, we extract the challenge-request parameter,
decrypt the contents and figure out that its value was
CHALLENGE_ON_DEMAND. Because it has this value, we do NOT run the
Captcha validator. We always run the content Inspection. If it
contains spam, we add a ValidationError. If not, we return a null
Resolution, the "save" event method executes further down the chain,
and we are done.

Now, let's look at the spammer case.

If the challenge-request parameter is not present in the request, we
KNOW that the user has been naughty, or that it is a spammer. So we
add a ValidationError and redirect to the editor again.

On the second GET (i.e., after the POST and redirect back to the
editor page), the SpamProtectTag executes again. This time, it knows
there was spam because of the ValidationError, and this time will
write out the enum Challenge.Request.CAPTCHA, which means "I just
rendered a CAPTCHA, and when SpamInterceptor intercepts the post,
validate it." Thus, when SpamInterceptor handles the post next time
around, when it sees the CAPTCHA value it knows that it should do the
CAPTCHA check.

(and then we lather, rinse, repeat until the user submits a correct
CAPTCHA value)

That might sound complicated, but it's not -- the code is dead simple.
The key is that the SpamProtectTag writes the current state out to the
challenge-request parameter: CAPTCHA_ON_DEMAND is written out for the
first-time GET, and on subsequent GETs, CAPTCHA will be written out if
the contents are spam. All SpamInterceptor needs to do is obtain what
the state was by retrieving and decrypting the challenge-request
param.

There is one other wrinkle here, which is if we see the SpamProtectTag
attribute "challenge" in the JSP, when the JSP author wants to force a
password check or a CAPTCHA in all cases. In that case, we will write
out the value Challenge.Request.CAPTCHA or Challenge.Request.PASSWORD
and render the Challenge right away, even on that first post.

Naming-wise, I've gone back and forth about what the right names for
everything should be. At the moment, I think Challenge.Request might
better be called Challenge.State. :) Maybe CAPTCHA_ON_DEMAND becomes
CHALLENGE_NOT_RENDERED, CAPTCHA becomes CAPTCHA_RENDERED, PASSWORD
becomes PASSWORD_RENDERED? Not sure. But,

Oh, and one more thing. This basic technique -- encrypt some sort of
state object, write it out as a hidden parameter to the form, then
extract/decrypt on POST -- is something I gleaned from looking through
the Stripes code. They do a lot of "state smuggling" as an alternative
to storing server-side session attributes. I think it's a nice,
low-overhead technique for situations like forms, which are
essentially stateful. I use this technique also for smuggling the
parameter names used for the spam tokens, for example.

Long post! Hope it made sense.

Andrew

On Thu, Jan 7, 2010 at 3:26 AM, Janne Jalkanen <Ja...@ecyrd.com> wrote:
>
> Errr... How do we determine what is a previous post? Spambots tend to make
> each request from a  different address and ignore cookies. Or is it so that
> if the post is determined to contain spam, you get a redirect to the editor
> page, but this time with a captcha? 'cos that would be really nice, since it
> allows you to re-edit the content.
>
> /Janne
>
> On Jan 5, 2010, at 18:10 , Andrew Jaquith wrote:
>
>> Small correction (this is what happens when you type too quickly) --
>>
>> CAPTCHAs are rendered, by default, ONLY if the previous post contains
>> spam. The missing "only" makes all the difference. :)
>>
>> The important point is that we are treating spam, essentially, as a
>> form validation error.
>>
>> If you don't submit spam, it won't produce a validation error, so you
>> won't see a CAPTCHA. (Unless the JSP requires it, for example, when
>> creating a user account).
>>
>> Andrew
>>
>> On Tue, Jan 5, 2010 at 10:46 AM, Andrew Jaquith
>> <an...@gmail.com> wrote:
>>>
>>> Hi all --
>>>
>>> Just thought I'd send a quick update on CATPCHA. Janne and I have had
>>> some back-channel conversations about enhancements that I needed to
>>> make.
>>>
>>> Functionally, here's how the revised system will work:
>>>
>>> - CAPTCHAs will be rendered on the same page as the submitting form,
>>> but by default if the previous post contains spam (this is in line
>>> with Janne's comments)
>>> - CAPTCHA-rendering will be the responsibility of the wiki:SpamProtect
>>> tag (as before)
>>> - wiki:SpamProtect must be added as a child of a form or stripes:form
>>> element (as before)
>>> - If the JSP author wishes, they may require a CAPTCHA by adding an
>>> attribute challenge="captcha" to the SpamProtect tag (new)
>>> - In addition, a form can require password confirmation by adding
>>> attribute challenge="password" to the SpamProtect tag (new)
>>> - All of the back-end processing will be done by SpamInterceptor, in
>>> collaboration with the content-inspection system (as before)
>>> - Stripes ActionBeans that require spam protection need only add a
>>> @SpamProtect annotation to the target event methods (as before)
>>>
>>> We will add the SpamProtect tag to the page-edit form, comment form,
>>> new user registration form, and user profile form. For new user
>>> registration, a CAPTCHA will likely be required (challenge=captcha).
>>> For user profile changes and post-install wiki configuration (coming
>>> soon!), the user's password will be required to confirm
>>> (challenge=password).
>>>
>>> So, that's the functional design -- nice and simple. And we knock out
>>> some JIRA bugs while we're at it (e.g., confirm password for account
>>> changes)...
>>>
>>> Andrew
>>>
>
>

Re: CAPTCHA handling -- quick update

Posted by Janne Jalkanen <Ja...@ecyrd.com>.

Errr... How do we determine what is a previous post? Spambots tend to  
make each request from a  different address and ignore cookies. Or is  
it so that if the post is determined to contain spam, you get a  
redirect to the editor page, but this time with a captcha? 'cos that  
would be really nice, since it allows you to re-edit the content.

/Janne

On Jan 5, 2010, at 18:10 , Andrew Jaquith wrote:

> Small correction (this is what happens when you type too quickly) --
>
> CAPTCHAs are rendered, by default, ONLY if the previous post contains
> spam. The missing "only" makes all the difference. :)
>
> The important point is that we are treating spam, essentially, as a
> form validation error.
>
> If you don't submit spam, it won't produce a validation error, so you
> won't see a CAPTCHA. (Unless the JSP requires it, for example, when
> creating a user account).
>
> Andrew
>
> On Tue, Jan 5, 2010 at 10:46 AM, Andrew Jaquith
> <an...@gmail.com> wrote:
>> Hi all --
>>
>> Just thought I'd send a quick update on CATPCHA. Janne and I have had
>> some back-channel conversations about enhancements that I needed to
>> make.
>>
>> Functionally, here's how the revised system will work:
>>
>> - CAPTCHAs will be rendered on the same page as the submitting form,
>> but by default if the previous post contains spam (this is in line
>> with Janne's comments)
>> - CAPTCHA-rendering will be the responsibility of the  
>> wiki:SpamProtect
>> tag (as before)
>> - wiki:SpamProtect must be added as a child of a form or stripes:form
>> element (as before)
>> - If the JSP author wishes, they may require a CAPTCHA by adding an
>> attribute challenge="captcha" to the SpamProtect tag (new)
>> - In addition, a form can require password confirmation by adding
>> attribute challenge="password" to the SpamProtect tag (new)
>> - All of the back-end processing will be done by SpamInterceptor, in
>> collaboration with the content-inspection system (as before)
>> - Stripes ActionBeans that require spam protection need only add a
>> @SpamProtect annotation to the target event methods (as before)
>>
>> We will add the SpamProtect tag to the page-edit form, comment form,
>> new user registration form, and user profile form. For new user
>> registration, a CAPTCHA will likely be required (challenge=captcha).
>> For user profile changes and post-install wiki configuration (coming
>> soon!), the user's password will be required to confirm
>> (challenge=password).
>>
>> So, that's the functional design -- nice and simple. And we knock out
>> some JIRA bugs while we're at it (e.g., confirm password for account
>> changes)...
>>
>> Andrew
>>

Re: CAPTCHA handling -- quick update

Posted by Harry Metske <ha...@gmail.com>.

sounds good to me

thanks,
Harry


2010/1/5 Andrew Jaquith <an...@gmail.com>

> Small correction (this is what happens when you type too quickly) --
>
> CAPTCHAs are rendered, by default, ONLY if the previous post contains
> spam. The missing "only" makes all the difference. :)
>
> The important point is that we are treating spam, essentially, as a
> form validation error.
>
> If you don't submit spam, it won't produce a validation error, so you
> won't see a CAPTCHA. (Unless the JSP requires it, for example, when
> creating a user account).
>
> Andrew
>
> On Tue, Jan 5, 2010 at 10:46 AM, Andrew Jaquith
> <an...@gmail.com> wrote:
> > Hi all --
> >
> > Just thought I'd send a quick update on CATPCHA. Janne and I have had
> > some back-channel conversations about enhancements that I needed to
> > make.
> >
> > Functionally, here's how the revised system will work:
> >
> > - CAPTCHAs will be rendered on the same page as the submitting form,
> > but by default if the previous post contains spam (this is in line
> > with Janne's comments)
> > - CAPTCHA-rendering will be the responsibility of the wiki:SpamProtect
> > tag (as before)
> > - wiki:SpamProtect must be added as a child of a form or stripes:form
> > element (as before)
> > - If the JSP author wishes, they may require a CAPTCHA by adding an
> > attribute challenge="captcha" to the SpamProtect tag (new)
> > - In addition, a form can require password confirmation by adding
> > attribute challenge="password" to the SpamProtect tag (new)
> > - All of the back-end processing will be done by SpamInterceptor, in
> > collaboration with the content-inspection system (as before)
> > - Stripes ActionBeans that require spam protection need only add a
> > @SpamProtect annotation to the target event methods (as before)
> >
> > We will add the SpamProtect tag to the page-edit form, comment form,
> > new user registration form, and user profile form. For new user
> > registration, a CAPTCHA will likely be required (challenge=captcha).
> > For user profile changes and post-install wiki configuration (coming
> > soon!), the user's password will be required to confirm
> > (challenge=password).
> >
> > So, that's the functional design -- nice and simple. And we knock out
> > some JIRA bugs while we're at it (e.g., confirm password for account
> > changes)...
> >
> > Andrew
> >
>

Re: CAPTCHA handling -- quick update

Posted by Andrew Jaquith <an...@gmail.com>.

Small correction (this is what happens when you type too quickly) --

CAPTCHAs are rendered, by default, ONLY if the previous post contains
spam. The missing "only" makes all the difference. :)

The important point is that we are treating spam, essentially, as a
form validation error.

If you don't submit spam, it won't produce a validation error, so you
won't see a CAPTCHA. (Unless the JSP requires it, for example, when
creating a user account).

Andrew

On Tue, Jan 5, 2010 at 10:46 AM, Andrew Jaquith
<an...@gmail.com> wrote:
> Hi all --
>
> Just thought I'd send a quick update on CATPCHA. Janne and I have had
> some back-channel conversations about enhancements that I needed to
> make.
>
> Functionally, here's how the revised system will work:
>
> - CAPTCHAs will be rendered on the same page as the submitting form,
> but by default if the previous post contains spam (this is in line
> with Janne's comments)
> - CAPTCHA-rendering will be the responsibility of the wiki:SpamProtect
> tag (as before)
> - wiki:SpamProtect must be added as a child of a form or stripes:form
> element (as before)
> - If the JSP author wishes, they may require a CAPTCHA by adding an
> attribute challenge="captcha" to the SpamProtect tag (new)
> - In addition, a form can require password confirmation by adding
> attribute challenge="password" to the SpamProtect tag (new)
> - All of the back-end processing will be done by SpamInterceptor, in
> collaboration with the content-inspection system (as before)
> - Stripes ActionBeans that require spam protection need only add a
> @SpamProtect annotation to the target event methods (as before)
>
> We will add the SpamProtect tag to the page-edit form, comment form,
> new user registration form, and user profile form. For new user
> registration, a CAPTCHA will likely be required (challenge=captcha).
> For user profile changes and post-install wiki configuration (coming
> soon!), the user's password will be required to confirm
> (challenge=password).
>
> So, that's the functional design -- nice and simple. And we knock out
> some JIRA bugs while we're at it (e.g., confirm password for account
> changes)...
>
> Andrew
>