You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@roller.apache.org by Allen Gilliland <Al...@Sun.COM> on 2006/01/03 21:18:23 UTC

Proposal: Asynchronous Referrer Processing

This is already linked on the Roller 2.2 proposal page, but I thought 
I'd send it out directly as well.

http://rollerweblogger.org/wiki/Wiki.jsp?page=AsynchronousReferrerProcessing

This will allow Roller admins to optionally process referrers in an 
asynchronous manner, i.e. not tied to the http request/response cycle.

Thoughts/comments always welcome.

-- Allen

Re: Proposal: Asynchronous Referrer Processing

Posted by Allen Gilliland <Al...@Sun.COM>.
I've gone ahead and moved the referrer spam checking logic into the RefererFilter to make sure that we can send an appropriate response to spammers.  The drawback is that the RefererFilter now requires a lookup of a WebsiteData object per request, and that can be time consuming.

I *very* highly recommend that we add an L2 Hibernate cache for the WebsiteData object to help counteract this.  Without it the asynchronous referrer processing is much less effective;

-- Allen


On Wed, 2006-01-04 at 14:40, Allen Gilliland wrote:
> On Tue, 2006-01-03 at 13:51, David M Johnson wrote:
> > On Jan 3, 2006, at 4:01 PM, Matthew Schmidt wrote:
> > > Definitely useful, but I question how we plan on blocking requests  
> > > from
> > > referrers that are bad?  If everything is pushed into the queue,  
> > > wouldn't
> > > the request just continue as normal with the blacklist processing  
> > > happening
> > > later?
> > 
> > Yes, that appears to be a shortcoming of this proposal.
> > 
> > If we want to answer referrer spammers with a 403 access denied, as  
> > we do now, then I guess we could do something like this: when the  
> > request comes in, check it against the blacklist, which is in memory.  
> > If it matches, then pitch it out with a 403. Otherwise, put it in the  
> > queue for storage in the DB.
> > 
> > With that approach, we'd still do some work for each referrer but we  
> > wouldn't have to hit the DB.
> 
> I don't mind doing that, but currently the spam checker stuff wants a
> full WebsiteData object passed in to do the spam check, and that means a
> trip to the db.  So we would need a way to check the blacklist without
> requiring any objects from the db.
> 
> I don't see anywhere that would cache a weblog specific blacklist, so
> I'm not sure how to make that work.  That means any way we would hack
> this it couldn't check a weblog specific blacklist.  Maybe it's good
> enough even if we don't check the weblog custom blacklist?
> 
> Another idea is to create a special SpamFilter which would check the
> spam itself and return 403 responses.  The problem is still the same
> though, we wouldn't want to put that in front of the cache filters
> because then you are hitting the db on every request just to check for
> referrer spam.  So that wouldn't work unless it was specifically
> designed to cache the weblog customized blacklists.  If the custom
> blacklists are cached then it would probably be okay to put it as one of
> the first filters in line.  I don't know how big those blacklists could
> get though.
> 
> -- Allen
> 
> 
> > 
> > - Dave
> > 
> > 
> > >
> > > -Matt
> > >
> > > -----Original Message-----
> > > From: Allen Gilliland [mailto:Allen.T.Gilliland@Sun.COM]
> > > Sent: Tuesday, January 03, 2006 3:18 PM
> > > To: roller-dev@incubator.apache.org
> > > Subject: Proposal: Asynchronous Referrer Processing
> > >
> > > This is already linked on the Roller 2.2 proposal page, but I thought
> > > I'd send it out directly as well.
> > >
> > > http://rollerweblogger.org/wiki/Wiki.jsp? 
> > > page=AsynchronousReferrerProcessing
> > >
> > > This will allow Roller admins to optionally process referrers in an
> > > asynchronous manner, i.e. not tied to the http request/response cycle.
> > >
> > > Thoughts/comments always welcome.
> > >
> > > -- Allen
> > 
> 


Re: Proposal: Asynchronous Referrer Processing

Posted by Allen Gilliland <Al...@Sun.COM>.
On Tue, 2006-01-03 at 13:51, David M Johnson wrote:
> On Jan 3, 2006, at 4:01 PM, Matthew Schmidt wrote:
> > Definitely useful, but I question how we plan on blocking requests  
> > from
> > referrers that are bad?  If everything is pushed into the queue,  
> > wouldn't
> > the request just continue as normal with the blacklist processing  
> > happening
> > later?
> 
> Yes, that appears to be a shortcoming of this proposal.
> 
> If we want to answer referrer spammers with a 403 access denied, as  
> we do now, then I guess we could do something like this: when the  
> request comes in, check it against the blacklist, which is in memory.  
> If it matches, then pitch it out with a 403. Otherwise, put it in the  
> queue for storage in the DB.
> 
> With that approach, we'd still do some work for each referrer but we  
> wouldn't have to hit the DB.

I don't mind doing that, but currently the spam checker stuff wants a
full WebsiteData object passed in to do the spam check, and that means a
trip to the db.  So we would need a way to check the blacklist without
requiring any objects from the db.

I don't see anywhere that would cache a weblog specific blacklist, so
I'm not sure how to make that work.  That means any way we would hack
this it couldn't check a weblog specific blacklist.  Maybe it's good
enough even if we don't check the weblog custom blacklist?

Another idea is to create a special SpamFilter which would check the
spam itself and return 403 responses.  The problem is still the same
though, we wouldn't want to put that in front of the cache filters
because then you are hitting the db on every request just to check for
referrer spam.  So that wouldn't work unless it was specifically
designed to cache the weblog customized blacklists.  If the custom
blacklists are cached then it would probably be okay to put it as one of
the first filters in line.  I don't know how big those blacklists could
get though.

-- Allen


> 
> - Dave
> 
> 
> >
> > -Matt
> >
> > -----Original Message-----
> > From: Allen Gilliland [mailto:Allen.T.Gilliland@Sun.COM]
> > Sent: Tuesday, January 03, 2006 3:18 PM
> > To: roller-dev@incubator.apache.org
> > Subject: Proposal: Asynchronous Referrer Processing
> >
> > This is already linked on the Roller 2.2 proposal page, but I thought
> > I'd send it out directly as well.
> >
> > http://rollerweblogger.org/wiki/Wiki.jsp? 
> > page=AsynchronousReferrerProcessing
> >
> > This will allow Roller admins to optionally process referrers in an
> > asynchronous manner, i.e. not tied to the http request/response cycle.
> >
> > Thoughts/comments always welcome.
> >
> > -- Allen
> 


Re: Proposal: Asynchronous Referrer Processing

Posted by David M Johnson <Da...@Sun.COM>.
On Jan 3, 2006, at 4:01 PM, Matthew Schmidt wrote:
> Definitely useful, but I question how we plan on blocking requests  
> from
> referrers that are bad?  If everything is pushed into the queue,  
> wouldn't
> the request just continue as normal with the blacklist processing  
> happening
> later?

Yes, that appears to be a shortcoming of this proposal.

If we want to answer referrer spammers with a 403 access denied, as  
we do now, then I guess we could do something like this: when the  
request comes in, check it against the blacklist, which is in memory.  
If it matches, then pitch it out with a 403. Otherwise, put it in the  
queue for storage in the DB.

With that approach, we'd still do some work for each referrer but we  
wouldn't have to hit the DB.

- Dave


>
> -Matt
>
> -----Original Message-----
> From: Allen Gilliland [mailto:Allen.T.Gilliland@Sun.COM]
> Sent: Tuesday, January 03, 2006 3:18 PM
> To: roller-dev@incubator.apache.org
> Subject: Proposal: Asynchronous Referrer Processing
>
> This is already linked on the Roller 2.2 proposal page, but I thought
> I'd send it out directly as well.
>
> http://rollerweblogger.org/wiki/Wiki.jsp? 
> page=AsynchronousReferrerProcessing
>
> This will allow Roller admins to optionally process referrers in an
> asynchronous manner, i.e. not tied to the http request/response cycle.
>
> Thoughts/comments always welcome.
>
> -- Allen


RE: Proposal: Asynchronous Referrer Processing

Posted by Matthew Schmidt <ma...@javalobby.org>.
Definitely useful, but I question how we plan on blocking requests from
referrers that are bad?  If everything is pushed into the queue, wouldn't
the request just continue as normal with the blacklist processing happening
later?

-Matt

-----Original Message-----
From: Allen Gilliland [mailto:Allen.T.Gilliland@Sun.COM] 
Sent: Tuesday, January 03, 2006 3:18 PM
To: roller-dev@incubator.apache.org
Subject: Proposal: Asynchronous Referrer Processing

This is already linked on the Roller 2.2 proposal page, but I thought 
I'd send it out directly as well.

http://rollerweblogger.org/wiki/Wiki.jsp?page=AsynchronousReferrerProcessing

This will allow Roller admins to optionally process referrers in an 
asynchronous manner, i.e. not tied to the http request/response cycle.

Thoughts/comments always welcome.

-- Allen