You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apr.apache.org by Stas Bekman <st...@stason.org> on 2005/08/19 03:25:23 UTC

how to make apr_bucket_read() read only as much as I need?

when reading from most bucket types you want to read all the data. This is 
not the case with socket buckets, since the amount of data on the socket 
can be anything. At the moment the code hardcodes 8K of data is to be 
read. What do I do if I want to control how much data is read from the socket?

Before I figure out how to do that in the generic case, I was trying to 
solve the problem for a time being by making a copy of socket_bucket_read 
and adjusting it to accept the upper limit as an argument. The problem is 
that I don't seem to make it working the same way as the original before 
making any changes. e.g. if I had the code in core_input_filter:

   rv = apr_bucket_read(e, &str, &len, block);

and I replace it with:

   rv = apr_bucket_socket_read_stas(e, &str, &len, block);

which is exactly the same copy as socket_bucket_read, I don't get any data 
read. What am I missing here?

Thanks!

-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: how to make apr_bucket_read() read only as much as I need?

Posted by Colm MacCarthaigh <co...@stdlib.net>.
On Mon, Aug 22, 2005 at 01:56:27PM +0100, Joe Orton wrote:
> > Also while we are at it, do you remember why 8000 was chosen as the size 
> > of the buffer (I remember why it wasn't 8K, but I wonder why not 2k or 
> > 4k?) is it because it perfectly fits the memory page? but how do you 
> > ensure that it always starts at the page boundary and not hanging over the 
> > end of the previous page?
> 
> IIRC 8000 was picked because you can about that much in a single jumbo 
> ethernet frame on a gigE network, something like that.  

Hmmm, 8000 isn't a very good value if that was the reasoning. Most
jumboframe deployments are 9000 bytes end to end (based on Cisco's
choice of supporting 9128 bytes at transit level, and Broadcom's 9000
bytes at the lower end), giving payloads of 8944 bytes (assuming IPv4
and TCP). 

Jumboframe deployments of 16k arn't uncommon (Most Intel kit supports
16k for example) either, though not on the Internet.

-- 
Colm MacCárthaigh                        Public Key: colm+pgp@stdlib.net

Re: how to make apr_bucket_read() read only as much as I need?

Posted by Joe Orton <jo...@redhat.com>.
On Fri, Aug 19, 2005 at 01:56:30PM -0700, Stas Bekman wrote:
> Joe Orton wrote:
> >On Fri, Aug 19, 2005 at 12:07:37AM -0700, Stas Bekman wrote:
> >>We do care, because we write a throttling filter for the smtp module, so 
> >>we want to slow down the spammers, reading 8K defeats the purpose of the 
> >>throttling filter (we read like 32B and then sleep), since most spam 
> >>messages will be consumed by the server side. that filter will be 
> >>available on CPAN once completed.
> >
> >Hmmm, tricky.  I would suspect it would be better to achieve this using 
> >socket options, though I'm not sure how feasible that is; e.g. try 
> >setting SO_RCVBUF to 32.  You really want to prevent the TCP stack from 
> >buffering and ACKing the data as well if you want to throttle the sender 
> >from the word go.
> 
> But that still won't prevent from the bucket socket read function to read 
> more than 32, if while it cleans the buffer the OS level keeps on reading 
> the data in, is that correct? or is it absolutely sure that the smaller 
> buffer size will make sure that the socket buffer will never read more 
> than that?

I'd think so, yes, but I'm not really sure, it would be worth some 
experimentation.  Would be interesting to hear whether this did work 
anyway.

> Also while we are at it, do you remember why 8000 was chosen as the size 
> of the buffer (I remember why it wasn't 8K, but I wonder why not 2k or 
> 4k?) is it because it perfectly fits the memory page? but how do you 
> ensure that it always starts at the page boundary and not hanging over the 
> end of the previous page?

IIRC 8000 was picked because you can about that much in a single jumbo 
ethernet frame on a gigE network, something like that.  I don't think 
page alignment was considered to matter.

Regards,

joe

Re: how to make apr_bucket_read() read only as much as I need?

Posted by Stas Bekman <st...@stason.org>.
Joe Orton wrote:
> On Fri, Aug 19, 2005 at 12:07:37AM -0700, Stas Bekman wrote:
> 
>>Joe Orton wrote:
>>
>>>On Thu, Aug 18, 2005 at 06:25:23PM -0700, Stas Bekman wrote:
>>>
>>>
>>>>when reading from most bucket types you want to read all the data. This 
>>>>is not the case with socket buckets, since the amount of data on the 
>>>>socket can be anything. At the moment the code hardcodes 8K of data is to 
>>>>be read. What do I do if I want to control how much data is read from the 
>>>>socket?
>>>
>>>
>>>Why do you need to do that?  You shouldn't really care.
>>
>>We do care, because we write a throttling filter for the smtp module, so 
>>we want to slow down the spammers, reading 8K defeats the purpose of the 
>>throttling filter (we read like 32B and then sleep), since most spam 
>>messages will be consumed by the server side. that filter will be 
>>available on CPAN once completed.
> 
> 
> Hmmm, tricky.  I would suspect it would be better to achieve this using 
> socket options, though I'm not sure how feasible that is; e.g. try 
> setting SO_RCVBUF to 32.  You really want to prevent the TCP stack from 
> buffering and ACKing the data as well if you want to throttle the sender 
> from the word go.

But that still won't prevent from the bucket socket read function to read 
more than 32, if while it cleans the buffer the OS level keeps on reading 
the data in, is that correct? or is it absolutely sure that the smaller 
buffer size will make sure that the socket buffer will never read more 
than that?

Also while we are at it, do you remember why 8000 was chosen as the size 
of the buffer (I remember why it wasn't 8K, but I wonder why not 2k or 
4k?) is it because it perfectly fits the memory page? but how do you 
ensure that it always starts at the page boundary and not hanging over the 
end of the previous page?

> Otherwise, I guess what you could do here is implement a new bucket 
> type, which is a clone of SOCKET, except it stores a length parameter 
> which is used in the place of APR_BUCKET_BUFF_SIZE on each ->read() 
> invocation; allowing the caller to change the length as necessary using 
> an accessor function.

Right I could do that. But I'll also need to replace the core_in filter, 
since it will use the current socket bucket functionality.

> You can't change the apr_bucket_read() interface itself so I'm not sure 
> exactly what you were trying to do with the changes you mention.

I wasn't trying to, I was just hoping for a quick hack before a clean 
solution is introduced. But never mind, I'll just work on the clean 
solution when we will have time for that.

-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: how to make apr_bucket_read() read only as much as I need?

Posted by Joe Orton <jo...@redhat.com>.
On Fri, Aug 19, 2005 at 12:07:37AM -0700, Stas Bekman wrote:
> Joe Orton wrote:
> >On Thu, Aug 18, 2005 at 06:25:23PM -0700, Stas Bekman wrote:
> >
> >>when reading from most bucket types you want to read all the data. This 
> >>is not the case with socket buckets, since the amount of data on the 
> >>socket can be anything. At the moment the code hardcodes 8K of data is to 
> >>be read. What do I do if I want to control how much data is read from the 
> >>socket?
> >
> >
> >Why do you need to do that?  You shouldn't really care.
> 
> We do care, because we write a throttling filter for the smtp module, so 
> we want to slow down the spammers, reading 8K defeats the purpose of the 
> throttling filter (we read like 32B and then sleep), since most spam 
> messages will be consumed by the server side. that filter will be 
> available on CPAN once completed.

Hmmm, tricky.  I would suspect it would be better to achieve this using 
socket options, though I'm not sure how feasible that is; e.g. try 
setting SO_RCVBUF to 32.  You really want to prevent the TCP stack from 
buffering and ACKing the data as well if you want to throttle the sender 
from the word go.

Otherwise, I guess what you could do here is implement a new bucket 
type, which is a clone of SOCKET, except it stores a length parameter 
which is used in the place of APR_BUCKET_BUFF_SIZE on each ->read() 
invocation; allowing the caller to change the length as necessary using 
an accessor function.

You can't change the apr_bucket_read() interface itself so I'm not sure 
exactly what you were trying to do with the changes you mention.

joe

Re: how to make apr_bucket_read() read only as much as I need?

Posted by Stas Bekman <st...@stason.org>.
Joe Orton wrote:
> On Thu, Aug 18, 2005 at 06:25:23PM -0700, Stas Bekman wrote:
> 
>>when reading from most bucket types you want to read all the data. This is 
>>not the case with socket buckets, since the amount of data on the socket 
>>can be anything. At the moment the code hardcodes 8K of data is to be 
>>read. What do I do if I want to control how much data is read from the 
>>socket?
> 
> 
> Why do you need to do that?  You shouldn't really care.

We do care, because we write a throttling filter for the smtp module, so 
we want to slow down the spammers, reading 8K defeats the purpose of the 
throttling filter (we read like 32B and then sleep), since most spam 
messages will be consumed by the server side. that filter will be 
available on CPAN once completed.

-- 
__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com

Re: how to make apr_bucket_read() read only as much as I need?

Posted by Joe Orton <jo...@redhat.com>.
On Thu, Aug 18, 2005 at 06:25:23PM -0700, Stas Bekman wrote:
> when reading from most bucket types you want to read all the data. This is 
> not the case with socket buckets, since the amount of data on the socket 
> can be anything. At the moment the code hardcodes 8K of data is to be 
> read. What do I do if I want to control how much data is read from the 
> socket?

Why do you need to do that?  You shouldn't really care.

joe