You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@apr.apache.org by rb...@apache.org on 2001/07/05 02:58:56 UTC

cvs commit: apr-util/buckets apr_buckets_file.c

rbb         01/07/04 17:58:56

  Modified:    buckets  apr_buckets_file.c
  Log:
  We need to ALWAYS do the seek if we are reading from the file.  This is
  unfortunate from a performance perspective, but right now, I can have an
  offset of 0 in the bucket, but be referring to a file that has been read
  from.  If we don't seek before reading from the bucket, we get invalid
  data.
  
  Revision  Changes    Path
  1.49      +4 -6      apr-util/buckets/apr_buckets_file.c
  
  Index: apr_buckets_file.c
  ===================================================================
  RCS file: /home/cvs/apr-util/buckets/apr_buckets_file.c,v
  retrieving revision 1.48
  retrieving revision 1.49
  diff -u -d -b -w -u -r1.48 -r1.49
  --- apr_buckets_file.c	2001/06/27 20:15:01	1.48
  +++ apr_buckets_file.c	2001/07/05 00:58:56	1.49
  @@ -163,12 +163,10 @@
       buf = malloc(*len);
   
       /* Handle offset ... */
  -    if (fileoffset) {
           rv = apr_file_seek(f, APR_SET, &fileoffset);
           if (rv != APR_SUCCESS) {
               free(buf);
               return rv;
  -        }
       }
       rv = apr_file_read(f, buf, len);
       if (rv != APR_SUCCESS && rv != APR_EOF) {
  
  
  

Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by rb...@covalent.net.
On Thu, 5 Jul 2001, Bill Stoddard wrote:

> > > Maybe there is something very fundamental that I a missing here.
> > > Each 8K chunk that is read causes the system to increment its file
> > > pointer for that open fd. You should not need to call seek() to do
> > > something the system is already doing for you under the covers.
> >
> >
> >    apr_file_t *f = ...;
> >    apr_bucket *a, *b, *c, *d;
> >
> >    /* split the bucket into hunks of 100 bytes each */
> >
> >    a = apr_bucket_file_create(f, 0, len, pool);
> >    b = apr_bucket_split(a, 100);
> >    c = apr_bucket_split(b, 100);
> >    d = apr_bucket_split(c, 100);
> >
> >    APR_BUCKET_INSERT_AFTER(a, c);
> >
> >    apr_bucket_destroy(b);
> >    apr_bucket_destroy(d);
> >
> > You can't guarantee that consecutive reads from file buckets read from the
> > "next" spot in the file.  In fact, they very frequently jump around.
>
> Give me a common example of reads from file buckets jumping around.

Any small byterange request.

Ryan

_____________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
Covalent Technologies			rbb@covalent.net
-----------------------------------------------------------------------------


Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by Cliff Woolley <cl...@yahoo.com>.
On Thu, 5 Jul 2001, Cliff Woolley wrote:

> The chunk filter.

s/chunk/byterange/

Duh.

--Cliff

--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA



Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by Cliff Woolley <cl...@yahoo.com>.
On Thu, 5 Jul 2001, Bill Stoddard wrote:

> > You can't guarantee that consecutive reads from file buckets read from the
> > "next" spot in the file.  In fact, they very frequently jump around.
>
> Give me a common example of reads from file buckets jumping around.

The chunk filter.  Or any filter for that matter.  Any time
apr_bucket_split() or apr_bucket_copy() [or even apr_bucket_delete()] are
used.

I've been trying to think of *any* possible way that the buckets code
could know whether the file pointer is at the "right spot" or not.
Possibly some kind of flag in the apr_bucket_file that indicates whether
any of the above operations have been called.  But no matter what approach
I come up with, I always come up with cases where it just won't work.

If you're really desperate to optimize the case where we read straight
through the file [and can't use sendfile and can't use mmap], I think the
best way to do it is to put another conditional in apr_file_seek that
checks to see if the file handle is already at the right place and if so
returns immediately, skipping the call to lseek() or whatever.  But
lseek() is probably doing the same thing anyhow...

--Cliff

--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA



Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by Bill Stoddard <bi...@wstoddard.com>.
> > Maybe there is something very fundamental that I a missing here.
> > Each 8K chunk that is read causes the system to increment its file
> > pointer for that open fd. You should not need to call seek() to do
> > something the system is already doing for you under the covers.
> 
> 
>    apr_file_t *f = ...;
>    apr_bucket *a, *b, *c, *d;
> 
>    /* split the bucket into hunks of 100 bytes each */
> 
>    a = apr_bucket_file_create(f, 0, len, pool);
>    b = apr_bucket_split(a, 100);
>    c = apr_bucket_split(b, 100);
>    d = apr_bucket_split(c, 100);
> 
>    APR_BUCKET_INSERT_AFTER(a, c);
> 
>    apr_bucket_destroy(b);
>    apr_bucket_destroy(d);
> 
> You can't guarantee that consecutive reads from file buckets read from the
> "next" spot in the file.  In fact, they very frequently jump around.

Give me a common example of reads from file buckets jumping around.

Bill



Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by Cliff Woolley <cl...@yahoo.com>.
On Thu, 5 Jul 2001, Bill Stoddard wrote:

> Maybe there is something very fundamental that I a missing here.
> Each 8K chunk that is read causes the system to increment its file
> pointer for that open fd. You should not need to call seek() to do
> something the system is already doing for you under the covers.


   apr_file_t *f = ...;
   apr_bucket *a, *b, *c, *d;

   /* split the bucket into hunks of 100 bytes each */

   a = apr_bucket_file_create(f, 0, len, pool);
   b = apr_bucket_split(a, 100);
   c = apr_bucket_split(b, 100);
   d = apr_bucket_split(c, 100);

   APR_BUCKET_INSERT_AFTER(a, c);

   apr_bucket_destroy(b);
   apr_bucket_destroy(d);

You can't guarantee that consecutive reads from file buckets read from the
"next" spot in the file.  In fact, they very frequently jump around.

--Cliff


--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA



Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by Bill Stoddard <bi...@wstoddard.com>.
> PS: The common case for this segment of code is that the offset will be
> non-zero, since it reads in from a file in 8KB hunks, incrementing the
> offset each time.  So removing the conditional actually (very slightly)
> improves the performance of this section for the common case.  Only the
> case where offset == 0 might be negatively impacted.
>
> --Cliff

Maybe there is something very fundamental that I a missing here.  Each 8K chunk that is
read causes the system to increment its file pointer for that open fd. You should not need
to call seek() to do something the system is already doing for you under the covers.

Bill


Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by rb...@covalent.net.
On Thu, 5 Jul 2001, Bill Stoddard wrote:

>
> > On Thu, 5 Jul 2001, Bill Stoddard wrote:
> >
> > > I am teetering on a -1 for this patch. You are hacking around a more
> > > fundamental problem. If we cannot fix problems like this w/o impacting
> > > the performance of all applications that need to read files, then APR
> > > is seriously broken.
> >
> > Well, that's probably true.  But please don't -1 the patch.  If you're
> > going to -1 something, -1 APR's lack of an apr_file_read()-like function
> > that takes an offset as a parameter.  Until APR has such a beast (which
> > will just have to do the lock and seek on its own), this is the correct
> > patch to the buckets, which are broken without it.
> >
>
> We routinely veto short term hacks in favor of fixing the problem the right way (I've had
> a few of my own hacks vetoed and rightfully so :-). The problem with allowing hacks like
> this in is that they tend to accumulate and the proper fix never gets implemented.

This isn't the wrong solution.  I'm sorry, but it isn't.  Adding an
apr_file_read_with_offset is another possible solution, but all that does
is the exact same thing under one function.  The logic doesn't change,
just the location of the if.

Ryan

_____________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
Covalent Technologies			rbb@covalent.net
-----------------------------------------------------------------------------



Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by Bill Stoddard <bi...@wstoddard.com>.
> On Thu, 5 Jul 2001, Bill Stoddard wrote:
>
> > I am teetering on a -1 for this patch. You are hacking around a more
> > fundamental problem. If we cannot fix problems like this w/o impacting
> > the performance of all applications that need to read files, then APR
> > is seriously broken.
>
> Well, that's probably true.  But please don't -1 the patch.  If you're
> going to -1 something, -1 APR's lack of an apr_file_read()-like function
> that takes an offset as a parameter.  Until APR has such a beast (which
> will just have to do the lock and seek on its own), this is the correct
> patch to the buckets, which are broken without it.
>

We routinely veto short term hacks in favor of fixing the problem the right way (I've had
a few of my own hacks vetoed and rightfully so :-). The problem with allowing hacks like
this in is that they tend to accumulate and the proper fix never gets implemented.

Bill


Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by Cliff Woolley <cl...@yahoo.com>.
On Thu, 5 Jul 2001, Bill Stoddard wrote:

> I am teetering on a -1 for this patch. You are hacking around a more
> fundamental problem. If we cannot fix problems like this w/o impacting
> the performance of all applications that need to read files, then APR
> is seriously broken.

Well, that's probably true.  But please don't -1 the patch.  If you're
going to -1 something, -1 APR's lack of an apr_file_read()-like function
that takes an offset as a parameter.  Until APR has such a beast (which
will just have to do the lock and seek on its own), this is the correct
patch to the buckets, which are broken without it.

PS: The common case for this segment of code is that the offset will be
non-zero, since it reads in from a file in 8KB hunks, incrementing the
offset each time.  So removing the conditional actually (very slightly)
improves the performance of this section for the common case.  Only the
case where offset == 0 might be negatively impacted.

--Cliff


--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA




Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by rb...@covalent.net.
> > > I am teetering on a -1 for this patch. You are hacking around a more fundamental
> problem.
> > > If we cannot fix problems like this w/o impacting the performance of all applications
> that
> > > need to read files, then APR is seriously broken.
> >
> > Huh?  If I have a file that I put into a bucket, and I set the offset to
> > 0, so that the bucket refers to the whole file, then the bucket code needs
> > to respect that, regardless of what else I do with the file.  This is
> > about making the code orthogonal, so that if I read from the file
> > someplace else, it doesn't effect my bucket logic.
> >
>
> If you perform a read on a file and don't specifiy an offset, then you should assume you

Bill, I DID specify an offset.  I specified the offset 0, the start of the
file.

> will be reading from the current file pointer maintained by the system (or by apr_file_t
> in the XTHREAD case).  If you have an apr_file_t open and you are reading from the file
> someplace else using the fd, then you are screwed. You shouldn't be mixing apr_file_*
> calls with non apr_file_* calls on the same fd. If you insist on doing this, then you need

I am only using apr_file_* calls on this file.  The problem is that I
opened the file, then I read from it to parse the whole file.  Finally,
after finding the information I needed, I decided to put it into a bucket
with an offset of 0, because that was where the information I needed
began.

> to ensure that your non apr_file_* calls leaves the file pointer in the proper state when
> they are done.  You definitely shouldn't be horking up apr_file* calls to defend against
> this case.

Ryan

_____________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
Covalent Technologies			rbb@covalent.net
-----------------------------------------------------------------------------


Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by Cliff Woolley <cl...@yahoo.com>.
On Thu, 5 Jul 2001, Roy T. Fielding wrote:

> Either define two different file bucket types or add a variable that
> indicates relative vs fixed.  Fixed fds have to be seeked to the
> position of each read, whereas relative fds skip from the current
> position to the offset.  File buckets are therefore relative until
> split.

I thought about that.  But there'd be a *lot* of code duplication.  The
only thing that would be different would be whether or not apr_file_seek()
gets called and the logic to convert from one type to another.  Plus I'm
not convinced that there's any reliable way to know when to do the
relative->fixed swapover.  It's not just when you split() the bucket.
Problems can occur with copy, split, and delete, at least.  But just the
fact that you copy, split, or delete doesn't mean that you HAVE to seek,
it just means that you can no longer guarantee that you don't have to
seek.

It seems like it'd be a lot easier if we just stuck a conditional at the
top of apr_file_seek() that returns immediately if the offset stashed in
the apr_file_t (which is assumed to be the same as the one in-kernel,
which is a valid assumption if you don't mix apr_file_* and non-apr_file_*
calls on the same file handle) is the same as the offset passed into
apr_file_seek().  So we're talking about one functional call and one
conditional return.  It's a tradeoff between that very small performance
penalty and much cleaner and more reliable file buckets.

I'm dislike the idea of duplicating the file buckets in this way.  But I'm
willing to change my mind if you can show me a way to do it that doesn't
duplicate much code and that works in all cases.  <shrug>

--Cliff

--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA



Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by "Roy T. Fielding" <fi...@ebuilt.com>.
Either define two different file bucket types or add a variable that indicates
relative vs fixed.  Fixed fds have to be seeked to the position of each read,
whereas relative fds skip from the current position to the offset.  File
buckets are therefore relative until split.

....Roy


Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by Cliff Woolley <cl...@yahoo.com>.
On Thu, 5 Jul 2001 rbb@covalent.net wrote:

> I agree, but Bill is saying that in http, we can usually count on the OS
> file offset being correct, so he wants a performance enhancement for that
> case.  I have no problem adding an enhancement for that special case,
> assuming it is called out as a special case.

Okay.  I have no problem enhancing it if we can figure out some way to
make it work.  Ideally, there would be some way to know "I'm still reading
sequentially from this file, don't bother seeking," and to gracefully fall
back on seeking if needed.

> I just want to be sure that we don't back out the fix that went in
> yesterday, so I am offering options that might work for everybody.  :-)
> I do not agree that the special case will always work, sooner or later
> a filter will break the special case IMO, but I don't want to force my
> opinions on everybody.

Fair enough.  =-)

--Cliff

--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA



Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by rb...@covalent.net.
> > Perhaps just having a different offset, like -1 to specify that you should
> > always use the OS's file offset.
>
> Hmm... that *only* works if you can guarantee that no other code (eg, some
> filter) is going muck with the brigade.  IMO you're much better off
> keeping track of the offset than counting on reading from the file
> sequentially without backtracking and without skipping any.  Buckets
> should work regardless of which buckets come before or after them in the
> brigade.

I agree, but Bill is saying that in http, we can usually count on the OS
file offset being correct, so he wants a performance enhancement for that
case.  I have no problem adding an enhancement for that special case,
assuming it is called out as a special case.

I just want to be sure that we don't back out the fix that went in
yesterday, so I am offering options that might work for everybody.  :-)

I do not agree that the special case will always work, sooner or later a
filter will break the special case IMO, but I don't want to force my
opinions on everybody.

Ryan
_____________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
Covalent Technologies			rbb@covalent.net
-----------------------------------------------------------------------------


Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by Cliff Woolley <cl...@yahoo.com>.
On Thu, 5 Jul 2001 rbb@covalent.net wrote:

> Perhaps just having a different offset, like -1 to specify that you should
> always use the OS's file offset.

Hmm... that *only* works if you can guarantee that no other code (eg, some
filter) is going muck with the brigade.  IMO you're much better off
keeping track of the offset than counting on reading from the file
sequentially without backtracking and without skipping any.  Buckets
should work regardless of which buckets come before or after them in the
brigade.

> I agree whole-heartedly that we need to care about performance.  However,
> the need to be performance aware can NEVER interfere with making the code
> work correctly.

Exactly.


--Cliff


--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA



Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by rb...@covalent.net.
On Thu, 5 Jul 2001, Bill Stoddard wrote:

> What specific problem are we trying  to fix?  I agree for the need for a seek() in the
> case you are describing below but it seems to be an unusual case, at least for the
> webserver. We need to focus on making APR work efficiently for the most common cases and
> resist the urge to create an overgeneralized solutions to every problem.  Apache httpd
> already has serious performance problems and patches like this are not helping us move in
> the right direction/

I agree whole-heartedly that we need to care about performance.  However,
the need to be performance aware can NEVER interfere with making the code
work correctly.  Also, APR has nothing to do with a web server.  The first
application was a web server, but that isn't the only application.  APR is
being used in Apache modules, Subversion, and I happen to know that Rob
McCool is using APR for some of his projects.

If you are going to veto this patch, please at least suggest a way to
get the same feature without it.  Not having the buckets respect the
offset that I passed in breaks MY code.

Perhaps just having a different offset, like -1 to specify that you should
always use the OS's file offset.

Ryan

_____________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
Covalent Technologies			rbb@covalent.net
-----------------------------------------------------------------------------


Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by Bill Stoddard <bi...@wstoddard.com>.
What specific problem are we trying  to fix?  I agree for the need for a seek() in the
case you are describing below but it seems to be an unusual case, at least for the
webserver. We need to focus on making APR work efficiently for the most common cases and
resist the urge to create an overgeneralized solutions to every problem.  Apache httpd
already has serious performance problems and patches like this are not helping us move in
the right direction/

Bill

> On Thu, 5 Jul 2001, Bill Stoddard wrote:
>
> > If you perform a read on a file and don't specifiy an offset, then you
> > should assume you will be reading from the current file pointer
> > maintained by the system (or by apr_file_t in the XTHREAD case).  If
> > you have an apr_file_t open and you are reading from the file
> > someplace else using the fd, then you are screwed. You shouldn't be
> > mixing apr_file_* calls with non apr_file_* calls on the same fd. If
> > you insist on doing this, then you need to ensure that your non
> > apr_file_* calls leaves the file pointer in the proper state when they
> > are done.  You definitely shouldn't be horking up apr_file* calls to
> > defend against this case.
>
> [scratching head]
>
> I don't think we're talking about mixing apr_file_* and non apr_file_*
> calls.  A file bucket *always* specifies an offset [although sometimes
> that offset might be 0].
>
> Consider this:
>
>     apr_file_t *f = ...
>     apr_bucket *a, *b;
>
>     a = apr_bucket_file_create(f, 0, len, pool);
>     b = apr_bucket_split(a, 500);
>
>     APR_BUCKET_INSERT_BEFORE(a, b);
>
> Now you have two file buckets referencing the same apr_file_t, one at
> offset 0 and one at offset 500.  The one at offset 500 is BEFORE the one
> at offset 0 in the brigade.  When you read from the buckets in the brigade
> (assume for a minute that !APR_HAS_MMAP), you get to the offset==500
> bucket first, seek to offset 500, and read from there.  Then you get to
> the offset==0 bucket, so you sure as hell better seek back to offset 0
> before you read!  Not doing so is a bug in the file buckets code, not in
> APR.  If you want to combine the apr_file_seek/apr_file_read calls, that's
> fine, but you'll STILL need to have removed this offset-test conditional.
> So Ryan's patch is a correct fix either way, not just a hack.
>
> --Cliff
>
>
> --------------------------------------------------------------
>    Cliff Woolley
>    cliffwoolley@yahoo.com
>    Charlottesville, VA
>
>
>


Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by Cliff Woolley <cl...@yahoo.com>.
On Thu, 5 Jul 2001, Bill Stoddard wrote:

> If you perform a read on a file and don't specifiy an offset, then you
> should assume you will be reading from the current file pointer
> maintained by the system (or by apr_file_t in the XTHREAD case).  If
> you have an apr_file_t open and you are reading from the file
> someplace else using the fd, then you are screwed. You shouldn't be
> mixing apr_file_* calls with non apr_file_* calls on the same fd. If
> you insist on doing this, then you need to ensure that your non
> apr_file_* calls leaves the file pointer in the proper state when they
> are done.  You definitely shouldn't be horking up apr_file* calls to
> defend against this case.

[scratching head]

I don't think we're talking about mixing apr_file_* and non apr_file_*
calls.  A file bucket *always* specifies an offset [although sometimes
that offset might be 0].

Consider this:

    apr_file_t *f = ...
    apr_bucket *a, *b;

    a = apr_bucket_file_create(f, 0, len, pool);
    b = apr_bucket_split(a, 500);

    APR_BUCKET_INSERT_BEFORE(a, b);

Now you have two file buckets referencing the same apr_file_t, one at
offset 0 and one at offset 500.  The one at offset 500 is BEFORE the one
at offset 0 in the brigade.  When you read from the buckets in the brigade
(assume for a minute that !APR_HAS_MMAP), you get to the offset==500
bucket first, seek to offset 500, and read from there.  Then you get to
the offset==0 bucket, so you sure as hell better seek back to offset 0
before you read!  Not doing so is a bug in the file buckets code, not in
APR.  If you want to combine the apr_file_seek/apr_file_read calls, that's
fine, but you'll STILL need to have removed this offset-test conditional.
So Ryan's patch is a correct fix either way, not just a hack.

--Cliff


--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA




Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by Bill Stoddard <bi...@wstoddard.com>.
> On Thu, 5 Jul 2001, Bill Stoddard wrote:
>
> > I am teetering on a -1 for this patch. You are hacking around a more fundamental
problem.
> > If we cannot fix problems like this w/o impacting the performance of all applications
that
> > need to read files, then APR is seriously broken.
>
> Huh?  If I have a file that I put into a bucket, and I set the offset to
> 0, so that the bucket refers to the whole file, then the bucket code needs
> to respect that, regardless of what else I do with the file.  This is
> about making the code orthogonal, so that if I read from the file
> someplace else, it doesn't effect my bucket logic.
>

If you perform a read on a file and don't specifiy an offset, then you should assume you
will be reading from the current file pointer maintained by the system (or by apr_file_t
in the XTHREAD case).  If you have an apr_file_t open and you are reading from the file
someplace else using the fd, then you are screwed. You shouldn't be mixing apr_file_*
calls with non apr_file_* calls on the same fd. If you insist on doing this, then you need
to ensure that your non apr_file_* calls leaves the file pointer in the proper state when
they are done.  You definitely shouldn't be horking up apr_file* calls to defend against
this case.

Bill



Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by rb...@covalent.net.
On Thu, 5 Jul 2001, Bill Stoddard wrote:

> I am teetering on a -1 for this patch. You are hacking around a more fundamental problem.
> If we cannot fix problems like this w/o impacting the performance of all applications that
> need to read files, then APR is seriously broken.

Huh?  If I have a file that I put into a bucket, and I set the offset to
0, so that the bucket refers to the whole file, then the bucket code needs
to respect that, regardless of what else I do with the file.  This is
about making the code orthogonal, so that if I read from the file
someplace else, it doesn't effect my bucket logic.

Besides, we are already into the "the performance of this operation sucks"
segment of the code.

Also, I have a feeling that if the file pointer is already at the correct
location, then the C Run-Time will just return, without doing anything.

Ryan

> > rbb         01/07/04 17:58:56
> >
> >   Modified:    buckets  apr_buckets_file.c
> >   Log:
> >   We need to ALWAYS do the seek if we are reading from the file.  This is
> >   unfortunate from a performance perspective, but right now, I can have an
> >   offset of 0 in the bucket, but be referring to a file that has been read
> >   from.  If we don't seek before reading from the bucket, we get invalid
> >   data.


_____________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
Covalent Technologies			rbb@covalent.net
-----------------------------------------------------------------------------


Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by Bill Stoddard <bi...@wstoddard.com>.
I am teetering on a -1 for this patch. You are hacking around a more fundamental problem.
If we cannot fix problems like this w/o impacting the performance of all applications that
need to read files, then APR is seriously broken.

Bill

----- Original Message -----
From: <rb...@apache.org>
To: <ap...@apache.org>
Sent: Wednesday, July 04, 2001 8:58 PM
Subject: cvs commit: apr-util/buckets apr_buckets_file.c


> rbb         01/07/04 17:58:56
>
>   Modified:    buckets  apr_buckets_file.c
>   Log:
>   We need to ALWAYS do the seek if we are reading from the file.  This is
>   unfortunate from a performance perspective, but right now, I can have an
>   offset of 0 in the bucket, but be referring to a file that has been read
>   from.  If we don't seek before reading from the bucket, we get invalid
>   data.
>
>   Revision  Changes    Path
>   1.49      +4 -6      apr-util/buckets/apr_buckets_file.c
>
>   Index: apr_buckets_file.c
>   ===================================================================
>   RCS file: /home/cvs/apr-util/buckets/apr_buckets_file.c,v
>   retrieving revision 1.48
>   retrieving revision 1.49
>   diff -u -d -b -w -u -r1.48 -r1.49
>   --- apr_buckets_file.c 2001/06/27 20:15:01 1.48
>   +++ apr_buckets_file.c 2001/07/05 00:58:56 1.49
>   @@ -163,12 +163,10 @@
>        buf = malloc(*len);
>
>        /* Handle offset ... */
>   -    if (fileoffset) {
>            rv = apr_file_seek(f, APR_SET, &fileoffset);
>            if (rv != APR_SUCCESS) {
>                free(buf);
>                return rv;
>   -        }
>        }
>        rv = apr_file_read(f, buf, len);
>        if (rv != APR_SUCCESS && rv != APR_EOF) {
>
>
>
>


Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by Cliff Woolley <cl...@yahoo.com>.
On Wed, 4 Jul 2001 rbb@covalent.net wrote:

> > Hoooo... good catch.  I don't know why I never thought of that before.
> > <sigh>  You're right that it sucks for performance, but then again if we
> > get to this point we've given up on mmaping the file so we're in the
> > "worst case" performance scenario anyway.
>
> I only caught it because it exposed a bug in mod_pop3.  :-)

I'd imagine that we've never noticed it before because you'd only see it
if you took a file bucket, split and/or copied it, and read from a LATER
part of the file BEFORE reading from the beginning of the file, which
isn't common in current Apache modules.  Maybe we should come up with a
mod_scramble/mod_unscramble to test for weird stuff like this.  =-)

--Cliff


--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA



Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by rb...@covalent.net.
> >   We need to ALWAYS do the seek if we are reading from the file.  This is
> >   unfortunate from a performance perspective, but right now, I can have an
> >   offset of 0 in the bucket, but be referring to a file that has been read
> >   from.  If we don't seek before reading from the bucket, we get invalid
> >   data.
> >
> >        /* Handle offset ... */
> >   -    if (fileoffset) {
> >            rv = apr_file_seek(f, APR_SET, &fileoffset);
> >            if (rv != APR_SUCCESS) {
> >                free(buf);
> >                return rv;
> >   -        }
> >        }
>
> Hoooo... good catch.  I don't know why I never thought of that before.
> <sigh>  You're right that it sucks for performance, but then again if we
> get to this point we've given up on mmaping the file so we're in the
> "worst case" performance scenario anyway.

I only caught it because it exposed a bug in mod_pop3.  :-)

Ryan

_____________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
Covalent Technologies			rbb@covalent.net
-----------------------------------------------------------------------------


Re: cvs commit: apr-util/buckets apr_buckets_file.c

Posted by Cliff Woolley <cl...@yahoo.com>.
On 5 Jul 2001 rbb@apache.org wrote:

>   We need to ALWAYS do the seek if we are reading from the file.  This is
>   unfortunate from a performance perspective, but right now, I can have an
>   offset of 0 in the bucket, but be referring to a file that has been read
>   from.  If we don't seek before reading from the bucket, we get invalid
>   data.
>
>        /* Handle offset ... */
>   -    if (fileoffset) {
>            rv = apr_file_seek(f, APR_SET, &fileoffset);
>            if (rv != APR_SUCCESS) {
>                free(buf);
>                return rv;
>   -        }
>        }

Hoooo... good catch.  I don't know why I never thought of that before.
<sigh>  You're right that it sucks for performance, but then again if we
get to this point we've given up on mmaping the file so we're in the
"worst case" performance scenario anyway.

--Cliff


--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA