You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Cliff Woolley <cl...@yahoo.com> on 2000/11/24 03:03:34 UTC

Fwd: file bucket native operations

Crap, I forgot that buckets aren't part of APR, so I guess this should have gone to
this list instead of the APR list.  (Most of my questions were APR-related anyway,
but whatever.)

Happy Thanksgiving!

--Cliff


--- Cliff Woolley <cl...@yahoo.com> wrote:
> From: Cliff Woolley <cl...@yahoo.com>
> Subject: file bucket native operations
> To: dev@apr.apache.org
> CC: gstein@lyra.org
> 
> 
> Why is it that we currently cannot split file buckets natively?  (Is it a sendfile
> limitation?)
> 
> Let me just say ahead of time that I'm not trying to start another massive
> debate... if somebody has a good reason that this cannot be done, I'll shut
> up about it.  It just seems that it's a lot easier to do with file buckets
> than with pipes and sockets, since the split can be done natively with no
> read required (as the requirement for split functions is currently
> defined... speaking of which, is anybody ever going to commit the
> ap_bucket_split_any() patch?)
> 
> In trying to implement such a function in conversation with OtherBill, the only
> big problem I ran into was that a seek/read sequence on a fd is not threadsafe
> if there are two file buckets in existance that point to the same fd, since
> it breaks the assumptions that file_read() makes about the current location
> of the file pointer.
> 
> OtherBill suggested that maybe the second file bucket should point to a dup'ed
> file handle rather than the same file handle.  At first glance, that'd be a
> great solution, but at least some OSes cause dup'ed file handles to share
> file pointers/flags/etc, so dup'ing doesn't necessarily help in this
> situation.  I don't know if that's a standard behavior of dup() or not
> (somebody please enlighten me); if it is, then it makes sense for
> apr_dupfile() to not change that behavior.  But if dup() would fix this
> problem on some systems, then it probably makes sense for apr_dupfile()
> to somehow guarantee that it's fixed on all systems.  In that case,
> file_split() would just dup the file handle (a future file_copy() would want
> to do the same thing).
> 
> As it stands though, it seems that the only way to make it possible to have
> multiple file buckets point to the same fd is to serialize reads, seeking to
> the right spot in the file before reading from it.  For every read.  That
> sucks.  (It's too bad that there's no version of apr_read() that takes as
> a parameter the offset into the file from which you want to begin reading...
> that'd be another way around this problem.  Of course, it would have to deal
> with the same problems I'm dealing with now, so that's probably not such a
> good idea.)
> 
> As I mentioned, a future file_copy operation would want to do the same thing. 
> That assumes that there is still interest in implementing a copy operation
> on all bucket types (which I think is a Good Thing).  If there is such an
> interest, I'd like to do it before APR goes to Beta, since it's an API change.
> Somebody just say the word, and I'll do the work.
> 
> 
> So I guess I'm asking two questions here:
> (1) Does anybody have any bright ideas that would help in implementing split() and
> a future copy() on file buckets, given the problems in reading non-sequentially
> from the file?
> (2) Is there still an interest in adding a copy operation to the buckets API?
> 
> 
> Thanks, and Happy Thankgiving to all...
> 
> --Cliff


__________________________________________________
Do You Yahoo!?
Yahoo! Shopping - Thousands of Stores. Millions of Products.
http://shopping.yahoo.com/

Re: Fwd: file bucket native operations

Posted by rb...@covalent.net.

> So I guess I'm asking two questions here:
> (1) Does anybody have any bright ideas that would help in implementing split() and
> a future copy() on file buckets, given the problems in reading non-sequentially
> from the file?

Native file splits are a bit annoying, but not impossible.  The easiest
way to deal with them, is to put an apr_lock_t inside the file type.  As
long as the ref count is 1, that lock should be NULL.  As soon as the
refcount is > 1 the lock needs to be created and used.  The buckets read
code already does the seek when necessary, so that shouldn't be an
issue.  As far as dealing with the limitations of having two buckets refer
to the same file, and using dup to handle this, I wouldn't worry about
that issue.

There are only a few cases we really need to worry about.

1)  The same file is opened in multiple threads of the same process.  This
is done with two calls to open, so it is a non-issue for us.

2)  We have one file that is split into two buckets.  This is handled with
the lock discussed above.

3)  We dup a file and it shows up in two buckets.  This isn't case we need
to worry about, because it can be treated as if we were just dealing with
files instead of buckets.

I don't think I missed any cases, but please let me know if I did.

> (2) Is there still an interest in adding a copy operation to the buckets API?

This is an absolute necessity.

Ryan

__________________________________________________
Do You Yahoo!?
Yahoo! Shopping - Thousands of Stores. Millions of Products.
http://shopping.yahoo.com/




_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------