You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apr.apache.org by Cliff Woolley <cl...@yahoo.com> on 2000/11/24 02:45:22 UTC

file bucket native operations

Why is it that we currently cannot split file buckets natively?  (Is it a sendfile
limitation?)

Let me just say ahead of time that I'm not trying to start another massive debate...
if somebody has a good reason that this cannot be done, I'll shut up about it.  It
just seems that it's a lot easier to do with file buckets than with pipes and
sockets, since the split can be done natively with no read required (as the
requirement for split functions is currently defined... speaking of which, is
anybody ever going to commit the ap_bucket_split_any() patch?)

In trying to implement such a function in conversation with OtherBill, the only big
problem I ran into was that a seek/read sequence on a fd is not threadsafe if there
are two file buckets in existance that point to the same fd, since it breaks the
assumptions that file_read() makes about the current location of the file pointer.

OtherBill suggested that maybe the second file bucket should point to a dup'ed file
handle rather than the same file handle.  At first glance, that'd be a great
solution, but at least some OSes cause dup'ed file handles to share file
pointers/flags/etc, so dup'ing doesn't necessarily help in this situation.  I don't
know if that's a standard behavior of dup() or not (somebody please enlighten me);
if it is, then it makes sense for apr_dupfile() to not change that behavior.  But if
dup() would fix this problem on some systems, then it probably makes sense for
apr_dupfile() to somehow guarantee that it's fixed on all systems.  In that case,
file_split() would just dup the file handle (a future file_copy() would want to do
the same thing).

As it stands though, it seems that the only way to make it possible to have multiple
file buckets point to the same fd is to serialize reads, seeking to the right spot
in the file before reading from it.  For every read.  That sucks.  (It's too bad
that there's no version of apr_read() that takes as a parameter the offset into the
file from which you want to begin reading... that'd be another way around this
problem.  Of course, it would have to deal with the same problems I'm dealing with
now, so that's probably not such a good idea.)

As I mentioned, a future file_copy operation would want to do the same thing.  That
assumes that there is still interest in implementing a copy operation on all bucket
types (which I think is a Good Thing).  If there is such an interest, I'd like to do
it before APR goes to Beta, since it's an API change.  Somebody just say the word,
and I'll do the work.


So I guess I'm asking two questions here:
(1) Does anybody have any bright ideas that would help in implementing split() and a
future copy() on file buckets, given the problems in reading non-sequentially from
the file?
(2) Is there still an interest in adding a copy operation to the buckets API?


Thanks, and Happy Thankgiving to all...

--Cliff

__________________________________________________
Do You Yahoo!?
Yahoo! Shopping - Thousands of Stores. Millions of Products.
http://shopping.yahoo.com/