You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apr.apache.org by Peter Samuelson <pe...@p12n.org> on 2007/01/05 12:19:44 UTC

[PATCH] Linux 2.4 may or may not implement sendfile64()

apr incorrectly detects the sendfile64() syscall on Linux 2.4 on
non-i386 architectures.  The problem is that on Linux, libc is not
tightly coupled to your installed kernel, so libc exposes syscalls
which may or may not be implemented kernel-side, requiring runtime
detection of errno==ENOSYS.  I wrote this patch in response to
http://bugs.debian.org/396631 - the symptom is that Apache sends empty
files without realising that anything is wrong.  Multiple Debian users
(not me) have tested the patch, and it will be in Debian 4.0.

  (sendfile64 actually does exist on Linux 2.4, but unfortunately
  nobody bothered to add the entry points to the syscall tables on most
  non-i386 architectures!)

Unlike the epoll case [another situation where runtime detection might
be used], disabling sendfile64 where it actually is available would
sacrifice real functionality: ability to use apr_socket_sendfile for
large files.  So I think the case for runtime detection is stronger
here than for epoll.

Peter

--- network_io/unix/sendrecv.c
+++ network_io/unix/sendrecv.c
@@ -240,31 +240,77 @@
 
 #if defined(__linux__) && defined(HAVE_WRITEV)
 
+/* Helper function for apr_socket_sendfile.
+ * Takes care of sendfile vs. sendfile64 (must be detected at runtime),
+ * EINTR restarting, and other details.  NOTE: does not necessarily
+ * update 'off', as callers don't need this.
+ */
+static
+ssize_t do_sendfile(int out, int in, apr_off_t *off, apr_size_t len)
+{
+#if !APR_HAS_LARGE_FILES
+    ssize_t ret;
+    do
+	ret = sendfile(out, in, off, len);
+    while (ret == -1 && errno == EINTR);
+    return ret;
+#else
+
+#ifdef HAVE_SENDFILE64
+    static int sendfile64_enosys;  /* sendfile64() syscall not found */
+#endif
+    off_t offtmp;
+    ssize_t ret;
+
+    /* Multiple reports have shown sendfile failing with EINVAL if
+     * passed a >=2Gb count value on some 64-bit kernels.  It won't
+     * noticably hurt performance to limit each call to <2Gb at a time,
+     * so avoid that issue here.  (Round down to a common page size.) */
+    if (sizeof(off_t) == 8 && len > INT_MAX)
+        len = INT_MAX - 8191;
+
+    /* The simple and common case: we don't cross the LFS barrier */
+    if (sizeof(off_t) == 8 || (apr_int64_t)*off + len <= INT_MAX) {
+        offtmp = *off;
+        do
+            ret = sendfile(out, in, &offtmp, len);
+        while (ret == -1 && errno == EINTR);
+        return ret;
+    }
+
+    /* From here down we know it's a 32-bit runtime */
+#ifdef HAVE_SENDFILE64
+    if (!sendfile64_enosys) {
+        do
+            ret = sendfile64(out, in, off, len);
+        while (ret == -1 && errno == EINTR);
+
+        if (ret != -1 || errno != ENOSYS)
+            return ret;
+
+        sendfile64_enosys = 1;
+    }
+#endif
+    if (*off > INT_MAX) {
+        errno = EINVAL;
+        return -1;
+    }
+    offtmp = *off;
+    do
+	ret = sendfile(out, in, &offtmp, len);
+    while (ret == -1 && errno == EINTR);
+    return ret;
+#endif /* APR_HAS_LARGE_FILES */
+}
+
+
 apr_status_t apr_socket_sendfile(apr_socket_t *sock, apr_file_t *file,
                                  apr_hdtr_t *hdtr, apr_off_t *offset,
                                  apr_size_t *len, apr_int32_t flags)
 {
     int rv, nbytes = 0, total_hdrbytes, i;
     apr_status_t arv;
-
-#if APR_HAS_LARGE_FILES && defined(HAVE_SENDFILE64)
     apr_off_t off = *offset;
-#define sendfile sendfile64
-
-#elif APR_HAS_LARGE_FILES && SIZEOF_OFF_T == 4
-    /* 64-bit apr_off_t but no sendfile64(): fail if trying to send
-     * past the 2Gb limit. */
-    off_t off;
-    
-    if ((apr_int64_t)*offset + *len > INT_MAX) {
-        return EINVAL;
-    }
-    
-    off = *offset;
-
-#else
-    off_t off = *offset;
-#endif
 
     if (!hdtr) {
         hdtr = &no_hdtr;
@@ -310,12 +356,10 @@
         goto do_select;
     }
 
-    do {
-        rv = sendfile(sock->socketdes,    /* socket */
+    rv = do_sendfile(sock->socketdes,    /* socket */
                       file->filedes, /* open file descriptor of the file to be sent */
                       &off,    /* where in the file to start */
                       *len);   /* number of bytes to send */
-    } while (rv == -1 && errno == EINTR);
 
     while ((rv == -1) && (errno == EAGAIN || errno == EWOULDBLOCK) 
                       && (sock->timeout > 0)) {
@@ -326,12 +370,10 @@
             return arv;
         }
         else {
-            do {
-                rv = sendfile(sock->socketdes,    /* socket */
+	    rv = do_sendfile(sock->socketdes,    /* socket */
                               file->filedes, /* open file descriptor of the file to be sent */
                               &off,    /* where in the file to start */
                               *len);    /* number of bytes to send */
-            } while (rv == -1 && errno == EINTR);
         }
     }
 

Re: [PATCH] Linux 2.4 may or may not implement sendfile64()

Posted by Peter Samuelson <pe...@p12n.org>.
[Joe Orton]
> I'd have done exactly as you did.  But the rest of the world does not
> exist under the specific constraints which Debian does; what's right
> for Debian is not necessarily right for upstream.

Fair enough.  I accept that Linux is in an unusual situation here, with
the loose coupling between libc and kernel, such that you can compile
on a new kernel yet run the binary on an old kernel.

I don't mind maintaining this as a Debian-specific patch, if that's
what you think is appropriate.  I only submitted it because I thought
it was a genuine improvement others would want as well.  Also, I
apologise if I came off sounding as though nobody except Debian cares
about the user experience - I was frustrated because I thought I'd been
misunderstood about why this patch matters to us.

Peter

Re: [PATCH] Linux 2.4 may or may not implement sendfile64()

Posted by Joe Orton <jo...@redhat.com>.
On Fri, Jan 05, 2007 at 07:20:01AM -0600, Peter Samuelson wrote:
> 
> [Joe Orton]
> > Yes, APR could detect system features at run-time, and yes, doing so
> > would be a complete maintenance nightmare having a small net negative
> > run-time cost to 99% of users, and no, it's not worth doing.
> 
> So in your opinion, which of these two options should we have gone with
> instead?

I'd have done exactly as you did.  But the rest of the world does not 
exist under the specific constraints which Debian does; what's right for 
Debian is not necessarily right for upstream.

(I'm not trying to be anti-Debian here; s/Debian/Fedora/ above is 
equally true, along with anybody else distributing binaries)

joe

Re: [PATCH] Linux 2.4 may or may not implement sendfile64()

Posted by Peter Samuelson <pe...@p12n.org>.
[Ruediger Pluem]
> How about letting these users simply turn off sendfile via
> 
>  EnableSendfile Off

Uh, first, because I didn't know Apache had that option.

Second ... well, I still would have written the patch.  If there's an
easy way to save sysadmins some troubleshooting and Googling and
editing a config file, I'm happy to do that.  I guess I have a bias
toward trying to make the system as painless as possible for the
sysadmins and users, as opposed to making it as painless as possible
for the developers.  At least as long as it's easy and straighforward,
as this patch was.

The complexity of the 90-line patch, by the way, should be weighed
against the complexity of the smarter autoconf test Joe proposed.  And
part of my patch is refactoring the function into smaller pieces, which
(at least IMO) is also a net win, despite added lines of source.

Re: [PATCH] Linux 2.4 may or may not implement sendfile64()

Posted by Ruediger Pluem <rp...@apache.org>.

On 01/05/2007 02:20 PM, Peter Samuelson wrote:
> [Joe Orton]
> 
>>Yes, APR could detect system features at run-time, and yes, doing so
>>would be a complete maintenance nightmare having a small net negative
>>run-time cost to 99% of users, and no, it's not worth doing.
> 
> 
> So in your opinion, which of these two options should we have gone with
> instead?
> 
> - status quo: some users on Linux 2.4 simply cannot run Apache 2 at all

How about letting these users simply turn off sendfile via

 EnableSendfile Off

If users with these kernels really want to have the performance benefit of
sendfile in this case they need to upgrade their kernel, but at least
they can use large files.


Regards

RĂ¼diger

Re: [PATCH] Linux 2.4 may or may not implement sendfile64()

Posted by Davi Arnaut <da...@haxent.com.br>.
Peter Samuelson wrote:
> [Joe Orton]
>> Yes, APR could detect system features at run-time, and yes, doing so
>> would be a complete maintenance nightmare having a small net negative
>> run-time cost to 99% of users, and no, it's not worth doing.
> 
> So in your opinion, which of these two options should we have gone with
> instead?
> 
> - status quo: some users on Linux 2.4 simply cannot run Apache 2 at all
> - NO Debian users can use Apache 2 to serve 4+GB files

- create Linux 2.4/2.6 Apache packages.

btw, a "portable runtime" that does not runs on both Linux 2.4/2.6 is
lame. I don't buy this "maintenance nightmare" argument, expect for
epoll and sendfile, are others runtime detections needed ? Those can be
done at apr_initialize() without any penalties. just my 2 cents.

Re: [PATCH] Linux 2.4 may or may not implement sendfile64()

Posted by Peter Samuelson <pe...@p12n.org>.
[Joe Orton]
> Yes, APR could detect system features at run-time, and yes, doing so
> would be a complete maintenance nightmare having a small net negative
> run-time cost to 99% of users, and no, it's not worth doing.

So in your opinion, which of these two options should we have gone with
instead?

- status quo: some users on Linux 2.4 simply cannot run Apache 2 at all
- NO Debian users can use Apache 2 to serve 4+GB files

The first one is a non-starter, we can't just decide to stop supporting
those users.  As for the second, LFS was the main reason we wanted to
switch to Apache 2.2 in the first place.  In this day and age, some
people actually want to ship DVD images over HTTP.

I suppose there's also the Gentoo Option: tell users they should just
compile all the software they use, on every box they want to use it on.
Some would call that a valid strategy, but Debian users would never put
up with it.  They expect not to have to mess with stuff like that -
it's one reason they use Debian.

Re: [PATCH] Linux 2.4 may or may not implement sendfile64()

Posted by Joe Orton <jo...@redhat.com>.
On Fri, Jan 05, 2007 at 06:07:18AM -0600, Peter Samuelson wrote:
> 
> [Joe Orton]
> > OK nice.  Why does this require run-time detection rather than simply
> > a smarter configure test (or apr_hints.m4 blacklist or whatever)?
> 
> Because we (Debian) ship binaries rather than source code.  Our users
> run these binaries on both kernel 2.4 and kernel 2.6.  Without runtime
> detection you get two bad choices.

Oh, that argument again :) That fact makes this feature no different to 
many of the other features which APR detects at configure-time.

Yes, APR could detect system features at run-time, and yes, doing so 
would be a complete maintenance nightmare having a small net negative 
run-time cost to 99% of users, and no, it's not worth doing.

joe

Re: [PATCH] Linux 2.4 may or may not implement sendfile64()

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
Peter Samuelson wrote:
> [Joe Orton]
>> OK nice.  Why does this require run-time detection rather than simply
>> a smarter configure test (or apr_hints.m4 blacklist or whatever)?
> 
> Because we (Debian) ship binaries rather than source code.  Our users
> run these binaries on both kernel 2.4 and kernel 2.6.  Without runtime
> detection you get two bad choices.  Take your pick:
> 
> - Users still on kernel 2.4 can't run Apache at all - or
> - Users on kernel 2.6 can't use Apache to serve 2+GB files

Take this one step further... Users on a broken kernel2.6 can't use
sendfile64 until updating apr, after they have picked up the patches
to correct the kernel sendfile64 behavior.

And one step further... Users who patch-update such that sendfile64
is broken.  Or who find sendfile64 broken on a particular network stack
driver.  The list goes on :)

The advantage (and why I *especially* like this patch) is that the user
who has -at this moment- a buggy sendfile64 implementation will still be
able to take advantage of it once everything is fixed; but I like even
more the fact that the user who breaks sendfile64 won't noticeably suffer.


My 2c

Re: [PATCH] Linux 2.4 may or may not implement sendfile64()

Posted by Peter Samuelson <pe...@p12n.org>.
[Joe Orton]
> OK nice.  Why does this require run-time detection rather than simply
> a smarter configure test (or apr_hints.m4 blacklist or whatever)?

Because we (Debian) ship binaries rather than source code.  Our users
run these binaries on both kernel 2.4 and kernel 2.6.  Without runtime
detection you get two bad choices.  Take your pick:

- Users still on kernel 2.4 can't run Apache at all - or
- Users on kernel 2.6 can't use Apache to serve 2+GB files

Runtime detection allows everyone to run Apache, and the majority
(people using kernel 2.6, or kernel 2.4.21+ on i386) can also serve
large files.

...And before you say "just make all Debian users upgrade their
kernels", that's not realistic.  Datacenters that sell shell accounts
on vservers, for example, often let you control the whole system
_except_ what kernel it runs.


Btw, use of the Linux epoll facility is similar (another kernel 2.4/2.6
difference), except that if you unconditionally disable it, no actual
functionality is lost, only a bit of efficiency.  So we disabled epoll.

Re: [PATCH] Linux 2.4 may or may not implement sendfile64()

Posted by Joe Orton <jo...@redhat.com>.
On Fri, Jan 05, 2007 at 05:19:44AM -0600, Peter Samuelson wrote:
> apr incorrectly detects the sendfile64() syscall on Linux 2.4 on
> non-i386 architectures.  The problem is that on Linux, libc is not
> tightly coupled to your installed kernel, so libc exposes syscalls
> which may or may not be implemented kernel-side, requiring runtime
> detection of errno==ENOSYS. 

OK nice.  Why does this require run-time detection rather than simply a 
smarter configure test (or apr_hints.m4 blacklist or whatever)?  It's a 
kernel bug/lack-of-feature which is either fixed or not in whatever 
kernel you build on, surely?

Regards,

joe