You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Dean Gaudet <dg...@arctic.org> on 1997/07/25 11:53:11 UTC
[PATCH] writev() combining for large bwrites
This is the patch I wanted to get in to anticipate mmap() development
later. If a BUFF already has some partially buffered content (i.e.
headers) and a large bwrite is performed (larger than the BUFF's buffer)
then no memory copy is done, and both buffers are combined and written
using writev().
I think I found a bug in the chunking code... if chunking, and a small
write is performed, filling the buffer, an end_chunk is performed. Then
write_it_all is called. If that fails, it's still possible for bwrite()
to not set the error flag on the BUFF. I'm not terribily happy with the
fix I included in this (it's the extra start_chunk call). I'll likely do
another rev of this.
At any rate, on your typical 6k file, this patch reduces the number of
system calls by 1. Apache is still a far off from the "optimal" number
which is probably in the 20-30 range ;)
Dean
Index: CHANGES
===================================================================
RCS file: /export/home/cvs/apache/src/CHANGES,v
retrieving revision 1.364
diff -u -r1.364 CHANGES
--- CHANGES 1997/07/24 04:38:07 1.364
+++ CHANGES 1997/07/25 09:45:53
@@ -1,5 +1,15 @@
Changes with Apache 1.3a2
-
+
+ *) When a large bwrite() occurs (larger than the internal buffer size),
+ while there is already something in the buffer, apache will combine
+ the large write and the buffer into a single writev(). (This is
+ in anticipation of using mmap() for reading files.)
+ [Dean Gaudet]
+
+ *) In obscure cases where a partial socket write occured while chunking,
+ Apache would omit the chunk header/footer on the next block.
+ [Dean Gaudet]
+
*) PORT: Various tweaks to eliminate pointer-int casting warnings on 64-bit
cpus like the alpha. Apache still stores ints in pointers, but that's
the relatively safe direction. [Dean Gaudet] PR#344
Index: buff.c
===================================================================
RCS file: /export/home/cvs/apache/src/buff.c,v
retrieving revision 1.38
diff -u -r1.38 buff.c
--- buff.c 1997/07/24 04:23:57 1.38
+++ buff.c 1997/07/25 09:45:55
@@ -840,6 +840,45 @@
}
+#ifndef NO_WRITEV
+/* similar to previous, but uses writev. Note that it modifies vec.
+ * return 0 if successful, -1 otherwise.
+ */
+static int writev_it_all (BUFF *fb, struct iovec *vec, int nvec)
+{
+ int i, rv;
+
+ /* while it's nice an easy to build the vector and crud, it's painful
+ * to deal with a partial writev()
+ */
+ for( i = 0; i < nvec; ) {
+ do rv = writev( fb->fd, &vec[i], nvec - i );
+ while (rv == -1 && errno == EINTR && !(fb->flags & B_EOUT));
+ if (rv == -1)
+ return -1;
+ /* recalculate vec to deal with partial writes */
+ while (rv > 0) {
+ if (rv < vec[i].iov_len) {
+ vec[i].iov_base = (char *)vec[i].iov_base + rv;
+ vec[i].iov_len -= rv;
+ rv = 0;
+ if (vec[i].iov_len == 0) {
+ ++i;
+ }
+ } else {
+ rv -= vec[i].iov_len;
+ ++i;
+ }
+ }
+ if (fb->flags & B_EOUT)
+ return -1;
+ }
+ /* if we got here, we wrote it all */
+ return 0;
+}
+#endif
+
+
/*
* A hook to write() that deals with chunking. This is really a protocol-
* level issue, but we deal with it here because it's simpler; this is
@@ -852,7 +891,6 @@
char chunksize[16]; /* Big enough for practically anything */
#ifndef NO_WRITEV
struct iovec vec[3];
- int i, rv;
#endif
if (fb->flags & (B_WRERR|B_EOUT))
@@ -874,9 +912,6 @@
return -1;
return nbyte;
#else
-
-#define NVEC (sizeof(vec)/sizeof(vec[0]))
-
vec[0].iov_base = chunksize;
vec[0].iov_len = ap_snprintf(chunksize, sizeof(chunksize), "%x\015\012",
nbyte);
@@ -884,38 +919,51 @@
vec[1].iov_len = nbyte;
vec[2].iov_base = "\r\n";
vec[2].iov_len = 2;
- /* while it's nice an easy to build the vector and crud, it's painful
- * to deal with a partial writev()
- */
- for( i = 0; i < NVEC; ) {
- do rv = writev( fb->fd, &vec[i], NVEC - i );
- while (rv == -1 && errno == EINTR && !(fb->flags & B_EOUT));
- if (rv == -1)
- return -1;
- /* recalculate vec to deal with partial writes */
- while (rv > 0) {
- if( rv <= vec[i].iov_len ) {
- vec[i].iov_base = (char *)vec[i].iov_base + rv;
- vec[i].iov_len -= rv;
- rv = 0;
- if( vec[i].iov_len == 0 ) {
- ++i;
- }
- } else {
- rv -= vec[i].iov_len;
- ++i;
- }
- }
- if (fb->flags & B_EOUT)
- return -1;
- }
- /* if we got here, we wrote it all */
- return nbyte;
-#undef NVEC
+
+ return writev_it_all (fb, vec, (sizeof(vec)/sizeof(vec[0]))) ? -1 : nbyte;
#endif
}
+#ifndef NO_WRITEV
+/*
+ * Used to combine the contents of the fb buffer, and a large buffer
+ * passed in.
+ */
+static int large_write (BUFF *fb, const void *buf, int nbyte)
+{
+ struct iovec vec[4];
+ int nvec;
+ char chunksize[16];
+
+ nvec = 0;
+ /* it's easiest to end the current chunk */
+ if (fb->flags & B_CHUNK) {
+ end_chunk(fb);
+ }
+ vec[0].iov_base = fb->outbase;
+ vec[0].iov_len = fb->outcnt;
+ if (fb->flags & B_CHUNK) {
+ vec[1].iov_base = chunksize;
+ vec[1].iov_len = ap_snprintf (chunksize, sizeof(chunksize),
+ "%x\015\012", nbyte);
+ vec[2].iov_base = (void *)buf;
+ vec[2].iov_len = nbyte;
+ vec[3].iov_base = "\r\n";
+ vec[3].iov_len = 2;
+ nvec = 4;
+ } else {
+ vec[1].iov_base = (void *)buf;
+ vec[1].iov_len = nbyte;
+ nvec = 2;
+ }
+
+ fb->outcnt = 0;
+ return writev_it_all (fb, vec, nvec) ? -1 : nbyte;
+}
+#endif
+
+
/*
* Write nbyte bytes.
* Only returns fewer than nbyte if an error ocurred.
@@ -951,6 +999,19 @@
else
return i;
}
+
+#ifndef NO_WRITEV
+/*
+ * Detect case where we're asked to write a large buffer, and combine our
+ * current buffer with it in a single writev()
+ */
+ if (fb->outcnt > 0 && nbyte >= fb->bufsiz) {
+ return large_write (fb, buf, nbyte);
+ }
+#endif
+
+ /* in case a chunk hasn't been started yet */
+ if( fb->flags & B_CHUNK ) start_chunk( fb );
/*
* Whilst there is data in the buffer, keep on adding to it and writing it
Re: [PATCH] writev() combining for large bwrites
Posted by Dean Gaudet <dg...@arctic.org>.
You use autoconf in mod_php to determine mmap() correctness right?
At any rate I wasn't thinking about mod_include and mmap() in this case,
just the default handler. It would be nice if we could set a #define to
know when it's safe to use ... and I suppose a few mmap routines in
alloc.c to do resource protection are in order.
I've been wondering about the usefulness of the core opening/mmapping the
file early on... but the only case I know of so far where a file is opened
twice is when mod_mime_magic is in use. Are there others?
Dean
On Fri, 25 Jul 1997, Rasmus Lerdorf wrote:
> > This is the patch I wanted to get in to anticipate mmap() development
> > later.
>
> We have to make sure not to enable mmap() on Alphas running OSF. mmap()
> is very broken on that OS. Keep that in mind if/when the Configure stuff
> is done to determine if mmap() should be used.
>
> Also, is it expected that content-parsing modules such as mod_include and
> mod_php would now be able to receive a caddr_t pointer to the mmap'ed
> file, or is the intention to only do the mmap() right in mod_include?
>
> -Rasmus
>
>
Re: [PATCH] writev() combining for large bwrites
Posted by Dean Gaudet <dg...@arctic.org>.
HAVE_MMAP is defined if you want the arch to use mmap for the scoreboard.
For example, linux has a working mmap but HAVE_MMAP isn't defined.
Dean
On Fri, 25 Jul 1997, Alexei Kosut wrote:
> On Fri, 25 Jul 1997, Rasmus Lerdorf wrote:
>
> > > This is the patch I wanted to get in to anticipate mmap() development
> > > later.
> >
> > We have to make sure not to enable mmap() on Alphas running OSF. mmap()
> > is very broken on that OS. Keep that in mind if/when the Configure stuff
> > is done to determine if mmap() should be used.
>
> We already do Configure stuff to determine if mmap() should be used,
> since we use it (optionally) for the scoreboard. The HAVE_MMAP define in
> conf.h determines whether it is availble for a given OS. I do see a
> #define HAVE_MMAP in the OSF1 section...
>
> -- Alexei Kosut <ak...@organic.com>
>
>
>
>
Re: [PATCH] writev() combining for large bwrites
Posted by Alexei Kosut <ak...@organic.com>.
On Fri, 25 Jul 1997, Rasmus Lerdorf wrote:
> > This is the patch I wanted to get in to anticipate mmap() development
> > later.
>
> We have to make sure not to enable mmap() on Alphas running OSF. mmap()
> is very broken on that OS. Keep that in mind if/when the Configure stuff
> is done to determine if mmap() should be used.
We already do Configure stuff to determine if mmap() should be used,
since we use it (optionally) for the scoreboard. The HAVE_MMAP define in
conf.h determines whether it is availble for a given OS. I do see a
#define HAVE_MMAP in the OSF1 section...
-- Alexei Kosut <ak...@organic.com>
Re: [PATCH] writev() combining for large bwrites
Posted by Rasmus Lerdorf <ra...@lerdorf.on.ca>.
> This is the patch I wanted to get in to anticipate mmap() development
> later.
We have to make sure not to enable mmap() on Alphas running OSF. mmap()
is very broken on that OS. Keep that in mind if/when the Configure stuff
is done to determine if mmap() should be used.
Also, is it expected that content-parsing modules such as mod_include and
mod_php would now be able to receive a caddr_t pointer to the mmap'ed
file, or is the intention to only do the mmap() right in mod_include?
-Rasmus