You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Brian Pane <bp...@pacbell.net> on 2001/08/28 08:27:45 UTC

[PATCH] Re: chunking of content in mod_include?

Ryan Bloom wrote:

>On Monday 27 August 2001 19:25, Paul J. Reder wrote:
>
>>Ryan Bloom wrote:
>>
>>>On Monday 27 August 2001 16:05, Brian Pane wrote:
>>>
>>>>In mod_include's find_start_sequence function, there's some code that
>>>>splits the current bucket if "ctx->bytes_parsed >=
>>>>BYTE_COUNT_THRESHOLD."
>>>>
>>>>Can somebody explain the rationale for this?  It seems redundant to be
>>>>splitting the data into smaller chunks in a content filter; I'd expect
>>>>mod_include to defer network block sizing to downstream filters.  In
>>>>the profile data I'm looking at currently, this check accounts for 35%
>>>>of the total run time of find_start_sequence, so there's some
>>>>performance to be gained if the "ctx->bytes_parsed >=
>>>>BYTE_COUNT_THRESHOLD" check can be eliminated.
>>>>
>>>It is used to ensure that we don't buffer all the data in mod_include. 
>>>It isn't really done correctly though, because what we should be doing,
>>>is just continuing to read as much data as possible, but as soon as we
>>>can't read something, send what we have down the filter stack.
>>>
>>>This variable, basically ensures we don't keep reading all the data until
>>>we process the whole file, or reach the first tag.
>>>
>>In what manner do you mean "as soon as we can't read something"? It is my
>>understanding that the bucket code hides reading delays from the
>>mod_include code. If that is true how would the mod_include code know when
>>
>
>The bucket does not hide reading delays at all.  Basically, you call apr_bucket_read
>with APR_NONBLOCK_READ, and when you get a return code of APR_EAGAIN,
>you send what you have, and then call apr_bucket_read with APR_BLOCK_READ.
>
>>to send a chunk along? Are you saying the bucket code should do some magic
>>like send all buckets in the brigade up to the current one? This would
>>wreak havoc on code like mod_include that may be setting aside or tagging
>>buckets for replacement when the end of the tag is found.
>>
>
>Huh?  The bucket code doesn't ever send data down the filter stack unless you
>tell it to.  Take a look at the content-length filter to see what I mean.
>
>>This code was put in because we were seeing the mod_include code buffer up
>>the entire collection of buckets until an SSI tag was found. If you have a
>>200 MB file with an SSI tag footer at the end of the brigade, the whole
>>thing was being buffered. How do you propose that this be done differently?
>>
>
>I don't care if mod_include buffers 200 Megs, as long as it is constantly doing
>something with the data.  If we have a 200 Meg file that has no SSI tags in
>it, but we can get all 200 Meg at one time, then we shouldn't have any problem
>just scanning through the entire 200 Megs very quickly.  Worst case, we do what
>Brian suggested, and just check the bucket length once we have finished
>processing all of the data in that bucket.  The buffering only becomes a
>real problem when we sit waiting for data from a CGI or some other slow
>content generator.
>

How does this patch look?  It does the check only on bucket
boundaries, and it pushes out the buffered content if either:
  * the amount of buffered content is too large, or
  * the apr_bucket_read would block.

If this technique looks sound, I can create another patch that
does the same thing for find_end_sequence.

--Brian


Index: mod_include.c
===================================================================
RCS file: /home/cvspublic/httpd-2.0/modules/filters/mod_include.c,v
retrieving revision 1.136
diff -u -r1.136 mod_include.c
--- mod_include.c    2001/08/27 20:25:42    1.136
+++ mod_include.c    2001/08/28 06:16:38
@@ -143,9 +143,10 @@
  * first byte of the BEGINNING_SEQUENCE (after finding a complete 
match) or it
  * returns NULL if no match found.
  */
-static apr_bucket *find_start_sequence(apr_bucket *dptr, include_ctx_t 
*ctx,
-                                      apr_bucket_brigade *bb, int 
*do_cleanup)
+static apr_bucket *find_start_sequence(apr_bucket *dptr, ap_filter_t *f,
+                                      apr_bucket_brigade **bb, int 
*do_cleanup)
 {
+    include_ctx_t *ctx = f->ctx;
     apr_size_t len;
     const char *c;
     const char *buf;
@@ -156,39 +157,43 @@
     *do_cleanup = 0;
 
     do {
+        apr_status_t rv;
+
         if (APR_BUCKET_IS_EOS(dptr)) {
             break;
         }
-        apr_bucket_read(dptr, &buf, &len, APR_BLOCK_READ);
-        /* XXX handle retcodes */
-        if (len == 0) { /* end of pipe? */
-            break;
+
+        if ((ctx->bytes_parsed >= BYTE_COUNT_THRESHOLD) &&
+            !APR_BRIGADE_EMPTY(*bb)) {
+            apr_bucket_brigade *remainder = apr_brigade_split(*bb, dptr);
+            rv = ap_pass_brigade(f->next, *bb);
+            *bb = remainder;
+            if (rv != APR_SUCCESS) {
+                return NULL;
+            }
         }
-        c = buf;
-        while (c < buf + len) {
-            if (ctx->bytes_parsed >= BYTE_COUNT_THRESHOLD) {
-                apr_bucket *start_bucket;
 
-                if (ctx->head_start_index > 0) {
-                    start_index  = ctx->head_start_index;
-                    start_bucket = ctx->head_start_bucket;
+        rv = apr_bucket_read(dptr, &buf, &len, APR_NONBLOCK_READ);
+        if (rv == APR_EAGAIN) {
+            if (!APR_BRIGADE_EMPTY(*bb)) {
+                apr_bucket_brigade *remainder = apr_brigade_split(*bb, 
dptr);
+                rv = ap_pass_brigade(f->next, *bb);
+                *bb = remainder;
+                if (rv != APR_SUCCESS) {
+                    return NULL;
                 }
-                else {
-                    start_index  = (c - buf);
-                    start_bucket = dptr;
-                }
-                apr_bucket_split(start_bucket, start_index);
-                tmp_bkt = APR_BUCKET_NEXT(start_bucket);
-                if (ctx->head_start_index > 0) {
-                    ctx->head_start_index  = 0;
-                    ctx->head_start_bucket = tmp_bkt;
-                    ctx->parse_pos = 0;
-                    ctx->state = PRE_HEAD;
-                }
-
-                return tmp_bkt;
             }
+            rv = apr_bucket_read(dptr, &buf, &len, APR_BLOCK_READ);
+        }
+        if (rv != APR_SUCCESS) {
+            return NULL;
+        }
 
+        if (len == 0) { /* end of pipe? */
+            break;
+        }
+        c = buf;
+        while (c < buf + len) {
             if (*c == str[ctx->parse_pos]) {
                 if (ctx->state == PRE_HEAD) {
                     ctx->state             = PARSE_HEAD;
@@ -244,7 +249,7 @@
             ctx->bytes_parsed++;
         }
         dptr = APR_BUCKET_NEXT(dptr);
-    } while (dptr != APR_BRIGADE_SENTINEL(bb));
+    } while (dptr != APR_BRIGADE_SENTINEL(*bb));
     return NULL;
 }
 
@@ -2363,7 +2368,7 @@
             int do_cleanup = 0;
             apr_size_t cleanup_bytes = ctx->parse_pos;
 
-            tmp_dptr = find_start_sequence(dptr, ctx, *bb, &do_cleanup);
+            tmp_dptr = find_start_sequence(dptr, f, bb, &do_cleanup);
 
             /* The few bytes stored in the ssi_tag_brigade turned out 
not to
              * be a tag after all. This can only happen if the starting



Re: [PATCH] Re: chunking of content in mod_include?

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
From: "Ryan Bloom" <rb...@covalent.net>
Sent: Tuesday, August 28, 2001 8:57 AM


> On Monday 27 August 2001 23:27, Brian Pane wrote:
> 
> > How does this patch look?  It does the check only on bucket
> > boundaries, and it pushes out the buffered content if either:
> >   * the amount of buffered content is too large, or
> >   * the apr_bucket_read would block.
> >
> > +        if (rv == APR_EAGAIN) {
> 
> This needs to be if (APR_STATUS_IS_EAGAIN(rv)) {
> 
> EAGAIN is one of those annoying error conditions, that is completely non-portable,
> so we have to use a macro to check for it.

For those of you coming on to apr or httpd development recently (I count at least
five active patch submitters that haven't been here all that long) ... let me tell
a fast story.

Most non-unixes don't map directly to errno.  We started with a lookup table and
some other bogus stuff to try mapping errors.  It's horribly slow and a poor solution.
Even some Unix errors have this problem with EAGAIN/EWOULDBLOCK.  But PLEASE don't
simply code correctly in this one case, code consistently!

Anyone writing (rv == APR_ESOMETHING) is asking for problems.  That's why we implement
and maintain that APR_STATUS_IS_ESOMETHING() set of macros.  They are fast (no time
difference on Unix at all) and legible in mainline code.

Unless you are looking at an APR-defined status please use APR_STATUS_IS_EFOO to avoid
some difficult to debug bugs from creeping into the code.

Bill




Re: [PATCH] Re: chunking of content in mod_include?

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
From: "Ryan Bloom" <rb...@covalent.net>
Sent: Tuesday, August 28, 2001 8:57 AM


> On Monday 27 August 2001 23:27, Brian Pane wrote:
> 
> > How does this patch look?  It does the check only on bucket
> > boundaries, and it pushes out the buffered content if either:
> >   * the amount of buffered content is too large, or
> >   * the apr_bucket_read would block.
> >
> > +        if (rv == APR_EAGAIN) {
> 
> This needs to be if (APR_STATUS_IS_EAGAIN(rv)) {
> 
> EAGAIN is one of those annoying error conditions, that is completely non-portable,
> so we have to use a macro to check for it.

For those of you coming on to apr or httpd development recently (I count at least
five active patch submitters that haven't been here all that long) ... let me tell
a fast story.

Most non-unixes don't map directly to errno.  We started with a lookup table and
some other bogus stuff to try mapping errors.  It's horribly slow and a poor solution.
Even some Unix errors have this problem with EAGAIN/EWOULDBLOCK.  But PLEASE don't
simply code correctly in this one case, code consistently!

Anyone writing (rv == APR_ESOMETHING) is asking for problems.  That's why we implement
and maintain that APR_STATUS_IS_ESOMETHING() set of macros.  They are fast (no time
difference on Unix at all) and legible in mainline code.

Unless you are looking at an APR-defined status please use APR_STATUS_IS_EFOO to avoid
some difficult to debug bugs from creeping into the code.

Bill




Re: [PATCH] Re: chunking of content in mod_include?

Posted by Ryan Bloom <rb...@covalent.net>.
Good patch, I have just a few fixes that need to be made to it.
Most of these are incredibly non-obvious or intuitive.

On Monday 27 August 2001 23:27, Brian Pane wrote:

> How does this patch look?  It does the check only on bucket
> boundaries, and it pushes out the buffered content if either:
>   * the amount of buffered content is too large, or
>   * the apr_bucket_read would block.
>
> If this technique looks sound, I can create another patch that
> does the same thing for find_end_sequence.
>
> --Brian
>
>
> -        c = buf;
> -        while (c < buf + len) {
> -            if (ctx->bytes_parsed >= BYTE_COUNT_THRESHOLD) {
> -                apr_bucket *start_bucket;
>
> -                if (ctx->head_start_index > 0) {
> -                    start_index  = ctx->head_start_index;
> -                    start_bucket = ctx->head_start_bucket;
> +        rv = apr_bucket_read(dptr, &buf, &len, APR_NONBLOCK_READ);
> +        if (rv == APR_EAGAIN) {

This needs to be if (APR_STATUS_IS_EAGAIN(rv)) {

EAGAIN is one of those annoying error conditions, that is completely non-portable,
so we have to use a macro to check for it.

> +                rv = ap_pass_brigade(f->next, *bb);
> +                *bb = remainder;
> +                if (rv != APR_SUCCESS) {
> +                    return NULL;

This looks like a problem.  Basically, a future filter has just informed us that there
was a problem, but we are going to silently ignore it.  If we get a status other than
APR_SUCCESS from a future filter, then we _must_ return that error code to the
filter that called us.  If we don't, then we will continue to send data, even though
lower filters think that everything is done, and everything gets confused.  I have
closed this bug in mod_include multiple times, so I am VERY aware of it.  :-)

> +            rv = apr_bucket_read(dptr, &buf, &len, APR_BLOCK_READ);
> +        }
> +        if (rv != APR_SUCCESS) {
> +            return NULL;
> +        }

Same as above.  You can't lose that error code.

Fix those three things, and I'll give it a +1.

Ryan

______________________________________________________________
Ryan Bloom				rbb@apache.org
Covalent Technologies			rbb@covalent.net
--------------------------------------------------------------

[PATCH] Re: chunking of content in mod_include?

Posted by Brian Pane <bp...@pacbell.net>.
Thanks to everybody who reviewed the first patch.  This new
version adds error-checking fixes and moves the brigade-splitting
back into send_parsed_content.  I also added a nonblocking read
to find_end_sequence.

--Brian

Index: mod_include.h
===================================================================
RCS file: /home/cvspublic/httpd-2.0/modules/filters/mod_include.h,v
retrieving revision 1.21
diff -u -r1.21 mod_include.h
--- mod_include.h    2001/08/27 20:25:42    1.21
+++ mod_include.h    2001/08/29 04:48:12
@@ -138,6 +138,8 @@
     int          if_nesting_level;
     apr_size_t   parse_pos;
     int          bytes_parsed;
+    apr_status_t status;
+    int          output_now;
    
     apr_bucket   *head_start_bucket;
     apr_size_t   head_start_index;
Index: mod_include.c
===================================================================
RCS file: /home/cvspublic/httpd-2.0/modules/filters/mod_include.c,v
retrieving revision 1.136
diff -u -r1.136 mod_include.c
--- mod_include.c    2001/08/27 20:25:42    1.136
+++ mod_include.c    2001/08/29 04:48:13
@@ -144,7 +144,7 @@
  * returns NULL if no match found.
  */
 static apr_bucket *find_start_sequence(apr_bucket *dptr, include_ctx_t 
*ctx,
-                                      apr_bucket_brigade *bb, int 
*do_cleanup)
+                                       apr_bucket_brigade *bb, int 
*do_cleanup)
 {
     apr_size_t len;
     const char *c;
@@ -156,39 +156,56 @@
     *do_cleanup = 0;
 
     do {
+        apr_status_t rv;
+        int read_done = 0;
+
         if (APR_BUCKET_IS_EOS(dptr)) {
             break;
         }
-        apr_bucket_read(dptr, &buf, &len, APR_BLOCK_READ);
-        /* XXX handle retcodes */
-        if (len == 0) { /* end of pipe? */
-            break;
+
+        if (ctx->bytes_parsed >= BYTE_COUNT_THRESHOLD) {
+            ctx->output_now = 1;
         }
-        c = buf;
-        while (c < buf + len) {
-            if (ctx->bytes_parsed >= BYTE_COUNT_THRESHOLD) {
-                apr_bucket *start_bucket;
+        else if (ctx->bytes_parsed > 0) {
+            rv = apr_bucket_read(dptr, &buf, &len, APR_NONBLOCK_READ);
+            read_done = 1;
+            if (APR_STATUS_IS_EAGAIN(rv)) {
+                ctx->output_now = 1;
+            }
+        }
 
-                if (ctx->head_start_index > 0) {
-                    start_index  = ctx->head_start_index;
-                    start_bucket = ctx->head_start_bucket;
-                }
-                else {
-                    start_index  = (c - buf);
-                    start_bucket = dptr;
-                }
+        if (ctx->output_now) {
+            apr_size_t start_index;
+            apr_bucket *start_bucket;
+            apr_bucket_brigade *remainder;
+            if (ctx->head_start_index > 0) {
+                start_bucket = ctx->head_start_bucket;
                 apr_bucket_split(start_bucket, start_index);
-                tmp_bkt = APR_BUCKET_NEXT(start_bucket);
-                if (ctx->head_start_index > 0) {
-                    ctx->head_start_index  = 0;
-                    ctx->head_start_bucket = tmp_bkt;
-                    ctx->parse_pos = 0;
-                    ctx->state = PRE_HEAD;
-                }
-
-                return tmp_bkt;
+                start_bucket = APR_BUCKET_NEXT(start_bucket);
+                ctx->head_start_index = 0;
+                ctx->head_start_bucket = start_bucket;
+                ctx->parse_pos = 0;
+                ctx->state = PRE_HEAD;
             }
+            else {
+                start_bucket = dptr;
+            }
+            return start_bucket;
+        }
 
+        if (!read_done) {
+            rv = apr_bucket_read(dptr, &buf, &len, APR_BLOCK_READ);
+        }
+        if (!APR_STATUS_IS_SUCCESS(rv)) {
+            ctx->status = rv;
+            return NULL;
+        }
+
+        if (len == 0) { /* end of pipe? */
+            break;
+        }
+        c = buf;
+        while (c < buf + len) {
             if (*c == str[ctx->parse_pos]) {
                 if (ctx->state == PRE_HEAD) {
                     ctx->state             = PARSE_HEAD;
@@ -256,11 +273,40 @@
     const char *str = ENDING_SEQUENCE;
 
     do {
+        apr_status_t rv;
+        int read_done = 0;
+
         if (APR_BUCKET_IS_EOS(dptr)) {
             break;
+        }
+        if (ctx->bytes_parsed >= BYTE_COUNT_THRESHOLD) {
+            ctx->output_now = 1;
         }
-        apr_bucket_read(dptr, &buf, &len, APR_BLOCK_READ);
-        /* XXX handle retcodes */
+        else if (ctx->bytes_parsed > 0) {
+            rv = apr_bucket_read(dptr, &buf, &len, APR_NONBLOCK_READ);
+            read_done = 1;
+            if (APR_STATUS_IS_EAGAIN(rv)) {
+                ctx->output_now = 1;
+            }
+        }
+
+        if (ctx->output_now) {
+            if (ctx->state == PARSE_DIRECTIVE) {
+                /* gonna start over parsing the directive next time 
through */
+                ctx->directive_length = 0;
+                ctx->tag_length       = 0;
+            }
+            return dptr;
+        }
+
+        if (!read_done) {
+            rv = apr_bucket_read(dptr, &buf, &len, APR_BLOCK_READ);
+        }
+        if (!APR_STATUS_IS_SUCCESS(rv)) {
+            ctx->status = rv;
+            return NULL;
+        }
+
         if (len == 0) { /* end of pipe? */
             break;
         }
@@ -271,15 +317,6 @@
             c = buf;
         }
         while (c < buf + len) {
-            if (ctx->bytes_parsed >= BYTE_COUNT_THRESHOLD) {
-                if (ctx->state == PARSE_DIRECTIVE) {
-                    /* gonna start over parsing the directive next time 
through */
-                    ctx->directive_length = 0;
-                    ctx->tag_length       = 0;
-                }
-                return dptr;
-            }
-
             if (*c == str[ctx->parse_pos]) {
                 if (ctx->state != PARSE_TAIL) {
                     ctx->state             = PARSE_TAIL;
@@ -2364,6 +2401,9 @@
             apr_size_t cleanup_bytes = ctx->parse_pos;
 
             tmp_dptr = find_start_sequence(dptr, ctx, *bb, &do_cleanup);
+            if (!APR_STATUS_IS_SUCCESS(ctx->status)) {
+                return ctx->status;
+            }
 
             /* The few bytes stored in the ssi_tag_brigade turned out 
not to
              * be a tag after all. This can only happen if the starting
@@ -2401,7 +2441,9 @@
                     dptr = APR_BRIGADE_SENTINEL(*bb);
                 }
             }
-            else if ((tmp_dptr != NULL) && (ctx->bytes_parsed >= 
BYTE_COUNT_THRESHOLD)) {
+            else if ((tmp_dptr != NULL) &&
+                     (ctx->output_now ||
+                      (ctx->bytes_parsed >= BYTE_COUNT_THRESHOLD))) {
                                /* Send the large chunk of pre-tag 
bytes...  */
                 tag_and_after = apr_brigade_split(*bb, tmp_dptr);
                 rv = ap_pass_brigade(f->next, *bb);
@@ -2411,6 +2453,7 @@
                 *bb  = tag_and_after;
                 dptr = tmp_dptr;
                 ctx->bytes_parsed = 0;
+                ctx->output_now = 0;
             }
             else if (tmp_dptr == NULL) { /* There was no possible SSI 
tag in the */
                 dptr = APR_BRIGADE_SENTINEL(*bb);  /* remainder of this 
brigade...    */
@@ -2423,6 +2466,9 @@
              (ctx->state == PARSE_TAIL))       &&
             (dptr != APR_BRIGADE_SENTINEL(*bb))) {
             tmp_dptr = find_end_sequence(dptr, ctx, *bb);
+            if (!APR_STATUS_IS_SUCCESS(ctx->status)) {
+                return ctx->status;
+            }
 
             if (tmp_dptr != NULL) {
                 dptr = tmp_dptr;  /* Adjust bucket pos... */
@@ -2438,11 +2484,13 @@
                     APR_BRIGADE_CONCAT(ctx->ssi_tag_brigade, *bb);
                     *bb = tag_and_after;
                 }
-                else if (ctx->bytes_parsed >= BYTE_COUNT_THRESHOLD) {
+                else if (ctx->output_now ||
+                         (ctx->bytes_parsed >= BYTE_COUNT_THRESHOLD)) {
                     SPLIT_AND_PASS_PRETAG_BUCKETS(*bb, ctx, f->next, rv);
                     if (rv != APR_SUCCESS) {
                         return rv;
                     }
+                    ctx->output_now = 0;
                 }
             }
             else {
@@ -2715,6 +2763,7 @@
                 ctx->flags |= FLAG_NO_EXEC;
             }
             ctx->ssi_tag_brigade = apr_brigade_create(f->c->pool);
+            ctx->status = APR_SUCCESS;
 
             apr_cpystrn(ctx->error_str, conf->default_error_msg, 
sizeof(ctx->error_str));
             apr_cpystrn(ctx->time_str, conf->default_time_fmt, 
sizeof(ctx->time_str));






Re: [PATCH] Re: chunking of content in mod_include?

Posted by "Paul J. Reder" <re...@remulak.net>.
In addition to Ryan's comments I have the following comments...

Brian Pane wrote:
> How does this patch look?  It does the check only on bucket
> boundaries, and it pushes out the buffered content if either:
>   * the amount of buffered content is too large, or
>   * the apr_bucket_read would block.
> 
> If this technique looks sound, I can create another patch that
> does the same thing for find_end_sequence.
> 
> --Brian
> 
> Index: mod_include.c
> ===================================================================
> RCS file: /home/cvspublic/httpd-2.0/modules/filters/mod_include.c,v
> retrieving revision 1.136
> diff -u -r1.136 mod_include.c
> --- mod_include.c    2001/08/27 20:25:42    1.136
> +++ mod_include.c    2001/08/28 06:16:38
> @@ -143,9 +143,10 @@
>   * first byte of the BEGINNING_SEQUENCE (after finding a complete
> match) or it
>   * returns NULL if no match found.
>   */
> -static apr_bucket *find_start_sequence(apr_bucket *dptr, include_ctx_t
> *ctx,
> -                                      apr_bucket_brigade *bb, int
> *do_cleanup)
> +static apr_bucket *find_start_sequence(apr_bucket *dptr, ap_filter_t *f,
> +                                      apr_bucket_brigade **bb, int
> *do_cleanup)
>  {
> +    include_ctx_t *ctx = f->ctx;
>      apr_size_t len;
>      const char *c;
>      const char *buf;
> @@ -156,39 +157,43 @@
>      *do_cleanup = 0;
> 
>      do {
> +        apr_status_t rv;
> +
>          if (APR_BUCKET_IS_EOS(dptr)) {
>              break;
>          }
> -        apr_bucket_read(dptr, &buf, &len, APR_BLOCK_READ);
> -        /* XXX handle retcodes */
> -        if (len == 0) { /* end of pipe? */
> -            break;
> +
> +        if ((ctx->bytes_parsed >= BYTE_COUNT_THRESHOLD) &&
> +            !APR_BRIGADE_EMPTY(*bb)) {
> +            apr_bucket_brigade *remainder = apr_brigade_split(*bb, dptr);

This should check ctx->head_start_bucket. If it is non-NULL, use it instead
of dptr in the brigade split or you may be passing part of the tag along.

> +            rv = ap_pass_brigade(f->next, *bb);
> +            *bb = remainder;

You need to reset ctx->bytes_parsed to 0 here or this will trip again.

> +            if (rv != APR_SUCCESS) {
> +                return NULL;
> +            }

You may want to emulate the way the old code worked for threshold processing.
It split the bucket, but you don't *have* to do that. The important part is
what it did with the returned pointer and the context. It allowed the
code in send_parsed_content to do the actual pass_brigade. Thus the code
to split and pass the brigade existed in one common place for the
find_start_sequence and find_end_sequence code. So all you would have to 
do here is reset the context and return a pointer to the head_start_bucket.
You would do the same thing in find_end_sequence. This would limit the
duplicated code to one set of code instead of four.

>          }
> -        c = buf;
> -        while (c < buf + len) {
> -            if (ctx->bytes_parsed >= BYTE_COUNT_THRESHOLD) {
> -                apr_bucket *start_bucket;
> 
> -                if (ctx->head_start_index > 0) {
> -                    start_index  = ctx->head_start_index;
> -                    start_bucket = ctx->head_start_bucket;
> +        rv = apr_bucket_read(dptr, &buf, &len, APR_NONBLOCK_READ);
> +        if (rv == APR_EAGAIN) {
> +            if (!APR_BRIGADE_EMPTY(*bb)) {
> +                apr_bucket_brigade *remainder = apr_brigade_split(*bb,
> dptr);
> +                rv = ap_pass_brigade(f->next, *bb);
> +                *bb = remainder;

Same comments here...

> +                if (rv != APR_SUCCESS) {
> +                    return NULL;
>                  }
> -                else {
> -                    start_index  = (c - buf);
> -                    start_bucket = dptr;
> -                }
> -                apr_bucket_split(start_bucket, start_index);
> -                tmp_bkt = APR_BUCKET_NEXT(start_bucket);
> -                if (ctx->head_start_index > 0) {
> -                    ctx->head_start_index  = 0;
> -                    ctx->head_start_bucket = tmp_bkt;
> -                    ctx->parse_pos = 0;
> -                    ctx->state = PRE_HEAD;
> -                }
> -
> -                return tmp_bkt;
>              }
> +            rv = apr_bucket_read(dptr, &buf, &len, APR_BLOCK_READ);
> +        }
> +        if (rv != APR_SUCCESS) {
> +            return NULL;
> +        }
> 
> +        if (len == 0) { /* end of pipe? */
> +            break;
> +        }
> +        c = buf;
> +        while (c < buf + len) {
>              if (*c == str[ctx->parse_pos]) {
>                  if (ctx->state == PRE_HEAD) {
>                      ctx->state             = PARSE_HEAD;
> @@ -244,7 +249,7 @@
>              ctx->bytes_parsed++;
>          }
>          dptr = APR_BUCKET_NEXT(dptr);
> -    } while (dptr != APR_BRIGADE_SENTINEL(bb));
> +    } while (dptr != APR_BRIGADE_SENTINEL(*bb));
>      return NULL;
>  }
> 
> @@ -2363,7 +2368,7 @@
>              int do_cleanup = 0;
>              apr_size_t cleanup_bytes = ctx->parse_pos;
> 
> -            tmp_dptr = find_start_sequence(dptr, ctx, *bb, &do_cleanup);
> +            tmp_dptr = find_start_sequence(dptr, f, bb, &do_cleanup);
> 
>              /* The few bytes stored in the ssi_tag_brigade turned out
> not to
>               * be a tag after all. This can only happen if the starting

Re: [PATCH] Re: chunking of content in mod_include?

Posted by "Paul J. Reder" <re...@remulak.net>.
"Paul J. Reder" wrote:
> 
> Brian Pane wrote:
> > How does this patch look?  It does the check only on bucket
> > boundaries, and it pushes out the buffered content if either:
> >   * the amount of buffered content is too large, or
> >   * the apr_bucket_read would block.
> >
> > If this technique looks sound, I can create another patch that
> > does the same thing for find_end_sequence.
> >
> 
> I have some concerns about the way this is implemented. It doesn't
> look like it correctly keeps the context information when splitting
> the brigade. But the basic idea looks good.
> 
> I am looking at the code more carefully and will post another note
> or an updated patch soon.

Upon further checking I have determined that this does have the
potential problem that it may incorrectly pass along part of the tag. 

+        if ((ctx->bytes_parsed >= BYTE_COUNT_THRESHOLD) &&
+            !APR_BRIGADE_EMPTY(*bb)) {
+            apr_bucket_brigade *remainder = apr_brigade_split(*bb, dptr);
+            rv = ap_pass_brigade(f->next, *bb);
+            *bb = remainder;
+            if (rv != APR_SUCCESS) {
+                return NULL;
+            }
         }

The brigade split should be performed against ctx->head_start_bucket if
it is non-NULL. Part of the tag might be in a previous bucket, but the
current bucket is the one that throws it over the limit.

I need to go to a meeting now, but I will continue looking at this when
I get back. 

I will present all of my comments then. It still looks basically sound,
just needs a little tweaking.

Paul J. Reder

Re: [PATCH] Re: chunking of content in mod_include?

Posted by "Paul J. Reder" <re...@remulak.net>.
Brian Pane wrote:
> How does this patch look?  It does the check only on bucket
> boundaries, and it pushes out the buffered content if either:
>   * the amount of buffered content is too large, or
>   * the apr_bucket_read would block.
> 
> If this technique looks sound, I can create another patch that
> does the same thing for find_end_sequence.
> 

I have some concerns about the way this is implemented. It doesn't
look like it correctly keeps the context information when splitting
the brigade. But the basic idea looks good.

I am looking at the code more carefully and will post another note
or an updated patch soon.

Paul J. Reder