You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by "William A. Rowe, Jr." <wr...@rowe-clan.net> on 2002/10/31 00:56:59 UTC

SSL Input Filter bogosity

I spent alot of time over the last several days trying to shoot down the
obscure bugs in SSL lockup.  Some might have to do with the DBM and
apr_global_mutex bugs.  Others weren't so obvious ("Spurious Interrupt,
perhaps one of those OpenSSL confusions?")

The answer is we rarely know which end is up with the existing input
SSL filter.  We aren't tracking our own APR result codes correctly,
and nothing percolates properly.  We don't look (today) correctly at
the failure and retry cases.

The patch attached is a rewrite of the SSL input filtering.  Please take
a close look at the new input filter.  I've collapsed two functions that
didn't need to be split so it's easier to follow, now.

Bill

Re: SSL Input Filter bogosity

Posted by Justin Erenkrantz <je...@apache.org>.
--On Thursday, October 31, 2002 12:19 AM -0600 "William A. Rowe, Jr." 
<wr...@apache.org> wrote:

> OpenSSL 0.9.6g does so.  Why shouldn't we?

Because OpenSSL is a library, we're not.

> However, if we don't have inl worth of bytes, and they are sitting
> ready (on the socket) shouldn't we fetch them?  Forget the GETLINE
> bogosity, it means nothing to SSL.

No.  I think you are missing the point of having this code understand 
AP_MODE_GETLINE.  When we try to read a line, we don't have a way of 
really knowing how much to read.  So, we try to read a maximum of 8k 
(whatever AP_IOBUFSIZE is).  That merely defines what our maximum 
line length is (actually that is the maximum that mod_ssl will ever 
return on a read).  The generic read code doesn't have logic for 
determining when to stop.  That code is only contained in the logic 
that understands AP_MODE_GETLINE (does the memchr call).

Yet, in all probability, the chances are that the line is going to be 
very short (consider HTTP headers).  Therefore, as a critical 
optimization, we don't necessarily want to read the full 8k, but we 
want to try to see if we have enough with what we already have.  So, 
when we read anything with AP_MODE_GETLINE, we should exit out of our 
generic read call and then check to see if we've satisfied the 
getline requirements.

Removing the short-circuits is going to make all AP_MODE_GETLINE 
calls block for 8k of data every time.  That's going to be 
unacceptable when we're reading the headers (which is a blocking 
read) as we may end up reading too much data and we will end up 
reading past the headers (which commonly take less than 8k of data). 
When reading from the SSL socket generically, AP_MODE_GETLINE calls 
can only safely handle the output of one socket read, and I believe, 
if there was any data left over from the last read, we should attempt 
to check if that was a line as well (again, chances are that it is 
when called with AP_MODE_GETLINE).  Once we have the output of that 
socket read, the getline-aware logic can then determine if it is 
enough (it saw the LF), or to read some more (given the AP_IOBUFSIZE 
constraint).  Doing blind blocking as your patch does can't work.

Therefore, I believe these AP_MODE_GETLINE optimizations must stay in.

As a hunch, I would believe it would be better to return partial data 
up the filter chain sooner rather than waiting for the entirety even 
for AP_MODE_READBYTES calls as well.  There is no assumption that a 
filter *must* return all of the bytes it was requested.  On a 
blocking read, the only guarantee is that it should return at least 
one byte.  If we're really waiting around for the network, I think 
it'd be best to compute what we already have and then once we're 
ready, try to read from the network again.  Of course, I have no 
numbers to prove this thought, but I wouldn't be shocked if it is.

>> We should do a conditional on the APR_BRIGADE_EMPTY() check if
>> inbio->block is non-blocking.  It's considered a design error if a
>> filter returns an empty brigade on a blocking call.
>
> Who said we are blocking?  This could be a SPECULATIVE call
> with a NONBLOCKING request, no?

Huh?  We have the socket mode as we are passing that parameter to 
ap_get_brigade.  So, of course, we know if we are blocking.  Again, 
the common case is that we are blocking.

>> Should the APR_BUCKET_IS_EOS rather be APR_BUCKET_IS_METADATA?
>> Not sure here.  Perhaps.
>
> Hmmm.  I was thinking about the METADATA case.
>
> Do you suppose we should percolate METADATA buckets back out
> to the filter_read of SSL?  I suppose metadata should just go
> through unharmed.

I think so.  We really don't have a case where METADATA originates 
from the core, but it's possible.

> However, we have to react to EOS, since that's the end of the input
> available to the SSL pump.

Yes, as a special case of METADATA.

> Getline means nothing in the context of fetching bytes off of
> an SSL socket.  We must shove the raw bytes into the SSL
> pump in order to return any sort of data (SPECULATIVE, READBYTES,
> GETLINE, etc) from the SSL pump.  The raw bytes just need
> to be fetched.
>
> Of course, it's generally nonblocking, so if we don't get a hit
> from the socket, the new code just returns whatever we got.

No, it's not generally non-blocking (see ap_rgetline_core for the 
initial call to ap_get_brigade - it is blocking).  So, attempting to 
read the full 8k when we don't have that available will not work. 
Again, the AP_MODE_GETLINE is important even within the generic 
reading of the code - see above.

> No, there really isn't.  It was impossible to look at inbio.rv when
> we needed too, since ctx wasn't passed, but the SSL ctx.

Then, change the function parameters.  I'm just concerned that we're 
going to be substantially adding to another function.  mod_ssl has a 
lot of places where functions go on and on and on and on and on. 
Breaking it into little pieces isn't that bad of an idea - this patch 
is making a function considerably longer and more complicated.

If you want to do something *really* productive with mod_ssl to help 
it be more understandable, please go and change those bizzaro 
function names to actually mean something.  ssl_io_filter_Input. 
Gah.  -- justin

Re: SSL Input Filter bogosity

Posted by "William A. Rowe, Jr." <wr...@apache.org>.
At 10:42 PM 10/30/2002, Justin Erenkrantz wrote:
>--On Wednesday, October 30, 2002 5:56 PM -0600 "William A. Rowe, Jr." <wr...@rowe-clan.net> wrote:
>
>>@@ -358,6 +357,12 @@
>>     SSLConnRec *sslconn = myConnConfig(inbio->f->c);
>>     int len = 0;
>>
>>+    inbio->rc = APR_SUCCESS;
>>+
>>+    /* OpenSSL catches this case, so should we. */
>>+    if (!in)
>>+        return 0;
>>+
>>     /* XXX: flush here only required for SSLv2;
>>      * OpenSSL calls BIO_flush() at the appropriate times for
>>      * the other protocols.
>
>Why are we attempting to catch NULL pointer exceptions?  I don't believe that there are any cases where our code could have a NULL ctx->buffer as those are well-defined char arrays.  Within mod_ssl, there shouldn't be any other entry points to the OpenSSL read calls but by entering this path with ctx->buffer, so I don't see why we should bother.  (Yes, photons could hit, so what?)

OpenSSL 0.9.6g does so.  Why shouldn't we?

>>@@ -366,11 +371,11 @@
>>         BIO_bucket_flush(inbio->wbio);
>>     }
>>
>>-    inbio->rc = APR_SUCCESS;
>>-
>>+    BIO_clear_retry_flags(bio);
>>+
>>     /* first use data already read from socket if any */
>>     if ((len = char_buffer_read(&inbio->cbuf, in, inl))) {
>>-        if ((len <= inl) || inbio->mode == AP_MODE_GETLINE) {
>>+        if (len >= inl) {
>>             return len;
>>         }
>>         inl -= len;
>
>Hmm, what is the point of this conditional change?  char_buffer_read shouldn't be returning anything larger than inl.  To me, it actually sounds that this segment should just be returning len all the time (as that conditional was *always* true).  No need for a conditional here.  If there's anything that we've already read, just return it. Clear out that buffer - if the user still wants more, then they can call again.  (Due to our structure, we're not sure if the caller really wants all of inl or not - almost certainly not for AP_MODE_GETLINE.)

However, if we don't have inl worth of bytes, and they are sitting
ready (on the socket) shouldn't we fetch them?  Forget the GETLINE
bogosity, it means nothing to SSL.

>>@@ -396,21 +399,58 @@
>>                                        AP_MODE_READBYTES, 
>inbio->block,
>>                                        inl);
>>
>>-            if ((inbio->rc != APR_SUCCESS) 
>||APR_BRIGADE_EMPTY(inbio->bb))
>>-            {
>>+            if (APR_STATUS_IS_EAGAIN(inbio->rc)
>>+                    || APR_STATUS_IS_EINTR(inbio->rc)) {
>>+                break;
>>+            }
>>+
>>+            if (inbio->rc != APR_SUCCESS) {
>>+                /* Unexpected errors discard the brigade */
>>+                apr_brigade_cleanup(inbio->bb);
>>+                inbio->bb = NULL;
>>+                return -1;
>>+            }
>>+
>>+            if (APR_BRIGADE_EMPTY(inbio->bb)) {
>>+                /* Treat an empty brigade as a retry case */
>>                 break;
>>             }
>>         }
>
>We should do a conditional on the APR_BRIGADE_EMPTY() check if inbio->block is non-blocking.  It's considered a design error if a filter returns an empty brigade on a blocking call.

Who said we are blocking?  This could be a SPECULATIVE call
with a NONBLOCKING request, no?

>>-        inbio->bucket = APR_BRIGADE_FIRST(inbio->bb);
>>+        bucket = APR_BRIGADE_FIRST(inbio->bb);
>>+
>>+        if (APR_BUCKET_IS_EOS(bucket)) {
>>+            if (len) {
>>+                /* Leave this EOS on the brigade */
>>+                break;
>>+            }
>>+            /* Consume the EOS and return 0, we already
>>+             * reset the retry flag above
>>+             */
>>+            apr_brigade_cleanup(inbio->bb);
>>+            inbio->bb = NULL;
>>+            return 0;
>>+        }
>
>Should the APR_BUCKET_IS_EOS rather be APR_BUCKET_IS_METADATA?  Not sure here.  Perhaps.

Hmmm.  I was thinking about the METADATA case.

Do you suppose we should percolate METADATA buckets back out
to the filter_read of SSL?  I suppose metadata should just go through
unharmed.

However, we have to react to EOS, since that's the end of the input
available to the SSL pump.

>>-        if (inbio->mode == AP_MODE_GETLINE) {
>>-            /* only read from the socket once in getline mode.
>>-             * since callers buffer size is likely much larger than
>>-             * the request headers.  caller can always come back 
>for more
>>-             * if first read didn't get all the headers.
>>-             */
>>-            break;
>>-        }
>
>Again, what is the rationale for removing this?  We don't have a predetermined size for a GETLINE call - well, not 100% true, it sort of ends up being sizeof(ctx->buffer).  Yet, I think that should at most translate to only one socket read.  Not sure why we would want to do a full 8k read all the time.  The hope is that on a GETLINE mode, the odds are that the lines are really short - one socket call should be sufficient to get mulitple lines.  (Furthermore, we can probably 
>read from the buffer since multiple headers will be read at once!)

Getline means nothing in the context of fetching bytes off of
an SSL socket.  We must shove the raw bytes into the SSL
pump in order to return any sort of data (SPECULATIVE, READBYTES,
GETLINE, etc) from the SSL pump.  The raw bytes just need
to be fetched.

Of course, it's generally nonblocking, so if we don't get a hit
from the socket, the new code just returns whatever we got.


>Also, I'm not totally sold on the collasping of ssl_io_hook_read.  I think there's an advantage to creating a fine line between the OpenSSL calls and our more abstract calls.  But, whatever, no big deal.  I think shorter, concise functions are better...  -- justin

No, there really isn't.  It was impossible to look at inbio.rv when
we needed too, since ctx wasn't passed, but the SSL ctx.

We really need to watch the EAGAIN cases.  Right now, that is
what was most borked.  Problems in SSL only happen with some
reasonably busy traffic, interrupt and the EAGAIN situations.  If you
don't flood SSL, you won't see any problems today.

Bill


Re: SSL Input Filter bogosity

Posted by Justin Erenkrantz <je...@apache.org>.
--On Wednesday, October 30, 2002 5:56 PM -0600 "William A. Rowe, Jr." 
<wr...@rowe-clan.net> wrote:

> @@ -358,6 +357,12 @@
>      SSLConnRec *sslconn = myConnConfig(inbio->f->c);
>      int len = 0;
>
> +    inbio->rc = APR_SUCCESS;
> +
> +    /* OpenSSL catches this case, so should we. */
> +    if (!in)
> +        return 0;
> +
>      /* XXX: flush here only required for SSLv2;
>       * OpenSSL calls BIO_flush() at the appropriate times for
>       * the other protocols.

Why are we attempting to catch NULL pointer exceptions?  I don't 
believe that there are any cases where our code could have a NULL 
ctx->buffer as those are well-defined char arrays.  Within mod_ssl, 
there shouldn't be any other entry points to the OpenSSL read calls 
but by entering this path with ctx->buffer, so I don't see why we 
should bother.  (Yes, photons could hit, so what?)

> @@ -366,11 +371,11 @@
>          BIO_bucket_flush(inbio->wbio);
>      }
>
> -    inbio->rc = APR_SUCCESS;
> -
> +    BIO_clear_retry_flags(bio);
> +
>      /* first use data already read from socket if any */
>      if ((len = char_buffer_read(&inbio->cbuf, in, inl))) {
> -        if ((len <= inl) || inbio->mode == AP_MODE_GETLINE) {
> +        if (len >= inl) {
>              return len;
>          }
>          inl -= len;

Hmm, what is the point of this conditional change?  char_buffer_read 
shouldn't be returning anything larger than inl.  To me, it actually 
sounds that this segment should just be returning len all the time 
(as that conditional was *always* true).  No need for a conditional 
here.  If there's anything that we've already read, just return it. 
Clear out that buffer - if the user still wants more, then they can 
call again.  (Due to our structure, we're not sure if the caller 
really wants all of inl or not - almost certainly not for 
AP_MODE_GETLINE.)

> @@ -396,21 +399,58 @@
>                                         AP_MODE_READBYTES, 
inbio->block,
>                                         inl);
>
> -            if ((inbio->rc != APR_SUCCESS) 
||APR_BRIGADE_EMPTY(inbio->bb))
> -            {
> +            if (APR_STATUS_IS_EAGAIN(inbio->rc)
> +                    || APR_STATUS_IS_EINTR(inbio->rc)) {
> +                break;
> +            }
> +
> +            if (inbio->rc != APR_SUCCESS) {
> +                /* Unexpected errors discard the brigade */
> +                apr_brigade_cleanup(inbio->bb);
> +                inbio->bb = NULL;
> +                return -1;
> +            }
> +
> +            if (APR_BRIGADE_EMPTY(inbio->bb)) {
> +                /* Treat an empty brigade as a retry case */
>                  break;
>              }
>          }

We should do a conditional on the APR_BRIGADE_EMPTY() check if 
inbio->block is non-blocking.  It's considered a design error if a 
filter returns an empty brigade on a blocking call.

> -        inbio->bucket = APR_BRIGADE_FIRST(inbio->bb);
> +        bucket = APR_BRIGADE_FIRST(inbio->bb);
> +
> +        if (APR_BUCKET_IS_EOS(bucket)) {
> +            if (len) {
> +                /* Leave this EOS on the brigade */
> +                break;
> +            }
> +            /* Consume the EOS and return 0, we already
> +             * reset the retry flag above
> +             */
> +            apr_brigade_cleanup(inbio->bb);
> +            inbio->bb = NULL;
> +            return 0;
> +        }

Should the APR_BUCKET_IS_EOS rather be APR_BUCKET_IS_METADATA?  Not 
sure here.  Perhaps.

> -        if (inbio->mode == AP_MODE_GETLINE) {
> -            /* only read from the socket once in getline mode.
> -             * since callers buffer size is likely much larger than
> -             * the request headers.  caller can always come back 
for more
> -             * if first read didn't get all the headers.
> -             */
> -            break;
> -        }

Again, what is the rationale for removing this?  We don't have a 
predetermined size for a GETLINE call - well, not 100% true, it sort 
of ends up being sizeof(ctx->buffer).  Yet, I think that should at 
most translate to only one socket read.  Not sure why we would want 
to do a full 8k read all the time.  The hope is that on a GETLINE 
mode, the odds are that the lines are really short - one socket call 
should be sufficient to get mulitple lines.  (Furthermore, we can 
probably read from the buffer since multiple headers will be read at 
once!)

Also, I'm not totally sold on the collasping of ssl_io_hook_read.  I 
think there's an advantage to creating a fine line between the 
OpenSSL calls and our more abstract calls.  But, whatever, no big 
deal.  I think shorter, concise functions are better...  -- justin