You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by "William A. Rowe, Jr." <wr...@rowe-clan.net> on 2002/10/31 00:56:59 UTC
SSL Input Filter bogosity
I spent alot of time over the last several days trying to shoot down the
obscure bugs in SSL lockup. Some might have to do with the DBM and
apr_global_mutex bugs. Others weren't so obvious ("Spurious Interrupt,
perhaps one of those OpenSSL confusions?")
The answer is we rarely know which end is up with the existing input
SSL filter. We aren't tracking our own APR result codes correctly,
and nothing percolates properly. We don't look (today) correctly at
the failure and retry cases.
The patch attached is a rewrite of the SSL input filtering. Please take
a close look at the new input filter. I've collapsed two functions that
didn't need to be split so it's easier to follow, now.
Bill
Re: SSL Input Filter bogosity
Posted by Justin Erenkrantz <je...@apache.org>.
--On Thursday, October 31, 2002 12:19 AM -0600 "William A. Rowe, Jr."
<wr...@apache.org> wrote:
> OpenSSL 0.9.6g does so. Why shouldn't we?
Because OpenSSL is a library, we're not.
> However, if we don't have inl worth of bytes, and they are sitting
> ready (on the socket) shouldn't we fetch them? Forget the GETLINE
> bogosity, it means nothing to SSL.
No. I think you are missing the point of having this code understand
AP_MODE_GETLINE. When we try to read a line, we don't have a way of
really knowing how much to read. So, we try to read a maximum of 8k
(whatever AP_IOBUFSIZE is). That merely defines what our maximum
line length is (actually that is the maximum that mod_ssl will ever
return on a read). The generic read code doesn't have logic for
determining when to stop. That code is only contained in the logic
that understands AP_MODE_GETLINE (does the memchr call).
Yet, in all probability, the chances are that the line is going to be
very short (consider HTTP headers). Therefore, as a critical
optimization, we don't necessarily want to read the full 8k, but we
want to try to see if we have enough with what we already have. So,
when we read anything with AP_MODE_GETLINE, we should exit out of our
generic read call and then check to see if we've satisfied the
getline requirements.
Removing the short-circuits is going to make all AP_MODE_GETLINE
calls block for 8k of data every time. That's going to be
unacceptable when we're reading the headers (which is a blocking
read) as we may end up reading too much data and we will end up
reading past the headers (which commonly take less than 8k of data).
When reading from the SSL socket generically, AP_MODE_GETLINE calls
can only safely handle the output of one socket read, and I believe,
if there was any data left over from the last read, we should attempt
to check if that was a line as well (again, chances are that it is
when called with AP_MODE_GETLINE). Once we have the output of that
socket read, the getline-aware logic can then determine if it is
enough (it saw the LF), or to read some more (given the AP_IOBUFSIZE
constraint). Doing blind blocking as your patch does can't work.
Therefore, I believe these AP_MODE_GETLINE optimizations must stay in.
As a hunch, I would believe it would be better to return partial data
up the filter chain sooner rather than waiting for the entirety even
for AP_MODE_READBYTES calls as well. There is no assumption that a
filter *must* return all of the bytes it was requested. On a
blocking read, the only guarantee is that it should return at least
one byte. If we're really waiting around for the network, I think
it'd be best to compute what we already have and then once we're
ready, try to read from the network again. Of course, I have no
numbers to prove this thought, but I wouldn't be shocked if it is.
>> We should do a conditional on the APR_BRIGADE_EMPTY() check if
>> inbio->block is non-blocking. It's considered a design error if a
>> filter returns an empty brigade on a blocking call.
>
> Who said we are blocking? This could be a SPECULATIVE call
> with a NONBLOCKING request, no?
Huh? We have the socket mode as we are passing that parameter to
ap_get_brigade. So, of course, we know if we are blocking. Again,
the common case is that we are blocking.
>> Should the APR_BUCKET_IS_EOS rather be APR_BUCKET_IS_METADATA?
>> Not sure here. Perhaps.
>
> Hmmm. I was thinking about the METADATA case.
>
> Do you suppose we should percolate METADATA buckets back out
> to the filter_read of SSL? I suppose metadata should just go
> through unharmed.
I think so. We really don't have a case where METADATA originates
from the core, but it's possible.
> However, we have to react to EOS, since that's the end of the input
> available to the SSL pump.
Yes, as a special case of METADATA.
> Getline means nothing in the context of fetching bytes off of
> an SSL socket. We must shove the raw bytes into the SSL
> pump in order to return any sort of data (SPECULATIVE, READBYTES,
> GETLINE, etc) from the SSL pump. The raw bytes just need
> to be fetched.
>
> Of course, it's generally nonblocking, so if we don't get a hit
> from the socket, the new code just returns whatever we got.
No, it's not generally non-blocking (see ap_rgetline_core for the
initial call to ap_get_brigade - it is blocking). So, attempting to
read the full 8k when we don't have that available will not work.
Again, the AP_MODE_GETLINE is important even within the generic
reading of the code - see above.
> No, there really isn't. It was impossible to look at inbio.rv when
> we needed too, since ctx wasn't passed, but the SSL ctx.
Then, change the function parameters. I'm just concerned that we're
going to be substantially adding to another function. mod_ssl has a
lot of places where functions go on and on and on and on and on.
Breaking it into little pieces isn't that bad of an idea - this patch
is making a function considerably longer and more complicated.
If you want to do something *really* productive with mod_ssl to help
it be more understandable, please go and change those bizzaro
function names to actually mean something. ssl_io_filter_Input.
Gah. -- justin
Re: SSL Input Filter bogosity
Posted by "William A. Rowe, Jr." <wr...@apache.org>.
At 10:42 PM 10/30/2002, Justin Erenkrantz wrote:
>--On Wednesday, October 30, 2002 5:56 PM -0600 "William A. Rowe, Jr." <wr...@rowe-clan.net> wrote:
>
>>@@ -358,6 +357,12 @@
>> SSLConnRec *sslconn = myConnConfig(inbio->f->c);
>> int len = 0;
>>
>>+ inbio->rc = APR_SUCCESS;
>>+
>>+ /* OpenSSL catches this case, so should we. */
>>+ if (!in)
>>+ return 0;
>>+
>> /* XXX: flush here only required for SSLv2;
>> * OpenSSL calls BIO_flush() at the appropriate times for
>> * the other protocols.
>
>Why are we attempting to catch NULL pointer exceptions? I don't believe that there are any cases where our code could have a NULL ctx->buffer as those are well-defined char arrays. Within mod_ssl, there shouldn't be any other entry points to the OpenSSL read calls but by entering this path with ctx->buffer, so I don't see why we should bother. (Yes, photons could hit, so what?)
OpenSSL 0.9.6g does so. Why shouldn't we?
>>@@ -366,11 +371,11 @@
>> BIO_bucket_flush(inbio->wbio);
>> }
>>
>>- inbio->rc = APR_SUCCESS;
>>-
>>+ BIO_clear_retry_flags(bio);
>>+
>> /* first use data already read from socket if any */
>> if ((len = char_buffer_read(&inbio->cbuf, in, inl))) {
>>- if ((len <= inl) || inbio->mode == AP_MODE_GETLINE) {
>>+ if (len >= inl) {
>> return len;
>> }
>> inl -= len;
>
>Hmm, what is the point of this conditional change? char_buffer_read shouldn't be returning anything larger than inl. To me, it actually sounds that this segment should just be returning len all the time (as that conditional was *always* true). No need for a conditional here. If there's anything that we've already read, just return it. Clear out that buffer - if the user still wants more, then they can call again. (Due to our structure, we're not sure if the caller really wants all of inl or not - almost certainly not for AP_MODE_GETLINE.)
However, if we don't have inl worth of bytes, and they are sitting
ready (on the socket) shouldn't we fetch them? Forget the GETLINE
bogosity, it means nothing to SSL.
>>@@ -396,21 +399,58 @@
>> AP_MODE_READBYTES,
>inbio->block,
>> inl);
>>
>>- if ((inbio->rc != APR_SUCCESS)
>||APR_BRIGADE_EMPTY(inbio->bb))
>>- {
>>+ if (APR_STATUS_IS_EAGAIN(inbio->rc)
>>+ || APR_STATUS_IS_EINTR(inbio->rc)) {
>>+ break;
>>+ }
>>+
>>+ if (inbio->rc != APR_SUCCESS) {
>>+ /* Unexpected errors discard the brigade */
>>+ apr_brigade_cleanup(inbio->bb);
>>+ inbio->bb = NULL;
>>+ return -1;
>>+ }
>>+
>>+ if (APR_BRIGADE_EMPTY(inbio->bb)) {
>>+ /* Treat an empty brigade as a retry case */
>> break;
>> }
>> }
>
>We should do a conditional on the APR_BRIGADE_EMPTY() check if inbio->block is non-blocking. It's considered a design error if a filter returns an empty brigade on a blocking call.
Who said we are blocking? This could be a SPECULATIVE call
with a NONBLOCKING request, no?
>>- inbio->bucket = APR_BRIGADE_FIRST(inbio->bb);
>>+ bucket = APR_BRIGADE_FIRST(inbio->bb);
>>+
>>+ if (APR_BUCKET_IS_EOS(bucket)) {
>>+ if (len) {
>>+ /* Leave this EOS on the brigade */
>>+ break;
>>+ }
>>+ /* Consume the EOS and return 0, we already
>>+ * reset the retry flag above
>>+ */
>>+ apr_brigade_cleanup(inbio->bb);
>>+ inbio->bb = NULL;
>>+ return 0;
>>+ }
>
>Should the APR_BUCKET_IS_EOS rather be APR_BUCKET_IS_METADATA? Not sure here. Perhaps.
Hmmm. I was thinking about the METADATA case.
Do you suppose we should percolate METADATA buckets back out
to the filter_read of SSL? I suppose metadata should just go through
unharmed.
However, we have to react to EOS, since that's the end of the input
available to the SSL pump.
>>- if (inbio->mode == AP_MODE_GETLINE) {
>>- /* only read from the socket once in getline mode.
>>- * since callers buffer size is likely much larger than
>>- * the request headers. caller can always come back
>for more
>>- * if first read didn't get all the headers.
>>- */
>>- break;
>>- }
>
>Again, what is the rationale for removing this? We don't have a predetermined size for a GETLINE call - well, not 100% true, it sort of ends up being sizeof(ctx->buffer). Yet, I think that should at most translate to only one socket read. Not sure why we would want to do a full 8k read all the time. The hope is that on a GETLINE mode, the odds are that the lines are really short - one socket call should be sufficient to get mulitple lines. (Furthermore, we can probably
>read from the buffer since multiple headers will be read at once!)
Getline means nothing in the context of fetching bytes off of
an SSL socket. We must shove the raw bytes into the SSL
pump in order to return any sort of data (SPECULATIVE, READBYTES,
GETLINE, etc) from the SSL pump. The raw bytes just need
to be fetched.
Of course, it's generally nonblocking, so if we don't get a hit
from the socket, the new code just returns whatever we got.
>Also, I'm not totally sold on the collasping of ssl_io_hook_read. I think there's an advantage to creating a fine line between the OpenSSL calls and our more abstract calls. But, whatever, no big deal. I think shorter, concise functions are better... -- justin
No, there really isn't. It was impossible to look at inbio.rv when
we needed too, since ctx wasn't passed, but the SSL ctx.
We really need to watch the EAGAIN cases. Right now, that is
what was most borked. Problems in SSL only happen with some
reasonably busy traffic, interrupt and the EAGAIN situations. If you
don't flood SSL, you won't see any problems today.
Bill
Re: SSL Input Filter bogosity
Posted by Justin Erenkrantz <je...@apache.org>.
--On Wednesday, October 30, 2002 5:56 PM -0600 "William A. Rowe, Jr."
<wr...@rowe-clan.net> wrote:
> @@ -358,6 +357,12 @@
> SSLConnRec *sslconn = myConnConfig(inbio->f->c);
> int len = 0;
>
> + inbio->rc = APR_SUCCESS;
> +
> + /* OpenSSL catches this case, so should we. */
> + if (!in)
> + return 0;
> +
> /* XXX: flush here only required for SSLv2;
> * OpenSSL calls BIO_flush() at the appropriate times for
> * the other protocols.
Why are we attempting to catch NULL pointer exceptions? I don't
believe that there are any cases where our code could have a NULL
ctx->buffer as those are well-defined char arrays. Within mod_ssl,
there shouldn't be any other entry points to the OpenSSL read calls
but by entering this path with ctx->buffer, so I don't see why we
should bother. (Yes, photons could hit, so what?)
> @@ -366,11 +371,11 @@
> BIO_bucket_flush(inbio->wbio);
> }
>
> - inbio->rc = APR_SUCCESS;
> -
> + BIO_clear_retry_flags(bio);
> +
> /* first use data already read from socket if any */
> if ((len = char_buffer_read(&inbio->cbuf, in, inl))) {
> - if ((len <= inl) || inbio->mode == AP_MODE_GETLINE) {
> + if (len >= inl) {
> return len;
> }
> inl -= len;
Hmm, what is the point of this conditional change? char_buffer_read
shouldn't be returning anything larger than inl. To me, it actually
sounds that this segment should just be returning len all the time
(as that conditional was *always* true). No need for a conditional
here. If there's anything that we've already read, just return it.
Clear out that buffer - if the user still wants more, then they can
call again. (Due to our structure, we're not sure if the caller
really wants all of inl or not - almost certainly not for
AP_MODE_GETLINE.)
> @@ -396,21 +399,58 @@
> AP_MODE_READBYTES,
inbio->block,
> inl);
>
> - if ((inbio->rc != APR_SUCCESS)
||APR_BRIGADE_EMPTY(inbio->bb))
> - {
> + if (APR_STATUS_IS_EAGAIN(inbio->rc)
> + || APR_STATUS_IS_EINTR(inbio->rc)) {
> + break;
> + }
> +
> + if (inbio->rc != APR_SUCCESS) {
> + /* Unexpected errors discard the brigade */
> + apr_brigade_cleanup(inbio->bb);
> + inbio->bb = NULL;
> + return -1;
> + }
> +
> + if (APR_BRIGADE_EMPTY(inbio->bb)) {
> + /* Treat an empty brigade as a retry case */
> break;
> }
> }
We should do a conditional on the APR_BRIGADE_EMPTY() check if
inbio->block is non-blocking. It's considered a design error if a
filter returns an empty brigade on a blocking call.
> - inbio->bucket = APR_BRIGADE_FIRST(inbio->bb);
> + bucket = APR_BRIGADE_FIRST(inbio->bb);
> +
> + if (APR_BUCKET_IS_EOS(bucket)) {
> + if (len) {
> + /* Leave this EOS on the brigade */
> + break;
> + }
> + /* Consume the EOS and return 0, we already
> + * reset the retry flag above
> + */
> + apr_brigade_cleanup(inbio->bb);
> + inbio->bb = NULL;
> + return 0;
> + }
Should the APR_BUCKET_IS_EOS rather be APR_BUCKET_IS_METADATA? Not
sure here. Perhaps.
> - if (inbio->mode == AP_MODE_GETLINE) {
> - /* only read from the socket once in getline mode.
> - * since callers buffer size is likely much larger than
> - * the request headers. caller can always come back
for more
> - * if first read didn't get all the headers.
> - */
> - break;
> - }
Again, what is the rationale for removing this? We don't have a
predetermined size for a GETLINE call - well, not 100% true, it sort
of ends up being sizeof(ctx->buffer). Yet, I think that should at
most translate to only one socket read. Not sure why we would want
to do a full 8k read all the time. The hope is that on a GETLINE
mode, the odds are that the lines are really short - one socket call
should be sufficient to get mulitple lines. (Furthermore, we can
probably read from the buffer since multiple headers will be read at
once!)
Also, I'm not totally sold on the collasping of ssl_io_hook_read. I
think there's an advantage to creating a fine line between the
OpenSSL calls and our more abstract calls. But, whatever, no big
deal. I think shorter, concise functions are better... -- justin