You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by GitBox <gi...@apache.org> on 2019/06/29 01:26:19 UTC

[GitHub] [tomcat] alpire opened a new pull request #176: CoyoteAdapter: fix out-of-bounds read in checkNormalize

alpire opened a new pull request #176: CoyoteAdapter: fix out-of-bounds read in checkNormalize
URL: https://github.com/apache/tomcat/pull/176
 
 
   On malformed requests, checkNormalize would throw an ArrayIndexOutOfBoundsException leading to a 500 response. This change fixes checkNormalize to return false instead of throwing exception on those inputs, and adds a few tests to check the new functionality.
   
   For the record, the exception is below:
   
   ```
   java.lang.ArrayIndexOutOfBoundsException: -1
           at org.apache.catalina.connector.CoyoteAdapter.checkNormalize(CoyoteAdapter.java:1275) ~[tomcat-embed-core-9.0.19.jar!/:9.0.19]
           at org.apache.catalina.connector.CoyoteAdapter.postParseRequest(CoyoteAdapter.java:647) ~[tomcat-embed-core-9.0.19.jar!/:9.0.19]
           at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:337) ~[tomcat-embed-core-9.0.19.jar!/:9.0.19]
           at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:408) ~[tomcat-embed-core-9.0.19.jar!/:9.0.19]
           at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66) [tomcat-embed-core-9.0.19.jar!/:9.0.19]
           at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:836) [tomcat-embed-core-9.0.19.jar!/:9.0.19]
           at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1747) [tomcat-embed-core-9.0.19.jar!/:9.0.19]
           at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49) [tomcat-embed-core-9.0.19.jar!/:9.0.19]
           at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_212]
           at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_212]
           at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) [tomcat-embed-core-9.0.19.jar!/:9.0.19]
           at java.lang.Thread.run(Thread.java:748) [na:1.8.0_212]
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: [GitHub] [tomcat] alpire opened a new pull request #176: CoyoteAdapter: fix out-of-bounds read in checkNormalize

Posted by Mark Thomas <ma...@apache.org>.
On 29/06/2019 21:01, Mark Thomas wrote:
> On 29/06/2019 02:26, GitBox wrote:
>> alpire opened a new pull request #176: CoyoteAdapter: fix out-of-bounds read in checkNormalize
>> URL: https://github.com/apache/tomcat/pull/176
>>  
>>  
>>    On malformed requests, checkNormalize would throw an ArrayIndexOutOfBoundsException leading to a 500 response. This change fixes checkNormalize to return false instead of throwing exception on those inputs, and adds a few tests to check the new functionality.
> 
> That there are URI sequences that can trigger this is certainly
> something that we need to fix.
> 
> On a slightly different topic, this got me thinking.>
> checkNormalize() is a test that runs on every request. It is essentially
> there to ensure that whatever character set has been specified for the
> Connector is "safe". In this case "safe" means when the bytes are
> converted to characters we don't get any unexpected "." or "/"
> characters in the result that make the character version of the URI
> non-normalized.
> 
> Rather than run this test on every request, we could:
> - pre-calculate the character sets we consider to be safe
> - throw an exception if the user attempts to configure the Connector to
>   use one
> 
> I think we could safely exclude any character set where any of the
> following were not true:
> 
> - 0x00 <-> '\u0000'
> - 0x2e <-> '.'
> - 0x2f <-> '/'
> - 0x5c <-> '\'
> 
> We could create a unit test that maintains/checks the list of "safe"
> character set canonical names. After creating the character set in
> Connector.setURIEncoding(), if the canonical name of the resulting
> character set is not in the safe list, throw an exception.
> 
> Then remove checkNormalize() and replace with a comment that explains
> why the conversion is known to be safe.
> 
> There are several loops through the URI in checkNormalize(). I think -
> I'd need to test to confirm - that removing them would provide a
> measurable performance benefit.
> 
> Thoughts?

I thought about this some more over the weekend and I think it is a good
idea in theory but a bad idea in practice.

checkNormalize() serves a purpose that wasn't addressed in my comments
above. Broken decoders. There is a history of issues with Java's UTF-8
decoder that could be exploited in the way checkNormalize() is designed
to protect against. While I am reasonably confident Java's UTF-8 decoder
is now safe, I am not confident that every Java decoder is bug free.
Therefore, I think the checkNormalize() call needs to be retained for
every request.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: [GitHub] [tomcat] alpire opened a new pull request #176: CoyoteAdapter: fix out-of-bounds read in checkNormalize

Posted by Christopher Schultz <ch...@christopherschultz.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

Mark,

On 6/29/19 16:01, Mark Thomas wrote:
> On 29/06/2019 02:26, GitBox wrote:
>> alpire opened a new pull request #176: CoyoteAdapter: fix
>> out-of-bounds read in checkNormalize URL:
>> https://github.com/apache/tomcat/pull/176
>> 
>> 
>> On malformed requests, checkNormalize would throw an
>> ArrayIndexOutOfBoundsException leading to a 500 response. This
>> change fixes checkNormalize to return false instead of throwing
>> exception on those inputs, and adds a few tests to check the new
>> functionality.
> 
> That there are URI sequences that can trigger this is certainly 
> something that we need to fix.
> 
> On a slightly different topic, this got me thinking.
> 
> checkNormalize() is a test that runs on every request. It is
> essentially there to ensure that whatever character set has been
> specified for the Connector is "safe". In this case "safe" means
> when the bytes are converted to characters we don't get any
> unexpected "." or "/" characters in the result that make the
> character version of the URI non-normalized.
> 
> Rather than run this test on every request, we could: -
> pre-calculate the character sets we consider to be safe - throw an
> exception if the user attempts to configure the Connector to use
> one
> 
> I think we could safely exclude any character set where any of the 
> following were not true:
> 
> - 0x00 <-> '\u0000' - 0x2e <-> '.' - 0x2f <-> '/' - 0x5c <-> '\'
> 
> We could create a unit test that maintains/checks the list of
> "safe" character set canonical names. After creating the character
> set in Connector.setURIEncoding(), if the canonical name of the
> resulting character set is not in the safe list, throw an
> exception.
> 
> Then remove checkNormalize() and replace with a comment that
> explains why the conversion is known to be safe.
> 
> There are several loops through the URI in checkNormalize(). I
> think - I'd need to test to confirm - that removing them would
> provide a measurable performance benefit.

I'm a little worried about edge cases that might cause a misread of a
multibyte character into several single-byte characters that aren't
supposed to be treated as such. Remember that an attacker is not
obligated to follow the rules of sane e.g. UTF-8 or SHIFT-JS or
whatever. They can claim one encoding and then send complete garbage
down the line. We need to remain safe in all cases.

So I think checking the string AFTER decoding is the only truly safe
way to operate.

Note that I can't think of a specific way to break this off the top of
my head.

What kind of performance benefit are you looking at, here? Certainly,
every bit helps, but is the wire-to-servlet pipeline really that long
at this point?

- -chris
-----BEGIN PGP SIGNATURE-----
Comment: Using GnuPG with Thunderbird - https://www.enigmail.net/

iQIzBAEBCAAdFiEEMmKgYcQvxMe7tcJcHPApP6U8pFgFAl0aIY8ACgkQHPApP6U8
pFhzrBAAwpIKVsiUIJiEQIEMsioeBSyShut/jncTjYNe1+JpdDWySDgQtuv6rzJm
ImMsi7/Iutm+OTvPyX/0ErkueYTwsdQTMS3dAGnyC+naViLQ+/C5XsOLIisHJwUy
suDJsVl5V0DbslXauhNiy+tsMzA8ipx87OneYO7gBMjcurLtMjvPiCnMvRS9eFah
C5bU6Emu9NrEC40MI7Jw+Il/loUIDlLbCDIqDjGxMb/tA8N0wXmIoIIbC1iAXPpy
fmSN4MyEUd0Xi4m/sv7kl71hmI09ivItHUtMdW89Tp8kvIEWUAnu3mDoLZGmw3+X
YZIxhW+95g4HydOh7ODuJa7Bqg+csa0ZpQl5u4jnnwGZrLWLGZ/w+tu5P83qHBp6
Yy1hdPVrhfAgLMV1Rmzj+5Heo0LGe2PbKvgpg4DMAFSed3SOnnW/w3NNPTjFhEBD
jiQi1TSXTZZQ6VRS/KWain4uQAWD2ouPipZl7XA8Uz+g0pEhh642s7LNUpbclH5y
JZ02GRL5GFAcb9ZTn0RlCQgDz/xK9Lm5eVbHmWkdtHw9CUgqWF9nD6ZhWXEEI39Z
vVQZboAZMlkijv3AJupjE6Q8PfPnXWaFdbtDqWMJQ+/c12xzPSGu+QcpgCYxyIfE
6kSig2FGR98lKXCE5bMMq6XuM1RNcHMyKUnOZ8NOxAhLfBbZcEw=
=zgdi
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: [GitHub] [tomcat] alpire opened a new pull request #176: CoyoteAdapter: fix out-of-bounds read in checkNormalize

Posted by Mark Thomas <ma...@apache.org>.
On 29/06/2019 02:26, GitBox wrote:
> alpire opened a new pull request #176: CoyoteAdapter: fix out-of-bounds read in checkNormalize
> URL: https://github.com/apache/tomcat/pull/176
>  
>  
>    On malformed requests, checkNormalize would throw an ArrayIndexOutOfBoundsException leading to a 500 response. This change fixes checkNormalize to return false instead of throwing exception on those inputs, and adds a few tests to check the new functionality.

That there are URI sequences that can trigger this is certainly
something that we need to fix.

On a slightly different topic, this got me thinking.

checkNormalize() is a test that runs on every request. It is essentially
there to ensure that whatever character set has been specified for the
Connector is "safe". In this case "safe" means when the bytes are
converted to characters we don't get any unexpected "." or "/"
characters in the result that make the character version of the URI
non-normalized.

Rather than run this test on every request, we could:
- pre-calculate the character sets we consider to be safe
- throw an exception if the user attempts to configure the Connector to
  use one

I think we could safely exclude any character set where any of the
following were not true:

- 0x00 <-> '\u0000'
- 0x2e <-> '.'
- 0x2f <-> '/'
- 0x5c <-> '\'

We could create a unit test that maintains/checks the list of "safe"
character set canonical names. After creating the character set in
Connector.setURIEncoding(), if the canonical name of the resulting
character set is not in the safe list, throw an exception.

Then remove checkNormalize() and replace with a comment that explains
why the conversion is known to be safe.

There are several loops through the URI in checkNormalize(). I think -
I'd need to test to confirm - that removing them would provide a
measurable performance benefit.

Thoughts?

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org