You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Micha Lenk <mi...@lenk.info> on 2014/03/19 20:58:21 UTC

[PATCH] mod_proxy_html sometimes adds random characters to HTML pages smaller than 4 bytes

Hi Apache developers,

next is a bug that causes mod_proxy_html to add some random characters 
(+html code) to HTML pages, if the document is smaller than 4 bytes. 
(Thomas, Ewald, this is issue #18378 in our Mantis). It looks like the 
output is from some kind of uninitialized memory. The added string 
sometimes matches part of a previously delivered request. Also, it looks 
like this only happens when doing multiple HTTP requests with the same 
browser and using HTTP Keep Alive.

The root cause is that the charset guessing with xml2enc needs to 
consume at least 4 bytes from the document to come to a conclusion. The 
consumed bytes are buffered so that they can later get prepended to the 
output again. But apparently it is assumed that there are always at 
least 4 bytes available, which in some cases is not the case. In these 
cases the buffer may contain some bytes left behind from the previous 
request on the same connection.

The attached patch fixes that issue by simply skipping documents smaller 
than 4 bytes. The rationale behind this is, that for HTML rewriting to 
do something useful, it needs to work on an absolute URL (i.e. including 
a schema). But as the schema "http" is already 4 bytes, there would be 
nothing to rewrite.

The patch is based on httpd trunk, rev. 1579365.

Please provide feedback whether I should file an issue in Apaches 
Bugzilla or whether this isn't needed in this case.

Regards,
Micha

Re: [PATCH] mod_proxy_html sometimes adds random characters to HTML pages smaller than 4 bytes

Posted by Micha Lenk <mi...@lenk.info>.
Hi,

On 19.03.2014 21:19, Jim Jagielski wrote:
> It's always best, imo, to follow-up with a bugzilla entry with
> description and patch.

Ok, this issue is now filed in ASF bugzilla as #56286.

Regards,
Micha

Re: [PATCH] mod_proxy_html sometimes adds random characters to HTML pages smaller than 4 bytes

Posted by Jim Jagielski <ji...@jaguNET.com>.
It's always best, imo, to follow-up with a bugzilla entry with
description and patch.

Thx!!
On Mar 19, 2014, at 3:58 PM, Micha Lenk <mi...@lenk.info> wrote:

> Hi Apache developers,
> 
> next is a bug that causes mod_proxy_html to add some random characters (+html code) to HTML pages, if the document is smaller than 4 bytes. (Thomas, Ewald, this is issue #18378 in our Mantis). It looks like the output is from some kind of uninitialized memory. The added string sometimes matches part of a previously delivered request. Also, it looks like this only happens when doing multiple HTTP requests with the same browser and using HTTP Keep Alive.
> 
> The root cause is that the charset guessing with xml2enc needs to consume at least 4 bytes from the document to come to a conclusion. The consumed bytes are buffered so that they can later get prepended to the output again. But apparently it is assumed that there are always at least 4 bytes available, which in some cases is not the case. In these cases the buffer may contain some bytes left behind from the previous request on the same connection.
> 
> The attached patch fixes that issue by simply skipping documents smaller than 4 bytes. The rationale behind this is, that for HTML rewriting to do something useful, it needs to work on an absolute URL (i.e. including a schema). But as the schema "http" is already 4 bytes, there would be nothing to rewrite.
> 
> The patch is based on httpd trunk, rev. 1579365.
> 
> Please provide feedback whether I should file an issue in Apaches Bugzilla or whether this isn't needed in this case.
> 
> Regards,
> Micha
> <modproxyhtml_skip_too_small_documents.patch>