You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@httpd.apache.org by Martin Gerdes <ma...@googlemail.com> on 2009/11/10 16:42:00 UTC

Re: [users@httpd] activating xml2enc makes client getting HTML-Page take very long: How about deactivating conversions?

A completely different idea to solve my actual problem:

Someone else suggested to just take out the conversions all together.
I mean, I am converting right back into the encoding I converted from. I
have been assured that no link uses a character above the first 128 (7 bit
ASCII). As far as I know there are no HTML control characters outside of 7
bit ASCII either.
So shouldn't the parser just be able to parse the ISO-8859-1 document as if
it was utf-8? Yeah, I know it sounds horrible, but as far as I can tell it
should not actually break...

As author of the module:
Could this work?
What would I have to change in the code to keep any input conversion from
happening?
(I will play around abit myself, but I am not familiar with the code, nor
with Apache module logic. And its been quite a few years since I last coded
C...)

At the very least this would tell us (if it works) whether or not the
conversions are to blame for the problems I experience.

Martin

Re: [users@httpd] activating xml2enc makes client getting HTML-Page take very long: How about deactivating conversions?

Posted by Martin Gerdes <ma...@googlemail.com>.

Alright, just forget I suggested that. If in front of a html character a
byte above 127 appears (a character outside of 7 bit ASCII), the control
character would get interpreted as part of the same character in utf-8. In
other words: It WILL break.
The suggestion just sounded too good. Back to the regularly scheduled
program...

2009/11/10 Martin Gerdes <ma...@googlemail.com>

> A completely different idea to solve my actual problem:
>
> Someone else suggested to just take out the conversions all together.
> I mean, I am converting right back into the encoding I converted from. I
> have been assured that no link uses a character above the first 128 (7 bit
> ASCII). As far as I know there are no HTML control characters outside of 7
> bit ASCII either.
> So shouldn't the parser just be able to parse the ISO-8859-1 document as if
> it was utf-8? Yeah, I know it sounds horrible, but as far as I can tell it
> should not actually break...
>
> As author of the module:
> Could this work?
> What would I have to change in the code to keep any input conversion from
> happening?
> (I will play around abit myself, but I am not familiar with the code, nor
> with Apache module logic. And its been quite a few years since I last coded
> C...)
>
> At the very least this would tell us (if it works) whether or not the
> conversions are to blame for the problems I experience.
>
> Martin
>
>