You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by Andreas Hartmann <an...@apache.org> on 2004/10/05 12:43:32 UTC

XHTML and Entities: ' fails in IE

Hi Cocoon community,

I wonder if there is a solution for this problem by
now - searching the list only revealed questions, no
solutions :(


I'm using UTF-8 encoding with the
o.a.c.components.serializers.XHTMLSerializer.
Special characters seem to be encoded as entity references
by default (e.g., ' is encoded as &apos;)

Everything is fine except that Internet Explorer keeps on
showing the &apos; as &apos; instead of ' .

This seems to be a known IE issue:

http://artific.com/library/xhtml_1.0_entities.html

"Some entity names do not resolve (eg: &apos; under Microsoft Internet
Explorer) but their corresponding numeric names do resolve."


Is there anything I can do about this? Looking at the
EncodingSerializer etc. didn't lead me to a solution.


Thanks in advance,
-- Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: XHTML and Entities: ' fails in IE

Posted by "Volkm@r" <pl...@arcor.de>.
Andreas Hartmann wrote:
> [...]
>> What would be the benefit of using the "new" XHTMLSerializer? If any, 
>> I'd like to try it more seriously.
> 
> 
> The most apparent advantage is that it takes care of the
> header encoding information. Since I use it I have no problems
> regarding UTF-8 anymore.
> 
> Another difference is that it just passes the existing
> doctype definition instead of adding a configured one.
> So it is required to set the public-id and system-id in the
> XSLT or source document.
> 
> I guess the new one has a cleaner implementation, but I didn't yet
> compare the code. Maybe it makes sense to ask on cocoon-dev.

So, I guess I'd better stick to configuring XMLSerializer.
Thanks for your explanation.

> 
> -- Andreas
Your separator is broken. You'd better use: minus minus blank cr

-- 
Volkmar W. Pogatzki


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: XHTML and Entities: ' fails in IE

Posted by Andreas Hartmann <an...@apache.org>.
Volkm@r wrote:
> Andreas Hartmann wrote:
>  > [...]
> 
>>> Did you try numerical entities like &#39;?
>>
>>
>>
>> That would actually help. The problem is that everything is
>> converted by the XHTMLSerializer:
>>
>> &apos; -> &apos;
>> &#39;  -> &apos;
>> '      -> &apos;
>>
>> I guess it really requires a patch to make it configurable.
>>
> 
> This is my first time to try the XHTMLSerializer and I only get an error 
> instead of a result:
>  org.xml.sax.SAXException: Unable to map "http://www.w3.org/2000/xmlns/"

I know this problem. IIRC it was related to Xalan/XSLTC mixture.
When I switched to Xalan everywhere, it vanished.


> Instead of org.apache.cocoon.components.serializers.XHTMLSerializer I've 
> always been using org.apache.cocoon.serialization.XMLSerializer with 
> configuration according to XHTML. That perferctly works and I never saw 
> *any* character entities in the output.
> 
> What would be the benefit of using the "new" XHTMLSerializer? If any, 
> I'd like to try it more seriously.

The most apparent advantage is that it takes care of the
header encoding information. Since I use it I have no problems
regarding UTF-8 anymore.

Another difference is that it just passes the existing
doctype definition instead of adding a configured one.
So it is required to set the public-id and system-id in the
XSLT or source document.

I guess the new one has a cleaner implementation, but I didn't yet
compare the code. Maybe it makes sense to ask on cocoon-dev.

-- Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: XHTML and Entities: ' fails in IE

Posted by "Volkm@r" <pl...@arcor.de>.
Andreas Hartmann wrote:
 > [...]
>> Did you try numerical entities like &#39;?
> 
> 
> That would actually help. The problem is that everything is
> converted by the XHTMLSerializer:
> 
> &apos; -> &apos;
> &#39;  -> &apos;
> '      -> &apos;
> 
> I guess it really requires a patch to make it configurable.
> 

This is my first time to try the XHTMLSerializer and I only get an error 
instead of a result:
  org.xml.sax.SAXException: Unable to map "http://www.w3.org/2000/xmlns/"

Instead of org.apache.cocoon.components.serializers.XHTMLSerializer I've 
always been using org.apache.cocoon.serialization.XMLSerializer with 
configuration according to XHTML. That perferctly works and I never saw 
*any* character entities in the output.

What would be the benefit of using the "new" XHTMLSerializer? If any, 
I'd like to try it more seriously.
-- 
Volkmar W. Pogatzki


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: XHTML and Entities: ' fails in IE

Posted by Andreas Hartmann <an...@apache.org>.
Volkm@r wrote:
> Andreas Hartmann wrote:
> 
>> The encoding is recognized as UTF-8, so this is not the problem.
>> This works fine with the new Serializer.
> 
> 
> Sorry for misunderstanding.

No problem :)

> 
>>
>> The problem is that IE doesn't expand the &apos; entity
>> (all UTF-8 characters and other entites are handled perfectly) ...
> 
> 
> Did you try numerical entities like &#39;?

That would actually help. The problem is that everything is
converted by the XHTMLSerializer:

&apos; -> &apos;
&#39;  -> &apos;
'      -> &apos;

I guess it really requires a patch to make it configurable.

-- Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: XHTML and Entities: ' fails in IE

Posted by "Volkm@r" <pl...@arcor.de>.
Andreas Hartmann wrote:
> The encoding is recognized as UTF-8, so this is not the problem.
> This works fine with the new Serializer.

Sorry for misunderstanding.

> 
> The problem is that IE doesn't expand the &apos; entity
> (all UTF-8 characters and other entites are handled perfectly) ...

Did you try numerical entities like &#39;?

-- 
Volkmar W. Pogatzki


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: XHTML and Entities: ' fails in IE

Posted by Andreas Hartmann <an...@apache.org>.
Volkm@r wrote:
> Andreas Hartmann wrote:
> 
>> I'm using only the XHTML serializer:
> 
> 
> Andreas, you didn't check the http response headers' charset value, did 
> you? I'm sure you have a charset mismatch.
> To ensure that the page is sent with the correct "charset=utf-8" header, 
> you'd better fix the serializer's configuration.

The encoding is recognized as UTF-8, so this is not the problem.
This works fine with the new Serializer.

The problem is that IE doesn't expand the &apos; entity
(all UTF-8 characters and other entites are handled perfectly) ...

>>
>> <map:serializer name="xhtml" logger="sitemap.serializer.xhtml"
>>     mime-type="text/html" pool-grow="2" pool-max="64" pool-min="2"
> 
> 
>       mime-type="text/html; charset=utf-8" pool-grow="2" pool-max="64"

That doesn't change anything.

> FYI, Internet Exploder does *not* evaluate the charset info from 
> processing instructions.

Yes, I know.

Anyway, thanks a lot!

-- Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: XHTML and Entities: ' fails in IE

Posted by "Volkm@r" <pl...@arcor.de>.
Andreas Hartmann wrote:
> I'm using only the XHTML serializer:

Andreas, you didn't check the http response headers' charset value, did 
you? I'm sure you have a charset mismatch.
To ensure that the page is sent with the correct "charset=utf-8" header, 
you'd better fix the serializer's configuration.

> 
> <map:serializer name="xhtml" logger="sitemap.serializer.xhtml"
>     mime-type="text/html" pool-grow="2" pool-max="64" pool-min="2"

       mime-type="text/html; charset=utf-8" pool-grow="2" pool-max="64"

>     src="org.apache.cocoon.components.serializers.XHTMLSerializer">
>   <encoding>UTF-8</encoding>
> </map:serializer>

FYI, Internet Exploder does *not* evaluate the charset info from 
processing instructions.
-- 
Volkmar W. Pogatzki


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: XHTML and Entities: ' fails in IE

Posted by Andreas Hartmann <an...@apache.org>.
Volkm@r wrote:
> Andreas Hartmann wrote:
> 

[...]

> 
> Andreas, I can't reproduce your problem.

Strange ... seems to be a known issue.

> Regardless if I have &apos; or ' both XHTML serializer and HTML 
> serializer produce ' as output.
> 
> What is your pipeline's configuration of both serializers?

I'm using only the XHTML serializer:

<map:serializer name="xhtml" logger="sitemap.serializer.xhtml"
     mime-type="text/html" pool-grow="2" pool-max="64" pool-min="2"
     src="org.apache.cocoon.components.serializers.XHTMLSerializer">
   <encoding>UTF-8</encoding>
</map:serializer>

...

<map:serialize type="xhtml"/>


For XML I'm using the old one (o.a.c.serialization.XMLSerializer).


-- Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: XHTML and Entities: ' fails in IE

Posted by "Volkm@r" <pl...@arcor.de>.
Andreas Hartmann wrote:
> Volkm@r wrote:
> 
>> Andreas Hartmann wrote:
> 
> 
> [...]
> 
>> Just type ' instead of &apos; and save using utf-8.
> 
> 
> 
> Thanks, but the source XML already contains a '.
> 
> If I use the XMLSerializer, it is displayed as '.
> Only the XHTMLSerializer changes it to &apos;
> 
> -- Andreas

Andreas, I can't reproduce your problem.
Regardless if I have &apos; or ' both XHTML serializer and HTML 
serializer produce ' as output.

What is your pipeline's configuration of both serializers?

-- 
Volkmar W. Pogatzki


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: XHTML and Entities: ' fails in IE

Posted by Andreas Hartmann <an...@apache.org>.
Colin Paul Adams wrote:
>>>>>>"Andreas" == Andreas Hartmann <an...@apache.org> writes:
> 
> 
>     Andreas> Volkm@r wrote:
>     >> Andreas Hartmann wrote:
> 
>     Andreas> [...]
> 
>     >> Just type ' instead of &apos; and save using utf-8.
> 
> 
>     Andreas> Thanks, but the source XML already contains a '.
> 
>     Andreas> If I use the XMLSerializer, it is displayed as '.  Only
>     Andreas> the XHTMLSerializer changes it to &apos;
> 
> Then don't use the XHTML serializer with IE - since when has IE
> supported XHTML? (it certainly didn't it 5.0 - I haven't used it
> since).

Well, kind of.

http://www.w3.org/MarkUp/2004/xhtml-faq#ie

-- Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: XHTML and Entities: ' fails in IE

Posted by Colin Paul Adams <co...@colina.demon.co.uk>.
>>>>> "Andreas" == Andreas Hartmann <an...@apache.org> writes:

    Andreas> Volkm@r wrote:
    >> Andreas Hartmann wrote:

    Andreas> [...]

    >> Just type ' instead of &apos; and save using utf-8.


    Andreas> Thanks, but the source XML already contains a '.

    Andreas> If I use the XMLSerializer, it is displayed as '.  Only
    Andreas> the XHTMLSerializer changes it to &apos;

Then don't use the XHTML serializer with IE - since when has IE
supported XHTML? (it certainly didn't it 5.0 - I haven't used it
since).
-- 
Colin Paul Adams
Preston Lancashire

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: XHTML and Entities: ' fails in IE

Posted by Andreas Hartmann <an...@apache.org>.
Volkm@r wrote:
> Andreas Hartmann wrote:

[...]

> Just type ' instead of &apos; and save using utf-8.


Thanks, but the source XML already contains a '.

If I use the XMLSerializer, it is displayed as '.
Only the XHTMLSerializer changes it to &apos;

-- Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: XHTML and Entities: ' fails in IE

Posted by "Volkm@r" <pl...@arcor.de>.
Andreas Hartmann wrote:
> Hi Cocoon community,
> 
> I wonder if there is a solution for this problem by
> now - searching the list only revealed questions, no
> solutions :(
> 
> 
> I'm using UTF-8 encoding with the
> o.a.c.components.serializers.XHTMLSerializer.
> Special characters seem to be encoded as entity references
> by default (e.g., ' is encoded as &apos;)
> 
> Everything is fine except that Internet Explorer keeps on
> showing the &apos; as &apos; instead of ' .
> 
> This seems to be a known IE issue:
> 
> http://artific.com/library/xhtml_1.0_entities.html
> 
> "Some entity names do not resolve (eg: &apos; under Microsoft Internet
> Explorer) but their corresponding numeric names do resolve."
> 
> 
> Is there anything I can do about this? Looking at the
> EncodingSerializer etc. didn't lead me to a solution.

Just type ' instead of &apos; and save using utf-8.

-- 
Volkmar W. Pogatzki


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Re: XHTML and Entities: ' fails in IE

Posted by Bertrand Delacretaz <bd...@apache.org>.
Le 5 oct. 04, à 12:43, Andreas Hartmann a écrit :

> ...Everything is fine except that Internet Explorer keeps on
> showing the &apos; as &apos; instead of ' .

You might know this already, but the encoding is controlled by the 
XHTMLEncoder class. One way to solve your problem might be to make it 
configurable, for example with an (aaaaaaaargh) "IE quirks" mode.

-Bertrand

Re: XHTML and Entities: ' fails in IE

Posted by Joerg Heinicke <jo...@gmx.de>.
On 05.10.2004 12:43, Andreas Hartmann wrote:

> I'm using UTF-8 encoding with the
> o.a.c.components.serializers.XHTMLSerializer.

That's a Cocoon-specific class, not a Xalan class, so we probably can do 
something about it.

> Special characters seem to be encoded as entity references
> by default (e.g., ' is encoded as &apos;)
> 
> Everything is fine except that Internet Explorer keeps on
> showing the &apos; as &apos; instead of ' .
> 
> This seems to be a known IE issue:
> 
> http://artific.com/library/xhtml_1.0_entities.html
> 
> "Some entity names do not resolve (eg: &apos; under Microsoft Internet
> Explorer) but their corresponding numeric names do resolve."
> 
> 
> Is there anything I can do about this? Looking at the
> EncodingSerializer etc. didn't lead me to a solution.

Can you ask on the dev list? Pier has written the serializer components 
and he is only reading the dev list.

Joerg

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org