You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@forrest.apache.org by Roland Becker <ro...@hellyfant.de> on 2005/05/09 20:16:36 UTC

encoding-atribute in site.xml

Is there e reason, that "site.xml" after "forrest seed" has
no encoding-atribute?

If there is a german umlaut in "site.xml", then build fails with:

linkmap.html  BROKEN: Invalid byte 2 of 2-byte UTF-8 sequence.
BUILD FAILED
C:\programme\apache-forrest-0.6\src\core\targets\site.xml:43: Java returned: 1

If I add the encoding-atribute like:
<?xml version="1.0" encoding="UTF-8"?>
then it works fine.

-- 
Roland Becker



Re: Encoding-atribute in site.xml (copied from user list)

Posted by David Crossley <cr...@apache.org>.
Torsten, can you please provide a URL to your GT2004 presentation.

--David

Ferdinand Soethe wrote:
> Johannes Schaefer wrote:
> 
> JS> There is a related FAQ entry:
> JS>    http://forrest.apache.org/0.7/docs/faq.html#encoding
> 
> JS> Johannes
> 
> Thanks. I missed that just looking at the old 0.6 FAQs.
> What we perhaps should add is which encodings Forrest will understand
> as only utf-8 and utf-16 are required by all xml-tools and
> understanding all the rest is optional.
> 
> Obviously ISO-8859-1 is one of them. Any others confirmed?
> 
> There is a reference in the FAQ pointing to
> 
> > GT2004 presentation by Torsten Schlabach
> 
> which in fact only links to Torstens bio.
> The best I could find there in terms of more details was
> http://orixo.com/events/gt2004/sessions.html#torsten.
> 
> Does anybody have the full presentation or know where to find it? If
> not, we might as well take out the link?!
> 
> --
> Ferdinand Soethe

Re: Encoding-atribute in site.xml (copied from user list)

Posted by Ferdinand Soethe <sa...@soethe.net>.



Johannes Schaefer wrote:

JS> There is a related FAQ entry:
JS>    http://forrest.apache.org/0.7/docs/faq.html#encoding

JS> Johannes

Thanks. I missed that just looking at the old 0.6 FAQs.
What we perhaps should add is which encodings Forrest will understand
as only utf-8 and utf-16 are required by all xml-tools and
understanding all the rest is optional.

Obviously ISO-8859-1 is one of them. Any others confirmed?

There is a reference in the FAQ pointing to

> GT2004 presentation by Torsten Schlabach

which in fact only links to Torstens bio.
The best I could find there in terms of more details was
http://orixo.com/events/gt2004/sessions.html#torsten.

Does anybody have the full presentation or know where to find it? If
not, we might as well take out the link?!



--
Ferdinand Soethe


Re: Encoding-atribute in site.xml (copied from user list)

Posted by Johannes Schaefer <jo...@uidesign.de>.
There is a related FAQ entry:
   http://forrest.apache.org/0.7/docs/faq.html#encoding

Johannes


Ferdinand Soethe wrote:
> 
> Ross Gardler wrote:
> 
> RG> Since the template files do use UTF-8 it makes sense for them to be
> RG> defined as using UTF-8.
> 
> Agreed. No harm done in making it explicit.
> 
> RG> No it is not an editor problem. In the absence of an encoding attribute
> RG> the editor will assume a certain type of encoding. What this assumption
> RG> is would be dependant on the local settings of the editor (in some cases
> RG> this means the settings of the Operating System).
> 
> Just for the sake of the argument :-)
> 
> I just looked this up in some of my references and they
> state that an xml-file without encoding attribute is utf-8 or
> utf-16 (depending on the byte order mark).
> 
> Which to me means that xml-editors that assume something else are not
> conforming to the xml-standards. Or am I missing something?
> 
> --
> Ferdinand Soethe
> 
> 


-- 
User Interface Design GmbH * Teinacher Str. 38 * D-71634 
Ludwigsburg
Fon +49 (0)7141 377 000 * Fax  +49 (0)7141 377 00-99
Geschäftsstelle: User Interface Design GmbH * 
Lehrer-Götz-Weg 11 * D-81825 München
www.uidesign.de

Buch "User Interface Tuning" von Joachim Machate & Michael 
Burmester
www.user-interface-tuning.de

Re: Encoding-atribute in site.xml (copied from user list)

Posted by Ross Gardler <rg...@apache.org>.
Ferdinand Soethe wrote:
> 
> Ross Gardler wrote:
> 
> RG> Since the template files do use UTF-8 it makes sense for them to be
> RG> defined as using UTF-8.
> 
> Agreed. No harm done in making it explicit.
> 
> RG> No it is not an editor problem. In the absence of an encoding attribute
> RG> the editor will assume a certain type of encoding. What this assumption
> RG> is would be dependant on the local settings of the editor (in some cases
> RG> this means the settings of the Operating System).
> 
> Just for the sake of the argument :-)
> 
> I just looked this up in some of my references and they
> state that an xml-file without encoding attribute is utf-8 or
> utf-16 (depending on the byte order mark).
> 
> Which to me means that xml-editors that assume something else are not
> conforming to the xml-standards. Or am I missing something?

No it might well have been me that was missing something.

Nevertheless, there is no harm in having the encoding present, even if 
it is a problem with editors.

Ross

Re: Encoding-atribute in site.xml (copied from user list)

Posted by Ferdinand Soethe <sa...@soethe.net>.

Ross Gardler wrote:

RG> Since the template files do use UTF-8 it makes sense for them to be
RG> defined as using UTF-8.

Agreed. No harm done in making it explicit.

RG> No it is not an editor problem. In the absence of an encoding attribute
RG> the editor will assume a certain type of encoding. What this assumption
RG> is would be dependant on the local settings of the editor (in some cases
RG> this means the settings of the Operating System).

Just for the sake of the argument :-)

I just looked this up in some of my references and they
state that an xml-file without encoding attribute is utf-8 or
utf-16 (depending on the byte order mark).

Which to me means that xml-editors that assume something else are not
conforming to the xml-standards. Or am I missing something?

--
Ferdinand Soethe


Re: Encoding-atribute in site.xml (copied from user list)

Posted by Ross Gardler <rg...@apache.org>.
Ferdinand Soethe wrote:
> Ross Gardler wrote:
> 
> RG> Roland Becker wrote:
> 
> 
>>>Is there e reason, that "site.xml" after "forrest seed" has
>>>no encoding-atribute?
>>>
>>>If there is a german umlaut in "site.xml", then build fails with:
>>>
>>>linkmap.html  BROKEN: Invalid byte 2 of 2-byte UTF-8 sequence.
>>>BUILD FAILED
>>>C:\programme\apache-forrest-0.6\src\core\targets\site.xml:43: Java returned: 1
> 
> 
> RG> No reason that I am aware of or can imagine. I've updated the file in
> RG> site.xml for the next release.
> 
> 
> Funny, I have been using Umlaute with 0.7 head all over my site.xml
> and never had a problem compiling it. See below:
> 

...

> Still unclear to me: Am I correct that this is an _editor problem_ in
> the sense that no encoding-attribute means the file is utf-8 but
> Roland's editor needed a stronger hint?

No it is not an editor problem. In the absence of an encoding attribute
the editor will assume a certain type of encoding. What this assumption
is would be dependant on the local settings of the editor (in some cases
this means the settings of the Operating System).

Since the template files do use UTF-8 it makes sense for them to be
defined as using UTF-8.

Ross


Encoding-atribute in site.xml (copied from user list)

Posted by Ferdinand Soethe <sa...@soethe.net>.
Ross Gardler wrote:

RG> Roland Becker wrote:

>> Is there e reason, that "site.xml" after "forrest seed" has
>> no encoding-atribute?
>> 
>> If there is a german umlaut in "site.xml", then build fails with:
>> 
>> linkmap.html  BROKEN: Invalid byte 2 of 2-byte UTF-8 sequence.
>> BUILD FAILED
>> C:\programme\apache-forrest-0.6\src\core\targets\site.xml:43: Java returned: 1

RG> No reason that I am aware of or can imagine. I've updated the file in
RG> site.xml for the next release.


Funny, I have been using Umlaute with 0.7 head all over my site.xml
and never had a problem compiling it. See below:

<?xml version="1.0"?>

> <site label="Bildungsverein Hannover" href="" xmlns="http://apache.org/forrest/linkmap/1.0" tab="home">
>
>   <about label="Über uns" tab="home">
>     <über_uns label="Wir über uns" href="index.html"/>
>     <Dozentinnen label="Dozent/innen" href="dozentinnen.html"/>
>     <Lernorte label="Lernorte" href="lernorte.html"/>
>     <agbs label="Geschäftsbedingungen" href="agbs.html"/>
>     <newsletter label="Newsletter" href="newsletter.html"/>
>     <kontakt label="Kontakt" href="kontakt.html"/>
>     <impressum label="Impressum" href="impressum.html"/>
>     <wegweiser label="Wegweiser" href="Wegweiser.html"/>
>   </about>

I assumed that somebody had fixed the problem in the meantime but never
checked. Now I did and found that Eclipse understands site to be utf-8
even though there is not explicit declaration.

Perhaps we should append the issue and the FAQ entry (How to use
special characters in the labels of the site.xml file?) referring to
the problem and explain that you _can_ use special characters directly
if you use the standard utf-8 encoding for site.xml (and make sure
your editor knows it).

Still unclear to me: Am I correct that this is an _editor problem_ in
the sense that no encoding-attribute means the file is utf-8 but
Roland's editor needed a stronger hint?

Thanks,
Ferdinand Soethe


Re: encoding-atribute in site.xml

Posted by Ross Gardler <rg...@apache.org>.
Roland Becker wrote:
> Is there e reason, that "site.xml" after "forrest seed" has
> no encoding-atribute?
> 
> If there is a german umlaut in "site.xml", then build fails with:
> 
> linkmap.html  BROKEN: Invalid byte 2 of 2-byte UTF-8 sequence.
> BUILD FAILED
> C:\programme\apache-forrest-0.6\src\core\targets\site.xml:43: Java returned: 1
> 
> If I add the encoding-atribute like:
> <?xml version="1.0" encoding="UTF-8"?>
> then it works fine.

No reason that I am aware of or can imagine. I've updated the file in 
site.xml for the next release.

Thanks for pointing this out.

Ross