You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by Dimitri Pissarenko <di...@gmx.net> on 2004/01/24 18:16:55 UTC
Russian text in Apache Forrest files
Hello!
If I enter a text in Russian in the file given below, I get the error
<output>
BUILD FAILED
C:\Programme\apache-forrest-0.5.1-bin\forrest.build.xml:851: Could not
validate
document
C:\work\demetrix\homepage-work\homepage-src\src\documentation\content\x
docs\demetrix\contact.xml
</output>
I inserted Russian text as any other text, without enclosing it with
any special tags, ie
<p>
Äìèòðèé Ïèñàðåíêî
</p>
How can I avoid the above error?
Thanks in advance
dap
PS0: If I remove the Russian text, the site is built without
complaints.
PS1: Here is the file, which causes the error:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.2//EN"
"http://apache.org/forrest/dtd/document-v12.dtd">
<document>
<header>
<title>Contact</title>
</header>
<body>
<section>
<title>Contact</title>
<p>
For any information concerning demetrix contact
</p>
<p>
<link
href="http://members.inode.at/d.pissarenko">Dimitri Pissarenko</link>
</p>
<p>
Äìèòðèé Ïèñàðåíêî
</p>
<p>
E-Mail: <link
href="mailto:dimitri.pissarenko@gmx.net">dimitri.pissarenko@gmx.net</link>
</p>
<p>
ICQ: dap/176181430
</p>
<p>
You can send me messages in Russian, English and
German.
</p>
</section>
</body>
</document>
Re: Russian text in Apache Forrest files
Posted by Dimitri Pissarenko <di...@gmx.net>.
Thanks for your help!
Using JEdit I stored the file in UTF-8 and could build the web-site
properly.
Best regards
dap
Re: Russian text in Apache Forrest files
Posted by "J.Pietschmann" <j3...@yahoo.de>.
Dimitri Pissarenko wrote:
> Do you know how to convert Russian text from Windows 1251 to UTF-8?
- Check whether you preferred editor has something like an
"coding" or "encoding" or "character set" choice in the "save"
or "save as" dialog. Windows Notepad has one since W2K.
- Manual method: Use character references in the XML. Check
http://www.unicode.org
for the code point numbers.
- Use a conversion utility. I use iconv on cygwin
http://www.cygwin.com
J.Pietschmann
Re: Russian text in Apache Forrest files
Posted by Dimitri Pissarenko <di...@gmx.net>.
Hello!
Thanks for your response!
>This seems to be Windows 1251 encoding (cyrillic) rather than UTF-8.
Do you know how to convert Russian text from Windows 1251 to UTF-8?
TIA
dap
Re: Russian text in Apache Forrest files
Posted by "J.Pietschmann" <j3...@yahoo.de>.
Dimitri Pissarenko wrote:
> Äìèòðèé Ïèñàðåíêî
This seems to be Windows 1251 encoding (cyrillic) rather than UTF-8.
I'd suggest to check the encoding again, and perhaps run the file
through some other validation tool, like xmllint from libxml2 (available
with cygwin), in order to get a more detailed problem description.
J.Pietschmann
Re: Russian text in Apache Forrest files
Posted by Ross Gardler <rg...@apache.org>.
Dimitri Pissarenko wrote:
> If I enter a text in Russian in the file given below, I get the error
>
> <output>
> BUILD FAILED
> C:\Programme\apache-forrest-0.5.1-bin\forrest.build.xml:851: Could not
> validate
> document
> C:\work\demetrix\homepage-work\homepage-src\src\documentation\content\x
> docs\demetrix\contact.xml
> </output>
>
> I inserted Russian text as any other text, without enclosing it with
> any special tags, ie
>
> <p>
> Äìèòðèé Ïèñàðåíêî
> </p>
>
> How can I avoid the above error?
I'm afraid I know nothing of international characters etc. however, as a
workaround until someone with more expertise responds you could turn off
validation. This is done in the forrest.properties file. This will get
you started at least.
Ross