You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by Dimitri Pissarenko <di...@gmx.net> on 2004/01/24 18:16:55 UTC

Russian text in Apache Forrest files

Hello!

If I enter a text in Russian in the file given below, I get the error

<output>
BUILD FAILED
C:\Programme\apache-forrest-0.5.1-bin\forrest.build.xml:851: Could not
validate
document
C:\work\demetrix\homepage-work\homepage-src\src\documentation\content\x
docs\demetrix\contact.xml
</output>

I inserted Russian text as any other text, without enclosing it with
any special tags, ie

<p>
Äìèòðèé Ïèñàðåíêî
</p>

How can I avoid the above error?

Thanks in advance

dap

PS0: If I remove the Russian text, the site is built without
complaints.
PS1: Here is the file, which causes the error:

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE document PUBLIC "-//APACHE//DTD Documentation V1.2//EN"
"http://apache.org/forrest/dtd/document-v12.dtd">

<document> 

  <header> 

    <title>Contact</title> 



  </header> 

  <body> 

	<section>

	    <title>Contact</title> 

		<p>

		For any information concerning demetrix contact

		</p>

		<p>

		<link
href="http://members.inode.at/d.pissarenko">Dimitri Pissarenko</link>

		</p>

		<p>
		Äìèòðèé Ïèñàðåíêî
		</p>
		<p>
		E-Mail: <link
href="mailto:dimitri.pissarenko@gmx.net">dimitri.pissarenko@gmx.net</link>

		</p>

		<p>

		ICQ: dap/176181430

		</p>

		<p>

		You can send me messages in Russian, English and
German.

		</p>

	</section>

  </body>

</document>



Re: Russian text in Apache Forrest files

Posted by Dimitri Pissarenko <di...@gmx.net>.
Thanks for your help!

Using JEdit I stored the file in UTF-8 and could build the web-site
properly.

Best regards

dap

Re: Russian text in Apache Forrest files

Posted by "J.Pietschmann" <j3...@yahoo.de>.
Dimitri Pissarenko wrote:
> Do you know how to convert Russian text from Windows 1251 to UTF-8?

- Check whether you preferred editor has something like an
  "coding" or "encoding" or "character set" choice in the "save"
  or "save as" dialog. Windows Notepad has one since W2K.
- Manual method: Use character references in the XML. Check
  http://www.unicode.org
  for the code point numbers.
- Use a conversion utility. I use iconv on cygwin
  http://www.cygwin.com

J.Pietschmann

Re: Russian text in Apache Forrest files

Posted by Dimitri Pissarenko <di...@gmx.net>.
Hello!

Thanks for your response!

>This seems to be Windows 1251 encoding (cyrillic) rather than UTF-8.

Do you know how to convert Russian text from Windows 1251 to UTF-8?

TIA

dap

Re: Russian text in Apache Forrest files

Posted by "J.Pietschmann" <j3...@yahoo.de>.
Dimitri Pissarenko wrote:
> Äìèòðèé Ïèñàðåíêî

This seems to be Windows 1251 encoding (cyrillic) rather than UTF-8.
I'd suggest to check the encoding again, and perhaps run the file
through some other validation tool, like xmllint from libxml2 (available
with cygwin), in order to get a more detailed problem description.

J.Pietschmann

Re: Russian text in Apache Forrest files

Posted by Ross Gardler <rg...@apache.org>.
Dimitri Pissarenko wrote:
> If I enter a text in Russian in the file given below, I get the error
> 
> <output>
> BUILD FAILED
> C:\Programme\apache-forrest-0.5.1-bin\forrest.build.xml:851: Could not
> validate
> document
> C:\work\demetrix\homepage-work\homepage-src\src\documentation\content\x
> docs\demetrix\contact.xml
> </output>
> 
> I inserted Russian text as any other text, without enclosing it with
> any special tags, ie
> 
> <p>
> Äìèòðèé Ïèñàðåíêî
> </p>
> 
> How can I avoid the above error?

I'm afraid I know nothing of international characters etc. however, as a 
workaround until someone with more expertise responds you could turn off 
validation. This is done in the forrest.properties file. This will get 
you started at least.

Ross