You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Michael Schuerig <mi...@schuerig.de> on 2004/09/30 00:58:14 UTC

Character encoding in XML tag files?

First things first, I'm using Tomcat 5.5.1. I'm moving some code from a 
JSP document to tag files. All of the files are in XML format. The head 
of one of my tag files looks like this

<?xml version="1.0" encoding="ISO-8859-1"?>

<jsp:root version="2.0"
 xmlns:jsp="http://java.sun.com/JSP/Page">

  <jsp:directive.tag pageEncoding="ISO-8859-1" />
  <jsp:directive.tag body-content="empty"/>


I'm precompiling the file with this ant target

  <target name="jspc" depends="prepare">
    <mkdir dir="${build.home}/WEB-INF/src"/>
    <jasper2 
     javaencoding="UTF-8"
     validateXml="false"
     compile="false"
     uriroot="${build.home}" 
     webXmlFragment="${build.home}/WEB-INF/generated_web.xml" 
     addwebxmlmappings="true"
     outputDir="${build.home}/WEB-INF/src" /> 
  </target>

So far, all is working. Problems start, when the file contains 
characters or constants such as 'Ü' and '&#160;'. These result in 
errors like this

An error occurred at line: 47 in the jsp 
file: /WEB-INF/tags/veranstaltung.tagx
Generated servlet error:
Invalid character constant

The tag file is indeed encoded as ISO-8859-1. Alternatively, I tried 
UTF-8, with the declarations changed, too. The result was the same. And 
now for the weird part:

This doesn't compile
<span>Ü</span>

This either
<span>Ü
 </span>

But this does
<span>Ü
  </span>

And so does this
<span>Ü   </span>

So, there have to be 3 whitespace characters after the umlaut to make it 
pass through jasper. =:-O


Michael

-- 
Michael Schuerig                         Those people who smile a lot
mailto:michael@schuerig.de                             Watch the eyes
http://www.schuerig.de/michael/    --Ani DiFranco, Outta Me, Onto You

---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-user-help@jakarta.apache.org


Re: Character encoding in XML tag files?

Posted by Michael Schuerig <mi...@schuerig.de>.
On Thursday 30 September 2004 02:01, Ben Souther wrote:
> I've had similar problems on FedoraCore.
> Setting the LANG evironment variable to:
> en_US.iso885915
> took care of it.

My normal locale is somewhat involved

LANG=POSIX
LC_CTYPE=de_DE.iso885915@euro
LC_NUMERIC=de_DE.iso885915@euro
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY=de_DE.iso885915@euro
LC_MESSAGES="POSIX"
LC_PAPER=de_DE.iso885915@euro
LC_NAME=de_DE.iso885915@euro
LC_ADDRESS=de_DE.iso885915@euro
LC_TELEPHONE=de_DE.iso885915@euro
LC_MEASUREMENT=de_DE.iso885915@euro
LC_IDENTIFICATION=de_DE.iso885915@euro
LC_ALL=

But I just tried with LANG=de_DE.iso885915@euro and the result was the 
same as before.

Thanks for the hint nonetheless, I'd never have thought of that.

Michael

-- 
Michael Schuerig                       All good people read good books
mailto:michael@schuerig.de                Now your conscience is clear
http://www.schuerig.de/michael/ --Tanita Tikaram, Twist In My Sobriety

---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-user-help@jakarta.apache.org


Re: Character encoding in XML tag files?

Posted by Ben Souther <bs...@fwdco.com>.
I've had similar problems on FedoraCore.
Setting the LANG evironment variable to:
en_US.iso885915
took care of it.



On Wed, 2004-09-29 at 18:58, Michael Schuerig wrote:
> First things first, I'm using Tomcat 5.5.1. I'm moving some code from a 
> JSP document to tag files. All of the files are in XML format. The head 
> of one of my tag files looks like this
> 
> <?xml version="1.0" encoding="ISO-8859-1"?>
> 
> <jsp:root version="2.0"
>  xmlns:jsp="http://java.sun.com/JSP/Page">
> 
>   <jsp:directive.tag pageEncoding="ISO-8859-1" />
>   <jsp:directive.tag body-content="empty"/>
> 
> 
> I'm precompiling the file with this ant target
> 
>   <target name="jspc" depends="prepare">
>     <mkdir dir="${build.home}/WEB-INF/src"/>
>     <jasper2 
>      javaencoding="UTF-8"
>      validateXml="false"
>      compile="false"
>      uriroot="${build.home}" 
>      webXmlFragment="${build.home}/WEB-INF/generated_web.xml" 
>      addwebxmlmappings="true"
>      outputDir="${build.home}/WEB-INF/src" /> 
>   </target>
> 
> So far, all is working. Problems start, when the file contains 
> characters or constants such as 'Ü' and '&#160;'. These result in 
> errors like this
> 
> An error occurred at line: 47 in the jsp 
> file: /WEB-INF/tags/veranstaltung.tagx
> Generated servlet error:
> Invalid character constant
> 
> The tag file is indeed encoded as ISO-8859-1. Alternatively, I tried 
> UTF-8, with the declarations changed, too. The result was the same. And 
> now for the weird part:
> 
> This doesn't compile
> <span>Ü</span>
> 
> This either
> <span>Ü
>  </span>
> 
> But this does
> <span>Ü
>   </span>
> 
> And so does this
> <span>Ü   </span>
> 
> So, there have to be 3 whitespace characters after the umlaut to make it 
> pass through jasper. =:-O
> 
> 
> Michael


---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-user-help@jakarta.apache.org