You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@xmlbeans.apache.org by Matthias Wessendorf <mw...@pironet-ndh.com> on 2005/05/19 16:59:32 UTC

Using XMLBeans to parse content from XML (UTF-8?)

Hi all,

I am using (more evaluating) XMLBeans (1.0.4) to read in a Java way my content.

I created a XSD and generated my clazzes/interfaces. Fine! I now have a xml file,
that contains some *text* that I read with XMLBeans and it copies the content (from XML)
to the generated clazzes/interfaces. Fine!

Now I copy the (string) properties form generated clazzes/interfaces to my *business classes*.
And I am presenting them in a html file. fine! The website works.

but, if I use the (string) properties inside a XSL-FO (to generate a PDF) it doesn't work.
FOP throws an exception (java.io.UTFDataFormatException)
I guess there is something with xmlbeans UTF-8 wrong?  But what ?

My XML document uses German umlauts, but it contains
<?xml version="1.0" encoding="utf-8"?>
(followed by my content...)

It is not an issue of Apache FOP(that is the processor I am using to process my XSL-FO), I guess.
I tested the XSL-FO during debuging...
copied the XSL-FO string (containing the *value* from my xml document) to a *local* file and now FOP created the PDF
as aspected. But not *on the fly* (the FOP shows me (with the same XSL-FO string) an java.io.UTFDataFormatException)

Note, I replaced the german umlaut (รถ -> o) then it works.

I really guess, I am doing something wrong with XMLBeans and UTF-8.

does anybody have an idea for me?

Thanks in advice.
Matthias

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: user-help@xmlbeans.apache.org


Re: Using XMLBeans to parse content from XML (UTF-8?)

Posted by Mark Lewis <ma...@mir3.com>.
Oops, typed before I read.  Disregard this.

On Thu, 2005-05-19 at 15:04, Mark Lewis wrote:
> On Thu, 2005-05-19 at 07:59, Matthias Wessendorf wrote:
> ...
> > but, if I use the (string) properties inside a XSL-FO (to generate a PDF) it doesn't work.
> > FOP throws an exception (java.io.UTFDataFormatException)
> > I guess there is something with xmlbeans UTF-8 wrong?  But what ?
> > 
> > My XML document uses German umlauts, but it contains
> > <?xml version="1.0" encoding="utf-8"?>
> > (followed by my content...)
> ...
> 
> Most likely the XML document you are parsing is not really in UTF-8
> encoding.  If the document was being saved in a different encoding
> (win-1252, iso 8859-1, etc) then the umlaut will be encoded as a single
> extended (>127) character, which is an illegal sequence in UTF-8, and
> you will get the error you listed above.
> 
> To fix it, you can change the declared encoding to be the encoding the
> document is really in, or change the document to be in UTF-8.
> 
> -- Mark Lewis
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@xmlbeans.apache.org
> For additional commands, e-mail: user-help@xmlbeans.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: user-help@xmlbeans.apache.org


Re: Using XMLBeans to parse content from XML (UTF-8?)

Posted by Mark Lewis <ma...@mir3.com>.
On Thu, 2005-05-19 at 07:59, Matthias Wessendorf wrote:
...
> but, if I use the (string) properties inside a XSL-FO (to generate a PDF) it doesn't work.
> FOP throws an exception (java.io.UTFDataFormatException)
> I guess there is something with xmlbeans UTF-8 wrong?  But what ?
> 
> My XML document uses German umlauts, but it contains
> <?xml version="1.0" encoding="utf-8"?>
> (followed by my content...)
...

Most likely the XML document you are parsing is not really in UTF-8
encoding.  If the document was being saved in a different encoding
(win-1252, iso 8859-1, etc) then the umlaut will be encoded as a single
extended (>127) character, which is an illegal sequence in UTF-8, and
you will get the error you listed above.

To fix it, you can change the declared encoding to be the encoding the
document is really in, or change the document to be in UTF-8.

-- Mark Lewis



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@xmlbeans.apache.org
For additional commands, e-mail: user-help@xmlbeans.apache.org