You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Peter Guyatt <pg...@telesoft-technologies.com> on 2004/05/11 14:28:15 UTC

UTF-8 Encoding problem

Hi All,

	I was wondering if someone would be able to answer a question I have.

I parse the a document as UTF-8 using xerces c 2.2.0 and get an error in my
custom handler stating the following error.

Fatal Error line 5, col 15, Message:An Exception occurred!
Type:TranscodingException, Message:An invalid multi-byte source text
sequence was encountered

The actual entry in the XML file is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<Maintenance>
	<DMPair>
		<Instance>1</Instance>
		<Name>simeecauaeiouaeiou</Name> <!-- Exception here -->
		<ServerPort>9001</ServerPort>
		<IpAddressNode0>172.16.3.28</IpAddressNode0>
		<IpAddressNode1>172.16.3.29</IpAddressNode1>
		<Enabled>False</Enabled>
		<FailureRoutingType>3</FailureRoutingType>
		<FailureRoutingData>2</FailureRoutingData>
	</DMPair>
</Maintenance>

I am pretty sure that the characters in the Name tag are fine since I have
look at the UTF-8 spec and their character values appear to in the valid set
of unicode characters, also this document parses fine using Xerces-J-2.6.2.

Any insight into this will be greatly appreciated.

Thanks in advance

Pete






---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


RE: UTF-8 Encoding problem

Posted by Peter Guyatt <pg...@telesoft-technologies.com>.
Hi There,

    Thanks to everyone for the reply, however if I actually encoded the
characters in the name tag as UTF-8 then xerces was fine.

Thanks again

Pete
  -----Original Message-----
  From: PeiYong PY Zhang [mailto:peiyongz_xml@ca.ibm.com]On Behalf Of
PeiYong Zhang
  Sent: 14 May 2004 16:46
  To: xerces-c-dev@xml.apache.org
  Subject: Re: UTF-8 Encoding problem



  Peter,

      I've tried your sample instance file with the current version, 2.5.0
and did not see any exception,
  can you use this version rather than 2.2.0?

  Rgds
  Peiyong

  "Peter Guyatt" <pg...@telesoft-technologies.com> wrote on 05/11/2004
08:28:15 AM:

  > Hi All,
  >
  >    I was wondering if someone would be able to answer a question I have.
  >
  > I parse the a document as UTF-8 using xerces c 2.2.0 and get an error in
my
  > custom handler stating the following error.
  >
  > Fatal Error line 5, col 15, Message:An Exception occurred!
  > Type:TranscodingException, Message:An invalid multi-byte source text
  > sequence was encountered
  >
  > The actual entry in the XML file is as follows:
  >
  > <?xml version="1.0" encoding="UTF-8"?>
  > <Maintenance>
  >    <DMPair>
  >       <Instance>1</Instance>
  >       <Name>simeecauaeiouaeiou</Name> <!-- Exception here -->
  >       <ServerPort>9001</ServerPort>
  >       <IpAddressNode0>172.16.3.28</IpAddressNode0>
  >       <IpAddressNode1>172.16.3.29</IpAddressNode1>
  >       <Enabled>False</Enabled>
  >       <FailureRoutingType>3</FailureRoutingType>
  >       <FailureRoutingData>2</FailureRoutingData>
  >    </DMPair>
  > </Maintenance>
  >
  > I am pretty sure that the characters in the Name tag are fine since I
have
  > look at the UTF-8 spec and their character values appear to in the valid
set
  > of unicode characters, also this document parses fine using
Xerces-J-2.6.2.
  >
  > Any insight into this will be greatly appreciated.
  >
  > Thanks in advance
  >
  > Pete
  >
  >
  >
  >
  >
  >
  > ---------------------------------------------------------------------
  > To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
  > For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
  >

Re: UTF-8 Encoding problem

Posted by PeiYong Zhang <pe...@ca.ibm.com>.
Peter,

    I've tried your sample instance file with the current version, 2.5.0 
and did not see any exception,
can you use this version rather than 2.2.0?

Rgds
Peiyong

"Peter Guyatt" <pg...@telesoft-technologies.com> wrote on 05/11/2004 
08:28:15 AM:

> Hi All,
> 
>    I was wondering if someone would be able to answer a question I have.
> 
> I parse the a document as UTF-8 using xerces c 2.2.0 and get an error in 
my
> custom handler stating the following error.
> 
> Fatal Error line 5, col 15, Message:An Exception occurred!
> Type:TranscodingException, Message:An invalid multi-byte source text
> sequence was encountered
> 
> The actual entry in the XML file is as follows:
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <Maintenance>
>    <DMPair>
>       <Instance>1</Instance>
>       <Name>simeecauaeiouaeiou</Name> <!-- Exception here -->
>       <ServerPort>9001</ServerPort>
>       <IpAddressNode0>172.16.3.28</IpAddressNode0>
>       <IpAddressNode1>172.16.3.29</IpAddressNode1>
>       <Enabled>False</Enabled>
>       <FailureRoutingType>3</FailureRoutingType>
>       <FailureRoutingData>2</FailureRoutingData>
>    </DMPair>
> </Maintenance>
> 
> I am pretty sure that the characters in the Name tag are fine since I 
have
> look at the UTF-8 spec and their character values appear to in the valid 
set
> of unicode characters, also this document parses fine using 
Xerces-J-2.6.2.
> 
> Any insight into this will be greatly appreciated.
> 
> Thanks in advance
> 
> Pete
> 
> 
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>