You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Allen Miller <al...@listenpoint.com> on 2002/05/16 22:13:08 UTC
trouble parsing unicode
Howdy,
I've tried parsing the string following with both 1.4.4 and 2.0.1. Any
ideas?
This is the string going in, notice the em-dash following 'business':
<?xml version="1.0" encoding="UTF-8"?>
<group xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<groupHeader>
<version>1.0</version>
<locale>US</locale>
<createdBy>30</createdBy>
<creationDate>2001-08-10T17:00:00-08:00</creationDate>
</groupHeader>
<groupDefinition>
<type>Product</type>
<allowSubGroups>true</allowSubGroups>
</groupDefinition>
<groupData>
<name>pl01</name>
<description>But the benefit does not stop there.
Consolidating data, voice, and video networks into one architecture
enables managers not only to bring spiraling costs under control, it
also gives companies a new way to do business—a better way to
compete. </description>
<owner ownerID="30"></owner>
<memberList/>
</groupData>
</group>
This is what is parsed, notice the character following 'business'
<group>
<groupData>
<description>But the benefit does not stop there. Consolidating
data, voice, and video networks into one architecture
enables managers not only to bring spiraling costs under
control, it also gives companies a new way to do
businessùa better way to compete.
</description><memberList></memberList><name>pl01
1021576965889</name><owner
ownerID="30"></owner></groupData><groupDefinition>
<allowSubGroups>true</allowSubGroups><type>Product</type></groupDefiniti
on><groupHeader>
<createdBy>30</createdBy><creationDate>2001-08-10T17:00:00.000</creation
Date><locale>US</locale><version>1.0</versio
n></groupHeader></group>
---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org