You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by Ric Emery <re...@cyclonecommerce.com> on 2000/08/03 02:47:50 UTC

shift_jis Element tags

Are there any known issues with using Shift_JIS characters in element tags
using Xerces 1.1.3? I receive the following error parsing an XML document
that contains shift_jis characters as element tags:
 
Parsing .\jp.xml threw: org.xml.sax.SAXParseException: The content of
elements m
ust consist of well-formed character data or markup.
org.xml.sax.SAXParseException: The content of elements must consist of
well-form
ed character data or markup.
 
Using the same characters as element data does not exhibit the problem. I
have included as an attachment the XML I am trying to parse.
The document parses fine in IE5.
 
Thanks
 
 

Re: shift_jis Element tags

Posted by Andy Clark <an...@apache.org>.
Ric Emery wrote:
> Are there any known issues with using Shift_JIS characters in 
> element tags using Xerces 1.1.3? I receive the following error 
> parsing an XML document that contains shift_jis characters as 
> element tags:

I looked at your sample Shift_JIS file and I think I know the
reason why this is failing. Your file contains the following
Unicode characters as your element name:

  0xFF7A (half width katakana letter KO)
  0xFF8B (half width katakana letter HI)

In other words, the katakana for the word "coffee".

Look at the following productions in the XML specification:

  [5] Name ::= (Letter | '_' | ':') (NameChar)*
  [40] STag ::= '<' Name (S Attribute)* S? '>'
  [84] Letter ::= BaseChar | Ideographic
  [85] BaseChar ::= ...
  [86] Ideographic ::= ...

Neither BaseChar or Ideographic allow those Unicode characters
to appear as an element name. However, the full width katakana
*is* allowed.

-- 
Andy Clark * IBM, JTC - Silicon Valley * andyc@apache.org

Re: shift_jis Element tags

Posted by Eric Ye <er...@locus.apache.org>.
Are you sure it parses fine in IE5? when I dragged and dropped you file in IE5, it issue a error of unknown encoding "shift_js".
_____


Eric Ye * IBM, JTC - Silicon Valley * ericye@locus.apache.org

  ----- Original Message ----- 
  From: Ric Emery 
  To: xerces-j-dev@xml.apache.org 
  Sent: Wednesday, August 02, 2000 5:47 PM
  Subject: shift_jis Element tags


  Are there any known issues with using Shift_JIS characters in element tags using Xerces 1.1.3? I receive the following error parsing an XML document that contains shift_jis characters as element tags:
   
  Parsing .\jp.xml threw: org.xml.sax.SAXParseException: The content of elements m
  ust consist of well-formed character data or markup.
  org.xml.sax.SAXParseException: The content of elements must consist of well-form
  ed character data or markup.
   
  Using the same characters as element data does not exhibit the problem. I have included as an attachment the XML I am trying to parse.
  The document parses fine in IE5.
   
  Thanks
   
   


------------------------------------------------------------------------------


  ---------------------------------------------------------------------
  To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
  For additional commands, e-mail: xerces-j-dev-help@xml.apache.org