You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-dev@xerces.apache.org by "Troy R. Bundy" <ce...@smart.net> on 2001/01/15 21:06:54 UTC

Issues with latest Xerces for Java

Hi-
	I write this email to solicit information on 
issues that I have encountered using both the latest version
of the Apache Xerces XML Parser for Java distributed from the
IBM Alphaworks web site AND distributed from the xerces.apache.org
website. That is, I have determined that the below issues exist
in both distributions (v3.1.0 or dated Sept 2000 from IBM and
the latest version whose bug fixes were made on Dec 6, 2000 from 
Apache). 

	If you have ANY information on how I can address the below
issues, then please email me and let me know what the solutions
are (cec@smart.net). Thank you!

  Problems:

1) I have gathered that the element types {timeInstant, timeDuration,
   recurringInstant, date, time} aren't supported. Is there a projection
   date as to when they will be supported by the parser?


2) THIS QUESTION CONCERNS XML SCHEMA:

   In a sample XSD/XML file pair, I had an element defined of type
   "decimal"; however, during the parsing of the XML file the element
   wasn't marked as being 'decimal' -- it was makred as 'NMTOKEN'.

    e.g.

   (snippet from the .xsd file)
   ----------------------------

    <complexType name="USAddress">
 	<sequence>
		<element name="city" type="string" />
		<element name="zip" type="decimal" />
	</sequence>
	<attribute name="country" type="NMTOKEN" use="fixed" value="US" />
    </complexType>

    <purchaseOrder orderDate="1999-10-20">
	<shipTo country="US">
		<city>Mill Valley</city>
		<state>CA</state>
		<zip>90952</zip>
	</shipTo>

    PROBLEM: 'zip' was reported by the parser as being of type 'NMTOKEN'; 
                  why??


3) THIS QUESTION CONCERNS XML SCHEMA:

 
   Can elements which are defined within a complexType be "ref"ed from
within another complexType?

4) THIS QUESTION CONCERNS XML SCHEMA:

  Given the following snippets (.xsd and .xml), I got an error. 
  I am not sure why I got the error or what meaning I should
  gather from the error -- can you explain??

  (snippet from .xml file)
  ------------------------
 
  <item partnum="872-AA">

  (snippet from .xsd file)
  ------------------------

  <element name="item">
     .
     .
     .
    <complexType>
      .
      .
      .
       <attribute name="partnum" type="SKU">
    </complexType>
  </element>

  <simpleType name="SKU">
 	<restriction base="string">
		<pattern value="\d{3}-[A-Z]{z}" />
	</restriction>
  </simpleType>

   ERROR: "Internal Error: this element have a simpleType
	   but no datatype validator was found, element prefix: -1,
           localpart: 60, rawname: 60, uri: 0, locapart: quantity"

  The error apparently occurs against the "partnum" attribute.

5) THIS QUESTION CONCERNS XML SCHEMA:

   Using DOM processing, it seems that every child node list has an extra 
   and apparently useless and incorrect child node in it with the
   following characteristics:

     name="#text"
     type="TEXT_NODE"
     value = (new line) or "\n"

   Why is this the case?

Please respond -- thank you -- Troy R. Bundy


Re: Issues with latest Xerces for Java

Posted by Guoliang Cao <bi...@mail.com>.
Andy,

I just read the Features page.  The ignorable space should be defaultly ignored
by DOMParser. However, when I output a document, a lot of text nodes are still
there.  Even when I setFeature explicitly, it didn't filter pure text nodes.
I'm using xerces 1.2.3.  What can be the reason?

Thanks,
Guoliang

--- CODE --------------------------------------------
        DOMParser domParser = new DOMParser();

domParser.setFeature("http://apache.org/xml/features/dom/include-ignorable-whitespace",
false);
        domParser.parse(is);

--- XML ----------------------------------------------
<?xml version="1.0"?>
<provRequest xmlns='http://www.ispsoft.com'
    xmlns:xsi='http://www.w3.org/1999/XMLSchema-instance'
    xsi:schemaLocation='http://www.ispsoft.com/ ...xsd'>

    <authToken>Auth1</authToken>
    <Account ID="aaa.bbb">
............

--- OUTPUT-------------------------------------------

<DOCUMENT> #document:

    <ELEM> provRequest:

    <ATTR> xmlns = 'http://www.ispsoft.com'

    <ATTR> xmlns:xsi = 'http://www.w3.org/1999/XMLSchema-instance'

    <ATTR> xsi:schemaLocation = 'http://www.ispsoft.com  ...xsd'

        <TEXT> #text:



        <ELEM> authToken:

            <TEXT> #text: Auth1

        <TEXT> #text:


        <ELEM> Account:

        <ATTR> ID = 'aaa.bbb'

            <TEXT> #text:




>
>
> > 5) THIS QUESTION CONCERNS XML SCHEMA:
> >
> >    Using DOM processing, it seems that every child node list has an extra
> >    and apparently useless and incorrect child node in it with the
> >    following characteristics:
>
> That's the whitespace at the end of your lines. Don't worry, it's
> normal and expected. XML parsers are required to report all of the
> information in the instance document -- and this information is
> in the document; it's just not important to *you*. ;)
>
> Check the features page in the docs for a feature you can use on
> the DOM parser to automatically remove the ignorable whitespace.
>
> --
> Andy Clark * IBM, TRL - Japan * andyc@apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-dev-help@xml.apache.org


Re: Issues with latest Xerces for Java

Posted by Andy Clark <an...@apache.org>.
"Troy R. Bundy" wrote:
> 1) I have gathered that the element types {timeInstant, timeDuration,
>    recurringInstant, date, time} aren't supported. Is there a projection
>    date as to when they will be supported by the parser?

As soon as you donate the code. ;)

> 2) THIS QUESTION CONCERNS XML SCHEMA:
> 
>    In a sample XSD/XML file pair, I had an element defined of type
>    "decimal"; however, during the parsing of the XML file the element
>    wasn't marked as being 'decimal' -- it was makred as 'NMTOKEN'.

Define "marked as NMTOKEN". Where is it being reported as NMTOKEN?