You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "A B (JIRA)" <de...@db.apache.org> on 2006/09/08 02:54:23 UTC

[jira] Updated: (DERBY-1759) XMLSERIALIZE operator doesn't follow SQL/XML spec in some areas when serializing a sequence.

     [ http://issues.apache.org/jira/browse/DERBY-1759?page=all ]

A B updated DERBY-1759:
-----------------------

    Attachment: d1759_v1.patch
                d1759_v1.stat


Attaching a patch for this issue, d1759_v1.patch.  The patch does the following:

  1. Adds logic to SqlXmlUtil.serializeToString() to perform the
     steps of "sequence normalization" as required by XML serialization
     rules.  This includes throwing an error if the user attempts to
     explicitly serialize a sequence that contains one or more top-level
     attribute nodes.

  2. In order to ensure that the serialization error is only thrown
     when the user performs an explicit XMLSERIALIZE, I added a
     new field, "containsTopLevelAttr", to the XML class.  This field
     indicates whether or not the XML value corresponds to a sequence
     with a top-level attribute node.  If the user calls XMLSERIALIZE
     on an XMLDataValue for which containsTopLevelAttr is true,
     then we'll throw the serialization error 2200W as dictated by
     SQL/XML. 

  3. Adds appropriate test cases to lang/xml_general.sql to verify
     the fix.

  4. Since Xalan doesn't provide a built-in way to retrieve a sequence
     of attribue values (as opposed to attribute nodes), I also included
     in lang/xml_general.sql a (rather ugly) way to accomplish that
     task within the serialization restrictions of SQL/XML.

I also added logic to put spaces between adjacent atomic values in a result sequence, per XML "sequence normalization" rules.  However, I didn't add any test cases for this because I couldn't come up with an XPath query that would actually return such a sequence.  Nonetheless, the relevant logic seems straightforward and I have included it in this patch.

I ran xmlSuite with these changes against insane jars with ibm142 on Windows 2000 and all tests passed.  I plan to run derbyall as a sanity check tonight but do not expect any failures since the changes are all limited to XML.  In the meantime, any early feedback/review/comments would be much appreciated.  Thanks!

> XMLSERIALIZE operator doesn't follow SQL/XML spec in some areas when serializing a sequence.
> --------------------------------------------------------------------------------------------
>
>                 Key: DERBY-1759
>                 URL: http://issues.apache.org/jira/browse/DERBY-1759
>             Project: Derby
>          Issue Type: Bug
>    Affects Versions: 10.2.1.0, 10.2.2.0, 10.3.0.0
>            Reporter: A B
>         Assigned To: A B
>             Fix For: 10.2.2.0, 10.3.0.0
>
>         Attachments: d1759_v1.patch, d1759_v1.stat
>
>
> The SQL/XML specification dictates that, when serializing a sequence of XML items, the XMLSERIALIZE operator must first "normalize" the sequence based on the rules defined here:
>   http://www.w3.org/TR/xslt-xquery-serialization/#serdm
> The current Derby implementation doesn't perform such normalization, which leads to two ways in which the results of an XMLSERIALIZE operator may not agree with the required behavior:
>   1. Sequences of atomic values will not have spaces between, but
>      the space is required as part of step 3 of the normalization
>      rules at the above link.
>   2. Derby will allow serialization of a sequence even if it has
>      a top-level Attribute node in it, but the rules of normalization
>      dictate that an error should be thrown instead (step 7).
> Both of these behaviors can be seen with the following query.
> values
>   xmlserialize(
>     xmlquery('/ageinfo/@*' passing by ref
>       xmlparse(
>         document '<ageinfo age="48" birthdate="1900-02-08"/>'
>         preserve whitespace
>       )
>       empty on empty
>     )
>     as char(50)
>   )
> Derby will currently return the following result from this statement:
> 1
> --------------------------------------------------
> 481900-02-08
> This result does not abide by SQL/XML specification because a) Derby allowed serialization of a sequence having a top-level attribute node (actually, the sequence had two), and b) the atomic values produced from the attributes were displayed without a space between them.
> The correct behavior for the above example is to return a serialization error caused by the presence of an Attribute node in the sequence.
> If the example was rewritten as, say:
> -    xmlquery('/ageinfo/@*' passing by ref
> +    xmlquery('fn:data(/ageinfo/@*)' passing by ref
> then the attribute nodes are no longer present--we only have their atomic values, which is allowed.  Thus the correct result should then be:
> 1
> --------------------------------------------------
> 48 1900-02-08
> Note, though, that Xalan doesn't appear to support the "fn:data" function, so this rewritten query won't actually work.  I tried using Xalan's built-in string function, as follows:
> -    xmlquery('/ageinfo/@*' passing by ref
> +    xmlquery('string(/ageinfo/@*)' passing by ref
> but Xalan only returns the first attribute in that case; it doesn't return the second one.  So part of this Jira issue is probably going to involve figuring out how to allow a user to retrieve a sequence of attribute *values* (as opposed to attribute nodes) using Xalan and still abide by the SQL/XML rules.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira