You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-user@db.apache.org by Daniel John Debrunner <dj...@apache.org> on 2006/08/29 18:11:57 UTC

XMLPARSE/XMLSERIALIZE question

I insert an XML document into an XML column using XMLPARSE (with 10.2)
and select it using XMLSERIALIZE. The raw input to the XMLPARSE had as
its first line:

<?xml version="1.0" encoding="utf-8" ?>

I don't see that being generated when I select it with XMLSERIALIZE, is
that expected?

Thanks,
Dan.


Re: XMLPARSE/XMLSERIALIZE question

Posted by Army <qo...@gmail.com>.
Daniel John Debrunner wrote:
> I insert an XML document into an XML column using XMLPARSE (with 10.2)
> and select it using XMLSERIALIZE. The raw input to the XMLPARSE had as
> its first line:
> 
> <?xml version="1.0" encoding="utf-8" ?>
> 
> I don't see that being generated when I select it with XMLSERIALIZE, is
> that expected?

Short answer: Yes, that's expected, and is covered by the documentation here:

http://db.apache.org/derby/docs/dev/ref/rreffuncxmlserialize.html

<begin quote>

Attention: Serialization is performed based on the SQL/XML serialization rules. 
These rules, combined with the fact that Derby supports only a subset of the 
XMLSERIALIZE syntax, dictate that the results of an XMLSERIALIZE operation are 
not guaranteed to be in-tact copies of the original XML text.

<end quote>

Longer answer:

Since the XMLSERIALIZE operator doesn't currently support the DOCUMENT nor 
CONTENT keywords, SQL/XML spec says that the default is CONTENT (6.7:Syntax 
Rules:2.a).  Further, since the XMLSERIALIZE operator doesn't currently support 
the <XML declaration option> syntax, the SQL/XML spec says that the default for 
that option is "Unknown" (6.7:General Rules:2.f).  Put those together and that 
in turn means that the value of "OMIT XML DECLARATION" must be "Yes", as stated 
in section 10.15:General Rules:8.c.  So we omit the XML declaration when we 
serialize an XML document in Derby.

Army


Re: XMLPARSE/XMLSERIALIZE question

Posted by Army <qo...@gmail.com>.
Suavi Ali Demir wrote:
> Is the output of XMLSERIALIZE varchar?

The output of XMLSERIALIZE is provided as part of the syntax.  Right now Derby 
requires that the target type be a SQL character type: CHAR, VARCHAR, LONG 
VARCHAR, or CLOB.

> If you get back a byte[] (or binary data), it should put the xml header 
> so that the client can parse the xml using correct encoding. 

Derby doesn't currently allow serialization to a binary type, so it looks like 
the xml header isn't required.

>   I am speculating that when you use xmlserialize, you probably serialize
> into varchar, so there is no need for encoding info. As far as the rest of
> the xml header goes, without the header it is still parseable, or it is 
> easier for the client to add <?xml version="1.0"?> characters than to
> remove them (The client may want to plug this xml content into an existing
> xml, if there is xml header it causes pain), so it is working correctly 
> i think.

Thanks for the extra input, Suavi!

Army


Re: XMLPARSE/XMLSERIALIZE question

Posted by Suavi Ali Demir <de...@yahoo.com>.
Is the output of XMLSERIALIZE varchar?
   
  If you are getting back a java string (or varchar etc), there is no need for the xml header to mention encoding (already have characers). If you get back a byte[] (or binary data), it should put the xml header so that the client can parse the xml using correct encoding. 
   
  I am speculating that when you use xmlserialize, you probably serialize into varchar, so there is no need for encoding info. As far as the rest of the xml header goes, without the header it is still parseable, or it is easier for the client to add <?xml version="1.0"?> characters than to remove them (The client may want to plug this xml content into an existing xml, if there is xml header it causes pain), so it is working correctly i think.
   
  Also, "<?xml version="1.0" encoding="utf-8"?>" would not have been correct if you have the xml content as a java string in memory, because java strings are not utf-8.
   
  Regards,
  Ali
   
  
Daniel John Debrunner <dj...@apache.org> wrote:
  
I insert an XML document into an XML column using XMLPARSE (with 10.2)
and select it using XMLSERIALIZE. The raw input to the XMLPARSE had as
its first line:



I don't see that being generated when I select it with XMLSERIALIZE, is
that expected?

Thanks,
Dan.