You are viewing a plain text version of this content. The canonical link for it is here.

Posted to xindice-dev@xml.apache.org by Christian Gross <ma...@devspace.com> on 2003/01/11 16:59:12 UTC

Proposed Patch

Hi

I have added the following patch at the end of this email.  It addresses 
the following, what I think is an issue.

Consider the following query:

//user[identifier="ZEUS20030108132147141"]

The XMLRPC answer is the following:

  <?xml version="1.0" encoding="UTF-8" ?>
- <methodResponse>
- <params>
- <param>
- <value>
- <struct>
- <member>
   <name>result</name>
   <value><?xml version="1.0"?> <result count="1"><user 
xmlns:src="http://xml.apache.org/xindice/Query" src:col="/db/users" 
src:key="cgross@devspace.com"> <identifier 
xmlns:src="http://xml.apache.org/xindice/Query">ZEUS20030108132147141</identifier> 
<username 
xmlns:src="http://xml.apache.org/xindice/Query">cgross@devspace.com</username> 
<password 
xmlns:src="http://xml.apache.org/xindice/Query">patches</password> 
<workspaces xmlns:src="http://xml.apache.org/xindice/Query"> <Name 
xmlns:src="http://xml.apache.org/xindice/Query">class scWorkspaces</Name> 
</workspaces></user></result></value>
   </member>
   </struct>
   </value>
   </param>
   </params>
   </methodResponse>

The problem I see with this result for a query is that you are embedding a 
document withing a document.  This means I have three document processing 
cycles.  The first is to process the XMLRPC message.  Then I have to 
process the result value and see how many elements there.  And then I have 
parse each individual item.  The patch will convert the resultset into an 
array of elements that are returned to the user.  This makes processing the 
resultset simpler since the client only needs to iterate the array.

Comments?


Christian Gross
Software Engineering Consultant / Trainer
http://www.devspace.com
North America: 1-450-675-4208
Europe: +41.1.701.1166


*****************************************
diff -r -u java/src/org/apache/xindice/server/rpc/RPCDefaultMessage.java 
../xml-xindice.cvs/java/src/org/apache/xindice/server/rpc/RPCDefaultMessage.java
--- 
java/src/org/apache/xindice/server/rpc/RPCDefaultMessage.java 
2003-01-11 16:47:05.000000000 +0100
+++ 
../xml-xindice.cvs/java/src/org/apache/xindice/server/rpc/RPCDefaultMessage.java 
2002-12-27 19:37:24.000000000 +0100
@@ -83,7 +83,6 @@
     public static final String NAMESPACES = "namespaces";
     public static final String CONFIGURATION = "configuration";
     public static final String META = "meta";
-   public static final String COUNT = "count";

     public static final String MISSING_COLLECTION_PARAM = "Required 
parameter 'collection' not found.";
     public static final String MISSING_NAME_PARAM = "Required parameter 
'name' not found.";
diff -r -u java/src/org/apache/xindice/server/rpc/messages/Query.java 
../xml-xindice.cvs/java/src/org/apache/xindice/server/rpc/messages/Query.java
--- java/src/org/apache/xindice/server/rpc/messages/Query.java  2003-01-11 
16:47:13.000000000 +0100
+++ 
../xml-xindice.cvs/java/src/org/apache/xindice/server/rpc/messages/Query.java 
2002-11-22 11:15:35.000000000 +0100
@@ -71,7 +71,6 @@

  import java.util.Enumeration;
  import java.util.Hashtable;
-import java.util.Vector;

     /**
      * Executes a query against a document or collection
@@ -106,7 +105,7 @@
        }

        Hashtable result = new Hashtable();
-      queryWrapper( result, ns);
+      result.put(RESULT, queryWrapper( ns ));
        return result;
     }

@@ -136,34 +135,35 @@
      * Adds additional meta data to the query response and turns it into a
      * Document.
      *
-    * @param result The Hashtable that contains the XMLRPC resultset
      * @param ns The NodeSet containing the query results.
      * @return the result set as an XML document
      * @exception Exception
      */
-   private void queryWrapper( Hashtable result, NodeSet ns ) throws 
Exception {
-       Vector resultArray = new Vector();
-       int count = 0;
-       while( ns != null && ns.hasMoreNodes())
-       {
-           Node n = ns.getNextNode();
-           if( n.getNodeType() == Node.DOCUMENT_NODE)
-           {
-               n = ((Document)n).getDocumentElement();
-           }
-
-           if( n instanceof DBNode)
-           {
-               ((DBNode)n).expandSource();
-           }
-
-           resultArray.addElement( TextWriter.toString( n));
-           count ++;
-       }
-       result.put( COUNT, Integer.toString( count));
-       result.put( RESULT, resultArray);
-   }
+   private String queryWrapper( NodeSet ns ) throws Exception {
+      // Turn the NodeSet into a document.
+      DocumentImpl doc = new DocumentImpl();
+
+      Element root = doc.createElement( "result" );
+      doc.appendChild( root );
+      int count = 0;
+      while ( ns != null && ns.hasMoreNodes() ) {
+         Node n = ns.getNextNode();

-}
+         if ( n.getNodeType() == Node.DOCUMENT_NODE ) {
+            n = ( ( Document ) n ).getDocumentElement();
+         }
+
+         if ( n instanceof DBNode ) {
+            ( ( DBNode ) n ).expandSource();
+         }

+         root.appendChild( doc.importNode( n, true ) );
+         count++;
+      }
+
+      root.setAttribute( "count", Integer.toString( count ) );

+      return TextWriter.toString( doc );
+   }
+
+}

Re: Proposed Patch

Posted by Gianugo Rabellino <gi...@apache.org>.

Christian Gross wrote:

>> 1. XML-RPC access in Xindice was, and still, is meant as a network 
>> transport for the networked XML:DB Java API: the fact of having a 
>> generic XML-RPC access to Xindice is just a (pleasant?) consequence. 

> Does this mean then that XMLRPC is a way to realize XMLDB?  If so then 
> it would be best for me to simply chuck away the XMLRPC and build an 
> AXIS layer.

Probably (and always IMHO) yes.

>> 2. Since XML-RPC direct access is somehow outside of the Xindice scope 
>> and not directly supported as it stands now, my suggestion is to feel 
>> free to add to your particular setup any XML-RPC method you might see 
>> as more consistent with your particular environment. We are currently 
>> discussing on (and if) give hooks to users to extend the XML-RPC layer 
>> in a structured way, but as of now it's more than enough to place your 
>> class in the o.a.x.server.rpc.messages package to add your own XML-RPC 
>> accessor.
> 
> 
> That would mean changing the way the class is loaded since it seems it 
> expects org.apache.xindice.rpc.messages.*, yes?

Exactly. Hopefully we'll soon come out with a mechanism to plugin 
XML-RPC methods in any namespace (even if, given that XML-RPC is just a 
transport I'm personally not that much convinced that this is a Good Thing).

>> This said, I'm still not getting what you mean when you talk about 
>> XPath result not being specification compliant. 
> 
> In the XPath specification consider the following:
> 
> 3.3 Node-sets
> 
> A location path can be used as an expression. The expression returns the 
> set of nodes selected by the path.
> 
> Now comes the question what is a node-set?  In the case of Xalan it is a 
> w3 nodelist.  And a nodelist is basically an array of nodes.  

Well, yes. With the exception that any node might contain an arbitrary 
number of nodes.

> But the problem is that the XMLRPC layer is used as a transport and the 
> result tag is generated by the *.rpc.messages.Query class, not the 
> XIndice infrastructure.  Consider the following code:
> 
>    private String queryWrapper( NodeSet ns ) throws Exception {
>       // Turn the NodeSet into a document.
[...]
> 
> The nodeset is the result of the query, which is correct, as per the 
> XPath spec.  But in the code the nodeset is converted into a document, 
> even though it does not need to be converted into a document.  XMLRPC 
> allows the saving of an array, which is a 1 to 1 mapping of a nodeset.  
> Saving the nodeset into a document breaks the spec of XPath 3.3 since 
> you generating a document.  And for those clients that are expecting a 
> nodeset as a result-set they need to map the document back into a 
> nodeset.  And in my case where SOAP and the serialization is handled 
> automatically this mapping adds an extra processing cycle.

OK, now I see your point, thanks for clarifying it. Basically what you 
don't want to see is another Document embedded in a Document. Which 
makes sense, but what if we would return not a Document but a 
DocumentFragment or, even better, just a Node? Just making some wid 
guesses, I'd have to check what needs to be changed (probably 
o.a.x.client.xmldb.ResourceSetImpl.java, which takes a document as a 
parameter should be enough: James, Kimbro?). But this said, we should 
also be careful about what this might change for people (ab?)using 
XML-RPC API with other languages.

Ciao,

-- 
Gianugo Rabellino

Re: Proposed Patch

Posted by Christian Gross <ma...@devspace.com>.

At 15:30 1/13/2003 +0100, you wrote:
>Christian Gross wrote:
>>At 23:43 1/13/2003 +1100, you wrote:
>1. XML-RPC access in Xindice was, and still, is meant as a network 
>transport for the networked XML:DB Java API: the fact of having a generic 
>XML-RPC access to Xindice is just a (pleasant?) consequence. This means 
>that it might well be possible that in the future the scenario will 
>change: as we switched from CORBA to XML-RPC, we might switch to RMI, SOAP 
>or WebDAV, without any backward compatibility issue to consider (meaning 
>that we will be backward compatible to XML:DB clients, not to XML-RPC 
>calls that the client makes). While I don't see the XML-RPC stuff going 
>away anytime soon, please note that there is no contract whatsoever that 
>the way XML-RPC access is implemented will not change in the future: our 
>only contract with users is to have a consistent client-server scenario 
>with the XML:DB APIs.

Does this mean then that XMLRPC is a way to realize XMLDB?  If so then it 
would be best for me to simply chuck away the XMLRPC and build an AXIS layer.

>2. Since XML-RPC direct access is somehow outside of the Xindice scope and 
>not directly supported as it stands now, my suggestion is to feel free to 
>add to your particular setup any XML-RPC method you might see as more 
>consistent with your particular environment. We are currently discussing 
>on (and if) give hooks to users to extend the XML-RPC layer in a 
>structured way, but as of now it's more than enough to place your class in 
>the o.a.x.server.rpc.messages package to add your own XML-RPC accessor.

That would mean changing the way the class is loaded since it seems it 
expects org.apache.xindice.rpc.messages.*, yes?

>This said, I'm still not getting what you mean when you talk about XPath 
>result not being specification compliant. AFAIK there is no cross-platform 
>and standard way of returning an XPath result, so the actual decision is 
>left to the implementor. But I'd be more than willing to know more about 
>this: can you point me out to some documentation on this topic?

In the XPath specification consider the following:

3.3 Node-sets

A location path can be used as an expression. The expression returns the 
set of nodes selected by the path.

Now comes the question what is a node-set?  In the case of Xalan it is a w3 
nodelist.  And a nodelist is basically an array of nodes.  Ok one can argue 
that nodelist is a node and therefore the result tag does not break the 
notation.  The only comment there is that XMLRPC is used to embed yet 
another document.  This is legal because the spec is vague.

But the problem is that the XMLRPC layer is used as a transport and the 
result tag is generated by the *.rpc.messages.Query class, not the XIndice 
infrastructure.  Consider the following code:

    private String queryWrapper( NodeSet ns ) throws Exception {
       // Turn the NodeSet into a document.
       DocumentImpl doc = new DocumentImpl();

       Element root = doc.createElement( "result" );
       doc.appendChild( root );
       int count = 0;
       while ( ns != null && ns.hasMoreNodes() ) {
          Node n = ns.getNextNode();

          if ( n.getNodeType() == Node.DOCUMENT_NODE ) {
             n = ( ( Document ) n ).getDocumentElement();
          }

          if ( n instanceof DBNode ) {
             ( ( DBNode ) n ).expandSource();
          }

          root.appendChild( doc.importNode( n, true ) );
          count++;
       }

       root.setAttribute( "count", Integer.toString( count ) );

       return TextWriter.toString( doc );
    }

The nodeset is the result of the query, which is correct, as per the XPath 
spec.  But in the code the nodeset is converted into a document, even 
though it does not need to be converted into a document.  XMLRPC allows the 
saving of an array, which is a 1 to 1 mapping of a nodeset.  Saving the 
nodeset into a document breaks the spec of XPath 3.3 since you generating a 
document.  And for those clients that are expecting a nodeset as a 
result-set they need to map the document back into a nodeset.  And in my 
case where SOAP and the serialization is handled automatically this mapping 
adds an extra processing cycle.

Christian Gross
Software Engineering Consultant / Trainer
http://www.devspace.com
North America: 1-450-675-4208
Europe: +41.1.701.1166

Re: Proposed Patch

Posted by Gianugo Rabellino <gi...@apache.org>.

Christian Gross wrote:
> At 23:43 1/13/2003 +1100, you wrote:

>> I must confess I am not well versed in Java, but is there much 
>> performance
>> difference between having Xalan (or whatever) pass you back an array of
>> result nodes, compared to extract the returned nodes from the document 
>> root?
>> Is saving a couple of lines of code worth breaking a lot of peoples
>> applications?
> 
> 
> The problem is that on my end there is a performance hit.  This is 
> because I have to load the XML document, extract the individual nodes 
> and then generate string buffers of those individual nodes.  This is 
> because the XPath that is returned is not specification compliant....

I don't want to delve into language-specific issues, but I have a couple 
of considerations on this topic.

1. XML-RPC access in Xindice was, and still, is meant as a network 
transport for the networked XML:DB Java API: the fact of having a 
generic XML-RPC access to Xindice is just a (pleasant?) consequence. 
This means that it might well be possible that in the future the 
scenario will change: as we switched from CORBA to XML-RPC, we might 
switch to RMI, SOAP or WebDAV, without any backward compatibility issue 
to consider (meaning that we will be backward compatible to XML:DB 
clients, not to XML-RPC calls that the client makes). While I don't see 
the XML-RPC stuff going away anytime soon, please note that there is no 
contract whatsoever that the way XML-RPC access is implemented will not 
change in the future: our only contract with users is to have a 
consistent client-server scenario with the XML:DB APIs.

2. Since XML-RPC direct access is somehow outside of the Xindice scope 
and not directly supported as it stands now, my suggestion is to feel 
free to add to your particular setup any XML-RPC method you might see as 
more consistent with your particular environment. We are currently 
discussing on (and if) give hooks to users to extend the XML-RPC layer 
in a structured way, but as of now it's more than enough to place your 
class in the o.a.x.server.rpc.messages package to add your own XML-RPC 
accessor.

<disclaimer>
These are not official statements from the Xindice developers group, 
rather they are only my opinions derived from common sense and 
experience. The future scenario, anyhow, might change significantly, and 
I'm not at all against someone stepping up to maintain the XML-RPC 
access *if* and *when* it will be discontinued.
</disclaimer>

This said, I'm still not getting what you mean when you talk about XPath 
result not being specification compliant. AFAIK there is no 
cross-platform and standard way of returning an XPath result, so the 
actual decision is left to the implementor. But I'd be more than willing 
to know more about this: can you point me out to some documentation on 
this topic?

Ciao,

-- 
Gianugo Rabellino

Re: Proposed Patch

Posted by Terry Rosenbaum <Te...@amicas.com>.

If you want to eliminate the <result count="xxx">
wrapper, do not forget that you must also
deal with the embedded driver code (which cannot
return results as an XMLRPC array).

-Terry

Christian Gross wrote:

> At 23:43 1/13/2003 +1100, you wrote:
>
>
>> I must confess I am not well versed in Java, but is there much 
>> performance
>> difference between having Xalan (or whatever) pass you back an array of
>> result nodes, compared to extract the returned nodes from the 
>> document root?
>> Is saving a couple of lines of code worth breaking a lot of peoples
>> applications?
>
>
> The problem is that on my end there is a performance hit.  This is 
> because I have to load the XML document, extract the individual nodes 
> and then generate string buffers of those individual nodes.  This is 
> because the XPath that is returned is not specification compliant....
>
> The exact scenario is that I have a Web Service using the SOAP 
> protocol.  It makes a request for a series of documents.  The SOAP Web 
> Service has a business component that makes an XIndice request.  The 
> query data is then sent back to the client of the SOAP Web Service as 
> a series of DIME attachments.
>
> So when I was working I saw that in the XMLRPC layer there are two 
> performance issues.
>
> The first is that the XML query data is transported in the XMLRPC 
> layer as an escaped string.  I have found out that this is part of the 
> XMLRPC layer that converts the data automatically.  To get around this 
> issue in SOAP you use attachments which automatically leave the data 
> as is.  While small pieces of XML can be escaped and unescaped easily, 
> larger pieces with the number of users like what I am thinking will 
> have definite performance issues.
>
> The second is the separation of the XML documents into individual 
> pieces.  When a SOAP client makes a Web Service request they expect 
> the query to generate a number of XML documents that are DIME 
> attachments.
>
> Combine these two issues and you have a potentially SLOW access layer.
>
> Why not access the Xindice directly?  Because I want to write 
> extensions to the XMLRPC layer that make it possible to save data on a 
> cluster of XMLRPC Xindice databases....
>
> Comments?
>
>
> Christian Gross
> Software Engineering Consultant / Trainer
> http://www.devspace.com
> North America: 1-450-675-4208
> Europe: +41.1.701.1166
>

Re: Proposed Patch

Posted by Christian Gross <ma...@devspace.com>.

At 23:43 1/13/2003 +1100, you wrote:

>I must confess I am not well versed in Java, but is there much performance
>difference between having Xalan (or whatever) pass you back an array of
>result nodes, compared to extract the returned nodes from the document root?
>Is saving a couple of lines of code worth breaking a lot of peoples
>applications?

The problem is that on my end there is a performance hit.  This is because 
I have to load the XML document, extract the individual nodes and then 
generate string buffers of those individual nodes.  This is because the 
XPath that is returned is not specification compliant....

The exact scenario is that I have a Web Service using the SOAP 
protocol.  It makes a request for a series of documents.  The SOAP Web 
Service has a business component that makes an XIndice request.  The query 
data is then sent back to the client of the SOAP Web Service as a series of 
DIME attachments.

So when I was working I saw that in the XMLRPC layer there are two 
performance issues.

The first is that the XML query data is transported in the XMLRPC layer as 
an escaped string.  I have found out that this is part of the XMLRPC layer 
that converts the data automatically.  To get around this issue in SOAP you 
use attachments which automatically leave the data as is.  While small 
pieces of XML can be escaped and unescaped easily, larger pieces with the 
number of users like what I am thinking will have definite performance issues.

The second is the separation of the XML documents into individual 
pieces.  When a SOAP client makes a Web Service request they expect the 
query to generate a number of XML documents that are DIME attachments.

Combine these two issues and you have a potentially SLOW access layer.

Why not access the Xindice directly?  Because I want to write extensions to 
the XMLRPC layer that make it possible to save data on a cluster of XMLRPC 
Xindice databases....

Comments?

Christian Gross
Software Engineering Consultant / Trainer
http://www.devspace.com
North America: 1-450-675-4208
Europe: +41.1.701.1166

Re: Proposed Patch

Posted by Lachlan Donald <la...@ljd.cc>.

I for one use Xindice/XMLRPC with PHP extensively, and I do xslt
transformations on query results. It would be a real pain if Xindice changed
the format of queries. I completely agree with the "if its not broken don't
fix it".

I must confess I am not well versed in Java, but is there much performance
difference between having Xalan (or whatever) pass you back an array of
result nodes, compared to extract the returned nodes from the document root?
Is saving a couple of lines of code worth breaking a lot of peoples
applications?

- Lachlan Donald

Re: Proposed Patch

Posted by Christian Gross <ma...@devspace.com>.

At 10:13 1/13/2003 +0100, you wrote:
>Christian Gross wrote:
>>Hi
>>I have added the following patch at the end of this email.  It addresses 
>>the following, what I think is an issue.
> > The problem I see with this result for a query is that you are embedding
> > a document withing a document.  This means I have three document
> > processing cycles.
>
>Hmmm... I'm currently in the "if it ain't broken don't fix it" mood. I 
>don't really understand the real advantage for this: we would be changing 
>the format of the query replies (thus having to change also all the client 
>code that uses the services) in order to gain just a little more comfort 
>in analyzing the results (presumably from another language, since using 
>the XML:DB API you end up with a ResourceSet). Am I missing something?

Ok imagine writing some client code to the query.  The client makes a query

//getdoc

Then an answer is rececived.  The answer is

<result count="xxx">
         <doc></doc>
         <doc></doc>
</result>

This is not XPath specification compliant since the XPath returns to you a 
set of nodes only.  eg, I tried with multiple XPath visualizers like 
XMLSpy.  Usually a node set is returned and I think Xalan uses the object 
org.w3c.dom.traversal.NodeIterator to represent a node-set.  Using the 
current result format increases the processing requirements because either 
an object or DOM has to be instantiated before the serialization can 
occur.  This is because you need to get past the result tag object, before 
the real processing can occur.  And in the worst case you need to 
instantiate a DOM strip the result tag and save back to a string to get the 
actually resultset that you wanted.

By converting the data to an array as per my patch the end client has the 
option to paste everything into the <result>...</result> as desired or the 
end client can retrieve the XML data and assemble it as a node-set.

Yes the reason why I stumbled on this is because I was using C++ 
serialization.  However, there may be more people using the XMLRPC layer 
since XML:DB is specific to Java, which means for some  XML:DB it is 
absolutely useless.

Comments?

Christian Gross
Software Engineering Consultant / Trainer
http://www.devspace.com
North America: 1-450-675-4208
Europe: +41.1.701.1166

Re: Proposed Patch

Posted by Gianugo Rabellino <gi...@apache.org>.

Christian Gross wrote:
> Hi
> 
> I have added the following patch at the end of this email.  It addresses 
> the following, what I think is an issue.
 > The problem I see with this result for a query is that you are embedding
 > a document withing a document.  This means I have three document
 > processing cycles.

Hmmm... I'm currently in the "if it ain't broken don't fix it" mood. I 
don't really understand the real advantage for this: we would be 
changing the format of the query replies (thus having to change also all 
the client code that uses the services) in order to gain just a little 
more comfort in analyzing the results (presumably from another language, 
since using the XML:DB API you end up with a ResourceSet). Am I missing 
something?

Ciao,

-- 
Gianugo Rabellino