You are viewing a plain text version of this content. The canonical link for it is here.
Posted to xindice-dev@xml.apache.org by Christian Gross <ma...@devspace.com> on 2003/01/11 16:59:12 UTC
Proposed Patch
Hi
I have added the following patch at the end of this email. It addresses
the following, what I think is an issue.
Consider the following query:
//user[identifier="ZEUS20030108132147141"]
The XMLRPC answer is the following:
<?xml version="1.0" encoding="UTF-8" ?>
- <methodResponse>
- <params>
- <param>
- <value>
- <struct>
- <member>
<name>result</name>
<value><?xml version="1.0"?> <result count="1"><user
xmlns:src="http://xml.apache.org/xindice/Query" src:col="/db/users"
src:key="cgross@devspace.com"> <identifier
xmlns:src="http://xml.apache.org/xindice/Query">ZEUS20030108132147141</identifier>
<username
xmlns:src="http://xml.apache.org/xindice/Query">cgross@devspace.com</username>
<password
xmlns:src="http://xml.apache.org/xindice/Query">patches</password>
<workspaces xmlns:src="http://xml.apache.org/xindice/Query"> <Name
xmlns:src="http://xml.apache.org/xindice/Query">class scWorkspaces</Name>
</workspaces></user></result></value>
</member>
</struct>
</value>
</param>
</params>
</methodResponse>
The problem I see with this result for a query is that you are embedding a
document withing a document. This means I have three document processing
cycles. The first is to process the XMLRPC message. Then I have to
process the result value and see how many elements there. And then I have
parse each individual item. The patch will convert the resultset into an
array of elements that are returned to the user. This makes processing the
resultset simpler since the client only needs to iterate the array.
Comments?
Christian Gross
Software Engineering Consultant / Trainer
http://www.devspace.com
North America: 1-450-675-4208
Europe: +41.1.701.1166
*****************************************
diff -r -u java/src/org/apache/xindice/server/rpc/RPCDefaultMessage.java
../xml-xindice.cvs/java/src/org/apache/xindice/server/rpc/RPCDefaultMessage.java
---
java/src/org/apache/xindice/server/rpc/RPCDefaultMessage.java
2003-01-11 16:47:05.000000000 +0100
+++
../xml-xindice.cvs/java/src/org/apache/xindice/server/rpc/RPCDefaultMessage.java
2002-12-27 19:37:24.000000000 +0100
@@ -83,7 +83,6 @@
public static final String NAMESPACES = "namespaces";
public static final String CONFIGURATION = "configuration";
public static final String META = "meta";
- public static final String COUNT = "count";
public static final String MISSING_COLLECTION_PARAM = "Required
parameter 'collection' not found.";
public static final String MISSING_NAME_PARAM = "Required parameter
'name' not found.";
diff -r -u java/src/org/apache/xindice/server/rpc/messages/Query.java
../xml-xindice.cvs/java/src/org/apache/xindice/server/rpc/messages/Query.java
--- java/src/org/apache/xindice/server/rpc/messages/Query.java 2003-01-11
16:47:13.000000000 +0100
+++
../xml-xindice.cvs/java/src/org/apache/xindice/server/rpc/messages/Query.java
2002-11-22 11:15:35.000000000 +0100
@@ -71,7 +71,6 @@
import java.util.Enumeration;
import java.util.Hashtable;
-import java.util.Vector;
/**
* Executes a query against a document or collection
@@ -106,7 +105,7 @@
}
Hashtable result = new Hashtable();
- queryWrapper( result, ns);
+ result.put(RESULT, queryWrapper( ns ));
return result;
}
@@ -136,34 +135,35 @@
* Adds additional meta data to the query response and turns it into a
* Document.
*
- * @param result The Hashtable that contains the XMLRPC resultset
* @param ns The NodeSet containing the query results.
* @return the result set as an XML document
* @exception Exception
*/
- private void queryWrapper( Hashtable result, NodeSet ns ) throws
Exception {
- Vector resultArray = new Vector();
- int count = 0;
- while( ns != null && ns.hasMoreNodes())
- {
- Node n = ns.getNextNode();
- if( n.getNodeType() == Node.DOCUMENT_NODE)
- {
- n = ((Document)n).getDocumentElement();
- }
-
- if( n instanceof DBNode)
- {
- ((DBNode)n).expandSource();
- }
-
- resultArray.addElement( TextWriter.toString( n));
- count ++;
- }
- result.put( COUNT, Integer.toString( count));
- result.put( RESULT, resultArray);
- }
+ private String queryWrapper( NodeSet ns ) throws Exception {
+ // Turn the NodeSet into a document.
+ DocumentImpl doc = new DocumentImpl();
+
+ Element root = doc.createElement( "result" );
+ doc.appendChild( root );
+ int count = 0;
+ while ( ns != null && ns.hasMoreNodes() ) {
+ Node n = ns.getNextNode();
-}
+ if ( n.getNodeType() == Node.DOCUMENT_NODE ) {
+ n = ( ( Document ) n ).getDocumentElement();
+ }
+
+ if ( n instanceof DBNode ) {
+ ( ( DBNode ) n ).expandSource();
+ }
+ root.appendChild( doc.importNode( n, true ) );
+ count++;
+ }
+
+ root.setAttribute( "count", Integer.toString( count ) );
+ return TextWriter.toString( doc );
+ }
+
+}
Re: Proposed Patch
Posted by Gianugo Rabellino <gi...@apache.org>.
Christian Gross wrote:
>> 1. XML-RPC access in Xindice was, and still, is meant as a network
>> transport for the networked XML:DB Java API: the fact of having a
>> generic XML-RPC access to Xindice is just a (pleasant?) consequence.
> Does this mean then that XMLRPC is a way to realize XMLDB? If so then
> it would be best for me to simply chuck away the XMLRPC and build an
> AXIS layer.
Probably (and always IMHO) yes.
>> 2. Since XML-RPC direct access is somehow outside of the Xindice scope
>> and not directly supported as it stands now, my suggestion is to feel
>> free to add to your particular setup any XML-RPC method you might see
>> as more consistent with your particular environment. We are currently
>> discussing on (and if) give hooks to users to extend the XML-RPC layer
>> in a structured way, but as of now it's more than enough to place your
>> class in the o.a.x.server.rpc.messages package to add your own XML-RPC
>> accessor.
>
>
> That would mean changing the way the class is loaded since it seems it
> expects org.apache.xindice.rpc.messages.*, yes?
Exactly. Hopefully we'll soon come out with a mechanism to plugin
XML-RPC methods in any namespace (even if, given that XML-RPC is just a
transport I'm personally not that much convinced that this is a Good Thing).
>> This said, I'm still not getting what you mean when you talk about
>> XPath result not being specification compliant.
>
> In the XPath specification consider the following:
>
> 3.3 Node-sets
>
> A location path can be used as an expression. The expression returns the
> set of nodes selected by the path.
>
> Now comes the question what is a node-set? In the case of Xalan it is a
> w3 nodelist. And a nodelist is basically an array of nodes.
Well, yes. With the exception that any node might contain an arbitrary
number of nodes.
> But the problem is that the XMLRPC layer is used as a transport and the
> result tag is generated by the *.rpc.messages.Query class, not the
> XIndice infrastructure. Consider the following code:
>
> private String queryWrapper( NodeSet ns ) throws Exception {
> // Turn the NodeSet into a document.
[...]
>
> The nodeset is the result of the query, which is correct, as per the
> XPath spec. But in the code the nodeset is converted into a document,
> even though it does not need to be converted into a document. XMLRPC
> allows the saving of an array, which is a 1 to 1 mapping of a nodeset.
> Saving the nodeset into a document breaks the spec of XPath 3.3 since
> you generating a document. And for those clients that are expecting a
> nodeset as a result-set they need to map the document back into a
> nodeset. And in my case where SOAP and the serialization is handled
> automatically this mapping adds an extra processing cycle.
OK, now I see your point, thanks for clarifying it. Basically what you
don't want to see is another Document embedded in a Document. Which
makes sense, but what if we would return not a Document but a
DocumentFragment or, even better, just a Node? Just making some wid
guesses, I'd have to check what needs to be changed (probably
o.a.x.client.xmldb.ResourceSetImpl.java, which takes a document as a
parameter should be enough: James, Kimbro?). But this said, we should
also be careful about what this might change for people (ab?)using
XML-RPC API with other languages.
Ciao,
--
Gianugo Rabellino
Re: Proposed Patch
Posted by Christian Gross <ma...@devspace.com>.
At 15:30 1/13/2003 +0100, you wrote:
>Christian Gross wrote:
>>At 23:43 1/13/2003 +1100, you wrote:
>1. XML-RPC access in Xindice was, and still, is meant as a network
>transport for the networked XML:DB Java API: the fact of having a generic
>XML-RPC access to Xindice is just a (pleasant?) consequence. This means
>that it might well be possible that in the future the scenario will
>change: as we switched from CORBA to XML-RPC, we might switch to RMI, SOAP
>or WebDAV, without any backward compatibility issue to consider (meaning
>that we will be backward compatible to XML:DB clients, not to XML-RPC
>calls that the client makes). While I don't see the XML-RPC stuff going
>away anytime soon, please note that there is no contract whatsoever that
>the way XML-RPC access is implemented will not change in the future: our
>only contract with users is to have a consistent client-server scenario
>with the XML:DB APIs.
Does this mean then that XMLRPC is a way to realize XMLDB? If so then it
would be best for me to simply chuck away the XMLRPC and build an AXIS layer.
>2. Since XML-RPC direct access is somehow outside of the Xindice scope and
>not directly supported as it stands now, my suggestion is to feel free to
>add to your particular setup any XML-RPC method you might see as more
>consistent with your particular environment. We are currently discussing
>on (and if) give hooks to users to extend the XML-RPC layer in a
>structured way, but as of now it's more than enough to place your class in
>the o.a.x.server.rpc.messages package to add your own XML-RPC accessor.
That would mean changing the way the class is loaded since it seems it
expects org.apache.xindice.rpc.messages.*, yes?
>This said, I'm still not getting what you mean when you talk about XPath
>result not being specification compliant. AFAIK there is no cross-platform
>and standard way of returning an XPath result, so the actual decision is
>left to the implementor. But I'd be more than willing to know more about
>this: can you point me out to some documentation on this topic?
In the XPath specification consider the following:
3.3 Node-sets
A location path can be used as an expression. The expression returns the
set of nodes selected by the path.
Now comes the question what is a node-set? In the case of Xalan it is a w3
nodelist. And a nodelist is basically an array of nodes. Ok one can argue
that nodelist is a node and therefore the result tag does not break the
notation. The only comment there is that XMLRPC is used to embed yet
another document. This is legal because the spec is vague.
But the problem is that the XMLRPC layer is used as a transport and the
result tag is generated by the *.rpc.messages.Query class, not the XIndice
infrastructure. Consider the following code:
private String queryWrapper( NodeSet ns ) throws Exception {
// Turn the NodeSet into a document.
DocumentImpl doc = new DocumentImpl();
Element root = doc.createElement( "result" );
doc.appendChild( root );
int count = 0;
while ( ns != null && ns.hasMoreNodes() ) {
Node n = ns.getNextNode();
if ( n.getNodeType() == Node.DOCUMENT_NODE ) {
n = ( ( Document ) n ).getDocumentElement();
}
if ( n instanceof DBNode ) {
( ( DBNode ) n ).expandSource();
}
root.appendChild( doc.importNode( n, true ) );
count++;
}
root.setAttribute( "count", Integer.toString( count ) );
return TextWriter.toString( doc );
}
The nodeset is the result of the query, which is correct, as per the XPath
spec. But in the code the nodeset is converted into a document, even
though it does not need to be converted into a document. XMLRPC allows the
saving of an array, which is a 1 to 1 mapping of a nodeset. Saving the
nodeset into a document breaks the spec of XPath 3.3 since you generating a
document. And for those clients that are expecting a nodeset as a
result-set they need to map the document back into a nodeset. And in my
case where SOAP and the serialization is handled automatically this mapping
adds an extra processing cycle.
Christian Gross
Software Engineering Consultant / Trainer
http://www.devspace.com
North America: 1-450-675-4208
Europe: +41.1.701.1166
Re: Proposed Patch
Posted by Gianugo Rabellino <gi...@apache.org>.
Christian Gross wrote:
> At 23:43 1/13/2003 +1100, you wrote:
>> I must confess I am not well versed in Java, but is there much
>> performance
>> difference between having Xalan (or whatever) pass you back an array of
>> result nodes, compared to extract the returned nodes from the document
>> root?
>> Is saving a couple of lines of code worth breaking a lot of peoples
>> applications?
>
>
> The problem is that on my end there is a performance hit. This is
> because I have to load the XML document, extract the individual nodes
> and then generate string buffers of those individual nodes. This is
> because the XPath that is returned is not specification compliant....
I don't want to delve into language-specific issues, but I have a couple
of considerations on this topic.
1. XML-RPC access in Xindice was, and still, is meant as a network
transport for the networked XML:DB Java API: the fact of having a
generic XML-RPC access to Xindice is just a (pleasant?) consequence.
This means that it might well be possible that in the future the
scenario will change: as we switched from CORBA to XML-RPC, we might
switch to RMI, SOAP or WebDAV, without any backward compatibility issue
to consider (meaning that we will be backward compatible to XML:DB
clients, not to XML-RPC calls that the client makes). While I don't see
the XML-RPC stuff going away anytime soon, please note that there is no
contract whatsoever that the way XML-RPC access is implemented will not
change in the future: our only contract with users is to have a
consistent client-server scenario with the XML:DB APIs.
2. Since XML-RPC direct access is somehow outside of the Xindice scope
and not directly supported as it stands now, my suggestion is to feel
free to add to your particular setup any XML-RPC method you might see as
more consistent with your particular environment. We are currently
discussing on (and if) give hooks to users to extend the XML-RPC layer
in a structured way, but as of now it's more than enough to place your
class in the o.a.x.server.rpc.messages package to add your own XML-RPC
accessor.
<disclaimer>
These are not official statements from the Xindice developers group,
rather they are only my opinions derived from common sense and
experience. The future scenario, anyhow, might change significantly, and
I'm not at all against someone stepping up to maintain the XML-RPC
access *if* and *when* it will be discontinued.
</disclaimer>
This said, I'm still not getting what you mean when you talk about XPath
result not being specification compliant. AFAIK there is no
cross-platform and standard way of returning an XPath result, so the
actual decision is left to the implementor. But I'd be more than willing
to know more about this: can you point me out to some documentation on
this topic?
Ciao,
--
Gianugo Rabellino
Re: Proposed Patch
Posted by Terry Rosenbaum <Te...@amicas.com>.
If you want to eliminate the <result count="xxx">
wrapper, do not forget that you must also
deal with the embedded driver code (which cannot
return results as an XMLRPC array).
-Terry
Christian Gross wrote:
> At 23:43 1/13/2003 +1100, you wrote:
>
>
>> I must confess I am not well versed in Java, but is there much
>> performance
>> difference between having Xalan (or whatever) pass you back an array of
>> result nodes, compared to extract the returned nodes from the
>> document root?
>> Is saving a couple of lines of code worth breaking a lot of peoples
>> applications?
>
>
> The problem is that on my end there is a performance hit. This is
> because I have to load the XML document, extract the individual nodes
> and then generate string buffers of those individual nodes. This is
> because the XPath that is returned is not specification compliant....
>
> The exact scenario is that I have a Web Service using the SOAP
> protocol. It makes a request for a series of documents. The SOAP Web
> Service has a business component that makes an XIndice request. The
> query data is then sent back to the client of the SOAP Web Service as
> a series of DIME attachments.
>
> So when I was working I saw that in the XMLRPC layer there are two
> performance issues.
>
> The first is that the XML query data is transported in the XMLRPC
> layer as an escaped string. I have found out that this is part of the
> XMLRPC layer that converts the data automatically. To get around this
> issue in SOAP you use attachments which automatically leave the data
> as is. While small pieces of XML can be escaped and unescaped easily,
> larger pieces with the number of users like what I am thinking will
> have definite performance issues.
>
> The second is the separation of the XML documents into individual
> pieces. When a SOAP client makes a Web Service request they expect
> the query to generate a number of XML documents that are DIME
> attachments.
>
> Combine these two issues and you have a potentially SLOW access layer.
>
> Why not access the Xindice directly? Because I want to write
> extensions to the XMLRPC layer that make it possible to save data on a
> cluster of XMLRPC Xindice databases....
>
> Comments?
>
>
> Christian Gross
> Software Engineering Consultant / Trainer
> http://www.devspace.com
> North America: 1-450-675-4208
> Europe: +41.1.701.1166
>
Re: Proposed Patch
Posted by Christian Gross <ma...@devspace.com>.
At 23:43 1/13/2003 +1100, you wrote:
>I must confess I am not well versed in Java, but is there much performance
>difference between having Xalan (or whatever) pass you back an array of
>result nodes, compared to extract the returned nodes from the document root?
>Is saving a couple of lines of code worth breaking a lot of peoples
>applications?
The problem is that on my end there is a performance hit. This is because
I have to load the XML document, extract the individual nodes and then
generate string buffers of those individual nodes. This is because the
XPath that is returned is not specification compliant....
The exact scenario is that I have a Web Service using the SOAP
protocol. It makes a request for a series of documents. The SOAP Web
Service has a business component that makes an XIndice request. The query
data is then sent back to the client of the SOAP Web Service as a series of
DIME attachments.
So when I was working I saw that in the XMLRPC layer there are two
performance issues.
The first is that the XML query data is transported in the XMLRPC layer as
an escaped string. I have found out that this is part of the XMLRPC layer
that converts the data automatically. To get around this issue in SOAP you
use attachments which automatically leave the data as is. While small
pieces of XML can be escaped and unescaped easily, larger pieces with the
number of users like what I am thinking will have definite performance issues.
The second is the separation of the XML documents into individual
pieces. When a SOAP client makes a Web Service request they expect the
query to generate a number of XML documents that are DIME attachments.
Combine these two issues and you have a potentially SLOW access layer.
Why not access the Xindice directly? Because I want to write extensions to
the XMLRPC layer that make it possible to save data on a cluster of XMLRPC
Xindice databases....
Comments?
Christian Gross
Software Engineering Consultant / Trainer
http://www.devspace.com
North America: 1-450-675-4208
Europe: +41.1.701.1166
Re: Proposed Patch
Posted by Lachlan Donald <la...@ljd.cc>.
I for one use Xindice/XMLRPC with PHP extensively, and I do xslt
transformations on query results. It would be a real pain if Xindice changed
the format of queries. I completely agree with the "if its not broken don't
fix it".
I must confess I am not well versed in Java, but is there much performance
difference between having Xalan (or whatever) pass you back an array of
result nodes, compared to extract the returned nodes from the document root?
Is saving a couple of lines of code worth breaking a lot of peoples
applications?
- Lachlan Donald
Re: Proposed Patch
Posted by Christian Gross <ma...@devspace.com>.
At 10:13 1/13/2003 +0100, you wrote:
>Christian Gross wrote:
>>Hi
>>I have added the following patch at the end of this email. It addresses
>>the following, what I think is an issue.
> > The problem I see with this result for a query is that you are embedding
> > a document withing a document. This means I have three document
> > processing cycles.
>
>Hmmm... I'm currently in the "if it ain't broken don't fix it" mood. I
>don't really understand the real advantage for this: we would be changing
>the format of the query replies (thus having to change also all the client
>code that uses the services) in order to gain just a little more comfort
>in analyzing the results (presumably from another language, since using
>the XML:DB API you end up with a ResourceSet). Am I missing something?
Ok imagine writing some client code to the query. The client makes a query
//getdoc
Then an answer is rececived. The answer is
<result count="xxx">
<doc></doc>
<doc></doc>
</result>
This is not XPath specification compliant since the XPath returns to you a
set of nodes only. eg, I tried with multiple XPath visualizers like
XMLSpy. Usually a node set is returned and I think Xalan uses the object
org.w3c.dom.traversal.NodeIterator to represent a node-set. Using the
current result format increases the processing requirements because either
an object or DOM has to be instantiated before the serialization can
occur. This is because you need to get past the result tag object, before
the real processing can occur. And in the worst case you need to
instantiate a DOM strip the result tag and save back to a string to get the
actually resultset that you wanted.
By converting the data to an array as per my patch the end client has the
option to paste everything into the <result>...</result> as desired or the
end client can retrieve the XML data and assemble it as a node-set.
Yes the reason why I stumbled on this is because I was using C++
serialization. However, there may be more people using the XMLRPC layer
since XML:DB is specific to Java, which means for some XML:DB it is
absolutely useless.
Comments?
Christian Gross
Software Engineering Consultant / Trainer
http://www.devspace.com
North America: 1-450-675-4208
Europe: +41.1.701.1166
Re: Proposed Patch
Posted by Gianugo Rabellino <gi...@apache.org>.
Christian Gross wrote:
> Hi
>
> I have added the following patch at the end of this email. It addresses
> the following, what I think is an issue.
> The problem I see with this result for a query is that you are embedding
> a document withing a document. This means I have three document
> processing cycles.
Hmmm... I'm currently in the "if it ain't broken don't fix it" mood. I
don't really understand the real advantage for this: we would be
changing the format of the query replies (thus having to change also all
the client code that uses the services) in order to gain just a little
more comfort in analyzing the results (presumably from another language,
since using the XML:DB API you end up with a ResourceSet). Am I missing
something?
Ciao,
--
Gianugo Rabellino