You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@jackrabbit.apache.org by "Melian, Jean-Francois" <jm...@cincom.com.INVALID> on 2021/01/21 11:10:26 UTC

SQL2 vs XPath performance

Hi

I am seeing performance issue on SQL2 queries vs XPATH Queries : Jackrabbit 2.20.0


We have structured own document node like this:

docNode (some "system"  properties)

-        propertiesNode (properties of the document)

o   contentNode (contains the main file)

o   attachmentsNode (contains some attached files)

*        doc1

*        doc2

*        ...

o   wordingsNode ( translations)

-        extractedTextsNode ( contains the text extracted of files of the document)

o   text1

o   text2

o   ...


All index of the nodes extractedTextsNode/* are aggregated in the node extractedTextsNode


I have a test base with less than  10 000 documents. And the following SQL query is very slow: about 1 minute to find 3854 uuid

SELECT object.[jcr:uuid] AS [object.jcr:uuid] FROM [kmdata:object] AS object
               INNER JOIN [kmdata:properties] AS properties ON ISSAMENODE(properties, object, [kmdata:propertiesNode])
               LEFT OUTER JOIN [nt:folder] AS extractsFolderSelector ON ISSAMENODE(extractsFolderSelector, object, [kmdata:extractedTextsNode])
               LEFT OUTER JOIN [kmdata:wordings] AS wordings ON ISSAMENODE(wordings, properties, [kmdata:wordingsNode])
WHERE ISDESCENDANTNODE(object, [/kmdata:root]) AND
(
               CONTAINS(object.[kmdata:name], 'tomber') OR
               CONTAINS(properties.*, 'tomber') OR
               CONTAINS(extractsFolderSelector.*, 'tomber') OR
               CONTAINS(wordings.*, 'tomber')
) ORDER BY wordings.fr, object.[kmdata:name]


The following XPATH query is faster : about 10 seconds to find the 3854 uuids

kmdata:root//element(*,kmdata:object)
[
               (
                              (
                                            jcr:contains(@kmdata:name, 'tomber') or
                                             jcr:contains(@kmdata:propertiesNode/kmdata:contentNode/kmdata:fileName, 'tomber') or
                                            jcr:contains(kmdata:propertiesNode, 'tomber') or
                                            jcr:contains(kmdata:extractedTextsNode/*, 'tomber') or
                                            jcr:contains(kmdata:propertiesNode/kmdata:wordingsNode, 'tomber')
                              )
               )
] order by kmdata:propertiesNode/kmdata:wordingsNode/@fr ascending, @kmdata:name ascending


Is it possible to improve performance of SQL2 or I have to translate the SQL2 query to XPATH query ?
Is there a utility to do this translation ?

Thanks for any help

Re: SQL2 vs XPath performance

Posted by Julian Reschke <ju...@gmx.de>.

Am 21.01.2021 um 12:10 schrieb Melian, Jean-Francois:
> Hi
>
> I am seeing performance issue on SQL2 queries vs XPATH Queries : Jackrabbit 2.20.0
> ...

That is strange - AFAIR, Jackrabbit translates both the the same
internal format. The likeliest reason for something like this is that
the queries indeed are not equivalent...

Best regards, Julian