You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-dev@jackrabbit.apache.org by "jorgeeflorez ." <jo...@gmail.com> on 2020/05/18 21:55:29 UTC

Query ordered by node name

Hello,
with the following query  I am able to get file nodes ordered by name:

SELECT * FROM [nt:file] AS s WHERE ISCHILDNODE(s, [/repo1/pruebaJF1]) ORDER
BY NAME([s]) DESC

unfortunately, because I do not have an index, on a big repository I have
warnings like:

WARN org.apache.jackrabbit.oak.plugins.index.Cursors$TraversingCursor  -
Traversed 81000 nodes with filter Filter(query=SELECT * FROM [nt:file] AS s
WHERE ISCHILDNODE(s, [/repo1/pruebaJF1]) ORDER BY NAME([s]) DESC,
path=/repo1/pruebaJF1/*); consider creating an index or changing the query

and the query takes a lot of time.

I do not know how to define a proper index for name(). if I use the
following:
  - compatVersion = 2
  - async = "async"
  - jcr:primaryType = oak:QueryIndexDefinition
  - evaluatePathRestrictions = true
  - type = "lucene"
  + indexRules
   + nt:file
    + properties
     + primaryType
      - name = "jcr:primaryType"
      - propertyIndex = true
     + name
      - function = "fn:name"
      - ordered = true
      - type = "String"

the index is used (index cost is 501 compared to 80946 for traverse), but
it takes more time than traversing with warnings like:

WARN
org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor
 - Index-Traversed 80000 nodes with filter Filter(query=SELECT * FROM
[nt:file] AS s WHERE ISCHILDNODE(s, [/repo1/pruebaJF1]) ORDER BY NAME([s])
DESC, path=/repo1/pruebaJF1/*)

Thanks in advance.

Regards.

Jorge

Re: Query ordered by node name

Posted by "jorgeeflorez ." <jo...@gmail.com>.
Hi Julian, thanks for your reply.

> You could try the Oak Index Definition Generator.
>
> http://oakutils.appspot.com/generate/index
>
I regularly use that page to generate the index definitions I need. It was
odd, when I tried the suggested index it was not chosen. It was cheaper to
traverse. I deleted the index, created it again and now it is chosen, but
it takes more time than traversing.

Traversing takes around 24 seconds.
With name = ":name" takes around 39 seconds.
With name = "fn:name()" takes around 39 seconds.


When the index is created this is some of the log it prints:
INFO org.apache.jackrabbit.oak.plugins.index.IndexUpdate  - Reindexing
Traversed #860000 /repo1/pruebaJF2/Doc 463312580 [4673.91 nodes/s,
16826067.39 nodes/hr] (Elapsed 3.078 min, Expected 2.000 s, Completed
98.92%)
INFO org.apache.jackrabbit.oak.plugins.index.IndexUpdate  - Indexing report
    - /oak:index/bigIndex*(81213)

INFO org.apache.jackrabbit.oak.plugins.index.IndexUpdate  - Reindexing
completed
INFO org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate  - [async]
Reindexing completed for indexes: [/oak:index/bigIndex*(81213)] in 3.093
min (185608 ms)

I do not know what is going on. Maybe the index takes into account more
nodes that it should (like version nodes) or something like that to take
longer than traversing...

Regards.

Jorge




El mar., 19 may. 2020 a las 4:02, Julian Sedding (<js...@gmail.com>)
escribió:

> Or alternatively try [function = "fn:name()"], i.e. with the brackets "()".
>
> Regards
> Julian
>
> On Tue, May 19, 2020 at 10:57 AM Julian Sedding <js...@gmail.com>
> wrote:
> >
> > Hi Jorge
> >
> > You could try the Oak Index Definition Generator.
> >
> > http://oakutils.appspot.com/generate/index
> >
> > FWIW, in the "name" property node it sets [name = ":name"] instead of
> > [function = "fn:name"]. I don't know if that makes a difference and
> > which is better, if any.
> >
> > Regards
> > Julian
> >
> > On Mon, May 18, 2020 at 11:55 PM jorgeeflorez .
> > <jo...@gmail.com> wrote:
> > >
> > > Hello,
> > > with the following query  I am able to get file nodes ordered by name:
> > >
> > > SELECT * FROM [nt:file] AS s WHERE ISCHILDNODE(s, [/repo1/pruebaJF1])
> ORDER
> > > BY NAME([s]) DESC
> > >
> > > unfortunately, because I do not have an index, on a big repository I
> have
> > > warnings like:
> > >
> > > WARN org.apache.jackrabbit.oak.plugins.index.Cursors$TraversingCursor
> -
> > > Traversed 81000 nodes with filter Filter(query=SELECT * FROM [nt:file]
> AS s
> > > WHERE ISCHILDNODE(s, [/repo1/pruebaJF1]) ORDER BY NAME([s]) DESC,
> > > path=/repo1/pruebaJF1/*); consider creating an index or changing the
> query
> > >
> > > and the query takes a lot of time.
> > >
> > > I do not know how to define a proper index for name(). if I use the
> > > following:
> > >   - compatVersion = 2
> > >   - async = "async"
> > >   - jcr:primaryType = oak:QueryIndexDefinition
> > >   - evaluatePathRestrictions = true
> > >   - type = "lucene"
> > >   + indexRules
> > >    + nt:file
> > >     + properties
> > >      + primaryType
> > >       - name = "jcr:primaryType"
> > >       - propertyIndex = true
> > >      + name
> > >       - function = "fn:name"
> > >       - ordered = true
> > >       - type = "String"
> > >
> > > the index is used (index cost is 501 compared to 80946 for traverse),
> but
> > > it takes more time than traversing with warnings like:
> > >
> > > WARN
> > >
> org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor
> > >  - Index-Traversed 80000 nodes with filter Filter(query=SELECT * FROM
> > > [nt:file] AS s WHERE ISCHILDNODE(s, [/repo1/pruebaJF1]) ORDER BY
> NAME([s])
> > > DESC, path=/repo1/pruebaJF1/*)
> > >
> > > Thanks in advance.
> > >
> > > Regards.
> > >
> > > Jorge
>

Re: Query ordered by node name

Posted by Julian Sedding <js...@gmail.com>.
Or alternatively try [function = "fn:name()"], i.e. with the brackets "()".

Regards
Julian

On Tue, May 19, 2020 at 10:57 AM Julian Sedding <js...@gmail.com> wrote:
>
> Hi Jorge
>
> You could try the Oak Index Definition Generator.
>
> http://oakutils.appspot.com/generate/index
>
> FWIW, in the "name" property node it sets [name = ":name"] instead of
> [function = "fn:name"]. I don't know if that makes a difference and
> which is better, if any.
>
> Regards
> Julian
>
> On Mon, May 18, 2020 at 11:55 PM jorgeeflorez .
> <jo...@gmail.com> wrote:
> >
> > Hello,
> > with the following query  I am able to get file nodes ordered by name:
> >
> > SELECT * FROM [nt:file] AS s WHERE ISCHILDNODE(s, [/repo1/pruebaJF1]) ORDER
> > BY NAME([s]) DESC
> >
> > unfortunately, because I do not have an index, on a big repository I have
> > warnings like:
> >
> > WARN org.apache.jackrabbit.oak.plugins.index.Cursors$TraversingCursor  -
> > Traversed 81000 nodes with filter Filter(query=SELECT * FROM [nt:file] AS s
> > WHERE ISCHILDNODE(s, [/repo1/pruebaJF1]) ORDER BY NAME([s]) DESC,
> > path=/repo1/pruebaJF1/*); consider creating an index or changing the query
> >
> > and the query takes a lot of time.
> >
> > I do not know how to define a proper index for name(). if I use the
> > following:
> >   - compatVersion = 2
> >   - async = "async"
> >   - jcr:primaryType = oak:QueryIndexDefinition
> >   - evaluatePathRestrictions = true
> >   - type = "lucene"
> >   + indexRules
> >    + nt:file
> >     + properties
> >      + primaryType
> >       - name = "jcr:primaryType"
> >       - propertyIndex = true
> >      + name
> >       - function = "fn:name"
> >       - ordered = true
> >       - type = "String"
> >
> > the index is used (index cost is 501 compared to 80946 for traverse), but
> > it takes more time than traversing with warnings like:
> >
> > WARN
> > org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor
> >  - Index-Traversed 80000 nodes with filter Filter(query=SELECT * FROM
> > [nt:file] AS s WHERE ISCHILDNODE(s, [/repo1/pruebaJF1]) ORDER BY NAME([s])
> > DESC, path=/repo1/pruebaJF1/*)
> >
> > Thanks in advance.
> >
> > Regards.
> >
> > Jorge

Re: Query ordered by node name

Posted by Julian Sedding <js...@gmail.com>.
Hi Jorge

You could try the Oak Index Definition Generator.

http://oakutils.appspot.com/generate/index

FWIW, in the "name" property node it sets [name = ":name"] instead of
[function = "fn:name"]. I don't know if that makes a difference and
which is better, if any.

Regards
Julian

On Mon, May 18, 2020 at 11:55 PM jorgeeflorez .
<jo...@gmail.com> wrote:
>
> Hello,
> with the following query  I am able to get file nodes ordered by name:
>
> SELECT * FROM [nt:file] AS s WHERE ISCHILDNODE(s, [/repo1/pruebaJF1]) ORDER
> BY NAME([s]) DESC
>
> unfortunately, because I do not have an index, on a big repository I have
> warnings like:
>
> WARN org.apache.jackrabbit.oak.plugins.index.Cursors$TraversingCursor  -
> Traversed 81000 nodes with filter Filter(query=SELECT * FROM [nt:file] AS s
> WHERE ISCHILDNODE(s, [/repo1/pruebaJF1]) ORDER BY NAME([s]) DESC,
> path=/repo1/pruebaJF1/*); consider creating an index or changing the query
>
> and the query takes a lot of time.
>
> I do not know how to define a proper index for name(). if I use the
> following:
>   - compatVersion = 2
>   - async = "async"
>   - jcr:primaryType = oak:QueryIndexDefinition
>   - evaluatePathRestrictions = true
>   - type = "lucene"
>   + indexRules
>    + nt:file
>     + properties
>      + primaryType
>       - name = "jcr:primaryType"
>       - propertyIndex = true
>      + name
>       - function = "fn:name"
>       - ordered = true
>       - type = "String"
>
> the index is used (index cost is 501 compared to 80946 for traverse), but
> it takes more time than traversing with warnings like:
>
> WARN
> org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor
>  - Index-Traversed 80000 nodes with filter Filter(query=SELECT * FROM
> [nt:file] AS s WHERE ISCHILDNODE(s, [/repo1/pruebaJF1]) ORDER BY NAME([s])
> DESC, path=/repo1/pruebaJF1/*)
>
> Thanks in advance.
>
> Regards.
>
> Jorge