You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Thomas Mueller (JIRA)" <ji...@apache.org> on 2017/08/07 13:57:00 UTC

[jira] [Commented] (OAK-937) Query engine index selection tweaks: shortcut and hint

    [ https://issues.apache.org/jira/browse/OAK-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16116614#comment-16116614 ] 

Thomas Mueller commented on OAK-937:
------------------------------------

> my concern is that it can be easily misused

I fully agree. It is a bit similar to "option(traversal ok)": it's easy to add that to the query, and if one does that, then such queries are not detected easily to be potentially slow. Some other dangerous features we have are "includedPaths" and "excludedPaths" in a Lucene index.

> mark some features as experimental or expert

Sure, we need to do that. But even if we do, there is still a risk.

The main problem I want to address with this issue is: there are multiple Lucene index configurations, with different aggregation rules. And then there is a query that uses "contains(., '<abc>')". How can you ensure the right index is used (the one with the aggregation rule you care about)?

The implementation I have so far is experimental, and I didn't document it on purpose, because I don't consider this the final design. It is mainly to allow testing if and how this works. I would like to discuss how to best solve the problem. In my view, instead of "option(index abc)", which hardcodes exactly _one_ specific index, I think it's better to allow using a group of indexes. For example, each index can have a multi-valued property "tags". Then a query can specify "option(index tag <x>)". That way, a query can potentially use multiple indexes (those that have the given tag). When adding a new index, queries don't have to be changed, instead the new index needs to define the right tags. This is the approach "All problems in computer science can be solved by another level of indirection". [~catholicon] [~chetanm] [~teofili] what do you think?




> Query engine index selection tweaks: shortcut and hint
> ------------------------------------------------------
>
>                 Key: OAK-937
>                 URL: https://issues.apache.org/jira/browse/OAK-937
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: query
>            Reporter: Alex Deparvu
>            Assignee: Thomas Mueller
>            Priority: Critical
>              Labels: performance
>             Fix For: 1.8
>
>
> This issue covers 2 different changes related to the way the QueryEngine selects a query index:
>  Firstly there could be a way to end the index selection process early via a known constant value: if an index returns a known value token (like -1000) then the query engine would effectively stop iterating through the existing index impls and use that index directly.
>  Secondly it would be nice to be able to specify a desired index (if one is known to perform better) thus skipping the existing selection mechanism (cost calculation and comparison). This could be done via certain query hints [0].
> [0] http://en.wikipedia.org/wiki/Hint_(SQL)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)