You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-issues@jackrabbit.apache.org by "Vikas Saurabh (JIRA)" <ji...@apache.org> on 2017/02/24 02:53:44 UTC
[jira] [Updated] (OAK-5707) [Oak lucene indexes] Clarify
aggregates, nodeScopeIndex, propertyIndex, analyzed
[ https://issues.apache.org/jira/browse/OAK-5707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vikas Saurabh updated OAK-5707:
-------------------------------
Attachment: OAK-5707.patch
In the spirit of laziness and rationalizing that I need to this before planning how to document: attaching [^OAK-5707.patch] which should have been a main class but test cases just have better utility methods - so, it's a test.
It'd print 3 type of definitions and how the data is stored in the index. Current output is at \[0]. Index dump is of the form:
{noformat}
<fieldName1>
<term1> => [<list of paths>]
<term2> => [<list of paths>]
...
<fieldName2>
....
....
{noformat}
It's just 3 new files, so the patch should cleanly apply. [~empire29], you might want to check it out and see if this shows what is getting stored.
My next step is to add queries and their plans to the output. That should make it bit clearer how the index would be queried.
I hope with enough shuffling, I'd get to a point where relevant points could be documented succinctly.
PS: Somehow the content tree dump isn't following the order in which indices are present in content tree :-/. The real order of prop defs is {{foo}}, {{bar}}, {{allBar}}.
\[0]:
{noformat}
----------------CONTENT-------------------
+/test
-foo = fox jumping
+test1
+testChild
-bar = dog jumping
+test2
+testChild
-barX = dog jumping
+testChild
-bar = dog jumping
----------------propIdx--------------
Definition
----------
+/oak:index/propIdx
-includedPaths = [/test]
-reindexCount = 1
-compatVersion = 2
-reindex = false
-type = lucene
-jcr:primaryType = oak:QueryIndexDefinition
+indexRules
-jcr:primaryType = nt:unstructured
+nt:base
-jcr:primaryType = nt:unstructured
+properties
-jcr:primaryType = nt:unstructured
+allBar
-name = testChild/ba.*
-propertyIndex = true
-isRegexp = true
-jcr:primaryType = nt:unstructured
+foo
-name = foo
-propertyIndex = true
-jcr:primaryType = nt:unstructured
+bar
-name = testChild/bar
-propertyIndex = true
-jcr:primaryType = nt:unstructured
Index
-----
foo
fox jumping => [/test]
testChild/bar
dog jumping => [/test/test1, /test]
testChild/barX
dog jumping => [/test/test2]
----------------analyzedIdx--------------
Definition
----------
+/oak:index/analyzedIdx
-includedPaths = [/test]
-reindexCount = 1
-compatVersion = 2
-reindex = false
-type = lucene
-jcr:primaryType = oak:QueryIndexDefinition
+indexRules
-jcr:primaryType = nt:unstructured
+nt:base
-jcr:primaryType = nt:unstructured
+properties
-jcr:primaryType = nt:unstructured
+allBar
-analyzed = true
-name = testChild/ba.*
-isRegexp = true
-jcr:primaryType = nt:unstructured
+foo
-analyzed = true
-name = foo
-jcr:primaryType = nt:unstructured
+bar
-analyzed = true
-name = testChild/bar
-jcr:primaryType = nt:unstructured
Index
-----
:fulltext
test => [/test]
test1 => [/test/test1]
test2 => [/test/test2]
full:foo
fox => [/test]
jumping => [/test]
full:testChild/bar
dog => [/test/test1, /test]
jumping => [/test/test1, /test]
full:testChild/barX
dog => [/test/test2]
jumping => [/test/test2]
----------------nodeScopedIdx--------------
Definition
----------
+/oak:index/nodeScopedIdx
-includedPaths = [/test]
-reindexCount = 1
-compatVersion = 2
-reindex = false
-type = lucene
-jcr:primaryType = oak:QueryIndexDefinition
+indexRules
-jcr:primaryType = nt:unstructured
+nt:base
-jcr:primaryType = nt:unstructured
+properties
-jcr:primaryType = nt:unstructured
+allBar
-nodeScopeIndex = true
-name = testChild/ba.*
-isRegexp = true
-jcr:primaryType = nt:unstructured
+foo
-nodeScopeIndex = true
-name = foo
-jcr:primaryType = nt:unstructured
+bar
-nodeScopeIndex = true
-name = testChild/bar
-jcr:primaryType = nt:unstructured
Index
-----
:fulltext
dog => [/test/test1, /test/test2, /test]
fox => [/test]
jumping => [/test/test1, /test/test2, /test]
test => [/test]
test1 => [/test/test1]
test2 => [/test/test2]
testchild => [/test/test1/testChild, /test/test2/testChild, /test/testChild]
{noformat}
> [Oak lucene indexes] Clarify aggregates, nodeScopeIndex, propertyIndex, analyzed
> --------------------------------------------------------------------------------
>
> Key: OAK-5707
> URL: https://issues.apache.org/jira/browse/OAK-5707
> Project: Jackrabbit Oak
> Issue Type: Documentation
> Reporter: David Gonzalez
> Assignee: Vikas Saurabh
> Attachments: OAK-5707.patch
>
>
> Oak lucene documentation would benefit from clarifying the relationships and expect behaviors around aggregates, nodeScopeIndex, propertyIndex and analyzed.
> These features have some overlap in what they do and/or augment one another, but to the lay-developer it is unclear how these work in concern and/or the implications of these using the various features.
> Its worth remembering many developers are under the mindset (shifting from jackrabbit 2 -> oak) that oak indexing requires explicit inclusion of content into search results; thus implicit content inclusion into indexes via generalized aggregations (vs named properties) is unclear/unexpected to many.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)