You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@jackrabbit.apache.org by "Ard Schrijvers (JIRA)" <ji...@apache.org> on 2007/08/15 15:49:30 UTC

[jira] Created: (JCR-1064) Optimize queries that check for the existence of a property

Optimize queries that check for the existence of a property
-----------------------------------------------------------

                 Key: JCR-1064
                 URL: https://issues.apache.org/jira/browse/JCR-1064
             Project: Jackrabbit
          Issue Type: Improvement
          Components: indexing
    Affects Versions: 1.3.1
            Reporter: Ard Schrijvers
            Priority: Minor
             Fix For: 1.4


//*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 

Solution: lucene documents will get a new Field:

public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();

that holds the available properties of this document. 

NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Re: Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by Christoph Kiehl <ch...@sulu3000.de>.

Ard Schrijvers (JIRA) wrote:

> Ard Schrijvers commented on JCR-1064:
> -------------------------------------
> 
> ps : is it possible to mark the current patch as deprecated or something? 

I tend to remove invalid patches. Don't how others think about that. But I think 
it makes no sense to keep a patch that is faulty.

Cheers,
Christoph

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Marcel Reutegger (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12524078 ] 

Marcel Reutegger commented on JCR-1064:
---------------------------------------

I just found another issue with the MultiIndex. The recovery code also runs inside the constructor and the redo log is applied it may happen that a node is indexed, which in turn needs to know the IndexFormatVersion.

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12529762 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

I have done the tests with old style indices and remove the parent handler index or the workspace index, tried with new style indices, all work as they should.

Also like the replacement of the IndexFormatVersion.version from String to int, to make future versions number possible. So, AFAICS, everything seems to work how it should. 

Thanks both for all the help regarding this issue.


> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-3.patch, JCR-1064-4.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12524655 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

> I just found another issue with the MultiIndex. The recovery code also runs inside the constructor and the redo log is applied it may 
> happen that a node is indexed, which in turn needs to know the IndexFormatVersion.

Do you have some preferred place to fix the problem with the redo log? Suppose you don't want the check for index format style in the Recovery.run() right? 

If you can give me your preference, I might be able to complete the patch/issue/improvement....\o/\o/

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ard Schrijvers updated JCR-1064:
--------------------------------

    Attachment:     (was: JCR-1064.patch)

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ard Schrijvers updated JCR-1064:
--------------------------------

    Attachment: JCR-1064-2.patch

Other patch had changes in unit tests. Removed them

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522540 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

I am now testing for the parent query handler  indexFormatVersion. If they are different, I am falling back to old style index format, because this one will always work. Ok with you?

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12529473 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

I didn't look well enough (only checked JCR-1064-2.patch  and not the JCR-1064-3.patch )

I'll do the tests this weekend,  because I am just about to grab a  nice cold beer (and can't think straight anymore this late ) :-)

I'll let you know how the tests went,

Cheers

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-3.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Marcel Reutegger (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12529468 ] 

Marcel Reutegger commented on JCR-1064:
---------------------------------------

I attached the patch with the previous comment ;)

I already did some tests, basically the ones you mentioned. But its always good to have someone else take another look at changes. So, if you have time, that'd be great.

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-3.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Bertrand Delacretaz (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522724 ] 

Bertrand Delacretaz commented on JCR-1064:
------------------------------------------

>> ...ps : is it possible to mark the current patch as deprecated or something?
> I prefer to just keep the attached patches to have a history on what was discussed...

Just upload the patch again with the same filename - JIRA then greys out older versions but keeps them available, and shows their time and date when hovering the mouse over the names. See SOLR-69 for example.

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522547 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

By the way: Are you really sure that it matters wether the parent query handler is in the new format? AFAICS, it doesn't matter wether one index, even the parent one, is in format old or new. 

All tests run also without problems.  When in doInit() I add the test

if (context.getParentHandler() instanceof SearchIndex){
            SearchIndex parentQueryHandler = (SearchIndex)(context.getParentHandler());
            if(parentQueryHandler.getIndexFormatVersion() != this.getIndexFormatVersion()){
                /*
                 * parentHandler is not allowed to be in different format. Fallback to
                 * old style
                 */ 
                log.warn("parentQueryHandler is in different format. Fallback to old format style");
                setIndexFormatVersion(IndexFormatVersion.V1);
            }
        }

and I delete the parent index (repository/index/*) and I delete the workspace index,. everything run in the new index format. Though, the parent index is created differently then the workspace indices, so in the NodeIndexer, 

if(indexFormatVersion == IndexFormatVersion.V2) {
       	addPropertyName(doc,propState.getName());
{

does not work because it is null for the parent query handler. Hence, after restart, the system will fallback into old format style, because the parent index format style is old!

So, I can add to the NodeIndexer something like:

if(indexFormatVersion == null || indexFormatVersion == IndexFormatVersion.V2) {
                	addPropertyName(doc,propState.getName());
}

this forces that the parent query handler indexes according new style. But, I just think the backwards compatible becomes a little hacky. 

I'll add a patch, in without the parent query handler check. If is turns out that it is needed, we can add it. WDYT?



> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522481 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

I am now creating a new IndexFormatVersion class (think IndexFormatVersion is more explanative then IndexVersion, right? If you like IndexVersion better, i can change it)

The remark above about how to compare [if(indexVersion.getInfo.equals("1") ) ] can be ignored. Understand how Marcel wants it.

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522443 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

Don't think it is much critics :-)  

I'll try to do your suggested changes and formatting, and do the old createQuery() in the old style. This was indeed my point of doubt as indicated already. 

Try to have the new patch in short notice. 

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Marcel Reutegger (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522461 ] 

Marcel Reutegger commented on JCR-1064:
---------------------------------------

Ard, thanks a lot for the patch, good work.

Here are some more comments:

- Instead of the boolean value newIndexFormat I would rather prefer version constants. something like:

public final class IndexVersion {

    public static final IndexVersion V1 = new IndexVersion("1");

    public static final IndexVersion V2 = new IndexVersion("2");

    private final String info;

    private IndexVersion(String info) {
        this.info = info;
    }

    public String getInfo() {
        return info;
    }
}

- SearchIndex.getIndexReader() always returns a new reader instance. That means you have to close the returned reader when you are done checking the index version.


> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Marcel Reutegger (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523173 ] 

Marcel Reutegger commented on JCR-1064:
---------------------------------------

> I am doing the tests, with the parent index in old format, and the workspace index in new format,
> and this is no problem

well, that just means that there is no appropriate test

WRT the bootstrapping issue with the index and its format, I will create a separate issue and extract the initial index creation from the MultiIndex constructor.  See JCR-1093. Once this is solved, you can set the index format version before indexing the workspace.

> will never be called since indexFormatVersion == null. 

that's actually another point that should be changed. There should be a default value. I suggest we set it to V1.

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522113 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

IndexingConfiguration might be null when not configured, so the above suggestion does not work. I'll go for instance variable in SearchIndex class

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Marcel Reutegger (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12529478 ] 

Marcel Reutegger commented on JCR-1064:
---------------------------------------

> IndexFormatVersion now contains the logic to decide which format version an index uses 
> (MultiIndex shouldn't know which FieldNames are used).

good point...

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-3.patch, JCR-1064-4.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Marcel Reutegger (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523020 ] 

Marcel Reutegger commented on JCR-1064:
---------------------------------------

Just a couple of minor issues that need to be resolved first:

- Please consistently use spaces and not tabs in the patch
- There's a System.out in SearchIndex.doInit()
- Please wrap the block with the version check into a try/finally and close the index reader in the finally block
- Please change !(indexFormatVersion == IndexFormatVersion.V2) to indexFormatVersion != IndexFormatVersion.V2

There actually is an issue when the parent handler uses a different index format version. If the parent handler (system index) uses the V1 and this handler (workspace index) uses V2 the query will use the PROPERTIES_SET field, which is not available in the system index.

A user may do the following:
- Upgrate a pre 1.4 repository (-> all indexes are V1)
- Re-index a workspace (-> workspace index will be V2)
- Execute a query on the workspace (-> will use V2 for queries)

This means if a query handler has a parent handler it must not use a more recent version than the one its parent is using!

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522467 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

ps : is it possible to mark the current patch as deprecated or something? 

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522466 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

Not closing SearchIndex.getIndexReader(), that is an ugly mistake. Correct it.

About the IndexVersion class. You want this to be a SearchIndex innerclass or a seperate one? 

And then, instead of 

if(newIndexFormat)

something like 

if(indexVersion.getInfo.equals("1") )  

The reasoning is that it is more flexible if we might face another indexing format, we can add "3" 

I you let me know, I will upload the patch (which I was doing by the way with Christoph's changes and almost hit post, when I saw your suggestions :-) )

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Resolved: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Marcel Reutegger (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marcel Reutegger resolved JCR-1064.
-----------------------------------

    Resolution: Fixed

Thanks all for your help. I've committed Christoph's version of the patch.

svn revision: 578711

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-3.patch, JCR-1064-4.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ard Schrijvers updated JCR-1064:
--------------------------------

    Attachment: JCR-1064-2.patch

Patch without the query parent handler check for the same format index style. Old tests run, and I do not think there will be problems.  2 workspaces can use different indexing format, and both work.

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522508 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

Hmmm, having a problem. 

When the index is empty at startup, the multiIndex seems to index some documents 

if (indexNames.size() == 0) {
                reindexing = true;

This means, that 

index.getIndexReader().numDocs() == 0; is always false in SearchIndex doInit(). 

I can try to add in MultiIndex when reindexing = true to add default fieldNames.PROPERTY_SET, but I am afraid, if somebody makes a change, it might break again. It is kind of a problem of the 

allFieldNames.contains(FieldNames.PROPERTIES_SET) || numDocs == 0

test, which assumes some parts, and can be easy to break. 

WDOT? Shall I add it to MultiIndex, when reindexnig is true that  PROPERTY_SET FieldName is added? I do not really like it though.



> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12529463 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

> Sorry for not getting back to you sooner. I was distracted with other work lately. 

np, I was pretty occupied as well, but regarding the release plans for 1.4 i thought it might be worthy to solve it. 

> Based on your work I've created a slightly different patch, which separates the physical index format version from the index format version that is used > for a query. This means a MultiIndex has an index format version (decides how nodes are indexed, independent of a parent query handler) and also > the SearchIndex has an index format version (decides how a query is executed, also takes a parent query handler into account). 

I think I do understand what you mean, but do you happen to have the patch available (or do you commit it directly)? I can do some testing if you want to see if it works out in all possible situations (index from scratch / existing indexes / existing indexes and remove a workspace index without parent, etc etc)? 

Regards Ard

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-3.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12524081 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------


We could move the code for the index format check to the multiIndex constructor, isn't?  If we do the check before the Recovery.run(this, redoLog); we can use handler.setIndexFormatVersion(IndexFormatVersion indexFormatVersion)  from the multiIndex constructor.

OTOH, perhaps putting the code in the multiIndex is not the best place. WDYT? 

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522978 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

Marcel Reutegger wrote:
One more thing, please"include a check in SearchIndex.doInit(), which compares the IndexFormatVersion of 'this' with the parent query handler (if there is one). They have to be the same, otherwise queries might return wrong results. "

The current patch is without this test in the parent query handler, because AFAICS everything works without this test (the parent handler does not need to have the PROPERTY_SET fieldname AFAIU ). 

Futhermore, I added a boolean 'newWorkSpaceIndex' to the MultiIndex.java, since the initial index creation when none exists is done there. As Marcel suggested, SearchIndex.doInit() might be a better place for this index creation. 

Do you think the current patch can be applied to the trunk? 

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522978 ] 

aschrijvers edited comment on JCR-1064 at 8/27/07 4:00 AM:
--------------------------------------------------------------

Marcel Reutegger wrote:
"One more thing, please"include a check in SearchIndex.doInit(), which compares the IndexFormatVersion of 'this' with the parent query handler (if there is one). They have to be the same, otherwise queries might return wrong results. "

The current patch is without this test in the parent query handler, because AFAICS everything works without this test (the parent handler does not need to have the PROPERTY_SET fieldname AFAIU ). 

Futhermore, I added a boolean 'newWorkSpaceIndex' to the MultiIndex.java, since the initial index creation when none exists is done there. As Marcel suggested, SearchIndex.doInit() might be a better place for this index creation. 

Do you think the current patch can be applied to the trunk? 

      was (Author: aschrijvers):
    Marcel Reutegger wrote:
One more thing, please"include a check in SearchIndex.doInit(), which compares the IndexFormatVersion of 'this' with the parent query handler (if there is one). They have to be the same, otherwise queries might return wrong results. "

The current patch is without this test in the parent query handler, because AFAICS everything works without this test (the parent handler does not need to have the PROPERTY_SET fieldname AFAIU ). 

Futhermore, I added a boolean 'newWorkSpaceIndex' to the MultiIndex.java, since the initial index creation when none exists is done there. As Marcel suggested, SearchIndex.doInit() might be a better place for this index creation. 

Do you think the current patch can be applied to the trunk? 

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ard Schrijvers updated JCR-1064:
--------------------------------

    Attachment: JCR-1064.patch

Patch that implements JCR-1064. 

Changes: 

FieldNames.java
LuceneQueryBuilder.java
NodeIndexer.java
QueryImpl.java
SearchIndex.java

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12528684 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

Have been thinking this issue over, since I think it is quite important performance improvement to get it right. I want to remove the part from the MultiIndex constructor below resetVolatileIndex();  (thus the Recovery.run(this, redoLog); and the rest) and move this in  a seperate method, which I call from the SearchIndex after the MultiIndex constructor and the test for the index format.  

WDOT? 

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522529 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

Added patch again.

The problem is that in de SearchIndex in the doInti() the index.getIndexReader().numDocs() is always larger then 0 (this was not true untill recently, because it worked for me before)

The problem is, that in MultiIndex, when if (indexNames.size() == 0) { , a persistent index is created. The lucene Docs that are added then do not use the nodeIndexer, and thus, the FieldNames.PROPERTIES_SET does not occur, and numDocs > 0, so always old index format style.

I can add in MultiIndex a instance variable and a getter, that is

boolean created; 

that I can call from IndexSearcher

In doInit(), I get

1) FieldNames.PROPERTIES_SET  exists, or
2) numDocs() == 0 or,
3) index.getNewlyCreated() 

I do not see how it can be done otherwise, because I need to know wether for the particular workspace, a persistent index is created for the first time....I do not really like the dependencies on other code for the backwards compatibility though. WDOT?




> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522092 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

According the thread in [1], we have chosen to implement the backwards compatibility by a check at startup wether the index is in old or new format:

Christoph: boolean propertySetSupported = index.getIndexReader().getFieldNames(
     FieldOption.ALL).contains(FieldNames.PROPERTIES_SET)
     || index.getIndexReader().numDocs() == 0;

Since I need this boolean value in the nodeIndexer I need this property to be available through the IndexingConfiguration, since I do not have access to the SearchIndex in the NodeIndexer.  This means adding a 

void setNewIndexFormat(boolean newIndexFormat); and
boolean getNewIndexFormat(); 

in the lucene/QueryImpl, I need in 

Query query = LuceneQueryBuilder.createQuery(root, session,
                index.getContext().getItemStateManager(), index.getNamespaceMappings(),
                index.getTextAnalyzer(), propReg, index.getSynonymProvider());

to also put in an argument for the indexFormat, to have this format available in LuceneQueryBuilder. WDOT? 

[1] http://www.mail-archive.com/dev@jackrabbit.apache.org/msg06907.html

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523176 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

> well, that just means that there is no appropriate test  

You mean that the tests just happen to work with new and old format by coincidence?  I just really am in the assumption, that a query is done on one index at the time, and wether this index is in the old or new format does not matter. Wether the system index is in old format, and the query runs on a workspace index in new format shouldn't give problems AFAICS. IMO, it is possible to port the jr impl to the new version while keeping all the indices, and when adding a new workspace, only this workspace will run in the new format.  But I do not have the overview like you do, so I probably just miss something :-). I'll stop worrying about it and go for your solution.

> WRT the bootstrapping issue with the index and its format, I will create a separate issue and extract the initial index creation from the >MultiIndex constructor. See JCR-1093. Once this is solved, you can set the index format version before indexing the workspace. 

That would be very nice. When you have finished, I'll create a new patch

> that's actually another point that should be changed. There should be a default value. I suggest we set it to V1.

Agreed. 

I'll wait for JCR-1093 and then create a new patch. 

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522147 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

1 note:

In LuceneQueryBuilder.createQuery:

public static Query createQuery(QueryRootNode root,
                                    SessionImpl session,
                                    ItemStateManager sharedItemMgr,
                                    NamespaceMappings nsMappings,
                                    Analyzer analyzer,
                                    PropertyTypeRegistry propReg)
            throws RepositoryException {
        return createQuery(root, session, sharedItemMgr, 
                nsMappings, analyzer, propReg, null);
    }

I added  return 
               createQuery(root, session, sharedItemMgr, 
                nsMappings, analyzer, propReg, null,true);

so, default index format true for this static call, because this static call does not seem to be used. Or could we better add it to the call of this static method? Or can we remove this method?




> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12519970 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

Reminder: I did not check it yet, but I should look at the Index Aggregates described in [1] as well, that not child nodes which are defined in an aggregate rule add there "available properties" to the indexed node in PROPERTIES_SET

[1] http://wiki.apache.org/jackrabbit/IndexingConfiguration

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522537 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

Just had my patch ready (again :-) )

Is it an idea to add the patch, and file a jira issue to refactor the initial creation logic into doInit()

I don't mind trying to do that one, but would like to first patch this one, and then solve the init logic, WDYT?



> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Marcel Reutegger (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marcel Reutegger updated JCR-1064:
----------------------------------

    Attachment: JCR-1064-3.patch

Sorry for not getting back to you sooner. I was distracted with other work lately.

Based on your work I've created a slightly different patch, which separates the physical index format version from the index format version that is used for a query. This means a MultiIndex has an index format version (decides how nodes are indexed, independent of a parent query handler) and also the SearchIndex has an index format version (decides how a query is executed, also takes a parent query handler into account).

With this approach the above mentioned issues are nicely avoided.

Ard, does that work for you?

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-3.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523205 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

> Ah, I see. That's where the misunderstanding is. Unless otherwise indicated (by static analysis of the query tree, see JCR-1066) a query is 
> executed on both indexes using a MultiReader. This means the query is only executed once and across both indexes. 

Now I am convinced and understand your concerns! :-)

I'll create a new patch with JCR-1093 taken into account, and a default value for indexFormatVersion 



> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Marcel Reutegger (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522536 ] 

Marcel Reutegger commented on JCR-1064:
---------------------------------------

One more thing, please include a check in SearchIndex.doInit(), which compares the IndexFormatVersion of 'this' with the parent query handler (if there is one). They have to be the same, otherwise queries might return wrong results.

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Marcel Reutegger (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523202 ] 

Marcel Reutegger commented on JCR-1064:
---------------------------------------

> You mean that the tests just happen to work with new and old format by coincidence?

yes, that's what I mean.

> I just really am in the assumption, that a query is done on one index at the time

Ah, I see. That's where the misunderstanding is. Unless otherwise indicated (by static analysis of the query tree, see JCR-1066) a query is executed on both indexes using a MultiReader. This means the query is only executed once and across both indexes.

Btw. JCR-1093 is now fixed.

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ard Schrijvers updated JCR-1064:
--------------------------------

    Attachment: JCR-1064-2.patch

New patch with Marcel's remarks, except for the parent query handler, because I had some remarks and questions on that matter

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Marcel Reutegger (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522526 ] 

Marcel Reutegger commented on JCR-1064:
---------------------------------------

> ps : is it possible to mark the current patch as deprecated or something?

I prefer to just keep the attached patches to have a history on what was discussed.

And right now, I actually wanted to review the patch again to see where exactly you are checking for the PROPERTIES_SET. But the patch is gone :-/

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Marcel Reutegger (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12519975 ] 

Marcel Reutegger commented on JCR-1064:
---------------------------------------

You should be just fine. Only the field FieldNames.FULLTEXT is packaged into an aggregate document.

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523033 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

Aaah I am sorry for the system.out. I replaced a patch and did put a sysout.out. Stupid, I'll remove it! I'll do the other 4 (-) as well.

About the parent handler I knew that the system index can be in the old format, but AFAICS, this is never an issue. When I am searching an index for workspace X, it does not matter wether the parent index is in the old format I think (I am doing the tests, with the parent index in old format, and the workspace index in new format, and this is no problem)

As I see your example:

A user may do the following:
- Upgrate a pre 1.4 repository (-> all indexes are V1)
- Re-index a workspace (-> workspace index will be V2)
- Execute a query on the workspace (-> will use V2 for queries) 

this will just run fine, as I tested it this way. You can have workspaces with old index style along with new index style, as with a system index in new or old format. 

It is hard to get it nice backwards compatible, due to the index creation in the MultiIndex when there is no index.

For example, when in SearchIndex.doInit() the following line is executed

index = new MultiIndex(indexDir, this, context.getItemStateManager(),
                context.getRootId(), excludedIDs, nsMappings);

the system index is created. Because this is *before* the setIndexFormatVersion part in doInit(), in NodeIndexer this part

if(indexFormatVersion == IndexFormatVersion.V2) {
               addPropertyName(doc,propState.getName());
}

will never be called since indexFormatVersion  == null. This means, the system index is always indexed without the PROPERTIES_SET, and therefor always in the old format. 

Now, I did just test to first set the default indexformat before the new MultiIndex, like:

setIndexFormatVersion(IndexFormatVersion.V2);
index = new MultiIndex(indexDir, this, context.getItemStateManager(),
               context.getRootId(), excludedIDs, nsMappings);

which later in doInit might be set to V1

so when a new index is created here, I get an index with the PROPERTIES_SET. But...I do not know wether the new MultiIndex(...) creation also indexes after it already exists, so that it might index  PROPERTIES_SET, while it should be in old format. Hope I am a little clear on the problems? :-)

I'll re-add the patch with your first 4 (-)  solved and wait if you can comment on my thing about the parent handler,

thanks for reviewing :-) 






> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Marcel Reutegger (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522535 ] 

Marcel Reutegger commented on JCR-1064:
---------------------------------------

We should probably extract the initial index logic from the MultiIndex. It doesn't feel right there anyway.

In SearchIndex.doInit() we'd have then:

1) do all the init stuff (extractor, indexing config, synonyms)
2) create MultiIndex (without creating initial index)
3) do the version check
4) if index size == 0 create an initial index


> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523228 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

Implemented the new indexing format again. There is a subtle difficulty though:

When I have one sysIndex and 2 workspace indices in format style like:

sysIndex = old
ws1Index = old
ws2Index = old

now, only deleting the sysIndex, will generate a sysIndex in new format style in index.createInitialIndex(). 

Since ws1Index and ws2Index  are old, the parentQueryHandler should be set to old index style again. This is implemented. 

Now, when you would have again 

sysIndex = old
ws1Index = old
ws2Index = old

and remove sysIndex  *and*  ws1Index, then  at doInit() we would get 

sysIndex = new --> old  (but changed to old when ws2Index is initialised)
ws1Index = new
ws2Index = old

but, when querying ws1Index, this might give problems, because sysIndex is reverted to "old" when ws2Index was initialized. To solve this, at getIndexFormatVersion() always a check is done wether parent handler and current index format are the same. If not, default back to old style.

This implies, that when updating jackrabbit version, you will *only* get the new indexing format style if and only if you re-index all the existing indices you have so far. 

Hope my explanation is clear! I'll prepare the patch



> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ard Schrijvers updated JCR-1064:
--------------------------------

    Attachment: JCR-1064-2.patch

Patch that should implement all previous comments. 



> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Jukka Zitting (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting updated JCR-1064:
-------------------------------

          Component/s: jackrabbit-core
    Affects Version/s:     (was: 1.3.1)

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing, jackrabbit-core
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-3.patch, JCR-1064-4.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Christoph Kiehl (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522442 ] 

Christoph Kiehl commented on JCR-1064:
--------------------------------------

I like the patch so far. Just a few comments:

LuceneQueryBuilder: 
- I would rename getMatchAllQuery() to createMatchAllQuery() to make clear that a new instance is created
- I think it's not a valid option to imply the new index format in the old createQuery() variant. I would rather imply the old index format since this will works with the new index as well (until now). But the optimal solution would be to remove this methods. Let's wait what Marcel says.

SearchIndex:
- You seem to have called "Organize Imports". If you could adjust the import order to Jackrabbits order this would make the diff smaller
- I would rewrite the format check to:

        // The index is in the new format if either the index already contains
        // the field FieldNames.PROPERTIES_SET in any document or if the index
        // is empty
        Collection allFieldNames = index.getIndexReader().getFieldNames(
          	             FieldOption.ALL);
        newIndexFormat = allFieldNames.contains(FieldNames.PROPERTIES_SET)
       	                 || index.getIndexReader().numDocs() == 0;

        if (!newIndexFormat) {
            log.warn("Index is in old format. This might imply slower queries."
                    + "Re-index if possible");
        }

My line was actually just a quick example. I think it's more readable if it is split into two lines at least. The other point is that I prefer to always enclose if-blocks with curly braces. This is less error prone when adding new statements to the block.

Overall you should take care of using spaces instead of tabs everywhere. In case you use eclipse just edit your formatter preferences for that particular project and do <Ctrl>+I on the sections in question. This will re-indent those sections.

Sounds like a lot of critics but I really like and appreciate your work!

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522132 ] 

Ard Schrijvers commented on JCR-1064:
-------------------------------------

AFAICS, the matchAllScorer won't be needed anymore with the new index format. Therefor, I won't add the changes to matchAllScorer that I had, because it implies that it scorer needs to know the format. For those interested, the improvement was like:

        if(newFormat){
        	TermDocs docs = reader.termDocs(new Term(FieldNames.PROPERTIES_SET,field));
        	while (docs.next()) {
                docFilter.set(docs.doc());
            }
        	docs.close();
        	
        } else {
	        String namedValue = FieldNames.createNamedValue(field, "");
	        TermEnum terms = reader.terms(new Term(FieldNames.PROPERTIES, namedValue));
	        try {
	            TermDocs docs = reader.termDocs();
	            try {
	                while (terms.term() != null
	                        && terms.term().field() == FieldNames.PROPERTIES
	                        && terms.term().text().startsWith(namedValue)) {
	                    docs.seek(terms);
	                    while (docs.next()) {
	                        docFilter.set(docs.doc());
	                    }
	                    terms.next();
	                }
	            } finally {
	                docs.close();
	            }
	        } finally {
	            terms.close();
	        }
        }

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Ard Schrijvers (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ard Schrijvers updated JCR-1064:
--------------------------------

    Attachment: JCR-1064-DEPR.patch

Added the deprecated patch for review

> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (JCR-1064) Optimize queries that check for the existence of a property

Posted by "Christoph Kiehl (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/JCR-1064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christoph Kiehl updated JCR-1064:
---------------------------------

    Attachment: JCR-1064-4.patch

I like Marcels solution to distinguish between physical index version and query index version. I just changed a few lines of Marcels patch:

- IndexFormatVersion now contains the logic to decide which format version an index uses (MultiIndex shouldn't know which FieldNames are used)
- No logging in createMatchAllQuery()
- Some javadoc edits/typo corrections

WDYT?

And sorry Ard for the late response. I was quite busy the last weeks as well. Thanks a lot for you work!


> Optimize queries that check for the existence of a property
> -----------------------------------------------------------
>
>                 Key: JCR-1064
>                 URL: https://issues.apache.org/jira/browse/JCR-1064
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: indexing
>    Affects Versions: 1.3.1
>            Reporter: Ard Schrijvers
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-2.patch, JCR-1064-3.patch, JCR-1064-4.patch, JCR-1064-DEPR.patch
>
>
> //*[@mytext] is transformed into the org.apache.jackrabbit.core.query.lucene.MatchAllQuery, that through the MatchAllWeight uses the MatchAllScorer.  The calculateDocFilter() in MatchAllScorer  does not scale and becomes slow for growing number of nodes. 
> Solution: lucene documents will get a new Field:
> public static final String PROPERTIES_SET = "_:PROPERTIES_SET".intern();
> that holds the available properties of this document. 
> NOTE: Lucene indices build without this performance improvement should still work and fall back to the original implementation

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.