You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2010/11/29 17:39:13 UTC

[jira] Created: (LUCENE-2784) Change all FilteredTermsEnum impls into TermsEnum decorators

Change all FilteredTermsEnum impls into TermsEnum decorators
------------------------------------------------------------

                 Key: LUCENE-2784
                 URL: https://issues.apache.org/jira/browse/LUCENE-2784
             Project: Lucene - Java
          Issue Type: Improvement
            Reporter: Robert Muir
             Fix For: 4.0


Currently, FilteredTermsEnum has two ctors:
* FilteredTermsEnum(IndexReader reader, String field)
* FilteredTermsEnum(TermsEnum tenum)

But most of our concrete implementations (e.g. TermsRangeEnum) use the IndexReader+field ctor

In my opinion we should remove this ctor, and switch over all FilteredTermsEnum implementations to just take a TermsEnum.

Advantages:
* This simplifies FilteredTermsEnum and its subclasses, where they are more decorator-like (perhaps in the future we could compose them)
* Removes silly checks such as if (tenum == null) in every next()
* Allows for consumers to pass in enumerators however they want: e.g. its their responsibility if they want to use MultiFields or not, it shouldnt be buried in FilteredTermsEnum.

I created a quick patch (all core+contrib+solr tests pass), but i think this opens up more possibilities for refactoring improvements that haven't yet been done in the patch: we should explore these too.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Issue Comment Edited: (LUCENE-2784) Change all FilteredTermsEnum impls into TermsEnum decorators

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12964864#action_12964864 ] 

Uwe Schindler edited comment on LUCENE-2784 at 11/29/10 1:34 PM:
-----------------------------------------------------------------

That patch looks very good!

{noformat}
+      if (term == null) // TODO: is this really necessary? we should be positioned already?
{noformat}

The check is really necessary, as on the first call of nextSeekTerm on the uninitialized TermsEnum (before first next()), the current term is null. If we would change this and place it on the first term of the underlying enum we would break everything: TermsEnum must be unpositioned initially before next() [this was the broken thing before 4.0] and we would have an unneeded seek on a never needed term.

So you can remove that comment from patch!

      was (Author: thetaphi):
    That patch looks very good!

{noformat}
+      if (term == null) // TODO: is this really necessary? we should be positioned already?
{noformat}

The check is really necessary, as on the first call of nextSeekTerm on the uninitialized TermsEnum (before first next()), the current term is null. If we would change this and place it on the first term of the underlying enum we would break everything: TermsEnum must be unpositioned initially before next() [this was the broken thing before 4.0] and we would have an unneeded seek on a never needed term.

So you can remove that line from patch!
  
> Change all FilteredTermsEnum impls into TermsEnum decorators
> ------------------------------------------------------------
>
>                 Key: LUCENE-2784
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2784
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-2784.patch, LUCENE-2784.patch
>
>
> Currently, FilteredTermsEnum has two ctors:
> * FilteredTermsEnum(IndexReader reader, String field)
> * FilteredTermsEnum(TermsEnum tenum)
> But most of our concrete implementations (e.g. TermsRangeEnum) use the IndexReader+field ctor
> In my opinion we should remove this ctor, and switch over all FilteredTermsEnum implementations to just take a TermsEnum.
> Advantages:
> * This simplifies FilteredTermsEnum and its subclasses, where they are more decorator-like (perhaps in the future we could compose them)
> * Removes silly checks such as if (tenum == null) in every next()
> * Allows for consumers to pass in enumerators however they want: e.g. its their responsibility if they want to use MultiFields or not, it shouldnt be buried in FilteredTermsEnum.
> I created a quick patch (all core+contrib+solr tests pass), but i think this opens up more possibilities for refactoring improvements that haven't yet been done in the patch: we should explore these too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2784) Change all FilteredTermsEnum impls into TermsEnum decorators

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12964864#action_12964864 ] 

Uwe Schindler commented on LUCENE-2784:
---------------------------------------

That patch looks very good!

{noformat}
+      if (term == null) // TODO: is this really necessary? we should be positioned already?
{noformat}

The check is really necessary, as on the first call of nextSeekTerm on the uninitialized TermsEnum (before first next()), the current term is null. If we would change this and place it on the first term of the underlying enum we would break everything: TermsEnum must be unpositioned initially before next() [this was the broken thing before 4.0] and we would have an unneeded seek on a never needed term.

So you can remove that line from patch!

> Change all FilteredTermsEnum impls into TermsEnum decorators
> ------------------------------------------------------------
>
>                 Key: LUCENE-2784
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2784
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-2784.patch, LUCENE-2784.patch
>
>
> Currently, FilteredTermsEnum has two ctors:
> * FilteredTermsEnum(IndexReader reader, String field)
> * FilteredTermsEnum(TermsEnum tenum)
> But most of our concrete implementations (e.g. TermsRangeEnum) use the IndexReader+field ctor
> In my opinion we should remove this ctor, and switch over all FilteredTermsEnum implementations to just take a TermsEnum.
> Advantages:
> * This simplifies FilteredTermsEnum and its subclasses, where they are more decorator-like (perhaps in the future we could compose them)
> * Removes silly checks such as if (tenum == null) in every next()
> * Allows for consumers to pass in enumerators however they want: e.g. its their responsibility if they want to use MultiFields or not, it shouldnt be buried in FilteredTermsEnum.
> I created a quick patch (all core+contrib+solr tests pass), but i think this opens up more possibilities for refactoring improvements that haven't yet been done in the patch: we should explore these too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Assigned: (LUCENE-2784) Change all FilteredTermsEnum impls into TermsEnum decorators

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Uwe Schindler reassigned LUCENE-2784:
-------------------------------------

    Assignee: Uwe Schindler

> Change all FilteredTermsEnum impls into TermsEnum decorators
> ------------------------------------------------------------
>
>                 Key: LUCENE-2784
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2784
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-2784.patch
>
>
> Currently, FilteredTermsEnum has two ctors:
> * FilteredTermsEnum(IndexReader reader, String field)
> * FilteredTermsEnum(TermsEnum tenum)
> But most of our concrete implementations (e.g. TermsRangeEnum) use the IndexReader+field ctor
> In my opinion we should remove this ctor, and switch over all FilteredTermsEnum implementations to just take a TermsEnum.
> Advantages:
> * This simplifies FilteredTermsEnum and its subclasses, where they are more decorator-like (perhaps in the future we could compose them)
> * Removes silly checks such as if (tenum == null) in every next()
> * Allows for consumers to pass in enumerators however they want: e.g. its their responsibility if they want to use MultiFields or not, it shouldnt be buried in FilteredTermsEnum.
> I created a quick patch (all core+contrib+solr tests pass), but i think this opens up more possibilities for refactoring improvements that haven't yet been done in the patch: we should explore these too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2784) Change all FilteredTermsEnum impls into TermsEnum decorators

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12964832#action_12964832 ] 

Robert Muir commented on LUCENE-2784:
-------------------------------------

Right, we could check in every query and return TermsEnum.EMPTY

or a crazier idea, we could change MTQ.getTermsEnum(IndexReader reader, AttributeSource atts)
to instead take a TermsEnum versus a reader.

then MTQ itself could handle this if null, return TermsEnum.EMPTY case?

But i didnt try this yet.


> Change all FilteredTermsEnum impls into TermsEnum decorators
> ------------------------------------------------------------
>
>                 Key: LUCENE-2784
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2784
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-2784.patch
>
>
> Currently, FilteredTermsEnum has two ctors:
> * FilteredTermsEnum(IndexReader reader, String field)
> * FilteredTermsEnum(TermsEnum tenum)
> But most of our concrete implementations (e.g. TermsRangeEnum) use the IndexReader+field ctor
> In my opinion we should remove this ctor, and switch over all FilteredTermsEnum implementations to just take a TermsEnum.
> Advantages:
> * This simplifies FilteredTermsEnum and its subclasses, where they are more decorator-like (perhaps in the future we could compose them)
> * Removes silly checks such as if (tenum == null) in every next()
> * Allows for consumers to pass in enumerators however they want: e.g. its their responsibility if they want to use MultiFields or not, it shouldnt be buried in FilteredTermsEnum.
> I created a quick patch (all core+contrib+solr tests pass), but i think this opens up more possibilities for refactoring improvements that haven't yet been done in the patch: we should explore these too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Issue Comment Edited: (LUCENE-2784) Change all FilteredTermsEnum impls into TermsEnum decorators

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12964867#action_12964867 ] 

Uwe Schindler edited comment on LUCENE-2784 at 11/29/10 1:26 PM:
-----------------------------------------------------------------

bq. I also removed multifields usage in these multitermqueries (maybe missed a few, will look more), because all rewrite methods are per-segment.

MultiTermQueryWrapperFilter still has it :-) And TermCollectingRewrite does not have it, which is correct because it knows that it works per-segment (that was always the case). But I think we should remove that in a separate issue.

      was (Author: thetaphi):
    bq. I also removed multifields usage in these multitermqueries (maybe missed a few, will look more), because all rewrite methods are per-segment.

MultiTermQueryWrapperFilter still has it :-) And TermCollectingRewrite does not have it. But I think we should remove that in a separate issue.
  
> Change all FilteredTermsEnum impls into TermsEnum decorators
> ------------------------------------------------------------
>
>                 Key: LUCENE-2784
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2784
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-2784.patch, LUCENE-2784.patch
>
>
> Currently, FilteredTermsEnum has two ctors:
> * FilteredTermsEnum(IndexReader reader, String field)
> * FilteredTermsEnum(TermsEnum tenum)
> But most of our concrete implementations (e.g. TermsRangeEnum) use the IndexReader+field ctor
> In my opinion we should remove this ctor, and switch over all FilteredTermsEnum implementations to just take a TermsEnum.
> Advantages:
> * This simplifies FilteredTermsEnum and its subclasses, where they are more decorator-like (perhaps in the future we could compose them)
> * Removes silly checks such as if (tenum == null) in every next()
> * Allows for consumers to pass in enumerators however they want: e.g. its their responsibility if they want to use MultiFields or not, it shouldnt be buried in FilteredTermsEnum.
> I created a quick patch (all core+contrib+solr tests pass), but i think this opens up more possibilities for refactoring improvements that haven't yet been done in the patch: we should explore these too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2784) Change all FilteredTermsEnum impls into TermsEnum decorators

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12964819#action_12964819 ] 

Robert Muir commented on LUCENE-2784:
-------------------------------------

bq. Looks god, but I think (no tests hit this), that you must do a null check when calling MultiFields.getTerms().

We need tests for this, and to decide how to handle it: e.g. return emptytermsenum or whatever is my first idea.

its still better than the empty condition being handled in FTE though (and checked on every call to next())

bq. We should really remove usage of MultiFields here, all Filters and Queries now work directly on segment readers.

I agree, we don't need multifields in these queries. (I just did it to be consistent with trunk).


> Change all FilteredTermsEnum impls into TermsEnum decorators
> ------------------------------------------------------------
>
>                 Key: LUCENE-2784
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2784
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-2784.patch
>
>
> Currently, FilteredTermsEnum has two ctors:
> * FilteredTermsEnum(IndexReader reader, String field)
> * FilteredTermsEnum(TermsEnum tenum)
> But most of our concrete implementations (e.g. TermsRangeEnum) use the IndexReader+field ctor
> In my opinion we should remove this ctor, and switch over all FilteredTermsEnum implementations to just take a TermsEnum.
> Advantages:
> * This simplifies FilteredTermsEnum and its subclasses, where they are more decorator-like (perhaps in the future we could compose them)
> * Removes silly checks such as if (tenum == null) in every next()
> * Allows for consumers to pass in enumerators however they want: e.g. its their responsibility if they want to use MultiFields or not, it shouldnt be buried in FilteredTermsEnum.
> I created a quick patch (all core+contrib+solr tests pass), but i think this opens up more possibilities for refactoring improvements that haven't yet been done in the patch: we should explore these too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Issue Comment Edited: (LUCENE-2784) Change all FilteredTermsEnum impls into TermsEnum decorators

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12964837#action_12964837 ] 

Uwe Schindler edited comment on LUCENE-2784 at 11/29/10 12:13 PM:
------------------------------------------------------------------

bq. or a crazier idea, we could change MTQ.getTermsEnum(IndexReader reader, AttributeSource atts) to instead take a TermsEnum versus a reader.

I already checked for that (that was the reason for my prev. comment).

But in my opinion, we should not do this. As some MTQs may quicker return EMPTY for some reasons, like for a range when upper<lower. So you don't even need to look into the IR's fields/terms. So I would lie to remove the checks in MTQ and simply call always MTQ.getTermsEnum and then exit if empty. Thats much more clean. I was just wondering why these checks were added by Mike? For TermRange and Numeric its too much work.

      was (Author: thetaphi):
    bq. or a crazier idea, we could change MTQ.getTermsEnum(IndexReader reader, AttributeSource atts)
to instead take a TermsEnum versus a reader.

I already checked for that (that was the reason for my prev. comment).

But in my opinion, we should not do this. As some MTQs may quicker return EMPTY for some reasons, like for a range when upper<lower. So you don't even need to look into the IR's fields/terms. So I would lie to remove the checks in MTQ and simply call always MTQ.getTermsEnum and then exit if empty. Thats much more clean. I was just wondering why these checks were added by Mike? For TermRange and Numeric its too much work.
  
> Change all FilteredTermsEnum impls into TermsEnum decorators
> ------------------------------------------------------------
>
>                 Key: LUCENE-2784
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2784
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-2784.patch
>
>
> Currently, FilteredTermsEnum has two ctors:
> * FilteredTermsEnum(IndexReader reader, String field)
> * FilteredTermsEnum(TermsEnum tenum)
> But most of our concrete implementations (e.g. TermsRangeEnum) use the IndexReader+field ctor
> In my opinion we should remove this ctor, and switch over all FilteredTermsEnum implementations to just take a TermsEnum.
> Advantages:
> * This simplifies FilteredTermsEnum and its subclasses, where they are more decorator-like (perhaps in the future we could compose them)
> * Removes silly checks such as if (tenum == null) in every next()
> * Allows for consumers to pass in enumerators however they want: e.g. its their responsibility if they want to use MultiFields or not, it shouldnt be buried in FilteredTermsEnum.
> I created a quick patch (all core+contrib+solr tests pass), but i think this opens up more possibilities for refactoring improvements that haven't yet been done in the patch: we should explore these too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2784) Change all FilteredTermsEnum impls into TermsEnum decorators

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12964867#action_12964867 ] 

Uwe Schindler commented on LUCENE-2784:
---------------------------------------

bq. I also removed multifields usage in these multitermqueries (maybe missed a few, will look more), because all rewrite methods are per-segment.

MultiTermQueryWrapperFilter still has it :-) And TermCollectingRewrite does not have it. But I think we should remove that in a separate issue.

> Change all FilteredTermsEnum impls into TermsEnum decorators
> ------------------------------------------------------------
>
>                 Key: LUCENE-2784
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2784
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-2784.patch, LUCENE-2784.patch
>
>
> Currently, FilteredTermsEnum has two ctors:
> * FilteredTermsEnum(IndexReader reader, String field)
> * FilteredTermsEnum(TermsEnum tenum)
> But most of our concrete implementations (e.g. TermsRangeEnum) use the IndexReader+field ctor
> In my opinion we should remove this ctor, and switch over all FilteredTermsEnum implementations to just take a TermsEnum.
> Advantages:
> * This simplifies FilteredTermsEnum and its subclasses, where they are more decorator-like (perhaps in the future we could compose them)
> * Removes silly checks such as if (tenum == null) in every next()
> * Allows for consumers to pass in enumerators however they want: e.g. its their responsibility if they want to use MultiFields or not, it shouldnt be buried in FilteredTermsEnum.
> I created a quick patch (all core+contrib+solr tests pass), but i think this opens up more possibilities for refactoring improvements that haven't yet been done in the patch: we should explore these too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2784) Change all FilteredTermsEnum impls into TermsEnum decorators

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12964809#action_12964809 ] 

Yonik Seeley commented on LUCENE-2784:
--------------------------------------

+1, definitely seems like the right path.

> Change all FilteredTermsEnum impls into TermsEnum decorators
> ------------------------------------------------------------
>
>                 Key: LUCENE-2784
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2784
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>             Fix For: 4.0
>
>         Attachments: LUCENE-2784.patch
>
>
> Currently, FilteredTermsEnum has two ctors:
> * FilteredTermsEnum(IndexReader reader, String field)
> * FilteredTermsEnum(TermsEnum tenum)
> But most of our concrete implementations (e.g. TermsRangeEnum) use the IndexReader+field ctor
> In my opinion we should remove this ctor, and switch over all FilteredTermsEnum implementations to just take a TermsEnum.
> Advantages:
> * This simplifies FilteredTermsEnum and its subclasses, where they are more decorator-like (perhaps in the future we could compose them)
> * Removes silly checks such as if (tenum == null) in every next()
> * Allows for consumers to pass in enumerators however they want: e.g. its their responsibility if they want to use MultiFields or not, it shouldnt be buried in FilteredTermsEnum.
> I created a quick patch (all core+contrib+solr tests pass), but i think this opens up more possibilities for refactoring improvements that haven't yet been done in the patch: we should explore these too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2784) Change all FilteredTermsEnum impls into TermsEnum decorators

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-2784:
--------------------------------

    Attachment: LUCENE-2784.patch

ok, i removed the TODO (i understand now), and also some unused imports in tests.

other than that the patch is the same, all core+contrib+solr tests pass.

> Change all FilteredTermsEnum impls into TermsEnum decorators
> ------------------------------------------------------------
>
>                 Key: LUCENE-2784
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2784
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-2784.patch, LUCENE-2784.patch, LUCENE-2784.patch
>
>
> Currently, FilteredTermsEnum has two ctors:
> * FilteredTermsEnum(IndexReader reader, String field)
> * FilteredTermsEnum(TermsEnum tenum)
> But most of our concrete implementations (e.g. TermsRangeEnum) use the IndexReader+field ctor
> In my opinion we should remove this ctor, and switch over all FilteredTermsEnum implementations to just take a TermsEnum.
> Advantages:
> * This simplifies FilteredTermsEnum and its subclasses, where they are more decorator-like (perhaps in the future we could compose them)
> * Removes silly checks such as if (tenum == null) in every next()
> * Allows for consumers to pass in enumerators however they want: e.g. its their responsibility if they want to use MultiFields or not, it shouldnt be buried in FilteredTermsEnum.
> I created a quick patch (all core+contrib+solr tests pass), but i think this opens up more possibilities for refactoring improvements that haven't yet been done in the patch: we should explore these too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2784) Change all FilteredTermsEnum impls into TermsEnum decorators

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12964816#action_12964816 ] 

Uwe Schindler commented on LUCENE-2784:
---------------------------------------

Looks god, but I think (no tests hit this), that you must do a null check when calling MultiFields.getTerms().

We should really remove usage of MultiFields here, all Filters and Queries now work directly on segment readers.

> Change all FilteredTermsEnum impls into TermsEnum decorators
> ------------------------------------------------------------
>
>                 Key: LUCENE-2784
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2784
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-2784.patch
>
>
> Currently, FilteredTermsEnum has two ctors:
> * FilteredTermsEnum(IndexReader reader, String field)
> * FilteredTermsEnum(TermsEnum tenum)
> But most of our concrete implementations (e.g. TermsRangeEnum) use the IndexReader+field ctor
> In my opinion we should remove this ctor, and switch over all FilteredTermsEnum implementations to just take a TermsEnum.
> Advantages:
> * This simplifies FilteredTermsEnum and its subclasses, where they are more decorator-like (perhaps in the future we could compose them)
> * Removes silly checks such as if (tenum == null) in every next()
> * Allows for consumers to pass in enumerators however they want: e.g. its their responsibility if they want to use MultiFields or not, it shouldnt be buried in FilteredTermsEnum.
> I created a quick patch (all core+contrib+solr tests pass), but i think this opens up more possibilities for refactoring improvements that haven't yet been done in the patch: we should explore these too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2784) Change all FilteredTermsEnum impls into TermsEnum decorators

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12964839#action_12964839 ] 

Robert Muir commented on LUCENE-2784:
-------------------------------------

ok, either way is fine by me. 
btw there is a similar check in AutomatonQuery too (where it looks at the query and knows it can return EMPTY)

we can clean up these getEnums though (such as that one) a lot though, i didnt do it in the patch...
it was almost a find-replace job.

> Change all FilteredTermsEnum impls into TermsEnum decorators
> ------------------------------------------------------------
>
>                 Key: LUCENE-2784
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2784
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-2784.patch
>
>
> Currently, FilteredTermsEnum has two ctors:
> * FilteredTermsEnum(IndexReader reader, String field)
> * FilteredTermsEnum(TermsEnum tenum)
> But most of our concrete implementations (e.g. TermsRangeEnum) use the IndexReader+field ctor
> In my opinion we should remove this ctor, and switch over all FilteredTermsEnum implementations to just take a TermsEnum.
> Advantages:
> * This simplifies FilteredTermsEnum and its subclasses, where they are more decorator-like (perhaps in the future we could compose them)
> * Removes silly checks such as if (tenum == null) in every next()
> * Allows for consumers to pass in enumerators however they want: e.g. its their responsibility if they want to use MultiFields or not, it shouldnt be buried in FilteredTermsEnum.
> I created a quick patch (all core+contrib+solr tests pass), but i think this opens up more possibilities for refactoring improvements that haven't yet been done in the patch: we should explore these too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2784) Change all FilteredTermsEnum impls into TermsEnum decorators

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-2784:
--------------------------------

    Attachment: LUCENE-2784.patch

here is an improved patch: this changes MultiTermQuery.getTermsEnum(IndexReader ir) to MultiTermQuery.getTermsEnum(Terms terms).

I also removed multifields usage in these multitermqueries (maybe missed a few, will look more), because all rewrite methods are per-segment.


> Change all FilteredTermsEnum impls into TermsEnum decorators
> ------------------------------------------------------------
>
>                 Key: LUCENE-2784
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2784
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-2784.patch, LUCENE-2784.patch
>
>
> Currently, FilteredTermsEnum has two ctors:
> * FilteredTermsEnum(IndexReader reader, String field)
> * FilteredTermsEnum(TermsEnum tenum)
> But most of our concrete implementations (e.g. TermsRangeEnum) use the IndexReader+field ctor
> In my opinion we should remove this ctor, and switch over all FilteredTermsEnum implementations to just take a TermsEnum.
> Advantages:
> * This simplifies FilteredTermsEnum and its subclasses, where they are more decorator-like (perhaps in the future we could compose them)
> * Removes silly checks such as if (tenum == null) in every next()
> * Allows for consumers to pass in enumerators however they want: e.g. its their responsibility if they want to use MultiFields or not, it shouldnt be buried in FilteredTermsEnum.
> I created a quick patch (all core+contrib+solr tests pass), but i think this opens up more possibilities for refactoring improvements that haven't yet been done in the patch: we should explore these too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Resolved: (LUCENE-2784) Change all FilteredTermsEnum impls into TermsEnum decorators

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir resolved LUCENE-2784.
---------------------------------

    Resolution: Fixed

Committed revision 1040379.

> Change all FilteredTermsEnum impls into TermsEnum decorators
> ------------------------------------------------------------
>
>                 Key: LUCENE-2784
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2784
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-2784.patch, LUCENE-2784.patch, LUCENE-2784.patch
>
>
> Currently, FilteredTermsEnum has two ctors:
> * FilteredTermsEnum(IndexReader reader, String field)
> * FilteredTermsEnum(TermsEnum tenum)
> But most of our concrete implementations (e.g. TermsRangeEnum) use the IndexReader+field ctor
> In my opinion we should remove this ctor, and switch over all FilteredTermsEnum implementations to just take a TermsEnum.
> Advantages:
> * This simplifies FilteredTermsEnum and its subclasses, where they are more decorator-like (perhaps in the future we could compose them)
> * Removes silly checks such as if (tenum == null) in every next()
> * Allows for consumers to pass in enumerators however they want: e.g. its their responsibility if they want to use MultiFields or not, it shouldnt be buried in FilteredTermsEnum.
> I created a quick patch (all core+contrib+solr tests pass), but i think this opens up more possibilities for refactoring improvements that haven't yet been done in the patch: we should explore these too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2784) Change all FilteredTermsEnum impls into TermsEnum decorators

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12964837#action_12964837 ] 

Uwe Schindler commented on LUCENE-2784:
---------------------------------------

bq. or a crazier idea, we could change MTQ.getTermsEnum(IndexReader reader, AttributeSource atts)
to instead take a TermsEnum versus a reader.

I already checked for that (that was the reason for my prev. comment).

But in my opinion, we should not do this. As some MTQs may quicker return EMPTY for some reasons, like for a range when upper<lower. So you don't even need to look into the IR's fields/terms. So I would lie to remove the checks in MTQ and simply call always MTQ.getTermsEnum and then exit if empty. Thats much more clean. I was just wondering why these checks were added by Mike? For TermRange and Numeric its too much work.

> Change all FilteredTermsEnum impls into TermsEnum decorators
> ------------------------------------------------------------
>
>                 Key: LUCENE-2784
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2784
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-2784.patch
>
>
> Currently, FilteredTermsEnum has two ctors:
> * FilteredTermsEnum(IndexReader reader, String field)
> * FilteredTermsEnum(TermsEnum tenum)
> But most of our concrete implementations (e.g. TermsRangeEnum) use the IndexReader+field ctor
> In my opinion we should remove this ctor, and switch over all FilteredTermsEnum implementations to just take a TermsEnum.
> Advantages:
> * This simplifies FilteredTermsEnum and its subclasses, where they are more decorator-like (perhaps in the future we could compose them)
> * Removes silly checks such as if (tenum == null) in every next()
> * Allows for consumers to pass in enumerators however they want: e.g. its their responsibility if they want to use MultiFields or not, it shouldnt be buried in FilteredTermsEnum.
> I created a quick patch (all core+contrib+solr tests pass), but i think this opens up more possibilities for refactoring improvements that haven't yet been done in the patch: we should explore these too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Updated: (LUCENE-2784) Change all FilteredTermsEnum impls into TermsEnum decorators

Posted by "Robert Muir (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/LUCENE-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Muir updated LUCENE-2784:
--------------------------------

    Attachment: LUCENE-2784.patch

first quick stab

> Change all FilteredTermsEnum impls into TermsEnum decorators
> ------------------------------------------------------------
>
>                 Key: LUCENE-2784
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2784
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>             Fix For: 4.0
>
>         Attachments: LUCENE-2784.patch
>
>
> Currently, FilteredTermsEnum has two ctors:
> * FilteredTermsEnum(IndexReader reader, String field)
> * FilteredTermsEnum(TermsEnum tenum)
> But most of our concrete implementations (e.g. TermsRangeEnum) use the IndexReader+field ctor
> In my opinion we should remove this ctor, and switch over all FilteredTermsEnum implementations to just take a TermsEnum.
> Advantages:
> * This simplifies FilteredTermsEnum and its subclasses, where they are more decorator-like (perhaps in the future we could compose them)
> * Removes silly checks such as if (tenum == null) in every next()
> * Allows for consumers to pass in enumerators however they want: e.g. its their responsibility if they want to use MultiFields or not, it shouldnt be buried in FilteredTermsEnum.
> I created a quick patch (all core+contrib+solr tests pass), but i think this opens up more possibilities for refactoring improvements that haven't yet been done in the patch: we should explore these too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2784) Change all FilteredTermsEnum impls into TermsEnum decorators

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12964829#action_12964829 ] 

Uwe Schindler commented on LUCENE-2784:
---------------------------------------

I looked into the code of MTQ: The reason why we don't hit the null case here is: MTQ itsself gets IR's Terms and checks for null, so later calls here always return something. In my opinion this is also not the right thing (I don't know for what reason this was added). Maybe we should simply also change MTQ.getTermsEnum() to simply take a enum to wrap?

> Change all FilteredTermsEnum impls into TermsEnum decorators
> ------------------------------------------------------------
>
>                 Key: LUCENE-2784
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2784
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-2784.patch
>
>
> Currently, FilteredTermsEnum has two ctors:
> * FilteredTermsEnum(IndexReader reader, String field)
> * FilteredTermsEnum(TermsEnum tenum)
> But most of our concrete implementations (e.g. TermsRangeEnum) use the IndexReader+field ctor
> In my opinion we should remove this ctor, and switch over all FilteredTermsEnum implementations to just take a TermsEnum.
> Advantages:
> * This simplifies FilteredTermsEnum and its subclasses, where they are more decorator-like (perhaps in the future we could compose them)
> * Removes silly checks such as if (tenum == null) in every next()
> * Allows for consumers to pass in enumerators however they want: e.g. its their responsibility if they want to use MultiFields or not, it shouldnt be buried in FilteredTermsEnum.
> I created a quick patch (all core+contrib+solr tests pass), but i think this opens up more possibilities for refactoring improvements that haven't yet been done in the patch: we should explore these too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Issue Comment Edited: (LUCENE-2784) Change all FilteredTermsEnum impls into TermsEnum decorators

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12964867#action_12964867 ] 

Uwe Schindler edited comment on LUCENE-2784 at 11/29/10 1:27 PM:
-----------------------------------------------------------------

bq. I also removed multifields usage in these multitermqueries (maybe missed a few, will look more), because all rewrite methods are per-segment.

MultiTermQueryWrapperFilter still has it :-) And TermCollectingRewrite does not have it, which is correct because it knows that it works per-segment (that was always the case).

But I think we should remove remaining MultiFields in a separate issue.

      was (Author: thetaphi):
    bq. I also removed multifields usage in these multitermqueries (maybe missed a few, will look more), because all rewrite methods are per-segment.

MultiTermQueryWrapperFilter still has it :-) And TermCollectingRewrite does not have it, which is correct because it knows that it works per-segment (that was always the case). But I think we should remove that in a separate issue.
  
> Change all FilteredTermsEnum impls into TermsEnum decorators
> ------------------------------------------------------------
>
>                 Key: LUCENE-2784
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2784
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-2784.patch, LUCENE-2784.patch
>
>
> Currently, FilteredTermsEnum has two ctors:
> * FilteredTermsEnum(IndexReader reader, String field)
> * FilteredTermsEnum(TermsEnum tenum)
> But most of our concrete implementations (e.g. TermsRangeEnum) use the IndexReader+field ctor
> In my opinion we should remove this ctor, and switch over all FilteredTermsEnum implementations to just take a TermsEnum.
> Advantages:
> * This simplifies FilteredTermsEnum and its subclasses, where they are more decorator-like (perhaps in the future we could compose them)
> * Removes silly checks such as if (tenum == null) in every next()
> * Allows for consumers to pass in enumerators however they want: e.g. its their responsibility if they want to use MultiFields or not, it shouldnt be buried in FilteredTermsEnum.
> I created a quick patch (all core+contrib+solr tests pass), but i think this opens up more possibilities for refactoring improvements that haven't yet been done in the patch: we should explore these too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] Commented: (LUCENE-2784) Change all FilteredTermsEnum impls into TermsEnum decorators

Posted by "Uwe Schindler (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/LUCENE-2784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12964820#action_12964820 ] 

Uwe Schindler commented on LUCENE-2784:
---------------------------------------

bq. We need tests for this, and to decide how to handle it: e.g. return emptytermsenum or whatever is my first idea.

For IR.terms() what should be called in MTQ.getTermsEnum(IR) - not MultiFields, null is a vlaid return value. Mike and you decided that. MTQ's internal handling can also live with null enum. I will look further into it and post modified patch!

> Change all FilteredTermsEnum impls into TermsEnum decorators
> ------------------------------------------------------------
>
>                 Key: LUCENE-2784
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2784
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Robert Muir
>            Assignee: Uwe Schindler
>             Fix For: 4.0
>
>         Attachments: LUCENE-2784.patch
>
>
> Currently, FilteredTermsEnum has two ctors:
> * FilteredTermsEnum(IndexReader reader, String field)
> * FilteredTermsEnum(TermsEnum tenum)
> But most of our concrete implementations (e.g. TermsRangeEnum) use the IndexReader+field ctor
> In my opinion we should remove this ctor, and switch over all FilteredTermsEnum implementations to just take a TermsEnum.
> Advantages:
> * This simplifies FilteredTermsEnum and its subclasses, where they are more decorator-like (perhaps in the future we could compose them)
> * Removes silly checks such as if (tenum == null) in every next()
> * Allows for consumers to pass in enumerators however they want: e.g. its their responsibility if they want to use MultiFields or not, it shouldnt be buried in FilteredTermsEnum.
> I created a quick patch (all core+contrib+solr tests pass), but i think this opens up more possibilities for refactoring improvements that haven't yet been done in the patch: we should explore these too.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org