You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/11/18 19:40:10 UTC

[GitHub] [lucene] reta opened a new pull request #455: LUCENE-10244: Please consider opening MultiCollector::getCollectors for public use

reta opened a new pull request #455:
URL: https://github.com/apache/lucene/pull/455


   Signed-off-by: Andriy Redko <an...@aiven.io>
   
   # Description
   
   
   
   The usage of `MultiCollector` sometimes is very convenient along with `CollectorManager` instances (as an alternative to `MultiCollectorManager`), for example:
   
   ```
   class SomeCollectorManager implements CollectorManager<Collector, ?> {
       @Override
       public Collector newCollector() throws IOException {
         return MultiCollector.wrap(new Collector1(), new Collector2(), ...);
       }
       @Override
       public List<Integer> reduce(Collection<Collector> collectors) throws IOException {
         return ...;
       }
   }
   ```
   
   Unfortunately, the reduce() phase is lacking the access to  `MultiCollector::getCollectors` method, since it is package protected at the moment. Making `MultiCollector::getCollectors`  public would make it convenient to use `MultiCollector` + `CollectorManager` and implement the reduce phase by accessing individual collectors.
   
   # Solution
   
   Make `MultiCollector::getCollectors`  public 
   
   # Tests
   
   The test which illustrates the usage of the `MultiCollector` + `CollectorManager` has been added.
   
   # Checklist
   
   Please review the following and check all that apply:
   
   - [X] I have reviewed the guidelines for [How to Contribute](https://wiki.apache.org/lucene/HowToContribute) and my code conforms to the standards described there to the best of my ability.
   - [X] I have created a Jira issue and added the issue ID to my pull request title.
   - [X] I have given Lucene maintainers [access](https://help.github.com/en/articles/allowing-changes-to-a-pull-request-branch-created-from-a-fork) to contribute to my PR branch. (optional but recommended)
   - [X] I have developed this patch against the `main` branch.
   - [X] I have run `./gradlew check`.
   - [X] I have added tests for my changes.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] chatman commented on pull request #455: LUCENE-10244: Please consider opening MultiCollector::getCollectors for public use

Posted by GitBox <gi...@apache.org>.
chatman commented on pull request #455:
URL: https://github.com/apache/lucene/pull/455#issuecomment-974695631


   +1, I felt the need for this in Solr once. https://github.com/apache/lucene-solr/pull/675#discussion_r288442020


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] gsmiller commented on pull request #455: LUCENE-10244: Please consider opening MultiCollector::getCollectors for public use

Posted by GitBox <gi...@apache.org>.
gsmiller commented on pull request #455:
URL: https://github.com/apache/lucene/pull/455#issuecomment-974839318


   @reta ah yes, thanks for all the detail! I'll add my +1 as well, and while I'm here, I'll go ahead and merge. Thanks for adding tests as well :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] gsmiller merged pull request #455: LUCENE-10244: Please consider opening MultiCollector::getCollectors for public use

Posted by GitBox <gi...@apache.org>.
gsmiller merged pull request #455:
URL: https://github.com/apache/lucene/pull/455


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on pull request #455: LUCENE-10244: Please consider opening MultiCollector::getCollectors for public use

Posted by GitBox <gi...@apache.org>.
jpountz commented on pull request #455:
URL: https://github.com/apache/lucene/pull/455#issuecomment-974611343


   It's reasonable to me too. I can imagine how someone would like to implement similar logic to `MultiCollectorManager` (which internally uses `MultiCollector#getCollectors`) but return a proper object instead of an array of objects.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] reta commented on pull request #455: LUCENE-10244: Please consider opening MultiCollector::getCollectors for public use

Posted by GitBox <gi...@apache.org>.
reta commented on pull request #455:
URL: https://github.com/apache/lucene/pull/455#issuecomment-974514163


   Thanks for looking @gsmiller , we are working on experimental support of the concurrent segment search in the OpenSearch (if you are curious, please take a look at [1]). Since we are dealing with existing code, in certain cases it was much simpler to use the `CollectorManager` + `MultiCollector` instead of `MultiCollectorManager`. I assume it could be the case for other projects (fe [2]). Thank you.
   
   [1] https://github.com/opensearch-project/OpenSearch/pull/1500
   [2] https://github.com/Yelp/nrtsearch


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on pull request #455: LUCENE-10244: Please consider opening MultiCollector::getCollectors for public use

Posted by GitBox <gi...@apache.org>.
jpountz commented on pull request #455:
URL: https://github.com/apache/lucene/pull/455#issuecomment-974617749


   I'll merge next week unless someone beats me to it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] gsmiller commented on pull request #455: LUCENE-10244: Please consider opening MultiCollector::getCollectors for public use

Posted by GitBox <gi...@apache.org>.
gsmiller commented on pull request #455:
URL: https://github.com/apache/lucene/pull/455#issuecomment-974502955


   I don't have any strong opposition to making this public, but I would be curious to understand why `MultiCollectorManager` isn't a good fit for your use case. It looks like your desired functionality is exactly what `MultiCollectorManager` does, so I wonder if there's some deficiency there that should be improved instead of requiring users to implement their own collector manager on top of `MultiCollector`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] reta edited a comment on pull request #455: LUCENE-10244: Please consider opening MultiCollector::getCollectors for public use

Posted by GitBox <gi...@apache.org>.
reta edited a comment on pull request #455:
URL: https://github.com/apache/lucene/pull/455#issuecomment-974514163


   Thanks for looking @gsmiller , we are working on experimental support of the concurrent segment search in the OpenSearch (if you are curious, please take a look at [1]). Since we are dealing with existing code, in certain cases it was much simpler to use the `CollectorManager` + `MultiCollector` instead of `MultiCollectorManager`. I assume it could be the case for other projects (fe [2]). Thank you.
   
   **Update:** forgot to mention, dealing with `MultiCollectorManager` is not always the best option, since it returns `Object[]` as the result of reduce operation. It is understandable but dealing with typed collectors within the `CollectorManager` (the implementation details never escape its context) and returning strongly typed result is often (arguably) better.
   
   [1] https://github.com/opensearch-project/OpenSearch/pull/1500
   [2] https://github.com/Yelp/nrtsearch


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] gsmiller merged pull request #455: LUCENE-10244: Please consider opening MultiCollector::getCollectors for public use

Posted by GitBox <gi...@apache.org>.
gsmiller merged pull request #455:
URL: https://github.com/apache/lucene/pull/455


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on pull request #455: LUCENE-10244: Please consider opening MultiCollector::getCollectors for public use

Posted by GitBox <gi...@apache.org>.
mikemccand commented on pull request #455:
URL: https://github.com/apache/lucene/pull/455#issuecomment-973999812


   This sounds reasonable to me!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] chatman commented on pull request #455: LUCENE-10244: Please consider opening MultiCollector::getCollectors for public use

Posted by GitBox <gi...@apache.org>.
chatman commented on pull request #455:
URL: https://github.com/apache/lucene/pull/455#issuecomment-974695631


   +1, I felt the need for this in Solr once. https://github.com/apache/lucene-solr/pull/675#discussion_r288442020


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] reta commented on pull request #455: LUCENE-10244: Please consider opening MultiCollector::getCollectors for public use

Posted by GitBox <gi...@apache.org>.
reta commented on pull request #455:
URL: https://github.com/apache/lucene/pull/455#issuecomment-973198716


   @jpountz if you don't mind taking a look, really appreciate it, thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on pull request #455: LUCENE-10244: Please consider opening MultiCollector::getCollectors for public use

Posted by GitBox <gi...@apache.org>.
jpountz commented on pull request #455:
URL: https://github.com/apache/lucene/pull/455#issuecomment-974611343






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] gsmiller commented on pull request #455: LUCENE-10244: Please consider opening MultiCollector::getCollectors for public use

Posted by GitBox <gi...@apache.org>.
gsmiller commented on pull request #455:
URL: https://github.com/apache/lucene/pull/455#issuecomment-974839318


   @reta ah yes, thanks for all the detail! I'll add my +1 as well, and while I'm here, I'll go ahead and merge. Thanks for adding tests as well :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org