You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Achim Domma <do...@procoders.net> on 2014/05/21 00:01:17 UTC

Extensibility and code reuse: SOLR vs Lucene

Hi,

I have a project, where we need to do aggregations over facetted values. The stats component is not powerful enough anymore and the new statistic component seems not to be ready yet. I understand that it's not easy to create a general purpose component for this task. I decided to check whether I can solve my use case by myself, but I'm struggling. Any clarification regarding the following points would be very appreciated:

- I assume that some of my use cases could be solved by using a custom collector. Lucene seems to be build to be extensible by deriving classes and overriding methods. That's how I would expect SOLID code to be. But looking at the SOLR code, I see a lot of hard coded types and no way to just exchange the collector. This is the case for most of the code parts I have read, so I wonder: Is there another way to customize / extend SOLR? How is the SOLR code supposed to be reused?

- I found several times code snippets like " if (collector instanceof DelegatingCollector) { ((DelegatingCollector)collector).finish() } ". Such code is considered bad practice in every OO language I know. Do I miss something here? Is there a reason why it's solved like this?

cheers,
Achim

Re: Extensibility and code reuse: SOLR vs Lucene

Posted by Yonik Seeley <yo...@heliosearch.com>.
On Tue, May 20, 2014 at 6:01 PM, Achim Domma <do...@procoders.net> wrote:
> - I found several times code snippets like " if (collector instanceof DelegatingCollector) { ((DelegatingCollector)collector).finish() } ". Such code is considered bad practice in every OO language I know. Do I miss something here? Is there a reason why it's solved like this?

In a single code base you would be correct (we would just add a finish
method to the base Collector class).  When you are adding additional
functionality to an existing API/code base however, this is often the
only way to do it.

What type of aggregation are you looking for?  The Heliosearch project
(a Solr fork), also has this:
http://heliosearch.org/solr-facet-functions/

-Yonik
http://heliosearch.org - facet functions, subfacets, off-heap filters&fieldcache

Re: Extensibility and code reuse: SOLR vs Lucene

Posted by Joel Bernstein <jo...@gmail.com>.
Achim,

Solr can be extended to plugin custom analytics. The code snippet you
mention is part of the framework which enables this.

Here is how you do it:

1) Create a QParserPlugin that returns a Query that extends PostFilter.
2) Then implement the PostFilter api and return a DelegatingCollector that
collects whatever you like.
3) DelegatingCollector.finish() signals your collector that the search has
completed.
4)  You can output your analytics directly to the ResponseBuilder. You can
get a reference to the ResponseBuilder through a static call in the
SolrRequestInfo class.

In Solr 4.9 you'll be able to implement your own MergeStrategy, to merge
the results generated by DelegatingCollectors on the shards (SOLR-5973).
 The pluggable collectors in that ticket are for ranking. The PostFilter
delegating collectors are a better place for doing custom analytics.












Joel Bernstein
Search Engineer at Heliosearch


On Tue, May 20, 2014 at 6:01 PM, Achim Domma <do...@procoders.net> wrote:

> Hi,
>
> I have a project, where we need to do aggregations over facetted values.
> The stats component is not powerful enough anymore and the new statistic
> component seems not to be ready yet. I understand that it's not easy to
> create a general purpose component for this task. I decided to check
> whether I can solve my use case by myself, but I'm struggling. Any
> clarification regarding the following points would be very appreciated:
>
> - I assume that some of my use cases could be solved by using a custom
> collector. Lucene seems to be build to be extensible by deriving classes
> and overriding methods. That's how I would expect SOLID code to be. But
> looking at the SOLR code, I see a lot of hard coded types and no way to
> just exchange the collector. This is the case for most of the code parts I
> have read, so I wonder: Is there another way to customize / extend SOLR?
> How is the SOLR code supposed to be reused?
>
> - I found several times code snippets like " if (collector instanceof
> DelegatingCollector) { ((DelegatingCollector)collector).finish() } ". Such
> code is considered bad practice in every OO language I know. Do I miss
> something here? Is there a reason why it's solved like this?
>
> cheers,
> Achim