You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Joel Bernstein (JIRA)" <ji...@apache.org> on 2014/08/13 22:52:13 UTC

[jira] [Comment Edited] (SOLR-5244) Exporting Full Sorted Result Sets

    [ https://issues.apache.org/jira/browse/SOLR-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14096099#comment-14096099 ] 

Joel Bernstein edited comment on SOLR-5244 at 8/13/14 8:51 PM:
---------------------------------------------------------------

We also need a graceful way to not have the /export handler go into distributed mode, which is not support in this release.

We could simply add distrib=false to the default params, which will work.

I think when we introduce the SearcHandler we should also introduce some basic distributed support as well, but this will take some thought and design.

As it stands now, users can write SolrCloud aware clients and solve a lot if interesting use case with this. So I think we should release it.


was (Author: joel.bernstein):
We also need a graceful way to not have the /export handler go into distributed mode, which is not support in this release.

For we could simple add distrib=false to the default param, which will work.

I think when we introduce the SearcHandler we should also introduce some basic distributed support as well, but this will take some thought and design.

As it stands now, users can write SolrCloud aware clients and solve a lot if interesting use case with this. So I think we should release it.

> Exporting Full Sorted Result Sets
> ---------------------------------
>
>                 Key: SOLR-5244
>                 URL: https://issues.apache.org/jira/browse/SOLR-5244
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 5.0
>            Reporter: Joel Bernstein
>            Assignee: Joel Bernstein
>            Priority: Minor
>             Fix For: 5.0, 4.10
>
>         Attachments: 0001-SOLR_5244.patch, SOLR-5244.patch, SOLR-5244.patch, SOLR-5244.patch, SOLR-5244.patch, SOLR-5244.patch, SOLR-5244.patch
>
>
> This ticket allows Solr to export full sorted result sets. A new export request handler has been created that sets up the default writer type (SortingResponseWriter) and the required rank query (ExportQParserPlugin). The syntax is:
> {code}
> /solr/collection1/export?q=*:*&fl=a,b,c&sort=a desc,b desc
> {code}
> This capability will open up Solr for a whole range of uses that were typically done using aggregation engines like Hadoop. For example:
> *Large Distributed Joins*
> A client outside of Solr calls two different Solr collections and returns the results sorted by a join key. The client iterates through both streams and performs a merge join.
> *Fully Distributed Field Collapsing/Grouping*
> A client outside of Solr makes individual calls to all the servers in a single collection and returns results sorted by the collapse key. The client merge joins the sorted lists on the collapse key to perform the field collapse.
> *High Cardinality Distributed Aggregation*
> A client outside of Solr makes individual calls to all the servers in a single collection and sorts on a high cardinality field. The client then merge joins the sorted lists to perform the high cardinality aggregation.
> *Large Scale Time Series Rollups*
> A client outside Solr makes individual calls to all servers in a collection and sorts on time dimensions. The client merge joins the sorted result sets and rolls up the time dimensions as it iterates through the data.
> In these scenarios Solr is being used as a distributed sorting engine. Developers can write clients that take advantage of this sorting capability in any way they wish.
> *Session Analysis and Aggregation*
> A client outside Solr makes individual calls to all servers in a collection and sorts on the sessionID. The client merge joins the sorted results and aggregates sessions as it iterates through the results.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org