You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Joel Bernstein (JIRA)" <ji...@apache.org> on 2014/07/24 14:39:39 UTC

[jira] [Comment Edited] (SOLR-5244) Full Search Result Export

    [ https://issues.apache.org/jira/browse/SOLR-5244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14073160#comment-14073160 ] 

Joel Bernstein edited comment on SOLR-5244 at 7/24/14 12:37 PM:
----------------------------------------------------------------

New patch with ExportQParserPlugin as a RankQuery, and a SortingResponseWriter that exports a full sorted result set in json format.


was (Author: joel.bernstein):
New patch with ExportQueryParserPlugin as a RankQuery, and a SortingResponseWriter that exports a full sorted result set in json format.

> Full Search Result Export
> -------------------------
>
>                 Key: SOLR-5244
>                 URL: https://issues.apache.org/jira/browse/SOLR-5244
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>    Affects Versions: 5.0
>            Reporter: Joel Bernstein
>            Assignee: Joel Bernstein
>            Priority: Minor
>             Fix For: 5.0, 4.10
>
>         Attachments: 0001-SOLR_5244.patch, SOLR-5244.patch, SOLR-5244.patch
>
>
> It would be great if Solr could efficiently export entire search result sets without scoring or ranking documents. This would allow external systems to perform rapid bulk imports from Solr. It also provides a possible platform for exporting results to support distributed join scenarios within Solr.
> This ticket provides a patch that has two pluggable components:
> 1) ExportQParserPlugin: which is a post filter that gathers a BitSet with document results and does not delegate to ranking collectors. Instead it puts the BitSet on the request context.
> 2) BinaryExportWriter: Is a output writer that iterates the BitSet and prints the entire result as a binary stream. A header is provided at the beginning of the stream so external clients can self configure.
> Note:
> These two components will be sufficient for a non-distributed environment. 
> For distributed export a new Request handler will need to be developed.
> After applying the patch and building the dist or example, you can register the components through the following changes to solrconfig.xml
> Register export contrib libraries:
> <lib dir="../../../dist/" regex="solr-export-\d.*\.jar" />
>  
> Register the "export" queryParser with the following line:
>  
> <queryParser name="export" class="org.apache.solr.export.ExportQParserPlugin"/>
>  
> Register the "xbin" writer:
>  
> <queryResponseWriter name="xbin" class="org.apache.solr.export.BinaryExportWriter"/>
>  
> The following query will perform the export:
> {code}
> http://localhost:8983/solr/collection1/select?q=*:*&fq={!export}&wt=xbin&fl=join_i
> {code}
> Initial patch supports export of four data-types:
> 1) Single value trie int, long and float
> 2) Binary doc values.
> The numerics are currently exported from the FieldCache and the Binary doc values can be in memory or on disk.
> Since this is designed to export very large result sets efficiently, stored fields are not used for the export.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org