You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Noble Paul (JIRA)" <ji...@apache.org> on 2009/01/03 11:07:44 UTC

[jira] Commented: (SOLR-906) Buffered / Streaming SolrServer implementaion

    [ https://issues.apache.org/jira/browse/SOLR-906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12660457#action_12660457 ] 

Noble Paul commented on SOLR-906:
---------------------------------

One problem with the current implementation is that it writes everything to a local buffer and then uploads the whole content in one go. So essentially we are wasting time till your 40K docs are written into this huge XML. Another issue is that this XML has to fit in memory

We need to fix the comonsHttpSolrServer first. It must stream the docs .

Another enhancement is using a different format (SOLR-865). It uses javabin format and it can be extremely fast compared to XML and the payload can be reduced substantially.

Probably we can overcome the perf problems to a certain extent with these two fixes. 





> Buffered / Streaming SolrServer implementaion
> ---------------------------------------------
>
>                 Key: SOLR-906
>                 URL: https://issues.apache.org/jira/browse/SOLR-906
>             Project: Solr
>          Issue Type: New Feature
>          Components: clients - java
>            Reporter: Ryan McKinley
>            Assignee: Shalin Shekhar Mangar
>             Fix For: 1.4
>
>         Attachments: SOLR-906-StreamingHttpSolrServer.patch, SOLR-906-StreamingHttpSolrServer.patch, SOLR-906-StreamingHttpSolrServer.patch, SOLR-906-StreamingHttpSolrServer.patch, StreamingHttpSolrServer.java
>
>
> While indexing lots of documents, the CommonsHttpSolrServer add( SolrInputDocument ) is less then optimal.  This makes a new request for each document.
> With a "StreamingHttpSolrServer", documents are buffered and then written to a single open Http connection.
> For related discussion see:
> http://www.nabble.com/solr-performance-tt9055437.html#a20833680

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.