You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Tod <li...@gmail.com> on 2011/11/04 17:36:11 UTC

Batch indexing documents using ContentStreamUpdateRequest

This is a code fragment of how I am doing a ContentStreamUpdateRequest 
using CommonHTTPSolrServer:


   ContentStreamBase.URLStream csbu = new ContentStreamBase.URLStream(url);
   InputStream is = csbu.getStream();
   FastInputStream fis = new FastInputStream(is);

   csur.addContentStream(csbu);
   csur.setParam("literal.content_id","000000");
   csur.setParam("literal.contentitle","This is a test");
   csur.setParam("literal.title","This is a test");
   server.request(csur);
   server.commit();

   fis.close();


This works fine for one document (a pdf in this case).  When I surround 
this with a while loop and try adding multiple documents I get:

org.apache.solr.client.solrj.SolrServerException: java.io.IOException: 
stream is closed

I've tried commenting out the fis.close, and also using just a plain 
InputStream with and without a .close() call - neither work.  Is there a 
way to do this that I'm missing?


Thanks - Tod

Re: Batch indexing documents using ContentStreamUpdateRequest

Posted by Tod <li...@gmail.com>.
Answering my own question.

ContentStreamUpdateRequest (csur) needs to be within the while loop not 
outside as I had it.  Still not seeing any dramatic performance 
improvements over perl though (the point of this exercise).  Indexing 
locks after about 30-45 minutes of activity, even a commit won't budge it.



On 11/04/2011 12:36 PM, Tod wrote:
> This is a code fragment of how I am doing a ContentStreamUpdateRequest
> using CommonHTTPSolrServer:
>
>
> ContentStreamBase.URLStream csbu = new ContentStreamBase.URLStream(url);
> InputStream is = csbu.getStream();
> FastInputStream fis = new FastInputStream(is);
>
> csur.addContentStream(csbu);
> csur.setParam("literal.content_id","000000");
> csur.setParam("literal.contentitle","This is a test");
> csur.setParam("literal.title","This is a test");
> server.request(csur);
> server.commit();
>
> fis.close();
>
>
> This works fine for one document (a pdf in this case). When I surround
> this with a while loop and try adding multiple documents I get:
>
> org.apache.solr.client.solrj.SolrServerException: java.io.IOException:
> stream is closed
>
> I've tried commenting out the fis.close, and also using just a plain
> InputStream with and without a .close() call - neither work. Is there a
> way to do this that I'm missing?
>
>
> Thanks - Tod