You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Paul Libbrecht <pa...@activemath.org> on 2009/01/25 00:30:47 UTC

size of solr update document a limitation?

Hello Solr experts,

is good practice to post large solr update documents?
(e.g. 100kb-2mb).
Will solr do the necessary tricks to make the field use a reader  
instead of strings?

thanks in advance

paul

Re: size of solr update document a limitation?

Posted by Paul Libbrecht <pa...@activemath.org>.
I even tried the solr client (which communicates binarily) and a  
reader is converted... with toString!

Yes, something of the sort you describe below is what I'm looking for.
I think a URL would be a safe bet for many applications.

paul


Le 25-janv.-09 à 14:39, Yonik Seeley a écrit :

> I have thought about being able to specify single fields as different
> streams though (to be handled via a Reader and never brought entirely
> into memory)... perhaps something like a
> &fieldstream.myfieldname=URL_to_respurce


Re: size of solr update document a limitation?

Posted by Yonik Seeley <yo...@apache.org>.
On Sat, Jan 24, 2009 at 6:30 PM, Paul Libbrecht <pa...@activemath.org> wrote:
> is good practice to post large solr update documents?
> (e.g. 100kb-2mb).
> Will solr do the necessary tricks to make the field use a reader instead of
> strings?

Solr will stream a *document* at a time from the input stream fine,
but it can't stream a *field* at a time.
The reason is that Lucene's Document class needs all the fields at
once when indexing, and there isn't a way to make multiple Readers
from a single InputStream (the HTTP POST) w/o bringing it into memory
anyway.

I have thought about being able to specify single fields as different
streams though (to be handled via a Reader and never brought entirely
into memory)... perhaps something like a
&fieldstream.myfieldname=URL_to_respurce

-Yonik