You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Lance Norskog <go...@gmail.com> on 2009/12/01 22:47:45 UTC

Re: Maximum number of fields allowed in a Solr document

Lucene creates an array of one item per document for every field you
sort on. If you sort on a thousand fields, Lucene will create 1000
different arrays of 500K ints. I assume there is some sort of cache of
these arrays. In Solr, it is also possible to sort using a function as
the relevance value. This is rather slow, and caches no data between
queries.

You may want to do sorting in your front-end applications, or get
database ids from Solr and do sorting in the database query.

On Mon, Nov 30, 2009 at 7:14 AM, Alex Wang <aw...@crossview.com> wrote:
> Thanks Otis for the reply. Yes this will be pretty memory intensive.
> The size of the index is 5 cores with a maximum of 500K documents each
> core. I did search the archives before but did not find any definite
> answer. Thanks again!
>
> Alex
>
>
>
> On Nov 27, 2009, at 11:09 PM, Otis Gospodnetic wrote:
>
>> Hi Alex,
>>
>> There is no build-in limit.  The limit is going to be dictated by
>> your hardware resources.  In particular, this sounds like a memory
>> intensive app because of sorting on lots of different fields.  You
>> didn't mention the size of your index, but that's a factor, too.
>> Once in a while people on the list mention cases with lots and lots
>> of fields, so I'd check ML archives.
>>
>> Otis
>> --
>> Sematext is hiring -- http://sematext.com/about/jobs.html?mls
>> Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
>>
>>
>>
>> ----- Original Message ----
>>> From: Alex Wang <aw...@crossview.com>
>>> To: "solr-user@lucene.apache.org" <so...@lucene.apache.org>
>>> Sent: Thu, November 26, 2009 12:47:36 PM
>>> Subject: Maximum number of fields allowed in a Solr document
>>>
>>> Hi,
>>>
>>> We are in the process of designing a Solr app where we might have
>>> millions of documents and within each of the document, we might have
>>> thousands of dynamic fields. These fields are small and only contain
>>> an integer, which needs to be retrievable and sortable.
>>>
>>> My questions is:
>>>
>>> 1. Is there a limit on the number of fields allowed per document?
>>> 2. What is the performance impact for such design?
>>> 3. Has anyone done this before and is it a wise thing to do?
>>>
>>> Thanks,
>>>
>>> Alex
>>
>
>



-- 
Lance Norskog
goksron@gmail.com