You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Park, Michael" <Mi...@brown.edu> on 2009/09/21 21:27:25 UTC

what is too large for an indexed field

I am trying to place the value of around 390,000 characters into a
single field.  However, my search results have become inaccurate.  Is
this too large?  I tried bumping the maxFieldLength in the
solrconfig.xml file to 500,000 and it hasn't fixed the problem.

 

Thanks,

Mike


Re: what is too large for an indexed field

Posted by Erick Erickson <er...@gmail.com>.
You might also want to get a copy of Luke and examine your index to seewhat's
actually in there. Could you be being mislead by, say, punctuation?

Erick

On Mon, Sep 21, 2009 at 4:28 PM, Yonik Seeley <yo...@lucidimagination.com>wrote:

> On Mon, Sep 21, 2009 at 4:22 PM, Park, Michael <Mi...@brown.edu>
> wrote:
> > I get no results back on a search.  But I can see the actual word or
> phrase in the stored doc.
>
> Ok cool - that should make it much easier to debug.
> #1) verify that you changed the maxFieldLength property in both places
> in solrconfig.xml, and that you restarted and reindexed.
> #2) if still broken, could you show the output after adding
> debugQuery=true to the request, along with a snippet from the document
> that should match?
>
> -Yonik
> http://www.lucidimagination.com
>

Re: what is too large for an indexed field

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Mon, Sep 21, 2009 at 4:22 PM, Park, Michael <Mi...@brown.edu> wrote:
> I get no results back on a search.  But I can see the actual word or phrase in the stored doc.

Ok cool - that should make it much easier to debug.
#1) verify that you changed the maxFieldLength property in both places
in solrconfig.xml, and that you restarted and reindexed.
#2) if still broken, could you show the output after adding
debugQuery=true to the request, along with a snippet from the document
that should match?

-Yonik
http://www.lucidimagination.com

RE: what is too large for an indexed field

Posted by "Park, Michael" <Mi...@brown.edu>.
I get no results back on a search.  But I can see the actual word or phrase in the stored doc.

-----Original Message-----
From: yseeley@gmail.com [mailto:yseeley@gmail.com] On Behalf Of Yonik Seeley
Sent: Monday, September 21, 2009 4:18 PM
To: solr-user@lucene.apache.org
Subject: Re: what is too large for an indexed field

On Mon, Sep 21, 2009 at 3:27 PM, Park, Michael <Mi...@brown.edu> wrote:
> I am trying to place the value of around 390,000 characters into a
> single field.  However, my search results have become inaccurate.

Do you mean that the document should score higher, or that the
document doesn't match a particular query?
If the former, keep in mind that length normalization penalizes long documents.

-Yonik
http://www.lucidimagination.com

Re: what is too large for an indexed field

Posted by Yonik Seeley <yo...@lucidimagination.com>.
On Mon, Sep 21, 2009 at 3:27 PM, Park, Michael <Mi...@brown.edu> wrote:
> I am trying to place the value of around 390,000 characters into a
> single field.  However, my search results have become inaccurate.

Do you mean that the document should score higher, or that the
document doesn't match a particular query?
If the former, keep in mind that length normalization penalizes long documents.

-Yonik
http://www.lucidimagination.com

RE: what is too large for an indexed field

Posted by "Park, Michael" <Mi...@brown.edu>.
I'm using the solr.WhitespaceTokenizerFactory and the
solr.LowerCaseFilterFactory.  Is it safe to assume that a token would be
created for each word?  

I can't image anything that would be even close to 16383 chars. Is there
a way to dissect the tokens? 

Thanks, Mike

-----Original Message-----
From: Mark Miller [mailto:markrmiller@gmail.com] 
Sent: Monday, September 21, 2009 3:42 PM
To: solr-user@lucene.apache.org
Subject: Re: what is too large for an indexed field

Park, Michael wrote:
> I am trying to place the value of around 390,000 characters into a
> single field.  However, my search results have become inaccurate.  Is
> this too large?  I tried bumping the maxFieldLength in the
> solrconfig.xml file to 500,000 and it hasn't fixed the problem.
>
>  
>
> Thanks,
>
> Mike
>
>
>   
How large is your largest token? There is hard limit of (I think) 16383
chars.

-- 
- Mark

http://www.lucidimagination.com




Re: what is too large for an indexed field

Posted by Mark Miller <ma...@gmail.com>.
Park, Michael wrote:
> I am trying to place the value of around 390,000 characters into a
> single field.  However, my search results have become inaccurate.  Is
> this too large?  I tried bumping the maxFieldLength in the
> solrconfig.xml file to 500,000 and it hasn't fixed the problem.
>
>  
>
> Thanks,
>
> Mike
>
>
>   
How large is your largest token? There is hard limit of (I think) 16383
chars.

-- 
- Mark

http://www.lucidimagination.com