You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Mark Wilson <mw...@sanger.ac.uk> on 2013/03/01 15:35:23 UTC
Re: Solr 3.6.1 Query large field
Hi Otis
Thanks for the info. I tried 2 different ways that both seem to work okay.
I added <filter class="solr.LimitTokenCountFilterFactory"
maxTokenCount="100000"/> to the <indexConfig> in the solrconfig.xml
And I tried adding the
<filter class="solr.LimitTokenCountFilterFactory" maxTokenCount="100000"/>
To the <fieldType><analyzer type="index"> section, in the Schema.xml file.
Both ways work ok.
Cheers Mark
On 28/02/2013 08:05, "Otis Gospodnetic" <ot...@gmail.com> wrote:
> Mark,
>
> Look at
> http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/example/solr/collection1
> /conf/solrconfig.xml:
>
> <indexConfig>
> <!-- maxFieldLength was removed in 4.0. To get similar behavior, include a
> LimitTokenCountFilterFactory in your fieldType definition. E.g.
> <filter class="solr.LimitTokenCountFilterFactory" maxTokenCount="10000"/>
> -->
>
> Otis
> --
> Solr & ElasticSearch Support
> http://sematext.com/
>
>
>
>
>
> On Wed, Feb 27, 2013 at 11:08 AM, Mark Wilson <mw...@sanger.ac.uk> wrote:
>
>> Hi
>>
>> I am using Nutch to crawl a site, and post it in Solr 3.6.1. The page is
>> very large.
>>
>> When I query the index, using the Solr Admin query page, it only finds the
>> result if it is in the top X% of the page, probably about 30%.
>>
>> The page is about 79Kb, and consists of 19,067 words.
>>
>> Is there a setting somewhere that sets the maxFieldSize? Or maxTokenSize?
>>
>> I set the field content to be displayed on the result page, and it displays
>> all the data correctly, where I can see all the tokens I get no results
>> from.
>>
>> I can't split the page up, as it is auto-generated from a database.
>>
>> Any help gratefully received.
>>
>> Thanks Mark
>>
>>
>>
>> --
>> The Wellcome Trust Sanger Institute is operated by Genome Research
>> Limited, a charity registered in England with number 1021457 and a
>> company registered in England with number 2742969, whose registered
>> office is 215 Euston Road, London, NW1 2BE.
>>
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.