You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Paul <pa...@nines.org> on 2011/03/23 17:05:20 UTC

Search failing for matched text in large field

I'm using solr 1.4.1.

I have a document that has a pretty big field. If I search for a
phrase that occurs near the start of that field, it works fine. If I
search for a phrase that appears even a little ways into the field, it
doesn't find it. Is there some limit to how far into a field solr will
search?

Here's the way I'm doing the search. All I'm changing is the text I'm
searching on to make it succeed or fail:

http://localhost:8983/solr/my_core/select/?q=%22search+phrase%22&hl=on&hl.fl=text

Or, if it is not related to how large the document is, what else could
it possibly be related to? Could there be some character in that field
that is stopping the search?

Re: Search failing for matched text in large field

Posted by Markus Jelsma <ma...@openindex.io>.
Enable TermVectors for fields that you're going tot highlight. If it is 
disabled Solr will reanalyze the field, killing performance.

> I looked into the search that I'm doing a little closer and it seems
> like the highlighting is slowing it down. If I do the query without
> requesting highlighting it is fast. (BTW, I also have faceting and
> pagination in my query. Faceting doesn't seem to change the response
> time much, adding &rows= and &start= does, but not prohibitively.)
> 
> The field in question needs to be stored=true, because it is needed
> for highlighting.
> 
> I'm thinking of doing this in two searches: first without highlighting
> and put a progress spinner next to each result, then do an ajax call
> to repeat the search with highlighting that can take its time to
> finish.
> 
> (I, too, have seen random really long response times that seem to be
> related to not enough RAM, but this isn't the problem because the
> results here are repeatable.)
> 
> On Wed, Mar 23, 2011 at 2:30 PM, Sascha Szott <sz...@zib.de> wrote:
> > On 23.03.2011 18:52, Paul wrote:
> >> I increased maxFieldLength and reindexed a small number of documents.
> >> That worked -- I got the correct results. In 3 minutes!
> > 
> > Did you mark the field in question as stored = false?
> > 
> > -Sascha
> > 
> >> I assume that if I reindex all my documents that all searches will
> >> become even slower. Is there any way to get all the results in a way
> >> that is quick enough that my user won't get bored waiting? Is there
> >> some optimization of this coming in solr 3.0?
> >> 
> >> On Wed, Mar 23, 2011 at 12:15 PM, Sascha Szott<sz...@zib.de>  wrote:
> >>> Hi Paul,
> >>> 
> >>> did you increase the value of the maxFieldLength parameter in your
> >>> solrconfig.xml?
> >>> 
> >>> -Sascha
> >>> 
> >>> On 23.03.2011 17:05, Paul wrote:
> >>>> I'm using solr 1.4.1.
> >>>> 
> >>>> I have a document that has a pretty big field. If I search for a
> >>>> phrase that occurs near the start of that field, it works fine. If I
> >>>> search for a phrase that appears even a little ways into the field, it
> >>>> doesn't find it. Is there some limit to how far into a field solr will
> >>>> search?
> >>>> 
> >>>> Here's the way I'm doing the search. All I'm changing is the text I'm
> >>>> searching on to make it succeed or fail:
> >>>> 
> >>>> 
> >>>> 
> >>>> http://localhost:8983/solr/my_core/select/?q=%22search+phrase%22&hl=on
> >>>> &hl.fl=text
> >>>> 
> >>>> Or, if it is not related to how large the document is, what else could
> >>>> it possibly be related to? Could there be some character in that field
> >>>> that is stopping the search?

Re: Search failing for matched text in large field

Posted by Jonathan Rochkind <ro...@jhu.edu>.
Yeah, you aren't going to be able to do highlighting on a very very 
large field without terrible performance.  I believe it's just the 
nature of the algorithm used by the highlighting component. I don't know 
of any workaround.  Other than inventing a new algorithm for 
highlighting and writing a component for it.

Even with an AJAX call, you don't want to wait 3 minutes. Plus the load 
on your server.

On 3/23/2011 3:52 PM, Paul wrote:
> I looked into the search that I'm doing a little closer and it seems
> like the highlighting is slowing it down. If I do the query without
> requesting highlighting it is fast. (BTW, I also have faceting and
> pagination in my query. Faceting doesn't seem to change the response
> time much, adding&rows= and&start= does, but not prohibitively.)
>
> The field in question needs to be stored=true, because it is needed
> for highlighting.
>
> I'm thinking of doing this in two searches: first without highlighting
> and put a progress spinner next to each result, then do an ajax call
> to repeat the search with highlighting that can take its time to
> finish.
>
> (I, too, have seen random really long response times that seem to be
> related to not enough RAM, but this isn't the problem because the
> results here are repeatable.)
>
> On Wed, Mar 23, 2011 at 2:30 PM, Sascha Szott<sz...@zib.de>  wrote:
>> On 23.03.2011 18:52, Paul wrote:
>>> I increased maxFieldLength and reindexed a small number of documents.
>>> That worked -- I got the correct results. In 3 minutes!
>> Did you mark the field in question as stored = false?
>>
>> -Sascha
>>
>>> I assume that if I reindex all my documents that all searches will
>>> become even slower. Is there any way to get all the results in a way
>>> that is quick enough that my user won't get bored waiting? Is there
>>> some optimization of this coming in solr 3.0?
>>>
>>> On Wed, Mar 23, 2011 at 12:15 PM, Sascha Szott<sz...@zib.de>    wrote:
>>>> Hi Paul,
>>>>
>>>> did you increase the value of the maxFieldLength parameter in your
>>>> solrconfig.xml?
>>>>
>>>> -Sascha
>>>>
>>>> On 23.03.2011 17:05, Paul wrote:
>>>>> I'm using solr 1.4.1.
>>>>>
>>>>> I have a document that has a pretty big field. If I search for a
>>>>> phrase that occurs near the start of that field, it works fine. If I
>>>>> search for a phrase that appears even a little ways into the field, it
>>>>> doesn't find it. Is there some limit to how far into a field solr will
>>>>> search?
>>>>>
>>>>> Here's the way I'm doing the search. All I'm changing is the text I'm
>>>>> searching on to make it succeed or fail:
>>>>>
>>>>>
>>>>>
>>>>> http://localhost:8983/solr/my_core/select/?q=%22search+phrase%22&hl=on&hl.fl=text
>>>>>
>>>>> Or, if it is not related to how large the document is, what else could
>>>>> it possibly be related to? Could there be some character in that field
>>>>> that is stopping the search?

Re: Search failing for matched text in large field

Posted by Paul <pa...@nines.org>.
I looked into the search that I'm doing a little closer and it seems
like the highlighting is slowing it down. If I do the query without
requesting highlighting it is fast. (BTW, I also have faceting and
pagination in my query. Faceting doesn't seem to change the response
time much, adding &rows= and &start= does, but not prohibitively.)

The field in question needs to be stored=true, because it is needed
for highlighting.

I'm thinking of doing this in two searches: first without highlighting
and put a progress spinner next to each result, then do an ajax call
to repeat the search with highlighting that can take its time to
finish.

(I, too, have seen random really long response times that seem to be
related to not enough RAM, but this isn't the problem because the
results here are repeatable.)

On Wed, Mar 23, 2011 at 2:30 PM, Sascha Szott <sz...@zib.de> wrote:
> On 23.03.2011 18:52, Paul wrote:
>>
>> I increased maxFieldLength and reindexed a small number of documents.
>> That worked -- I got the correct results. In 3 minutes!
>
> Did you mark the field in question as stored = false?
>
> -Sascha
>
>>
>> I assume that if I reindex all my documents that all searches will
>> become even slower. Is there any way to get all the results in a way
>> that is quick enough that my user won't get bored waiting? Is there
>> some optimization of this coming in solr 3.0?
>>
>> On Wed, Mar 23, 2011 at 12:15 PM, Sascha Szott<sz...@zib.de>  wrote:
>>>
>>> Hi Paul,
>>>
>>> did you increase the value of the maxFieldLength parameter in your
>>> solrconfig.xml?
>>>
>>> -Sascha
>>>
>>> On 23.03.2011 17:05, Paul wrote:
>>>>
>>>> I'm using solr 1.4.1.
>>>>
>>>> I have a document that has a pretty big field. If I search for a
>>>> phrase that occurs near the start of that field, it works fine. If I
>>>> search for a phrase that appears even a little ways into the field, it
>>>> doesn't find it. Is there some limit to how far into a field solr will
>>>> search?
>>>>
>>>> Here's the way I'm doing the search. All I'm changing is the text I'm
>>>> searching on to make it succeed or fail:
>>>>
>>>>
>>>>
>>>> http://localhost:8983/solr/my_core/select/?q=%22search+phrase%22&hl=on&hl.fl=text
>>>>
>>>> Or, if it is not related to how large the document is, what else could
>>>> it possibly be related to? Could there be some character in that field
>>>> that is stopping the search?
>>>
>

Storing Nested Fields

Posted by "Sethi, Parampreet" <pa...@teamaol.com>.
Hi All,

This is regarding nested array functionality. I have requirements
1. to store category and sub-category association with a word in the Solr.
2. Also each word can be listed under multiple categories (and thus
sub-categories). 
3. Query based on category or sub-category.

One way is to have two separate Array fields in Solr and making sure that
field category[0] is the super-category of field sub-category[0].

Has anyone encountered similar problem in Solr? Any suggestions will be
great.

Thanks
Param


Re: Search failing for matched text in large field

Posted by Sascha Szott <sz...@zib.de>.
On 23.03.2011 18:52, Paul wrote:
> I increased maxFieldLength and reindexed a small number of documents.
> That worked -- I got the correct results. In 3 minutes!
Did you mark the field in question as stored = false?

-Sascha

>
> I assume that if I reindex all my documents that all searches will
> become even slower. Is there any way to get all the results in a way
> that is quick enough that my user won't get bored waiting? Is there
> some optimization of this coming in solr 3.0?
>
> On Wed, Mar 23, 2011 at 12:15 PM, Sascha Szott<sz...@zib.de>  wrote:
>> Hi Paul,
>>
>> did you increase the value of the maxFieldLength parameter in your
>> solrconfig.xml?
>>
>> -Sascha
>>
>> On 23.03.2011 17:05, Paul wrote:
>>>
>>> I'm using solr 1.4.1.
>>>
>>> I have a document that has a pretty big field. If I search for a
>>> phrase that occurs near the start of that field, it works fine. If I
>>> search for a phrase that appears even a little ways into the field, it
>>> doesn't find it. Is there some limit to how far into a field solr will
>>> search?
>>>
>>> Here's the way I'm doing the search. All I'm changing is the text I'm
>>> searching on to make it succeed or fail:
>>>
>>>
>>> http://localhost:8983/solr/my_core/select/?q=%22search+phrase%22&hl=on&hl.fl=text
>>>
>>> Or, if it is not related to how large the document is, what else could
>>> it possibly be related to? Could there be some character in that field
>>> that is stopping the search?
>>

Re: Search failing for matched text in large field

Posted by Jonathan Rochkind <ro...@jhu.edu>.
Hmm, there's no reason it should take anywhere close 3 minutes to get a 
result from a simple search, even with very large documents/term lists.  
Especially if you're really JUST doing a simple search, you aren't using 
facetting or statistics component or highlighting etc at this point. (If 
you ARE using highlighting, that could be the culprit).

You might need more RAM allocated to the Solr JVM.  For reasons I can't 
explain myself, I sometimes get pathologically slow search results when 
I don't have enough RAM, even though there aren't any errors in my logs 
or anything -- which adding more RAM fixes.

It's also possible (just taking random guesses, I am not familiar with 
this part of Solr internals), that if you increased the maxFieldLength 
on an existing index, but only reindexed SOME of the results in that 
index, than Solr is getting all confused about your index. I don't know 
if Solr can handle changing the maxFieldLength on an existing index 
without re-indexing all docs.

Also, if you tell us HOW large you made maxFieldLength, someone (not me) 
might be able to say something about if it's so large it could create 
some kind of other problem.

On 3/23/2011 1:52 PM, Paul wrote:
> I increased maxFieldLength and reindexed a small number of documents.
> That worked -- I got the correct results. In 3 minutes!
>
> I assume that if I reindex all my documents that all searches will
> become even slower. Is there any way to get all the results in a way
> that is quick enough that my user won't get bored waiting? Is there
> some optimization of this coming in solr 3.0?
>
> On Wed, Mar 23, 2011 at 12:15 PM, Sascha Szott<sz...@zib.de>  wrote:
>> Hi Paul,
>>
>> did you increase the value of the maxFieldLength parameter in your
>> solrconfig.xml?
>>
>> -Sascha
>>
>> On 23.03.2011 17:05, Paul wrote:
>>> I'm using solr 1.4.1.
>>>
>>> I have a document that has a pretty big field. If I search for a
>>> phrase that occurs near the start of that field, it works fine. If I
>>> search for a phrase that appears even a little ways into the field, it
>>> doesn't find it. Is there some limit to how far into a field solr will
>>> search?
>>>
>>> Here's the way I'm doing the search. All I'm changing is the text I'm
>>> searching on to make it succeed or fail:
>>>
>>>
>>> http://localhost:8983/solr/my_core/select/?q=%22search+phrase%22&hl=on&hl.fl=text
>>>
>>> Or, if it is not related to how large the document is, what else could
>>> it possibly be related to? Could there be some character in that field
>>> that is stopping the search?

Re: Search failing for matched text in large field

Posted by Paul <pa...@nines.org>.
I increased maxFieldLength and reindexed a small number of documents.
That worked -- I got the correct results. In 3 minutes!

I assume that if I reindex all my documents that all searches will
become even slower. Is there any way to get all the results in a way
that is quick enough that my user won't get bored waiting? Is there
some optimization of this coming in solr 3.0?

On Wed, Mar 23, 2011 at 12:15 PM, Sascha Szott <sz...@zib.de> wrote:
> Hi Paul,
>
> did you increase the value of the maxFieldLength parameter in your
> solrconfig.xml?
>
> -Sascha
>
> On 23.03.2011 17:05, Paul wrote:
>>
>> I'm using solr 1.4.1.
>>
>> I have a document that has a pretty big field. If I search for a
>> phrase that occurs near the start of that field, it works fine. If I
>> search for a phrase that appears even a little ways into the field, it
>> doesn't find it. Is there some limit to how far into a field solr will
>> search?
>>
>> Here's the way I'm doing the search. All I'm changing is the text I'm
>> searching on to make it succeed or fail:
>>
>>
>> http://localhost:8983/solr/my_core/select/?q=%22search+phrase%22&hl=on&hl.fl=text
>>
>> Or, if it is not related to how large the document is, what else could
>> it possibly be related to? Could there be some character in that field
>> that is stopping the search?
>

Re: Search failing for matched text in large field

Posted by Paul <pa...@nines.org>.
Ah, no, I'll try that now.

What is the disadvantage of setting that to a really large number?

I do want the search to work for every word I give to solr. Otherwise
I wouldn't have indexed it to begin with.

On Wed, Mar 23, 2011 at 11:15 AM, Sascha Szott <sz...@zib.de> wrote:
> Hi Paul,
>
> did you increase the value of the maxFieldLength parameter in your
> solrconfig.xml?
>
> -Sascha
>
> On 23.03.2011 17:05, Paul wrote:
>>
>> I'm using solr 1.4.1.
>>
>> I have a document that has a pretty big field. If I search for a
>> phrase that occurs near the start of that field, it works fine. If I
>> search for a phrase that appears even a little ways into the field, it
>> doesn't find it. Is there some limit to how far into a field solr will
>> search?
>>
>> Here's the way I'm doing the search. All I'm changing is the text I'm
>> searching on to make it succeed or fail:
>>
>>
>> http://localhost:8983/solr/my_core/select/?q=%22search+phrase%22&hl=on&hl.fl=text
>>
>> Or, if it is not related to how large the document is, what else could
>> it possibly be related to? Could there be some character in that field
>> that is stopping the search?
>

Re: Search failing for matched text in large field

Posted by Sascha Szott <sz...@zib.de>.
Hi Paul,

did you increase the value of the maxFieldLength parameter in your 
solrconfig.xml?

-Sascha

On 23.03.2011 17:05, Paul wrote:
> I'm using solr 1.4.1.
>
> I have a document that has a pretty big field. If I search for a
> phrase that occurs near the start of that field, it works fine. If I
> search for a phrase that appears even a little ways into the field, it
> doesn't find it. Is there some limit to how far into a field solr will
> search?
>
> Here's the way I'm doing the search. All I'm changing is the text I'm
> searching on to make it succeed or fail:
>
> http://localhost:8983/solr/my_core/select/?q=%22search+phrase%22&hl=on&hl.fl=text
>
> Or, if it is not related to how large the document is, what else could
> it possibly be related to? Could there be some character in that field
> that is stopping the search?

Re: Search failing for matched text in large field

Posted by Jonathan Rochkind <ro...@jhu.edu>.
How large?

But rather than think about if there's something in the "searching" 
that's not working, the first step might be to make sure that everything 
in the _indexing_ is working -- that your field is actually being 
indexed as you intend.

I forget the best way to view what's in your index -- the Luke request 
handler in the Solr admin maybe?

On 3/23/2011 12:05 PM, Paul wrote:
> I'm using solr 1.4.1.
>
> I have a document that has a pretty big field. If I search for a
> phrase that occurs near the start of that field, it works fine. If I
> search for a phrase that appears even a little ways into the field, it
> doesn't find it. Is there some limit to how far into a field solr will
> search?
>
> Here's the way I'm doing the search. All I'm changing is the text I'm
> searching on to make it succeed or fail:
>
> http://localhost:8983/solr/my_core/select/?q=%22search+phrase%22&hl=on&hl.fl=text
>
> Or, if it is not related to how large the document is, what else could
> it possibly be related to? Could there be some character in that field
> that is stopping the search?
>