You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Rahul Mehta <ra...@gmail.com> on 2011/11/24 10:56:14 UTC

highlighting on range query

Hello,

I want to have result of a range query with highlighted Result.

e.g. i have this query
http://localhsot:8983/solr/select?q=field1:[5000%20TO%206000]&fl=field2&hl=on&rows=5&wt=json&indent=on&hl.fl=field3

is not giving any result in hightliting.

Please suggest how can i get the result?

-- 
Thanks & Regards

Rahul Mehta

RE: highlighting performance poor with *.tar, *.gz files

Posted by Shyam Bhaskaran <Sh...@synopsys.com>.
Hi Eric,

Thanks for the response.

I am already using termVectors with offsets & positions enabled as shown below.


<field name="attachment_bodies"  type="text_rev"    indexed="true"  stored="true"  multiValued="true" termVectors="true" termPositions="true" termOffsets="true" />


I am indexing FAQ content and some these FAQ has attachments linked to them and these attachments have files like PDF, DOC *.TAR , *.GZIP files that contains additional information related to the FAQ and all these contents are indexed. But while searching and highlighting it is observed that for archived files like *.gz, *.tar, *.zip the search performance degrades and using the debug flag I am finding that the time taken for highlighting these *.gz, *.tar, *.zip archived files is taking more time.

What could be the reason behind it ? Is it because these files are unzipped and then highlighted from the index during display time ?

Is the highlighting dependent on file size what I mean is if the file size is more, then does the performance of the search degrades because of the highlighting ?

I have tried to reduce the maxAnalyzedChars value from 5MB to 1 MB bus still do not see any significant improvement in the search and highlighting for these kind of files.

Let me know if you can suggest any workaround for improving the highlighting and search performance for these kind of files or even files having large file size ?


Thanks
Shyam

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Saturday, November 26, 2011 8:57 AM
To: solr-user@lucene.apache.org
Subject: Re: highlighting performance poor with *.tar, *.gz files

Highlighting is dependent on the size of the
data being fed through the highlighter. Unless you have
termVectors & offsets & positions enabled, the text
must be re-analyzed, see:
http://wiki.apache.org/solr/FieldOptionsByUseCase?highlight=%28termvector%29%7C%28retrieve%29%7C%28contents%29

But highlighting compressed files seems like an odd
use-case, what is the business reason you need to do this?

Best
Erick

On Thu, Nov 24, 2011 at 10:28 AM, Shyam Bhaskaran
<Sh...@synopsys.com> wrote:
> Hi,
>
> It is observed that highlighting of search results is taking too much time especially for highlighting terms for archived files like *.gz, *.tar, *.zip.
> What could be the reason behind it ? Is it because these files are unzipped and then highlighted from the index during display time ?
> Or is it dependent on the size of the file ? Is there any way by which the search & highlighter performance improves for these kind of archived files (*.tar, *.zip etc)
>
> Let me know if there is any workaround for improving the highlighting and search performance for these kind of files?
>
> -Shyam
>

Re: highlighting performance poor with *.tar, *.gz files

Posted by Erick Erickson <er...@gmail.com>.
Highlighting is dependent on the size of the
data being fed through the highlighter. Unless you have
termVectors & offsets & positions enabled, the text
must be re-analyzed, see:
http://wiki.apache.org/solr/FieldOptionsByUseCase?highlight=%28termvector%29%7C%28retrieve%29%7C%28contents%29

But highlighting compressed files seems like an odd
use-case, what is the business reason you need to do this?

Best
Erick

On Thu, Nov 24, 2011 at 10:28 AM, Shyam Bhaskaran
<Sh...@synopsys.com> wrote:
> Hi,
>
> It is observed that highlighting of search results is taking too much time especially for highlighting terms for archived files like *.gz, *.tar, *.zip.
> What could be the reason behind it ? Is it because these files are unzipped and then highlighted from the index during display time ?
> Or is it dependent on the size of the file ? Is there any way by which the search & highlighter performance improves for these kind of archived files (*.tar, *.zip etc)
>
> Let me know if there is any workaround for improving the highlighting and search performance for these kind of files?
>
> -Shyam
>

highlighting performance poor with *.tar, *.gz files

Posted by Shyam Bhaskaran <Sh...@synopsys.com>.
Hi,

It is observed that highlighting of search results is taking too much time especially for highlighting terms for archived files like *.gz, *.tar, *.zip.
What could be the reason behind it ? Is it because these files are unzipped and then highlighted from the index during display time ?
Or is it dependent on the size of the file ? Is there any way by which the search & highlighter performance improves for these kind of archived files (*.tar, *.zip etc)

Let me know if there is any workaround for improving the highlighting and search performance for these kind of files?

-Shyam

Re: highlighting on range query

Posted by Rahul Mehta <ra...@gmail.com>.
Tried  below url and got the same output. Any other suggestion .

http://localhost:8983/solr/select?q=rangefld:[5000%20TO%206000]&fl=lily.id,rangefld&hl=on&rows=5&wt=json&indent=on&hl.fl=rangefld&hl.highlightMultiTerm=true&hl.usePhraseHighlighter=true&hl.useFastVectorHighlighter=false

On Mon, Nov 28, 2011 at 8:10 PM, Ahmet Arslan <io...@yahoo.com> wrote:

> > and output is
> >
> > {
> >   "responseHeader":{
> >     "status":0,
> >     "QTime":4,
> >     "params":{
> >       "hl.highlightMultiTerm":"true",
> >       "fl":"lily.id,rangefld",
> >       "indent":"on",
> >
> > "hl.useFastVectorHighlighter":"false",
> >        "q":"rangefld:[5000 TO
> > 6000]",
> >       "hl.fl":"*,rangefld",
>
> I don't think hl.fl parameter accepts * value. Please try &hl.fl=rangefld
>
>
>


-- 
Thanks & Regards

Rahul Mehta

Re: highlighting on range query

Posted by Ahmet Arslan <io...@yahoo.com>.
> and output is
> 
> {
>   "responseHeader":{
>     "status":0,
>     "QTime":4,
>     "params":{
>       "hl.highlightMultiTerm":"true",
>       "fl":"lily.id,rangefld",
>       "indent":"on",
>      
> "hl.useFastVectorHighlighter":"false",
>        "q":"rangefld:[5000 TO
> 6000]",
>       "hl.fl":"*,rangefld",

I don't think hl.fl parameter accepts * value. Please try &hl.fl=rangefld



Re: highlighting on range query

Posted by Rahul Mehta <ra...@gmail.com>.
I tried this url :

http://localhost:8983/solr/select?q=rangefld:[5000%20TO%206000]&fl=lily.id,rangefld&hl=on&rows=5&wt=json&indent=on&hl.fl=*,rangefld&hl.highlightMultiTerm=true&hl.usePhraseHighlighter=true&hl.useFastVectorHighlighter=false

and output is

{
  "responseHeader":{
    "status":0,
    "QTime":4,
    "params":{
      "hl.highlightMultiTerm":"true",
      "fl":"lily.id,rangefld",
      "indent":"on",
      "hl.useFastVectorHighlighter":"false",
       "q":"rangefld:[5000 TO 6000]",
      "hl.fl":"*,rangefld",
      "wt":"json",
      "hl.usePhraseHighlighter":"true",
      "hl":"on",
      "rows":"5"}},
  "response":{"numFound":64,"start":0,"docs":[
      {
        "lily.id":"UUID.c5f00cd3-343a-47c1-ab16-ace104b2540f",
        "rangefld":5948},
      {
        "lily.id":"UUID.ed69ece0-1b24-4829-afb6-22eb242939f2",
        "rangefld":5749},
      {
        "lily.id":"UUID.afa0c654-2f26-4c5b-9fda-8b51c5ec080d",
        "rangefld":5739},
      {
        "lily.id":"UUID.d92b405d-f41e-4c85-9014-1b89a986ec42",
        "rangefld":5783},
      {
        "lily.id":"UUID.102adde5-cbff-4ca6-acb1-426bb14fb579",
        "rangefld":5753}]
  },
  "highlighting":{
    "UUID.c5f00cd3-343a-47c1-ab16-ace104b2540f":{},
    "UUID.ed69ece0-1b24-4829-afb6-22eb242939f2":{},
    "UUID.afa0c654-2f26-4c5b-9fda-8b51c5ec080d":{},
    "UUID.d92b405d-f41e-4c85-9014-1b89a986ec42":{},
    "UUID.102adde5-cbff-4ca6-acb1-426bb14fb579":{}}}

Why rangefld is not coming in highlight result.

On Mon, Nov 28, 2011 at 12:47 PM, Ahmet Arslan <io...@yahoo.com> wrote:

> > Any other Suggestion. as these
> > suggestions are not working.
>
> Could it be that you are using FastVectorHighlighter? What happens when
> you add &hl.useFastVectorHighlighter=false to your search URL?
>



-- 
Thanks & Regards

Rahul Mehta

Re: highlighting on range query

Posted by Ahmet Arslan <io...@yahoo.com>.
> Any other Suggestion. as these
> suggestions are not working.

Could it be that you are using FastVectorHighlighter? What happens when you add &hl.useFastVectorHighlighter=false to your search URL?

Re: highlighting on range query

Posted by Rahul Mehta <ra...@gmail.com>.
Any other Suggestion. as these suggestions are not working.

On Thu, Nov 24, 2011 at 5:44 PM, Rahul Mehta <ra...@gmail.com>wrote:

> Any other Suggestion.
>
>
> On Thu, Nov 24, 2011 at 5:30 PM, Rahul Mehta <ra...@gmail.com>wrote:
>
>> Yes, I tried with specifiying hl.fl=field1, and field1 is indexed and
>> stored.
>>
>>
>> On Thu, Nov 24, 2011 at 5:23 PM, Ahmet Arslan <io...@yahoo.com> wrote:
>>
>>> > oh sorry forgot to tell you that i
>>> > added &hl.usePhraseHighlighter=true this
>>> > also , but still no result is coming .
>>>
>>> Did you specify field1 in hl.fl parameter?
>>>
>>> Plus you need you mark field1 as indexed="true" and stored="true" to
>>> enable highlighting.
>>>
>>> http://wiki.apache.org/solr/FieldOptionsByUseCase
>>>
>>>
>>
>>
>> --
>> Thanks & Regards
>>
>> Rahul Mehta
>>
>>
>>
>>
>
>
> --
> Thanks & Regards
>
> Rahul Mehta
>
>
>
>


-- 
Thanks & Regards

Rahul Mehta

Re: highlighting on range query

Posted by Rahul Mehta <ra...@gmail.com>.
Any other Suggestion.

On Thu, Nov 24, 2011 at 5:30 PM, Rahul Mehta <ra...@gmail.com>wrote:

> Yes, I tried with specifiying hl.fl=field1, and field1 is indexed and
> stored.
>
>
> On Thu, Nov 24, 2011 at 5:23 PM, Ahmet Arslan <io...@yahoo.com> wrote:
>
>> > oh sorry forgot to tell you that i
>> > added &hl.usePhraseHighlighter=true this
>> > also , but still no result is coming .
>>
>> Did you specify field1 in hl.fl parameter?
>>
>> Plus you need you mark field1 as indexed="true" and stored="true" to
>> enable highlighting.
>>
>> http://wiki.apache.org/solr/FieldOptionsByUseCase
>>
>>
>
>
> --
> Thanks & Regards
>
> Rahul Mehta
>
>
>
>


-- 
Thanks & Regards

Rahul Mehta

Re: highlighting on range query

Posted by Rahul Mehta <ra...@gmail.com>.
Yes, I tried with specifiying hl.fl=field1, and field1 is indexed and
stored.


On Thu, Nov 24, 2011 at 5:23 PM, Ahmet Arslan <io...@yahoo.com> wrote:

> > oh sorry forgot to tell you that i
> > added &hl.usePhraseHighlighter=true this
> > also , but still no result is coming .
>
> Did you specify field1 in hl.fl parameter?
>
> Plus you need you mark field1 as indexed="true" and stored="true" to
> enable highlighting.
>
> http://wiki.apache.org/solr/FieldOptionsByUseCase
>
>


-- 
Thanks & Regards

Rahul Mehta

Re: highlighting on range query

Posted by Ahmet Arslan <io...@yahoo.com>.
> oh sorry forgot to tell you that i
> added &hl.usePhraseHighlighter=true this
> also , but still no result is coming .

Did you specify field1 in hl.fl parameter?

Plus you need you mark field1 as indexed="true" and stored="true" to enable highlighting.

http://wiki.apache.org/solr/FieldOptionsByUseCase


Re: highlighting on range query

Posted by Rahul Mehta <ra...@gmail.com>.
oh sorry forgot to tell you that i added &hl.usePhraseHighlighter=true this
also , but still no result is coming .

On Thu, Nov 24, 2011 at 5:14 PM, Ahmet Arslan <io...@yahoo.com> wrote:

> > I passed &hl.highlightMultiTerm=true in request ,* but
> > still field1 is not
> > coming in hightlighting.*
> >
> >
> http://localhsot:8983/solr/select?q=field1:[5000%20TO%206000]&fl=field2&hl=on&rows=5&wt=json&indent=on&hl.fl=field3&hl.highlightMultiTerm=true
> >
>
> As wiki says "If the SpanScorer is also being used..." which means you
> need to add &hl.usePhraseHighlighter=true too.
>



-- 
Thanks & Regards

Rahul Mehta

Re: highlighting on range query

Posted by Ahmet Arslan <io...@yahoo.com>.
> I passed &hl.highlightMultiTerm=true in request ,* but
> still field1 is not
> coming in hightlighting.*
> 
> http://localhsot:8983/solr/select?q=field1:[5000%20TO%206000]&fl=field2&hl=on&rows=5&wt=json&indent=on&hl.fl=field3&hl.highlightMultiTerm=true
> 

As wiki says "If the SpanScorer is also being used..." which means you need to add &hl.usePhraseHighlighter=true too.

Re: highlighting on range query

Posted by Rahul Mehta <ra...@gmail.com>.
Hi Ahmet,

I passed &hl.highlightMultiTerm=true in request ,* but still field1 is not
coming in hightlighting.*

http://localhsot:8983/solr/select?q=field1:[5000%20TO%206000]&fl=field2&hl=on&rows=5&wt=json&indent=on&hl.fl=field3&hl.highlightMultiTerm=true

I am using solr 3.1.

is i need to install the patch ? or any thing else i need to do ?






On Thu, Nov 24, 2011 at 3:36 PM, Ahmet Arslan <io...@yahoo.com> wrote:

> > I want to have result of a range query with highlighted
> > Result.
>
> http://wiki.apache.org/solr/HighlightingParameters#hl.highlightMultiTerm
>



-- 
Thanks & Regards

Rahul Mehta

Re: highlighting on range query

Posted by Ahmet Arslan <io...@yahoo.com>.
> I want to have result of a range query with highlighted
> Result.

http://wiki.apache.org/solr/HighlightingParameters#hl.highlightMultiTerm