You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Chuck Mysak <ch...@gmail.com> on 2009/11/13 10:21:40 UTC

highlighting issue lst.name is a leaf node

Hello list,

I'm new to solr but from what I'm experimenting, it's awesome.
I have a small issue regarding the highlighting feature.

It finds stuff (as I see from the query analyzer), but the highlight list
looks something like this:

<lst name="highlighting">
<lst name="c:\0596520107.pdf"/>
<lst name="c:\0470511389.pdf"/>
</lst>

(the files were added using  ContentStreamUpdateRequest req = new
ContentStreamUpdateRequest("/update/extract"); and I set the "literal.id" to
the filename)

My solrconfig.xml requesthandler looks like:

  <requestHandler name="standard" class="solr.SearchHandler" default="true">
    <!-- default values for query parameters -->
     <lst name="defaults">
       <str name="echoParams">explicit</str>
       <!--
       <int name="rows">10</int>
       <str name="fl">*</str>
       <str name="version">2.1</str>
        -->
       <bool name="hl">true</bool>
       <int name="hl.snippets">3</int>
       <int name="hl.fragsize">30</int>
       <str name="hl.simple.pre"><![CDATA[<span>]]></str>
       <str name="hl.simple.post"><![CDATA[</span>]]></str>
       <str name="hl.fl">*</str>
       <bool name="hl.requireFieldMatch">true</bool>
       <float name="hl.regex.slop">0.5</float>
       <str name="hl.regex.pattern">[-\w ,/\n\"']{20,200}</str>
       <bool name="hl.usePhraseHighlighter">true</bool>
     </lst>
  </requestHandler>

The schema.xml is untouched and downloaded yesterday from the latest stable
build.

At first, I thought it had something to do with the extraction of the pdf,
but I tried the demo xml docs also and got the same result.

I'm new to this, so please help.

Thank you,

Chuck

Re: highlighting issue lst.name is a leaf node

Posted by Chuck Mysak <ch...@gmail.com>.

I found the solution.
If somebody will run into the same problem, here is how I solved it.

- while uploading the document:

            req.setParam("uprefix", "attr_");
            req.setParam("fmap.content", "attr_content");
            req.setParam("overwrite", "true");
            req.setParam("commit", "true");

- in the query:
http://localhost:8983/solr/select?q=attr_content:%22Django%22&rows=4
- edit the solrconfig.xml in the requesthandler params

       <str name="fl">id,title</str>
so that you won't get the whole text content inside the response.

Regards,
Chuck

On Fri, Nov 13, 2009 at 11:21 AM, Chuck Mysak <ch...@gmail.com> wrote:

> Hello list,
>
> I'm new to solr but from what I'm experimenting, it's awesome.
> I have a small issue regarding the highlighting feature.
>
> It finds stuff (as I see from the query analyzer), but the highlight list
> looks something like this:
>
> <lst name="highlighting">
> <lst name="c:\0596520107.pdf"/>
> <lst name="c:\0470511389.pdf"/>
> </lst>
>
> (the files were added using  ContentStreamUpdateRequest req = new
> ContentStreamUpdateRequest("/update/extract"); and I set the "literal.id"
> to the filename)
>
> My solrconfig.xml requesthandler looks like:
>
>   <requestHandler name="standard" class="solr.SearchHandler"
> default="true">
>     <!-- default values for query parameters -->
>      <lst name="defaults">
>        <str name="echoParams">explicit</str>
>        <!--
>        <int name="rows">10</int>
>        <str name="fl">*</str>
>        <str name="version">2.1</str>
>         -->
>        <bool name="hl">true</bool>
>        <int name="hl.snippets">3</int>
>        <int name="hl.fragsize">30</int>
>        <str name="hl.simple.pre"><![CDATA[<span>]]></str>
>        <str name="hl.simple.post"><![CDATA[</span>]]></str>
>        <str name="hl.fl">*</str>
>        <bool name="hl.requireFieldMatch">true</bool>
>        <float name="hl.regex.slop">0.5</float>
>        <str name="hl.regex.pattern">[-\w ,/\n\"']{20,200}</str>
>        <bool name="hl.usePhraseHighlighter">true</bool>
>      </lst>
>   </requestHandler>
>
> The schema.xml is untouched and downloaded yesterday from the latest stable
> build.
>
> At first, I thought it had something to do with the extraction of the pdf,
> but I tried the demo xml docs also and got the same result.
>
> I'm new to this, so please help.
>
> Thank you,
>
> Chuck
>
>
>
>
>
>