You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Pierre-Yves LANDRON <pl...@hotmail.com> on 2007/03/29 08:53:58 UTC

Snippets of indexed text

Hello everybody !

I wondering if there a way to get some relevant snippets (searched terms 
contextualized) of indexed text with a solr response to a query, instead of 
just the entire indexed field ? ( more widely, what are the possibilities to 
let solr formate the answer (highlight terms, etc.) ? )

Thanks,
Kind regards,
P-Y Landron

_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE! 
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/


Re: Snippets of indexed text

Posted by Thierry Collogne <th...@gmail.com>.
Glad I could help you.

On 02/04/07, Pierre-Yves LANDRON <pl...@hotmail.com> wrote:
>
> >And this is the part for the highlighted text :
> >
> ><lst name="highlighting">
> ><lst name="col_36863_NL">
> >  <arr name="content">
> >    <str></str>
> >  </arr>
> ></lst>
> ></lst>
> >
>
> Yes it works just fine ! and it's great. :)
>
> Thanks Thierry : you were right, i didn't look for the right tag in the
> response.
> ( My problem with facets parameters is still unresolved but i will work on
> that later)
>
> The more i'm using solr, the more i'm glad i've choosen this way to work
> with lucene.
>
> Kind Regards...
> P-Yves Landron
>
> _________________________________________________________________
> Express yourself instantly with MSN Messenger! Download today it's FREE!
> http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
>
>

Re: Snippets of indexed text

Posted by Pierre-Yves LANDRON <pl...@hotmail.com>.
>And this is the part for the highlighted text :
>
><lst name="highlighting">
><lst name="col_36863_NL">
>  <arr name="content">
>    <str></str>
>  </arr>
></lst>
></lst>
>

Yes it works just fine ! and it's great. :)

Thanks Thierry : you were right, i didn't look for the right tag in the 
response.
( My problem with facets parameters is still unresolved but i will work on 
that later)

The more i'm using solr, the more i'm glad i've choosen this way to work 
with lucene.

Kind Regards...
P-Yves Landron

_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE! 
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/


Re: Snippets of indexed text

Posted by Thierry Collogne <th...@gmail.com>.
I can't see anything wrong. But perhaps you are looking at the wrong part of
the response. It is the same lake with facets.
You need to look further down in the xml reponse. Here I asked solr to
highlight the field "content" and I used a facer called type.

This is a sample of an xml response in our application

<?xml version="1.0" encoding="UTF-8"?>
<response>

<lst name="responseHeader">
 <int name="status">0</int>
 <int name="QTime">5</int>
 <lst name="params">
  <str name="rows">10</str>
  <str name="start">0</str>

  <str name="facet">true</str>
  <str name="q">stamp AND site:3</str>
  <str name="version">2.2</str>
  <str name="hl.fl">content</str>
  <str name="facet.field">type</str>
  <str name="indent">on</str>

  <str name="hl">true</str>
 </lst>
</lst>
<result name="response" numFound="1" start="0">
 <doc>
  <str name="id">col_36863_NL</str>
  <str name="authorisation">ALL</str>
  <str name="content"></str>
  <str name="type">HR</str>
 </doc>
</result>
<lst name="facet_counts">
 <lst name="facet_queries"/>
 <lst name="facet_fields">
  <lst name="type">
    <int name="HR">1</int>
  </lst>
 </lst>
</lst>
<lst name="highlighting">
 <lst name="col_36863_NL">
  <arr name="content">
    <str></str>
  </arr>
 </lst>
</lst>
</response>


If you look at the end you see the following for facets

<lst name="facet_counts">
 <lst name="facet_queries"/>
 <lst name="facet_fields">
  <lst name="type">
    <int name="HR">1</int>
  </lst>
 </lst>
</lst>


And this is the part for the highlighted text :

<lst name="highlighting">
 <lst name="col_36863_NL">
  <arr name="content">
    <str></str>
  </arr>
 </lst>
</lst>

I hope this helps a bit. By the way, if you are using java, it may be good
to check out the java client here

   http://issues.apache.org/jira/browse/SOLR-20

There is a comment with some code that I added. This code can be added to
the java client to support highlighting.

If you need anymore help, just post it and I will try to help more.


On 30/03/07, Pierre-Yves LANDRON <pl...@hotmail.com> wrote:
>
> hello,
>
> thanks for the info ; it's exactly what i need. i can't manage to make it
> works, though. it's strange because i have the same problem with facets :
> it
> seems that some options are not taken in account...
>
> for example, here is my request to solr:
>
> q=%28%28titre:moulin%29+OR+%28texte:moulin%29+OR+%28sujet:moulin%29+OR+%28desc:moulin%29%29&version=
> 2.1&start=0&rows=12&fl=*+score&qt=standard&hl=true&hl.fl=texte
> ,desc&hl.snippets=3&hl.fragsize=150
>
> and an extract of the response is :
> <doc>
> <float name="score">0.0151801035</float>
> <str name="PID">bml:8071</str>
> <str name="texte">
> Les Grands Moulins
> Le chemin de la Bouteille n'est pas, comme son nom semblerait l'indiquer,
> le
> chemin préféré des ivrognes. En l'occurrence, c'est plutôt le chemin des
> Boulangers ou mieux encore (... cutted by me, in fact all the field is
> returned)
> </str>
> <str name="thumb">http://10.208.0.215:8080/fedora/get/bml:8071/Thumb</str>
> <str name="type">page</str>
> </doc>
>
> obviously  the hl parameters haven't been taken in account. I've hot the
> same problem with the facet.mincount parameter; facets works fine, but
> this
> parameter is not taken in account for some reason...
>
> did i done something wrong ?
>
> thanks,
> kind regards,
> p-y
>
>
>
>
> >From: "Thierry Collogne" <th...@gmail.com>
> >Reply-To: solr-user@lucene.apache.org
> >To: solr-user@lucene.apache.org
> >Subject: Re: Snippets of indexed text
> >Date: Thu, 29 Mar 2007 08:56:51 +0200
> >
> >It is possible. You need to pass highlighting parameters. Look here :
> >
> >      http://wiki.apache.org/solr/HighlightingParameters
> >
> >Hope this helps.
> >
>
> _________________________________________________________________
> It's tax season, make sure to follow these few simple tips
>
> http://articles.moneycentral.msn.com/Taxes/PreparationTips/PreparationTips.aspx?icid=HMMartagline
>
>

Re: Snippets of indexed text

Posted by Pierre-Yves LANDRON <pl...@hotmail.com>.
hello,

thanks for the info ; it's exactly what i need. i can't manage to make it 
works, though. it's strange because i have the same problem with facets : it 
seems that some options are not taken in account...

for example, here is my request to solr:
q=%28%28titre:moulin%29+OR+%28texte:moulin%29+OR+%28sujet:moulin%29+OR+%28desc:moulin%29%29&version=2.1&start=0&rows=12&fl=*+score&qt=standard&hl=true&hl.fl=texte,desc&hl.snippets=3&hl.fragsize=150

and an extract of the response is :
<doc>
<float name="score">0.0151801035</float>
<str name="PID">bml:8071</str>
<str name="texte">
Les Grands Moulins
Le chemin de la Bouteille n'est pas, comme son nom semblerait l'indiquer, le 
chemin préféré des ivrognes. En l'occurrence, c'est plutôt le chemin des 
Boulangers ou mieux encore (... cutted by me, in fact all the field is 
returned)
</str>
<str name="thumb">http://10.208.0.215:8080/fedora/get/bml:8071/Thumb</str>
<str name="type">page</str>
</doc>

obviously  the hl parameters haven't been taken in account. I've hot the 
same problem with the facet.mincount parameter; facets works fine, but this 
parameter is not taken in account for some reason...

did i done something wrong ?

thanks,
kind regards,
p-y




>From: "Thierry Collogne" <th...@gmail.com>
>Reply-To: solr-user@lucene.apache.org
>To: solr-user@lucene.apache.org
>Subject: Re: Snippets of indexed text
>Date: Thu, 29 Mar 2007 08:56:51 +0200
>
>It is possible. You need to pass highlighting parameters. Look here :
>
>      http://wiki.apache.org/solr/HighlightingParameters
>
>Hope this helps.
>

_________________________________________________________________
It’s tax season, make sure to follow these few simple tips 
http://articles.moneycentral.msn.com/Taxes/PreparationTips/PreparationTips.aspx?icid=HMMartagline


Re: Snippets of indexed text

Posted by Thierry Collogne <th...@gmail.com>.
It is possible. You need to pass highlighting parameters. Look here :

      http://wiki.apache.org/solr/HighlightingParameters

Hope this helps.

On 29/03/07, Pierre-Yves LANDRON <pl...@hotmail.com> wrote:
>
> Hello everybody !
>
> I wondering if there a way to get some relevant snippets (searched terms
> contextualized) of indexed text with a solr response to a query, instead
> of
> just the entire indexed field ? ( more widely, what are the possibilities
> to
> let solr formate the answer (highlight terms, etc.) ? )
>
> Thanks,
> Kind regards,
> P-Y Landron
>
> _________________________________________________________________
> Express yourself instantly with MSN Messenger! Download today it's FREE!
> http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
>
>