You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by lboutros <bo...@gmail.com> on 2011/06/03 21:15:51 UTC

Getting payloads in Highlighter

Hi all,

I need to highlight searched words in the original text (xml) of a document. 

So I'm trying to develop a new Highlighter which uses the defaultHighlighter
to highlight some fields and then retrieve the original text file/document
(external or internal storage) and put the highlighted parts into them.

I'm using an additional field for the field offsets for each field in each
document.
To store the offsets (and perhaps other infos) I'm using the payloads. (I
cannot wait for the future DocValues).

now my question, what is the fastest way to retrieve payloads (TermPositions
?) for a given document a given field and a given term ?

If other methods exist to do that, I'm open :)

Ludovic.



-----
Jouve
France.
--
View this message in context: http://lucene.472066.n3.nabble.com/Getting-payloads-in-Highlighter-tp3020885p3020885.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Getting payloads in Highlighter

Posted by lboutros <bo...@gmail.com>.
To clarify a bit more, I took a look to this function :

termPositions

public TermPositions termPositions()
                            throws IOException

    Description copied from class: IndexReader
    Returns an unpositioned TermPositions enumerator. 

But it returns an unpositioned enumerator, is there a way to get a
TermPositions directly positioned on a document, a field and a term ?

Ludovic.

-----
Jouve
France.
--
View this message in context: http://lucene.472066.n3.nabble.com/Getting-payloads-in-Highlighter-tp3020885p3020922.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Getting payloads in Highlighter

Posted by lboutros <bo...@gmail.com>.
The original document is not indexed. Currently it is just stored and could
be stored in an filesystem or a database in the future.

The different parts of a document are indexed in multiple different fields
with some  different analyzers (stemming, multiple languages, regex,...).

So, I don't think your solution can be applied, but if I'm wrong, could you
please explain me how ?

Thanks,

Ludovic.


-----
Jouve
France.
--
View this message in context: http://lucene.472066.n3.nabble.com/Getting-payloads-in-Highlighter-tp3020885p3021383.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Getting payloads in Highlighter

Posted by Ahmet Arslan <io...@yahoo.com>.
> I need to highlight searched words in the original text
> (xml) of a document. 

Why don't you remove xml tags in an analyzer? You can highlight xml by doing so.