You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by juancesarvillalba <ju...@gmail.com> on 2013/04/05 02:43:58 UTC

SolR InvalidTokenOffsetsException with Highlighter and Synonyms

Hi I saw some similar problems in other threads but I think that this is a
little different and couldn't get any solution.*I get the exception
*/org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token
eightysix exceeds length of provided text sized 80/This happens for example
when I made a query for a word that have synonyms and with highlighting.For
example I have made a query for 86 , I have a "eightysix" synonym for this
and with highlighting I got the previous exception.The relevant conf
is:*Field Type:*                                                               
*Synonyms.txt*Brand 86, 86, eightysix, eight six, eighty six,
eighty-six*Default Highlighting Component*                100                         
70            0.5            [-\w ,/\n\&quot;&apos;]{20,200}                                                                                           
10      .,!? &#9;&#10;&#13;                        WORD                  en     
US      Also I saw that we I removed some words from the synonyms list, it
works right.Anyone has any idea about what is wrong ?Best Regards.



--
View this message in context: http://lucene.472066.n3.nabble.com/SolR-InvalidTokenOffsetsException-with-Highlighter-and-Synonyms-tp4053988.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolR InvalidTokenOffsetsException with Highlighter and Synonyms

Posted by Dmitry Kan <so...@gmail.com>.
Hi,

If you are not afraid of looking into the code, you could trace and
possibly fix this. Remember to commit a patch :)

Another (easier?) way is to compile a repeatable test and file a Jira.

Dmitry


On Tue, Apr 16, 2013 at 4:12 PM, juancesarvillalba <
juancesarvillalba@gmail.com> wrote:

>
>
> Hi,
>
> At moment, I am not considering store synonyms in the index, although is
> something that I have to do some time.
>
> Is "strange" that something "common" like multi-word synonyms have a bug
> with highligting but I couldn't find any solution.
>
> Thanks for your help.
>
>
>
>
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SolR-InvalidTokenOffsetsException-with-Highlighter-and-Synonyms-tp4053988p4056305.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: SolR InvalidTokenOffsetsException with Highlighter and Synonyms

Posted by juancesarvillalba <ju...@gmail.com>.

Hi,

At moment, I am not considering store synonyms in the index, although is
something that I have to do some time.

Is "strange" that something "common" like multi-word synonyms have a bug
with highligting but I couldn't find any solution.

Thanks for your help.

 

 





--
View this message in context: http://lucene.472066.n3.nabble.com/SolR-InvalidTokenOffsetsException-with-Highlighter-and-Synonyms-tp4053988p4056305.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolR InvalidTokenOffsetsException with Highlighter and Synonyms

Posted by Dmitry Kan <so...@gmail.com>.
Could be a bug in the higlighter. But before claiming that, I would still
play around different options, like hl.fragSize, hl.highlightMultiTerm.

Also, have you considered storing synonyms in the index?


On Tue, Apr 16, 2013 at 9:42 AM, juancesarvillalba <
juancesarvillalba@gmail.com> wrote:

>  Hi,
>
> I am using the stander highlighting.
> http://wiki.apache.org/solr/HighlightingParameters
>
> Cheers
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SolR-InvalidTokenOffsetsException-with-Highlighter-and-Synonyms-tp4053988p4056240.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: SolR InvalidTokenOffsetsException with Highlighter and Synonyms

Posted by juancesarvillalba <ju...@gmail.com>.
 Hi,

I am using the stander highlighting.
http://wiki.apache.org/solr/HighlightingParameters

Cheers



--
View this message in context: http://lucene.472066.n3.nabble.com/SolR-InvalidTokenOffsetsException-with-Highlighter-and-Synonyms-tp4053988p4056240.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolR InvalidTokenOffsetsException with Highlighter and Synonyms

Posted by Dmitry Kan <so...@gmail.com>.
Do you use the standard highlighter or FastVectorHighlighter /
PhraseHighlighter ?
Do you use hl.highlightMultiTerm<http://wiki.apache.org/solr/HighlightingParameters#hl.highlightMultiTerm>
 option?


On Tue, Apr 16, 2013 at 2:51 AM, juancesarvillalba <
juancesarvillalba@gmail.com> wrote:

>
> Hi,
>
> Before I had a different configuration that was working but with Synonyms
> in
> Query time.
>
> Now I have a requirement to add multi-word synonyms is for that I am
> checking this configuration.
>
> It doesn't work with this configuration still without multi-words synonyms.
> The problem happens only with Highlighting ON.
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SolR-InvalidTokenOffsetsException-with-Highlighter-and-Synonyms-tp4053988p4056186.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: SolR InvalidTokenOffsetsException with Highlighter and Synonyms

Posted by juancesarvillalba <ju...@gmail.com>.
Hi,

Before I had a different configuration that was working but with Synonyms in
Query time.

Now I have a requirement to add multi-word synonyms is for that I am
checking this configuration.

It doesn't work with this configuration still without multi-words synonyms.
The problem happens only with Highlighting ON.

 



--
View this message in context: http://lucene.472066.n3.nabble.com/SolR-InvalidTokenOffsetsException-with-Highlighter-and-Synonyms-tp4053988p4056186.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolR InvalidTokenOffsetsException with Highlighter and Synonyms

Posted by Dmitry Kan <so...@gmail.com>.
Hi,

Does it work well, if you remove synonyms with spaces in them, like "eighty
six" ?

Dmitry


On Fri, Apr 5, 2013 at 3:43 AM, juancesarvillalba <
juancesarvillalba@gmail.com> wrote:

> Hi I saw some similar problems in other threads but I think that this is a
> little different and couldn't get any solution.*I get the exception
> */org.apache.lucene.search.highlight.InvalidTokenOffsetsException: Token
> eightysix exceeds length of provided text sized 80/This happens for example
> when I made a query for a word that have synonyms and with highlighting.For
> example I have made a query for 86 , I have a "eightysix" synonym for this
> and with highlighting I got the previous exception.The relevant conf
> is:*Field Type:*
> *Synonyms.txt*Brand 86, 86, eightysix, eight six, eighty six,
> eighty-six*Default Highlighting Component*                100
> 70            0.5            [-\w ,/\n\&quot;&apos;]{20,200}
> 10      .,!? &#9;&#10;&#13;                        WORD                  en
> US      Also I saw that we I removed some words from the synonyms list, it
> works right.Anyone has any idea about what is wrong ?Best Regards.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/SolR-InvalidTokenOffsetsException-with-Highlighter-and-Synonyms-tp4053988.html
> Sent from the Solr - User mailing list archive at Nabble.com.