You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Lisa Riggle <li...@linkup.com> on 2011/07/07 21:19:35 UTC

Highlight not catching last letter(s)

Hi Guys!

Thanks for the help with my question regarding special characters in
indexes.  I have another question that I hope you can help with.

Right now, some of our companies have special, non-alphanumeric
characters in them.  Many of these characters get stripped out during
the indexing process and the query process.  Unfortunately, I've
noticed, if the Name of the company that's returned is 1-2 characters
longer than the stripped query string, highlighting will not highlight
the last 1-2 characters.

Example-
Company Name: Inter@ctive
    (@ symbol is removed from the tokenwith patternReplaceFilterFactory
during the indexing process)
Search Query: Inter@ctive
    (Here the @ symbol is removed by the PHP script before being sent to
Solr, so the term ends up being /Interctive/)
How it gets highlighted: *Inter@ctiv*e

Another time this happens is if I do a search for a company without any
spaces in the name, and it returns a version of the name with spaces in
the name.

Example-
Company Name: Best Buy
    (Notice the space)
Search Query: bestbuy
    (Notice the lack of space)
How it gets highlighted: *Best Bu*y

I'm at a total loss on how to get around this.  Can anyone point me in
the right direction?