You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Cool Coder <te...@yahoo.com> on 2007/10/24 23:27:27 UTC
Highlighter and href fields
Is there anyway I stop highlighting text if it is a href/url etc...? The problem occurs when the field content is a URL which contains the query e.g. my search is for .net and fields has value http://jkjsd.net. After applying highlighter, it becomes http://jkjsd<b>.net</b>, which is a wrong URL. Can I filter it out?
- BR
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
Re: Highlighter and href fields
Posted by Cool Coder <te...@yahoo.com>.
Ok I understand now that I have a big work ahead of me.
>2. Use an Analyzer that recognizes URL's. That way you wont get partial
BTW, Do you know any analyzer that can recognize URLs.
- BR
Mark Miller <ma...@gmail.com> wrote:
Nothing in the Highlighter per seh that will help you there. I see two
options off the top of my head:
1. break the text before feeding it to the highlighter and feed all but
the URL parts, and then stitch back together -- much as you might do if
highlighting an XML doc. Ugly though.
2. Use an Analyzer that recognizes URL's. That way you wont get partial
URL matches like .net. Each URL would be a full token, and would require
a search matching the entire URL to match. Even if you already indexed
with a different Analzyer, you could use this special Analyzer just for
highlighting...it would act exactly the same as your indexing Analyzer,
but would parse any URL as a single token. Of course, if you are using
TokenSource, this is not an option.
- Mark
Cool Coder wrote:
> Is there anyway I stop highlighting text if it is a href/url etc...? The problem occurs when the field content is a URL which contains the query e.g. my search is for .net and fields has value http://jkjsd.net. After applying highlighter, it becomes http://jkjsd.net, which is a wrong URL. Can I filter it out?
>
> - BR
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
Re: Highlighter and href fields
Posted by Mark Miller <ma...@gmail.com>.
Nothing in the Highlighter per seh that will help you there. I see two
options off the top of my head:
1. break the text before feeding it to the highlighter and feed all but
the URL parts, and then stitch back together -- much as you might do if
highlighting an XML doc. Ugly though.
2. Use an Analyzer that recognizes URL's. That way you wont get partial
URL matches like .net. Each URL would be a full token, and would require
a search matching the entire URL to match. Even if you already indexed
with a different Analzyer, you could use this special Analyzer just for
highlighting...it would act exactly the same as your indexing Analyzer,
but would parse any URL as a single token. Of course, if you are using
TokenSource, this is not an option.
- Mark
Cool Coder wrote:
> Is there anyway I stop highlighting text if it is a href/url etc...? The problem occurs when the field content is a URL which contains the query e.g. my search is for .net and fields has value http://jkjsd.net. After applying highlighter, it becomes http://jkjsd<b>.net</b>, which is a wrong URL. Can I filter it out?
>
> - BR
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org