You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by darul <da...@gmail.com> on 2012/01/03 18:19:07 UTC

charFilter PatternReplaceCharFilterFactory and highlighting

Hello,

I wanted to use char filter PatternReplaceCharFilterFactory to avoid
specific content to be indexed. 

At the end I get many issues with highlights and offsets...so I remove it,
example :



Example of content :



My charfilter should clean it like :



I do not understand why offset of highlights are disturbed by charFilter
while it is defined in first, it may change content before highlight
processing occurs ?



Do you have any solutions, we really need charFilter feature ?

Thanks,

Jul





--
View this message in context: http://lucene.472066.n3.nabble.com/charFilter-PatternReplaceCharFilterFactory-and-highlighting-tp3629699p3629699.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: charFilter PatternReplaceCharFilterFactory and highlighting

Posted by darul <da...@gmail.com>.
Well I guess may be a bug somewhere 

https://issues.apache.org/jira/browse/LUCENE-2208

--
View this message in context: http://lucene.472066.n3.nabble.com/charFilter-PatternReplaceCharFilterFactory-and-highlighting-tp3629699p3631571.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: charFilter PatternReplaceCharFilterFactory and highlighting

Posted by darul <da...@gmail.com>.
Here is the stacktrace :


....


--
View this message in context: http://lucene.472066.n3.nabble.com/charFilter-PatternReplaceCharFilterFactory-and-highlighting-tp3629699p3631513.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: charFilter PatternReplaceCharFilterFactory and highlighting

Posted by darul <da...@gmail.com>.
Some of our path in indexed content may contains some words matching query,
what we do not expect, that is why I have applied a CharFilter to skip it.

Here is example of content before filtering :



After applying regexp filter I have provided in my previous thread, it
should look like, shouldn't it (skip links path in indexation ?) :



I have also made query tests and get no results matching when looking for
"*uploads*" or "*content*", what is our expected behaviour.

Problem is when I activate highlight and search for "*word*", it throws
exception.




--
View this message in context: http://lucene.472066.n3.nabble.com/charFilter-PatternReplaceCharFilterFactory-and-highlighting-tp3629699p3631367.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: charFilter PatternReplaceCharFilterFactory and highlighting

Posted by Koji Sekiguchi <ko...@r.email.ne.jp>.
Jul,

Maybe you missed "Example of content :" and "My charfilter should clean it like :"
in your previous mail? We need them in order to consider your problem. :->

koji
-- 
http://www.rondhuit.com/en/


(12/01/04 2:19), darul wrote:
> Hello,
>
> I wanted to use char filter PatternReplaceCharFilterFactory to avoid
> specific content to be indexed.
>
> At the end I get many issues with highlights and offsets...so I remove it,
> example :
>
>
>
> Example of content :
>
>
>
> My charfilter should clean it like :
>
>
>
> I do not understand why offset of highlights are disturbed by charFilter
> while it is defined in first, it may change content before highlight
> processing occurs ?
>
>
>
> Do you have any solutions, we really need charFilter feature ?
>
> Thanks,
>
> Jul
>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/charFilter-PatternReplaceCharFilterFactory-and-highlighting-tp3629699p3629699.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>