You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Kai Gülzau <kg...@novomind.com> on 2012/03/16 14:59:24 UTC
mailto: scheme aware tokenizer
Is there any analyzer out there which handles the mailto: scheme?
UAX29URLEmailTokenizer seems to split at the wrong place:
mailto:test@example.org ->
mailto:test
example.org
As a workaround I use
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="mailto:" replacement="mailto: "/>
Regards,
Kai Gülzau
novomind AG
__________________________________
Bramfelder Straße 121 • 22305 Hamburg
phone +49 (0)40 808071138 • fax +49 (0)40 808071-100
email kguelzau@novomind.com • http://www.novomind.com
Vorstand : Peter Samuelsen (Vors.) • Stefan Grieben • Thomas Köhler
Aufsichtsratsvorsitzender: Werner Preuschhof
Gesellschaftssitz: Hamburg • HR B93508 Amtsgericht Hamburg
RE: mailto: scheme aware tokenizer
Posted by Steven A Rowe <sa...@syr.edu>.
Hi Kai,
I have created an issue for this: https://issues.apache.org/jira/browse/LUCENE-3880
Thanks for reporting!
Steve
-----Original Message-----
From: Kai Gülzau [mailto:kguelzau@novomind.com]
Sent: Friday, March 16, 2012 9:59 AM
To: solr-user@lucene.apache.org
Subject: mailto: scheme aware tokenizer
Is there any analyzer out there which handles the mailto: scheme?
UAX29URLEmailTokenizer seems to split at the wrong place:
mailto:test@example.org ->
mailto:test
example.org
As a workaround I use
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="mailto:" replacement="mailto: "/>
Regards,
Kai Gülzau
novomind AG
__________________________________
Bramfelder Straße 121 • 22305 Hamburg
phone +49 (0)40 808071138 • fax +49 (0)40 808071-100 email kguelzau@novomind.com • http://www.novomind.com
Vorstand : Peter Samuelsen (Vors.) • Stefan Grieben • Thomas Köhler
Aufsichtsratsvorsitzender: Werner Preuschhof
Gesellschaftssitz: Hamburg • HR B93508 Amtsgericht Hamburg