You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Sethi, Parampreet" <pa...@teamaol.com> on 2010/10/08 17:33:02 UTC
Accented Search in Solr
Hi All,
I am using Solr 1.3 in my project. Just wanted to know if there is any other way by which below mentioned queries will return the same results:
Gruyère-and-Zucchini
Gruyere-and-Zucchini
The first query has accented characters in it. I was just going through the Solr tokenizers and filter factories documentation, there is a filter factory listed "solr.ISOLatin1AccentFilterFactory" that can be used to replace accented characters with their non-accented counterparts.
Is there any other way to do this search which is independent of how data is stored (whether in accented or non-accented form)?
Thanks for the help.
Regards,
param
Re: Accented Search in Solr
Posted by Otis Gospodnetic <ot...@yahoo.com>.
Param,
Note that the original value will be stored even if ISOLatin1AccentFilter
removes the accept for indexing / matching purposes.
Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
----- Original Message ----
> From: "Sethi, Parampreet" <pa...@teamaol.com>
> To: "solr-user@lucene.apache.org" <so...@lucene.apache.org>
> Sent: Fri, October 8, 2010 11:33:02 AM
> Subject: Accented Search in Solr
>
> Hi All,
>
> I am using Solr 1.3 in my project. Just wanted to know if there is any other
>way by which below mentioned queries will return the same results:
>
> Gruyère-and-Zucchini
> Gruyere-and-Zucchini
>
> The first query has accented characters in it. I was just going through the
>Solr tokenizers and filter factories documentation, there is a filter factory
>listed "solr.ISOLatin1AccentFilterFactory" that can be used to replace accented
>characters with their non-accented counterparts.
>
> Is there any other way to do this search which is independent of how data is
>stored (whether in accented or non-accented form)?
>
> Thanks for the help.
>
> Regards,
> param
>
Re: Accented Search in Solr
Posted by Erick Erickson <er...@gmail.com>.
not that I know of. Do note that whether the query has the accent filter
active or not MUST
be matched with the index-time filter. In other words, if you indexed with
the filter but
search without it or vice-versa you won't get the resultsyou expect.
Also note that no matter what, the original text (without the filter
applied) is what's #stored#
untokenized. This is entirely independent of what's #indexed# for all that
these options are
specified for the same field.
If this is irrelevant, what are you really trying to accomplish? This may be
an "xy" problem, see:
http://people.apache.org/~hossman/#xyproblem
<http://people.apache.org/~hossman/#xyproblem>
Your question appears to be an "XY Problem" ... that is: you are dealing
with "X", you are assuming "Y" will help you, and you are asking about "Y"
without giving more details about the "X" so that we can understand the
full issue. Perhaps the best solution doesn't involve "Y" at all?
See Also: http://www.perlmonks.org/index.pl?node_id=542341
Erick
On Fri, Oct 8, 2010 at 11:33 AM, Sethi, Parampreet <
parampreet.sethi@teamaol.com> wrote:
> Hi All,
>
> I am using Solr 1.3 in my project. Just wanted to know if there is any
> other way by which below mentioned queries will return the same results:
>
> Gruyère-and-Zucchini
> Gruyere-and-Zucchini
>
> The first query has accented characters in it. I was just going through the
> Solr tokenizers and filter factories documentation, there is a filter
> factory listed "solr.ISOLatin1AccentFilterFactory" that can be used to
> replace accented characters with their non-accented counterparts.
>
> Is there any other way to do this search which is independent of how data
> is stored (whether in accented or non-accented form)?
>
> Thanks for the help.
>
> Regards,
> param
>
Re: Accented Search in Solr
Posted by Chris Hostetter <ho...@fucit.org>.
: Subject: Accented Search in Solr
: References: <AA...@mail.gmail.com>
: In-Reply-To: <AA...@mail.gmail.com>
http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists
When starting a new discussion on a mailing list, please do not reply to
an existing message, instead start a fresh email. Even if you change the
subject line of your email, other mail headers still track which thread
you replied to and your question is "hidden" in that thread and gets less
attention. It makes following discussions in the mailing list archives
particularly difficult.
See Also: http://en.wikipedia.org/wiki/User:DonDiego/Thread_hijacking
-Hoss