You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by al...@aim.com on 2011/12/21 22:09:25 UTC

How to apply relevant Stemmer to each document

Hello,

I would like to know if in the latest version of solr is it possible to apply relevant stemmer to each doc depending on its lang field.
I searched solr-user mailing lists and fount this thread

http://lucene.472066.n3.nabble.com/Multiplexing-TokenFilter-for-multi-language-td3235341.html

but not sure if it was developed into a jira ticket. 

Thanks.
Alex.
 


Re: How to apply relevant Stemmer to each document

Posted by Erick Erickson <er...@gmail.com>.
Sure, but what about inappropriate stemming in one language that
happens to match something in another?

In general, putting multiple languages into a single field usually
only makes sense when the
overwhelming number of documents are in one language...

Best
Erick

On Thu, Dec 22, 2011 at 2:41 PM,  <al...@aim.com> wrote:
> Hi Erick,
>
> Why querying would be wrong?
>
> It is my understanding that if I have let say 3 docs and each of them has been indexed with its own language stemmer, then sending a query will search  all  docs and return matching results? Let say if a query is "driving" and one of the docs has drive and was stemmed by English Stemmer, then it would return 1 result as opposed if I had applied to all docs Russian lang stemmer and resuilt be 0 docs?
>
> Am I missing something?
>
> Thanks.
> Alex.
>
>
>
>
>
>
>
>
>
> -----Original Message-----
> From: Erick Erickson <er...@gmail.com>
> To: solr-user <so...@lucene.apache.org>
> Sent: Thu, Dec 22, 2011 11:06 am
> Subject: Re: How to apply relevant Stemmer to each document
>
>
> Not really. And it's hard to make sense of how this would work in practice
> because stemming the document (even if you could) because that's only
> half the battle.
>
> How would querying work then? No matter what language you used
> for your stemming, it would be wrong for all the documents that used a
> different stemmer (or a stemmer based on a different language).
>
> So I wouldn't hold out too much hope here.
>
> Best
> Erick
>
> On Wed, Dec 21, 2011 at 4:09 PM,  <al...@aim.com> wrote:
>> Hello,
>>
>> I would like to know if in the latest version of solr is it possible to apply
> relevant stemmer to each doc depending on its lang field.
>> I searched solr-user mailing lists and fount this thread
>>
>> http://lucene.472066.n3.nabble.com/Multiplexing-TokenFilter-for-multi-language-td3235341.html
>>
>> but not sure if it was developed into a jira ticket.
>>
>> Thanks.
>> Alex.
>>
>>
>
>

Re: How to apply relevant Stemmer to each document

Posted by al...@aim.com.
Hi Erick,

Why querying would be wrong? 

It is my understanding that if I have let say 3 docs and each of them has been indexed with its own language stemmer, then sending a query will search  all  docs and return matching results? Let say if a query is "driving" and one of the docs has drive and was stemmed by English Stemmer, then it would return 1 result as opposed if I had applied to all docs Russian lang stemmer and resuilt be 0 docs?

Am I missing something?

Thanks.
Alex.

  

 

 

 

-----Original Message-----
From: Erick Erickson <er...@gmail.com>
To: solr-user <so...@lucene.apache.org>
Sent: Thu, Dec 22, 2011 11:06 am
Subject: Re: How to apply relevant Stemmer to each document


Not really. And it's hard to make sense of how this would work in practice
because stemming the document (even if you could) because that's only
half the battle.

How would querying work then? No matter what language you used
for your stemming, it would be wrong for all the documents that used a
different stemmer (or a stemmer based on a different language).

So I wouldn't hold out too much hope here.

Best
Erick

On Wed, Dec 21, 2011 at 4:09 PM,  <al...@aim.com> wrote:
> Hello,
>
> I would like to know if in the latest version of solr is it possible to apply 
relevant stemmer to each doc depending on its lang field.
> I searched solr-user mailing lists and fount this thread
>
> http://lucene.472066.n3.nabble.com/Multiplexing-TokenFilter-for-multi-language-td3235341.html
>
> but not sure if it was developed into a jira ticket.
>
> Thanks.
> Alex.
>
>

 

Re: How to apply relevant Stemmer to each document

Posted by Erick Erickson <er...@gmail.com>.
Not really. And it's hard to make sense of how this would work in practice
because stemming the document (even if you could) because that's only
half the battle.

How would querying work then? No matter what language you used
for your stemming, it would be wrong for all the documents that used a
different stemmer (or a stemmer based on a different language).

So I wouldn't hold out too much hope here.

Best
Erick

On Wed, Dec 21, 2011 at 4:09 PM,  <al...@aim.com> wrote:
> Hello,
>
> I would like to know if in the latest version of solr is it possible to apply relevant stemmer to each doc depending on its lang field.
> I searched solr-user mailing lists and fount this thread
>
> http://lucene.472066.n3.nabble.com/Multiplexing-TokenFilter-for-multi-language-td3235341.html
>
> but not sure if it was developed into a jira ticket.
>
> Thanks.
> Alex.
>
>