You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by bryan rasmussen <ra...@gmail.com> on 2011/04/19 17:15:49 UTC

testing of stemming

Hi,

I was wondering if I have a large number of queries I want to test
stemming on if there is a free standing library I can just run it
against without having to do all the overhead of a http request?

Thanks,
Bryan Rasmussen

Re: testing of stemming

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Bryan,

Have a look at page 111 of Lucene in Action 2, section 4.1.  Is that the sort of 
thing you are after?
If so, we may have some code that produced that in the LIA2 source code 
download...

You could also just write a small app/script that calls (via HTTP/SolrJ) one of 
the Solr analysis request handlers - if you look at solrconfig.xml you will see 
them defined there.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: bryan rasmussen <ra...@gmail.com>
> To: solr-user <so...@lucene.apache.org>
> Sent: Tue, April 19, 2011 11:15:49 AM
> Subject: testing of stemming
> 
> Hi,
> 
> I was wondering if I have a large number of queries I want to  test
> stemming on if there is a free standing library I can just run  it
> against without having to do all the overhead of a http  request?
> 
> Thanks,
> Bryan Rasmussen
> 

Re: testing of stemming

Posted by bryan rasmussen <ra...@gmail.com>.
that looks like a good starting point,

thanks,
bryan rasmussen

2011/4/19 François Schiettecatte <fs...@gmail.com>:
> I would start here:
>
>        http://snowball.tartarus.org/
>
> François
>
> On Apr 19, 2011, at 11:15 AM, bryan rasmussen wrote:
>
>> Hi,
>>
>> I was wondering if I have a large number of queries I want to test
>> stemming on if there is a free standing library I can just run it
>> against without having to do all the overhead of a http request?
>>
>> Thanks,
>> Bryan Rasmussen
>
>

Re: testing of stemming

Posted by François Schiettecatte <fs...@gmail.com>.
I would start here:

	http://snowball.tartarus.org/

François

On Apr 19, 2011, at 11:15 AM, bryan rasmussen wrote:

> Hi,
> 
> I was wondering if I have a large number of queries I want to test
> stemming on if there is a free standing library I can just run it
> against without having to do all the overhead of a http request?
> 
> Thanks,
> Bryan Rasmussen


Re: testing of stemming

Posted by bryan rasmussen <ra...@gmail.com>.
maybe not a library but a command line tool would be good, something
that I can write code or do automation via script to test that when I
ask for the word virksomhed in the danish language that I can then see
that it will would also return virksomhederne and other variations.

I guess I was hoping for something similar to a wordnet of stems...

but at worst I would be fine with checking specifically against my
index - I just didn't necessarily want to automate the browser to do
it as I figured it would be extra performance intensive.

Thanks,
Bryan Rasmussen



On Tue, Apr 19, 2011 at 5:19 PM, Erick Erickson <er...@gmail.com> wrote:
> I'm not sure what a "free standing library" would look like. Do you
> want it to check that all the terms in your index are stemmed
> correctly (or at least as expected)?
>
> You have a bunch of queries. How would such a library test them
> against your corpus?
>
> There's not enough information here to give a meaningful answer....
>
> Best
> Erick
>
> On Tue, Apr 19, 2011 at 11:15 AM, bryan rasmussen
> <ra...@gmail.com> wrote:
>> Hi,
>>
>> I was wondering if I have a large number of queries I want to test
>> stemming on if there is a free standing library I can just run it
>> against without having to do all the overhead of a http request?
>>
>> Thanks,
>> Bryan Rasmussen
>>
>

Re: testing of stemming

Posted by Erick Erickson <er...@gmail.com>.
I'm not sure what a "free standing library" would look like. Do you
want it to check that all the terms in your index are stemmed
correctly (or at least as expected)?

You have a bunch of queries. How would such a library test them
against your corpus?

There's not enough information here to give a meaningful answer....

Best
Erick

On Tue, Apr 19, 2011 at 11:15 AM, bryan rasmussen
<ra...@gmail.com> wrote:
> Hi,
>
> I was wondering if I have a large number of queries I want to test
> stemming on if there is a free standing library I can just run it
> against without having to do all the overhead of a http request?
>
> Thanks,
> Bryan Rasmussen
>