You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ganesh <em...@yahoo.co.in> on 2009/03/06 07:44:42 UTC

Questions about analyzer

Hello all

1)
Which is best to use Snowball analyzer or Lucene contrib analyzer? There is 
no inbuilt stop word list for Snowball analyzer?

2)
Whether Analyzer and QueryParser are thread-free. They could created once 
and use it in as many threads?

3)
I am using Snowball Analyzer to do index and search., When i search for 
windows AND vista, QueryParser is adding AND as part of search, But i am 
expecting something like +windows +vista.

Regards
Ganesh 

Send instant messages to your online friends http://in.messenger.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Questions about analyzer

Posted by Michael McCandless <lu...@mikemccandless.com>.
Ganesh wrote:

> Mike in of his replies to the thread "Faceted search using Lucene",  
> gave the following code review comment
>
> * You are creating a new Analyzer & QueryParser every time, also
> creating unnecessary garbage; instead, they should be created once
> & reused.
>
> This made me to ask the below questions.
>
> Is QueryParser and Analyzer thread safe?

Actually QueryParser is *not* thread-safe, but Lucene's core analyzers  
are.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Questions about analyzer

Posted by Ganesh <em...@yahoo.co.in>.
Erick,

I got your reply, but i asked more more query.

Mike in of his replies to the thread "Faceted search using Lucene", gave the 
following code review comment

 * You are creating a new Analyzer & QueryParser every time, also
creating unnecessary garbage; instead, they should be created once
& reused.

This made me to ask the below questions.

Is QueryParser and Analyzer thread safe?

Regards
Ganesh

----- Original Message ----- 
From: "Erick Erickson" <er...@gmail.com>
To: <ja...@lucene.apache.org>
Sent: Tuesday, March 10, 2009 6:10 PM
Subject: Re: Questions about analyzer


> Yes, I replied 4 days ago, is your SPAM filter interfering?
>
> On Tue, Mar 10, 2009 at 8:35 AM, Ganesh <em...@yahoo.co.in> wrote:
>
>> Any reply on this?
>>
>> ----- Original Message ----- From: "Ganesh" <em...@yahoo.co.in>
>> To: <ja...@lucene.apache.org>
>> Sent: Monday, March 09, 2009 11:28 AM
>> Subject: Re: Questions about analyzer
>>
>>
>>  Mike in of his replies to the thread "Faceted search using Lucene", gave
>>> the following code review comment
>>>
>>>  * You are creating a new Analyzer & QueryParser every time, also
>>>    creating unnecessary garbage; instead, they should be created once
>>>    & reused.
>>>
>>> This made me to ask the below questions.
>>> Is QueryParser and Analyzer thread safe?
>>>
>>
>>
>>> Based on the application default langauge, i need to pick analyzer. Few
>>> analyzers (Russian, French) are available in both Contrib and Snowball.
>>> Which is more reliable?
>>>
>>> Reagrds
>>> Ganesh
>>>
>>>
>>> ----- Original Message ----- From: "Erick Erickson" <
>>> erickerickson@gmail.com>
>>> To: <ja...@lucene.apache.org>
>>> Sent: Friday, March 06, 2009 6:47 PM
>>> Subject: Re: Questions about analyzer
>>>
>>>
>>>  See below
>>>> On Fri, Mar 6, 2009 at 1:44 AM, Ganesh <em...@yahoo.co.in> wrote:
>>>>
>>>>  Hello all
>>>>>
>>>>> 1)
>>>>> Which is best to use Snowball analyzer or Lucene contrib analyzer? 
>>>>> There
>>>>> is
>>>>> no inbuilt stop word list for Snowball analyzer?
>>>>>
>>>>>
>>>> What is the "Lucene contrib analyzer"? There are 12 of them......
>>>> And regardless, the answer is "It depends" on what
>>>> you're trying to accomplish, which you haven't stated.
>>>>
>>>> I don't know if there's a default set of stop words, but this
>>>> seems like a pretty easy thing to test for yourself.
>>>>
>>>>
>>>>
>>>>> 2)
>>>>> Whether Analyzer and QueryParser are thread-free. They could created
>>>>> once
>>>>> and use it in as many threads?
>>>>>
>>>>>  From the FAQ (
>>>>
>>>> http://wiki.apache.org/lucene-java/LuceneFAQ#head-42833b3bb259e10c64424a892a3f04840a187d80
>>>> )
>>>>
>>>> Is the QueryParser thread-safe?
>>>>
>>>> No, it's not.
>>>>
>>>>
>>>> I'm unsure about analyzers, but they are so lightweight I wouldn't
>>>>
>>>> go to the trouble of sharing them among threads just to be safe.
>>>>
>>>>
>>>>
>>>>
>>>>> 3)
>>>>> I am using Snowball Analyzer to do index and search., When i search 
>>>>> for
>>>>> windows AND vista, QueryParser is adding AND as part of search, But i 
>>>>> am
>>>>> expecting something like +windows +vista.
>>>>>
>>>>>
>>>> How do you know this? Is the the result of query.toString? And is
>>>> the "AND" *in your query* uppercase? One of the quirks of Lucene
>>>> is that it requires 'and', 'or' and 'not' to be uppercase to be
>>>> interpreted as you expect.
>>>>
>>>> Best
>>>> Erick
>>>>
>>>>
>>>>
>>>>> Regards
>>>>> Ganesh
>>>>> Send instant messages to your online friends
>>>>> http://in.messenger.yahoo.com
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>>
>>>>>
>>>>
>>> Send instant messages to your online friends
>>> http://in.messenger.yahoo.com
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>> Send instant messages to your online friends 
>> http://in.messenger.yahoo.com
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
> 

Send instant messages to your online friends http://in.messenger.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Questions about analyzer

Posted by Erick Erickson <er...@gmail.com>.
Yes, I replied 4 days ago, is your SPAM filter interfering?

On Tue, Mar 10, 2009 at 8:35 AM, Ganesh <em...@yahoo.co.in> wrote:

> Any reply on this?
>
> ----- Original Message ----- From: "Ganesh" <em...@yahoo.co.in>
> To: <ja...@lucene.apache.org>
> Sent: Monday, March 09, 2009 11:28 AM
> Subject: Re: Questions about analyzer
>
>
>  Mike in of his replies to the thread "Faceted search using Lucene", gave
>> the following code review comment
>>
>>  * You are creating a new Analyzer & QueryParser every time, also
>>    creating unnecessary garbage; instead, they should be created once
>>    & reused.
>>
>> This made me to ask the below questions.
>> Is QueryParser and Analyzer thread safe?
>>
>
>
>> Based on the application default langauge, i need to pick analyzer. Few
>> analyzers (Russian, French) are available in both Contrib and Snowball.
>> Which is more reliable?
>>
>> Reagrds
>> Ganesh
>>
>>
>> ----- Original Message ----- From: "Erick Erickson" <
>> erickerickson@gmail.com>
>> To: <ja...@lucene.apache.org>
>> Sent: Friday, March 06, 2009 6:47 PM
>> Subject: Re: Questions about analyzer
>>
>>
>>  See below
>>> On Fri, Mar 6, 2009 at 1:44 AM, Ganesh <em...@yahoo.co.in> wrote:
>>>
>>>  Hello all
>>>>
>>>> 1)
>>>> Which is best to use Snowball analyzer or Lucene contrib analyzer? There
>>>> is
>>>> no inbuilt stop word list for Snowball analyzer?
>>>>
>>>>
>>> What is the "Lucene contrib analyzer"? There are 12 of them......
>>> And regardless, the answer is "It depends" on what
>>> you're trying to accomplish, which you haven't stated.
>>>
>>> I don't know if there's a default set of stop words, but this
>>> seems like a pretty easy thing to test for yourself.
>>>
>>>
>>>
>>>> 2)
>>>> Whether Analyzer and QueryParser are thread-free. They could created
>>>> once
>>>> and use it in as many threads?
>>>>
>>>>  From the FAQ (
>>>
>>> http://wiki.apache.org/lucene-java/LuceneFAQ#head-42833b3bb259e10c64424a892a3f04840a187d80
>>> )
>>>
>>> Is the QueryParser thread-safe?
>>>
>>> No, it's not.
>>>
>>>
>>> I'm unsure about analyzers, but they are so lightweight I wouldn't
>>>
>>> go to the trouble of sharing them among threads just to be safe.
>>>
>>>
>>>
>>>
>>>> 3)
>>>> I am using Snowball Analyzer to do index and search., When i search for
>>>> windows AND vista, QueryParser is adding AND as part of search, But i am
>>>> expecting something like +windows +vista.
>>>>
>>>>
>>> How do you know this? Is the the result of query.toString? And is
>>> the "AND" *in your query* uppercase? One of the quirks of Lucene
>>> is that it requires 'and', 'or' and 'not' to be uppercase to be
>>> interpreted as you expect.
>>>
>>> Best
>>> Erick
>>>
>>>
>>>
>>>> Regards
>>>> Ganesh
>>>> Send instant messages to your online friends
>>>> http://in.messenger.yahoo.com
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>>
>>>
>> Send instant messages to your online friends
>> http://in.messenger.yahoo.com
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
> Send instant messages to your online friends http://in.messenger.yahoo.com
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Questions about analyzer

Posted by Ganesh <em...@yahoo.co.in>.
Any reply on this?

----- Original Message ----- 
From: "Ganesh" <em...@yahoo.co.in>
To: <ja...@lucene.apache.org>
Sent: Monday, March 09, 2009 11:28 AM
Subject: Re: Questions about analyzer


> Mike in of his replies to the thread "Faceted search using Lucene", gave 
> the following code review comment
>
>   * You are creating a new Analyzer & QueryParser every time, also
>     creating unnecessary garbage; instead, they should be created once
>     & reused.
>
> This made me to ask the below questions.
> Is QueryParser and Analyzer thread safe?

>
> Based on the application default langauge, i need to pick analyzer. Few 
> analyzers (Russian, French) are available in both Contrib and Snowball. 
> Which is more reliable?
>
> Reagrds
> Ganesh
>
>
> ----- Original Message ----- 
> From: "Erick Erickson" <er...@gmail.com>
> To: <ja...@lucene.apache.org>
> Sent: Friday, March 06, 2009 6:47 PM
> Subject: Re: Questions about analyzer
>
>
>> See below
>> On Fri, Mar 6, 2009 at 1:44 AM, Ganesh <em...@yahoo.co.in> wrote:
>>
>>> Hello all
>>>
>>> 1)
>>> Which is best to use Snowball analyzer or Lucene contrib analyzer? There 
>>> is
>>> no inbuilt stop word list for Snowball analyzer?
>>>
>>
>> What is the "Lucene contrib analyzer"? There are 12 of them......
>> And regardless, the answer is "It depends" on what
>> you're trying to accomplish, which you haven't stated.
>>
>> I don't know if there's a default set of stop words, but this
>> seems like a pretty easy thing to test for yourself.
>>
>>
>>>
>>> 2)
>>> Whether Analyzer and QueryParser are thread-free. They could created 
>>> once
>>> and use it in as many threads?
>>>
>> From the FAQ (
>> http://wiki.apache.org/lucene-java/LuceneFAQ#head-42833b3bb259e10c64424a892a3f04840a187d80
>> )
>>
>> Is the QueryParser thread-safe?
>>
>> No, it's not.
>>
>>
>> I'm unsure about analyzers, but they are so lightweight I wouldn't
>>
>> go to the trouble of sharing them among threads just to be safe.
>>
>>
>>
>>>
>>> 3)
>>> I am using Snowball Analyzer to do index and search., When i search for
>>> windows AND vista, QueryParser is adding AND as part of search, But i am
>>> expecting something like +windows +vista.
>>>
>>
>> How do you know this? Is the the result of query.toString? And is
>> the "AND" *in your query* uppercase? One of the quirks of Lucene
>> is that it requires 'and', 'or' and 'not' to be uppercase to be
>> interpreted as you expect.
>>
>> Best
>> Erick
>>
>>
>>>
>>> Regards
>>> Ganesh
>>> Send instant messages to your online friends 
>>> http://in.messenger.yahoo.com
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>
> Send instant messages to your online friends http://in.messenger.yahoo.com
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 

Send instant messages to your online friends http://in.messenger.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Questions about analyzer

Posted by Ganesh <em...@yahoo.co.in>.
Mike in of his replies to the thread "Faceted search using Lucene", gave the 
following code review comment

   * You are creating a new Analyzer & QueryParser every time, also
     creating unnecessary garbage; instead, they should be created once
     & reused.

This made me to ask the below questions.

Based on the application default langauge, i need to pick analyzer. Few 
analyzers (Russian, French) are available in both Contrib and Snowball. 
Which is more reliable?

Reagrds
Ganesh


----- Original Message ----- 
From: "Erick Erickson" <er...@gmail.com>
To: <ja...@lucene.apache.org>
Sent: Friday, March 06, 2009 6:47 PM
Subject: Re: Questions about analyzer


> See below
> On Fri, Mar 6, 2009 at 1:44 AM, Ganesh <em...@yahoo.co.in> wrote:
>
>> Hello all
>>
>> 1)
>> Which is best to use Snowball analyzer or Lucene contrib analyzer? There 
>> is
>> no inbuilt stop word list for Snowball analyzer?
>>
>
> What is the "Lucene contrib analyzer"? There are 12 of them......
> And regardless, the answer is "It depends" on what
> you're trying to accomplish, which you haven't stated.
>
> I don't know if there's a default set of stop words, but this
> seems like a pretty easy thing to test for yourself.
>
>
>>
>> 2)
>> Whether Analyzer and QueryParser are thread-free. They could created once
>> and use it in as many threads?
>>
> From the FAQ (
> http://wiki.apache.org/lucene-java/LuceneFAQ#head-42833b3bb259e10c64424a892a3f04840a187d80
> )
>
> Is the QueryParser thread-safe?
>
> No, it's not.
>
>
> I'm unsure about analyzers, but they are so lightweight I wouldn't
>
> go to the trouble of sharing them among threads just to be safe.
>
>
>
>>
>> 3)
>> I am using Snowball Analyzer to do index and search., When i search for
>> windows AND vista, QueryParser is adding AND as part of search, But i am
>> expecting something like +windows +vista.
>>
>
> How do you know this? Is the the result of query.toString? And is
> the "AND" *in your query* uppercase? One of the quirks of Lucene
> is that it requires 'and', 'or' and 'not' to be uppercase to be
> interpreted as you expect.
>
> Best
> Erick
>
>
>>
>> Regards
>> Ganesh
>> Send instant messages to your online friends 
>> http://in.messenger.yahoo.com
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
> 

Send instant messages to your online friends http://in.messenger.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Questions about analyzer

Posted by Erick Erickson <er...@gmail.com>.
See below
On Fri, Mar 6, 2009 at 1:44 AM, Ganesh <em...@yahoo.co.in> wrote:

> Hello all
>
> 1)
> Which is best to use Snowball analyzer or Lucene contrib analyzer? There is
> no inbuilt stop word list for Snowball analyzer?
>

What is the "Lucene contrib analyzer"? There are 12 of them......
And regardless, the answer is "It depends" on what
you're trying to accomplish, which you haven't stated.

I don't know if there's a default set of stop words, but this
seems like a pretty easy thing to test for yourself.


>
> 2)
> Whether Analyzer and QueryParser are thread-free. They could created once
> and use it in as many threads?
>
>From the FAQ (
http://wiki.apache.org/lucene-java/LuceneFAQ#head-42833b3bb259e10c64424a892a3f04840a187d80
)

Is the QueryParser thread-safe?

No, it's not.


I'm unsure about analyzers, but they are so lightweight I wouldn't

go to the trouble of sharing them among threads just to be safe.



>
> 3)
> I am using Snowball Analyzer to do index and search., When i search for
> windows AND vista, QueryParser is adding AND as part of search, But i am
> expecting something like +windows +vista.
>

How do you know this? Is the the result of query.toString? And is
the "AND" *in your query* uppercase? One of the quirks of Lucene
is that it requires 'and', 'or' and 'not' to be uppercase to be
interpreted as you expect.

Best
Erick


>
> Regards
> Ganesh
> Send instant messages to your online friends http://in.messenger.yahoo.com
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>