You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by webspeak <we...@hotmail.com> on 2007/12/27 15:33:59 UTC

Search by KeyWord, the best practice

Hello,

I would like to search documents by "CUSTOMER".
So I search on the field "CUSTOMER" using a KeywordAnalyzer.

The CUSTOMER field is indexed with those params:
Field.Index.UN_TOKENIZED
Field.Index.Store

Is it the Best Practice ?

-- 
View this message in context: http://www.nabble.com/Search-by-KeyWord%2C-the-best-practice-tp14513720p14513720.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Search by KeyWord, the best practice

Posted by Grant Ingersoll <gs...@apache.org>.
Depends on whether you want fuzzy matches on Customer or not.   
Assuming this value contains things like first and last name, I would  
think you would want to tokenize so that you can search for those  
separately.  If it truly contains something that is a single token,  
then this should be fine.  You only need to Store items if you want to  
recover the value afterward for display or something like that.

-Grant

On Dec 27, 2007, at 9:33 AM, webspeak wrote:

>
> Hello,
>
> I would like to search documents by "CUSTOMER".
> So I search on the field "CUSTOMER" using a KeywordAnalyzer.
>
> The CUSTOMER field is indexed with those params:
> Field.Index.UN_TOKENIZED
> Field.Index.Store
>
> Is it the Best Practice ?
>
> -- 
> View this message in context: http://www.nabble.com/Search-by-KeyWord%2C-the-best-practice-tp14513720p14513720.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

--------------------------
Grant Ingersoll
http://lucene.grantingersoll.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Search by KeyWord, the best practice

Posted by Erick Erickson <er...@gmail.com>.
As long as you control both ends (i.e. what's indexed and what's searched)
then UN_TOKENIZED is fine. Note that case has to match, etc.

As an added benefit, you can sort by the field too...

Best
Erick

On Dec 27, 2007 10:31 AM, webspeak <we...@hotmail.com> wrote:

>
> Hello,
>
> Thank you for your reply :-)
> The customer value will be choosed from a dropdown list.The value that it
> will be selected must match the value in the CUSTOMER field.
>
> I think I don't have to tokenized it... as it is exact match.
>
>
>
>
> Erick Erickson wrote:
> >
> > Well, it depends upon what you want to accomplish. By indexing
> > UN_TOKENIZED, the text is NOT broken up. So indexing
> > "some text" will not match if you search on "some". or "text" or
> > even "text some".
> >
> > You really, really, really need to tell us what it is you want to
> > accomplish before anyone can suggest best practices. What's
> > the use case you're trying to support?
> >
> > Best
> > Erick
> >
> > On Dec 27, 2007 9:33 AM, webspeak <we...@hotmail.com> wrote:
> >
> >>
> >> Hello,
> >>
> >> I would like to search documents by "CUSTOMER".
> >> So I search on the field "CUSTOMER" using a KeywordAnalyzer.
> >>
> >> The CUSTOMER field is indexed with those params:
> >> Field.Index.UN_TOKENIZED
> >> Field.Index.Store
> >>
> >> Is it the Best Practice ?
> >>
> >> --
> >> View this message in context:
> >>
> http://www.nabble.com/Search-by-KeyWord%2C-the-best-practice-tp14513720p14513720.html
> >> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Search-by-KeyWord%2C-the-best-practice-tp14513720p14514446.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Search by KeyWord, the best practice

Posted by webspeak <we...@hotmail.com>.
Hello,

Thank you for your reply :-)
The customer value will be choosed from a dropdown list.The value that it
will be selected must match the value in the CUSTOMER field.

I think I don't have to tokenized it... as it is exact match.




Erick Erickson wrote:
> 
> Well, it depends upon what you want to accomplish. By indexing
> UN_TOKENIZED, the text is NOT broken up. So indexing
> "some text" will not match if you search on "some". or "text" or
> even "text some".
> 
> You really, really, really need to tell us what it is you want to
> accomplish before anyone can suggest best practices. What's
> the use case you're trying to support?
> 
> Best
> Erick
> 
> On Dec 27, 2007 9:33 AM, webspeak <we...@hotmail.com> wrote:
> 
>>
>> Hello,
>>
>> I would like to search documents by "CUSTOMER".
>> So I search on the field "CUSTOMER" using a KeywordAnalyzer.
>>
>> The CUSTOMER field is indexed with those params:
>> Field.Index.UN_TOKENIZED
>> Field.Index.Store
>>
>> Is it the Best Practice ?
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Search-by-KeyWord%2C-the-best-practice-tp14513720p14513720.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/Search-by-KeyWord%2C-the-best-practice-tp14513720p14514446.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Search by KeyWord, the best practice

Posted by Erick Erickson <er...@gmail.com>.
Well, it depends upon what you want to accomplish. By indexing
UN_TOKENIZED, the text is NOT broken up. So indexing
"some text" will not match if you search on "some". or "text" or
even "text some".

You really, really, really need to tell us what it is you want to
accomplish before anyone can suggest best practices. What's
the use case you're trying to support?

Best
Erick

On Dec 27, 2007 9:33 AM, webspeak <we...@hotmail.com> wrote:

>
> Hello,
>
> I would like to search documents by "CUSTOMER".
> So I search on the field "CUSTOMER" using a KeywordAnalyzer.
>
> The CUSTOMER field is indexed with those params:
> Field.Index.UN_TOKENIZED
> Field.Index.Store
>
> Is it the Best Practice ?
>
> --
> View this message in context:
> http://www.nabble.com/Search-by-KeyWord%2C-the-best-practice-tp14513720p14513720.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>