You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jamie Johnson <je...@gmail.com> on 2011/07/20 17:57:01 UTC

Tokenizer Question

I have a query which starts out with something like name:"john", I
need to expand this to something like name:("john" "johnny").  I've
implemented a custom tokenzier which gets close, but isn't quite right
it outputs name:"john johnny".  Is there a simple example of doing
what I'm attempting?

Re: Tokenizer Question

Posted by Jamie Johnson <je...@gmail.com>.
Thanks, I'll try that now, I'm assuming I need to add the position
increment and offset attributes?

On Wed, Jul 20, 2011 at 3:44 PM, Chris Hostetter
<ho...@fucit.org> wrote:
>
> When the QueryParser gives hunks of text to an analyzer, and that analyzer
> produces multiple terms, the query parser has to decide how to build a
> query out of it.
>
> if the terms have identicle position information, then it always builds an
> "OR" query (this is the typical synonym situation).  If the terms have
> differing positions, then the behavior is driven by the
> autoGeneratePhraseQueries attribute of the FieldType -- the default value
> of this depends on the version attribute of your top level <schema/> tag.
>
>
> : I have a query which starts out with something like name:"john", I
> : need to expand this to something like name:("john" "johnny").  I've
> : implemented a custom tokenzier which gets close, but isn't quite right
> : it outputs name:"john johnny".  Is there a simple example of doing
> : what I'm attempting?
> :
>
> -Hoss
>

Re: Tokenizer Question

Posted by Chris Hostetter <ho...@fucit.org>.
When the QueryParser gives hunks of text to an analyzer, and that analyzer 
produces multiple terms, the query parser has to decide how to build a 
query out of it.

if the terms have identicle position information, then it always builds an 
"OR" query (this is the typical synonym situation).  If the terms have 
differing positions, then the behavior is driven by the 
autoGeneratePhraseQueries attribute of the FieldType -- the default value 
of this depends on the version attribute of your top level <schema/> tag.


: I have a query which starts out with something like name:"john", I
: need to expand this to something like name:("john" "johnny").  I've
: implemented a custom tokenzier which gets close, but isn't quite right
: it outputs name:"john johnny".  Is there a simple example of doing
: what I'm attempting?
: 

-Hoss

Re: Tokenizer Question

Posted by Jamie Johnson <je...@gmail.com>.
My use case really isn't names, I just used that as a simplification.
I did look at the Synonym filter to see if I could implement a similar
filter (if that was a more appropriate place to do so) but even after
doing that I ended up with the same result.

On Wed, Jul 20, 2011 at 12:07 PM, Kyle Lee <ra...@gmail.com> wrote:
> I'm not sure how to accomplish what you're asking, but have you considered
> using a synonyms file? This would also allow you to catch ostensibly
> unrelated name substitutes such as Robert -> Bob and Richard -> Dick.
>
> On Wed, Jul 20, 2011 at 10:57 AM, Jamie Johnson <je...@gmail.com> wrote:
>
>> I have a query which starts out with something like name:"john", I
>> need to expand this to something like name:("john" "johnny").  I've
>> implemented a custom tokenzier which gets close, but isn't quite right
>> it outputs name:"john johnny".  Is there a simple example of doing
>> what I'm attempting?
>>
>

Re: Tokenizer Question

Posted by Kyle Lee <ra...@gmail.com>.
I'm not sure how to accomplish what you're asking, but have you considered
using a synonyms file? This would also allow you to catch ostensibly
unrelated name substitutes such as Robert -> Bob and Richard -> Dick.

On Wed, Jul 20, 2011 at 10:57 AM, Jamie Johnson <je...@gmail.com> wrote:

> I have a query which starts out with something like name:"john", I
> need to expand this to something like name:("john" "johnny").  I've
> implemented a custom tokenzier which gets close, but isn't quite right
> it outputs name:"john johnny".  Is there a simple example of doing
> what I'm attempting?
>