You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by JMA <mr...@comcast.net> on 2005/10/24 10:46:01 UTC

Frustrated with tokenized listing terms

Greetings...
Quick question, perhaps I am missing something.

I have a bunch of documents where one of the indexed fields is "author". For
example:

book1, by "John Smith"
book2, by "Steve Smith"
book3, by "John Smith"

I would like to find all distinct authors in my index.  I want to support
searches for author:smith, so I tokenize the author field during index.
However, getTerms() then returns:

John (x2)
Smith (x3)
Steve (x1)

I would like to see:
John Smith (x2)
Steve Smith (x1)

I've solved this by indexing the field twice, once as author:(searchable/not
stored/tokenized)
and once as author_phrased:(not searchable/stored/not tokenized).

Then I query using the 'author' field while listing terms using the
'author_phrased' field.

This works, but is it the proper way to do it?

Thanks in advance,

JMA



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Frustrated with tokenized listing terms

Posted by Chris Hostetter <ho...@fucit.org>.
: I've solved this by indexing the field twice, once as author:(searchable/not
: stored/tokenized)
: and once as author_phrased:(not searchable/stored/not tokenized).

: This works, but is it the proper way to do it?

It's the most effective/efficient method i can think of.



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org