You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by JMA <mr...@comcast.net> on 2005/10/24 10:46:01 UTC
Frustrated with tokenized listing terms
Greetings...
Quick question, perhaps I am missing something.
I have a bunch of documents where one of the indexed fields is "author". For
example:
book1, by "John Smith"
book2, by "Steve Smith"
book3, by "John Smith"
I would like to find all distinct authors in my index. I want to support
searches for author:smith, so I tokenize the author field during index.
However, getTerms() then returns:
John (x2)
Smith (x3)
Steve (x1)
I would like to see:
John Smith (x2)
Steve Smith (x1)
I've solved this by indexing the field twice, once as author:(searchable/not
stored/tokenized)
and once as author_phrased:(not searchable/stored/not tokenized).
Then I query using the 'author' field while listing terms using the
'author_phrased' field.
This works, but is it the proper way to do it?
Thanks in advance,
JMA
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Frustrated with tokenized listing terms
Posted by Chris Hostetter <ho...@fucit.org>.
: I've solved this by indexing the field twice, once as author:(searchable/not
: stored/tokenized)
: and once as author_phrased:(not searchable/stored/not tokenized).
: This works, but is it the proper way to do it?
It's the most effective/efficient method i can think of.
-Hoss
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org