You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by David Black <bl...@apple.com> on 2004/04/01 21:48:39 UTC
Nested category strategy
Hey All,
I'm trying to figure out the best approach to something.
Each document I index has an array of categories which looks like the
following example....
/Science/Medicine/Serology/blood gas
/Biology/Fluids/Blood/
etc.
Anyway, there's a couple things I'm trying to deal with.
1. The fact that we have an undefined array size. I can't just shove
these into a single field. I could explode them into multiple fields
on the fly like category_1, category_2. etc. etc
2. The fact that a search will need to be performed like " category:
/Science/Medicine/*" would need to return all items within that
category.
Thanks in advance to anyone who can give me some help here.
Thanks
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
Re: Nested category strategy
Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Apr 1, 2004, at 2:48 PM, David Black wrote:
> Each document I index has an array of categories which looks like the
> following example....
>
> /Science/Medicine/Serology/blood gas
> /Biology/Fluids/Blood/
>
> etc.
>
> Anyway, there's a couple things I'm trying to deal with.
>
> 1. The fact that we have an undefined array size. I can't just shove
> these into a single field. I could explode them into multiple fields
> on the fly like category_1, category_2. etc. etc
Yes, you could use a single field.... and add them individually.
doc.add(Field.Keyword("category", category1))
doc.add(Field.Keyword("category", category2))
Or am I missing something about your scenario?
> 2. The fact that a search will need to be performed like " category:
> /Science/Medicine/*" would need to return all items within that
> category.
Keep in mind the issues you may encounter using QueryParser with such a
field selector. Choose your analyzer(s) carefully!
Erik
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
RE: Nested category strategy
Posted by Stephane James Vaucher <va...@cirano.qc.ca>.
Another possibility is to add all combinations in a single field.
addField("category", "/Science/");
addField("category", "/Science/Medicine");
addField("category", "/Science/Foo");
addField("category", "/Biology");
Your wildcard search should work, and you shouldn't have the problem with
a search "/Science/*".
HTH,
sv
On Thu, 1 Apr 2004, Tate Avery wrote:
>
> Could you put them all into a tab-delimited string and store that as a
> single field, then use a TabTokenizer on the field to search?
>
> And, if you need to, do a .split("\t") on the field value in order to break
> them back up into individual categories.
>
>
>
>
> -----Original Message-----
> From: David Black [mailto:black@apple.com]
> Sent: Thursday, April 01, 2004 2:49 PM
> To: lucene-user@jakarta.apache.org
> Subject: Nested category strategy
>
>
> Hey All,
>
> I'm trying to figure out the best approach to something.
>
> Each document I index has an array of categories which looks like the
> following example....
>
> /Science/Medicine/Serology/blood gas
> /Biology/Fluids/Blood/
>
> etc.
>
> Anyway, there's a couple things I'm trying to deal with.
>
> 1. The fact that we have an undefined array size. I can't just shove
> these into a single field. I could explode them into multiple fields
> on the fly like category_1, category_2. etc. etc
>
> 2. The fact that a search will need to be performed like " category:
> /Science/Medicine/*" would need to return all items within that
> category.
>
> Thanks in advance to anyone who can give me some help here.
>
> Thanks
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
RE: Nested category strategy
Posted by Tate Avery <ta...@nstein.com>.
Could you put them all into a tab-delimited string and store that as a
single field, then use a TabTokenizer on the field to search?
And, if you need to, do a .split("\t") on the field value in order to break
them back up into individual categories.
-----Original Message-----
From: David Black [mailto:black@apple.com]
Sent: Thursday, April 01, 2004 2:49 PM
To: lucene-user@jakarta.apache.org
Subject: Nested category strategy
Hey All,
I'm trying to figure out the best approach to something.
Each document I index has an array of categories which looks like the
following example....
/Science/Medicine/Serology/blood gas
/Biology/Fluids/Blood/
etc.
Anyway, there's a couple things I'm trying to deal with.
1. The fact that we have an undefined array size. I can't just shove
these into a single field. I could explode them into multiple fields
on the fly like category_1, category_2. etc. etc
2. The fact that a search will need to be performed like " category:
/Science/Medicine/*" would need to return all items within that
category.
Thanks in advance to anyone who can give me some help here.
Thanks
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org