You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Geoffrey Young <ge...@modperlcookbook.org> on 2008/02/22 15:19:03 UTC

multiple "things" in a document

hi all :)

I'm just getting up to speed with solr (and lucene, for that matter) for 
a new project.  after reading through the available docs I'm not finding 
an answer to my most basic (newbie, certainly) question.  please feel 
free to just point me to the proper doc :)

this isn't my actual use case, but it's close enough for general 
understanding... say I want to store data on a collection of SKUs which 
(for the unfamiliar :) are a combination of item + location.  so we 
might have

   sku
     id
     name
     item
     location

   item
     id
     name

   location
     id
     name

all of the schema.xml examples seem to deal with just a flat "thing" 
perhaps with multiple entries of the same field.  what I'm after is how 
to represent this kind of relationship in the schema, such that I can 
limit my result set to, say, a sku or item, but if I search on sku I can 
discriminate between the sku name and the item name in my results.

from my reading on lucene this is pretty basic stuff, but I don't see 
how the solr layer approaches this at all.  again, doc pointers much 
appreciated.

thanks for listening :)

--Geoff

RE: multiple "things" in a document

Posted by Will Johnson <wi...@gmail.com>.
Usually you do something like: (assuming this is in a rdbms)

SELECT sku.id as skuid, sku.name as skuname, item.name as itemname,
location.name as locationname 
FROM sku, item, location
WHERE sku.item = item.id AND sku.location = location.id

The you can search on any part of the 'flat' record and know what field
comes from where.

Depending on the size of you corpus, and the type of queries you want to be
able to server there are a million ways to optimize this but this should get
you up and searching quickly enough.

- will



-----Original Message-----
From: Geoffrey Young [mailto:geoff@modperlcookbook.org] 
Sent: Friday, February 22, 2008 9:19 AM
To: solr-user@lucene.apache.org
Subject: multiple "things" in a document

hi all :)

I'm just getting up to speed with solr (and lucene, for that matter) for 
a new project.  after reading through the available docs I'm not finding 
an answer to my most basic (newbie, certainly) question.  please feel 
free to just point me to the proper doc :)

this isn't my actual use case, but it's close enough for general 
understanding... say I want to store data on a collection of SKUs which 
(for the unfamiliar :) are a combination of item + location.  so we 
might have

   sku
     id
     name
     item
     location

   item
     id
     name

   location
     id
     name

all of the schema.xml examples seem to deal with just a flat "thing" 
perhaps with multiple entries of the same field.  what I'm after is how 
to represent this kind of relationship in the schema, such that I can 
limit my result set to, say, a sku or item, but if I search on sku I can 
discriminate between the sku name and the item name in my results.

from my reading on lucene this is pretty basic stuff, but I don't see 
how the solr layer approaches this at all.  again, doc pointers much 
appreciated.

thanks for listening :)

--Geoff