You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Geoffrey Young <ge...@modperlcookbook.org> on 2008/02/22 15:19:03 UTC
multiple "things" in a document
hi all :)
I'm just getting up to speed with solr (and lucene, for that matter) for
a new project. after reading through the available docs I'm not finding
an answer to my most basic (newbie, certainly) question. please feel
free to just point me to the proper doc :)
this isn't my actual use case, but it's close enough for general
understanding... say I want to store data on a collection of SKUs which
(for the unfamiliar :) are a combination of item + location. so we
might have
sku
id
name
item
location
item
id
name
location
id
name
all of the schema.xml examples seem to deal with just a flat "thing"
perhaps with multiple entries of the same field. what I'm after is how
to represent this kind of relationship in the schema, such that I can
limit my result set to, say, a sku or item, but if I search on sku I can
discriminate between the sku name and the item name in my results.
from my reading on lucene this is pretty basic stuff, but I don't see
how the solr layer approaches this at all. again, doc pointers much
appreciated.
thanks for listening :)
--Geoff
RE: multiple "things" in a document
Posted by Will Johnson <wi...@gmail.com>.
Usually you do something like: (assuming this is in a rdbms)
SELECT sku.id as skuid, sku.name as skuname, item.name as itemname,
location.name as locationname
FROM sku, item, location
WHERE sku.item = item.id AND sku.location = location.id
The you can search on any part of the 'flat' record and know what field
comes from where.
Depending on the size of you corpus, and the type of queries you want to be
able to server there are a million ways to optimize this but this should get
you up and searching quickly enough.
- will
-----Original Message-----
From: Geoffrey Young [mailto:geoff@modperlcookbook.org]
Sent: Friday, February 22, 2008 9:19 AM
To: solr-user@lucene.apache.org
Subject: multiple "things" in a document
hi all :)
I'm just getting up to speed with solr (and lucene, for that matter) for
a new project. after reading through the available docs I'm not finding
an answer to my most basic (newbie, certainly) question. please feel
free to just point me to the proper doc :)
this isn't my actual use case, but it's close enough for general
understanding... say I want to store data on a collection of SKUs which
(for the unfamiliar :) are a combination of item + location. so we
might have
sku
id
name
item
location
item
id
name
location
id
name
all of the schema.xml examples seem to deal with just a flat "thing"
perhaps with multiple entries of the same field. what I'm after is how
to represent this kind of relationship in the schema, such that I can
limit my result set to, say, a sku or item, but if I search on sku I can
discriminate between the sku name and the item name in my results.
from my reading on lucene this is pretty basic stuff, but I don't see
how the solr layer approaches this at all. again, doc pointers much
appreciated.
thanks for listening :)
--Geoff