You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Mike Krieger <mi...@instagram.com> on 2011/09/15 04:56:04 UTC

Complex Fields, Indexing & Storing

 Hi all,  

I have a quick question about how complex fields (subFields etc) interact with indexing and storage.

Let's say I have this (partial) schema:

<fieldType name="location" class="solr.LatLonType" subFieldSuffix="_coordinate"/>
…
<field name="pnt" type="location" indexed="true" stored="true" required="true" />
<dynamicField name="*_coordinate" type="tdouble" indexed="true" stored="false"/>


(slightly modified from the example/ project).

If my goals were to:

1) Always query the location by "lat,lng" (never the subfields directly)
2) Never need to pull out the 'pnt' value from results (I always just need the document's ID back, not its contents)
3) Not have any unnecessary indexes

Is there anything that I should change about the set-up above? Will the schema as it stands index both the two dynamic fields that get created, as well as the 'pnt' field above them (in effect creating 3 indexes)? Or is the "indexed='true'" on the 'pnt' field just needed so it properly connects to the two indexes created for the dynamic fields?

Thanks in advance,
Mike

Re: Complex Fields, Indexing & Storing

Posted by Erick Erickson <er...@gmail.com>.
Let's quickly review the difference between indexed and stored.

Indexing a field means you can search on it.

Stored means an entirely separate copy of the raw input is placed
in a particular file in your index (*.fdt to be exact).

These operations are entirely orthogonal. In effect when you
search on some terms, having set indexed="true" is all that
is necessary.

when you get a document from your index, or specify fl=field,
the datais fetched if and only if stored="true".

So in your example (and, BTW what's the point of your
dynamic field in your question? it seems unrelated) you
could set stored="false" for everything except your id
field

Best
Erick

On Wed, Sep 14, 2011 at 10:56 PM, Mike Krieger <mi...@instagram.com> wrote:
>  Hi all,
>
> I have a quick question about how complex fields (subFields etc) interact with indexing and storage.
>
> Let's say I have this (partial) schema:
>
> <fieldType name="location" class="solr.LatLonType" subFieldSuffix="_coordinate"/>
> …
> <field name="pnt" type="location" indexed="true" stored="true" required="true" />
> <dynamicField name="*_coordinate" type="tdouble" indexed="true" stored="false"/>
>
>
> (slightly modified from the example/ project).
>
> If my goals were to:
>
> 1) Always query the location by "lat,lng" (never the subfields directly)
> 2) Never need to pull out the 'pnt' value from results (I always just need the document's ID back, not its contents)
> 3) Not have any unnecessary indexes
>
> Is there anything that I should change about the set-up above? Will the schema as it stands index both the two dynamic fields that get created, as well as the 'pnt' field above them (in effect creating 3 indexes)? Or is the "indexed='true'" on the 'pnt' field just needed so it properly connects to the two indexes created for the dynamic fields?
>
> Thanks in advance,
> Mike
>