You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Bijoy Deb <bi...@yahoo.com> on 2014/01/13 12:16:20 UTC

Can I store only the index in Solr and not the actual data

Hi,

I have my data in HDFS,which I need to index using Solr.In that case,does Solr always store both the data (the fields that need to be retrieved) as well as the index, or  can it be configured to store only the index that points to the original data in HDFS.
Personally,I would like the latter feature as the former will unnecessary cause data duplication and will occupy more diskspace.
In a word,I feel that similar to database indexes,my data should not be required to get stored separately in any server(Solr server) and only the index should be created that will point to that data.

Would highly appreciate if you can let me know if such thing is possible in Solr (creating only the indexes,and not copying the retrievable data into Solr server).

Thanks
Bijoy

Re: Can I store only the index in Solr and not the actual data

Posted by David Santamauro <da...@gmail.com>.
On 01/13/2014 06:16 AM, Bijoy Deb wrote:
> Hi,
>
> I have my data in HDFS,which I need to index using Solr.In that case,does Solr always store both the data (the fields that need to be retrieved) as well as the index, or  can it be configured to store only the index that points to the original data in HDFS.
> Personally,I would like the latter feature as the former will unnecessary cause data duplication and will occupy more diskspace.
> In a word,I feel that similar to database indexes,my data should not be required to get stored separately in any server(Solr server) and only the index should be created that will point to that data.

The attribute you are looking for is @stored in your schema.xml[1].

[1] http://wiki.apache.org/solr/SchemaXml


David