You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Alejandro Calbazana <ac...@gmail.com> on 2013/10/30 02:37:31 UTC
Many Dynamic Fields + Indexing Strategy
Hi,
I have an application that has a fair number of dynamic fields in addition
to static fields. The use case is that a customer can create any number of
dynamic fields and associate them with domain objects that we then pull
into an indexed document. I have no way to know these fields in advance
and the expectation is that these fields are searchable using a field/value
query. It is a multi-tenant environment and it is possible that there
could be a high volume of dynamic fields created.
My question is if there is a reasonable indexing strategy that can be used
to accommodate such a use case. My concern is that I can end up with a
large number of dynamic fields which would bring querying and full indexing
to a slow down. Through some testing, I've created unique dynamic fields
and got into the 50K - 100K range when my JVM began to behave poorly and go
OOM. I understand why this happens but I'm interested in how to protect
against this.
My only thought at the moment is to split my single index into multiple
cores - one per tenant. Has anyone else had this requirement? How did you
handle it?
My schema is pretty much what I've described. A handful of static fields
with the stock dynamic field pattern definitions. I am using Solr 4.2.1.
Thanks,
Al
Re: Many Dynamic Fields + Indexing Strategy
Posted by Jack Krupansky <ja...@basetechnology.com>.
Every multitenant situation is going to be different, but at the extreme a
single core per tenant is the cleanest and provides the best separation,
optimal performance, and supports full tf-idf relevancy of document fields
for each tenant.
You can also do a hybrid, where you have separate cores for the bulk data
for each tenant, but have a single common collection with a subset of tenant
data which your admin application can use to do searches across tenants for
common metadata.
-- Jack Krupansky
-----Original Message-----
From: Alejandro Calbazana
Sent: Tuesday, October 29, 2013 9:37 PM
To: solr-user@lucene.apache.org
Subject: Many Dynamic Fields + Indexing Strategy
Hi,
I have an application that has a fair number of dynamic fields in addition
to static fields. The use case is that a customer can create any number of
dynamic fields and associate them with domain objects that we then pull
into an indexed document. I have no way to know these fields in advance
and the expectation is that these fields are searchable using a field/value
query. It is a multi-tenant environment and it is possible that there
could be a high volume of dynamic fields created.
My question is if there is a reasonable indexing strategy that can be used
to accommodate such a use case. My concern is that I can end up with a
large number of dynamic fields which would bring querying and full indexing
to a slow down. Through some testing, I've created unique dynamic fields
and got into the 50K - 100K range when my JVM began to behave poorly and go
OOM. I understand why this happens but I'm interested in how to protect
against this.
My only thought at the moment is to split my single index into multiple
cores - one per tenant. Has anyone else had this requirement? How did you
handle it?
My schema is pretty much what I've described. A handful of static fields
with the stock dynamic field pattern definitions. I am using Solr 4.2.1.
Thanks,
Al