You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Trey Grainger <so...@gmail.com> on 2013/10/05 03:21:16 UTC

Roadmap for fixing features broken by core autodiscovery

There are two use-cases that appear broken with the new core auto-discovery
mechanism:

*1) The Core Admin Handler's CREATE command no longer works to create brand
new cores*
(unless you have logged on the box and created the core's directory
structure manually, which largely defeats the purpose of the "CREATE"
command).  With the old Solr.xml format, we could spin up as many cores as
we wanted to dynamically with the following command:
http://localhost:8983/solr/admin/cores?action=CREATE&name=newCore1&
instanceDir=collection1&dataDir=newCore1/data
...
http://localhost:8983/solr/admin/cores?action=CREATE&name=newCoreN&
instanceDir=collection1&dataDir=newCoreN/data

In the new core discovery mode, this exception is now thrown:
Error CREATEing SolrCore 'newCore1': Could not create a new core in
solr/collection1/as another core is already defined there

The exception is being intentionally thrown in CorePropertiesLocator.java
because a core.properties file already exists in solr/collection1 (and only
one can exist per directory).


*2) Having a shared configuration directory (instanceDir) across many cores
no longer works*.
Every core has to have it's own conf/ directory, and this doesn't seem to
be overridable any longer.  Previously, it was possible to have many cores
share the same instanceDir (and just override their dataDir for obvious
reasons).  Now, it is necessary to copy and paste identical config files
for each Solr core.


I don't know if there's already a current roadmap for fixing this.  I saw
https://issues.apache.org/jira/browse/SOLR-4478, which suggested replacing
instanceDir with the ability to specify a named configSet.  This solves
problem 2, but not problem1 (since you still can't have multiple
core.properties files in the same folder).  Based on Erick's comments in
the JIRA ticket, it also sounds like this ticket is also dead at the moment.

There is definitely a need to have a shared config directory - whether that
is through a configSet or an explicit indexDir doesn't matter to me.
 There's also a need to be able to dynamically create Solr cores from
external systems.  I currently can't upgrade to core auto discovery because
it doesn't allow dynamic core creation.  Does anyone have some thoughts on
how to best get these features working again under core autodiscovery?
 Adding instanceDir to core.properties seems like an easy solution, but
there must be a desire not to do that or it would probably have already
been done.

I'm happy to contribute some time to resolving this if there is agreed upon
path forward.


Thanks,

-Trey

Re: Roadmap for fixing features broken by core autodiscovery

Posted by Erick Erickson <er...@gmail.com>.
Right, let's move this discussion to SOLR-4779. There's some history
here. Sharing named config sets got a bit wrapped up in sharing the
underlying solrconfig object. This latter has been taken off the
table, but we should discuss fixing Trey's issues up. Here's what the
thinking was:
There would be a directory like <solr_home>/configs/configset1,
<solr_home>/configs/configset2, etc. Then a new parameter for
core.properties or create or whatever like "configset=configset1" that
would be smart enough to look in <solr_home>/configs for an entire
conf directory named "configste1".

Trey:
Does that work for your case? If so, please add your comments to 4779
and we can take it from there. FWIW, I don't think this is especially
hard, but time is always at a premium.

Erick

On Fri, Oct 4, 2013 at 6:51 PM, Shawn Heisey <so...@elyograg.org> wrote:
> On 10/4/2013 7:21 PM, Trey Grainger wrote:
>> There are two use-cases that appear broken with the new core
>> auto-discovery mechanism:
>>
>> *1) The Core Admin Handler's CREATE command no longer works to create
>> brand new cores*
>> (unless you have logged on the box and created the core's directory
>> structure manually, which largely defeats the purpose of the "CREATE"
>> command).  With the old Solr.xml format, we could spin up as many cores
>> as we wanted to dynamically with the following command:
>> http://localhost:8983/solr/admin/cores?action=CREATE&name=newCore1&instanceDir=collection1&dataDir=newCore1/data
>> ...
>> http://localhost:8983/solr/admin/cores?action=CREATE&name=newCoreN&instanceDir=collection1&dataDir=newCoreN/data
>>
>> In the new core discovery mode, this exception is now thrown:
>> Error CREATEing SolrCore 'newCore1': Could not create a new core in
>> solr/collection1/as another core is already defined there
>
> The CREATE action has *always* required that you have your configuration
> on the disk before you call it.  You are sharing the instanceDir, which
> is the only reason you can skip that step.
>
> If you want completely dynamic creation, use SolrCloud, which keeps the
> config in zookeeper and requires ZERO config information to exist on the
> disk.
>
>> *2) Having a shared configuration directory (instanceDir) across many
>> cores no longer works*.
>> Every core has to have it's own conf/ directory, and this doesn't seem
>> to be overridable any longer.  Previously, it was possible to have many
>> cores share the same instanceDir (and just override their dataDir for
>> obvious reasons).  Now, it is necessary to copy and paste identical
>> config files for each Solr core.
>
> From what I understand talking to the people that worked on this, the
> lack of a shared instanceDir was completely deliberate.  It's the only
> way that core discovery can work in any kind of predictable and sane
> manner.  The entire point of it is that every core is self-contained and
> solr.xml isn't used to tell Solr about them.
>
> I personally have never tried to share the instanceDir.  I do have
> shared configs, though - my corename/conf directories have symlinks to a
> shared config directory.  I also don't dynamically create cores - I have
> seven shards, each of which has a live core and a build core.  There are
> two other cores that serve as frontends, with the shards parameter in
> the request handlers.
>
>> I don't know if there's already a current roadmap for fixing this.  I
>> saw https://issues.apache.org/jira/browse/SOLR-4478, which suggested
>> replacing instanceDir with the ability to specify a named configSet.
>>  This solves problem 2, but not problem1 (since you still can't have
>> multiple core.properties files in the same folder).  Based on Erick's
>> comments in the JIRA ticket, it also sounds like this ticket is also
>> dead at the moment.
>>
>> There is definitely a need to have a shared config directory - whether
>> that is through a configSet or an explicit indexDir doesn't matter to
>> me.  There's also a need to be able to dynamically create Solr cores
>> from external systems.  I currently can't upgrade to core auto discovery
>> because it doesn't allow dynamic core creation.  Does anyone have some
>> thoughts on how to best get these features working again under core
>> autodiscovery?  Adding instanceDir to core.properties seems like an easy
>> solution, but there must be a desire not to do that or it would probably
>> have already been done.
>
> Thankfully, you do not need to upgrade to core discovery anytime soon.
> All future 4.x versions will support the old format, and any problems
> with that will be considered bugs.  It will be mandatory in Solr 5.0,
> which currently doesn't have any kind of release roadmap or timeframe.
> I suspect that what we currently call "SolrCloud" will also be mandatory
> in 5.0, and that gives you shared configs with zookeeper.  Requiring
> zookeeper allows completely dynamic core/collection creation, because
> the only thing that will be on the disk is the index and transaction log
> data.
>
> Thanks,
> Shawn
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Roadmap for fixing features broken by core autodiscovery

Posted by Shawn Heisey <so...@elyograg.org>.
On 10/4/2013 7:21 PM, Trey Grainger wrote:
> There are two use-cases that appear broken with the new core
> auto-discovery mechanism:
> 
> *1) The Core Admin Handler's CREATE command no longer works to create
> brand new cores* 
> (unless you have logged on the box and created the core's directory
> structure manually, which largely defeats the purpose of the "CREATE"
> command).  With the old Solr.xml format, we could spin up as many cores
> as we wanted to dynamically with the following command:
> http://localhost:8983/solr/admin/cores?action=CREATE&name=newCore1&instanceDir=collection1&dataDir=newCore1/data
> ...
> http://localhost:8983/solr/admin/cores?action=CREATE&name=newCoreN&instanceDir=collection1&dataDir=newCoreN/data
> 
> In the new core discovery mode, this exception is now thrown:
> Error CREATEing SolrCore 'newCore1': Could not create a new core in
> solr/collection1/as another core is already defined there

The CREATE action has *always* required that you have your configuration
on the disk before you call it.  You are sharing the instanceDir, which
is the only reason you can skip that step.

If you want completely dynamic creation, use SolrCloud, which keeps the
config in zookeeper and requires ZERO config information to exist on the
disk.

> *2) Having a shared configuration directory (instanceDir) across many
> cores no longer works*.  
> Every core has to have it's own conf/ directory, and this doesn't seem
> to be overridable any longer.  Previously, it was possible to have many
> cores share the same instanceDir (and just override their dataDir for
> obvious reasons).  Now, it is necessary to copy and paste identical
> config files for each Solr core.

>From what I understand talking to the people that worked on this, the
lack of a shared instanceDir was completely deliberate.  It's the only
way that core discovery can work in any kind of predictable and sane
manner.  The entire point of it is that every core is self-contained and
solr.xml isn't used to tell Solr about them.

I personally have never tried to share the instanceDir.  I do have
shared configs, though - my corename/conf directories have symlinks to a
shared config directory.  I also don't dynamically create cores - I have
seven shards, each of which has a live core and a build core.  There are
two other cores that serve as frontends, with the shards parameter in
the request handlers.

> I don't know if there's already a current roadmap for fixing this.  I
> saw https://issues.apache.org/jira/browse/SOLR-4478, which suggested
> replacing instanceDir with the ability to specify a named configSet.
>  This solves problem 2, but not problem1 (since you still can't have
> multiple core.properties files in the same folder).  Based on Erick's
> comments in the JIRA ticket, it also sounds like this ticket is also
> dead at the moment.
> 
> There is definitely a need to have a shared config directory - whether
> that is through a configSet or an explicit indexDir doesn't matter to
> me.  There's also a need to be able to dynamically create Solr cores
> from external systems.  I currently can't upgrade to core auto discovery
> because it doesn't allow dynamic core creation.  Does anyone have some
> thoughts on how to best get these features working again under core
> autodiscovery?  Adding instanceDir to core.properties seems like an easy
> solution, but there must be a desire not to do that or it would probably
> have already been done.

Thankfully, you do not need to upgrade to core discovery anytime soon.
All future 4.x versions will support the old format, and any problems
with that will be considered bugs.  It will be mandatory in Solr 5.0,
which currently doesn't have any kind of release roadmap or timeframe.
I suspect that what we currently call "SolrCloud" will also be mandatory
in 5.0, and that gives you shared configs with zookeeper.  Requiring
zookeeper allows completely dynamic core/collection creation, because
the only thing that will be on the disk is the index and transaction log
data.

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org