You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by Erik Hatcher <er...@ehatchersolutions.com> on 2007/02/03 21:43:21 UTC

finer granularity of configuration

I'd like to be able to have a common schema.xml and solrconfig.xml  
but be able to fire up Solr instances pointed to different data  
directories.  I realize we have SOLR-79 in JIRA.  Is that the  
approach we want long term?

Here's an off-the-cuff idea... what if we hook Config.get() to look  
for system properties that would override configuration values.    
SolrCore does this:

	  dataDir = SolrConfig.config.get("dataDir",Config.getInstanceDir() 
+"data");

If it looked for a system property (perhaps with a "solr." prefix)  
you could override anything Config serves up.  Thoughts?

Speaking of which, is this incorrect in SolrCore.java?

	  public String getDataDir() { return index_path; }  // shouldn't  
this return dataDir?
	  public String getIndexDir() { return index_path; }

Erik


Re: finer granularity of configuration

Posted by Yonik Seeley <yo...@apache.org>.
On 2/3/07, Erik Hatcher <er...@ehatchersolutions.com> wrote:
> I'd like to be able to have a common schema.xml and solrconfig.xml
> but be able to fire up Solr instances pointed to different data
> directories.  I realize we have SOLR-79 in JIRA.  Is that the
> approach we want long term?
>
> Here's an off-the-cuff idea... what if we hook Config.get() to look
> for system properties that would override configuration values.
> SolrCore does this:
>
>           dataDir = SolrConfig.config.get("dataDir",Config.getInstanceDir()
> +"data");
>
> If it looked for a system property (perhaps with a "solr." prefix)
> you could override anything Config serves up.  Thoughts?

Seems like a good idea... and as long as the env vars don't clash,
this doesn't conflict with SOLR-79.
Perhaps the full path with a leading solr.

So -Dsolr.config.dataDir=/path

> Speaking of which, is this incorrect in SolrCore.java?
>
>           public String getDataDir() { return index_path; }  // shouldn't
> this return dataDir?
>           public String getIndexDir() { return index_path; }

Fixed.  getDataDir() wasn't used anywhere outside SolrCore, and I
changed uses of getDataDir to getIndexDir.

-Yonik

Re: finer granularity of configuration

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Feb 5, 2007, at 1:01 AM, Chris Hostetter wrote:
> : Here's an off-the-cuff idea... what if we hook Config.get() to look
> : for system properties that would override configuration values.
> : SolrCore does this:
>
> the number of different system properties could get very messy ...

well, there would be a lot of places things could be hooked, for  
sure.  but that doesn't mean anyone would need to use them.   in  
fact, i don't see much need for external overriding of configuration  
other than to share configuration with multiple data directories.   
i'm not sure where the mess would be.  It'd be one line of coded  
added to Config.get(), no parsing changes, and only a few sentences  
needed to document.  property expansion seems overkill.

the difference with SOLR-79 and my proposal is the ability to  
override a configuration without touching the configuration file at all.

> another way to make this more customizable, would be to make sure we
> support Xinclude when parsing xml config files...

+1, though this can already be hacked with entity reference includes  
in XML, which was the old school way to share bits of Ant build files.

	Erik


Re: finer granularity of configuration

Posted by Chris Hostetter <ho...@fucit.org>.
: Here's an off-the-cuff idea... what if we hook Config.get() to look
: for system properties that would override configuration values.
: SolrCore does this:

the number of different system properties could get very messy ... i'd
much rather see a good patch for SOLR-79 (one thought i had rereading the
issue is that the property replacement could happen when the config was
parsed into the DOM tree at stratup -- then all of the various methods
used to get config values wouldn't need to be changed -- jus hte parsing).

another way to make this more customizable, would be to make sure we
support Xinclude when parsing xml config files...

http://www.w3.org/TR/2004/PR-xinclude-20040930/



-Hoss


Re: finer granularity of configuration

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Feb 10, 2007, at 3:25 PM, Chris Hostetter wrote:
> : Sounds totally fair (assigned _to me_).  To be honest, I only just
> : glanced at the code for SOLR-79. It may be just what the doctor
> : ordered with some tweaks.  I sorta have some experience with the $
> : {..} syntax and would switch the syntax in the patch to be Ant-like
> : in this regard.  I'll come up with some unit tests along the way  
> too.
>
> yeah ... it would be cool to have a robust version of SOLR-79 so that
> everything *could* be in the hands of the users if they want it --  
> i just
> don't know enough about DOM manipulation to know if there's a clean  
> way to
> completely decorate the tree with variable substitution.

I've implemented SOLR-79 and added my latest patch.  The DOM is  
manipulated in place and ${...} are substituted with system property  
lookups.

At this point it is robust enough, substituting an empty string in  
for a non-existent property reference.  If deemed desirable we could  
have some kind of default value handling, maybe ${prop.name:default  
value}, though not needed initially for my use case.

Committable as-is?

	Erik




Re: finer granularity of configuration

Posted by Chris Hostetter <ho...@fucit.org>.
: Sounds totally fair (assigned _to me_).  To be honest, I only just
: glanced at the code for SOLR-79. It may be just what the doctor
: ordered with some tweaks.  I sorta have some experience with the $
: {..} syntax and would switch the syntax in the patch to be Ant-like
: in this regard.  I'll come up with some unit tests along the way too.

yeah ... it would be cool to have a robust version of SOLR-79 so that
everything *could* be in the hands of the users if they want it -- i just
don't know enough about DOM manipulation to know if there's a clean way to
completely decorate the tree with variable substitution.

: Hoss, you mentioned multiple server instances pointing to the same
: index directory.  Would that be a reasonable configuration?  Any
: contention issues with multiple Solr instances pointed at a single
: index?  I kinda always envisioned one-to-one Solr instance and index.

i've never tried it myself ... but if no more then one solr
instance is writting, the rest should be able to read just fine
(especially once we rev lucene and there are no more locks on opening an
index reader)



-Hoss


Re: finer granularity of configuration

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Feb 9, 2007, at 10:47 PM, Chris Hostetter wrote:
> : : Does this patch make sense to commit as-is, or could it be  
> committed
> : : with some tweaks, or is it not a good general approach and  
> needs to
> : : be thought out more?
> :
> : I suppose it's okay...
>
> just to clarify that statement: i have no strong objection to it being
> committed, as long as Erik agree's to assign SOLR-79 to himself and
> work on it as soon as someone requests YASP (Yet Another System  
> Property)
>
> 	:)
>
> does that sound likie a fair deal Erik?

Sounds totally fair (assigned _to me_).  To be honest, I only just  
glanced at the code for SOLR-79. It may be just what the doctor  
ordered with some tweaks.  I sorta have some experience with the $ 
{..} syntax and would switch the syntax in the patch to be Ant-like  
in this regard.  I'll come up with some unit tests along the way too.

Thanks, Hoss, for the prod to do this the better way from the start.

Having control over the Solr configuration from the launching JVM  
above and beyond the configuration files makes a lot of sense.

What I'm working on is a system to bring up Solr + Flare instances  
easily, sharing schemas but different data directories, and such.  I  
can see cache settings being overridden.

Hoss, you mentioned multiple server instances pointing to the same  
index directory.  Would that be a reasonable configuration?  Any  
contention issues with multiple Solr instances pointed at a single  
index?  I kinda always envisioned one-to-one Solr instance and index.

	Erik


Re: finer granularity of configuration

Posted by Chris Hostetter <ho...@fucit.org>.
: : Does this patch make sense to commit as-is, or could it be committed
: : with some tweaks, or is it not a good general approach and needs to
: : be thought out more?
:
: I suppose it's okay...

just to clarify that statement: i have no strong objection to it being
committed, as long as Erik agree's to assign SOLR-79 to himself and
work on it as soon as someone requests YASP (Yet Another System Property)

	:)

does that sound likie a fair deal Erik?



-Hoss


Re: finer granularity of configuration

Posted by Chris Hostetter <ho...@fucit.org>.
: My use case is this:  I want to have a single solrconfig.xml/
: schema.xml and Solr distribution, and be able to bring up a Solr
: instance for different data directories.  Nothing more than that.
: For different configurations and schemas, I'd make a copy of the
: configuration directory.
:
: Does this patch make sense to commit as-is, or could it be committed
: with some tweaks, or is it not a good general approach and needs to
: be thought out more?

I suppose it's okay...

i just worry that while for you the dataDir is the only critical
difference, other people may feel that they want to reuse the same dataDir
butwith differnet solrconfig.xml (so they can have multiple ports serving
searching the same index with different request handler configurations)
while other people will want the xslt directory to be different, etc...
at the moment the only system property we support is solr.solr.home (and
even that i feel like we should discourage unless there is a strong reason
why someone cn't use JNDI) but with that one property people can still
share just about anything via symbolic file/directory links.

if we start adding more system properies to let people override config
values, i'd rather see us go all out with token replacment and not add any
more special system property names.




-Hoss


Re: finer granularity of configuration

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Feb 3, 2007, at 3:43 PM, Erik Hatcher wrote:
> I'd like to be able to have a common schema.xml and solrconfig.xml  
> but be able to fire up Solr instances pointed to different data  
> directories.  I realize we have SOLR-79 in JIRA.  Is that the  
> approach we want long term?
>
> Here's an off-the-cuff idea... what if we hook Config.get() to look  
> for system properties that would override configuration values.    
> SolrCore does this:
>
> 	  dataDir = SolrConfig.config.get("dataDir",Config.getInstanceDir() 
> +"data");
>
> If it looked for a system property (perhaps with a "solr." prefix)  
> you could override anything Config serves up.  Thoughts?

At first I went down the path of modifying Config.get (indirectly, by  
modifying Config.getVal), but it didn't feel quite right because the  
properties it asks for are XPaths.

dataDir is the only one I wanted to override, so I patched this instead:

Index: src/java/org/apache/solr/core/SolrCore.java
===================================================================
--- src/java/org/apache/solr/core/SolrCore.java (revision 505245)
+++ src/java/org/apache/solr/core/SolrCore.java (working copy)
@@ -182,7 +182,7 @@
        core = this;   // set singleton
        if (dataDir ==null) {
-        dataDir = SolrConfig.config.get 
("dataDir",Config.getInstanceDir()+"data");
+        dataDir = System.getProperty("solr.dataDir",  
SolrConfig.config.get("dataDir",Config.getInstanceDir()+"data"));
        }
        log.info("Opening new SolrCore at " + Config.getInstanceDir()  
+ ", dataDir="+dataDir);

My use case is this:  I want to have a single solrconfig.xml/ 
schema.xml and Solr distribution, and be able to bring up a Solr  
instance for different data directories.  Nothing more than that.   
For different configurations and schemas, I'd make a copy of the  
configuration directory.

Does this patch make sense to commit as-is, or could it be committed  
with some tweaks, or is it not a good general approach and needs to  
be thought out more?

Thanks,
	Erik