You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Greenhorn Techie <gr...@gmail.com> on 2018/06/07 13:37:37 UTC

Solr start script

Hi,

For our project purposes, we need to store Solr collections on HDFS.  While
exploring the documentation for the same, I have found lucidworks
documentation (
https://doc.lucidworks.com/lucidworks-hdpsearch/3.0.0/Guide-Install-Manual.html#hdfs-specific-changes)
, where it has been mentioned that solr start script can be passed many
arguments while starting. The example provided is as below:

bin/solr start -c
   -z 10.0.0.1:2181,10.0.0.2:2181,10.0.0.3:2181/solr
   -Dsolr.directoryFactory=HdfsDirectoryFactory
   -Dsolr.lock.type=hdfs
   -Dsolr.hdfs.home=hdfs://sandbox.hortonworks.com:8020/user/solr


What does this actually mean when passing directoryFactory settings for
Solr start script? I was thinking Directory Factory setting is something
that apply only at each collection level i.e. we need to specify within the
solrconfig.xml file *only*.

When the above settings are passed as part of start script, does that mean
whenever a new collection is created, Solr is going to store the indexes in
HDFS? But what if I upload my solrconfig.xml to ZK which contradicts with
this and contains NRTDirectoryFactory setting? Given the above start
script, should / could I skip the directory factory setting section in my
solrconfig.xml with the assumption that the collections are going to be
stored on HDFS *by default*?

This is confusing to me and hence need the expert advice of the community.

Thanks

Re: Solr start script

Posted by Cassandra Targett <ca...@gmail.com>.
The reason why you pass the DirectoryFactory at startup is so every
collection/core that's created is automatically stored in HDFS before
solrconfig.xml is read to know that's where they should be stored.

If you prefer to only store certain collections/cores in HDFS, you would
only set those properties in the solrconfig.xml files for the collection.

The properties do still need to be defined in solrconfig.xml, which the
documentation you pointed to says - make the change in solrconfig.xml, then
pass the properties at startup.

On Thu, Jun 7, 2018 at 9:25 AM Greenhorn Techie <gr...@gmail.com>
wrote:

> Shawn, Thanks for your response. Please find my follow-up questions:
>
> 1. My understanding is that Directory Factory settings are typically at a
> collection / core level. If thats the case, what is the advantage of
> passing it along with the start script?
> 2. In your below response, did you mean that even though I pass the
> settings as part of start script, they dont have any value unless they are
> mentioned as part of the solrconfig.xml file?
> 3. As per my previous email, what does Solr do if my solfconfig.xml contain
> NRTDirectoryFactory setting while the solr script is started with HDFS
> settings?
>
> Thanks
>
>
> On 7 June 2018 at 15:08:02, Shawn Heisey (apache@elyograg.org) wrote:
>
> On 6/7/2018 7:37 AM, Greenhorn Techie wrote:
> > When the above settings are passed as part of start script, does that
> mean
> > whenever a new collection is created, Solr is going to store the indexes
> in
> > HDFS? But what if I upload my solrconfig.xml to ZK which contradicts with
> > this and contains NRTDirectoryFactory setting? Given the above start
> > script, should / could I skip the directory factory setting section in my
> > solrconfig.xml with the assumption that the collections are going to be
> > stored on HDFS *by default*?
>
> Those commandline options are Java system properties.  It looks like the
> example configs DO have settings in them that would use the
> solr.directoryFactory and solr.lock.type properties.  But if your
> solrconfig.xml file doesn't reference those properties, then they
> wouldn't make any difference.  The last one is probably a setting that
> HdfsDirectoryFactory uses that doesn't need to be explicitly referenced
> in a config file.
>
> Thanks,
> Shawn
>

Re: Solr start script

Posted by Greenhorn Techie <gr...@gmail.com>.
Shawn, Thanks for your response. Please find my follow-up questions:

1. My understanding is that Directory Factory settings are typically at a
collection / core level. If thats the case, what is the advantage of
passing it along with the start script?
2. In your below response, did you mean that even though I pass the
settings as part of start script, they dont have any value unless they are
mentioned as part of the solrconfig.xml file?
3. As per my previous email, what does Solr do if my solfconfig.xml contain
NRTDirectoryFactory setting while the solr script is started with HDFS
settings?

Thanks


On 7 June 2018 at 15:08:02, Shawn Heisey (apache@elyograg.org) wrote:

On 6/7/2018 7:37 AM, Greenhorn Techie wrote:
> When the above settings are passed as part of start script, does that
mean
> whenever a new collection is created, Solr is going to store the indexes
in
> HDFS? But what if I upload my solrconfig.xml to ZK which contradicts with
> this and contains NRTDirectoryFactory setting? Given the above start
> script, should / could I skip the directory factory setting section in my
> solrconfig.xml with the assumption that the collections are going to be
> stored on HDFS *by default*?

Those commandline options are Java system properties.  It looks like the
example configs DO have settings in them that would use the
solr.directoryFactory and solr.lock.type properties.  But if your
solrconfig.xml file doesn't reference those properties, then they
wouldn't make any difference.  The last one is probably a setting that
HdfsDirectoryFactory uses that doesn't need to be explicitly referenced
in a config file.

Thanks,
Shawn

Re: Solr start script

Posted by Shawn Heisey <ap...@elyograg.org>.
On 6/7/2018 7:37 AM, Greenhorn Techie wrote:
> When the above settings are passed as part of start script, does that mean
> whenever a new collection is created, Solr is going to store the indexes in
> HDFS? But what if I upload my solrconfig.xml to ZK which contradicts with
> this and contains NRTDirectoryFactory setting? Given the above start
> script, should / could I skip the directory factory setting section in my
> solrconfig.xml with the assumption that the collections are going to be
> stored on HDFS *by default*?

Those commandline options are Java system properties.  It looks like the 
example configs DO have settings in them that would use the 
solr.directoryFactory and solr.lock.type properties.  But if your 
solrconfig.xml file doesn't reference those properties, then they 
wouldn't make any difference.  The last one is probably a setting that 
HdfsDirectoryFactory uses that doesn't need to be explicitly referenced 
in a config file.

Thanks,
Shawn