You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by jhittner <jo...@hittner.com> on 2014/01/07 20:55:38 UTC

questions on the collections API usage (solrcloud 4.5.1)

Hi, I'm working on setting up my first SolrCloud instance and have a three
questions I was hoping for some help with.   I am running Solr 4.5.1.

1- I have 2 different schema's and 2 solrconfigs that I currently place in a
single conf folder that get uploaded to zookeeper automatically by tomcat
(bootstrap_confdir).  I have been editing the core.properties files after
creating my collections to add the "schema=" and "config=" params to
instruct each core on what config file and schema to use.   How can I
configure these two options when calling the API so I don't need to login to
the boxes and edit anything manually.   The command I currently use looks
something like:
http://solrvip:8080/solr/admin/collections?action=CREATE&name=zipcollection-a&numShards=2&replicationFactor=2&maxShardsPerNode=4

2- I store all my configs in a specific directory that I set with the tomcat
"bootstrap_confdir" option.   After creating my collection using the API
call above, the collection runs perfect until the next restart.   After
restarting, solr will not start back up until I create a "conf" directory in
each of the shards directories created by the API.   The directory can be
empty, but needs to exist in order for Solr to run.   Is this a bug or am I
doing something wrong?

3- As I stated in question 1, I put all my schema's and solrconfigs in a
single directory uploaded with the "bootstrap_confdir" directive.    Is
there any way to organize this in sub-directories.   I've tried several
ways, but can't seem to get it to work.   for example:
bootstrap_confdir=/solr/conf
/solr/conf/collection-a/schema.xml
/solr/conf/collection-b/schema.xml

Thanks in advance for the advice, and thanks to the community for the hard
work.




--
View this message in context: http://lucene.472066.n3.nabble.com/questions-on-the-collections-API-usage-solrcloud-4-5-1-tp4110059.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: questions on the collections API usage (solrcloud 4.5.1)

Posted by jhittner <jo...@hittner.com>.
You are correct.  I am trying to run two collections.  each collection with
its own solrconfig and schema.  I'm just having some issues creating these
separate collections by the API without having to modify the core.properties
and create the empty conf directory.   How do I tell the collections APIs
CREATE what schema and solrconfig to use for the newly created collection?



--
View this message in context: http://lucene.472066.n3.nabble.com/questions-on-the-collections-API-usage-solrcloud-4-5-1-tp4110059p4110135.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: questions on the collections API usage (solrcloud 4.5.1)

Posted by Erick Erickson <er...@gmail.com>.
This sounds like you really want two different collections. Using two
different configs/schemas is not compatible with a single collection.
Perhaps you could back up and tell us why you think you need to do this?
Sounds like an XY problem.

Best
Erick
On Jan 7, 2014 2:56 PM, "jhittner" <jo...@hittner.com> wrote:

> Hi, I'm working on setting up my first SolrCloud instance and have a three
> questions I was hoping for some help with.   I am running Solr 4.5.1.
>
> 1- I have 2 different schema's and 2 solrconfigs that I currently place in
> a
> single conf folder that get uploaded to zookeeper automatically by tomcat
> (bootstrap_confdir).  I have been editing the core.properties files after
> creating my collections to add the "schema=" and "config=" params to
> instruct each core on what config file and schema to use.   How can I
> configure these two options when calling the API so I don't need to login
> to
> the boxes and edit anything manually.   The command I currently use looks
> something like:
>
> http://solrvip:8080/solr/admin/collections?action=CREATE&name=zipcollection-a&numShards=2&replicationFactor=2&maxShardsPerNode=4
>
> 2- I store all my configs in a specific directory that I set with the
> tomcat
> "bootstrap_confdir" option.   After creating my collection using the API
> call above, the collection runs perfect until the next restart.   After
> restarting, solr will not start back up until I create a "conf" directory
> in
> each of the shards directories created by the API.   The directory can be
> empty, but needs to exist in order for Solr to run.   Is this a bug or am I
> doing something wrong?
>
> 3- As I stated in question 1, I put all my schema's and solrconfigs in a
> single directory uploaded with the "bootstrap_confdir" directive.    Is
> there any way to organize this in sub-directories.   I've tried several
> ways, but can't seem to get it to work.   for example:
> bootstrap_confdir=/solr/conf
> /solr/conf/collection-a/schema.xml
> /solr/conf/collection-b/schema.xml
>
> Thanks in advance for the advice, and thanks to the community for the hard
> work.
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/questions-on-the-collections-API-usage-solrcloud-4-5-1-tp4110059.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: questions on the collections API usage (solrcloud 4.5.1)

Posted by jhittner <jo...@hittner.com>.
> Let's take a step back and examine a more ideal way to do things. 

Shawn-
   Thanks very much for taking the time to explain this to me so clearly.  
I think I'm good to go now.   I will do some testing today.

-Jon



--
View this message in context: http://lucene.472066.n3.nabble.com/questions-on-the-collections-API-usage-solrcloud-4-5-1-tp4110059p4110242.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: questions on the collections API usage (solrcloud 4.5.1)

Posted by Shawn Heisey <so...@elyograg.org>.
On 1/7/2014 12:55 PM, jhittner wrote:
> Hi, I'm working on setting up my first SolrCloud instance and have a three
> questions I was hoping for some help with.   I am running Solr 4.5.1.
> 
> 1- I have 2 different schema's and 2 solrconfigs that I currently place in a
> single conf folder that get uploaded to zookeeper automatically by tomcat
> (bootstrap_confdir).  I have been editing the core.properties files after
> creating my collections to add the "schema=" and "config=" params to
> instruct each core on what config file and schema to use.   How can I
> configure these two options when calling the API so I don't need to login to
> the boxes and edit anything manually.   The command I currently use looks
> something like:
> http://solrvip:8080/solr/admin/collections?action=CREATE&name=zipcollection-a&numShards=2&replicationFactor=2&maxShardsPerNode=4
> 
> 2- I store all my configs in a specific directory that I set with the tomcat
> "bootstrap_confdir" option.   After creating my collection using the API
> call above, the collection runs perfect until the next restart.   After
> restarting, solr will not start back up until I create a "conf" directory in
> each of the shards directories created by the API.   The directory can be
> empty, but needs to exist in order for Solr to run.   Is this a bug or am I
> doing something wrong?
> 
> 3- As I stated in question 1, I put all my schema's and solrconfigs in a
> single directory uploaded with the "bootstrap_confdir" directive.    Is
> there any way to organize this in sub-directories.   I've tried several
> ways, but can't seem to get it to work.   for example:
> bootstrap_confdir=/solr/conf
> /solr/conf/collection-a/schema.xml
> /solr/conf/collection-b/schema.xml

Let's take a step back and examine a more ideal way to do things.

The bootstrap startup options are intended to be used once, and only to
convert a non-cloud setup to a cloud setup.  After that's done, they
should be removed.  In many cases, the only SolrCloud startup parameter
you really need is zkHost:

-DzkHost=zoo1.example.com:2181,zoo2.example.com:2181,zoo3.example.com:2181/solr

In this example, there are three zookeeper hosts (the minimum required
for a redundant setup), and Solr is using "/solr" as a chroot so the
zookeeper root is clean.  Note that you can put your zkHost value in
your solr.xml file so that no special startup options are required at all.

In my opinion, the bootstrap options shouldn't be used.  Instead, you
should use zkcli scripts to upload your config sets to zookeeper and
then just create collections that reference the config set you want to
use.  If you need to change the config, you can change the local copy
and then re-upload the config to zookeeper and do a RELOAD action on
whichever collections are using that config.

On the disk, a config set consists of a directory with solrconfig.xml,
schema.xml, and any other config files referenced by those two xml
files.  When uploaded into zookeeper, all the files in that directory
will be associated with a configName.  Only the copy in zookeeper will
actually be used.

Here's information on the zkcli command-line script:

https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities

To create a collection once a config set is uploaded, use a URL like the
following.  For this example, the uploaded config is named "config1" and
the new collection is named "test1".  I have left out the
maxShardsPerNode parameter, so this command would require that you have
at least four Solr nodes in your cloud.

http://server:port/solr/admin/collections?action=CREATE&name=test1&numShards=2&replicationFactor=2&collection.configName=config1

There should be no need to edit anything, and you will not need to tell
Solr or any of the cores what file to use for solrconfig or schema.

Thanks,
Shawn