You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Christopher Schultz <ch...@christopherschultz.net> on 2022/06/01 17:41:38 UTC

Re: Create a core via SolrClient, single server

Clemens,

On 5/30/22 02:02, Clemens WYSS (Helbling Technik) wrote:
> Given a connection to Solr ( e.g. adminSolrConnection )
> CoreAdminRequest.Create createCoreRequest = new CoreAdminRequest.Create();
> createCoreRequest.setCoreName( coreName );
> createCoreRequest.process( adminSolrConnection );

What is an "admin solr connection"? Is that any different than just a 
plain-old HttpSolrClient instance?

How can I provide the schema for the core once it's been created? Can I 
use the API for that, or do I have to resort to pushing the config file 
directly similar to these kindx of curl commands:

curl -d "{ ... config }" \
    ${SCHEME}://localhost:${PORT}/solr/${CORENAME}/config

curl -H application/json --data-binary '{ ... schema ... }' \
    "${SCHEME}://localhost:${PORT}/solr/${CORENAME}/schema"

The CoreAdminRequest class has a createCore() method which takes a whole 
bunch of arguments, but the javadoc doesn't say what those arguments 
are. With parameter names like "configFile" I assume it's expecting a 
configuration file /name/ and not the actual configuration; same with 
schemaFile. I'm happy to make direct calls to the REST API, but if the 
SolrJ client will do it for me, I'd prefer that.

I'm also happy to write patches for CoreAdminRequest to that end.

-chris

> On 2022/05/25 21:25:09 Christopher Schultz wrote:
>> All,
>>
>> I have a non-clustered/ZK Solr instance and I'd like to create a core
>> using the Java SolrClient library. Is that currently possible? I only
>> see methods for working with documents in the current core (selected
>> when the client object is initially created, based upon the URL which
>> contains the core name).
>>
>> I'm using Solr 7.7.3 and the vanilla SolrJ client library.
>>
>> Thanks,
>> -chris
>>
> 

Re: Create a core via SolrClient, single server

Posted by Shawn Heisey <ap...@elyograg.org>.
On 6/1/2022 6:31 PM, Shawn Heisey wrote:
>
> The end result is the same ... except in the second case, it 
> references the configset by name, which will be in the created 
> core.properties file.  If you were to change the config in the 
> configset directory and then reload each core, test_core would not see 
> the changes, but test_core2 would.  That's because test_core has a 
> complete copy of the config that is separate from the configset, and 
> test_core2 is referencing the configset. 

The configSet feature brings the shared config model from SolrCloud to 
standalone mode, with one difference -- with the configs in zookeeper, 
SolrCloud can share the same config between multiple nodes, but in 
standalone mode, the shared config is local to one Solr node.

Thanks,
Shawn


Re: Create a core via SolrClient, single server

Posted by Shawn Heisey <ap...@elyograg.org>.
On 6/1/2022 3:34 PM, Christopher Schultz wrote:
> So I tried this with configSet=_default and I /did/ get a core 
> created. I didn't get the same thing I got from the CLI:
>
> This is what I get from "solr create -c test_core":

Using bin/solr to create a core does it in multiple steps.  It creates 
the core directory, and COPIES the named configset's conf directory 
(using _default if you don't specify one) to the core directory.  Then 
it calls /solr/admin/cores with "name" and "instanceDir" set to the name 
you gave it, which finds the just-copied config, creates the 
core.properties file, and adds the core to the running config.

> If I use SolrClient as in your example with configSet=_default, I get:
>
> test_core2
> test_core2/core.properties
> test_core2/data
> test_core2/data/tlog
> test_core2/data/snapshot_metadata
> test_core2/data/index
> test_core2/data/index/segments_1
> test_core2/data/index/write.lock

The end result is the same ... except in the second case, it references 
the configset by name, which will be in the created core.properties 
file.  If you were to change the config in the configset directory and 
then reload each core, test_core would not see the changes, but 
test_core2 would.  That's because test_core has a complete copy of the 
config that is separate from the configset, and test_core2 is 
referencing the configset.

Thanks,
Shawn


Re: Create a core via SolrClient, single server

Posted by Christopher Schultz <ch...@christopherschultz.net>.
Shawn,

On 6/1/22 16:34, Christopher Schultz wrote:
> Shawn,
> 
> On 6/1/22 15:18, Shawn Heisey wrote:
>> On 6/1/2022 11:41 AM, Christopher Schultz wrote:
>>> How can I provide the schema for the core once it's been created? Can 
>>> I use the API for that, or do I have to resort to pushing the config 
>>> file directly similar to these kindx of curl commands:
>>>
>>> curl -d "{ ... config }" \
>>>    ${SCHEME}://localhost:${PORT}/solr/${CORENAME}/config
>>>
>>> curl -H application/json --data-binary '{ ... schema ... }' \
>>>    "${SCHEME}://localhost:${PORT}/solr/${CORENAME}/schema"
>>
>> There's a chicken and egg problem.  You can't use those endpoints 
>> until the core is created.  And you can't create the core without 
>> providing the config and schema.
>>
>> https://solr.apache.org/guide/8_9/coreadmin-api.html#coreadmin-create
>>
>> (the large WARNING box in this section of the docs is the part I am 
>> referring you to)
> 
> I'll have a look.
> 
>> Assuming you're not going to be using cloud mode (in which case all 
>> configs are in zookeeper) you have two choices:  Create the core 
>> directory with a conf subdirectory that contains a config and a schema 
>> before calling the CoreAdmin API, or use the ConfigSets feature.
>>
>> https://solr.apache.org/guide/8_9/config-sets.html#configsets-in-standalone-mode 
> 
> 
> When I used the CoreAdminRequest as suggested by Clemens, the Solr 
> server did create the core directory, but it's empty and I got that error.
> 
> Would it not be theoretically possible to accept the core-name and 
> config and schema all at once and provision the whole thing? This seems 
> like a big missing feature after 9 major versions. Maybe Solr standalone 
> is only for children :)
> 
> My notes for creating the core include:
> 
>     $ sudo -u solr ${SOLR_HOME}/bin/solr create -c [corename]
> 
> I've never tried to do that from a remote server, but I don't specify a 
> config file or a schema in that command, and it works. Of course, I get 
> this warning:
> 
> WARNING: Using _default configset with data driven schema functionality. 
> NOT RECOMMENDED for production use.
> 
> My next steps are to provide a small config and a schema using two 
> separate curl commands, which obviously only communicate over the REST API.
> 
> What magic is "solr create" performing that I can't use via the API? OR 
> can I simulate it in some way?
> 
>> Checking SolrJ, you would use createCore("corename", "corename", 
>> client) if go you with the first option and name the directory 
>> "corename" in the solr home.  It doesn't look like CoreAdminRequest 
>> has a convenience method for using a configSet when creating a core.
>>
>> I worked out how to do it as a generic request with SolrJ if you want 
>> to use the configsets feature:
>>
>> https://paste.elyograg.org/view/3cd9aac2
> 
> Cool, though this is basically using SolrClient as an HttpClient ;)
> 
> 
> Would this work with configSet=_default ? It would be great if I didn't 
> have to prepare a Solr installation other than making sure it's running 
> before my application is able to create cores.

So I tried this with configSet=_default and I /did/ get a core created. 
I didn't get the same thing I got from the CLI:

This is what I get from "solr create -c test_core":

$ find ${SOLR_HOME}/server/solr/test_core

test_core
test_core/core.properties
test_core/data
test_core/data/tlog
test_core/data/snapshot_metadata
test_core/data/index
test_core/data/index/segments_1
test_core/data/index/write.lock
test_core/conf
test_core/conf/managed-schema
test_core/conf/params.json
test_core/conf/lang
test_core/conf/lang/stopwords_gl.txt
test_core/conf/lang/stopwords_es.txt
test_core/conf/lang/stopwords_fi.txt
test_core/conf/lang/stopwords_da.txt
test_core/conf/lang/stopwords_hu.txt
test_core/conf/lang/stopwords_id.txt
test_core/conf/lang/hyphenations_ga.txt
test_core/conf/lang/contractions_it.txt
test_core/conf/lang/stopwords_ro.txt
test_core/conf/lang/stopwords_eu.txt
test_core/conf/lang/stopwords_pt.txt
test_core/conf/lang/stopwords_de.txt
test_core/conf/lang/stoptags_ja.txt
test_core/conf/lang/stopwords_it.txt
test_core/conf/lang/contractions_ca.txt
test_core/conf/lang/stopwords_ca.txt
test_core/conf/lang/stopwords_th.txt
test_core/conf/lang/stopwords_bg.txt
test_core/conf/lang/stopwords_lv.txt
test_core/conf/lang/userdict_ja.txt
test_core/conf/lang/stopwords_cz.txt
test_core/conf/lang/stopwords_ar.txt
test_core/conf/lang/stopwords_tr.txt
test_core/conf/lang/stemdict_nl.txt
test_core/conf/lang/stopwords_no.txt
test_core/conf/lang/stopwords_nl.txt
test_core/conf/lang/stopwords_fa.txt
test_core/conf/lang/stopwords_sv.txt
test_core/conf/lang/stopwords_el.txt
test_core/conf/lang/stopwords_ja.txt
test_core/conf/lang/stopwords_hi.txt
test_core/conf/lang/stopwords_en.txt
test_core/conf/lang/contractions_ga.txt
test_core/conf/lang/contractions_fr.txt
test_core/conf/lang/stopwords_ru.txt
test_core/conf/lang/stopwords_ga.txt
test_core/conf/lang/stopwords_fr.txt
test_core/conf/lang/stopwords_hy.txt
test_core/conf/protwords.txt
test_core/conf/synonyms.txt
test_core/conf/solrconfig.xml
test_core/conf/stopwords.txt

If I use SolrClient as in your example with configSet=_default, I get:

test_core2
test_core2/core.properties
test_core2/data
test_core2/data/tlog
test_core2/data/snapshot_metadata
test_core2/data/index
test_core2/data/index/segments_1
test_core2/data/index/write.lock

It's not clear to me whether or not I need all that stuff I get from 
"solr create -c" because I don't intend to use stopwords, etc. so it's 
probably fine.

So far, I have this code:

             CoreAdminRequest status = new CoreAdminRequest();
             status.setAction(CoreAdminAction.STATUS);
             status.setCoreName(coreName);
             CoreAdminResponse cores = status.process(solr);
             NamedList<?> coreStatus = cores.getCoreStatus(coreName);
             String name = (String)coreStatus.get("name");
             if(coreName.equals(name)) {
                 logger.trace("Solr core " + coreName + " already 
exists; no need to create it");
             } else {
                 logger.debug("Must create Solr core " + coreName);
                 ModifiableSolrParams params = new ModifiableSolrParams();
                 params.add("action", "CREATE");
                 params.add("name", coreName);
                 params.add("instanceDir", coreName);
                 params.add("configSet", "_default");
                 GenericSolrRequest req = new 
GenericSolrRequest(METHOD.GET, "/admin/cores", params);
                 SimpleSolrResponse response = req.process(solr);
                 logger.debug("response to Solr CREATE " + 
response.getResponse());
             }

How can I apply the config and schema, then? I know I can POST to 
/solr/corename/config and /solr/corename/schema. How do I do that with 
the SolrClient? It's pretty obvious given your example how to do GET 
requests, but I don't understand how to supply a request entity for a POST.

I have the JSON I want to use for both the config and schema ready to go.

Thanks,
-chris

Re: Create a core via SolrClient, single server

Posted by Christopher Schultz <ch...@christopherschultz.net>.
Shawn,

On 6/1/22 15:18, Shawn Heisey wrote:
> On 6/1/2022 11:41 AM, Christopher Schultz wrote:
>> How can I provide the schema for the core once it's been created? Can 
>> I use the API for that, or do I have to resort to pushing the config 
>> file directly similar to these kindx of curl commands:
>>
>> curl -d "{ ... config }" \
>>    ${SCHEME}://localhost:${PORT}/solr/${CORENAME}/config
>>
>> curl -H application/json --data-binary '{ ... schema ... }' \
>>    "${SCHEME}://localhost:${PORT}/solr/${CORENAME}/schema"
> 
> There's a chicken and egg problem.  You can't use those endpoints until 
> the core is created.  And you can't create the core without providing 
> the config and schema.
> 
> https://solr.apache.org/guide/8_9/coreadmin-api.html#coreadmin-create
> 
> (the large WARNING box in this section of the docs is the part I am 
> referring you to)

I'll have a look.

> Assuming you're not going to be using cloud mode (in which case all 
> configs are in zookeeper) you have two choices:  Create the core 
> directory with a conf subdirectory that contains a config and a schema 
> before calling the CoreAdmin API, or use the ConfigSets feature.
> 
> https://solr.apache.org/guide/8_9/config-sets.html#configsets-in-standalone-mode 

When I used the CoreAdminRequest as suggested by Clemens, the Solr 
server did create the core directory, but it's empty and I got that error.

Would it not be theoretically possible to accept the core-name and 
config and schema all at once and provision the whole thing? This seems 
like a big missing feature after 9 major versions. Maybe Solr standalone 
is only for children :)

My notes for creating the core include:

    $ sudo -u solr ${SOLR_HOME}/bin/solr create -c [corename]

I've never tried to do that from a remote server, but I don't specify a 
config file or a schema in that command, and it works. Of course, I get 
this warning:

WARNING: Using _default configset with data driven schema functionality. 
NOT RECOMMENDED for production use.

My next steps are to provide a small config and a schema using two 
separate curl commands, which obviously only communicate over the REST API.

What magic is "solr create" performing that I can't use via the API? OR 
can I simulate it in some way?

> Checking SolrJ, you would use createCore("corename", "corename", client) 
> if go you with the first option and name the directory "corename" in the 
> solr home.  It doesn't look like CoreAdminRequest has a convenience 
> method for using a configSet when creating a core.
> 
> I worked out how to do it as a generic request with SolrJ if you want to 
> use the configsets feature:
> 
> https://paste.elyograg.org/view/3cd9aac2

Cool, though this is basically using SolrClient as an HttpClient ;)


Would this work with configSet=_default ? It would be great if I didn't 
have to prepare a Solr installation other than making sure it's running 
before my application is able to create cores.

Thanks,
-chris

Re: Create a core via SolrClient, single server

Posted by Shawn Heisey <ap...@elyograg.org>.
On 6/1/2022 11:41 AM, Christopher Schultz wrote:
> How can I provide the schema for the core once it's been created? Can 
> I use the API for that, or do I have to resort to pushing the config 
> file directly similar to these kindx of curl commands:
>
> curl -d "{ ... config }" \
>    ${SCHEME}://localhost:${PORT}/solr/${CORENAME}/config
>
> curl -H application/json --data-binary '{ ... schema ... }' \
>    "${SCHEME}://localhost:${PORT}/solr/${CORENAME}/schema"

There's a chicken and egg problem.  You can't use those endpoints until 
the core is created.  And you can't create the core without providing 
the config and schema.

https://solr.apache.org/guide/8_9/coreadmin-api.html#coreadmin-create

(the large WARNING box in this section of the docs is the part I am 
referring you to)

Assuming you're not going to be using cloud mode (in which case all 
configs are in zookeeper) you have two choices:  Create the core 
directory with a conf subdirectory that contains a config and a schema 
before calling the CoreAdmin API, or use the ConfigSets feature.

https://solr.apache.org/guide/8_9/config-sets.html#configsets-in-standalone-mode

Checking SolrJ, you would use createCore("corename", "corename", client) 
if go you with the first option and name the directory "corename" in the 
solr home.  It doesn't look like CoreAdminRequest has a convenience 
method for using a configSet when creating a core.

I worked out how to do it as a generic request with SolrJ if you want to 
use the configsets feature:

https://paste.elyograg.org/view/3cd9aac2

Thanks,
Shawn


Re: Create a core via SolrClient, single server

Posted by Christopher Schultz <ch...@christopherschultz.net>.
Clemens,

On 6/1/22 13:41, Christopher Schultz wrote:
> Clemens,
> 
> On 5/30/22 02:02, Clemens WYSS (Helbling Technik) wrote:
>> Given a connection to Solr ( e.g. adminSolrConnection )
>> CoreAdminRequest.Create createCoreRequest = new 
>> CoreAdminRequest.Create();
>> createCoreRequest.setCoreName( coreName );
>> createCoreRequest.process( adminSolrConnection );
> 
> What is an "admin solr connection"? Is that any different than just a 
> plain-old HttpSolrClient instance?

I found that I needed a client that wasn't pointing to any existing 
core, which wasn't a problem.

But I do get this error when trying to create the core:

org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: 
Error from server at ${SOLR_BASE_URL}/solr: Error CREATEing SolrCore 
'test_remote_create_core': Can't find resource 'solrconfig.xml' in 
classpath or '${SOLR_HOME}/server/solr/test_remote_create_core'

Here's the code I used to create the core:

     CoreAdminRequest car = new CoreAdminRequest.Create();
     car.setCoreName("test_remote_create_core");
     CoreAdminResponse response = car.process(solr);

I'd like to be able to bootstrap a core from my application if it 
doesn't exist. I can supply whatever information is necessary, but I 
need to be able to do it without doing anything other than making API 
calls, either via SolrJ or directly via HTTP/REST.

Thanks,
-chris