You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by Christopher Schultz <ch...@christopherschultz.net> on 2022/06/01 17:41:38 UTC
Re: Create a core via SolrClient, single server
Clemens,
On 5/30/22 02:02, Clemens WYSS (Helbling Technik) wrote:
> Given a connection to Solr ( e.g. adminSolrConnection )
> CoreAdminRequest.Create createCoreRequest = new CoreAdminRequest.Create();
> createCoreRequest.setCoreName( coreName );
> createCoreRequest.process( adminSolrConnection );
What is an "admin solr connection"? Is that any different than just a
plain-old HttpSolrClient instance?
How can I provide the schema for the core once it's been created? Can I
use the API for that, or do I have to resort to pushing the config file
directly similar to these kindx of curl commands:
curl -d "{ ... config }" \
${SCHEME}://localhost:${PORT}/solr/${CORENAME}/config
curl -H application/json --data-binary '{ ... schema ... }' \
"${SCHEME}://localhost:${PORT}/solr/${CORENAME}/schema"
The CoreAdminRequest class has a createCore() method which takes a whole
bunch of arguments, but the javadoc doesn't say what those arguments
are. With parameter names like "configFile" I assume it's expecting a
configuration file /name/ and not the actual configuration; same with
schemaFile. I'm happy to make direct calls to the REST API, but if the
SolrJ client will do it for me, I'd prefer that.
I'm also happy to write patches for CoreAdminRequest to that end.
-chris
> On 2022/05/25 21:25:09 Christopher Schultz wrote:
>> All,
>>
>> I have a non-clustered/ZK Solr instance and I'd like to create a core
>> using the Java SolrClient library. Is that currently possible? I only
>> see methods for working with documents in the current core (selected
>> when the client object is initially created, based upon the URL which
>> contains the core name).
>>
>> I'm using Solr 7.7.3 and the vanilla SolrJ client library.
>>
>> Thanks,
>> -chris
>>
>
Re: Create a core via SolrClient, single server
Posted by Shawn Heisey <ap...@elyograg.org>.
On 6/1/2022 6:31 PM, Shawn Heisey wrote:
>
> The end result is the same ... except in the second case, it
> references the configset by name, which will be in the created
> core.properties file. If you were to change the config in the
> configset directory and then reload each core, test_core would not see
> the changes, but test_core2 would. That's because test_core has a
> complete copy of the config that is separate from the configset, and
> test_core2 is referencing the configset.
The configSet feature brings the shared config model from SolrCloud to
standalone mode, with one difference -- with the configs in zookeeper,
SolrCloud can share the same config between multiple nodes, but in
standalone mode, the shared config is local to one Solr node.
Thanks,
Shawn
Re: Create a core via SolrClient, single server
Posted by Shawn Heisey <ap...@elyograg.org>.
On 6/1/2022 3:34 PM, Christopher Schultz wrote:
> So I tried this with configSet=_default and I /did/ get a core
> created. I didn't get the same thing I got from the CLI:
>
> This is what I get from "solr create -c test_core":
Using bin/solr to create a core does it in multiple steps. It creates
the core directory, and COPIES the named configset's conf directory
(using _default if you don't specify one) to the core directory. Then
it calls /solr/admin/cores with "name" and "instanceDir" set to the name
you gave it, which finds the just-copied config, creates the
core.properties file, and adds the core to the running config.
> If I use SolrClient as in your example with configSet=_default, I get:
>
> test_core2
> test_core2/core.properties
> test_core2/data
> test_core2/data/tlog
> test_core2/data/snapshot_metadata
> test_core2/data/index
> test_core2/data/index/segments_1
> test_core2/data/index/write.lock
The end result is the same ... except in the second case, it references
the configset by name, which will be in the created core.properties
file. If you were to change the config in the configset directory and
then reload each core, test_core would not see the changes, but
test_core2 would. That's because test_core has a complete copy of the
config that is separate from the configset, and test_core2 is
referencing the configset.
Thanks,
Shawn
Re: Create a core via SolrClient, single server
Posted by Christopher Schultz <ch...@christopherschultz.net>.
Shawn,
On 6/1/22 16:34, Christopher Schultz wrote:
> Shawn,
>
> On 6/1/22 15:18, Shawn Heisey wrote:
>> On 6/1/2022 11:41 AM, Christopher Schultz wrote:
>>> How can I provide the schema for the core once it's been created? Can
>>> I use the API for that, or do I have to resort to pushing the config
>>> file directly similar to these kindx of curl commands:
>>>
>>> curl -d "{ ... config }" \
>>> ${SCHEME}://localhost:${PORT}/solr/${CORENAME}/config
>>>
>>> curl -H application/json --data-binary '{ ... schema ... }' \
>>> "${SCHEME}://localhost:${PORT}/solr/${CORENAME}/schema"
>>
>> There's a chicken and egg problem. You can't use those endpoints
>> until the core is created. And you can't create the core without
>> providing the config and schema.
>>
>> https://solr.apache.org/guide/8_9/coreadmin-api.html#coreadmin-create
>>
>> (the large WARNING box in this section of the docs is the part I am
>> referring you to)
>
> I'll have a look.
>
>> Assuming you're not going to be using cloud mode (in which case all
>> configs are in zookeeper) you have two choices: Create the core
>> directory with a conf subdirectory that contains a config and a schema
>> before calling the CoreAdmin API, or use the ConfigSets feature.
>>
>> https://solr.apache.org/guide/8_9/config-sets.html#configsets-in-standalone-mode
>
>
> When I used the CoreAdminRequest as suggested by Clemens, the Solr
> server did create the core directory, but it's empty and I got that error.
>
> Would it not be theoretically possible to accept the core-name and
> config and schema all at once and provision the whole thing? This seems
> like a big missing feature after 9 major versions. Maybe Solr standalone
> is only for children :)
>
> My notes for creating the core include:
>
> $ sudo -u solr ${SOLR_HOME}/bin/solr create -c [corename]
>
> I've never tried to do that from a remote server, but I don't specify a
> config file or a schema in that command, and it works. Of course, I get
> this warning:
>
> WARNING: Using _default configset with data driven schema functionality.
> NOT RECOMMENDED for production use.
>
> My next steps are to provide a small config and a schema using two
> separate curl commands, which obviously only communicate over the REST API.
>
> What magic is "solr create" performing that I can't use via the API? OR
> can I simulate it in some way?
>
>> Checking SolrJ, you would use createCore("corename", "corename",
>> client) if go you with the first option and name the directory
>> "corename" in the solr home. It doesn't look like CoreAdminRequest
>> has a convenience method for using a configSet when creating a core.
>>
>> I worked out how to do it as a generic request with SolrJ if you want
>> to use the configsets feature:
>>
>> https://paste.elyograg.org/view/3cd9aac2
>
> Cool, though this is basically using SolrClient as an HttpClient ;)
>
>
> Would this work with configSet=_default ? It would be great if I didn't
> have to prepare a Solr installation other than making sure it's running
> before my application is able to create cores.
So I tried this with configSet=_default and I /did/ get a core created.
I didn't get the same thing I got from the CLI:
This is what I get from "solr create -c test_core":
$ find ${SOLR_HOME}/server/solr/test_core
test_core
test_core/core.properties
test_core/data
test_core/data/tlog
test_core/data/snapshot_metadata
test_core/data/index
test_core/data/index/segments_1
test_core/data/index/write.lock
test_core/conf
test_core/conf/managed-schema
test_core/conf/params.json
test_core/conf/lang
test_core/conf/lang/stopwords_gl.txt
test_core/conf/lang/stopwords_es.txt
test_core/conf/lang/stopwords_fi.txt
test_core/conf/lang/stopwords_da.txt
test_core/conf/lang/stopwords_hu.txt
test_core/conf/lang/stopwords_id.txt
test_core/conf/lang/hyphenations_ga.txt
test_core/conf/lang/contractions_it.txt
test_core/conf/lang/stopwords_ro.txt
test_core/conf/lang/stopwords_eu.txt
test_core/conf/lang/stopwords_pt.txt
test_core/conf/lang/stopwords_de.txt
test_core/conf/lang/stoptags_ja.txt
test_core/conf/lang/stopwords_it.txt
test_core/conf/lang/contractions_ca.txt
test_core/conf/lang/stopwords_ca.txt
test_core/conf/lang/stopwords_th.txt
test_core/conf/lang/stopwords_bg.txt
test_core/conf/lang/stopwords_lv.txt
test_core/conf/lang/userdict_ja.txt
test_core/conf/lang/stopwords_cz.txt
test_core/conf/lang/stopwords_ar.txt
test_core/conf/lang/stopwords_tr.txt
test_core/conf/lang/stemdict_nl.txt
test_core/conf/lang/stopwords_no.txt
test_core/conf/lang/stopwords_nl.txt
test_core/conf/lang/stopwords_fa.txt
test_core/conf/lang/stopwords_sv.txt
test_core/conf/lang/stopwords_el.txt
test_core/conf/lang/stopwords_ja.txt
test_core/conf/lang/stopwords_hi.txt
test_core/conf/lang/stopwords_en.txt
test_core/conf/lang/contractions_ga.txt
test_core/conf/lang/contractions_fr.txt
test_core/conf/lang/stopwords_ru.txt
test_core/conf/lang/stopwords_ga.txt
test_core/conf/lang/stopwords_fr.txt
test_core/conf/lang/stopwords_hy.txt
test_core/conf/protwords.txt
test_core/conf/synonyms.txt
test_core/conf/solrconfig.xml
test_core/conf/stopwords.txt
If I use SolrClient as in your example with configSet=_default, I get:
test_core2
test_core2/core.properties
test_core2/data
test_core2/data/tlog
test_core2/data/snapshot_metadata
test_core2/data/index
test_core2/data/index/segments_1
test_core2/data/index/write.lock
It's not clear to me whether or not I need all that stuff I get from
"solr create -c" because I don't intend to use stopwords, etc. so it's
probably fine.
So far, I have this code:
CoreAdminRequest status = new CoreAdminRequest();
status.setAction(CoreAdminAction.STATUS);
status.setCoreName(coreName);
CoreAdminResponse cores = status.process(solr);
NamedList<?> coreStatus = cores.getCoreStatus(coreName);
String name = (String)coreStatus.get("name");
if(coreName.equals(name)) {
logger.trace("Solr core " + coreName + " already
exists; no need to create it");
} else {
logger.debug("Must create Solr core " + coreName);
ModifiableSolrParams params = new ModifiableSolrParams();
params.add("action", "CREATE");
params.add("name", coreName);
params.add("instanceDir", coreName);
params.add("configSet", "_default");
GenericSolrRequest req = new
GenericSolrRequest(METHOD.GET, "/admin/cores", params);
SimpleSolrResponse response = req.process(solr);
logger.debug("response to Solr CREATE " +
response.getResponse());
}
How can I apply the config and schema, then? I know I can POST to
/solr/corename/config and /solr/corename/schema. How do I do that with
the SolrClient? It's pretty obvious given your example how to do GET
requests, but I don't understand how to supply a request entity for a POST.
I have the JSON I want to use for both the config and schema ready to go.
Thanks,
-chris
Re: Create a core via SolrClient, single server
Posted by Christopher Schultz <ch...@christopherschultz.net>.
Shawn,
On 6/1/22 15:18, Shawn Heisey wrote:
> On 6/1/2022 11:41 AM, Christopher Schultz wrote:
>> How can I provide the schema for the core once it's been created? Can
>> I use the API for that, or do I have to resort to pushing the config
>> file directly similar to these kindx of curl commands:
>>
>> curl -d "{ ... config }" \
>> ${SCHEME}://localhost:${PORT}/solr/${CORENAME}/config
>>
>> curl -H application/json --data-binary '{ ... schema ... }' \
>> "${SCHEME}://localhost:${PORT}/solr/${CORENAME}/schema"
>
> There's a chicken and egg problem. You can't use those endpoints until
> the core is created. And you can't create the core without providing
> the config and schema.
>
> https://solr.apache.org/guide/8_9/coreadmin-api.html#coreadmin-create
>
> (the large WARNING box in this section of the docs is the part I am
> referring you to)
I'll have a look.
> Assuming you're not going to be using cloud mode (in which case all
> configs are in zookeeper) you have two choices: Create the core
> directory with a conf subdirectory that contains a config and a schema
> before calling the CoreAdmin API, or use the ConfigSets feature.
>
> https://solr.apache.org/guide/8_9/config-sets.html#configsets-in-standalone-mode
When I used the CoreAdminRequest as suggested by Clemens, the Solr
server did create the core directory, but it's empty and I got that error.
Would it not be theoretically possible to accept the core-name and
config and schema all at once and provision the whole thing? This seems
like a big missing feature after 9 major versions. Maybe Solr standalone
is only for children :)
My notes for creating the core include:
$ sudo -u solr ${SOLR_HOME}/bin/solr create -c [corename]
I've never tried to do that from a remote server, but I don't specify a
config file or a schema in that command, and it works. Of course, I get
this warning:
WARNING: Using _default configset with data driven schema functionality.
NOT RECOMMENDED for production use.
My next steps are to provide a small config and a schema using two
separate curl commands, which obviously only communicate over the REST API.
What magic is "solr create" performing that I can't use via the API? OR
can I simulate it in some way?
> Checking SolrJ, you would use createCore("corename", "corename", client)
> if go you with the first option and name the directory "corename" in the
> solr home. It doesn't look like CoreAdminRequest has a convenience
> method for using a configSet when creating a core.
>
> I worked out how to do it as a generic request with SolrJ if you want to
> use the configsets feature:
>
> https://paste.elyograg.org/view/3cd9aac2
Cool, though this is basically using SolrClient as an HttpClient ;)
Would this work with configSet=_default ? It would be great if I didn't
have to prepare a Solr installation other than making sure it's running
before my application is able to create cores.
Thanks,
-chris
Re: Create a core via SolrClient, single server
Posted by Shawn Heisey <ap...@elyograg.org>.
On 6/1/2022 11:41 AM, Christopher Schultz wrote:
> How can I provide the schema for the core once it's been created? Can
> I use the API for that, or do I have to resort to pushing the config
> file directly similar to these kindx of curl commands:
>
> curl -d "{ ... config }" \
> ${SCHEME}://localhost:${PORT}/solr/${CORENAME}/config
>
> curl -H application/json --data-binary '{ ... schema ... }' \
> "${SCHEME}://localhost:${PORT}/solr/${CORENAME}/schema"
There's a chicken and egg problem. You can't use those endpoints until
the core is created. And you can't create the core without providing
the config and schema.
https://solr.apache.org/guide/8_9/coreadmin-api.html#coreadmin-create
(the large WARNING box in this section of the docs is the part I am
referring you to)
Assuming you're not going to be using cloud mode (in which case all
configs are in zookeeper) you have two choices: Create the core
directory with a conf subdirectory that contains a config and a schema
before calling the CoreAdmin API, or use the ConfigSets feature.
https://solr.apache.org/guide/8_9/config-sets.html#configsets-in-standalone-mode
Checking SolrJ, you would use createCore("corename", "corename", client)
if go you with the first option and name the directory "corename" in the
solr home. It doesn't look like CoreAdminRequest has a convenience
method for using a configSet when creating a core.
I worked out how to do it as a generic request with SolrJ if you want to
use the configsets feature:
https://paste.elyograg.org/view/3cd9aac2
Thanks,
Shawn
Re: Create a core via SolrClient, single server
Posted by Christopher Schultz <ch...@christopherschultz.net>.
Clemens,
On 6/1/22 13:41, Christopher Schultz wrote:
> Clemens,
>
> On 5/30/22 02:02, Clemens WYSS (Helbling Technik) wrote:
>> Given a connection to Solr ( e.g. adminSolrConnection )
>> CoreAdminRequest.Create createCoreRequest = new
>> CoreAdminRequest.Create();
>> createCoreRequest.setCoreName( coreName );
>> createCoreRequest.process( adminSolrConnection );
>
> What is an "admin solr connection"? Is that any different than just a
> plain-old HttpSolrClient instance?
I found that I needed a client that wasn't pointing to any existing
core, which wasn't a problem.
But I do get this error when trying to create the core:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
Error from server at ${SOLR_BASE_URL}/solr: Error CREATEing SolrCore
'test_remote_create_core': Can't find resource 'solrconfig.xml' in
classpath or '${SOLR_HOME}/server/solr/test_remote_create_core'
Here's the code I used to create the core:
CoreAdminRequest car = new CoreAdminRequest.Create();
car.setCoreName("test_remote_create_core");
CoreAdminResponse response = car.process(solr);
I'd like to be able to bootstrap a core from my application if it
doesn't exist. I can supply whatever information is necessary, but I
need to be able to do it without doing anything other than making API
calls, either via SolrJ or directly via HTTP/REST.
Thanks,
-chris