You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Tyrone Tse <ty...@hotmail.com> on 2020/07/21 17:32:30 UTC

Setting in solrconfig.xml does it override Solr REST Post calls with parameters commit=true&softCommit=false

I am using Solr 8.5 cloud, and in my collection I have edited the
solrconfig.xml file to use
<autoSoftCommit>
        <maxTime>1000</maxTime>
      </autoSoftCommit>

and commented out the default <autoCommit> configuration

  <!--
    <autoCommit>
      <maxTime>15000</maxTime>
      <openSearcher>false</openSearcher>
    </autoCommit>
  -->

We are using SolrJ to post files to the Solr here is the snippet of Java
code that does it

try(HttpSolrClient solrClient = solr.build()){
    ContentStreamUpdateRequest up = new
ContentStreamUpdateRequest("/update/extract");
    up.addFile(f, mimeType);
    String tempId = f.getName() + (new Date()).toString();
    up.setParam("literal.id", tempId);
    up.setParam("literal.username", user);
    up.setParam("literal.fileName", f.getName());
    up.setParam("literal.filePath", path);
    up.setParam("uprefix", "attr_");
    up.setParam("fmap.content", "attr_content");
    up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
    logger.info("PreRequest");
    solrClient.request(up);
    logger.info("PostRequest");
    resultId = tempId;
} catch (IOException | SolrServerException |
HttpSolrClient.RemoteSolrException e) {
    logger.error("Error connecting.committing to Solr", e);
}

So I am not passing the parameter to do a softCommit  in the SolrJ command.

When I posted a file to my Solr core, when I look at the solr.log file I
see the following information

2020-07-21 16:38:54.719 INFO  (qtp1546693040-302) [c:files s:shard1
r:core_node5 x:files_shard1_replica_n2] o.a.s.u.p.LogUpdateProcessorFactory
[files_shard1_replica_n2]  webapp=/solr path=/update
params={update.distrib=TOLEADER&update.chain=files-update-processor&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=
http://192.168.1.191:8983/solr/files_shard2_replica_n6/

Does having  <autoSoftCommit> set in the solrconfig.xml override REST Post
calls that have the parameter softCommit=false and force a softCommit when
the data is posted to Solr.

Thanks in advance.

Re: Setting in solrconfig.xml does it override Solr REST Post calls with parameters commit=true&softCommit=false

Posted by Erick Erickson <er...@gmail.com>.
Yep, that assumes you can afford it to be 5 minutes between the time you send a doc to Solr and your users can search it.

Part of it depends on what your indexing rate is. If you’re only sending docs occasionally, you may want to make that longer. Frankly, though, the interval there isn’t too important practically, it’s opening a new searcher that impacts your setup most obviously.

Best,
Erick

> On Jul 21, 2020, at 3:52 PM, Tyrone Tse <ty...@hotmail.com> wrote:
> 
> Eric
> 
> Thanks for your quick response.
> So in the solrconfig.xml keep the  out of the box setting of 15 seconds
> 
>    <autoCommit>
>      <maxTime>15000</maxTime>
>      <openSearcher>false</openSearcher>
>    </autoCommit>
> 
> and also have the <autoSoftCommit> setting set to something like 5 minutes
>      <autoSoftCommit>
>        <maxTime>300000</maxTime>
>      </autoSoftCommit>
> 
> Then in the existing SolrJ code just simply delete the line
> 
> up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
> 
> Is this what you recommended I try.
> 
> Thanks
> 
> Tyrone
> 
> 
> On Tue, Jul 21, 2020 at 1:16 PM Erick Erickson <er...@gmail.com>
> wrote:
> 
>> What you’re seeing is the input from the client. You’ve passed true, true,
>> which are waitFlush and waitSearcher
>> which sets softCommit _for that call_ to false. It has nothing to do with
>> the settings in your config file.
>> 
>> bq. I am not passing the parameter to do a softCommit  in the SolrJ
>> command.
>> 
>> I don’t think so. That’s a hard commit. This is a little tricky since the
>> waitFlush and waitSearcher params don’t
>> tell you that they are about hard commits. There used to only be hard
>> commits, so...
>> 
>> But…. these settings are highly suspicious. Here’s the long form:
>> 
>> 
>> https://lucidworks.com/post/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>> 
>> It is risky to have your autocommit settings commented out. You risk
>> transaction logs growing
>> forever to no  purpose whatsoever.
>> 
>> Your call does a hard commit _and_ opens a new searcher for, apparently,
>> every document.
>> 
>> But your autoSoftCommit settings also open a new searcher without doing
>> anything about flushing
>> data to disk every second. Usually, this is far too often unless you have
>> extremely stringent latency
>> requirements, and in this case unless you’re only indexing once in a great
>> while your caches
>> are pretty useless.
>> 
>> I strongly urge you to uncomment autocommit settings. Make the autoCommit
>> interval something
>> reasonable (15-60 seconds for instance) with openSearcher=false.
>> 
>> Then lengthen your autoSoftCommit settings to as long as you can stand.
>> The longer the interval,
>> the less work you’ll do opening new searchers, which is a rather expensive
>> operation. I like
>> 5-10 minutes if possible, but your app may require shorter intervals.
>> 
>> Then don’t send any commit settings in your SolrJ program at all.
>> 
>> Best,
>> Erick
>> 
>>> On Jul 21, 2020, at 1:32 PM, Tyrone Tse <ty...@hotmail.com> wrote:
>>> 
>>> I am using Solr 8.5 cloud, and in my collection I have edited the
>>> solrconfig.xml file to use
>>> <autoSoftCommit>
>>>       <maxTime>1000</maxTime>
>>>     </autoSoftCommit>
>>> 
>>> and commented out the default <autoCommit> configuration
>>> 
>>> <!--
>>>   <autoCommit>
>>>     <maxTime>15000</maxTime>
>>>     <openSearcher>false</openSearcher>
>>>   </autoCommit>
>>> -->
>>> 
>>> We are using SolrJ to post files to the Solr here is the snippet of Java
>>> code that does it
>>> 
>>> try(HttpSolrClient solrClient = solr.build()){
>>>   ContentStreamUpdateRequest up = new
>>> ContentStreamUpdateRequest("/update/extract");
>>>   up.addFile(f, mimeType);
>>>   String tempId = f.getName() + (new Date()).toString();
>>>   up.setParam("literal.id", tempId);
>>>   up.setParam("literal.username", user);
>>>   up.setParam("literal.fileName", f.getName());
>>>   up.setParam("literal.filePath", path);
>>>   up.setParam("uprefix", "attr_");
>>>   up.setParam("fmap.content", "attr_content");
>>>   up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
>>>   logger.info("PreRequest");
>>>   solrClient.request(up);
>>>   logger.info("PostRequest");
>>>   resultId = tempId;
>>> } catch (IOException | SolrServerException |
>>> HttpSolrClient.RemoteSolrException e) {
>>>   logger.error("Error connecting.committing to Solr", e);
>>> }
>>> 
>>> So I am not passing the parameter to do a softCommit  in the SolrJ
>> command.
>>> 
>>> When I posted a file to my Solr core, when I look at the solr.log file I
>>> see the following information
>>> 
>>> 2020-07-21 16:38:54.719 INFO  (qtp1546693040-302) [c:files s:shard1
>>> r:core_node5 x:files_shard1_replica_n2]
>> o.a.s.u.p.LogUpdateProcessorFactory
>>> [files_shard1_replica_n2]  webapp=/solr path=/update
>>> 
>> params={update.distrib=TOLEADER&update.chain=files-update-processor&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=
>>> http://192.168.1.191:8983/solr/files_shard2_replica_n6/
>>> 
>>> Does having  <autoSoftCommit> set in the solrconfig.xml override REST
>> Post
>>> calls that have the parameter softCommit=false and force a softCommit
>> when
>>> the data is posted to Solr.
>>> 
>>> Thanks in advance.
>> 
>> 


Re: Setting in solrconfig.xml does it override Solr REST Post calls with parameters commit=true&softCommit=false

Posted by Tyrone Tse <ty...@hotmail.com>.
Eric

Thanks for your quick response.
So in the solrconfig.xml keep the  out of the box setting of 15 seconds

    <autoCommit>
      <maxTime>15000</maxTime>
      <openSearcher>false</openSearcher>
    </autoCommit>

and also have the <autoSoftCommit> setting set to something like 5 minutes
      <autoSoftCommit>
        <maxTime>300000</maxTime>
      </autoSoftCommit>

Then in the existing SolrJ code just simply delete the line

up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);

Is this what you recommended I try.

Thanks

Tyrone


On Tue, Jul 21, 2020 at 1:16 PM Erick Erickson <er...@gmail.com>
wrote:

> What you’re seeing is the input from the client. You’ve passed true, true,
> which are waitFlush and waitSearcher
> which sets softCommit _for that call_ to false. It has nothing to do with
> the settings in your config file.
>
> bq. I am not passing the parameter to do a softCommit  in the SolrJ
> command.
>
> I don’t think so. That’s a hard commit. This is a little tricky since the
> waitFlush and waitSearcher params don’t
> tell you that they are about hard commits. There used to only be hard
> commits, so...
>
> But…. these settings are highly suspicious. Here’s the long form:
>
>
> https://lucidworks.com/post/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>
> It is risky to have your autocommit settings commented out. You risk
> transaction logs growing
> forever to no  purpose whatsoever.
>
> Your call does a hard commit _and_ opens a new searcher for, apparently,
> every document.
>
> But your autoSoftCommit settings also open a new searcher without doing
> anything about flushing
> data to disk every second. Usually, this is far too often unless you have
> extremely stringent latency
> requirements, and in this case unless you’re only indexing once in a great
> while your caches
> are pretty useless.
>
> I strongly urge you to uncomment autocommit settings. Make the autoCommit
> interval something
> reasonable (15-60 seconds for instance) with openSearcher=false.
>
> Then lengthen your autoSoftCommit settings to as long as you can stand.
> The longer the interval,
> the less work you’ll do opening new searchers, which is a rather expensive
> operation. I like
> 5-10 minutes if possible, but your app may require shorter intervals.
>
> Then don’t send any commit settings in your SolrJ program at all.
>
> Best,
> Erick
>
> > On Jul 21, 2020, at 1:32 PM, Tyrone Tse <ty...@hotmail.com> wrote:
> >
> > I am using Solr 8.5 cloud, and in my collection I have edited the
> > solrconfig.xml file to use
> > <autoSoftCommit>
> >        <maxTime>1000</maxTime>
> >      </autoSoftCommit>
> >
> > and commented out the default <autoCommit> configuration
> >
> >  <!--
> >    <autoCommit>
> >      <maxTime>15000</maxTime>
> >      <openSearcher>false</openSearcher>
> >    </autoCommit>
> >  -->
> >
> > We are using SolrJ to post files to the Solr here is the snippet of Java
> > code that does it
> >
> > try(HttpSolrClient solrClient = solr.build()){
> >    ContentStreamUpdateRequest up = new
> > ContentStreamUpdateRequest("/update/extract");
> >    up.addFile(f, mimeType);
> >    String tempId = f.getName() + (new Date()).toString();
> >    up.setParam("literal.id", tempId);
> >    up.setParam("literal.username", user);
> >    up.setParam("literal.fileName", f.getName());
> >    up.setParam("literal.filePath", path);
> >    up.setParam("uprefix", "attr_");
> >    up.setParam("fmap.content", "attr_content");
> >    up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
> >    logger.info("PreRequest");
> >    solrClient.request(up);
> >    logger.info("PostRequest");
> >    resultId = tempId;
> > } catch (IOException | SolrServerException |
> > HttpSolrClient.RemoteSolrException e) {
> >    logger.error("Error connecting.committing to Solr", e);
> > }
> >
> > So I am not passing the parameter to do a softCommit  in the SolrJ
> command.
> >
> > When I posted a file to my Solr core, when I look at the solr.log file I
> > see the following information
> >
> > 2020-07-21 16:38:54.719 INFO  (qtp1546693040-302) [c:files s:shard1
> > r:core_node5 x:files_shard1_replica_n2]
> o.a.s.u.p.LogUpdateProcessorFactory
> > [files_shard1_replica_n2]  webapp=/solr path=/update
> >
> params={update.distrib=TOLEADER&update.chain=files-update-processor&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=
> > http://192.168.1.191:8983/solr/files_shard2_replica_n6/
> >
> > Does having  <autoSoftCommit> set in the solrconfig.xml override REST
> Post
> > calls that have the parameter softCommit=false and force a softCommit
> when
> > the data is posted to Solr.
> >
> > Thanks in advance.
>
>

Re: Setting in solrconfig.xml does it override Solr REST Post calls with parameters commit=true&softCommit=false

Posted by Erick Erickson <er...@gmail.com>.
What you’re seeing is the input from the client. You’ve passed true, true, which are waitFlush and waitSearcher
which sets softCommit _for that call_ to false. It has nothing to do with the settings in your config file.

bq. I am not passing the parameter to do a softCommit  in the SolrJ command.

I don’t think so. That’s a hard commit. This is a little tricky since the waitFlush and waitSearcher params don’t
tell you that they are about hard commits. There used to only be hard commits, so...

But…. these settings are highly suspicious. Here’s the long form:

https://lucidworks.com/post/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

It is risky to have your autocommit settings commented out. You risk transaction logs growing
forever to no  purpose whatsoever.

Your call does a hard commit _and_ opens a new searcher for, apparently, every document.

But your autoSoftCommit settings also open a new searcher without doing anything about flushing
data to disk every second. Usually, this is far too often unless you have extremely stringent latency
requirements, and in this case unless you’re only indexing once in a great while your caches
are pretty useless.

I strongly urge you to uncomment autocommit settings. Make the autoCommit interval something
reasonable (15-60 seconds for instance) with openSearcher=false.

Then lengthen your autoSoftCommit settings to as long as you can stand. The longer the interval,
the less work you’ll do opening new searchers, which is a rather expensive operation. I like
5-10 minutes if possible, but your app may require shorter intervals.

Then don’t send any commit settings in your SolrJ program at all.

Best,
Erick

> On Jul 21, 2020, at 1:32 PM, Tyrone Tse <ty...@hotmail.com> wrote:
> 
> I am using Solr 8.5 cloud, and in my collection I have edited the
> solrconfig.xml file to use
> <autoSoftCommit>
>        <maxTime>1000</maxTime>
>      </autoSoftCommit>
> 
> and commented out the default <autoCommit> configuration
> 
>  <!--
>    <autoCommit>
>      <maxTime>15000</maxTime>
>      <openSearcher>false</openSearcher>
>    </autoCommit>
>  -->
> 
> We are using SolrJ to post files to the Solr here is the snippet of Java
> code that does it
> 
> try(HttpSolrClient solrClient = solr.build()){
>    ContentStreamUpdateRequest up = new
> ContentStreamUpdateRequest("/update/extract");
>    up.addFile(f, mimeType);
>    String tempId = f.getName() + (new Date()).toString();
>    up.setParam("literal.id", tempId);
>    up.setParam("literal.username", user);
>    up.setParam("literal.fileName", f.getName());
>    up.setParam("literal.filePath", path);
>    up.setParam("uprefix", "attr_");
>    up.setParam("fmap.content", "attr_content");
>    up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
>    logger.info("PreRequest");
>    solrClient.request(up);
>    logger.info("PostRequest");
>    resultId = tempId;
> } catch (IOException | SolrServerException |
> HttpSolrClient.RemoteSolrException e) {
>    logger.error("Error connecting.committing to Solr", e);
> }
> 
> So I am not passing the parameter to do a softCommit  in the SolrJ command.
> 
> When I posted a file to my Solr core, when I look at the solr.log file I
> see the following information
> 
> 2020-07-21 16:38:54.719 INFO  (qtp1546693040-302) [c:files s:shard1
> r:core_node5 x:files_shard1_replica_n2] o.a.s.u.p.LogUpdateProcessorFactory
> [files_shard1_replica_n2]  webapp=/solr path=/update
> params={update.distrib=TOLEADER&update.chain=files-update-processor&waitSearcher=true&openSearcher=true&commit=true&softCommit=false&distrib.from=
> http://192.168.1.191:8983/solr/files_shard2_replica_n6/
> 
> Does having  <autoSoftCommit> set in the solrconfig.xml override REST Post
> calls that have the parameter softCommit=false and force a softCommit when
> the data is posted to Solr.
> 
> Thanks in advance.