You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Puppy Linux Distros <vi...@gmail.com> on 2017/12/04 05:21:36 UTC

Re: check softCommit , autocommit and hard commit count

Hello,

Thanks Shawn. Can you provide command to find the total number of
autocommits in the solr.log?

On Thu, Nov 30, 2017 at 7:20 PM, Shawn Heisey <ap...@elyograg.org> wrote:

> On 11/30/2017 4:36 AM, Puppy Linux Distros wrote:
>
>> I am trying to calculate the total number of softCommit , autocommit and
>> hard commit from the solr logs. Can you please check whether the below
>> commands are correct ?
>>
>> Let me know how to find the total softcommit, hardcommit and autocommit
>> from the logs.
>>
>>
>> *1. totalcommit=`cat $solrlogfile | grep "start commit" | wc -l`*
>>
>> *totalcommit =  **41906*
>>
>>
>> *2. totalsoftcommit=`cat $solrlogfile | grep "start commit" | grep
>> "softCommit=true" | wc -l`*
>>
>> *totalsoftcommit =  **921*
>>
>
> These look reasonable ... but be aware that the default logging config
> will roll the solr.log file to a new empty file when it reaches 4
> megabytes, which doesn't really take that long on a busy server, so if
> you're only looking at "solr.log" you may have an incomplete picture.  I
> personally change the roll size limit to 4 gigabytes so solr.log covers a
> lot more time.
>
> Solr restarts will *also* roll/archive logfiles, so you probably can't
> just look through every file in the logs directory that starts with
> "solr.log" -- it may be difficult to figure out exactly which files apply
> to the current running instance.  It might turn out that I'm completely
> wrong in that statement -- I haven't confirmed exactly what a Solr restart
> actually does with the logfiles.
>
> *3. totalhardcommits=`cat $solrlogfile | grep "start commit" | grep
>> "softCommit=false" | grep "openSearcher=true" | wc -l`*
>>
>> *totalhardcommits=  **40982*
>>
>
> If you have configured autoCommit in solrconfig.xml and have set
> openSearcher to false in that config, then there will be hard commits that
> *don't* open a new searcher, so the "openSearcher=true" part will not catch
> those commits.  Example configs in recent versions have autoCommit set up
> this way, and this recommended config for *everybody*.  The default
> autoCommit interval in the example configs is 15 seconds, which I think is
> a little too aggressive, but this kind of commit is typically very fast, so
> I've never seen that config cause problems.
>
> The example configs do not have autoSoftCommit configured.  If users want
> to automatically do commits for visibility, we recommend that they use
> autoSoftCommit.
>
> *4.  totalautocommit=`cat $solrlogfile | grep "realtime" | wc -l`*
>>
>> *totalautocommit= 3*
>>
>
> These aren't autoCommits.  They are new searchers for the realtime get
> handler, which is capable of accessing documents that haven't been
> committed yet.  In addition to the index on disk, it searches the
> transaction logs.  Opening a new realtime searcher should be very fast, and
> they happen without any configuration. I'm not sure why you're only seeing
> this happen three times here. Presumably in a log where there are 40000
> total commits, you are doing a fair amount of indexing, so I would have
> expected a new realtime searcher to have been created much more frequently,
> even if there were no commits done at all.
>
> Maybe the realtime get handler can use the standard searcher, and only
> opens a new realtime searcher in cases where new documents have been
> indexed but there hasn't been a recent commit that opens a new searcher.
> If that's the case, then I have no idea how long it would wait before
> firing up a new realtime searcher.  I wouldn't expect that to be very long
> ... so if your indexing/committing cycles are normally very fast, maybe
> Solr doesn't feel it's necessary to open realtime searchers very often.
>
> Thanks,
> Shawn
>
>


-- 
Regards,

Vivek CV

Re: check softCommit , autocommit and hard commit count

Posted by Erick Erickson <er...@gmail.com>.
Neither commit does anything if no updates have been received.

But you don't need to wait for the devs to STOP DOING THAT ;). In
solrconfig.xml you can set:
IgnoreCommitOptimizeUpdateProcessorFactory
see the ref guide....

Best,
Erick

On Mon, Dec 4, 2017 at 12:53 AM, Puppy Linux Distros <vi...@gmail.com> wrote:
> Hi,
>
> Thanks Shawn for the help.
>
> I think I should have added few more details to my previous mail.
>
> I know it's a bad practice but due to some reasons, our application fires
> hard commits via code(upon most of the /update) and invokes the /update api
> with commit=true and application very less uses softcommits. I will
> recommend devs to look forward with more softcommits and make use of
> realtime searchers in future.
>
> However, my current scenario is to get the solr to latest 7.1.0 so I need
> to collect the current traffic in solr to have an optimized trade-offs with
> the latest stack that I am looking forward to. Current stack is bit older
> like 4.10. so got to process/parse current solr logs.
>
> I have my own log storage mechanism with us so I have one month solr.log
> stored and hence rotation/archive isn't an issue here. Once I get a hold of
> unique phrases in each logs that appends with each type of
> commits(softcommit, autohardcommit,hardcommit), I can frame some metrics of
> current traffic.
>
> Our current stack still maintains default autocommit config like
> opensearcher=false and 15s period. Currently dont have softcommits enabled,
> however softcommits and hardcommits invokes explicitly from application,
> hence its bit hard to get them separated from solr.log unless I get some
> unique phrases/regex/words out of each log lines that each type of commits
> fires. Would be really helpful if any inputs in this area.
>
> In addition to that, just wanted to confirm, if there no pending /update
> written to disk, does autocommit really fires at it's interval or is it
> going to be idle if nothing to write to disk..? In other way, suppose, I
> made a softcommit on 5th second and I made a hardcommit explicitly on 10th
> second, is it really going to happen an autocommit on 15th second for no
> reason since hardcommit on 10th second has already wrote the changes to
> disk and re-built the index. If it happens in that way, it makes sense to
> me if I see very less autocommit logs since I have very frequent
> hardcommits firing from the application.
>
> Every help is appreciated.
> Thanks in advance,
>
> On Mon, Dec 4, 2017 at 10:51 AM, Puppy Linux Distros <vi...@gmail.com>
> wrote:
>
>> Hello,
>>
>> Thanks Shawn. Can you provide command to find the total number of
>> autocommits in the solr.log?
>>
>> On Thu, Nov 30, 2017 at 7:20 PM, Shawn Heisey <ap...@elyograg.org> wrote:
>>
>>> On 11/30/2017 4:36 AM, Puppy Linux Distros wrote:
>>>
>>>> I am trying to calculate the total number of softCommit , autocommit and
>>>> hard commit from the solr logs. Can you please check whether the below
>>>> commands are correct ?
>>>>
>>>> Let me know how to find the total softcommit, hardcommit and autocommit
>>>> from the logs.
>>>>
>>>>
>>>> *1. totalcommit=`cat $solrlogfile | grep "start commit" | wc -l`*
>>>>
>>>> *totalcommit =  **41906*
>>>>
>>>>
>>>> *2. totalsoftcommit=`cat $solrlogfile | grep "start commit" | grep
>>>> "softCommit=true" | wc -l`*
>>>>
>>>> *totalsoftcommit =  **921*
>>>>
>>>
>>> These look reasonable ... but be aware that the default logging config
>>> will roll the solr.log file to a new empty file when it reaches 4
>>> megabytes, which doesn't really take that long on a busy server, so if
>>> you're only looking at "solr.log" you may have an incomplete picture.  I
>>> personally change the roll size limit to 4 gigabytes so solr.log covers a
>>> lot more time.
>>>
>>> Solr restarts will *also* roll/archive logfiles, so you probably can't
>>> just look through every file in the logs directory that starts with
>>> "solr.log" -- it may be difficult to figure out exactly which files apply
>>> to the current running instance.  It might turn out that I'm completely
>>> wrong in that statement -- I haven't confirmed exactly what a Solr restart
>>> actually does with the logfiles.
>>>
>>> *3. totalhardcommits=`cat $solrlogfile | grep "start commit" | grep
>>>> "softCommit=false" | grep "openSearcher=true" | wc -l`*
>>>>
>>>> *totalhardcommits=  **40982*
>>>>
>>>
>>> If you have configured autoCommit in solrconfig.xml and have set
>>> openSearcher to false in that config, then there will be hard commits that
>>> *don't* open a new searcher, so the "openSearcher=true" part will not catch
>>> those commits.  Example configs in recent versions have autoCommit set up
>>> this way, and this recommended config for *everybody*.  The default
>>> autoCommit interval in the example configs is 15 seconds, which I think is
>>> a little too aggressive, but this kind of commit is typically very fast, so
>>> I've never seen that config cause problems.
>>>
>>> The example configs do not have autoSoftCommit configured.  If users want
>>> to automatically do commits for visibility, we recommend that they use
>>> autoSoftCommit.
>>>
>>> *4.  totalautocommit=`cat $solrlogfile | grep "realtime" | wc -l`*
>>>>
>>>> *totalautocommit= 3*
>>>>
>>>
>>> These aren't autoCommits.  They are new searchers for the realtime get
>>> handler, which is capable of accessing documents that haven't been
>>> committed yet.  In addition to the index on disk, it searches the
>>> transaction logs.  Opening a new realtime searcher should be very fast, and
>>> they happen without any configuration. I'm not sure why you're only seeing
>>> this happen three times here. Presumably in a log where there are 40000
>>> total commits, you are doing a fair amount of indexing, so I would have
>>> expected a new realtime searcher to have been created much more frequently,
>>> even if there were no commits done at all.
>>>
>>> Maybe the realtime get handler can use the standard searcher, and only
>>> opens a new realtime searcher in cases where new documents have been
>>> indexed but there hasn't been a recent commit that opens a new searcher.
>>> If that's the case, then I have no idea how long it would wait before
>>> firing up a new realtime searcher.  I wouldn't expect that to be very long
>>> ... so if your indexing/committing cycles are normally very fast, maybe
>>> Solr doesn't feel it's necessary to open realtime searchers very often.
>>>
>>> Thanks,
>>> Shawn
>>>
>>>
>>
>>
>> --
>> Regards,
>>
>> Vivek CV
>>
>>
>>
>
>
> --
> Regards,
>
> Vivek CV

Re: check softCommit , autocommit and hard commit count

Posted by Shawn Heisey <ap...@elyograg.org>.
On 12/4/2017 1:53 AM, Puppy Linux Distros wrote:
> I know it's a bad practice but due to some reasons, our application fires
> hard commits via code(upon most of the /update) and invokes the /update api
> with commit=true and application very less uses softcommits. I will
> recommend devs to look forward with more softcommits and make use of
> realtime searchers in future.

Anyone who AUTOMATICALLY says it's bad practice to send hard commits 
doesn't fully understand all the mechanics.  It's true that soft commits 
are recommended when you want to see index changes, but this is only 
because they *MIGHT* be faster than hard commits.

It's actually the opening of the searcher that tends to be a performance 
killer, and soft commits DO open a new searcher. There are situations in 
which a soft commit is NOT any faster than a hard commit, but because it 
MIGHT be faster, they are generally recommended.

There are plenty of users who never worry about the difference and 
always use hard commits.  Most of the time this is long-time users who 
got into Solr before version 4.0, when soft commits were introduced.

> In addition to that, just wanted to confirm, if there no pending /update
> written to disk, does autocommit really fires at it's interval or is it
> going to be idle if nothing to write to disk..? In other way, suppose, I
> made a softcommit on 5th second and I made a hardcommit explicitly on 10th
> second, is it really going to happen an autocommit on 15th second for no
> reason since hardcommit on 10th second has already wrote the changes to
> disk and re-built the index. If it happens in that way, it makes sense to
> me if I see very less autocommit logs since I have very frequent
> hardcommits firing from the application.

As Erick said, if there have been no changes to the index, then *any* 
kind of commit will do nothing.  The automatic commits don't fire if 
there have been no changes to the index.  I do not know what happens in 
the log when a commit is requested that does nothing.

Thanks,
Shawn


Re: check softCommit , autocommit and hard commit count

Posted by Puppy Linux Distros <vi...@gmail.com>.
Hi,

Thanks Shawn for the help.

I think I should have added few more details to my previous mail.

I know it's a bad practice but due to some reasons, our application fires
hard commits via code(upon most of the /update) and invokes the /update api
with commit=true and application very less uses softcommits. I will
recommend devs to look forward with more softcommits and make use of
realtime searchers in future.

However, my current scenario is to get the solr to latest 7.1.0 so I need
to collect the current traffic in solr to have an optimized trade-offs with
the latest stack that I am looking forward to. Current stack is bit older
like 4.10. so got to process/parse current solr logs.

I have my own log storage mechanism with us so I have one month solr.log
stored and hence rotation/archive isn't an issue here. Once I get a hold of
unique phrases in each logs that appends with each type of
commits(softcommit, autohardcommit,hardcommit), I can frame some metrics of
current traffic.

Our current stack still maintains default autocommit config like
opensearcher=false and 15s period. Currently dont have softcommits enabled,
however softcommits and hardcommits invokes explicitly from application,
hence its bit hard to get them separated from solr.log unless I get some
unique phrases/regex/words out of each log lines that each type of commits
fires. Would be really helpful if any inputs in this area.

In addition to that, just wanted to confirm, if there no pending /update
written to disk, does autocommit really fires at it's interval or is it
going to be idle if nothing to write to disk..? In other way, suppose, I
made a softcommit on 5th second and I made a hardcommit explicitly on 10th
second, is it really going to happen an autocommit on 15th second for no
reason since hardcommit on 10th second has already wrote the changes to
disk and re-built the index. If it happens in that way, it makes sense to
me if I see very less autocommit logs since I have very frequent
hardcommits firing from the application.

Every help is appreciated.
Thanks in advance,

On Mon, Dec 4, 2017 at 10:51 AM, Puppy Linux Distros <vi...@gmail.com>
wrote:

> Hello,
>
> Thanks Shawn. Can you provide command to find the total number of
> autocommits in the solr.log?
>
> On Thu, Nov 30, 2017 at 7:20 PM, Shawn Heisey <ap...@elyograg.org> wrote:
>
>> On 11/30/2017 4:36 AM, Puppy Linux Distros wrote:
>>
>>> I am trying to calculate the total number of softCommit , autocommit and
>>> hard commit from the solr logs. Can you please check whether the below
>>> commands are correct ?
>>>
>>> Let me know how to find the total softcommit, hardcommit and autocommit
>>> from the logs.
>>>
>>>
>>> *1. totalcommit=`cat $solrlogfile | grep "start commit" | wc -l`*
>>>
>>> *totalcommit =  **41906*
>>>
>>>
>>> *2. totalsoftcommit=`cat $solrlogfile | grep "start commit" | grep
>>> "softCommit=true" | wc -l`*
>>>
>>> *totalsoftcommit =  **921*
>>>
>>
>> These look reasonable ... but be aware that the default logging config
>> will roll the solr.log file to a new empty file when it reaches 4
>> megabytes, which doesn't really take that long on a busy server, so if
>> you're only looking at "solr.log" you may have an incomplete picture.  I
>> personally change the roll size limit to 4 gigabytes so solr.log covers a
>> lot more time.
>>
>> Solr restarts will *also* roll/archive logfiles, so you probably can't
>> just look through every file in the logs directory that starts with
>> "solr.log" -- it may be difficult to figure out exactly which files apply
>> to the current running instance.  It might turn out that I'm completely
>> wrong in that statement -- I haven't confirmed exactly what a Solr restart
>> actually does with the logfiles.
>>
>> *3. totalhardcommits=`cat $solrlogfile | grep "start commit" | grep
>>> "softCommit=false" | grep "openSearcher=true" | wc -l`*
>>>
>>> *totalhardcommits=  **40982*
>>>
>>
>> If you have configured autoCommit in solrconfig.xml and have set
>> openSearcher to false in that config, then there will be hard commits that
>> *don't* open a new searcher, so the "openSearcher=true" part will not catch
>> those commits.  Example configs in recent versions have autoCommit set up
>> this way, and this recommended config for *everybody*.  The default
>> autoCommit interval in the example configs is 15 seconds, which I think is
>> a little too aggressive, but this kind of commit is typically very fast, so
>> I've never seen that config cause problems.
>>
>> The example configs do not have autoSoftCommit configured.  If users want
>> to automatically do commits for visibility, we recommend that they use
>> autoSoftCommit.
>>
>> *4.  totalautocommit=`cat $solrlogfile | grep "realtime" | wc -l`*
>>>
>>> *totalautocommit= 3*
>>>
>>
>> These aren't autoCommits.  They are new searchers for the realtime get
>> handler, which is capable of accessing documents that haven't been
>> committed yet.  In addition to the index on disk, it searches the
>> transaction logs.  Opening a new realtime searcher should be very fast, and
>> they happen without any configuration. I'm not sure why you're only seeing
>> this happen three times here. Presumably in a log where there are 40000
>> total commits, you are doing a fair amount of indexing, so I would have
>> expected a new realtime searcher to have been created much more frequently,
>> even if there were no commits done at all.
>>
>> Maybe the realtime get handler can use the standard searcher, and only
>> opens a new realtime searcher in cases where new documents have been
>> indexed but there hasn't been a recent commit that opens a new searcher.
>> If that's the case, then I have no idea how long it would wait before
>> firing up a new realtime searcher.  I wouldn't expect that to be very long
>> ... so if your indexing/committing cycles are normally very fast, maybe
>> Solr doesn't feel it's necessary to open realtime searchers very often.
>>
>> Thanks,
>> Shawn
>>
>>
>
>
> --
> Regards,
>
> Vivek CV
>
>
>


-- 
Regards,

Vivek CV