You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Midas A <te...@gmail.com> on 2015/11/30 07:18:57 UTC

Soft commit and hard commit

Machine configuration

RAM: 48 GB
CPU: 8 core
JVM : 36 GB

We are updating 70 , 000 docs / hr  . what should be our soft commit and
hard commit time  to get best results.

Current configuration :
<autoCommit> <maxTime>60000</maxTime> <openSearcher>false</openSearcher> </
autoCommit>


<autoSoftCommit> <maxTime>600000</maxTime> </autoSoftCommit>

There are no read on master server.

Re: Soft commit and hard commit

Posted by Alessandro Benedetti <ab...@apache.org>.
In particular please give us additional details about your search use case .
If the master is not searched, do you mean you have a master/slave
architecture ?
In the case, how replication is managed ?

If you are replicating old style, you are going to be able to see only what
is in the disk at the moment of the replication, which means only the hard
committed segments . Is this your case ?

Do you need soft commit at all ?

>From Erick's guide :

Heavy (bulk) indexing
> The assumption here is that you’re interested in getting lots of data to
> the index as quickly as possible for search sometime in the future. I’m
> thinking original loads of a data source etc.
>
>    - Set your soft commit interval quite long. As in 10 minutes or even
>    longer (-1 for no soft commits at all). *Soft commit is about
>    visibility, *and my assumption here is that bulk indexing isn’t about
>    near real time searching so don’t do the extra work of opening any kind of
>    searcher.
>
>
>    - Set your hard commit intervals to 15 seconds, openSearcher=false.
>    Again the assumption is that you’re going to be just blasting data at Solr.
>    The worst case here is that you restart your system and have to replay 15
>    seconds or so of data from your tlog. If your system is bouncing up and
>    down more often than that, fix the reason for that first.
>
>
>    - Only after you’ve tried the simple things should you consider
>    refinements, they’re usually only required in unusual circumstances. But
>    they include:
>
>
>    - Turning off the tlog completely for the bulk-load operation
>
>
>    - Indexing offline with some kind of map-reduce process
>
>
>    - Only having a leader per shard, no replicas for the load, then
>    turning on replicas later and letting them do old-style replication to
>    catch up. Note that this is automatic, if the node discovers it is “too
>    far” out of sync with the leader, it initiates an old-style replication.
>    After it has caught up, it’ll get documents as they’re indexed to the
>    leader and keep its own tlog.
>
>
>    - etc.
>
>
Cheers


On 30 November 2015 at 09:14, Ali Nazemian <al...@gmail.com> wrote:

> Dear Midas,
> Hi,
> AFAIK, currently Solr uses virtual memory for storing memory maps.
> Therefore using 36GB from 48GB of ram for Java heap is not recommended. As
> a rule of thumb do not access more than 25% of your total memory to Solr
> JVM in usual situations.
> About your main question, setting softcommit and hardcommit for Solr is
> highly dependent on your application. A really nice guide for this purpose
> is presented by lucidworks, In order to find the best value for softcommit
> and hardcommit please follow this guide:
>
> http://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>
> Best regards.
>
> On Mon, Nov 30, 2015 at 9:48 AM, Midas A <te...@gmail.com> wrote:
>
> > Machine configuration
> >
> > RAM: 48 GB
> > CPU: 8 core
> > JVM : 36 GB
> >
> > We are updating 70 , 000 docs / hr  . what should be our soft commit and
> > hard commit time  to get best results.
> >
> > Current configuration :
> > <autoCommit> <maxTime>60000</maxTime> <openSearcher>false</openSearcher>
> </
> > autoCommit>
> >
> >
> > <autoSoftCommit> <maxTime>600000</maxTime> </autoSoftCommit>
> >
> > There are no read on master server.
> >
>
>
>
> --
> A.Nazemian
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Re: Soft commit and hard commit

Posted by Ali Nazemian <al...@gmail.com>.
Dear Midas,
Hi,
AFAIK, currently Solr uses virtual memory for storing memory maps.
Therefore using 36GB from 48GB of ram for Java heap is not recommended. As
a rule of thumb do not access more than 25% of your total memory to Solr
JVM in usual situations.
About your main question, setting softcommit and hardcommit for Solr is
highly dependent on your application. A really nice guide for this purpose
is presented by lucidworks, In order to find the best value for softcommit
and hardcommit please follow this guide:
http://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

Best regards.

On Mon, Nov 30, 2015 at 9:48 AM, Midas A <te...@gmail.com> wrote:

> Machine configuration
>
> RAM: 48 GB
> CPU: 8 core
> JVM : 36 GB
>
> We are updating 70 , 000 docs / hr  . what should be our soft commit and
> hard commit time  to get best results.
>
> Current configuration :
> <autoCommit> <maxTime>60000</maxTime> <openSearcher>false</openSearcher> </
> autoCommit>
>
>
> <autoSoftCommit> <maxTime>600000</maxTime> </autoSoftCommit>
>
> There are no read on master server.
>



-- 
A.Nazemian