You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@solr.apache.org by mtn search <se...@gmail.com> on 2024/03/07 23:12:44 UTC

Solr 6 Replication Question

Hello,

While building/deploying a new SolrCloud based solution our team is still
maintaining a Solr 6 M/S large deployment, and I have a replication
question.

We are in a situation where we likely need to take down our set of primary
slave cores.  My understanding is that there is a point where Solr will end
up doing a full replication of the set of  index files (i.e. the master
core was optimized) instead of replicating updates to index files.  If I am
correct that there is a tipping point, what characterizes this tipping
point?  Number of commits? Date of last modified?

I am trying to anticipate if the primaries are down for a few days, a week,
a week+, when we might face a number of cores performing full replication.

Thanks,
Matt

Re: Solr 6 Replication Question

Posted by Shawn Heisey <ap...@elyograg.org.INVALID>.
On 3/16/2024 11:52, mtn search wrote:
> While following up on a request to verify replication within our Solr farm,
> I discovered a state that is puzzling.  For a number of Solr 6 cores I see
> the master and the core replicating to the master having the same doc
> count, indexVersion number, and LastUpdatedDate values, however the Gen
> value is off by one (Master ahead by one). Replication is not being
> triggered.

The indexes should be absolutely identical after replication occurs.

Unless there is something I am unaware of, which is possible, my guess 
would be that either you are happening to catch it while a replication 
is still occurring, or that your "replicateAfter" settings on the master 
are not configured to trigger a replication for some event that is 
happening.  Enabling replicateAfter for commit is generally the most 
useful setting.

I do not have access to systems using replication, so it will be 
difficult for me to verify what I am saying.

Thanks,
Shawn


Re: Solr 6 Replication Question

Posted by mtn search <se...@gmail.com>.
Follow on question -

While following up on a request to verify replication within our Solr farm,
I discovered a state that is puzzling.  For a number of Solr 6 cores I see
the master and the core replicating to the master having the same doc
count, indexVersion number, and LastUpdatedDate values, however the Gen
value is off by one (Master ahead by one). Replication is not being
triggered.

Any ideas on what would cause this state, and the meaning of the version
being the same and the gen different?

Thanks,
Matt

On Mon, Mar 11, 2024 at 8:51 AM Matt Kuiper <ku...@gmail.com> wrote:

> Thanks Shawn!  That is helpful information.
>
> Matt
>
> On Fri, Mar 8, 2024 at 2:26 AM Shawn Heisey <ap...@elyograg.org.invalid>
> wrote:
>
> > On 3/7/24 16:22, mtn search wrote:
> > > Realized my question was not very clear.  It pertains to when we bring
> > the
> > > primary cores back up, after X number of days, how does Solr determine
> to
> > > do an standard replication of updates or a full replication of the set
> of
> > > index files.
> >
> > There is no "tipping point".  It works almost exactly like rsync.
> >
> > If you optimize the index, then all the segment files will be different,
> > so it will need to copy the whole index.  During normal operation, some
> > of the index files may be merged into larger segments.  Anything that
> > does not already exist on the target index will be copied, and it will
> > delete any files that do not exist in the source.
> >
> > Thanks,
> > Shawn
> >
> >
>

Re: Solr 6 Replication Question

Posted by Matt Kuiper <ku...@gmail.com>.
Thanks Shawn!  That is helpful information.

Matt

On Fri, Mar 8, 2024 at 2:26 AM Shawn Heisey <ap...@elyograg.org.invalid>
wrote:

> On 3/7/24 16:22, mtn search wrote:
> > Realized my question was not very clear.  It pertains to when we bring
> the
> > primary cores back up, after X number of days, how does Solr determine to
> > do an standard replication of updates or a full replication of the set of
> > index files.
>
> There is no "tipping point".  It works almost exactly like rsync.
>
> If you optimize the index, then all the segment files will be different,
> so it will need to copy the whole index.  During normal operation, some
> of the index files may be merged into larger segments.  Anything that
> does not already exist on the target index will be copied, and it will
> delete any files that do not exist in the source.
>
> Thanks,
> Shawn
>
>

Re: Solr 6 Replication Question

Posted by Shawn Heisey <ap...@elyograg.org.INVALID>.
On 3/7/24 16:22, mtn search wrote:
> Realized my question was not very clear.  It pertains to when we bring the
> primary cores back up, after X number of days, how does Solr determine to
> do an standard replication of updates or a full replication of the set of
> index files.

There is no "tipping point".  It works almost exactly like rsync.

If you optimize the index, then all the segment files will be different, 
so it will need to copy the whole index.  During normal operation, some 
of the index files may be merged into larger segments.  Anything that 
does not already exist on the target index will be copied, and it will 
delete any files that do not exist in the source.

Thanks,
Shawn


Re: Solr 6 Replication Question

Posted by mtn search <se...@gmail.com>.
Realized my question was not very clear.  It pertains to when we bring the
primary cores back up, after X number of days, how does Solr determine to
do an standard replication of updates or a full replication of the set of
index files.

On Thu, Mar 7, 2024 at 4:12 PM mtn search <se...@gmail.com> wrote:

> Hello,
>
> While building/deploying a new SolrCloud based solution our team is still
> maintaining a Solr 6 M/S large deployment, and I have a replication
> question.
>
> We are in a situation where we likely need to take down our set of primary
> slave cores.  My understanding is that there is a point where Solr will end
> up doing a full replication of the set of  index files (i.e. the master
> core was optimized) instead of replicating updates to index files.  If I am
> correct that there is a tipping point, what characterizes this tipping
> point?  Number of commits? Date of last modified?
>
> I am trying to anticipate if the primaries are down for a few days, a
> week, a week+, when we might face a number of cores performing full
> replication.
>
> Thanks,
> Matt
>