You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Alexandre Rocco <al...@gmail.com> on 2012/03/23 15:28:38 UTC

Slave index size growing fast

Hello,

We have a Solr index that has an average of 1.19 GB in size.
After configuring the replication, the slave machine is growing the index
size expoentially.
Currently we have an slave with 323.44 GB in size.
Is there anything that could cause this behavior?
The current replication config is below.

Master:
<requestHandler name="/replication" class="solr.ReplicationHandler">
<lst name="master">
<str name="replicateAfter">commit</str>
<str name="replicateAfter">startup</str>
<str name="backupAfter">startup</str>
<str name="confFiles">
elevate.xml,protwords.txt,schema.xml,spellings.txt,stopwords.txt,synonyms.txt
</str>
</lst>
</requestHandler>

Slave:
<requestHandler name="/replication" class="solr.ReplicationHandler">
<lst name="slave">
<str name="masterUrl">http://master:8984/solr/Index/replication</str>
</lst>
</requestHandler>

Any pointers will be useful.

Thanks,
Alexandre

Re: Slave index size growing fast

Posted by Alexandre Rocco <al...@gmail.com>.

Erick,

I haven't changed the maxCommitsToKeep yet.
We stopped the slave that had issues and removed the data dir as you
pointed and afer starting it, everything started working as normal.
I guess that at some point someone commited on the slave or even copied the
master files over and made this mess. Will check on the internal docs to
prevent this from happening again.

Thanks for explaining the whole concept, will be useful to understand the
whole process.

Best,
Alexandre

On Fri, Mar 23, 2012 at 4:05 PM, Erick Erickson <er...@gmail.com>wrote:

> Alexandre:
>
> Have you changed anything like <maxCommitsToKeep> on your slave?
> And do you have more than one slave? If you do, have you considered
> just blowing away the entire .../data directory on the slave and letting
> it re-start from scratch? I'd take the slave out of service for the
> duration of this operation, or do it when you are OK with some number of
> requests going to an empty index....
>
> Because having an index.<timestamp> directory indicates that sometime
> someone forced the slave to get out of sync, possibly as you say by
> doing a commit. Or sending docs to it to be indexed or some such. Starting
> the slave over should fix that if it's the root of your problem.
>
> Note a curious thing about the <timestamp>. When you start indexing, the
> index version is a timestamp. However, from that point on when the index
> changes, the version number is just incremented (not made the current
> time). This is to avoid problems with masters and slaves having different
> times. But a consequence of that is if your slave somehow gets an index
> that's newer, the replication process does the best it can to not delete
> indexes that are out of sync with the master and saves them away. This
> might be what you're seeing.
>
> I'm grasping at straws a bit here, but this seems possible.
>
> Best
> Erick
>
> On Fri, Mar 23, 2012 at 1:16 PM, Alexandre Rocco <al...@gmail.com>
> wrote:
> > Tomás,
> >
> > The 300+GB size is only inside the index.20110926152410 dir. Inside there
> > are a lot of files.
> > I am almost conviced that something is messed up like someone commited on
> > this slave machine.
> >
> > Thanks
> >
> > 2012/3/23 Tomás Fernández Löbbe <to...@gmail.com>
> >
> >> Alexandre, additionally to what Erick said, you may want to check in the
> >> slave if what's 300+GB is the "data" directory or the
> "index.<timestamp>"
> >> directory.
> >>
> >> On Fri, Mar 23, 2012 at 12:25 PM, Erick Erickson <
> erickerickson@gmail.com
> >> >wrote:
> >>
> >> > not really, unless perhaps you're issuing commits or optimizes
> >> > on the _slave_ (which you should NOT do).
> >> >
> >> > Replication happens based on the version of the index on the master.
> >> > True, it starts out as a timestamp, but then successive versions
> >> > just have that number incremented. The version number
> >> > in the index on the slave is compared against the one on the master,
> >> > but the actual time (on the slave or master) is irrelevant. This is
> >> > explicitly to avoid problems with time synching across
> >> > machines/timezones/whataver....
> >> >
> >> > It would be instructive to look at the admin/info page to see what
> >> > the index version is on the master and slave.
> >> >
> >> > But, if you optimize or commit (I think) on the _slave_, you might
> >> > change the timestamp and mess things up (although I'm reaching
> >> > here, I don't know this for certain).
> >> >
> >> > What's the  index look like on the slave as compared to the master?
> >> > Are there just a bunch of files on the slave? Or a bunch of
> directories?
> >> >
> >> > Instead of re-indexing on the master, you could try to bring down the
> >> > slave, blow away the entire index and start it back up. Since this is
> a
> >> > production system, I'd only try this if I had more than one slave.
> >> Although
> >> > you could bring up a new slave and attach it to the master and see
> >> > what happens there. You wouldn't affect production if you didn't point
> >> > incoming requests at it...
> >> >
> >> > Best
> >> > Erick
> >> >
> >> > On Fri, Mar 23, 2012 at 11:03 AM, Alexandre Rocco <al...@gmail.com>
> >> > wrote:
> >> > > Erick,
> >> > >
> >> > > We're using Solr 3.3 on Linux (CentOS 5.6).
> >> > > The /data dir on master is actually 1.2G.
> >> > >
> >> > > I haven't tried to recreate the index yet. Since it's a production
> >> > > environment,
> >> > > I guess that I can stop replication and indexing and then recreate
> the
> >> > > master index to see if it makes any difference.
> >> > >
> >> > > Also just noticed another thread here named "Simple Slave
> Replication
> >> > > Question" that tells that it could be a problem if I'm seeing an
> >> > > /data/index with an timestamp on the slave node.
> >> > > Is this info relevant to this issue?
> >> > >
> >> > > Thanks,
> >> > > Alexandre
> >> > >
> >> > > On Fri, Mar 23, 2012 at 11:48 AM, Erick Erickson <
> >> > erickerickson@gmail.com>wrote:
> >> > >
> >> > >> What version of Solr and what operating system?
> >> > >>
> >> > >> But regardless, this shouldn't be happening. Indexes can
> >> > >> temporarily double in size, but any extras should be
> >> > >> cleaned up relatively soon.
> >> > >>
> >> > >> On the master, what's the total size of the <solr home>/data
> >> directory?
> >> > >> I'm a little suspicious of the <backupAfter> on your master, but I
> >> > >> don't think that's the root of your problem....
> >> > >>
> >> > >> Are you recreating the index on the master (by deleting the
> >> > >> index directory and starting over)?
> >> > >>
> >> > >> This is unusual, and I suspect it's something odd in your
> >> configuration,
> >> > >> but I confess I'm at a loss as to what.
> >> > >>
> >> > >> Best
> >> > >> Erick
> >> > >>
> >> > >> On Fri, Mar 23, 2012 at 10:28 AM, Alexandre Rocco <
> aleloco@gmail.com>
> >> > >> wrote:
> >> > >> > Hello,
> >> > >> >
> >> > >> > We have a Solr index that has an average of 1.19 GB in size.
> >> > >> > After configuring the replication, the slave machine is growing
> the
> >> > index
> >> > >> > size expoentially.
> >> > >> > Currently we have an slave with 323.44 GB in size.
> >> > >> > Is there anything that could cause this behavior?
> >> > >> > The current replication config is below.
> >> > >> >
> >> > >> > Master:
> >> > >> > <requestHandler name="/replication"
> class="solr.ReplicationHandler">
> >> > >> > <lst name="master">
> >> > >> > <str name="replicateAfter">commit</str>
> >> > >> > <str name="replicateAfter">startup</str>
> >> > >> > <str name="backupAfter">startup</str>
> >> > >> > <str name="confFiles">
> >> > >> >
> >> > >>
> >> >
> >>
> elevate.xml,protwords.txt,schema.xml,spellings.txt,stopwords.txt,synonyms.txt
> >> > >> > </str>
> >> > >> > </lst>
> >> > >> > </requestHandler>
> >> > >> >
> >> > >> > Slave:
> >> > >> > <requestHandler name="/replication"
> class="solr.ReplicationHandler">
> >> > >> > <lst name="slave">
> >> > >> > <str name="masterUrl">http://master:8984/solr/Index/replication
> >> </str>
> >> > >> > </lst>
> >> > >> > </requestHandler>
> >> > >> >
> >> > >> > Any pointers will be useful.
> >> > >> >
> >> > >> > Thanks,
> >> > >> > Alexandre
> >> > >>
> >> >
> >>
>

Re: Slave index size growing fast

Posted by Erick Erickson <er...@gmail.com>.

Alexandre:

Have you changed anything like <maxCommitsToKeep> on your slave?
And do you have more than one slave? If you do, have you considered
just blowing away the entire .../data directory on the slave and letting
it re-start from scratch? I'd take the slave out of service for the
duration of this operation, or do it when you are OK with some number of
requests going to an empty index....

Because having an index.<timestamp> directory indicates that sometime
someone forced the slave to get out of sync, possibly as you say by
doing a commit. Or sending docs to it to be indexed or some such. Starting
the slave over should fix that if it's the root of your problem.

Note a curious thing about the <timestamp>. When you start indexing, the
index version is a timestamp. However, from that point on when the index
changes, the version number is just incremented (not made the current
time). This is to avoid problems with masters and slaves having different
times. But a consequence of that is if your slave somehow gets an index
that's newer, the replication process does the best it can to not delete
indexes that are out of sync with the master and saves them away. This
might be what you're seeing.

I'm grasping at straws a bit here, but this seems possible.

Best
Erick

On Fri, Mar 23, 2012 at 1:16 PM, Alexandre Rocco <al...@gmail.com> wrote:
> Tomás,
>
> The 300+GB size is only inside the index.20110926152410 dir. Inside there
> are a lot of files.
> I am almost conviced that something is messed up like someone commited on
> this slave machine.
>
> Thanks
>
> 2012/3/23 Tomás Fernández Löbbe <to...@gmail.com>
>
>> Alexandre, additionally to what Erick said, you may want to check in the
>> slave if what's 300+GB is the "data" directory or the "index.<timestamp>"
>> directory.
>>
>> On Fri, Mar 23, 2012 at 12:25 PM, Erick Erickson <erickerickson@gmail.com
>> >wrote:
>>
>> > not really, unless perhaps you're issuing commits or optimizes
>> > on the _slave_ (which you should NOT do).
>> >
>> > Replication happens based on the version of the index on the master.
>> > True, it starts out as a timestamp, but then successive versions
>> > just have that number incremented. The version number
>> > in the index on the slave is compared against the one on the master,
>> > but the actual time (on the slave or master) is irrelevant. This is
>> > explicitly to avoid problems with time synching across
>> > machines/timezones/whataver....
>> >
>> > It would be instructive to look at the admin/info page to see what
>> > the index version is on the master and slave.
>> >
>> > But, if you optimize or commit (I think) on the _slave_, you might
>> > change the timestamp and mess things up (although I'm reaching
>> > here, I don't know this for certain).
>> >
>> > What's the  index look like on the slave as compared to the master?
>> > Are there just a bunch of files on the slave? Or a bunch of directories?
>> >
>> > Instead of re-indexing on the master, you could try to bring down the
>> > slave, blow away the entire index and start it back up. Since this is a
>> > production system, I'd only try this if I had more than one slave.
>> Although
>> > you could bring up a new slave and attach it to the master and see
>> > what happens there. You wouldn't affect production if you didn't point
>> > incoming requests at it...
>> >
>> > Best
>> > Erick
>> >
>> > On Fri, Mar 23, 2012 at 11:03 AM, Alexandre Rocco <al...@gmail.com>
>> > wrote:
>> > > Erick,
>> > >
>> > > We're using Solr 3.3 on Linux (CentOS 5.6).
>> > > The /data dir on master is actually 1.2G.
>> > >
>> > > I haven't tried to recreate the index yet. Since it's a production
>> > > environment,
>> > > I guess that I can stop replication and indexing and then recreate the
>> > > master index to see if it makes any difference.
>> > >
>> > > Also just noticed another thread here named "Simple Slave Replication
>> > > Question" that tells that it could be a problem if I'm seeing an
>> > > /data/index with an timestamp on the slave node.
>> > > Is this info relevant to this issue?
>> > >
>> > > Thanks,
>> > > Alexandre
>> > >
>> > > On Fri, Mar 23, 2012 at 11:48 AM, Erick Erickson <
>> > erickerickson@gmail.com>wrote:
>> > >
>> > >> What version of Solr and what operating system?
>> > >>
>> > >> But regardless, this shouldn't be happening. Indexes can
>> > >> temporarily double in size, but any extras should be
>> > >> cleaned up relatively soon.
>> > >>
>> > >> On the master, what's the total size of the <solr home>/data
>> directory?
>> > >> I'm a little suspicious of the <backupAfter> on your master, but I
>> > >> don't think that's the root of your problem....
>> > >>
>> > >> Are you recreating the index on the master (by deleting the
>> > >> index directory and starting over)?
>> > >>
>> > >> This is unusual, and I suspect it's something odd in your
>> configuration,
>> > >> but I confess I'm at a loss as to what.
>> > >>
>> > >> Best
>> > >> Erick
>> > >>
>> > >> On Fri, Mar 23, 2012 at 10:28 AM, Alexandre Rocco <al...@gmail.com>
>> > >> wrote:
>> > >> > Hello,
>> > >> >
>> > >> > We have a Solr index that has an average of 1.19 GB in size.
>> > >> > After configuring the replication, the slave machine is growing the
>> > index
>> > >> > size expoentially.
>> > >> > Currently we have an slave with 323.44 GB in size.
>> > >> > Is there anything that could cause this behavior?
>> > >> > The current replication config is below.
>> > >> >
>> > >> > Master:
>> > >> > <requestHandler name="/replication" class="solr.ReplicationHandler">
>> > >> > <lst name="master">
>> > >> > <str name="replicateAfter">commit</str>
>> > >> > <str name="replicateAfter">startup</str>
>> > >> > <str name="backupAfter">startup</str>
>> > >> > <str name="confFiles">
>> > >> >
>> > >>
>> >
>> elevate.xml,protwords.txt,schema.xml,spellings.txt,stopwords.txt,synonyms.txt
>> > >> > </str>
>> > >> > </lst>
>> > >> > </requestHandler>
>> > >> >
>> > >> > Slave:
>> > >> > <requestHandler name="/replication" class="solr.ReplicationHandler">
>> > >> > <lst name="slave">
>> > >> > <str name="masterUrl">http://master:8984/solr/Index/replication
>> </str>
>> > >> > </lst>
>> > >> > </requestHandler>
>> > >> >
>> > >> > Any pointers will be useful.
>> > >> >
>> > >> > Thanks,
>> > >> > Alexandre
>> > >>
>> >
>>

Re: Slave index size growing fast

Posted by Alexandre Rocco <al...@gmail.com>.

Tomás,

The 300+GB size is only inside the index.20110926152410 dir. Inside there
are a lot of files.
I am almost conviced that something is messed up like someone commited on
this slave machine.

Thanks

2012/3/23 Tomás Fernández Löbbe <to...@gmail.com>

> Alexandre, additionally to what Erick said, you may want to check in the
> slave if what's 300+GB is the "data" directory or the "index.<timestamp>"
> directory.
>
> On Fri, Mar 23, 2012 at 12:25 PM, Erick Erickson <erickerickson@gmail.com
> >wrote:
>
> > not really, unless perhaps you're issuing commits or optimizes
> > on the _slave_ (which you should NOT do).
> >
> > Replication happens based on the version of the index on the master.
> > True, it starts out as a timestamp, but then successive versions
> > just have that number incremented. The version number
> > in the index on the slave is compared against the one on the master,
> > but the actual time (on the slave or master) is irrelevant. This is
> > explicitly to avoid problems with time synching across
> > machines/timezones/whataver....
> >
> > It would be instructive to look at the admin/info page to see what
> > the index version is on the master and slave.
> >
> > But, if you optimize or commit (I think) on the _slave_, you might
> > change the timestamp and mess things up (although I'm reaching
> > here, I don't know this for certain).
> >
> > What's the  index look like on the slave as compared to the master?
> > Are there just a bunch of files on the slave? Or a bunch of directories?
> >
> > Instead of re-indexing on the master, you could try to bring down the
> > slave, blow away the entire index and start it back up. Since this is a
> > production system, I'd only try this if I had more than one slave.
> Although
> > you could bring up a new slave and attach it to the master and see
> > what happens there. You wouldn't affect production if you didn't point
> > incoming requests at it...
> >
> > Best
> > Erick
> >
> > On Fri, Mar 23, 2012 at 11:03 AM, Alexandre Rocco <al...@gmail.com>
> > wrote:
> > > Erick,
> > >
> > > We're using Solr 3.3 on Linux (CentOS 5.6).
> > > The /data dir on master is actually 1.2G.
> > >
> > > I haven't tried to recreate the index yet. Since it's a production
> > > environment,
> > > I guess that I can stop replication and indexing and then recreate the
> > > master index to see if it makes any difference.
> > >
> > > Also just noticed another thread here named "Simple Slave Replication
> > > Question" that tells that it could be a problem if I'm seeing an
> > > /data/index with an timestamp on the slave node.
> > > Is this info relevant to this issue?
> > >
> > > Thanks,
> > > Alexandre
> > >
> > > On Fri, Mar 23, 2012 at 11:48 AM, Erick Erickson <
> > erickerickson@gmail.com>wrote:
> > >
> > >> What version of Solr and what operating system?
> > >>
> > >> But regardless, this shouldn't be happening. Indexes can
> > >> temporarily double in size, but any extras should be
> > >> cleaned up relatively soon.
> > >>
> > >> On the master, what's the total size of the <solr home>/data
> directory?
> > >> I'm a little suspicious of the <backupAfter> on your master, but I
> > >> don't think that's the root of your problem....
> > >>
> > >> Are you recreating the index on the master (by deleting the
> > >> index directory and starting over)?
> > >>
> > >> This is unusual, and I suspect it's something odd in your
> configuration,
> > >> but I confess I'm at a loss as to what.
> > >>
> > >> Best
> > >> Erick
> > >>
> > >> On Fri, Mar 23, 2012 at 10:28 AM, Alexandre Rocco <al...@gmail.com>
> > >> wrote:
> > >> > Hello,
> > >> >
> > >> > We have a Solr index that has an average of 1.19 GB in size.
> > >> > After configuring the replication, the slave machine is growing the
> > index
> > >> > size expoentially.
> > >> > Currently we have an slave with 323.44 GB in size.
> > >> > Is there anything that could cause this behavior?
> > >> > The current replication config is below.
> > >> >
> > >> > Master:
> > >> > <requestHandler name="/replication" class="solr.ReplicationHandler">
> > >> > <lst name="master">
> > >> > <str name="replicateAfter">commit</str>
> > >> > <str name="replicateAfter">startup</str>
> > >> > <str name="backupAfter">startup</str>
> > >> > <str name="confFiles">
> > >> >
> > >>
> >
> elevate.xml,protwords.txt,schema.xml,spellings.txt,stopwords.txt,synonyms.txt
> > >> > </str>
> > >> > </lst>
> > >> > </requestHandler>
> > >> >
> > >> > Slave:
> > >> > <requestHandler name="/replication" class="solr.ReplicationHandler">
> > >> > <lst name="slave">
> > >> > <str name="masterUrl">http://master:8984/solr/Index/replication
> </str>
> > >> > </lst>
> > >> > </requestHandler>
> > >> >
> > >> > Any pointers will be useful.
> > >> >
> > >> > Thanks,
> > >> > Alexandre
> > >>
> >
>

Re: Slave index size growing fast

Posted by Tomás Fernández Löbbe <to...@gmail.com>.

Alexandre, additionally to what Erick said, you may want to check in the
slave if what's 300+GB is the "data" directory or the "index.<timestamp>"
directory.

On Fri, Mar 23, 2012 at 12:25 PM, Erick Erickson <er...@gmail.com>wrote:

> not really, unless perhaps you're issuing commits or optimizes
> on the _slave_ (which you should NOT do).
>
> Replication happens based on the version of the index on the master.
> True, it starts out as a timestamp, but then successive versions
> just have that number incremented. The version number
> in the index on the slave is compared against the one on the master,
> but the actual time (on the slave or master) is irrelevant. This is
> explicitly to avoid problems with time synching across
> machines/timezones/whataver....
>
> It would be instructive to look at the admin/info page to see what
> the index version is on the master and slave.
>
> But, if you optimize or commit (I think) on the _slave_, you might
> change the timestamp and mess things up (although I'm reaching
> here, I don't know this for certain).
>
> What's the  index look like on the slave as compared to the master?
> Are there just a bunch of files on the slave? Or a bunch of directories?
>
> Instead of re-indexing on the master, you could try to bring down the
> slave, blow away the entire index and start it back up. Since this is a
> production system, I'd only try this if I had more than one slave. Although
> you could bring up a new slave and attach it to the master and see
> what happens there. You wouldn't affect production if you didn't point
> incoming requests at it...
>
> Best
> Erick
>
> On Fri, Mar 23, 2012 at 11:03 AM, Alexandre Rocco <al...@gmail.com>
> wrote:
> > Erick,
> >
> > We're using Solr 3.3 on Linux (CentOS 5.6).
> > The /data dir on master is actually 1.2G.
> >
> > I haven't tried to recreate the index yet. Since it's a production
> > environment,
> > I guess that I can stop replication and indexing and then recreate the
> > master index to see if it makes any difference.
> >
> > Also just noticed another thread here named "Simple Slave Replication
> > Question" that tells that it could be a problem if I'm seeing an
> > /data/index with an timestamp on the slave node.
> > Is this info relevant to this issue?
> >
> > Thanks,
> > Alexandre
> >
> > On Fri, Mar 23, 2012 at 11:48 AM, Erick Erickson <
> erickerickson@gmail.com>wrote:
> >
> >> What version of Solr and what operating system?
> >>
> >> But regardless, this shouldn't be happening. Indexes can
> >> temporarily double in size, but any extras should be
> >> cleaned up relatively soon.
> >>
> >> On the master, what's the total size of the <solr home>/data directory?
> >> I'm a little suspicious of the <backupAfter> on your master, but I
> >> don't think that's the root of your problem....
> >>
> >> Are you recreating the index on the master (by deleting the
> >> index directory and starting over)?
> >>
> >> This is unusual, and I suspect it's something odd in your configuration,
> >> but I confess I'm at a loss as to what.
> >>
> >> Best
> >> Erick
> >>
> >> On Fri, Mar 23, 2012 at 10:28 AM, Alexandre Rocco <al...@gmail.com>
> >> wrote:
> >> > Hello,
> >> >
> >> > We have a Solr index that has an average of 1.19 GB in size.
> >> > After configuring the replication, the slave machine is growing the
> index
> >> > size expoentially.
> >> > Currently we have an slave with 323.44 GB in size.
> >> > Is there anything that could cause this behavior?
> >> > The current replication config is below.
> >> >
> >> > Master:
> >> > <requestHandler name="/replication" class="solr.ReplicationHandler">
> >> > <lst name="master">
> >> > <str name="replicateAfter">commit</str>
> >> > <str name="replicateAfter">startup</str>
> >> > <str name="backupAfter">startup</str>
> >> > <str name="confFiles">
> >> >
> >>
> elevate.xml,protwords.txt,schema.xml,spellings.txt,stopwords.txt,synonyms.txt
> >> > </str>
> >> > </lst>
> >> > </requestHandler>
> >> >
> >> > Slave:
> >> > <requestHandler name="/replication" class="solr.ReplicationHandler">
> >> > <lst name="slave">
> >> > <str name="masterUrl">http://master:8984/solr/Index/replication</str>
> >> > </lst>
> >> > </requestHandler>
> >> >
> >> > Any pointers will be useful.
> >> >
> >> > Thanks,
> >> > Alexandre
> >>
>

Re: Slave index size growing fast

Posted by Alexandre Rocco <al...@gmail.com>.

Erick,

The master /data dir contains only an index dir with a bunch of files.
In the slave, the /data dir contains an index.20110926152410 dir with a lot
more files than the master. That is quite strange for me.

I guess that the config is right, since we have another slave that is
running fine with the same config.
The best bet would be clean up this messed slave and try to sync it again
and see what happens.

Thanks

On Fri, Mar 23, 2012 at 12:25 PM, Erick Erickson <er...@gmail.com>wrote:

> not really, unless perhaps you're issuing commits or optimizes
> on the _slave_ (which you should NOT do).
>
> Replication happens based on the version of the index on the master.
> True, it starts out as a timestamp, but then successive versions
> just have that number incremented. The version number
> in the index on the slave is compared against the one on the master,
> but the actual time (on the slave or master) is irrelevant. This is
> explicitly to avoid problems with time synching across
> machines/timezones/whataver....
>
> It would be instructive to look at the admin/info page to see what
> the index version is on the master and slave.
>
> But, if you optimize or commit (I think) on the _slave_, you might
> change the timestamp and mess things up (although I'm reaching
> here, I don't know this for certain).
>
> What's the  index look like on the slave as compared to the master?
> Are there just a bunch of files on the slave? Or a bunch of directories?
>
> Instead of re-indexing on the master, you could try to bring down the
> slave, blow away the entire index and start it back up. Since this is a
> production system, I'd only try this if I had more than one slave. Although
> you could bring up a new slave and attach it to the master and see
> what happens there. You wouldn't affect production if you didn't point
> incoming requests at it...
>
> Best
> Erick
>
> On Fri, Mar 23, 2012 at 11:03 AM, Alexandre Rocco <al...@gmail.com>
> wrote:
> > Erick,
> >
> > We're using Solr 3.3 on Linux (CentOS 5.6).
> > The /data dir on master is actually 1.2G.
> >
> > I haven't tried to recreate the index yet. Since it's a production
> > environment,
> > I guess that I can stop replication and indexing and then recreate the
> > master index to see if it makes any difference.
> >
> > Also just noticed another thread here named "Simple Slave Replication
> > Question" that tells that it could be a problem if I'm seeing an
> > /data/index with an timestamp on the slave node.
> > Is this info relevant to this issue?
> >
> > Thanks,
> > Alexandre
> >
> > On Fri, Mar 23, 2012 at 11:48 AM, Erick Erickson <
> erickerickson@gmail.com>wrote:
> >
> >> What version of Solr and what operating system?
> >>
> >> But regardless, this shouldn't be happening. Indexes can
> >> temporarily double in size, but any extras should be
> >> cleaned up relatively soon.
> >>
> >> On the master, what's the total size of the <solr home>/data directory?
> >> I'm a little suspicious of the <backupAfter> on your master, but I
> >> don't think that's the root of your problem....
> >>
> >> Are you recreating the index on the master (by deleting the
> >> index directory and starting over)?
> >>
> >> This is unusual, and I suspect it's something odd in your configuration,
> >> but I confess I'm at a loss as to what.
> >>
> >> Best
> >> Erick
> >>
> >> On Fri, Mar 23, 2012 at 10:28 AM, Alexandre Rocco <al...@gmail.com>
> >> wrote:
> >> > Hello,
> >> >
> >> > We have a Solr index that has an average of 1.19 GB in size.
> >> > After configuring the replication, the slave machine is growing the
> index
> >> > size expoentially.
> >> > Currently we have an slave with 323.44 GB in size.
> >> > Is there anything that could cause this behavior?
> >> > The current replication config is below.
> >> >
> >> > Master:
> >> > <requestHandler name="/replication" class="solr.ReplicationHandler">
> >> > <lst name="master">
> >> > <str name="replicateAfter">commit</str>
> >> > <str name="replicateAfter">startup</str>
> >> > <str name="backupAfter">startup</str>
> >> > <str name="confFiles">
> >> >
> >>
> elevate.xml,protwords.txt,schema.xml,spellings.txt,stopwords.txt,synonyms.txt
> >> > </str>
> >> > </lst>
> >> > </requestHandler>
> >> >
> >> > Slave:
> >> > <requestHandler name="/replication" class="solr.ReplicationHandler">
> >> > <lst name="slave">
> >> > <str name="masterUrl">http://master:8984/solr/Index/replication</str>
> >> > </lst>
> >> > </requestHandler>
> >> >
> >> > Any pointers will be useful.
> >> >
> >> > Thanks,
> >> > Alexandre
> >>
>

Re: Slave index size growing fast

Posted by Erick Erickson <er...@gmail.com>.

not really, unless perhaps you're issuing commits or optimizes
on the _slave_ (which you should NOT do).

Replication happens based on the version of the index on the master.
True, it starts out as a timestamp, but then successive versions
just have that number incremented. The version number
in the index on the slave is compared against the one on the master,
but the actual time (on the slave or master) is irrelevant. This is
explicitly to avoid problems with time synching across
machines/timezones/whataver....

It would be instructive to look at the admin/info page to see what
the index version is on the master and slave.

But, if you optimize or commit (I think) on the _slave_, you might
change the timestamp and mess things up (although I'm reaching
here, I don't know this for certain).

What's the  index look like on the slave as compared to the master?
Are there just a bunch of files on the slave? Or a bunch of directories?

Instead of re-indexing on the master, you could try to bring down the
slave, blow away the entire index and start it back up. Since this is a
production system, I'd only try this if I had more than one slave. Although
you could bring up a new slave and attach it to the master and see
what happens there. You wouldn't affect production if you didn't point
incoming requests at it...

Best
Erick

On Fri, Mar 23, 2012 at 11:03 AM, Alexandre Rocco <al...@gmail.com> wrote:
> Erick,
>
> We're using Solr 3.3 on Linux (CentOS 5.6).
> The /data dir on master is actually 1.2G.
>
> I haven't tried to recreate the index yet. Since it's a production
> environment,
> I guess that I can stop replication and indexing and then recreate the
> master index to see if it makes any difference.
>
> Also just noticed another thread here named "Simple Slave Replication
> Question" that tells that it could be a problem if I'm seeing an
> /data/index with an timestamp on the slave node.
> Is this info relevant to this issue?
>
> Thanks,
> Alexandre
>
> On Fri, Mar 23, 2012 at 11:48 AM, Erick Erickson <er...@gmail.com>wrote:
>
>> What version of Solr and what operating system?
>>
>> But regardless, this shouldn't be happening. Indexes can
>> temporarily double in size, but any extras should be
>> cleaned up relatively soon.
>>
>> On the master, what's the total size of the <solr home>/data directory?
>> I'm a little suspicious of the <backupAfter> on your master, but I
>> don't think that's the root of your problem....
>>
>> Are you recreating the index on the master (by deleting the
>> index directory and starting over)?
>>
>> This is unusual, and I suspect it's something odd in your configuration,
>> but I confess I'm at a loss as to what.
>>
>> Best
>> Erick
>>
>> On Fri, Mar 23, 2012 at 10:28 AM, Alexandre Rocco <al...@gmail.com>
>> wrote:
>> > Hello,
>> >
>> > We have a Solr index that has an average of 1.19 GB in size.
>> > After configuring the replication, the slave machine is growing the index
>> > size expoentially.
>> > Currently we have an slave with 323.44 GB in size.
>> > Is there anything that could cause this behavior?
>> > The current replication config is below.
>> >
>> > Master:
>> > <requestHandler name="/replication" class="solr.ReplicationHandler">
>> > <lst name="master">
>> > <str name="replicateAfter">commit</str>
>> > <str name="replicateAfter">startup</str>
>> > <str name="backupAfter">startup</str>
>> > <str name="confFiles">
>> >
>> elevate.xml,protwords.txt,schema.xml,spellings.txt,stopwords.txt,synonyms.txt
>> > </str>
>> > </lst>
>> > </requestHandler>
>> >
>> > Slave:
>> > <requestHandler name="/replication" class="solr.ReplicationHandler">
>> > <lst name="slave">
>> > <str name="masterUrl">http://master:8984/solr/Index/replication</str>
>> > </lst>
>> > </requestHandler>
>> >
>> > Any pointers will be useful.
>> >
>> > Thanks,
>> > Alexandre
>>

Re: Slave index size growing fast

Posted by Alexandre Rocco <al...@gmail.com>.

Erick,

We're using Solr 3.3 on Linux (CentOS 5.6).
The /data dir on master is actually 1.2G.

I haven't tried to recreate the index yet. Since it's a production
environment,
I guess that I can stop replication and indexing and then recreate the
master index to see if it makes any difference.

Also just noticed another thread here named "Simple Slave Replication
Question" that tells that it could be a problem if I'm seeing an
/data/index with an timestamp on the slave node.
Is this info relevant to this issue?

Thanks,
Alexandre

On Fri, Mar 23, 2012 at 11:48 AM, Erick Erickson <er...@gmail.com>wrote:

> What version of Solr and what operating system?
>
> But regardless, this shouldn't be happening. Indexes can
> temporarily double in size, but any extras should be
> cleaned up relatively soon.
>
> On the master, what's the total size of the <solr home>/data directory?
> I'm a little suspicious of the <backupAfter> on your master, but I
> don't think that's the root of your problem....
>
> Are you recreating the index on the master (by deleting the
> index directory and starting over)?
>
> This is unusual, and I suspect it's something odd in your configuration,
> but I confess I'm at a loss as to what.
>
> Best
> Erick
>
> On Fri, Mar 23, 2012 at 10:28 AM, Alexandre Rocco <al...@gmail.com>
> wrote:
> > Hello,
> >
> > We have a Solr index that has an average of 1.19 GB in size.
> > After configuring the replication, the slave machine is growing the index
> > size expoentially.
> > Currently we have an slave with 323.44 GB in size.
> > Is there anything that could cause this behavior?
> > The current replication config is below.
> >
> > Master:
> > <requestHandler name="/replication" class="solr.ReplicationHandler">
> > <lst name="master">
> > <str name="replicateAfter">commit</str>
> > <str name="replicateAfter">startup</str>
> > <str name="backupAfter">startup</str>
> > <str name="confFiles">
> >
> elevate.xml,protwords.txt,schema.xml,spellings.txt,stopwords.txt,synonyms.txt
> > </str>
> > </lst>
> > </requestHandler>
> >
> > Slave:
> > <requestHandler name="/replication" class="solr.ReplicationHandler">
> > <lst name="slave">
> > <str name="masterUrl">http://master:8984/solr/Index/replication</str>
> > </lst>
> > </requestHandler>
> >
> > Any pointers will be useful.
> >
> > Thanks,
> > Alexandre
>

Re: Slave index size growing fast

Posted by Erick Erickson <er...@gmail.com>.

What version of Solr and what operating system?

But regardless, this shouldn't be happening. Indexes can
temporarily double in size, but any extras should be
cleaned up relatively soon.

On the master, what's the total size of the <solr home>/data directory?
I'm a little suspicious of the <backupAfter> on your master, but I
don't think that's the root of your problem....

Are you recreating the index on the master (by deleting the
index directory and starting over)?

This is unusual, and I suspect it's something odd in your configuration,
but I confess I'm at a loss as to what.

Best
Erick

On Fri, Mar 23, 2012 at 10:28 AM, Alexandre Rocco <al...@gmail.com> wrote:
> Hello,
>
> We have a Solr index that has an average of 1.19 GB in size.
> After configuring the replication, the slave machine is growing the index
> size expoentially.
> Currently we have an slave with 323.44 GB in size.
> Is there anything that could cause this behavior?
> The current replication config is below.
>
> Master:
> <requestHandler name="/replication" class="solr.ReplicationHandler">
> <lst name="master">
> <str name="replicateAfter">commit</str>
> <str name="replicateAfter">startup</str>
> <str name="backupAfter">startup</str>
> <str name="confFiles">
> elevate.xml,protwords.txt,schema.xml,spellings.txt,stopwords.txt,synonyms.txt
> </str>
> </lst>
> </requestHandler>
>
> Slave:
> <requestHandler name="/replication" class="solr.ReplicationHandler">
> <lst name="slave">
> <str name="masterUrl">http://master:8984/solr/Index/replication</str>
> </lst>
> </requestHandler>
>
> Any pointers will be useful.
>
> Thanks,
> Alexandre