You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Prasi S <pr...@gmail.com> on 2013/10/22 09:25:24 UTC

SolrCloud frequently hanging

Hi all,
We are using solrcloud 4.4 (solrcloud with external zookeeper, 2 tomcats ,
2 solr- 1 in each tomcat) for indexing delimited files. Our index records
count to 220 Million. We have three different files each with a partial set
of data.

We index the first file completely. Then the second and thrid files are
partial updates.

1. While we are testing the indexing performance, we notice that the solr
hangs frequently after 2 days. It just hangs for about an hour or 2 hours
 and then if we hit the admin url , it comes back and starts indexing. Why
does this happen?

We have noticed that in the last 12 hours , the hangin was so frequent .
almost 6 hours it was just in hanged state.

2. also, commit time also increases for the partial upload.


Do we need to tweek any parameter or is it the behavior with Cloud for huge
volume of data?


Thanks,
Prasi

Re: SolrCloud frequently hanging

Posted by Chris Geeringh <ge...@gmail.com>.
Prasi, as per the ticket I linked to earlier, I was running into GC
settings. May be worth investigating - and take a look at the GC settings
I'm running with in the ticket.

Cheers,
Chris


On 22 October 2013 10:25, Prasi S <pr...@gmail.com> wrote:

> bq: ...three different files each with a partial set
> of data.
>
> WE have to index around 170 metadata. around 120 fields are int he first
> file, 50 metadata in the second fiel and 6 on the third file. All the three
> files have the same unique key. We use solrj to push these files to solr.
> First, we index the first file for the 220 Million records. Then we take
> the second file, do a partial update on the existing 220M. then the same is
> repeated for the third file.
>
> WE commit in batches. Our batch consist of 20,000 records. Once 5 such
> batches are sent to solr, we send a commit to solr from the code. We have
> disabled Softcommit. The hardcommit is as below.
>
>      <autoCommit>
>        <maxTime>${solr.autoCommit.maxTime:600000}</maxTime>
>        <openSearcher>false</openSearcher>
>      </autoCommit>
>
>
> Thanks,
> Prasi
>
>
> On Tue, Oct 22, 2013 at 2:34 PM, Erick Erickson <erickerickson@gmail.com
> >wrote:
>
> > This is not a lot of data really.
> >
> > bq: ...three different files each with a partial set
> > of data.
> >
> > OK, what does this mean? Are you importing as CSV files or
> > something? Are you trying to commit 10s of M documents at once?
> >
> > This shouldn't be merging since you're in 4.4 unless you're committing
> > far too frequently.
> >
> > What are your commit settings? Both soft and hard? How are you
> > committing?
> >
> > In short, there's not a lot of information to go on here, you need to
> > provide
> > a number of details.
> >
> > Best,
> > Erick
> >
> >
> > On Tue, Oct 22, 2013 at 9:25 AM, Prasi S <pr...@gmail.com> wrote:
> >
> > > Hi all,
> > > We are using solrcloud 4.4 (solrcloud with external zookeeper, 2
> tomcats
> > ,
> > > 2 solr- 1 in each tomcat) for indexing delimited files. Our index
> records
> > > count to 220 Million. We have three different files each with a partial
> > set
> > > of data.
> > >
> > > We index the first file completely. Then the second and thrid files are
> > > partial updates.
> > >
> > > 1. While we are testing the indexing performance, we notice that the
> solr
> > > hangs frequently after 2 days. It just hangs for about an hour or 2
> hours
> > >  and then if we hit the admin url , it comes back and starts indexing.
> > Why
> > > does this happen?
> > >
> > > We have noticed that in the last 12 hours , the hangin was so frequent
> .
> > > almost 6 hours it was just in hanged state.
> > >
> > > 2. also, commit time also increases for the partial upload.
> > >
> > >
> > > Do we need to tweek any parameter or is it the behavior with Cloud for
> > huge
> > > volume of data?
> > >
> > >
> > > Thanks,
> > > Prasi
> > >
> >
>

Re: SolrCloud frequently hanging

Posted by Prasi S <pr...@gmail.com>.
bq: ...three different files each with a partial set
of data.

WE have to index around 170 metadata. around 120 fields are int he first
file, 50 metadata in the second fiel and 6 on the third file. All the three
files have the same unique key. We use solrj to push these files to solr.
First, we index the first file for the 220 Million records. Then we take
the second file, do a partial update on the existing 220M. then the same is
repeated for the third file.

WE commit in batches. Our batch consist of 20,000 records. Once 5 such
batches are sent to solr, we send a commit to solr from the code. We have
disabled Softcommit. The hardcommit is as below.

     <autoCommit>
       <maxTime>${solr.autoCommit.maxTime:600000}</maxTime>
       <openSearcher>false</openSearcher>
     </autoCommit>


Thanks,
Prasi


On Tue, Oct 22, 2013 at 2:34 PM, Erick Erickson <er...@gmail.com>wrote:

> This is not a lot of data really.
>
> bq: ...three different files each with a partial set
> of data.
>
> OK, what does this mean? Are you importing as CSV files or
> something? Are you trying to commit 10s of M documents at once?
>
> This shouldn't be merging since you're in 4.4 unless you're committing
> far too frequently.
>
> What are your commit settings? Both soft and hard? How are you
> committing?
>
> In short, there's not a lot of information to go on here, you need to
> provide
> a number of details.
>
> Best,
> Erick
>
>
> On Tue, Oct 22, 2013 at 9:25 AM, Prasi S <pr...@gmail.com> wrote:
>
> > Hi all,
> > We are using solrcloud 4.4 (solrcloud with external zookeeper, 2 tomcats
> ,
> > 2 solr- 1 in each tomcat) for indexing delimited files. Our index records
> > count to 220 Million. We have three different files each with a partial
> set
> > of data.
> >
> > We index the first file completely. Then the second and thrid files are
> > partial updates.
> >
> > 1. While we are testing the indexing performance, we notice that the solr
> > hangs frequently after 2 days. It just hangs for about an hour or 2 hours
> >  and then if we hit the admin url , it comes back and starts indexing.
> Why
> > does this happen?
> >
> > We have noticed that in the last 12 hours , the hangin was so frequent .
> > almost 6 hours it was just in hanged state.
> >
> > 2. also, commit time also increases for the partial upload.
> >
> >
> > Do we need to tweek any parameter or is it the behavior with Cloud for
> huge
> > volume of data?
> >
> >
> > Thanks,
> > Prasi
> >
>

Re: SolrCloud frequently hanging

Posted by Erick Erickson <er...@gmail.com>.
This is not a lot of data really.

bq: ...three different files each with a partial set
of data.

OK, what does this mean? Are you importing as CSV files or
something? Are you trying to commit 10s of M documents at once?

This shouldn't be merging since you're in 4.4 unless you're committing
far too frequently.

What are your commit settings? Both soft and hard? How are you
committing?

In short, there's not a lot of information to go on here, you need to
provide
a number of details.

Best,
Erick


On Tue, Oct 22, 2013 at 9:25 AM, Prasi S <pr...@gmail.com> wrote:

> Hi all,
> We are using solrcloud 4.4 (solrcloud with external zookeeper, 2 tomcats ,
> 2 solr- 1 in each tomcat) for indexing delimited files. Our index records
> count to 220 Million. We have three different files each with a partial set
> of data.
>
> We index the first file completely. Then the second and thrid files are
> partial updates.
>
> 1. While we are testing the indexing performance, we notice that the solr
> hangs frequently after 2 days. It just hangs for about an hour or 2 hours
>  and then if we hit the admin url , it comes back and starts indexing. Why
> does this happen?
>
> We have noticed that in the last 12 hours , the hangin was so frequent .
> almost 6 hours it was just in hanged state.
>
> 2. also, commit time also increases for the partial upload.
>
>
> Do we need to tweek any parameter or is it the behavior with Cloud for huge
> volume of data?
>
>
> Thanks,
> Prasi
>

Re: SolrCloud frequently hanging

Posted by Chris Geeringh <ge...@gmail.com>.
Hi Prasi,

I have the same issue - I'm trying to import data and after some amount of
time the cloud stops accepting updates. I can confirm that hitting the
admin url, indexing appears to start again although that doesn't seem to
last very long before hanging again. I have a Jira ticket open, please add
any details to it https://issues.apache.org/jira/browse/SOLR-5364

Cheers,
Chris


On 22 October 2013 08:25, Prasi S <pr...@gmail.com> wrote:

> Hi all,
> We are using solrcloud 4.4 (solrcloud with external zookeeper, 2 tomcats ,
> 2 solr- 1 in each tomcat) for indexing delimited files. Our index records
> count to 220 Million. We have three different files each with a partial set
> of data.
>
> We index the first file completely. Then the second and thrid files are
> partial updates.
>
> 1. While we are testing the indexing performance, we notice that the solr
> hangs frequently after 2 days. It just hangs for about an hour or 2 hours
>  and then if we hit the admin url , it comes back and starts indexing. Why
> does this happen?
>
> We have noticed that in the last 12 hours , the hangin was so frequent .
> almost 6 hours it was just in hanged state.
>
> 2. also, commit time also increases for the partial upload.
>
>
> Do we need to tweek any parameter or is it the behavior with Cloud for huge
> volume of data?
>
>
> Thanks,
> Prasi
>