You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Oliver Schrenk <ol...@gmail.com> on 2014/02/27 11:42:08 UTC

SolrCloud 4.7: Overseer tries to delete a non-existing collection, throws exception and loops

Hi,

Upgraded a small cluster from 4.3.1 to 4.7 in Solr Cloud mode.

I deleted the old data, replaced the solr.xml with the example solr.xml with auto-discovery, but it seems there is still some old data somewhere, probably in Zookeeper that keep my machine from starting.

It loops over the same log message over and over

	2014-02-27 11:00:49,011 INFO o.a.s.c.c.ZkStateReader [Thread-15] Updating cloud state from ZooKeeper... 2014-02-27 11:00:49,012 INFO o.a.s.c.Overseer$ClusterStateUpdater [Thread-15] Removing collection: collection1 shard: 2 from clusterstate 2014-02-27 11:00:49,012 ERROR o.a.s.c.Overseer$ClusterStateUpdater [Thread-15] Exception in Overseer main queue loop org.apache.solr.common.SolrException: Could not find collection:collection1 at org.apache.solr.common.cloud.ClusterState.getCollection(ClusterState.java:175) at org.apache.solr.cloud.Overseer$ClusterStateUpdater.removeShard(Overseer.java:801) at org.apache.solr.cloud.Overseer$ClusterStateUpdater.processMessage(Overseer.java:230) at org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:187) at java.lang.Thread.run(Thread.java:722)

How can I delete that false information?

Regards
Oliver
	

Re: SolrCloud 4.7: Overseer tries to delete a non-existing collection, throws exception and loops

Posted by Yago Riveiro <ya...@gmail.com>.
I recommend you to attach your log to the issue and commend the process that you did to run in this error. Maybe your logs can have some valuable information.


--  
Yago Riveiro
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Thursday, February 27, 2014 at 12:03 PM, Yago Riveiro wrote:

> I remember tat I need to empty my queue before restart the cluster too.
>  
> This bug is a little scary because if you have a schedule system to deletes collections on the fly your cluster can blow and you don't know why …
>  
> --  
> Yago Riveiro
> Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
>  
>  
> On Thursday, February 27, 2014 at 11:58 AM, Oliver Schrenk wrote:
>  
> > Interesting. I deleted all old collections, configs, and clusterstate.json from Zookeeper and I still had that problem.
> >  
> > I’m quite new to Zookeeper, so some of what I say might be wrong. It seems there were some outstanding changes in Zookeeper. Or at least I found some queue items in a queue node. I deleted all of them and was able to start my cluster.
> >  
> >  
> > On 27 Feb 2014, at 12:48, Yago Riveiro <yago.riveiro@gmail.com (mailto:yago.riveiro@gmail.com)> wrote:
> >  
> > > I had some problems with DELETE action too.  
> > >  
> > > I reported this some time ago https://issues.apache.org/jira/browse/SOLR-5559
> > >  
> > > The overseer fail to delete a collection and the solr cluster becomes unstable, I reloaded my boxes and my cluster never went online.
> > >  
> > > After some debug, I found some shard folder with the name of the collection that I was delete previously. I deleted it and the cluster went online again.
> > >  
> > > You will need to remove all entries of deleted collection manually on clusterstate.json (if exists any).  
> > >  
> > >  
> > > --  
> > > Yago Riveiro
> > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > >  
> > >  
> > > On Thursday, February 27, 2014 at 10:42 AM, Oliver Schrenk wrote:
> > >  
> > > > Hi,
> > > >  
> > > > Upgraded a small cluster from 4.3.1 to 4.7 in Solr Cloud mode.
> > > >  
> > > > I deleted the old data, replaced the solr.xml with the example solr.xml with auto-discovery, but it seems there is still some old data somewhere, probably in Zookeeper that keep my machine from starting.
> > > >  
> > > > It loops over the same log message over and over
> > > >  
> > > > 2014-02-27 11:00:49,011 INFO o.a.s.c.c.ZkStateReader [Thread-15] Updating cloud state from ZooKeeper... 2014-02-27 11:00:49,012 INFO o.a.s.c.Overseer$ClusterStateUpdater [Thread-15] Removing collection: collection1 shard: 2 from clusterstate 2014-02-27 11:00:49,012 ERROR o.a.s.c.Overseer$ClusterStateUpdater [Thread-15] Exception in Overseer main queue loop org.apache.solr.common.SolrException: Could not find collection:collection1 at org.apache.solr.common.cloud.ClusterState.getCollection(ClusterState.java:175) at org.apache.solr.cloud.Overseer$ClusterStateUpdater.removeShard(Overseer.java:801) at org.apache.solr.cloud.Overseer$ClusterStateUpdater.processMessage(Overseer.java:230) at org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:187) at java.lang.Thread.run(Thread.java:722)
> > > >  
> > > > How can I delete that false information?
> > > >  
> > > > Regards
> > > > Oliver
> > > >  
> > >  
> > >  
> >  
> >  
> >  
> >  
>  
>  


Re: SolrCloud 4.7: Overseer tries to delete a non-existing collection, throws exception and loops

Posted by Yago Riveiro <ya...@gmail.com>.
I remember tat I need to empty my queue before restart the cluster too.

This bug is a little scary because if you have a schedule system to deletes collections on the fly your cluster can blow and you don't know why …

--  
Yago Riveiro
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Thursday, February 27, 2014 at 11:58 AM, Oliver Schrenk wrote:

> Interesting. I deleted all old collections, configs, and clusterstate.json from Zookeeper and I still had that problem.
>  
> I’m quite new to Zookeeper, so some of what I say might be wrong. It seems there were some outstanding changes in Zookeeper. Or at least I found some queue items in a queue node. I deleted all of them and was able to start my cluster.
>  
>  
> On 27 Feb 2014, at 12:48, Yago Riveiro <yago.riveiro@gmail.com (mailto:yago.riveiro@gmail.com)> wrote:
>  
> > I had some problems with DELETE action too.  
> >  
> > I reported this some time ago https://issues.apache.org/jira/browse/SOLR-5559
> >  
> > The overseer fail to delete a collection and the solr cluster becomes unstable, I reloaded my boxes and my cluster never went online.
> >  
> > After some debug, I found some shard folder with the name of the collection that I was delete previously. I deleted it and the cluster went online again.
> >  
> > You will need to remove all entries of deleted collection manually on clusterstate.json (if exists any).  
> >  
> >  
> > --  
> > Yago Riveiro
> > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> >  
> >  
> > On Thursday, February 27, 2014 at 10:42 AM, Oliver Schrenk wrote:
> >  
> > > Hi,
> > >  
> > > Upgraded a small cluster from 4.3.1 to 4.7 in Solr Cloud mode.
> > >  
> > > I deleted the old data, replaced the solr.xml with the example solr.xml with auto-discovery, but it seems there is still some old data somewhere, probably in Zookeeper that keep my machine from starting.
> > >  
> > > It loops over the same log message over and over
> > >  
> > > 2014-02-27 11:00:49,011 INFO o.a.s.c.c.ZkStateReader [Thread-15] Updating cloud state from ZooKeeper... 2014-02-27 11:00:49,012 INFO o.a.s.c.Overseer$ClusterStateUpdater [Thread-15] Removing collection: collection1 shard: 2 from clusterstate 2014-02-27 11:00:49,012 ERROR o.a.s.c.Overseer$ClusterStateUpdater [Thread-15] Exception in Overseer main queue loop org.apache.solr.common.SolrException: Could not find collection:collection1 at org.apache.solr.common.cloud.ClusterState.getCollection(ClusterState.java:175) at org.apache.solr.cloud.Overseer$ClusterStateUpdater.removeShard(Overseer.java:801) at org.apache.solr.cloud.Overseer$ClusterStateUpdater.processMessage(Overseer.java:230) at org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:187) at java.lang.Thread.run(Thread.java:722)
> > >  
> > > How can I delete that false information?
> > >  
> > > Regards
> > > Oliver
> > >  
> >  
> >  
>  
>  
>  



Re: SolrCloud 4.7: Overseer tries to delete a non-existing collection, throws exception and loops

Posted by Oliver Schrenk <ol...@gmail.com>.
Interesting. I deleted all old collections, configs, and clusterstate.json from Zookeeper and I still had that problem.

I’m quite new to Zookeeper, so some of what I say might be wrong. It seems there were some outstanding changes in Zookeeper. Or at least I found some queue items in a queue node. I deleted all of them and was able to start my cluster.


On 27 Feb 2014, at 12:48, Yago Riveiro <ya...@gmail.com> wrote:

> I had some problems with DELETE action too. 
> 
> I reported this some time ago https://issues.apache.org/jira/browse/SOLR-5559
> 
> The overseer fail to delete a collection and the solr cluster becomes unstable, I reloaded my boxes and my cluster never went online.
> 
> After some debug, I found some shard folder with the name of the collection that I was delete previously. I deleted it and the cluster went online again.
> 
> You will need to remove all entries of deleted collection manually on clusterstate.json (if exists any). 
> 
> 
> -- 
> Yago Riveiro
> Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> 
> 
> On Thursday, February 27, 2014 at 10:42 AM, Oliver Schrenk wrote:
> 
>> Hi,
>> 
>> Upgraded a small cluster from 4.3.1 to 4.7 in Solr Cloud mode.
>> 
>> I deleted the old data, replaced the solr.xml with the example solr.xml with auto-discovery, but it seems there is still some old data somewhere, probably in Zookeeper that keep my machine from starting.
>> 
>> It loops over the same log message over and over
>> 
>> 2014-02-27 11:00:49,011 INFO o.a.s.c.c.ZkStateReader [Thread-15] Updating cloud state from ZooKeeper... 2014-02-27 11:00:49,012 INFO o.a.s.c.Overseer$ClusterStateUpdater [Thread-15] Removing collection: collection1 shard: 2 from clusterstate 2014-02-27 11:00:49,012 ERROR o.a.s.c.Overseer$ClusterStateUpdater [Thread-15] Exception in Overseer main queue loop org.apache.solr.common.SolrException: Could not find collection:collection1 at org.apache.solr.common.cloud.ClusterState.getCollection(ClusterState.java:175) at org.apache.solr.cloud.Overseer$ClusterStateUpdater.removeShard(Overseer.java:801) at org.apache.solr.cloud.Overseer$ClusterStateUpdater.processMessage(Overseer.java:230) at org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:187) at java.lang.Thread.run(Thread.java:722)
>> 
>> How can I delete that false information?
>> 
>> Regards
>> Oliver
>> 
>> 
>> 
> 
> 


Re: SolrCloud 4.7: Overseer tries to delete a non-existing collection, throws exception and loops

Posted by Yago Riveiro <ya...@gmail.com>.
I had some problems with DELETE action too. 

I reported this some time ago https://issues.apache.org/jira/browse/SOLR-5559

The overseer fail to delete a collection and the solr cluster becomes unstable, I reloaded my boxes and my cluster never went online.

After some debug, I found some shard folder with the name of the collection that I was delete previously. I deleted it and the cluster went online again.

You will need to remove all entries of deleted collection manually on clusterstate.json (if exists any). 


-- 
Yago Riveiro
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Thursday, February 27, 2014 at 10:42 AM, Oliver Schrenk wrote:

> Hi,
> 
> Upgraded a small cluster from 4.3.1 to 4.7 in Solr Cloud mode.
> 
> I deleted the old data, replaced the solr.xml with the example solr.xml with auto-discovery, but it seems there is still some old data somewhere, probably in Zookeeper that keep my machine from starting.
> 
> It loops over the same log message over and over
> 
> 2014-02-27 11:00:49,011 INFO o.a.s.c.c.ZkStateReader [Thread-15] Updating cloud state from ZooKeeper... 2014-02-27 11:00:49,012 INFO o.a.s.c.Overseer$ClusterStateUpdater [Thread-15] Removing collection: collection1 shard: 2 from clusterstate 2014-02-27 11:00:49,012 ERROR o.a.s.c.Overseer$ClusterStateUpdater [Thread-15] Exception in Overseer main queue loop org.apache.solr.common.SolrException: Could not find collection:collection1 at org.apache.solr.common.cloud.ClusterState.getCollection(ClusterState.java:175) at org.apache.solr.cloud.Overseer$ClusterStateUpdater.removeShard(Overseer.java:801) at org.apache.solr.cloud.Overseer$ClusterStateUpdater.processMessage(Overseer.java:230) at org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:187) at java.lang.Thread.run(Thread.java:722)
> 
> How can I delete that false information?
> 
> Regards
> Oliver
> 
> 
>