You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jason Kania <ja...@ymail.com> on 2016/06/08 01:31:27 UTC

Nodetool repair inconsistencies

I am running a 3 node cluster of 3.0.6 instances and encountered an error when running nodetool compact. I then ran nodetool repair. No errors were returned.
I then attempted to run nodetool compact again, but received the same error so the repair made no correction and reported no errors.
After that, I moved the problematic files out of the directory, restarted cassandra and attempted the repair again. The repair again completed without errors, however, no files were added to the directory that had contained the corrupt files. So nodetool repair does not seem to be making actual repairs.
I started looking around and numerous directories have vastly different amounts of content across the 3 nodes. There are 3 replicas so I would expect to find similar amounts of content in the same data directory on the different nodes.

Is there any way to dig deeper into this? I don't want to be caught because replication/repair is silently failing. I noticed that there is always an "some repair failed" amongst the repair output but that is so completely unhelpful and has always been present.

Thanks,
Jason

Re: Nodetool repair inconsistencies

Posted by Jason Kania <ja...@ymail.com>.
Hi Paul,
I have tried running 'nodetool compact' and the situation remains the same after I deleted the files that caused 'nodetool compact' to generate an exception in the first place.
My concern is that if I delete some sstable sets from a directory or even if I completely eliminate the sstables in a directory on one machine, run 'nodetool repair' followed by 'nodetool compact', that directory remains empty. My understanding has been that these equivalently named directories should contain roughly the same amount of content.
Thanks,
Jason

      From: Paul Fife <pa...@gmail.com>
 To: user@cassandra.apache.org; Jason Kania <ja...@ymail.com> 
 Sent: Wednesday, June 8, 2016 12:55 PM
 Subject: Re: Nodetool repair inconsistencies
   
Hi Jason -
Did you run a major compaction after the repair completed? Do you have other reasons besides the number/size of sstables to believe all nodes don't have a copy of the current data at the end of the repair operation?
Thanks,Paul
On Wed, Jun 8, 2016 at 8:12 AM, Jason Kania <ja...@ymail.com> wrote:

Hi Romain,
The problem is that there is no error to share. I am focusing on the inconsistency that when I run nodetool repair, get no errors and yet the content in the same directory on the different nodes is vastly different. This lack of an error is nature of my question, not the nodetool compact error.
Thanks,
Jason
      From: Romain Hardouin <ro...@yahoo.fr>
 To: "user@cassandra.apache.org" <us...@cassandra.apache.org>; Jason Kania <ja...@ymail.com> 
 Sent: Wednesday, June 8, 2016 8:30 AM
 Subject: Re: Nodetool repair inconsistencies
  
Hi Jason,
It's difficult for the community to help you if you don't share the error ;-)What the logs said when you ran a major compaction? (i.e. the first error you encountered) 
Best,
Romain

    Le Mercredi 8 juin 2016 3h34, Jason Kania <ja...@ymail.com> a écrit :
 

 I am running a 3 node cluster of 3.0.6 instances and encountered an error when running nodetool compact. I then ran nodetool repair. No errors were returned.
I then attempted to run nodetool compact again, but received the same error so the repair made no correction and reported no errors.
After that, I moved the problematic files out of the directory, restarted cassandra and attempted the repair again. The repair again completed without errors, however, no files were added to the directory that had contained the corrupt files. So nodetool repair does not seem to be making actual repairs.
I started looking around and numerous directories have vastly different amounts of content across the 3 nodes. There are 3 replicas so I would expect to find similar amounts of content in the same data directory on the different nodes.

Is there any way to dig deeper into this? I don't want to be caught because replication/repair is silently failing. I noticed that there is always an "some repair failed" amongst the repair output but that is so completely unhelpful and has always been present.

Thanks,
Jason


   

   



  

Re: Nodetool repair inconsistencies

Posted by Paul Fife <pa...@gmail.com>.
Hi Jason -

Did you run a major compaction after the repair completed? Do you have
other reasons besides the number/size of sstables to believe all nodes
don't have a copy of the current data at the end of the repair operation?

Thanks,
Paul

On Wed, Jun 8, 2016 at 8:12 AM, Jason Kania <ja...@ymail.com> wrote:

> Hi Romain,
>
> The problem is that there is no error to share. I am focusing on the
> inconsistency that when I run nodetool repair, get no errors and yet the
> content in the same directory on the different nodes is vastly different.
> This lack of an error is nature of my question, not the nodetool compact
> error.
>
> Thanks,
>
> Jason
>
> ------------------------------
> *From:* Romain Hardouin <ro...@yahoo.fr>
> *To:* "user@cassandra.apache.org" <us...@cassandra.apache.org>; Jason
> Kania <ja...@ymail.com>
> *Sent:* Wednesday, June 8, 2016 8:30 AM
> *Subject:* Re: Nodetool repair inconsistencies
>
> Hi Jason,
>
> It's difficult for the community to help you if you don't share the error
> ;-)
> What the logs said when you ran a major compaction? (i.e. the first error
> you encountered)
>
> Best,
>
> Romain
>
> Le Mercredi 8 juin 2016 3h34, Jason Kania <ja...@ymail.com> a écrit
> :
>
>
> I am running a 3 node cluster of 3.0.6 instances and encountered an error
> when running nodetool compact. I then ran nodetool repair. No errors were
> returned.
>
> I then attempted to run nodetool compact again, but received the same
> error so the repair made no correction and reported no errors.
>
> After that, I moved the problematic files out of the directory, restarted
> cassandra and attempted the repair again. The repair again completed
> without errors, however, no files were added to the directory that had
> contained the corrupt files. So nodetool repair does not seem to be making
> actual repairs.
>
> I started looking around and numerous directories have vastly different
> amounts of content across the 3 nodes. There are 3 replicas so I would
> expect to find similar amounts of content in the same data directory on the
> different nodes.
>
> Is there any way to dig deeper into this? I don't want to be caught
> because replication/repair is silently failing. I noticed that there is
> always an "some repair failed" amongst the repair output but that is so
> completely unhelpful and has always been present.
>
> Thanks,
>
> Jason
>
>
>
>
>

Re: Nodetool repair inconsistencies

Posted by Jason Kania <ja...@ymail.com>.
Hi Romain,
The problem is that there is no error to share. I am focusing on the inconsistency that when I run nodetool repair, get no errors and yet the content in the same directory on the different nodes is vastly different. This lack of an error is nature of my question, not the nodetool compact error.
Thanks,
Jason
      From: Romain Hardouin <ro...@yahoo.fr>
 To: "user@cassandra.apache.org" <us...@cassandra.apache.org>; Jason Kania <ja...@ymail.com> 
 Sent: Wednesday, June 8, 2016 8:30 AM
 Subject: Re: Nodetool repair inconsistencies
   
Hi Jason,
It's difficult for the community to help you if you don't share the error ;-)What the logs said when you ran a major compaction? (i.e. the first error you encountered) 
Best,
Romain

    Le Mercredi 8 juin 2016 3h34, Jason Kania <ja...@ymail.com> a écrit :
 

 I am running a 3 node cluster of 3.0.6 instances and encountered an error when running nodetool compact. I then ran nodetool repair. No errors were returned.
I then attempted to run nodetool compact again, but received the same error so the repair made no correction and reported no errors.
After that, I moved the problematic files out of the directory, restarted cassandra and attempted the repair again. The repair again completed without errors, however, no files were added to the directory that had contained the corrupt files. So nodetool repair does not seem to be making actual repairs.
I started looking around and numerous directories have vastly different amounts of content across the 3 nodes. There are 3 replicas so I would expect to find similar amounts of content in the same data directory on the different nodes.

Is there any way to dig deeper into this? I don't want to be caught because replication/repair is silently failing. I noticed that there is always an "some repair failed" amongst the repair output but that is so completely unhelpful and has always been present.

Thanks,
Jason


   

  

Re: Nodetool repair inconsistencies

Posted by Romain Hardouin <ro...@yahoo.fr>.
Hi Jason,
It's difficult for the community to help you if you don't share the error ;-)What the logs said when you ran a major compaction? (i.e. the first error you encountered) 
Best,
Romain

    Le Mercredi 8 juin 2016 3h34, Jason Kania <ja...@ymail.com> a écrit :
 

 I am running a 3 node cluster of 3.0.6 instances and encountered an error when running nodetool compact. I then ran nodetool repair. No errors were returned.
I then attempted to run nodetool compact again, but received the same error so the repair made no correction and reported no errors.
After that, I moved the problematic files out of the directory, restarted cassandra and attempted the repair again. The repair again completed without errors, however, no files were added to the directory that had contained the corrupt files. So nodetool repair does not seem to be making actual repairs.
I started looking around and numerous directories have vastly different amounts of content across the 3 nodes. There are 3 replicas so I would expect to find similar amounts of content in the same data directory on the different nodes.

Is there any way to dig deeper into this? I don't want to be caught because replication/repair is silently failing. I noticed that there is always an "some repair failed" amongst the repair output but that is so completely unhelpful and has always been present.

Thanks,
Jason