You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by 祝海通 <zh...@gmail.com> on 2011/12/07 04:57:48 UTC

sstable count=0, why nodetool ring is not 0

hi,all

We are using Cassandra 1.0.2. I am testing the TTL with loading 400G.
When all the data are expired, I waited for some hours.
Later, the nodetool ring is still have 90GB. So I made a major compaction.
Then there are 30GB from the nodetool ring. After I saw the file system,I
found there are zero sstable.
I don't know where comes the 30GB?

Best Regards

Re: sstable count=0, why nodetool ring is not 0

Posted by 祝海通 <zh...@gmail.com>.

in the Cassandra wiki, I found this "
ColumnFamilyStoreMBean exposes sstable space used as getLiveDiskSpaceUsed
(only includes size of non-obsolete files) and getTotalDiskSpaceUsed
(includes everything)."
maybe this is the answer.

On Wed, Dec 7, 2011 at 11:57 AM, 祝海通 <zh...@gmail.com> wrote:

> hi,all
>
> We are using Cassandra 1.0.2. I am testing the TTL with loading 400G.
> When all the data are expired, I waited for some hours.
> Later, the nodetool ring is still have 90GB. So I made a major compaction.
> Then there are 30GB from the nodetool ring. After I saw the file system,I
> found there are zero sstable.
> I don't know where comes the 30GB?
>
> Best Regards
>

Re: sstable count=0, why nodetool ring is not 0

Posted by 祝海通 <zh...@gmail.com>.

I count the disk usage. I found that from nodetool information, space used
(live) is about the nodetool ring info.
space used(total) is about the disk usage.

thx


On Thu, Dec 8, 2011 at 9:53 AM, 祝海通 <zh...@gmail.com> wrote:

>
> We are testing the the performance of Cassandra for Big Data. Now I also
> have the problem. From nodetool cfstats, the space used (live) is 7 times
> than the Space used (total). Why?
>
> thx
>
>
>
> On Wed, Dec 7, 2011 at 5:58 PM, Dotan N. <di...@gmail.com> wrote:
>
>> Hi,
>> What kind of process did you use for loading 400GB of data?
>>
>> Thanks
>> --
>> Dotan, @jondot <http://twitter.com/jondot>
>>
>>
>>
>> On Wed, Dec 7, 2011 at 5:57 AM, 祝海通 <zh...@gmail.com> wrote:
>>
>>> hi,all
>>>
>>> We are using Cassandra 1.0.2. I am testing the TTL with loading 400G.
>>> When all the data are expired, I waited for some hours.
>>> Later, the nodetool ring is still have 90GB. So I made a major
>>> compaction.
>>> Then there are 30GB from the nodetool ring. After I saw the file
>>> system,I found there are zero sstable.
>>> I don't know where comes the 30GB?
>>>
>>> Best Regards
>>>
>>
>>
>

Re: Cassandra behavior too fragile?

Posted by Peter Schuller <pe...@infidyne.com>.

> Thing is, why is it so easy for the repair process to break? OK, I admit I'm
> not sure why nodes are reported as "dead" once in a while, but it's
> absolutely certain that they simply don't fall off the edge, are knocked out
> for 10 min or anything like that. Why is there no built-in tolerance/retry
> mechanism so that a node that may seem silent for a minute can be contacted
> later, or, better yet, a different node with a relevant replica is
> contacted?
>
> As was evident from some presentations at Cassandra-NYC yesterday, failed
> compactions and repairs are a major problem for a number of users. The
> cluster can quickly become unusable. I think it would be a good idea to
> build more robustness into these procedures,

I am trying to argue for removing the failure-detector-kills-repair in
https://issues.apache.org/jira/browse/CASSANDRA-3569, but I don't know
whether that will happen since there is opposition.

However, that only fixes the particular issue you are having right
now. There are significant problems with repair, and the answer to why
there is no retry is probably because it takes non-trivial amounts of
work to make the current repair process be fault-tolerant in the face
of TCP connections dying.

Personally, my pet ticket to fix repair once and for all is
https://issues.apache.org/jira/browse/CASSANDRA-2699 which should, at
least as I envisioned it, fix a lot of problems, including making it
much much much more robust to transient failures (it would just
automatically be robust without specific code necessary to deal with
it, because repair work would happen piecemeal and incrementally in a
repeating fashion anyway). Nodes could basically be going up and down
in any wild haywire mode and things would just automatically continue
to work in the background. Repair would become irrelevant to cluster
maintenance, and you wouldn't really have to think about whether or
not someone is repairing. You would also not have to think about
repair vs. gc grace time because it would all just sit there and work
without intervention.

It's a pretty big ticket though and not something I'm gonna be working
on in my spare time, so I don't know whether or when I would actually
work on that ticket (depends on priorities). I have the ideas but I
can't promise to fix it :)

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Cassandra behavior too fragile?

Posted by Maxim Potekhin <po...@bnl.gov>.

OK, thanks to the excellent help of Datastax folks, some of the more 
severe inconsistencies in my Cassandra cluster were fixed (after a node 
was down and compactions failed etc).

I'm still having problems as reported in "repairs 0.8.6." thread.

Thing is, why is it so easy for the repair process to break? OK, I admit 
I'm not sure why nodes are reported as "dead" once in a while, but it's 
absolutely certain that they simply don't fall off the edge, are knocked 
out for 10 min or anything like that. Why is there no built-in 
tolerance/retry mechanism so that a node that may seem silent for a 
minute can be contacted later, or, better yet, a different node with a 
relevant replica is contacted?

As was evident from some presentations at Cassandra-NYC yesterday, 
failed compactions and repairs are a major problem for a number of 
users. The cluster can quickly become unusable. I think it would be a 
good idea to build more robustness into these procedures,

Regards

Maxim

Re: sstable count=0, why nodetool ring is not 0

Posted by 祝海通 <zh...@gmail.com>.

We are testing the the performance of Cassandra for Big Data. Now I also
have the problem. From nodetool cfstats, the space used (live) is 7 times
than the Space used (total). Why?

thx


On Wed, Dec 7, 2011 at 5:58 PM, Dotan N. <di...@gmail.com> wrote:

> Hi,
> What kind of process did you use for loading 400GB of data?
>
> Thanks
> --
> Dotan, @jondot <http://twitter.com/jondot>
>
>
>
> On Wed, Dec 7, 2011 at 5:57 AM, 祝海通 <zh...@gmail.com> wrote:
>
>> hi,all
>>
>> We are using Cassandra 1.0.2. I am testing the TTL with loading 400G.
>> When all the data are expired, I waited for some hours.
>> Later, the nodetool ring is still have 90GB. So I made a major
>> compaction.
>> Then there are 30GB from the nodetool ring. After I saw the file system,I
>> found there are zero sstable.
>> I don't know where comes the 30GB?
>>
>> Best Regards
>>
>
>

Re: sstable count=0, why nodetool ring is not 0

Posted by "Dotan N." <di...@gmail.com>.

Hi,
What kind of process did you use for loading 400GB of data?

Thanks
--
Dotan, @jondot <http://twitter.com/jondot>



On Wed, Dec 7, 2011 at 5:57 AM, 祝海通 <zh...@gmail.com> wrote:

> hi,all
>
> We are using Cassandra 1.0.2. I am testing the TTL with loading 400G.
> When all the data are expired, I waited for some hours.
> Later, the nodetool ring is still have 90GB. So I made a major compaction.
> Then there are 30GB from the nodetool ring. After I saw the file system,I
> found there are zero sstable.
> I don't know where comes the 30GB?
>
> Best Regards
>