You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@cassandra.apache.org by Juan Manuel Formoso <jf...@gmail.com> on 2013/09/17 21:13:28 UTC

I don't understand shuffle progress

I am running shuffle on a cluster after upgrading to 1.2.X, and I don't
understand how to check progress.

I'm counting the lines of cassandra-shuffle ls, and it decreases VERY
slowly. Sometimes not at all after 24 hours of processing.

Is that value accurate? Does the shuffle operation supports
disabling/re-enabling (or restarting the cluster) and resuming from the
last position? Or does it start over?

-- 
*Juan Manuel Formoso
*Senior Geek
http://twitter.com/juanformoso
http://seniorgeek.com.ar
LLAP

Re: I don't understand shuffle progress

Posted by Juan Manuel Formoso <jf...@gmail.com>.

Will the new cluster be evenly balanced? Remember that the old one was pre
1.2.X, so I had no vnodes

I haven't used that tool, will look it up.

Thanks for the suggestion!

On Tuesday, September 17, 2013, David McNelis wrote:

> Stable loader is the way to go to load up the new cluster.
>
> On Tuesday, September 17, 2013, Juan Manuel Formoso wrote:
>
>> > If your shuffle succeeds, you will be the first reported case of
>> shuffle succeeding on a non-test cluster.
>>
>> Awesome! :O
>>
>> I'll try to migrate to a new cluster then.
>>
>> Any better alternatives than creating a small application that reads from
>> one cluster and inserts in the new one that anybody can suggest?
>>
>> On Tuesday, September 17, 2013, Robert Coli wrote:
>>
>>> On Tue, Sep 17, 2013 at 12:13 PM, Juan Manuel Formoso <
>>> jformoso@gmail.com>wrote:
>>>
>>> > I am running shuffle on a cluster after upgrading to 1.2.X, and I don't
>>> > understand how to check progress.
>>> >
>>>
>>> If your shuffle succeeds, you will be the first reported case of shuffle
>>> succeeding on a non-test cluster. Until I hear a report of someone having
>>> real world success, I recommend against using shuffle.
>>>
>>> If you want to enable vnodes on a cluster with existing data, IMO you
>>> should fork writes and bulk load a replacement cluster.
>>>
>>>
>>> > I'm counting the lines of cassandra-shuffle ls, and it decreases VERY
>>> > slowly. Sometimes not at all after 24 hours of processing.
>>> >
>>>
>>> I have heard reports of shuffle taking an insanely long amount of time,
>>> such as this, as well.
>>>
>>>
>>> > Is that value accurate?
>>> >
>>>
>>> Probably.
>>>
>>>
>>> > Does the shuffle operation supports disabling/re-enabling (or
>>> restarting
>>> > the cluster) and resuming from the last position? Or does it start
>>> over?
>>> >
>>>
>>> Yes, via the arguments "enable" and "disable". "clear" is what you use if
>>> you want to clear the queue and start over.
>>>
>>> Note that once you have started shuffle, you don't want to add/remove a
>>> node until the shuffle is complete.
>>>
>>> https://issues.apache.org/jira/browse/CASSANDRA-5525
>>>
>>> =Rob
>>>
>>
>>
>> --
>> *Juan Manuel Formoso
>> *Senior Geek
>> http://twitter.com/juanformoso
>> http://seniorgeek.com.ar
>> LLAP
>>
>

-- 
*Juan Manuel Formoso
*Senior Geek
http://twitter.com/juanformoso
http://seniorgeek.com.ar
LLAP

Re: I don't understand shuffle progress

Posted by David McNelis <dm...@gmail.com>.

Stable loader is the way to go to load up the new cluster.

On Tuesday, September 17, 2013, Juan Manuel Formoso wrote:

> > If your shuffle succeeds, you will be the first reported case of
> shuffle succeeding on a non-test cluster.
>
> Awesome! :O
>
> I'll try to migrate to a new cluster then.
>
> Any better alternatives than creating a small application that reads from
> one cluster and inserts in the new one that anybody can suggest?
>
> On Tuesday, September 17, 2013, Robert Coli wrote:
>
>> On Tue, Sep 17, 2013 at 12:13 PM, Juan Manuel Formoso <jformoso@gmail.com
>> >wrote:
>>
>> > I am running shuffle on a cluster after upgrading to 1.2.X, and I don't
>> > understand how to check progress.
>> >
>>
>> If your shuffle succeeds, you will be the first reported case of shuffle
>> succeeding on a non-test cluster. Until I hear a report of someone having
>> real world success, I recommend against using shuffle.
>>
>> If you want to enable vnodes on a cluster with existing data, IMO you
>> should fork writes and bulk load a replacement cluster.
>>
>>
>> > I'm counting the lines of cassandra-shuffle ls, and it decreases VERY
>> > slowly. Sometimes not at all after 24 hours of processing.
>> >
>>
>> I have heard reports of shuffle taking an insanely long amount of time,
>> such as this, as well.
>>
>>
>> > Is that value accurate?
>> >
>>
>> Probably.
>>
>>
>> > Does the shuffle operation supports disabling/re-enabling (or restarting
>> > the cluster) and resuming from the last position? Or does it start over?
>> >
>>
>> Yes, via the arguments "enable" and "disable". "clear" is what you use if
>> you want to clear the queue and start over.
>>
>> Note that once you have started shuffle, you don't want to add/remove a
>> node until the shuffle is complete.
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-5525
>>
>> =Rob
>>
>
>
> --
> *Juan Manuel Formoso
> *Senior Geek
> http://twitter.com/juanformoso
> http://seniorgeek.com.ar
> LLAP
>

Re: I don't understand shuffle progress

Posted by Juan Manuel Formoso <jf...@gmail.com>.

I have been trying to make it work non-stop since Friday afternoon. I
officially gave up today and I'm going to go the sstableloader route.

I wrote a little of what I tried here:
http://seniorgeek.com.ar/blog/2013/09/16/tips-for-running-cassandra-shuffle/
(I have yet to update it with the fact that I had to give up)

I would strongly recommend you don't use shuffle unless you have very
little data to move around.


On Tue, Sep 17, 2013 at 10:41 PM, Paulo Motta <pa...@gmail.com>wrote:

> That is very disappointing to hear. Vnodes support is one of the main
> reasons we're upgrading from 1.1.X to 1.2.X.
>
> So you're saying the only feasible way of enabling VNodes on an upgraded C*
> 1.2 is by doing fork writes to a brand new cluster + bulk load of sstables
> from the old cluster? Or is it possible to succeed on shuffling, even if
> that means waiting some weeks for the shuffle to complete?
>
>
> 2013/9/17 Robert Coli <rc...@eventbrite.com>
>
> > On Tue, Sep 17, 2013 at 4:00 PM, Juan Manuel Formoso <jformoso@gmail.com
> > >wrote:
> >
> > > Any better alternatives than creating a small application that reads
> from
> > > one cluster and inserts in the new one that anybody can suggest?
> > >
> > >
> > http://www.palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra
> >
> > In theory if you wanted to do the "copy-the-files" method while enabling
> > vnodes on the target cluster, you could :
> >
> > 1) create new target cluster with vnodes enabled
> > 2) fork writes so they go to both source and target cluster
> > 3) copy 100% of sstables from all source nodes to all target nodes (being
> > sure to ensure non-collision of sstables of names, probably by adding a
> few
> > hundreds/thousands to the sequence of various nodes in a predictable
> > fashion)
> > 4) be certain that you did not accidentally resurrect data from purged
> > source sstables in 3)
> > 5) run cleanup compaction on all nodes in target cluster
> > 6) turn off writes to old source cluster
> >
> > =Rob
> > * notes that this process would make a good blog post.. :D
> >
>
>
>
> --
> Paulo Ricardo
>
> --
> European Master in Distributed Computing***
> Royal Institute of Technology - KTH
> *
> *Instituto Superior Técnico - IST*
> *http://paulormg.com*
>



-- 
*Juan Manuel Formoso
*Senior Geek
http://twitter.com/juanformoso
http://seniorgeek.com.ar
LLAP

Re: I don't understand shuffle progress

Posted by David McNelis <dm...@gmail.com>.

As Rob mentioned, no one (myself included) has successfully used shuffle in
the wild (that I've heard of).

Shuffle is *supposed* to be a transparent background process... and is
designed, in theory, to take a long time to run (weeks is the right way to
think of it).

Be sure to keep an eye on your drive space if you are going to wait it out.
 Unless you have < 1/2 of your drives in use you are going to need to run
cleanup periodically to avoid running out of disk space, because shuffle
NEVER removes data, only makes copies of the data on the new destination
nodes.

I think that is the area that people tend to see the most failures, because
the newer versions of cassandra can survive OK with more than 1/2 the disk
in use, more and more people are using > 50% of their disks.

On Tue, Sep 17, 2013 at 9:41 PM, Paulo Motta <pa...@gmail.com>wrote:

> That is very disappointing to hear. Vnodes support is one of the main
> reasons we're upgrading from 1.1.X to 1.2.X.
>
> So you're saying the only feasible way of enabling VNodes on an upgraded
> C* 1.2 is by doing fork writes to a brand new cluster + bulk load of
> sstables from the old cluster? Or is it possible to succeed on shuffling,
> even if that means waiting some weeks for the shuffle to complete?
>
>
> 2013/9/17 Robert Coli <rc...@eventbrite.com>
>
>> On Tue, Sep 17, 2013 at 4:00 PM, Juan Manuel Formoso <jformoso@gmail.com
>> >wrote:
>>
>> > Any better alternatives than creating a small application that reads
>> from
>> > one cluster and inserts in the new one that anybody can suggest?
>> >
>> >
>> http://www.palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra
>>
>> In theory if you wanted to do the "copy-the-files" method while enabling
>> vnodes on the target cluster, you could :
>>
>> 1) create new target cluster with vnodes enabled
>> 2) fork writes so they go to both source and target cluster
>> 3) copy 100% of sstables from all source nodes to all target nodes (being
>> sure to ensure non-collision of sstables of names, probably by adding a
>> few
>> hundreds/thousands to the sequence of various nodes in a predictable
>> fashion)
>> 4) be certain that you did not accidentally resurrect data from purged
>> source sstables in 3)
>> 5) run cleanup compaction on all nodes in target cluster
>> 6) turn off writes to old source cluster
>>
>> =Rob
>> * notes that this process would make a good blog post.. :D
>>
>
>
>
> --
> Paulo Ricardo
>
> --
> European Master in Distributed Computing***
> Royal Institute of Technology - KTH
> *
> *Instituto Superior Técnico - IST*
> *http://paulormg.com*
>

Re: I don't understand shuffle progress

Posted by Juan Manuel Formoso <jf...@gmail.com>.

Awesome, thanks!

A few final questions:
1) Can I change the Snitch in the live source cluster? I'm using
SimpleSnitch, I'd change it to GossipingPropertyFileSnitch (in preparation
for changing the replication strategy when the new cluster is up and
running).
2) Can I have different Partitioners on the 2 clusters? I have
RandomPartitioner in the current one, I'd like to use Murmur on the new one
(it will be empty at first). Are partitioners only required to be the same
in the same cluster or also across clusters in different DCs?
3) Will I be able to remove the old dc from the cluster when I finish
rebuilding?

Thanks again!

On Wed, Sep 18, 2013 at 11:41 AM, Chris Burroughs <chris.burroughs@gmail.com
> wrote:

> http://www.datastax.com/**documentation/cassandra/1.2/**
> webhelp/index.html#cassandra/**operations/ops_add_dc_to_**cluster_t.html<http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html#cassandra/operations/ops_add_dc_to_cluster_t.html>
>
> This is a basic outline.
>
>
>
> On 09/18/2013 10:32 AM, Juan Manuel Formoso wrote:
>
>> I really like this idea. I can create a new cluster and have it replicate
>> the old one, after it finishes I can remove the original.
>>
>> Any good resource that explains how to add a new datacenter to a live
>> single dc cluster that anybody can recommend?
>>
>>
>> On Wed, Sep 18, 2013 at 9:58 AM, Chris Burroughs
>> <ch...@gmail.com>**wrote:
>>
>>  On 09/17/2013 09:41 PM, Paulo Motta wrote:
>>>
>>>  So you're saying the only feasible way of enabling VNodes on an upgraded
>>>> C*
>>>> 1.2 is by doing fork writes to a brand new cluster + bulk load of
>>>> sstables
>>>> from the old cluster? Or is it possible to succeed on shuffling, even if
>>>> that means waiting some weeks for the shuffle to complete?
>>>>
>>>>
>>> In a multi "DC" cluster situation you *should* be able to bring up a new
>>> DC with vnodes, bootstrap it, and then decommission the old cluster.
>>>
>>>
>>
>>
>>
>

-- 
*Juan Manuel Formoso
*Senior Geek
http://twitter.com/juanformoso
http://seniorgeek.com.ar
LLAP

Re: I don't understand shuffle progress

Posted by Juan Manuel Formoso <jf...@gmail.com>.

Thanks. I did this and I finished rebuilding the new cluster in about 8
hours... much better option than shuffle (you have to have the hardware for
duplicating your environment though)


On Thu, Sep 19, 2013 at 7:21 PM, Jeremiah D Jordan <
jeremiah.jordan@gmail.com> wrote:

>
> http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html#cassandra/configuration/configVnodesProduction_t.html
>
> On Sep 18, 2013, at 9:41 AM, Chris Burroughs <ch...@gmail.com>
> wrote:
>
> >
> http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html#cassandra/operations/ops_add_dc_to_cluster_t.html
> >
> > This is a basic outline.
> >
> >
> > On 09/18/2013 10:32 AM, Juan Manuel Formoso wrote:
> >> I really like this idea. I can create a new cluster and have it
> replicate
> >> the old one, after it finishes I can remove the original.
> >>
> >> Any good resource that explains how to add a new datacenter to a live
> >> single dc cluster that anybody can recommend?
> >>
> >>
> >> On Wed, Sep 18, 2013 at 9:58 AM, Chris Burroughs
> >> <ch...@gmail.com>wrote:
> >>
> >>> On 09/17/2013 09:41 PM, Paulo Motta wrote:
> >>>
> >>>> So you're saying the only feasible way of enabling VNodes on an
> upgraded
> >>>> C*
> >>>> 1.2 is by doing fork writes to a brand new cluster + bulk load of
> sstables
> >>>> from the old cluster? Or is it possible to succeed on shuffling, even
> if
> >>>> that means waiting some weeks for the shuffle to complete?
> >>>>
> >>>
> >>> In a multi "DC" cluster situation you *should* be able to bring up a
> new
> >>> DC with vnodes, bootstrap it, and then decommission the old cluster.
> >>>
> >>
> >>
> >>
> >
>
>


-- 
*Juan Manuel Formoso
*Senior Geek
http://twitter.com/juanformoso
http://seniorgeek.com.ar
LLAP

Re: I don't understand shuffle progress

Posted by Jeremiah D Jordan <je...@gmail.com>.

http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html#cassandra/configuration/configVnodesProduction_t.html

On Sep 18, 2013, at 9:41 AM, Chris Burroughs <ch...@gmail.com> wrote:

> http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html#cassandra/operations/ops_add_dc_to_cluster_t.html
> 
> This is a basic outline.
> 
> 
> On 09/18/2013 10:32 AM, Juan Manuel Formoso wrote:
>> I really like this idea. I can create a new cluster and have it replicate
>> the old one, after it finishes I can remove the original.
>> 
>> Any good resource that explains how to add a new datacenter to a live
>> single dc cluster that anybody can recommend?
>> 
>> 
>> On Wed, Sep 18, 2013 at 9:58 AM, Chris Burroughs
>> <ch...@gmail.com>wrote:
>> 
>>> On 09/17/2013 09:41 PM, Paulo Motta wrote:
>>> 
>>>> So you're saying the only feasible way of enabling VNodes on an upgraded
>>>> C*
>>>> 1.2 is by doing fork writes to a brand new cluster + bulk load of sstables
>>>> from the old cluster? Or is it possible to succeed on shuffling, even if
>>>> that means waiting some weeks for the shuffle to complete?
>>>> 
>>> 
>>> In a multi "DC" cluster situation you *should* be able to bring up a new
>>> DC with vnodes, bootstrap it, and then decommission the old cluster.
>>> 
>> 
>> 
>> 
>

Re: I don't understand shuffle progress

Posted by Juan Manuel Formoso <jf...@gmail.com>.

Awesome, thanks!

A few final questions:
1) Can I change the Snitch in the live source cluster? I'm using
SimpleSnitch, I'd change it to GossipingPropertyFileSnitch (in preparation
for changing the replication strategy when the new cluster is up and
running).
2) Can I have different Partitioners on the 2 clusters? I have
RandomPartitioner in the current one, I'd like to use Murmur on the new one
(it will be empty at first). Are partitioners only required to be the same
in the same cluster or also across clusters in different DCs?
3) Will I be able to remove the old dc from the cluster when I finish
rebuilding?

Thanks again!

On Wed, Sep 18, 2013 at 11:41 AM, Chris Burroughs <chris.burroughs@gmail.com
> wrote:

> http://www.datastax.com/**documentation/cassandra/1.2/**
> webhelp/index.html#cassandra/**operations/ops_add_dc_to_**cluster_t.html<http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html#cassandra/operations/ops_add_dc_to_cluster_t.html>
>
> This is a basic outline.
>
>
>
> On 09/18/2013 10:32 AM, Juan Manuel Formoso wrote:
>
>> I really like this idea. I can create a new cluster and have it replicate
>> the old one, after it finishes I can remove the original.
>>
>> Any good resource that explains how to add a new datacenter to a live
>> single dc cluster that anybody can recommend?
>>
>>
>> On Wed, Sep 18, 2013 at 9:58 AM, Chris Burroughs
>> <ch...@gmail.com>**wrote:
>>
>>  On 09/17/2013 09:41 PM, Paulo Motta wrote:
>>>
>>>  So you're saying the only feasible way of enabling VNodes on an upgraded
>>>> C*
>>>> 1.2 is by doing fork writes to a brand new cluster + bulk load of
>>>> sstables
>>>> from the old cluster? Or is it possible to succeed on shuffling, even if
>>>> that means waiting some weeks for the shuffle to complete?
>>>>
>>>>
>>> In a multi "DC" cluster situation you *should* be able to bring up a new
>>> DC with vnodes, bootstrap it, and then decommission the old cluster.
>>>
>>>
>>
>>
>>
>

-- 
*Juan Manuel Formoso
*Senior Geek
http://twitter.com/juanformoso
http://seniorgeek.com.ar
LLAP

Re: I don't understand shuffle progress

Posted by Chris Burroughs <ch...@gmail.com>.

http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html#cassandra/operations/ops_add_dc_to_cluster_t.html

This is a basic outline.


On 09/18/2013 10:32 AM, Juan Manuel Formoso wrote:
> I really like this idea. I can create a new cluster and have it replicate
> the old one, after it finishes I can remove the original.
>
> Any good resource that explains how to add a new datacenter to a live
> single dc cluster that anybody can recommend?
>
>
> On Wed, Sep 18, 2013 at 9:58 AM, Chris Burroughs
> <ch...@gmail.com>wrote:
>
>> On 09/17/2013 09:41 PM, Paulo Motta wrote:
>>
>>> So you're saying the only feasible way of enabling VNodes on an upgraded
>>> C*
>>> 1.2 is by doing fork writes to a brand new cluster + bulk load of sstables
>>> from the old cluster? Or is it possible to succeed on shuffling, even if
>>> that means waiting some weeks for the shuffle to complete?
>>>
>>
>> In a multi "DC" cluster situation you *should* be able to bring up a new
>> DC with vnodes, bootstrap it, and then decommission the old cluster.
>>
>
>
>

Re: I don't understand shuffle progress

Posted by Chris Burroughs <ch...@gmail.com>.

http://www.datastax.com/documentation/cassandra/1.2/webhelp/index.html#cassandra/operations/ops_add_dc_to_cluster_t.html

This is a basic outline.


On 09/18/2013 10:32 AM, Juan Manuel Formoso wrote:
> I really like this idea. I can create a new cluster and have it replicate
> the old one, after it finishes I can remove the original.
>
> Any good resource that explains how to add a new datacenter to a live
> single dc cluster that anybody can recommend?
>
>
> On Wed, Sep 18, 2013 at 9:58 AM, Chris Burroughs
> <ch...@gmail.com>wrote:
>
>> On 09/17/2013 09:41 PM, Paulo Motta wrote:
>>
>>> So you're saying the only feasible way of enabling VNodes on an upgraded
>>> C*
>>> 1.2 is by doing fork writes to a brand new cluster + bulk load of sstables
>>> from the old cluster? Or is it possible to succeed on shuffling, even if
>>> that means waiting some weeks for the shuffle to complete?
>>>
>>
>> In a multi "DC" cluster situation you *should* be able to bring up a new
>> DC with vnodes, bootstrap it, and then decommission the old cluster.
>>
>
>
>

Re: I don't understand shuffle progress

Posted by Juan Manuel Formoso <jf...@gmail.com>.

I really like this idea. I can create a new cluster and have it replicate
the old one, after it finishes I can remove the original.

Any good resource that explains how to add a new datacenter to a live
single dc cluster that anybody can recommend?

On Wed, Sep 18, 2013 at 9:58 AM, Chris Burroughs
<ch...@gmail.com>wrote:

> On 09/17/2013 09:41 PM, Paulo Motta wrote:
>
>> So you're saying the only feasible way of enabling VNodes on an upgraded
>> C*
>> 1.2 is by doing fork writes to a brand new cluster + bulk load of sstables
>> from the old cluster? Or is it possible to succeed on shuffling, even if
>> that means waiting some weeks for the shuffle to complete?
>>
>
> In a multi "DC" cluster situation you *should* be able to bring up a new
> DC with vnodes, bootstrap it, and then decommission the old cluster.
>

-- 
*Juan Manuel Formoso
*Senior Geek
http://twitter.com/juanformoso
http://seniorgeek.com.ar
LLAP

Re: I don't understand shuffle progress

Posted by Juan Manuel Formoso <jf...@gmail.com>.

I really like this idea. I can create a new cluster and have it replicate
the old one, after it finishes I can remove the original.

Any good resource that explains how to add a new datacenter to a live
single dc cluster that anybody can recommend?

On Wed, Sep 18, 2013 at 9:58 AM, Chris Burroughs
<ch...@gmail.com>wrote:

> On 09/17/2013 09:41 PM, Paulo Motta wrote:
>
>> So you're saying the only feasible way of enabling VNodes on an upgraded
>> C*
>> 1.2 is by doing fork writes to a brand new cluster + bulk load of sstables
>> from the old cluster? Or is it possible to succeed on shuffling, even if
>> that means waiting some weeks for the shuffle to complete?
>>
>
> In a multi "DC" cluster situation you *should* be able to bring up a new
> DC with vnodes, bootstrap it, and then decommission the old cluster.
>

-- 
*Juan Manuel Formoso
*Senior Geek
http://twitter.com/juanformoso
http://seniorgeek.com.ar
LLAP

Re: I don't understand shuffle progress

Posted by Chris Burroughs <ch...@gmail.com>.

On 09/17/2013 09:41 PM, Paulo Motta wrote:
> So you're saying the only feasible way of enabling VNodes on an upgraded C*
> 1.2 is by doing fork writes to a brand new cluster + bulk load of sstables
> from the old cluster? Or is it possible to succeed on shuffling, even if
> that means waiting some weeks for the shuffle to complete?

In a multi "DC" cluster situation you *should* be able to bring up a new 
DC with vnodes, bootstrap it, and then decommission the old cluster.

Re: I don't understand shuffle progress

Posted by Chris Burroughs <ch...@gmail.com>.

On 09/17/2013 09:41 PM, Paulo Motta wrote:
> So you're saying the only feasible way of enabling VNodes on an upgraded C*
> 1.2 is by doing fork writes to a brand new cluster + bulk load of sstables
> from the old cluster? Or is it possible to succeed on shuffling, even if
> that means waiting some weeks for the shuffle to complete?

In a multi "DC" cluster situation you *should* be able to bring up a new 
DC with vnodes, bootstrap it, and then decommission the old cluster.

Re: I don't understand shuffle progress

Posted by Juan Manuel Formoso <jf...@gmail.com>.

I have been trying to make it work non-stop since Friday afternoon. I
officially gave up today and I'm going to go the sstableloader route.

I wrote a little of what I tried here:
http://seniorgeek.com.ar/blog/2013/09/16/tips-for-running-cassandra-shuffle/
(I have yet to update it with the fact that I had to give up)

I would strongly recommend you don't use shuffle unless you have very
little data to move around.


On Tue, Sep 17, 2013 at 10:41 PM, Paulo Motta <pa...@gmail.com>wrote:

> That is very disappointing to hear. Vnodes support is one of the main
> reasons we're upgrading from 1.1.X to 1.2.X.
>
> So you're saying the only feasible way of enabling VNodes on an upgraded C*
> 1.2 is by doing fork writes to a brand new cluster + bulk load of sstables
> from the old cluster? Or is it possible to succeed on shuffling, even if
> that means waiting some weeks for the shuffle to complete?
>
>
> 2013/9/17 Robert Coli <rc...@eventbrite.com>
>
> > On Tue, Sep 17, 2013 at 4:00 PM, Juan Manuel Formoso <jformoso@gmail.com
> > >wrote:
> >
> > > Any better alternatives than creating a small application that reads
> from
> > > one cluster and inserts in the new one that anybody can suggest?
> > >
> > >
> > http://www.palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra
> >
> > In theory if you wanted to do the "copy-the-files" method while enabling
> > vnodes on the target cluster, you could :
> >
> > 1) create new target cluster with vnodes enabled
> > 2) fork writes so they go to both source and target cluster
> > 3) copy 100% of sstables from all source nodes to all target nodes (being
> > sure to ensure non-collision of sstables of names, probably by adding a
> few
> > hundreds/thousands to the sequence of various nodes in a predictable
> > fashion)
> > 4) be certain that you did not accidentally resurrect data from purged
> > source sstables in 3)
> > 5) run cleanup compaction on all nodes in target cluster
> > 6) turn off writes to old source cluster
> >
> > =Rob
> > * notes that this process would make a good blog post.. :D
> >
>
>
>
> --
> Paulo Ricardo
>
> --
> European Master in Distributed Computing***
> Royal Institute of Technology - KTH
> *
> *Instituto Superior Técnico - IST*
> *http://paulormg.com*
>



-- 
*Juan Manuel Formoso
*Senior Geek
http://twitter.com/juanformoso
http://seniorgeek.com.ar
LLAP

Re: I don't understand shuffle progress

Posted by Paulo Motta <pa...@gmail.com>.

That is very disappointing to hear. Vnodes support is one of the main
reasons we're upgrading from 1.1.X to 1.2.X.

So you're saying the only feasible way of enabling VNodes on an upgraded C*
1.2 is by doing fork writes to a brand new cluster + bulk load of sstables
from the old cluster? Or is it possible to succeed on shuffling, even if
that means waiting some weeks for the shuffle to complete?


2013/9/17 Robert Coli <rc...@eventbrite.com>

> On Tue, Sep 17, 2013 at 4:00 PM, Juan Manuel Formoso <jformoso@gmail.com
> >wrote:
>
> > Any better alternatives than creating a small application that reads from
> > one cluster and inserts in the new one that anybody can suggest?
> >
> >
> http://www.palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra
>
> In theory if you wanted to do the "copy-the-files" method while enabling
> vnodes on the target cluster, you could :
>
> 1) create new target cluster with vnodes enabled
> 2) fork writes so they go to both source and target cluster
> 3) copy 100% of sstables from all source nodes to all target nodes (being
> sure to ensure non-collision of sstables of names, probably by adding a few
> hundreds/thousands to the sequence of various nodes in a predictable
> fashion)
> 4) be certain that you did not accidentally resurrect data from purged
> source sstables in 3)
> 5) run cleanup compaction on all nodes in target cluster
> 6) turn off writes to old source cluster
>
> =Rob
> * notes that this process would make a good blog post.. :D
>



-- 
Paulo Ricardo

-- 
European Master in Distributed Computing***
Royal Institute of Technology - KTH
*
*Instituto Superior Técnico - IST*
*http://paulormg.com*

Re: I don't understand shuffle progress

Posted by Paulo Motta <pa...@gmail.com>.

That is very disappointing to hear. Vnodes support is one of the main
reasons we're upgrading from 1.1.X to 1.2.X.

So you're saying the only feasible way of enabling VNodes on an upgraded C*
1.2 is by doing fork writes to a brand new cluster + bulk load of sstables
from the old cluster? Or is it possible to succeed on shuffling, even if
that means waiting some weeks for the shuffle to complete?


2013/9/17 Robert Coli <rc...@eventbrite.com>

> On Tue, Sep 17, 2013 at 4:00 PM, Juan Manuel Formoso <jformoso@gmail.com
> >wrote:
>
> > Any better alternatives than creating a small application that reads from
> > one cluster and inserts in the new one that anybody can suggest?
> >
> >
> http://www.palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra
>
> In theory if you wanted to do the "copy-the-files" method while enabling
> vnodes on the target cluster, you could :
>
> 1) create new target cluster with vnodes enabled
> 2) fork writes so they go to both source and target cluster
> 3) copy 100% of sstables from all source nodes to all target nodes (being
> sure to ensure non-collision of sstables of names, probably by adding a few
> hundreds/thousands to the sequence of various nodes in a predictable
> fashion)
> 4) be certain that you did not accidentally resurrect data from purged
> source sstables in 3)
> 5) run cleanup compaction on all nodes in target cluster
> 6) turn off writes to old source cluster
>
> =Rob
> * notes that this process would make a good blog post.. :D
>



-- 
Paulo Ricardo

-- 
European Master in Distributed Computing***
Royal Institute of Technology - KTH
*
*Instituto Superior Técnico - IST*
*http://paulormg.com*

Re: I don't understand shuffle progress

Posted by Robert Coli <rc...@eventbrite.com>.

On Tue, Sep 17, 2013 at 4:00 PM, Juan Manuel Formoso <jf...@gmail.com>wrote:

> Any better alternatives than creating a small application that reads from
> one cluster and inserts in the new one that anybody can suggest?
>
>
http://www.palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra

In theory if you wanted to do the "copy-the-files" method while enabling
vnodes on the target cluster, you could :

1) create new target cluster with vnodes enabled
2) fork writes so they go to both source and target cluster
3) copy 100% of sstables from all source nodes to all target nodes (being
sure to ensure non-collision of sstables of names, probably by adding a few
hundreds/thousands to the sequence of various nodes in a predictable
fashion)
4) be certain that you did not accidentally resurrect data from purged
source sstables in 3)
5) run cleanup compaction on all nodes in target cluster
6) turn off writes to old source cluster

=Rob
* notes that this process would make a good blog post.. :D

Re: I don't understand shuffle progress

Posted by Robert Coli <rc...@eventbrite.com>.

On Tue, Sep 17, 2013 at 4:00 PM, Juan Manuel Formoso <jf...@gmail.com>wrote:

> Any better alternatives than creating a small application that reads from
> one cluster and inserts in the new one that anybody can suggest?
>
>
http://www.palominodb.com/blog/2012/09/25/bulk-loading-options-cassandra

In theory if you wanted to do the "copy-the-files" method while enabling
vnodes on the target cluster, you could :

1) create new target cluster with vnodes enabled
2) fork writes so they go to both source and target cluster
3) copy 100% of sstables from all source nodes to all target nodes (being
sure to ensure non-collision of sstables of names, probably by adding a few
hundreds/thousands to the sequence of various nodes in a predictable
fashion)
4) be certain that you did not accidentally resurrect data from purged
source sstables in 3)
5) run cleanup compaction on all nodes in target cluster
6) turn off writes to old source cluster

=Rob
* notes that this process would make a good blog post.. :D

Re: I don't understand shuffle progress

Posted by Juan Manuel Formoso <jf...@gmail.com>.

> If your shuffle succeeds, you will be the first reported case of
shuffle succeeding on a non-test cluster.

Awesome! :O

I'll try to migrate to a new cluster then.

Any better alternatives than creating a small application that reads from
one cluster and inserts in the new one that anybody can suggest?

On Tuesday, September 17, 2013, Robert Coli wrote:

> On Tue, Sep 17, 2013 at 12:13 PM, Juan Manuel Formoso <jformoso@gmail.com<javascript:;>
> >wrote:
>
> > I am running shuffle on a cluster after upgrading to 1.2.X, and I don't
> > understand how to check progress.
> >
>
> If your shuffle succeeds, you will be the first reported case of shuffle
> succeeding on a non-test cluster. Until I hear a report of someone having
> real world success, I recommend against using shuffle.
>
> If you want to enable vnodes on a cluster with existing data, IMO you
> should fork writes and bulk load a replacement cluster.
>
>
> > I'm counting the lines of cassandra-shuffle ls, and it decreases VERY
> > slowly. Sometimes not at all after 24 hours of processing.
> >
>
> I have heard reports of shuffle taking an insanely long amount of time,
> such as this, as well.
>
>
> > Is that value accurate?
> >
>
> Probably.
>
>
> > Does the shuffle operation supports disabling/re-enabling (or restarting
> > the cluster) and resuming from the last position? Or does it start over?
> >
>
> Yes, via the arguments "enable" and "disable". "clear" is what you use if
> you want to clear the queue and start over.
>
> Note that once you have started shuffle, you don't want to add/remove a
> node until the shuffle is complete.
>
> https://issues.apache.org/jira/browse/CASSANDRA-5525
>
> =Rob
>


-- 
*Juan Manuel Formoso
*Senior Geek
http://twitter.com/juanformoso
http://seniorgeek.com.ar
LLAP

Re: I don't understand shuffle progress

Posted by Juan Manuel Formoso <jf...@gmail.com>.

> If your shuffle succeeds, you will be the first reported case of
shuffle succeeding on a non-test cluster.

Awesome! :O

I'll try to migrate to a new cluster then.

Any better alternatives than creating a small application that reads from
one cluster and inserts in the new one that anybody can suggest?

On Tuesday, September 17, 2013, Robert Coli wrote:

> On Tue, Sep 17, 2013 at 12:13 PM, Juan Manuel Formoso <jformoso@gmail.com<javascript:;>
> >wrote:
>
> > I am running shuffle on a cluster after upgrading to 1.2.X, and I don't
> > understand how to check progress.
> >
>
> If your shuffle succeeds, you will be the first reported case of shuffle
> succeeding on a non-test cluster. Until I hear a report of someone having
> real world success, I recommend against using shuffle.
>
> If you want to enable vnodes on a cluster with existing data, IMO you
> should fork writes and bulk load a replacement cluster.
>
>
> > I'm counting the lines of cassandra-shuffle ls, and it decreases VERY
> > slowly. Sometimes not at all after 24 hours of processing.
> >
>
> I have heard reports of shuffle taking an insanely long amount of time,
> such as this, as well.
>
>
> > Is that value accurate?
> >
>
> Probably.
>
>
> > Does the shuffle operation supports disabling/re-enabling (or restarting
> > the cluster) and resuming from the last position? Or does it start over?
> >
>
> Yes, via the arguments "enable" and "disable". "clear" is what you use if
> you want to clear the queue and start over.
>
> Note that once you have started shuffle, you don't want to add/remove a
> node until the shuffle is complete.
>
> https://issues.apache.org/jira/browse/CASSANDRA-5525
>
> =Rob
>


-- 
*Juan Manuel Formoso
*Senior Geek
http://twitter.com/juanformoso
http://seniorgeek.com.ar
LLAP

Re: I don't understand shuffle progress

Posted by Robert Coli <rc...@eventbrite.com>.

On Tue, Sep 17, 2013 at 12:13 PM, Juan Manuel Formoso <jf...@gmail.com>wrote:

> I am running shuffle on a cluster after upgrading to 1.2.X, and I don't
> understand how to check progress.
>

If your shuffle succeeds, you will be the first reported case of shuffle
succeeding on a non-test cluster. Until I hear a report of someone having
real world success, I recommend against using shuffle.

If you want to enable vnodes on a cluster with existing data, IMO you
should fork writes and bulk load a replacement cluster.

> I'm counting the lines of cassandra-shuffle ls, and it decreases VERY
> slowly. Sometimes not at all after 24 hours of processing.
>

I have heard reports of shuffle taking an insanely long amount of time,
such as this, as well.

> Is that value accurate?
>

Probably.

> Does the shuffle operation supports disabling/re-enabling (or restarting
> the cluster) and resuming from the last position? Or does it start over?
>

Yes, via the arguments "enable" and "disable". "clear" is what you use if
you want to clear the queue and start over.

Note that once you have started shuffle, you don't want to add/remove a
node until the shuffle is complete.

https://issues.apache.org/jira/browse/CASSANDRA-5525

=Rob

Re: I don't understand shuffle progress

Posted by Robert Coli <rc...@eventbrite.com>.

On Tue, Sep 17, 2013 at 12:13 PM, Juan Manuel Formoso <jf...@gmail.com>wrote:

> I am running shuffle on a cluster after upgrading to 1.2.X, and I don't
> understand how to check progress.
>

If your shuffle succeeds, you will be the first reported case of shuffle
succeeding on a non-test cluster. Until I hear a report of someone having
real world success, I recommend against using shuffle.

If you want to enable vnodes on a cluster with existing data, IMO you
should fork writes and bulk load a replacement cluster.

> I'm counting the lines of cassandra-shuffle ls, and it decreases VERY
> slowly. Sometimes not at all after 24 hours of processing.
>

I have heard reports of shuffle taking an insanely long amount of time,
such as this, as well.

> Is that value accurate?
>

Probably.

> Does the shuffle operation supports disabling/re-enabling (or restarting
> the cluster) and resuming from the last position? Or does it start over?
>

Yes, via the arguments "enable" and "disable". "clear" is what you use if
you want to clear the queue and start over.

Note that once you have started shuffle, you don't want to add/remove a
node until the shuffle is complete.

https://issues.apache.org/jira/browse/CASSANDRA-5525

=Rob