You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "Areg Baghdasaryan (BLOOMBERG/ 731 LEX -)" <ab...@bloomberg.net> on 2014/09/23 23:40:25 UTC

Sorting a Table in Spark RDD

Hello,

So I have crated a table in in RDD in spark in thei format:
col1    col2
-----------------------
1.     10      11
2. 12      8
3.  9       13
4. 2       3

And the RDD is ristributed by the rows (rows 1, 2 on one node and rows 3 4 on another)
I want to sort each column of the table so that that output is the following:


col1    col2
-----------------------
1.     2       3
2.  9       8
3.  10      11
4. 122     13

Is tehre a easy way to do this with spark RDD? The only way that i can think of so far is to transpose the table somehow..

Thanks
Areg 

Re: Sorting a Table in Spark RDD

Posted by Victor Tso-Guillen <vt...@paxata.com>.
You could pluck out each column in separate rdds, sort them independently,
and zip them :)

On Tue, Sep 23, 2014 at 2:40 PM, Areg Baghdasaryan (BLOOMBERG/ 731 LEX -) <
abaghdasary2@bloomberg.net> wrote:

> Hello,
>
> So I have crated a table in in RDD in spark in thei format:
> col1 col2
> -----------------------
> 1. 10 11
> 2. 12 8
> 3. 9 13
> 4. 2 3
>
> And the RDD is ristributed by the rows (rows 1, 2 on one node and rows 3 4
> on another)
> I want to sort each column of the table so that that output is the
> following:
>
>
> col1 col2
> -----------------------
> 1. 2 3
> 2. 9 8
> 3. 10 11
> 4. 122 13
>
> Is tehre a easy way to do this with spark RDD? The only way that i can
> think of so far is to transpose the table somehow..
>
> Thanks
> Areg
>