You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Franc Carter <fr...@gmail.com> on 2016/01/09 22:55:15 UTC
pyspark: calculating row deltas
Hi,
I have a DataFrame with the columns
ID,Year,Value
I'd like to create a new Column that is Value2-Value1 where the
corresponding Year2=Year-1
At the moment I am creating a new DataFrame with renamed columns and doing
DF.join(DF2, . . . .)
This looks cumbersome to me, is there abtter way ?
thanks
--
Franc
Re: pyspark: calculating row deltas
Posted by Franc Carter <fr...@gmail.com>.
Thanks
cheers
On 10 January 2016 at 22:35, Blaž Šnuderl <sn...@gmail.com> wrote:
> This can be done using spark.sql and window functions. Take a look at
> https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html
>
> On Sun, Jan 10, 2016 at 11:07 AM, Franc Carter <fr...@gmail.com>
> wrote:
>
>>
>> Sure, for a dataframe that looks like this
>>
>> ID Year Value
>> 1 2012 100
>> 1 2013 102
>> 1 2014 106
>> 2 2012 110
>> 2 2013 118
>> 2 2014 128
>>
>> I'd like to get back
>>
>> ID Year Value
>> 1 2013 2
>> 1 2014 4
>> 2 2013 8
>> 2 2014 10
>>
>> i.e the Value for an ID,Year combination is the Value for the ID,Year
>> minus the Value for the ID,Year-1
>>
>> thanks
>>
>>
>>
>>
>>
>>
>> On 10 January 2016 at 20:51, Femi Anthony <fe...@gmail.com> wrote:
>>
>>> Can you clarify what you mean with an actual example ?
>>>
>>> For example, if your data frame looks like this:
>>>
>>> ID Year Value
>>> 1 2012 100
>>> 2 2013 101
>>> 3 2014 102
>>>
>>> What's your desired output ?
>>>
>>> Femi
>>>
>>>
>>> On Sat, Jan 9, 2016 at 4:55 PM, Franc Carter <fr...@gmail.com>
>>> wrote:
>>>
>>>>
>>>> Hi,
>>>>
>>>> I have a DataFrame with the columns
>>>>
>>>> ID,Year,Value
>>>>
>>>> I'd like to create a new Column that is Value2-Value1 where the
>>>> corresponding Year2=Year-1
>>>>
>>>> At the moment I am creating a new DataFrame with renamed columns and
>>>> doing
>>>>
>>>> DF.join(DF2, . . . .)
>>>>
>>>> This looks cumbersome to me, is there abtter way ?
>>>>
>>>> thanks
>>>>
>>>>
>>>> --
>>>> Franc
>>>>
>>>
>>>
>>>
>>> --
>>> http://www.femibyte.com/twiki5/bin/view/Tech/
>>> http://www.nextmatrix.com
>>> "Great spirits have always encountered violent opposition from mediocre
>>> minds." - Albert Einstein.
>>>
>>
>>
>>
>> --
>> Franc
>>
>
>
--
Franc
Re: pyspark: calculating row deltas
Posted by Blaž Šnuderl <sn...@gmail.com>.
This can be done using spark.sql and window functions. Take a look at
https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html
On Sun, Jan 10, 2016 at 11:07 AM, Franc Carter <fr...@gmail.com>
wrote:
>
> Sure, for a dataframe that looks like this
>
> ID Year Value
> 1 2012 100
> 1 2013 102
> 1 2014 106
> 2 2012 110
> 2 2013 118
> 2 2014 128
>
> I'd like to get back
>
> ID Year Value
> 1 2013 2
> 1 2014 4
> 2 2013 8
> 2 2014 10
>
> i.e the Value for an ID,Year combination is the Value for the ID,Year
> minus the Value for the ID,Year-1
>
> thanks
>
>
>
>
>
>
> On 10 January 2016 at 20:51, Femi Anthony <fe...@gmail.com> wrote:
>
>> Can you clarify what you mean with an actual example ?
>>
>> For example, if your data frame looks like this:
>>
>> ID Year Value
>> 1 2012 100
>> 2 2013 101
>> 3 2014 102
>>
>> What's your desired output ?
>>
>> Femi
>>
>>
>> On Sat, Jan 9, 2016 at 4:55 PM, Franc Carter <fr...@gmail.com>
>> wrote:
>>
>>>
>>> Hi,
>>>
>>> I have a DataFrame with the columns
>>>
>>> ID,Year,Value
>>>
>>> I'd like to create a new Column that is Value2-Value1 where the
>>> corresponding Year2=Year-1
>>>
>>> At the moment I am creating a new DataFrame with renamed columns and
>>> doing
>>>
>>> DF.join(DF2, . . . .)
>>>
>>> This looks cumbersome to me, is there abtter way ?
>>>
>>> thanks
>>>
>>>
>>> --
>>> Franc
>>>
>>
>>
>>
>> --
>> http://www.femibyte.com/twiki5/bin/view/Tech/
>> http://www.nextmatrix.com
>> "Great spirits have always encountered violent opposition from mediocre
>> minds." - Albert Einstein.
>>
>
>
>
> --
> Franc
>
Re: pyspark: calculating row deltas
Posted by Franc Carter <fr...@gmail.com>.
Sure, for a dataframe that looks like this
ID Year Value
1 2012 100
1 2013 102
1 2014 106
2 2012 110
2 2013 118
2 2014 128
I'd like to get back
ID Year Value
1 2013 2
1 2014 4
2 2013 8
2 2014 10
i.e the Value for an ID,Year combination is the Value for the ID,Year minus
the Value for the ID,Year-1
thanks
On 10 January 2016 at 20:51, Femi Anthony <fe...@gmail.com> wrote:
> Can you clarify what you mean with an actual example ?
>
> For example, if your data frame looks like this:
>
> ID Year Value
> 1 2012 100
> 2 2013 101
> 3 2014 102
>
> What's your desired output ?
>
> Femi
>
>
> On Sat, Jan 9, 2016 at 4:55 PM, Franc Carter <fr...@gmail.com>
> wrote:
>
>>
>> Hi,
>>
>> I have a DataFrame with the columns
>>
>> ID,Year,Value
>>
>> I'd like to create a new Column that is Value2-Value1 where the
>> corresponding Year2=Year-1
>>
>> At the moment I am creating a new DataFrame with renamed columns and
>> doing
>>
>> DF.join(DF2, . . . .)
>>
>> This looks cumbersome to me, is there abtter way ?
>>
>> thanks
>>
>>
>> --
>> Franc
>>
>
>
>
> --
> http://www.femibyte.com/twiki5/bin/view/Tech/
> http://www.nextmatrix.com
> "Great spirits have always encountered violent opposition from mediocre
> minds." - Albert Einstein.
>
--
Franc
Re: pyspark: calculating row deltas
Posted by Femi Anthony <fe...@gmail.com>.
Can you clarify what you mean with an actual example ?
For example, if your data frame looks like this:
ID Year Value
1 2012 100
2 2013 101
3 2014 102
What's your desired output ?
Femi
On Sat, Jan 9, 2016 at 4:55 PM, Franc Carter <fr...@gmail.com> wrote:
>
> Hi,
>
> I have a DataFrame with the columns
>
> ID,Year,Value
>
> I'd like to create a new Column that is Value2-Value1 where the
> corresponding Year2=Year-1
>
> At the moment I am creating a new DataFrame with renamed columns and doing
>
> DF.join(DF2, . . . .)
>
> This looks cumbersome to me, is there abtter way ?
>
> thanks
>
>
> --
> Franc
>
--
http://www.femibyte.com/twiki5/bin/view/Tech/
http://www.nextmatrix.com
"Great spirits have always encountered violent opposition from mediocre
minds." - Albert Einstein.