You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by 张包峰 <pe...@qq.com> on 2014/07/08 04:04:50 UTC

Re: Pig 0.13, Spark, Spork

Hi guys, previously I checked out the old "spork" and updated it to Hadoop 2.0, Scala 2.10.3 and Spark 0.9.1, see github project of mine https://github.com/pelick/flare-spork‍


It it also highly experimental, and just directly mapping pig physical operations to spark RDD transformations/actions. It works for simple requests. :)


I am also interested on the progress of spork, is it undergoing in Twitter in an un open-source way?


------------------
Thanks
Zhang Baofeng
Blog | Github | Weibo | LinkedIn




 




------------------ 原始邮件 ------------------
发件人: "Mayur Rustagi";<ma...@gmail.com>;
发送时间: 2014年7月7日(星期一) 晚上11:55
收件人: "user@spark.apache.org"<us...@spark.apache.org>; 

主题: Re: Pig 0.13, Spark, Spork



That version is old :). We are not forking pig but cleanly separating out pig execution engine. Let me know if you are willing to give it a go.



Also would love to know what features of pig you are using ? 
 


Regards
Mayur

Mayur Rustagi
Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com
 @mayur_rustagi





 

On Mon, Jul 7, 2014 at 8:46 PM, Bertrand Dechoux <de...@gmail.com> wrote:
 I saw a wiki page from your company but with an old version of Spark.http://docs.sigmoidanalytics.com/index.php/Setting_up_spork_with_spark_0.8.1
 


I have no reason to use it yet but I am interested in the state of the initiative.
What's your point of view (personal and/or professional) about the Pig 0.13 release?
Is the pluggable execution engine flexible enough in order to avoid having Spork as a fork of Pig? Pig + Spark + Fork = Spork :D
 

As a (for now) external observer, I am glad to see competition in that space. It can only be good for the community in the end.

 Bertrand Dechoux
 

On Mon, Jul 7, 2014 at 5:00 PM, Mayur Rustagi <ma...@gmail.com> wrote:
 Hi,We have fixed many major issues around Spork & deploying it with some customers. Would be happy to provide a working version to you to try out. We are looking for more folks to try it out & submit bugs. 
 

Regards
Mayur 


Mayur Rustagi
Ph: +1 (760) 203 3257 http://www.sigmoidanalytics.com
 @mayur_rustagi





 

On Mon, Jul 7, 2014 at 8:21 PM, Bertrand Dechoux <de...@gmail.com> wrote:
 Hi,

I was wondering what was the state of the Pig+Spark initiative now that the execution engine of Pig is pluggable? Granted, it was done in order to use Tez but could it be used by Spark? I know about a 'theoretical' project called Spork but I don't know any stable and maintained version of it.
 

Regards

Bertrand Dechoux

Re: Re: Pig 0.13, Spark, Spork

Posted by Mayur Rustagi <ma...@gmail.com>.
Also its far from bug free :)
Let me know if you need any help to try it out.

Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi <https://twitter.com/mayur_rustagi>



On Wed, Jul 9, 2014 at 12:58 PM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:

> Hi Bertrand,
>
> We've updated the document
> http://docs.sigmoidanalytics.com/index.php/Setting_up_spork_with_spark_0.9.0
>
> This is our working Github repo
> https://github.com/sigmoidanalytics/spork/tree/spork-0.9
>
> Feel free to open issues over here
> https://github.com/sigmoidanalytics/spork/issues
>
> Thanks
> Best Regards
>
>
> On Tue, Jul 8, 2014 at 2:33 PM, Bertrand Dechoux <de...@gmail.com>
> wrote:
>
>> @Mayur : I won't fight with the semantic of a fork but at the moment, no
>> Spork does take the standard Pig as dependency. On that, we should agree.
>>
>> As for my use of Pig, I have no limitation. I am however interested to
>> see the rise of a 'no-sql high level non programming language' for Spark.
>>
>> @Zhang : Could you elaborate your reference about Twitter?
>>
>>
>> Bertrand Dechoux
>>
>>
>> On Tue, Jul 8, 2014 at 4:04 AM, 张包峰 <pe...@qq.com> wrote:
>>
>>> Hi guys, previously I checked out the old "spork" and updated it to
>>> Hadoop 2.0, Scala 2.10.3 and Spark 0.9.1, see github project of mine
>>> https://github.com/pelick/flare-spork‍
>>>
>>> It it also highly experimental, and just directly mapping pig physical
>>> operations to spark RDD transformations/actions. It works for simple
>>> requests. :)
>>>
>>> I am also interested on the progress of spork, is it undergoing in
>>> Twitter in an un open-source way?
>>>
>>> ------------------
>>> Thanks
>>> Zhang Baofeng
>>> Blog <http://blog.csdn.net/pelick> | Github <https://github.com/pelick>
>>> | Weibo <http://weibo.com/pelickzhang> | LinkedIn
>>> <http://www.linkedin.com/pub/zhang-baofeng/70/609/84>
>>>
>>>
>>>
>>>
>>> ------------------ 原始邮件 ------------------
>>> *发件人:* "Mayur Rustagi";<ma...@gmail.com>;
>>> *发送时间:* 2014年7月7日(星期一) 晚上11:55
>>> *收件人:* "user@spark.apache.org"<us...@spark.apache.org>;
>>> *主题:* Re: Pig 0.13, Spark, Spork
>>>
>>> That version is old :).
>>> We are not forking pig but cleanly separating out pig execution engine.
>>> Let me know if you are willing to give it a go.
>>>
>>> Also would love to know what features of pig you are using ?
>>>
>>> Regards
>>> Mayur
>>>
>>> Mayur Rustagi
>>> Ph: +1 (760) 203 3257
>>> http://www.sigmoidanalytics.com
>>> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>>>
>>>
>>>
>>> On Mon, Jul 7, 2014 at 8:46 PM, Bertrand Dechoux <de...@gmail.com>
>>> wrote:
>>>
>>>> I saw a wiki page from your company but with an old version of Spark.
>>>>
>>>> http://docs.sigmoidanalytics.com/index.php/Setting_up_spork_with_spark_0.8.1
>>>>
>>>> I have no reason to use it yet but I am interested in the state of the
>>>> initiative.
>>>> What's your point of view (personal and/or professional) about the Pig
>>>> 0.13 release?
>>>> Is the pluggable execution engine flexible enough in order to avoid
>>>> having Spork as a fork of Pig? Pig + Spark + Fork = Spork :D
>>>>
>>>> As a (for now) external observer, I am glad to see competition in that
>>>> space. It can only be good for the community in the end.
>>>>
>>>> Bertrand Dechoux
>>>>
>>>>
>>>> On Mon, Jul 7, 2014 at 5:00 PM, Mayur Rustagi <ma...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>> We have fixed many major issues around Spork & deploying it with some
>>>>> customers. Would be happy to provide a working version to you to try out.
>>>>> We are looking for more folks to try it out & submit bugs.
>>>>>
>>>>> Regards
>>>>> Mayur
>>>>>
>>>>> Mayur Rustagi
>>>>> Ph: +1 (760) 203 3257
>>>>> http://www.sigmoidanalytics.com
>>>>> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jul 7, 2014 at 8:21 PM, Bertrand Dechoux <de...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I was wondering what was the state of the Pig+Spark initiative now
>>>>>> that the execution engine of Pig is pluggable? Granted, it was done in
>>>>>> order to use Tez but could it be used by Spark? I know about a
>>>>>> 'theoretical' project called Spork but I don't know any stable and
>>>>>> maintained version of it.
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>> Bertrand Dechoux
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Re: Pig 0.13, Spark, Spork

Posted by Akhil Das <ak...@sigmoidanalytics.com>.
Hi Bertrand,

We've updated the document
http://docs.sigmoidanalytics.com/index.php/Setting_up_spork_with_spark_0.9.0

This is our working Github repo
https://github.com/sigmoidanalytics/spork/tree/spork-0.9

Feel free to open issues over here
https://github.com/sigmoidanalytics/spork/issues

Thanks
Best Regards


On Tue, Jul 8, 2014 at 2:33 PM, Bertrand Dechoux <de...@gmail.com> wrote:

> @Mayur : I won't fight with the semantic of a fork but at the moment, no
> Spork does take the standard Pig as dependency. On that, we should agree.
>
> As for my use of Pig, I have no limitation. I am however interested to see
> the rise of a 'no-sql high level non programming language' for Spark.
>
> @Zhang : Could you elaborate your reference about Twitter?
>
>
> Bertrand Dechoux
>
>
> On Tue, Jul 8, 2014 at 4:04 AM, 张包峰 <pe...@qq.com> wrote:
>
>> Hi guys, previously I checked out the old "spork" and updated it to
>> Hadoop 2.0, Scala 2.10.3 and Spark 0.9.1, see github project of mine
>> https://github.com/pelick/flare-spork‍
>>
>> It it also highly experimental, and just directly mapping pig physical
>> operations to spark RDD transformations/actions. It works for simple
>> requests. :)
>>
>> I am also interested on the progress of spork, is it undergoing in
>> Twitter in an un open-source way?
>>
>> ------------------
>> Thanks
>> Zhang Baofeng
>> Blog <http://blog.csdn.net/pelick> | Github <https://github.com/pelick>
>> | Weibo <http://weibo.com/pelickzhang> | LinkedIn
>> <http://www.linkedin.com/pub/zhang-baofeng/70/609/84>
>>
>>
>>
>>
>> ------------------ 原始邮件 ------------------
>> *发件人:* "Mayur Rustagi";<ma...@gmail.com>;
>> *发送时间:* 2014年7月7日(星期一) 晚上11:55
>> *收件人:* "user@spark.apache.org"<us...@spark.apache.org>;
>> *主题:* Re: Pig 0.13, Spark, Spork
>>
>> That version is old :).
>> We are not forking pig but cleanly separating out pig execution engine.
>> Let me know if you are willing to give it a go.
>>
>> Also would love to know what features of pig you are using ?
>>
>> Regards
>> Mayur
>>
>> Mayur Rustagi
>> Ph: +1 (760) 203 3257
>> http://www.sigmoidanalytics.com
>> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>>
>>
>>
>> On Mon, Jul 7, 2014 at 8:46 PM, Bertrand Dechoux <de...@gmail.com>
>> wrote:
>>
>>> I saw a wiki page from your company but with an old version of Spark.
>>>
>>> http://docs.sigmoidanalytics.com/index.php/Setting_up_spork_with_spark_0.8.1
>>>
>>> I have no reason to use it yet but I am interested in the state of the
>>> initiative.
>>> What's your point of view (personal and/or professional) about the Pig
>>> 0.13 release?
>>> Is the pluggable execution engine flexible enough in order to avoid
>>> having Spork as a fork of Pig? Pig + Spark + Fork = Spork :D
>>>
>>> As a (for now) external observer, I am glad to see competition in that
>>> space. It can only be good for the community in the end.
>>>
>>> Bertrand Dechoux
>>>
>>>
>>> On Mon, Jul 7, 2014 at 5:00 PM, Mayur Rustagi <ma...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>> We have fixed many major issues around Spork & deploying it with some
>>>> customers. Would be happy to provide a working version to you to try out.
>>>> We are looking for more folks to try it out & submit bugs.
>>>>
>>>> Regards
>>>> Mayur
>>>>
>>>> Mayur Rustagi
>>>> Ph: +1 (760) 203 3257
>>>> http://www.sigmoidanalytics.com
>>>> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>>>>
>>>>
>>>>
>>>> On Mon, Jul 7, 2014 at 8:21 PM, Bertrand Dechoux <de...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I was wondering what was the state of the Pig+Spark initiative now
>>>>> that the execution engine of Pig is pluggable? Granted, it was done in
>>>>> order to use Tez but could it be used by Spark? I know about a
>>>>> 'theoretical' project called Spork but I don't know any stable and
>>>>> maintained version of it.
>>>>>
>>>>> Regards
>>>>>
>>>>> Bertrand Dechoux
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Re: Pig 0.13, Spark, Spork

Posted by Bertrand Dechoux <de...@gmail.com>.
@Mayur : I won't fight with the semantic of a fork but at the moment, no
Spork does take the standard Pig as dependency. On that, we should agree.

As for my use of Pig, I have no limitation. I am however interested to see
the rise of a 'no-sql high level non programming language' for Spark.

@Zhang : Could you elaborate your reference about Twitter?


Bertrand Dechoux


On Tue, Jul 8, 2014 at 4:04 AM, 张包峰 <pe...@qq.com> wrote:

> Hi guys, previously I checked out the old "spork" and updated it to Hadoop
> 2.0, Scala 2.10.3 and Spark 0.9.1, see github project of mine
> https://github.com/pelick/flare-spork‍
>
> It it also highly experimental, and just directly mapping pig physical
> operations to spark RDD transformations/actions. It works for simple
> requests. :)
>
> I am also interested on the progress of spork, is it undergoing in Twitter
> in an un open-source way?
>
> ------------------
> Thanks
> Zhang Baofeng
> Blog <http://blog.csdn.net/pelick> | Github <https://github.com/pelick> |
> Weibo <http://weibo.com/pelickzhang> | LinkedIn
> <http://www.linkedin.com/pub/zhang-baofeng/70/609/84>
>
>
>
>
> ------------------ 原始邮件 ------------------
> *发件人:* "Mayur Rustagi";<ma...@gmail.com>;
> *发送时间:* 2014年7月7日(星期一) 晚上11:55
> *收件人:* "user@spark.apache.org"<us...@spark.apache.org>;
> *主题:* Re: Pig 0.13, Spark, Spork
>
> That version is old :).
> We are not forking pig but cleanly separating out pig execution engine.
> Let me know if you are willing to give it a go.
>
> Also would love to know what features of pig you are using ?
>
> Regards
> Mayur
>
> Mayur Rustagi
> Ph: +1 (760) 203 3257
> http://www.sigmoidanalytics.com
> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>
>
>
> On Mon, Jul 7, 2014 at 8:46 PM, Bertrand Dechoux <de...@gmail.com>
> wrote:
>
>> I saw a wiki page from your company but with an old version of Spark.
>>
>> http://docs.sigmoidanalytics.com/index.php/Setting_up_spork_with_spark_0.8.1
>>
>> I have no reason to use it yet but I am interested in the state of the
>> initiative.
>> What's your point of view (personal and/or professional) about the Pig
>> 0.13 release?
>> Is the pluggable execution engine flexible enough in order to avoid
>> having Spork as a fork of Pig? Pig + Spark + Fork = Spork :D
>>
>> As a (for now) external observer, I am glad to see competition in that
>> space. It can only be good for the community in the end.
>>
>> Bertrand Dechoux
>>
>>
>> On Mon, Jul 7, 2014 at 5:00 PM, Mayur Rustagi <ma...@gmail.com>
>> wrote:
>>
>>> Hi,
>>> We have fixed many major issues around Spork & deploying it with some
>>> customers. Would be happy to provide a working version to you to try out.
>>> We are looking for more folks to try it out & submit bugs.
>>>
>>> Regards
>>> Mayur
>>>
>>> Mayur Rustagi
>>> Ph: +1 (760) 203 3257
>>> http://www.sigmoidanalytics.com
>>> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>>>
>>>
>>>
>>> On Mon, Jul 7, 2014 at 8:21 PM, Bertrand Dechoux <de...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I was wondering what was the state of the Pig+Spark initiative now that
>>>> the execution engine of Pig is pluggable? Granted, it was done in order to
>>>> use Tez but could it be used by Spark? I know about a 'theoretical' project
>>>> called Spork but I don't know any stable and maintained version of it.
>>>>
>>>> Regards
>>>>
>>>> Bertrand Dechoux
>>>>
>>>
>>>
>>
>