You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Mayur Rustagi <ma...@gmail.com> on 2014/04/08 19:47:11 UTC

Re: Pig on Spark

Hi Ankit,
Thanx for all the work on Pig.
Finally got it working. Couple of high level bugs right now:

   - Getting it working on Spark 0.9.0
   - Getting UDF working
   - Getting generate functionality working
   - Exhaustive test suite on Spark on Pig

are you maintaining a Jira somewhere?

I am currently trying to deploy it on 0.9.0.

Regards
Mayur

Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi <https://twitter.com/mayur_rustagi>



On Fri, Mar 14, 2014 at 1:37 PM, Aniket Mokashi <an...@gmail.com> wrote:

> We will post fixes from our side at - https://github.com/twitter/pig.
>
> Top on our list are-
> 1. Make it work with pig-trunk (execution engine interface) (with 0.8 or
> 0.9 spark).
> 2. Support for algebraic udfs (this mitigates the group by oom problems).
>
> Would definitely love more contribution on this.
>
> Thanks,
> Aniket
>
>
> On Fri, Mar 14, 2014 at 12:29 PM, Mayur Rustagi <ma...@gmail.com>wrote:
>
>> Dam I am off to NY for Structure Conf. Would it be possible to meet
>> anytime after 28th March?
>> I am really interested in making it stable & production quality.
>>
>> Regards
>> Mayur Rustagi
>> Ph: +1 (760) 203 3257
>> http://www.sigmoidanalytics.com
>> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>>
>>
>>
>> On Fri, Mar 14, 2014 at 11:53 AM, Julien Le Dem <ju...@twitter.com>wrote:
>>
>>> Hi Mayur,
>>> Are you going to the Pig meetup this afternoon?
>>> http://www.meetup.com/PigUser/events/160604192/
>>> Aniket and I will be there.
>>> We would be happy to chat about Pig-on-Spark
>>>
>>>
>>>
>>> On Tue, Mar 11, 2014 at 8:56 AM, Mayur Rustagi <ma...@gmail.com>wrote:
>>>
>>>> Hi Lin,
>>>> We are working on getting Pig on spark functional with 0.8.0, have you
>>>> got it working on any spark version ?
>>>> Also what all functionality works on it?
>>>> Regards
>>>> Mayur
>>>>
>>>> Mayur Rustagi
>>>> Ph: +1 (760) 203 3257
>>>> http://www.sigmoidanalytics.com
>>>> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>>>>
>>>>
>>>>
>>>> On Mon, Mar 10, 2014 at 11:00 PM, Xiangrui Meng <me...@gmail.com>wrote:
>>>>
>>>>> Hi Sameer,
>>>>>
>>>>> Lin (cc'ed) could also give you some updates about Pig on Spark
>>>>> development on her side.
>>>>>
>>>>> Best,
>>>>> Xiangrui
>>>>>
>>>>> On Mon, Mar 10, 2014 at 12:52 PM, Sameer Tilak <ss...@live.com>
>>>>> wrote:
>>>>> > Hi Mayur,
>>>>> > We are planning to upgrade our distribution MR1> MR2 (YARN) and the
>>>>> goal is
>>>>> > to get SPROK set up next month. I will keep you posted. Can you
>>>>> please keep
>>>>> > me informed about your progress as well.
>>>>> >
>>>>> > ________________________________
>>>>> > From: mayur.rustagi@gmail.com
>>>>> > Date: Mon, 10 Mar 2014 11:47:56 -0700
>>>>> >
>>>>> > Subject: Re: Pig on Spark
>>>>> > To: user@spark.apache.org
>>>>> >
>>>>> >
>>>>> > Hi Sameer,
>>>>> > Did you make any progress on this. My team is also trying it out
>>>>> would love
>>>>> > to know some detail so progress.
>>>>> >
>>>>> > Mayur Rustagi
>>>>> > Ph: +1 (760) 203 3257
>>>>> > http://www.sigmoidanalytics.com
>>>>> > @mayur_rustagi
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thu, Mar 6, 2014 at 2:20 PM, Sameer Tilak <ss...@live.com>
>>>>> wrote:
>>>>> >
>>>>> > Hi Aniket,
>>>>> > Many thanks! I will check this out.
>>>>> >
>>>>> > ________________________________
>>>>> > Date: Thu, 6 Mar 2014 13:46:50 -0800
>>>>> > Subject: Re: Pig on Spark
>>>>> > From: aniket486@gmail.com
>>>>> > To: user@spark.apache.org; tgraves_cs@yahoo.com
>>>>> >
>>>>> >
>>>>> > There is some work to make this work on yarn at
>>>>> > https://github.com/aniket486/pig. (So, compile pig with ant
>>>>> > -Dhadoopversion=23)
>>>>> >
>>>>> > You can look at
>>>>> https://github.com/aniket486/pig/blob/spork/pig-spark to
>>>>> > find out what sort of env variables you need (sorry, I haven't been
>>>>> able to
>>>>> > clean this up- in-progress). There are few known issues with this, I
>>>>> will
>>>>> > work on fixing them soon.
>>>>> >
>>>>> > Known issues-
>>>>> > 1. Limit does not work (spork-fix)
>>>>> > 2. Foreach requires to turn off schema-tuple-backend (should be a
>>>>> pig-jira)
>>>>> > 3. Algebraic udfs dont work (spork-fix in-progress)
>>>>> > 4. Group by rework (to avoid OOMs)
>>>>> > 5. UDF Classloader issue (requires SPARK-1053, then you can put
>>>>> > pig-withouthadoop.jar as SPARK_JARS in SparkContext along with udf
>>>>> jars)
>>>>> >
>>>>> > ~Aniket
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thu, Mar 6, 2014 at 1:36 PM, Tom Graves <tg...@yahoo.com>
>>>>> wrote:
>>>>> >
>>>>> > I had asked a similar question on the dev mailing list a while back
>>>>> (Jan
>>>>> > 22nd).
>>>>> >
>>>>> > See the archives:
>>>>> >
>>>>> http://mail-archives.apache.org/mod_mbox/spark-dev/201401.mbox/browser->
>>>>> > look for spork.
>>>>> >
>>>>> > Basically Matei said:
>>>>> >
>>>>> > Yup, that was it, though I believe people at Twitter picked it up
>>>>> again
>>>>> > recently. I'd suggest
>>>>> > asking Dmitriy if you know him. I've seen interest in this from
>>>>> several
>>>>> > other groups, and
>>>>> > if there's enough of it, maybe we can start another open source repo
>>>>> to
>>>>> > track it. The work
>>>>> > in that repo you pointed to was done over one week, and already had
>>>>> most of
>>>>> > Pig's operators
>>>>> > working. (I helped out with this prototype over Twitter's hack
>>>>> week.) That
>>>>> > work also calls
>>>>> > the Scala API directly, because it was done before we had a Java
>>>>> API; it
>>>>> > should be easier
>>>>> > with the Java one.
>>>>> >
>>>>> >
>>>>> > Tom
>>>>> >
>>>>> >
>>>>> >
>>>>> > On Thursday, March 6, 2014 3:11 PM, Sameer Tilak <ss...@live.com>
>>>>> wrote:
>>>>> > Hi everyone,
>>>>> >
>>>>> > We are using to Pig to build our data pipeline. I came across Spork
>>>>> -- Pig
>>>>> > on Spark at: https://github.com/dvryaboy/pig and not sure if it is
>>>>> still
>>>>> > active.
>>>>> >
>>>>> > Can someone please let me know the status of Spork or any other
>>>>> effort that
>>>>> > will let us run Pig on Spark? We can significantly benefit by using
>>>>> Spark,
>>>>> > but we would like to keep using the existing Pig scripts.
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> >
>>>>> > --
>>>>> > "...:::Aniket:::... Quetzalco@tl"
>>>>> >
>>>>> >
>>>>>
>>>>
>>>>
>>>
>>
>
>
> --
> "...:::Aniket:::... Quetzalco@tl"
>

Re: Pig on Spark

Posted by lalit1303 <la...@sigmoidanalytics.com>.
Hi,

We got spork working on spark 0.9.0
Repository available at:
https://github.com/sigmoidanalytics/pig/tree/spork-hadoopasm-fix

Please suggest your feedback.



-----
Lalit Yadav
lalit@sigmoidanalytics.com
--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Pig-on-Spark-tp2367p4668.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Pig on Spark

Posted by Mayur Rustagi <ma...@gmail.com>.
Bam !!!
http://docs.sigmoidanalytics.com/index.php/Setting_up_spork_with_spark_0.8.1


Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi <https://twitter.com/mayur_rustagi>



On Thu, Apr 10, 2014 at 3:07 AM, Konstantin Kudryavtsev <
kudryavtsev.konstantin@gmail.com> wrote:

> Hi Mayur,
>
> I wondered if you could share your findings in some way (github, blog
> post, etc). I guess your experience will be very interesting/useful for
> many people
>
> sent from Lenovo YogaTablet
> On Apr 8, 2014 8:48 PM, "Mayur Rustagi" <ma...@gmail.com> wrote:
>
>> Hi Ankit,
>> Thanx for all the work on Pig.
>> Finally got it working. Couple of high level bugs right now:
>>
>>    - Getting it working on Spark 0.9.0
>>    - Getting UDF working
>>    - Getting generate functionality working
>>    - Exhaustive test suite on Spark on Pig
>>
>> are you maintaining a Jira somewhere?
>>
>> I am currently trying to deploy it on 0.9.0.
>>
>> Regards
>> Mayur
>>
>> Mayur Rustagi
>> Ph: +1 (760) 203 3257
>> http://www.sigmoidanalytics.com
>> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>>
>>
>>
>> On Fri, Mar 14, 2014 at 1:37 PM, Aniket Mokashi <an...@gmail.com>wrote:
>>
>>> We will post fixes from our side at - https://github.com/twitter/pig.
>>>
>>> Top on our list are-
>>> 1. Make it work with pig-trunk (execution engine interface) (with 0.8 or
>>> 0.9 spark).
>>> 2. Support for algebraic udfs (this mitigates the group by oom problems).
>>>
>>> Would definitely love more contribution on this.
>>>
>>> Thanks,
>>> Aniket
>>>
>>>
>>> On Fri, Mar 14, 2014 at 12:29 PM, Mayur Rustagi <mayur.rustagi@gmail.com
>>> > wrote:
>>>
>>>> Dam I am off to NY for Structure Conf. Would it be possible to meet
>>>> anytime after 28th March?
>>>> I am really interested in making it stable & production quality.
>>>>
>>>> Regards
>>>> Mayur Rustagi
>>>> Ph: +1 (760) 203 3257
>>>> http://www.sigmoidanalytics.com
>>>> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>>>>
>>>>
>>>>
>>>> On Fri, Mar 14, 2014 at 11:53 AM, Julien Le Dem <ju...@twitter.com>wrote:
>>>>
>>>>> Hi Mayur,
>>>>> Are you going to the Pig meetup this afternoon?
>>>>> http://www.meetup.com/PigUser/events/160604192/
>>>>> Aniket and I will be there.
>>>>> We would be happy to chat about Pig-on-Spark
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Mar 11, 2014 at 8:56 AM, Mayur Rustagi <
>>>>> mayur.rustagi@gmail.com> wrote:
>>>>>
>>>>>> Hi Lin,
>>>>>> We are working on getting Pig on spark functional with 0.8.0, have
>>>>>> you got it working on any spark version ?
>>>>>> Also what all functionality works on it?
>>>>>> Regards
>>>>>> Mayur
>>>>>>
>>>>>> Mayur Rustagi
>>>>>> Ph: +1 (760) 203 3257
>>>>>> http://www.sigmoidanalytics.com
>>>>>> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Mar 10, 2014 at 11:00 PM, Xiangrui Meng <me...@gmail.com>wrote:
>>>>>>
>>>>>>> Hi Sameer,
>>>>>>>
>>>>>>> Lin (cc'ed) could also give you some updates about Pig on Spark
>>>>>>> development on her side.
>>>>>>>
>>>>>>> Best,
>>>>>>> Xiangrui
>>>>>>>
>>>>>>> On Mon, Mar 10, 2014 at 12:52 PM, Sameer Tilak <ss...@live.com>
>>>>>>> wrote:
>>>>>>> > Hi Mayur,
>>>>>>> > We are planning to upgrade our distribution MR1> MR2 (YARN) and
>>>>>>> the goal is
>>>>>>> > to get SPROK set up next month. I will keep you posted. Can you
>>>>>>> please keep
>>>>>>> > me informed about your progress as well.
>>>>>>> >
>>>>>>> > ________________________________
>>>>>>> > From: mayur.rustagi@gmail.com
>>>>>>> > Date: Mon, 10 Mar 2014 11:47:56 -0700
>>>>>>> >
>>>>>>> > Subject: Re: Pig on Spark
>>>>>>> > To: user@spark.apache.org
>>>>>>> >
>>>>>>> >
>>>>>>> > Hi Sameer,
>>>>>>> > Did you make any progress on this. My team is also trying it out
>>>>>>> would love
>>>>>>> > to know some detail so progress.
>>>>>>> >
>>>>>>> > Mayur Rustagi
>>>>>>> > Ph: +1 (760) 203 3257
>>>>>>> > http://www.sigmoidanalytics.com
>>>>>>> > @mayur_rustagi
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > On Thu, Mar 6, 2014 at 2:20 PM, Sameer Tilak <ss...@live.com>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> > Hi Aniket,
>>>>>>> > Many thanks! I will check this out.
>>>>>>> >
>>>>>>> > ________________________________
>>>>>>> > Date: Thu, 6 Mar 2014 13:46:50 -0800
>>>>>>> > Subject: Re: Pig on Spark
>>>>>>> > From: aniket486@gmail.com
>>>>>>> > To: user@spark.apache.org; tgraves_cs@yahoo.com
>>>>>>> >
>>>>>>> >
>>>>>>> > There is some work to make this work on yarn at
>>>>>>> > https://github.com/aniket486/pig. (So, compile pig with ant
>>>>>>> > -Dhadoopversion=23)
>>>>>>> >
>>>>>>> > You can look at
>>>>>>> https://github.com/aniket486/pig/blob/spork/pig-spark to
>>>>>>> > find out what sort of env variables you need (sorry, I haven't
>>>>>>> been able to
>>>>>>> > clean this up- in-progress). There are few known issues with this,
>>>>>>> I will
>>>>>>> > work on fixing them soon.
>>>>>>> >
>>>>>>> > Known issues-
>>>>>>> > 1. Limit does not work (spork-fix)
>>>>>>> > 2. Foreach requires to turn off schema-tuple-backend (should be a
>>>>>>> pig-jira)
>>>>>>> > 3. Algebraic udfs dont work (spork-fix in-progress)
>>>>>>> > 4. Group by rework (to avoid OOMs)
>>>>>>> > 5. UDF Classloader issue (requires SPARK-1053, then you can put
>>>>>>> > pig-withouthadoop.jar as SPARK_JARS in SparkContext along with udf
>>>>>>> jars)
>>>>>>> >
>>>>>>> > ~Aniket
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > On Thu, Mar 6, 2014 at 1:36 PM, Tom Graves <tg...@yahoo.com>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> > I had asked a similar question on the dev mailing list a while
>>>>>>> back (Jan
>>>>>>> > 22nd).
>>>>>>> >
>>>>>>> > See the archives:
>>>>>>> >
>>>>>>> http://mail-archives.apache.org/mod_mbox/spark-dev/201401.mbox/browser->
>>>>>>> > look for spork.
>>>>>>> >
>>>>>>> > Basically Matei said:
>>>>>>> >
>>>>>>> > Yup, that was it, though I believe people at Twitter picked it up
>>>>>>> again
>>>>>>> > recently. I'd suggest
>>>>>>> > asking Dmitriy if you know him. I've seen interest in this from
>>>>>>> several
>>>>>>> > other groups, and
>>>>>>> > if there's enough of it, maybe we can start another open source
>>>>>>> repo to
>>>>>>> > track it. The work
>>>>>>> > in that repo you pointed to was done over one week, and already
>>>>>>> had most of
>>>>>>> > Pig's operators
>>>>>>> > working. (I helped out with this prototype over Twitter's hack
>>>>>>> week.) That
>>>>>>> > work also calls
>>>>>>> > the Scala API directly, because it was done before we had a Java
>>>>>>> API; it
>>>>>>> > should be easier
>>>>>>> > with the Java one.
>>>>>>> >
>>>>>>> >
>>>>>>> > Tom
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > On Thursday, March 6, 2014 3:11 PM, Sameer Tilak <ss...@live.com>
>>>>>>> wrote:
>>>>>>> > Hi everyone,
>>>>>>> >
>>>>>>> > We are using to Pig to build our data pipeline. I came across
>>>>>>> Spork -- Pig
>>>>>>> > on Spark at: https://github.com/dvryaboy/pig and not sure if it
>>>>>>> is still
>>>>>>> > active.
>>>>>>> >
>>>>>>> > Can someone please let me know the status of Spork or any other
>>>>>>> effort that
>>>>>>> > will let us run Pig on Spark? We can significantly benefit by
>>>>>>> using Spark,
>>>>>>> > but we would like to keep using the existing Pig scripts.
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > --
>>>>>>> > "...:::Aniket:::... Quetzalco@tl"
>>>>>>> >
>>>>>>> >
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> "...:::Aniket:::... Quetzalco@tl"
>>>
>>
>>

Re: Pig on Spark

Posted by Konstantin Kudryavtsev <ku...@gmail.com>.
Hi Mayur,

I wondered if you could share your findings in some way (github, blog post,
etc). I guess your experience will be very interesting/useful for many
people

sent from Lenovo YogaTablet
On Apr 8, 2014 8:48 PM, "Mayur Rustagi" <ma...@gmail.com> wrote:

> Hi Ankit,
> Thanx for all the work on Pig.
> Finally got it working. Couple of high level bugs right now:
>
>    - Getting it working on Spark 0.9.0
>    - Getting UDF working
>    - Getting generate functionality working
>    - Exhaustive test suite on Spark on Pig
>
> are you maintaining a Jira somewhere?
>
> I am currently trying to deploy it on 0.9.0.
>
> Regards
> Mayur
>
> Mayur Rustagi
> Ph: +1 (760) 203 3257
> http://www.sigmoidanalytics.com
> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>
>
>
> On Fri, Mar 14, 2014 at 1:37 PM, Aniket Mokashi <an...@gmail.com>wrote:
>
>> We will post fixes from our side at - https://github.com/twitter/pig.
>>
>> Top on our list are-
>> 1. Make it work with pig-trunk (execution engine interface) (with 0.8 or
>> 0.9 spark).
>> 2. Support for algebraic udfs (this mitigates the group by oom problems).
>>
>> Would definitely love more contribution on this.
>>
>> Thanks,
>> Aniket
>>
>>
>> On Fri, Mar 14, 2014 at 12:29 PM, Mayur Rustagi <ma...@gmail.com>wrote:
>>
>>> Dam I am off to NY for Structure Conf. Would it be possible to meet
>>> anytime after 28th March?
>>> I am really interested in making it stable & production quality.
>>>
>>> Regards
>>> Mayur Rustagi
>>> Ph: +1 (760) 203 3257
>>> http://www.sigmoidanalytics.com
>>> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>>>
>>>
>>>
>>> On Fri, Mar 14, 2014 at 11:53 AM, Julien Le Dem <ju...@twitter.com>wrote:
>>>
>>>> Hi Mayur,
>>>> Are you going to the Pig meetup this afternoon?
>>>> http://www.meetup.com/PigUser/events/160604192/
>>>> Aniket and I will be there.
>>>> We would be happy to chat about Pig-on-Spark
>>>>
>>>>
>>>>
>>>> On Tue, Mar 11, 2014 at 8:56 AM, Mayur Rustagi <mayur.rustagi@gmail.com
>>>> > wrote:
>>>>
>>>>> Hi Lin,
>>>>> We are working on getting Pig on spark functional with 0.8.0, have you
>>>>> got it working on any spark version ?
>>>>> Also what all functionality works on it?
>>>>> Regards
>>>>> Mayur
>>>>>
>>>>> Mayur Rustagi
>>>>> Ph: +1 (760) 203 3257
>>>>> http://www.sigmoidanalytics.com
>>>>> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Mar 10, 2014 at 11:00 PM, Xiangrui Meng <me...@gmail.com>wrote:
>>>>>
>>>>>> Hi Sameer,
>>>>>>
>>>>>> Lin (cc'ed) could also give you some updates about Pig on Spark
>>>>>> development on her side.
>>>>>>
>>>>>> Best,
>>>>>> Xiangrui
>>>>>>
>>>>>> On Mon, Mar 10, 2014 at 12:52 PM, Sameer Tilak <ss...@live.com>
>>>>>> wrote:
>>>>>> > Hi Mayur,
>>>>>> > We are planning to upgrade our distribution MR1> MR2 (YARN) and the
>>>>>> goal is
>>>>>> > to get SPROK set up next month. I will keep you posted. Can you
>>>>>> please keep
>>>>>> > me informed about your progress as well.
>>>>>> >
>>>>>> > ________________________________
>>>>>> > From: mayur.rustagi@gmail.com
>>>>>> > Date: Mon, 10 Mar 2014 11:47:56 -0700
>>>>>> >
>>>>>> > Subject: Re: Pig on Spark
>>>>>> > To: user@spark.apache.org
>>>>>> >
>>>>>> >
>>>>>> > Hi Sameer,
>>>>>> > Did you make any progress on this. My team is also trying it out
>>>>>> would love
>>>>>> > to know some detail so progress.
>>>>>> >
>>>>>> > Mayur Rustagi
>>>>>> > Ph: +1 (760) 203 3257
>>>>>> > http://www.sigmoidanalytics.com
>>>>>> > @mayur_rustagi
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On Thu, Mar 6, 2014 at 2:20 PM, Sameer Tilak <ss...@live.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > Hi Aniket,
>>>>>> > Many thanks! I will check this out.
>>>>>> >
>>>>>> > ________________________________
>>>>>> > Date: Thu, 6 Mar 2014 13:46:50 -0800
>>>>>> > Subject: Re: Pig on Spark
>>>>>> > From: aniket486@gmail.com
>>>>>> > To: user@spark.apache.org; tgraves_cs@yahoo.com
>>>>>> >
>>>>>> >
>>>>>> > There is some work to make this work on yarn at
>>>>>> > https://github.com/aniket486/pig. (So, compile pig with ant
>>>>>> > -Dhadoopversion=23)
>>>>>> >
>>>>>> > You can look at
>>>>>> https://github.com/aniket486/pig/blob/spork/pig-spark to
>>>>>> > find out what sort of env variables you need (sorry, I haven't been
>>>>>> able to
>>>>>> > clean this up- in-progress). There are few known issues with this,
>>>>>> I will
>>>>>> > work on fixing them soon.
>>>>>> >
>>>>>> > Known issues-
>>>>>> > 1. Limit does not work (spork-fix)
>>>>>> > 2. Foreach requires to turn off schema-tuple-backend (should be a
>>>>>> pig-jira)
>>>>>> > 3. Algebraic udfs dont work (spork-fix in-progress)
>>>>>> > 4. Group by rework (to avoid OOMs)
>>>>>> > 5. UDF Classloader issue (requires SPARK-1053, then you can put
>>>>>> > pig-withouthadoop.jar as SPARK_JARS in SparkContext along with udf
>>>>>> jars)
>>>>>> >
>>>>>> > ~Aniket
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On Thu, Mar 6, 2014 at 1:36 PM, Tom Graves <tg...@yahoo.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > I had asked a similar question on the dev mailing list a while back
>>>>>> (Jan
>>>>>> > 22nd).
>>>>>> >
>>>>>> > See the archives:
>>>>>> >
>>>>>> http://mail-archives.apache.org/mod_mbox/spark-dev/201401.mbox/browser->
>>>>>> > look for spork.
>>>>>> >
>>>>>> > Basically Matei said:
>>>>>> >
>>>>>> > Yup, that was it, though I believe people at Twitter picked it up
>>>>>> again
>>>>>> > recently. I'd suggest
>>>>>> > asking Dmitriy if you know him. I've seen interest in this from
>>>>>> several
>>>>>> > other groups, and
>>>>>> > if there's enough of it, maybe we can start another open source
>>>>>> repo to
>>>>>> > track it. The work
>>>>>> > in that repo you pointed to was done over one week, and already had
>>>>>> most of
>>>>>> > Pig's operators
>>>>>> > working. (I helped out with this prototype over Twitter's hack
>>>>>> week.) That
>>>>>> > work also calls
>>>>>> > the Scala API directly, because it was done before we had a Java
>>>>>> API; it
>>>>>> > should be easier
>>>>>> > with the Java one.
>>>>>> >
>>>>>> >
>>>>>> > Tom
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On Thursday, March 6, 2014 3:11 PM, Sameer Tilak <ss...@live.com>
>>>>>> wrote:
>>>>>> > Hi everyone,
>>>>>> >
>>>>>> > We are using to Pig to build our data pipeline. I came across Spork
>>>>>> -- Pig
>>>>>> > on Spark at: https://github.com/dvryaboy/pig and not sure if it is
>>>>>> still
>>>>>> > active.
>>>>>> >
>>>>>> > Can someone please let me know the status of Spork or any other
>>>>>> effort that
>>>>>> > will let us run Pig on Spark? We can significantly benefit by using
>>>>>> Spark,
>>>>>> > but we would like to keep using the existing Pig scripts.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > "...:::Aniket:::... Quetzalco@tl"
>>>>>> >
>>>>>> >
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> "...:::Aniket:::... Quetzalco@tl"
>>
>
>