You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Marco Mistroni <mm...@gmail.com> on 2019/02/03 21:41:47 UTC

Re: testing frameworks

Hi
 sorry to resurrect this thread
Any spark libraries for testing code in pyspark?  the github code above
seems related to Scala
following links in the original threads (and also LMGFY) i found out
pytest-spark · PyPI <https://pypi.org/project/pytest-spark/>

w/kindest regards
 Marco




On Tue, Jun 12, 2018 at 6:44 PM Ryan Adams <ra...@gmail.com> wrote:

> We use spark testing base for unit testing.  These tests execute on a very
> small amount of data that covers all paths the code can take (or most paths
> anyway).
>
> https://github.com/holdenk/spark-testing-base
>
> For integration testing we use automated routines to ensure that aggregate
> values match an aggregate baseline.
>
> Ryan
>
> Ryan Adams
> radams217@gmail.com
>
> On Tue, Jun 12, 2018 at 11:51 AM, Lars Albertsson <la...@mapflat.com>
> wrote:
>
>> Hi,
>>
>> I wrote this answer to the same question a couple of years ago:
>> https://www.mail-archive.com/user%40spark.apache.org/msg48032.html
>>
>> I have made a couple of presentations on the subject. Slides and video
>> are linked on this page: http://www.mapflat.com/presentations/
>>
>> You can find more material in this list of resources:
>> http://www.mapflat.com/lands/resources/reading-list
>>
>> Happy testing!
>>
>> Regards,
>>
>>
>>
>> Lars Albertsson
>> Data engineering consultant
>> www.mapflat.com
>> https://twitter.com/lalleal
>> +46 70 7687109
>> Calendar: http://www.mapflat.com/calendar
>>
>>
>> On Mon, May 21, 2018 at 2:24 PM, Steve Pruitt <bp...@opentext.com>
>> wrote:
>> > Hi,
>> >
>> >
>> >
>> > Can anyone recommend testing frameworks suitable for Spark jobs.
>> Something
>> > that can be integrated into a CI tool would be great.
>> >
>> >
>> >
>> > Thanks.
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>
>>
>

Re: testing frameworks

Posted by Marco Mistroni <mm...@gmail.com>.
Thanks Hichame will follow up on that

Anyonen on this list using python version of spark-testing-base? seems
theres support for DataFrame....

thanks in advance and regards
 Marco

On Sun, Feb 3, 2019 at 9:58 PM Hichame El Khalfi <hi...@elkhalfi.com>
wrote:

> Hi,
> You can use pysparkling => https://github.com/svenkreiss/pysparkling
> This lib is useful in case you have RDD.
>
> Hope this helps,
>
> Hichame
>
> *From:* mmistroni@gmail.com
> *Sent:* February 3, 2019 4:42 PM
> *To:* radams217@gmail.com
> *Cc:* lalle@mapflat.com; bpruitt@opentext.com; user@spark.apache.org
> *Subject:* Re: testing frameworks
>
> Hi
>  sorry to resurrect this thread
> Any spark libraries for testing code in pyspark?  the github code above
> seems related to Scala
> following links in the original threads (and also LMGFY) i found out
> pytest-spark · PyPI <https://pypi.org/project/pytest-spark/>
>
> w/kindest regards
>  Marco
>
>
>
>
> On Tue, Jun 12, 2018 at 6:44 PM Ryan Adams <ra...@gmail.com> wrote:
>
>> We use spark testing base for unit testing.  These tests execute on a
>> very small amount of data that covers all paths the code can take (or most
>> paths anyway).
>>
>> https://github.com/holdenk/spark-testing-base
>>
>> For integration testing we use automated routines to ensure that
>> aggregate values match an aggregate baseline.
>>
>> Ryan
>>
>> Ryan Adams
>> radams217@gmail.com
>>
>> On Tue, Jun 12, 2018 at 11:51 AM, Lars Albertsson <la...@mapflat.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I wrote this answer to the same question a couple of years ago:
>>> https://www.mail-archive.com/user%40spark.apache.org/msg48032.html
>>>
>>> I have made a couple of presentations on the subject. Slides and video
>>> are linked on this page: http://www.mapflat.com/presentations/
>>>
>>> You can find more material in this list of resources:
>>> http://www.mapflat.com/lands/resources/reading-list
>>>
>>> Happy testing!
>>>
>>> Regards,
>>>
>>>
>>>
>>> Lars Albertsson
>>> Data engineering consultant
>>> www.mapflat.com
>>> https://twitter.com/lalleal
>>> +46 70 7687109
>>> Calendar: http://www.mapflat.com/calendar
>>>
>>>
>>> On Mon, May 21, 2018 at 2:24 PM, Steve Pruitt <bp...@opentext.com>
>>> wrote:
>>> > Hi,
>>> >
>>> >
>>> >
>>> > Can anyone recommend testing frameworks suitable for Spark jobs.
>>> Something
>>> > that can be integrated into a CI tool would be great.
>>> >
>>> >
>>> >
>>> > Thanks.
>>> >
>>> >
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>>
>>>
>>

Re: testing frameworks

Posted by Hichame El Khalfi <hi...@elkhalfi.com>.
Hi,
You can use pysparkling => https://github.com/svenkreiss/pysparkling
This lib is useful in case you have RDD.

Hope this helps,

Hichame

From: mmistroni@gmail.com
Sent: February 3, 2019 4:42 PM
To: radams217@gmail.com
Cc: lalle@mapflat.com; bpruitt@opentext.com; user@spark.apache.org
Subject: Re: testing frameworks


Hi
 sorry to resurrect this thread
Any spark libraries for testing code in pyspark?  the github code above seems related to Scala
following links in the original threads (and also LMGFY) i found out
<https://pypi.org/project/pytest-spark/>
pytest-spark · PyPI


w/kindest regards
 Marco




On Tue, Jun 12, 2018 at 6:44 PM Ryan Adams <ra...@gmail.com>> wrote:
We use spark testing base for unit testing.  These tests execute on a very small amount of data that covers all paths the code can take (or most paths anyway).

https://github.com/holdenk/spark-testing-base

For integration testing we use automated routines to ensure that aggregate values match an aggregate baseline.

Ryan

Ryan Adams
radams217@gmail.com<ma...@gmail.com>

On Tue, Jun 12, 2018 at 11:51 AM, Lars Albertsson <la...@mapflat.com>> wrote:
Hi,

I wrote this answer to the same question a couple of years ago:
https://www.mail-archive.com/user%40spark.apache.org/msg48032.html

I have made a couple of presentations on the subject. Slides and video
are linked on this page: http://www.mapflat.com/presentations/

You can find more material in this list of resources:
http://www.mapflat.com/lands/resources/reading-list

Happy testing!

Regards,



Lars Albertsson
Data engineering consultant
www.mapflat.com<http://www.mapflat.com>
https://twitter.com/lalleal
+46 70 7687109
Calendar: http://www.mapflat.com/calendar


On Mon, May 21, 2018 at 2:24 PM, Steve Pruitt <bp...@opentext.com>> wrote:
> Hi,
>
>
>
> Can anyone recommend testing frameworks suitable for Spark jobs.  Something
> that can be integrated into a CI tool would be great.
>
>
>
> Thanks.
>
>

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org<ma...@spark.apache.org>