You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Marco Mistroni <mm...@gmail.com> on 2019/02/03 21:41:47 UTC
Re: testing frameworks
Hi
sorry to resurrect this thread
Any spark libraries for testing code in pyspark? the github code above
seems related to Scala
following links in the original threads (and also LMGFY) i found out
pytest-spark · PyPI <https://pypi.org/project/pytest-spark/>
w/kindest regards
Marco
On Tue, Jun 12, 2018 at 6:44 PM Ryan Adams <ra...@gmail.com> wrote:
> We use spark testing base for unit testing. These tests execute on a very
> small amount of data that covers all paths the code can take (or most paths
> anyway).
>
> https://github.com/holdenk/spark-testing-base
>
> For integration testing we use automated routines to ensure that aggregate
> values match an aggregate baseline.
>
> Ryan
>
> Ryan Adams
> radams217@gmail.com
>
> On Tue, Jun 12, 2018 at 11:51 AM, Lars Albertsson <la...@mapflat.com>
> wrote:
>
>> Hi,
>>
>> I wrote this answer to the same question a couple of years ago:
>> https://www.mail-archive.com/user%40spark.apache.org/msg48032.html
>>
>> I have made a couple of presentations on the subject. Slides and video
>> are linked on this page: http://www.mapflat.com/presentations/
>>
>> You can find more material in this list of resources:
>> http://www.mapflat.com/lands/resources/reading-list
>>
>> Happy testing!
>>
>> Regards,
>>
>>
>>
>> Lars Albertsson
>> Data engineering consultant
>> www.mapflat.com
>> https://twitter.com/lalleal
>> +46 70 7687109
>> Calendar: http://www.mapflat.com/calendar
>>
>>
>> On Mon, May 21, 2018 at 2:24 PM, Steve Pruitt <bp...@opentext.com>
>> wrote:
>> > Hi,
>> >
>> >
>> >
>> > Can anyone recommend testing frameworks suitable for Spark jobs.
>> Something
>> > that can be integrated into a CI tool would be great.
>> >
>> >
>> >
>> > Thanks.
>> >
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>
>>
>
Re: testing frameworks
Posted by Marco Mistroni <mm...@gmail.com>.
Thanks Hichame will follow up on that
Anyonen on this list using python version of spark-testing-base? seems
theres support for DataFrame....
thanks in advance and regards
Marco
On Sun, Feb 3, 2019 at 9:58 PM Hichame El Khalfi <hi...@elkhalfi.com>
wrote:
> Hi,
> You can use pysparkling => https://github.com/svenkreiss/pysparkling
> This lib is useful in case you have RDD.
>
> Hope this helps,
>
> Hichame
>
> *From:* mmistroni@gmail.com
> *Sent:* February 3, 2019 4:42 PM
> *To:* radams217@gmail.com
> *Cc:* lalle@mapflat.com; bpruitt@opentext.com; user@spark.apache.org
> *Subject:* Re: testing frameworks
>
> Hi
> sorry to resurrect this thread
> Any spark libraries for testing code in pyspark? the github code above
> seems related to Scala
> following links in the original threads (and also LMGFY) i found out
> pytest-spark · PyPI <https://pypi.org/project/pytest-spark/>
>
> w/kindest regards
> Marco
>
>
>
>
> On Tue, Jun 12, 2018 at 6:44 PM Ryan Adams <ra...@gmail.com> wrote:
>
>> We use spark testing base for unit testing. These tests execute on a
>> very small amount of data that covers all paths the code can take (or most
>> paths anyway).
>>
>> https://github.com/holdenk/spark-testing-base
>>
>> For integration testing we use automated routines to ensure that
>> aggregate values match an aggregate baseline.
>>
>> Ryan
>>
>> Ryan Adams
>> radams217@gmail.com
>>
>> On Tue, Jun 12, 2018 at 11:51 AM, Lars Albertsson <la...@mapflat.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I wrote this answer to the same question a couple of years ago:
>>> https://www.mail-archive.com/user%40spark.apache.org/msg48032.html
>>>
>>> I have made a couple of presentations on the subject. Slides and video
>>> are linked on this page: http://www.mapflat.com/presentations/
>>>
>>> You can find more material in this list of resources:
>>> http://www.mapflat.com/lands/resources/reading-list
>>>
>>> Happy testing!
>>>
>>> Regards,
>>>
>>>
>>>
>>> Lars Albertsson
>>> Data engineering consultant
>>> www.mapflat.com
>>> https://twitter.com/lalleal
>>> +46 70 7687109
>>> Calendar: http://www.mapflat.com/calendar
>>>
>>>
>>> On Mon, May 21, 2018 at 2:24 PM, Steve Pruitt <bp...@opentext.com>
>>> wrote:
>>> > Hi,
>>> >
>>> >
>>> >
>>> > Can anyone recommend testing frameworks suitable for Spark jobs.
>>> Something
>>> > that can be integrated into a CI tool would be great.
>>> >
>>> >
>>> >
>>> > Thanks.
>>> >
>>> >
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>>>
>>>
>>
Re: testing frameworks
Posted by Hichame El Khalfi <hi...@elkhalfi.com>.
Hi,
You can use pysparkling => https://github.com/svenkreiss/pysparkling
This lib is useful in case you have RDD.
Hope this helps,
Hichame
From: mmistroni@gmail.com
Sent: February 3, 2019 4:42 PM
To: radams217@gmail.com
Cc: lalle@mapflat.com; bpruitt@opentext.com; user@spark.apache.org
Subject: Re: testing frameworks
Hi
sorry to resurrect this thread
Any spark libraries for testing code in pyspark? the github code above seems related to Scala
following links in the original threads (and also LMGFY) i found out
<https://pypi.org/project/pytest-spark/>
pytest-spark · PyPI
w/kindest regards
Marco
On Tue, Jun 12, 2018 at 6:44 PM Ryan Adams <ra...@gmail.com>> wrote:
We use spark testing base for unit testing. These tests execute on a very small amount of data that covers all paths the code can take (or most paths anyway).
https://github.com/holdenk/spark-testing-base
For integration testing we use automated routines to ensure that aggregate values match an aggregate baseline.
Ryan
Ryan Adams
radams217@gmail.com<ma...@gmail.com>
On Tue, Jun 12, 2018 at 11:51 AM, Lars Albertsson <la...@mapflat.com>> wrote:
Hi,
I wrote this answer to the same question a couple of years ago:
https://www.mail-archive.com/user%40spark.apache.org/msg48032.html
I have made a couple of presentations on the subject. Slides and video
are linked on this page: http://www.mapflat.com/presentations/
You can find more material in this list of resources:
http://www.mapflat.com/lands/resources/reading-list
Happy testing!
Regards,
Lars Albertsson
Data engineering consultant
www.mapflat.com<http://www.mapflat.com>
https://twitter.com/lalleal
+46 70 7687109
Calendar: http://www.mapflat.com/calendar
On Mon, May 21, 2018 at 2:24 PM, Steve Pruitt <bp...@opentext.com>> wrote:
> Hi,
>
>
>
> Can anyone recommend testing frameworks suitable for Spark jobs. Something
> that can be integrated into a CI tool would be great.
>
>
>
> Thanks.
>
>
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org<ma...@spark.apache.org>