You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Holden Karau <ho...@pigscanfly.ca> on 2017/04/14 18:17:18 UTC

Spark Testing Library Discussion

Hi Spark Users (+ Some Spark Testing Devs on BCC),

Awhile back on one of the many threads about testing in Spark there was
some interest in having a chat about the state of Spark testing and what
people want/need.

So if you are interested in joining an online (with maybe an IRL component
if enough people are SF based) chat about Spark testing please fill out
this doodle - https://doodle.com/poll/69y6yab4pyf7u8bn

I think reasonable topics of discussion could be:

1) What is the state of the different Spark testing libraries in the
different core (Scala, Python, R, Java) and extended languages (C#,
Javascript, etc.)?
2) How do we make these more easily discovered by users?
3) What are people looking for in their testing libraries that we are
missing? (can be functionality, documentation, etc.)
4) Are there any examples of well tested open source Spark projects and
where are they?

If you have other topics that's awesome.

To clarify this about libraries and best practices for people testing their
Spark applications, and less about testing Spark's internals (although as
illustrated by some of the libraries there is some strong overlap in what
is required to make that work).

Cheers,

Holden :)

-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau

Re: Spark Testing Library Discussion

Posted by "lucas.gary@gmail.com" <lu...@gmail.com>.
Oh interesting. I did send a PR or thought I did will check this eve.

On Apr 29, 2017 10:04 AM, "Sam Elamin" <hu...@gmail.com> wrote:

> Hi lucas
>
>
> Thanks for the detailed feedback, that's really useful!
>
> I did suggest Github but my colleague asked for an email
>
> You raise a good point with the grammar, sure I will rephrase it. I am
> more than happy to merge in the PR if you send it
>
>
> Th at said I know you can make BDD tests using any framework but I am a
> lazy developer and would rather use the framework or library defaults to
> make it easier for other devs to pick up.
>
> The number of rows is only a start correct, we can add more tests to check
> the transformed version but I was going to point that out on the future
> part of the series since this one is mainly about raw extracts.
>
>
> Thank you very much for the feedback and I will be sure to add it once I
> have more feedback
>
>
> Maybe we can create a gist of all this or even a tiny book on best
> practices if people find it useful
>
> Looking forward to the PR!
>
> Regards
> Sam
>
>
>
>
>
> On Sat, 29 Apr 2017 at 06:36, lucas.gary@gmail.com <lu...@gmail.com>
> wrote:
>
>> Awesome, thanks.
>>
>> Just reading your post
>>
>> A few observations:
>> 1) You're giving out Marius's email: "I have been lucky enough to
>> build this pipeline with the amazing Marius Feteanu".  A linked or
>> github link might be more helpful.
>>
>> 2) "If you are in Pyspark world sadly Holden’s test base wont work so
>> I suggest you check out Pytest and pytest-bdd.".  doesn't read well to
>> me, on first read I was wondering if Spark-Test-Base wasn't available
>> in python... It took me about 20 seconds to figure out that you
>> probably meant it doesn't allow for direct BDD semantics.  My 2nd
>> observation here is that BDD semantics can be aped in any given
>> testing framework.  You just need to be flexible :)
>>
>> 3) You're doing a transformation (IE JSON input against a JSON
>> schema).  You are testing for # of rows which is a good start.  But I
>> don't think that really exercises a test against your JSON schema. I
>> tend to view schema as the things that need the most rigorous testing
>> (it's code after all).  IE I would want to confirm that the output
>> matches the expected shape and values after being loaded against the
>> schema.
>>
>> I saw a few minor spelling and grammatical issues as well.  I put a PR
>> into your blog for them.  I won't be offended if you squish it :)
>>
>> I should be getting into our testing 'how-to' stuff this week.  I'll
>> scrape our org specific stuff and put it up to github this week as
>> well.  It'll be in python so maybe we'll get both use cases covered
>> with examples :)
>>
>> G
>>
>> On 27 April 2017 at 03:46, Sam Elamin <hu...@gmail.com> wrote:
>> > Hi
>> >
>> > @Lucas I certainly would love to write an integration testing library
>> for
>> > workflows, I have a few ideas I would love to share with others and
>> they are
>> > focused around Airflow since that is what we use
>> >
>> >
>> > As promised here is the first blog post in a series of posts I hope to
>> write
>> > on how we build data pipelines
>> >
>> > Please feel free to retweet my original tweet and share because the more
>> > ideas we have the better!
>> >
>> > Feedback is always welcome!
>> >
>> > Regards
>> > Sam
>> >
>> > On Tue, Apr 25, 2017 at 10:32 PM, lucas.gary@gmail.com
>> > <lu...@gmail.com> wrote:
>> >>
>> >> Hi all, whoever (Sam I think) was going to do some work on doing a
>> >> template testing pipeline.  I'd love to be involved, I have a current
>> task
>> >> in my day job (data engineer) to flesh out our testing how-to / best
>> >> practices for Spark jobs and I think I'll be doing something very
>> similar
>> >> for the next week or 2.
>> >>
>> >> I'll scrape out what i have now in the next day or so and put it up in
>> a
>> >> gist that I can share too.
>> >>
>> >> G
>> >>
>> >> On 25 April 2017 at 13:04, Holden Karau <ho...@pigscanfly.ca> wrote:
>> >>>
>> >>> Urgh hangouts did something frustrating, updated link
>> >>> https://hangouts.google.com/hangouts/_/ha6kusycp5fvzei2trhay4uhhqe
>> >>>
>> >>> On Mon, Apr 24, 2017 at 12:13 AM, Holden Karau <ho...@pigscanfly.ca>
>> >>> wrote:
>> >>>>
>> >>>> The (tentative) link for those interested is
>> >>>> https://hangouts.google.com/hangouts/_/oyjvcnffejcjhi6qazf3lysypue .
>> >>>>
>> >>>> On Mon, Apr 24, 2017 at 12:02 AM, Holden Karau <holden@pigscanfly.ca
>> >
>> >>>> wrote:
>> >>>>>
>> >>>>> So 14 people have said they are available on Tuesday the 25th at 1PM
>> >>>>> pacific so we will do this meeting then (
>> >>>>> https://doodle.com/poll/69y6yab4pyf7u8bn ).
>> >>>>>
>> >>>>> Since hangouts tends to work ok on the Linux distro I'm running my
>> >>>>> default is to host this as a "hangouts-on-air" unless there are
>> alternative
>> >>>>> ideas.
>> >>>>>
>> >>>>> I'll record the hangout and if it isn't terrible I'll post it for
>> those
>> >>>>> who weren't able to make it (and for next time I'll include more
>> European
>> >>>>> friendly time options - Doodle wouldn't let me update it once
>> posted).
>> >>>>>
>> >>>>> On Fri, Apr 14, 2017 at 11:17 AM, Holden Karau <
>> holden@pigscanfly.ca>
>> >>>>> wrote:
>> >>>>>>
>> >>>>>> Hi Spark Users (+ Some Spark Testing Devs on BCC),
>> >>>>>>
>> >>>>>> Awhile back on one of the many threads about testing in Spark there
>> >>>>>> was some interest in having a chat about the state of Spark
>> testing and what
>> >>>>>> people want/need.
>> >>>>>>
>> >>>>>> So if you are interested in joining an online (with maybe an IRL
>> >>>>>> component if enough people are SF based) chat about Spark testing
>> please
>> >>>>>> fill out this doodle - https://doodle.com/poll/69y6yab4pyf7u8bn
>> >>>>>>
>> >>>>>> I think reasonable topics of discussion could be:
>> >>>>>>
>> >>>>>> 1) What is the state of the different Spark testing libraries in
>> the
>> >>>>>> different core (Scala, Python, R, Java) and extended languages (C#,
>> >>>>>> Javascript, etc.)?
>> >>>>>> 2) How do we make these more easily discovered by users?
>> >>>>>> 3) What are people looking for in their testing libraries that we
>> are
>> >>>>>> missing? (can be functionality, documentation, etc.)
>> >>>>>> 4) Are there any examples of well tested open source Spark projects
>> >>>>>> and where are they?
>> >>>>>>
>> >>>>>> If you have other topics that's awesome.
>> >>>>>>
>> >>>>>> To clarify this about libraries and best practices for people
>> testing
>> >>>>>> their Spark applications, and less about testing Spark's internals
>> (although
>> >>>>>> as illustrated by some of the libraries there is some strong
>> overlap in what
>> >>>>>> is required to make that work).
>> >>>>>>
>> >>>>>> Cheers,
>> >>>>>>
>> >>>>>> Holden :)
>> >>>>>>
>> >>>>>> --
>> >>>>>> Cell : 425-233-8271 <(425)%20233-8271>
>> >>>>>> Twitter: https://twitter.com/holdenkarau
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> --
>> >>>>> Cell : 425-233-8271 <(425)%20233-8271>
>> >>>>> Twitter: https://twitter.com/holdenkarau
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Cell : 425-233-8271 <(425)%20233-8271>
>> >>>> Twitter: https://twitter.com/holdenkarau
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Cell : 425-233-8271 <(425)%20233-8271>
>> >>> Twitter: https://twitter.com/holdenkarau
>> >>
>> >>
>> >
>>
>

Re: Spark Testing Library Discussion

Posted by Sam Elamin <hu...@gmail.com>.
Hi lucas


Thanks for the detailed feedback, that's really useful!

I did suggest Github but my colleague asked for an email

You raise a good point with the grammar, sure I will rephrase it. I am more
than happy to merge in the PR if you send it


Th at said I know you can make BDD tests using any framework but I am a
lazy developer and would rather use the framework or library defaults to
make it easier for other devs to pick up.

The number of rows is only a start correct, we can add more tests to check
the transformed version but I was going to point that out on the future
part of the series since this one is mainly about raw extracts.


Thank you very much for the feedback and I will be sure to add it once I
have more feedback


Maybe we can create a gist of all this or even a tiny book on best
practices if people find it useful

Looking forward to the PR!

Regards
Sam





On Sat, 29 Apr 2017 at 06:36, lucas.gary@gmail.com <lu...@gmail.com>
wrote:

> Awesome, thanks.
>
> Just reading your post
>
> A few observations:
> 1) You're giving out Marius's email: "I have been lucky enough to
> build this pipeline with the amazing Marius Feteanu".  A linked or
> github link might be more helpful.
>
> 2) "If you are in Pyspark world sadly Holden’s test base wont work so
> I suggest you check out Pytest and pytest-bdd.".  doesn't read well to
> me, on first read I was wondering if Spark-Test-Base wasn't available
> in python... It took me about 20 seconds to figure out that you
> probably meant it doesn't allow for direct BDD semantics.  My 2nd
> observation here is that BDD semantics can be aped in any given
> testing framework.  You just need to be flexible :)
>
> 3) You're doing a transformation (IE JSON input against a JSON
> schema).  You are testing for # of rows which is a good start.  But I
> don't think that really exercises a test against your JSON schema. I
> tend to view schema as the things that need the most rigorous testing
> (it's code after all).  IE I would want to confirm that the output
> matches the expected shape and values after being loaded against the
> schema.
>
> I saw a few minor spelling and grammatical issues as well.  I put a PR
> into your blog for them.  I won't be offended if you squish it :)
>
> I should be getting into our testing 'how-to' stuff this week.  I'll
> scrape our org specific stuff and put it up to github this week as
> well.  It'll be in python so maybe we'll get both use cases covered
> with examples :)
>
> G
>
> On 27 April 2017 at 03:46, Sam Elamin <hu...@gmail.com> wrote:
> > Hi
> >
> > @Lucas I certainly would love to write an integration testing library for
> > workflows, I have a few ideas I would love to share with others and they
> are
> > focused around Airflow since that is what we use
> >
> >
> > As promised here is the first blog post in a series of posts I hope to
> write
> > on how we build data pipelines
> >
> > Please feel free to retweet my original tweet and share because the more
> > ideas we have the better!
> >
> > Feedback is always welcome!
> >
> > Regards
> > Sam
> >
> > On Tue, Apr 25, 2017 at 10:32 PM, lucas.gary@gmail.com
> > <lu...@gmail.com> wrote:
> >>
> >> Hi all, whoever (Sam I think) was going to do some work on doing a
> >> template testing pipeline.  I'd love to be involved, I have a current
> task
> >> in my day job (data engineer) to flesh out our testing how-to / best
> >> practices for Spark jobs and I think I'll be doing something very
> similar
> >> for the next week or 2.
> >>
> >> I'll scrape out what i have now in the next day or so and put it up in a
> >> gist that I can share too.
> >>
> >> G
> >>
> >> On 25 April 2017 at 13:04, Holden Karau <ho...@pigscanfly.ca> wrote:
> >>>
> >>> Urgh hangouts did something frustrating, updated link
> >>> https://hangouts.google.com/hangouts/_/ha6kusycp5fvzei2trhay4uhhqe
> >>>
> >>> On Mon, Apr 24, 2017 at 12:13 AM, Holden Karau <ho...@pigscanfly.ca>
> >>> wrote:
> >>>>
> >>>> The (tentative) link for those interested is
> >>>> https://hangouts.google.com/hangouts/_/oyjvcnffejcjhi6qazf3lysypue .
> >>>>
> >>>> On Mon, Apr 24, 2017 at 12:02 AM, Holden Karau <ho...@pigscanfly.ca>
> >>>> wrote:
> >>>>>
> >>>>> So 14 people have said they are available on Tuesday the 25th at 1PM
> >>>>> pacific so we will do this meeting then (
> >>>>> https://doodle.com/poll/69y6yab4pyf7u8bn ).
> >>>>>
> >>>>> Since hangouts tends to work ok on the Linux distro I'm running my
> >>>>> default is to host this as a "hangouts-on-air" unless there are
> alternative
> >>>>> ideas.
> >>>>>
> >>>>> I'll record the hangout and if it isn't terrible I'll post it for
> those
> >>>>> who weren't able to make it (and for next time I'll include more
> European
> >>>>> friendly time options - Doodle wouldn't let me update it once
> posted).
> >>>>>
> >>>>> On Fri, Apr 14, 2017 at 11:17 AM, Holden Karau <holden@pigscanfly.ca
> >
> >>>>> wrote:
> >>>>>>
> >>>>>> Hi Spark Users (+ Some Spark Testing Devs on BCC),
> >>>>>>
> >>>>>> Awhile back on one of the many threads about testing in Spark there
> >>>>>> was some interest in having a chat about the state of Spark testing
> and what
> >>>>>> people want/need.
> >>>>>>
> >>>>>> So if you are interested in joining an online (with maybe an IRL
> >>>>>> component if enough people are SF based) chat about Spark testing
> please
> >>>>>> fill out this doodle - https://doodle.com/poll/69y6yab4pyf7u8bn
> >>>>>>
> >>>>>> I think reasonable topics of discussion could be:
> >>>>>>
> >>>>>> 1) What is the state of the different Spark testing libraries in the
> >>>>>> different core (Scala, Python, R, Java) and extended languages (C#,
> >>>>>> Javascript, etc.)?
> >>>>>> 2) How do we make these more easily discovered by users?
> >>>>>> 3) What are people looking for in their testing libraries that we
> are
> >>>>>> missing? (can be functionality, documentation, etc.)
> >>>>>> 4) Are there any examples of well tested open source Spark projects
> >>>>>> and where are they?
> >>>>>>
> >>>>>> If you have other topics that's awesome.
> >>>>>>
> >>>>>> To clarify this about libraries and best practices for people
> testing
> >>>>>> their Spark applications, and less about testing Spark's internals
> (although
> >>>>>> as illustrated by some of the libraries there is some strong
> overlap in what
> >>>>>> is required to make that work).
> >>>>>>
> >>>>>> Cheers,
> >>>>>>
> >>>>>> Holden :)
> >>>>>>
> >>>>>> --
> >>>>>> Cell : 425-233-8271
> >>>>>> Twitter: https://twitter.com/holdenkarau
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Cell : 425-233-8271
> >>>>> Twitter: https://twitter.com/holdenkarau
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Cell : 425-233-8271
> >>>> Twitter: https://twitter.com/holdenkarau
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> Cell : 425-233-8271
> >>> Twitter: https://twitter.com/holdenkarau
> >>
> >>
> >
>

Re: Spark Testing Library Discussion

Posted by "lucas.gary@gmail.com" <lu...@gmail.com>.
Awesome, thanks.

Just reading your post

A few observations:
1) You're giving out Marius's email: "I have been lucky enough to
build this pipeline with the amazing Marius Feteanu".  A linked or
github link might be more helpful.

2) "If you are in Pyspark world sadly Holden’s test base wont work so
I suggest you check out Pytest and pytest-bdd.".  doesn't read well to
me, on first read I was wondering if Spark-Test-Base wasn't available
in python... It took me about 20 seconds to figure out that you
probably meant it doesn't allow for direct BDD semantics.  My 2nd
observation here is that BDD semantics can be aped in any given
testing framework.  You just need to be flexible :)

3) You're doing a transformation (IE JSON input against a JSON
schema).  You are testing for # of rows which is a good start.  But I
don't think that really exercises a test against your JSON schema. I
tend to view schema as the things that need the most rigorous testing
(it's code after all).  IE I would want to confirm that the output
matches the expected shape and values after being loaded against the
schema.

I saw a few minor spelling and grammatical issues as well.  I put a PR
into your blog for them.  I won't be offended if you squish it :)

I should be getting into our testing 'how-to' stuff this week.  I'll
scrape our org specific stuff and put it up to github this week as
well.  It'll be in python so maybe we'll get both use cases covered
with examples :)

G

On 27 April 2017 at 03:46, Sam Elamin <hu...@gmail.com> wrote:
> Hi
>
> @Lucas I certainly would love to write an integration testing library for
> workflows, I have a few ideas I would love to share with others and they are
> focused around Airflow since that is what we use
>
>
> As promised here is the first blog post in a series of posts I hope to write
> on how we build data pipelines
>
> Please feel free to retweet my original tweet and share because the more
> ideas we have the better!
>
> Feedback is always welcome!
>
> Regards
> Sam
>
> On Tue, Apr 25, 2017 at 10:32 PM, lucas.gary@gmail.com
> <lu...@gmail.com> wrote:
>>
>> Hi all, whoever (Sam I think) was going to do some work on doing a
>> template testing pipeline.  I'd love to be involved, I have a current task
>> in my day job (data engineer) to flesh out our testing how-to / best
>> practices for Spark jobs and I think I'll be doing something very similar
>> for the next week or 2.
>>
>> I'll scrape out what i have now in the next day or so and put it up in a
>> gist that I can share too.
>>
>> G
>>
>> On 25 April 2017 at 13:04, Holden Karau <ho...@pigscanfly.ca> wrote:
>>>
>>> Urgh hangouts did something frustrating, updated link
>>> https://hangouts.google.com/hangouts/_/ha6kusycp5fvzei2trhay4uhhqe
>>>
>>> On Mon, Apr 24, 2017 at 12:13 AM, Holden Karau <ho...@pigscanfly.ca>
>>> wrote:
>>>>
>>>> The (tentative) link for those interested is
>>>> https://hangouts.google.com/hangouts/_/oyjvcnffejcjhi6qazf3lysypue .
>>>>
>>>> On Mon, Apr 24, 2017 at 12:02 AM, Holden Karau <ho...@pigscanfly.ca>
>>>> wrote:
>>>>>
>>>>> So 14 people have said they are available on Tuesday the 25th at 1PM
>>>>> pacific so we will do this meeting then (
>>>>> https://doodle.com/poll/69y6yab4pyf7u8bn ).
>>>>>
>>>>> Since hangouts tends to work ok on the Linux distro I'm running my
>>>>> default is to host this as a "hangouts-on-air" unless there are alternative
>>>>> ideas.
>>>>>
>>>>> I'll record the hangout and if it isn't terrible I'll post it for those
>>>>> who weren't able to make it (and for next time I'll include more European
>>>>> friendly time options - Doodle wouldn't let me update it once posted).
>>>>>
>>>>> On Fri, Apr 14, 2017 at 11:17 AM, Holden Karau <ho...@pigscanfly.ca>
>>>>> wrote:
>>>>>>
>>>>>> Hi Spark Users (+ Some Spark Testing Devs on BCC),
>>>>>>
>>>>>> Awhile back on one of the many threads about testing in Spark there
>>>>>> was some interest in having a chat about the state of Spark testing and what
>>>>>> people want/need.
>>>>>>
>>>>>> So if you are interested in joining an online (with maybe an IRL
>>>>>> component if enough people are SF based) chat about Spark testing please
>>>>>> fill out this doodle - https://doodle.com/poll/69y6yab4pyf7u8bn
>>>>>>
>>>>>> I think reasonable topics of discussion could be:
>>>>>>
>>>>>> 1) What is the state of the different Spark testing libraries in the
>>>>>> different core (Scala, Python, R, Java) and extended languages (C#,
>>>>>> Javascript, etc.)?
>>>>>> 2) How do we make these more easily discovered by users?
>>>>>> 3) What are people looking for in their testing libraries that we are
>>>>>> missing? (can be functionality, documentation, etc.)
>>>>>> 4) Are there any examples of well tested open source Spark projects
>>>>>> and where are they?
>>>>>>
>>>>>> If you have other topics that's awesome.
>>>>>>
>>>>>> To clarify this about libraries and best practices for people testing
>>>>>> their Spark applications, and less about testing Spark's internals (although
>>>>>> as illustrated by some of the libraries there is some strong overlap in what
>>>>>> is required to make that work).
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Holden :)
>>>>>>
>>>>>> --
>>>>>> Cell : 425-233-8271
>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Cell : 425-233-8271
>>>>> Twitter: https://twitter.com/holdenkarau
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Cell : 425-233-8271
>>>> Twitter: https://twitter.com/holdenkarau
>>>
>>>
>>>
>>>
>>> --
>>> Cell : 425-233-8271
>>> Twitter: https://twitter.com/holdenkarau
>>
>>
>

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: Spark Testing Library Discussion

Posted by Sam Elamin <hu...@gmail.com>.
Hi

@Lucas I certainly would love to write an integration testing library for
workflows, I have a few ideas I would love to share with others and they
are focused around Airflow since that is what we use


As promised here
<https://samelamin.github.io/2017/04/27/Building-A-Datapipeline-part1/> is
the first blog post in a series of posts I hope to write on how we build
data pipelines

Please feel free to retweet my original tweet
<https://twitter.com/samelamin/status/857546231492612096> and share because
the more ideas we have the better!

Feedback is always welcome!

Regards
Sam

On Tue, Apr 25, 2017 at 10:32 PM, lucas.gary@gmail.com <lucas.gary@gmail.com
> wrote:

> Hi all, whoever (Sam I think) was going to do some work on doing a
> template testing pipeline.  I'd love to be involved, I have a current task
> in my day job (data engineer) to flesh out our testing how-to / best
> practices for Spark jobs and I think I'll be doing something very similar
> for the next week or 2.
>
> I'll scrape out what i have now in the next day or so and put it up in a
> gist that I can share too.
>
> G
>
> On 25 April 2017 at 13:04, Holden Karau <ho...@pigscanfly.ca> wrote:
>
>> Urgh hangouts did something frustrating, updated link
>> https://hangouts.google.com/hangouts/_/ha6kusycp5fvzei2trhay4uhhqe
>>
>> On Mon, Apr 24, 2017 at 12:13 AM, Holden Karau <ho...@pigscanfly.ca>
>> wrote:
>>
>>> The (tentative) link for those interested is https://hangouts.google.com
>>> /hangouts/_/oyjvcnffejcjhi6qazf3lysypue .
>>>
>>> On Mon, Apr 24, 2017 at 12:02 AM, Holden Karau <ho...@pigscanfly.ca>
>>> wrote:
>>>
>>>> So 14 people have said they are available on Tuesday the 25th at 1PM
>>>> pacific so we will do this meeting then ( https://doodle.com/poll/69y6
>>>> yab4pyf7u8bn ).
>>>>
>>>> Since hangouts tends to work ok on the Linux distro I'm running my
>>>> default is to host this as a "hangouts-on-air" unless there are alternative
>>>> ideas.
>>>>
>>>> I'll record the hangout and if it isn't terrible I'll post it for those
>>>> who weren't able to make it (and for next time I'll include more European
>>>> friendly time options - Doodle wouldn't let me update it once posted).
>>>>
>>>> On Fri, Apr 14, 2017 at 11:17 AM, Holden Karau <ho...@pigscanfly.ca>
>>>> wrote:
>>>>
>>>>> Hi Spark Users (+ Some Spark Testing Devs on BCC),
>>>>>
>>>>> Awhile back on one of the many threads about testing in Spark there
>>>>> was some interest in having a chat about the state of Spark testing and
>>>>> what people want/need.
>>>>>
>>>>> So if you are interested in joining an online (with maybe an IRL
>>>>> component if enough people are SF based) chat about Spark testing please
>>>>> fill out this doodle - https://doodle.com/poll/69y6yab4pyf7u8bn
>>>>>
>>>>> I think reasonable topics of discussion could be:
>>>>>
>>>>> 1) What is the state of the different Spark testing libraries in the
>>>>> different core (Scala, Python, R, Java) and extended languages (C#,
>>>>> Javascript, etc.)?
>>>>> 2) How do we make these more easily discovered by users?
>>>>> 3) What are people looking for in their testing libraries that we are
>>>>> missing? (can be functionality, documentation, etc.)
>>>>> 4) Are there any examples of well tested open source Spark projects
>>>>> and where are they?
>>>>>
>>>>> If you have other topics that's awesome.
>>>>>
>>>>> To clarify this about libraries and best practices for people testing
>>>>> their Spark applications, and less about testing Spark's internals
>>>>> (although as illustrated by some of the libraries there is some strong
>>>>> overlap in what is required to make that work).
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Holden :)
>>>>>
>>>>> --
>>>>> Cell : 425-233-8271 <(425)%20233-8271>
>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Cell : 425-233-8271 <(425)%20233-8271>
>>>> Twitter: https://twitter.com/holdenkarau
>>>>
>>>
>>>
>>>
>>> --
>>> Cell : 425-233-8271 <(425)%20233-8271>
>>> Twitter: https://twitter.com/holdenkarau
>>>
>>
>>
>>
>> --
>> Cell : 425-233-8271 <(425)%20233-8271>
>> Twitter: https://twitter.com/holdenkarau
>>
>
>

Re: Spark Testing Library Discussion

Posted by "lucas.gary@gmail.com" <lu...@gmail.com>.
Hi all, whoever (Sam I think) was going to do some work on doing a template
testing pipeline.  I'd love to be involved, I have a current task in my day
job (data engineer) to flesh out our testing how-to / best practices for
Spark jobs and I think I'll be doing something very similar for the next
week or 2.

I'll scrape out what i have now in the next day or so and put it up in a
gist that I can share too.

G

On 25 April 2017 at 13:04, Holden Karau <ho...@pigscanfly.ca> wrote:

> Urgh hangouts did something frustrating, updated link
> https://hangouts.google.com/hangouts/_/ha6kusycp5fvzei2trhay4uhhqe
>
> On Mon, Apr 24, 2017 at 12:13 AM, Holden Karau <ho...@pigscanfly.ca>
> wrote:
>
>> The (tentative) link for those interested is https://hangouts.google.com
>> /hangouts/_/oyjvcnffejcjhi6qazf3lysypue .
>>
>> On Mon, Apr 24, 2017 at 12:02 AM, Holden Karau <ho...@pigscanfly.ca>
>> wrote:
>>
>>> So 14 people have said they are available on Tuesday the 25th at 1PM
>>> pacific so we will do this meeting then ( https://doodle.com/poll/69y6
>>> yab4pyf7u8bn ).
>>>
>>> Since hangouts tends to work ok on the Linux distro I'm running my
>>> default is to host this as a "hangouts-on-air" unless there are alternative
>>> ideas.
>>>
>>> I'll record the hangout and if it isn't terrible I'll post it for those
>>> who weren't able to make it (and for next time I'll include more European
>>> friendly time options - Doodle wouldn't let me update it once posted).
>>>
>>> On Fri, Apr 14, 2017 at 11:17 AM, Holden Karau <ho...@pigscanfly.ca>
>>> wrote:
>>>
>>>> Hi Spark Users (+ Some Spark Testing Devs on BCC),
>>>>
>>>> Awhile back on one of the many threads about testing in Spark there was
>>>> some interest in having a chat about the state of Spark testing and what
>>>> people want/need.
>>>>
>>>> So if you are interested in joining an online (with maybe an IRL
>>>> component if enough people are SF based) chat about Spark testing please
>>>> fill out this doodle - https://doodle.com/poll/69y6yab4pyf7u8bn
>>>>
>>>> I think reasonable topics of discussion could be:
>>>>
>>>> 1) What is the state of the different Spark testing libraries in the
>>>> different core (Scala, Python, R, Java) and extended languages (C#,
>>>> Javascript, etc.)?
>>>> 2) How do we make these more easily discovered by users?
>>>> 3) What are people looking for in their testing libraries that we are
>>>> missing? (can be functionality, documentation, etc.)
>>>> 4) Are there any examples of well tested open source Spark projects and
>>>> where are they?
>>>>
>>>> If you have other topics that's awesome.
>>>>
>>>> To clarify this about libraries and best practices for people testing
>>>> their Spark applications, and less about testing Spark's internals
>>>> (although as illustrated by some of the libraries there is some strong
>>>> overlap in what is required to make that work).
>>>>
>>>> Cheers,
>>>>
>>>> Holden :)
>>>>
>>>> --
>>>> Cell : 425-233-8271 <(425)%20233-8271>
>>>> Twitter: https://twitter.com/holdenkarau
>>>>
>>>
>>>
>>>
>>> --
>>> Cell : 425-233-8271 <(425)%20233-8271>
>>> Twitter: https://twitter.com/holdenkarau
>>>
>>
>>
>>
>> --
>> Cell : 425-233-8271 <(425)%20233-8271>
>> Twitter: https://twitter.com/holdenkarau
>>
>
>
>
> --
> Cell : 425-233-8271 <(425)%20233-8271>
> Twitter: https://twitter.com/holdenkarau
>

Re: Spark Testing Library Discussion

Posted by Holden Karau <ho...@pigscanfly.ca>.
Sorry about that, hangouts on air broke in the first one :(

On Wed, Apr 26, 2017 at 8:41 AM, Marco Mistroni <mm...@gmail.com> wrote:

> Uh i stayed online in the other link but nobody joined....Will follow
> transcript
> Kr
>
> On 26 Apr 2017 9:35 am, "Holden Karau" <ho...@pigscanfly.ca> wrote:
>
>> And the recording of our discussion is at https://www.youtube.com/wat
>> ch?v=2q0uAldCQ8M
>> A few of us have follow up things and we will try and do another meeting
>> in about a month or two :)
>>
>> On Tue, Apr 25, 2017 at 1:04 PM, Holden Karau <ho...@pigscanfly.ca>
>> wrote:
>>
>>> Urgh hangouts did something frustrating, updated link
>>> https://hangouts.google.com/hangouts/_/ha6kusycp5fvzei2trhay4uhhqe
>>>
>>> On Mon, Apr 24, 2017 at 12:13 AM, Holden Karau <ho...@pigscanfly.ca>
>>> wrote:
>>>
>>>> The (tentative) link for those interested is
>>>> https://hangouts.google.com/hangouts/_/oyjvcnffejcjhi6qazf3lysypue .
>>>>
>>>> On Mon, Apr 24, 2017 at 12:02 AM, Holden Karau <ho...@pigscanfly.ca>
>>>> wrote:
>>>>
>>>>> So 14 people have said they are available on Tuesday the 25th at 1PM
>>>>> pacific so we will do this meeting then ( https://doodle.com/poll/69y6
>>>>> yab4pyf7u8bn ).
>>>>>
>>>>> Since hangouts tends to work ok on the Linux distro I'm running my
>>>>> default is to host this as a "hangouts-on-air" unless there are alternative
>>>>> ideas.
>>>>>
>>>>> I'll record the hangout and if it isn't terrible I'll post it for
>>>>> those who weren't able to make it (and for next time I'll include more
>>>>> European friendly time options - Doodle wouldn't let me update it once
>>>>> posted).
>>>>>
>>>>> On Fri, Apr 14, 2017 at 11:17 AM, Holden Karau <ho...@pigscanfly.ca>
>>>>> wrote:
>>>>>
>>>>>> Hi Spark Users (+ Some Spark Testing Devs on BCC),
>>>>>>
>>>>>> Awhile back on one of the many threads about testing in Spark there
>>>>>> was some interest in having a chat about the state of Spark testing and
>>>>>> what people want/need.
>>>>>>
>>>>>> So if you are interested in joining an online (with maybe an IRL
>>>>>> component if enough people are SF based) chat about Spark testing please
>>>>>> fill out this doodle - https://doodle.com/poll/69y6yab4pyf7u8bn
>>>>>>
>>>>>> I think reasonable topics of discussion could be:
>>>>>>
>>>>>> 1) What is the state of the different Spark testing libraries in the
>>>>>> different core (Scala, Python, R, Java) and extended languages (C#,
>>>>>> Javascript, etc.)?
>>>>>> 2) How do we make these more easily discovered by users?
>>>>>> 3) What are people looking for in their testing libraries that we are
>>>>>> missing? (can be functionality, documentation, etc.)
>>>>>> 4) Are there any examples of well tested open source Spark projects
>>>>>> and where are they?
>>>>>>
>>>>>> If you have other topics that's awesome.
>>>>>>
>>>>>> To clarify this about libraries and best practices for people testing
>>>>>> their Spark applications, and less about testing Spark's internals
>>>>>> (although as illustrated by some of the libraries there is some strong
>>>>>> overlap in what is required to make that work).
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> Holden :)
>>>>>>
>>>>>> --
>>>>>> Cell : 425-233-8271 <(425)%20233-8271>
>>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Cell : 425-233-8271 <(425)%20233-8271>
>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Cell : 425-233-8271 <(425)%20233-8271>
>>>> Twitter: https://twitter.com/holdenkarau
>>>>
>>>
>>>
>>>
>>> --
>>> Cell : 425-233-8271 <(425)%20233-8271>
>>> Twitter: https://twitter.com/holdenkarau
>>>
>>
>>
>>
>> --
>> Cell : 425-233-8271 <(425)%20233-8271>
>> Twitter: https://twitter.com/holdenkarau
>>
>


-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau

Re: Spark Testing Library Discussion

Posted by Marco Mistroni <mm...@gmail.com>.
Uh i stayed online in the other link but nobody joined....Will follow
transcript
Kr

On 26 Apr 2017 9:35 am, "Holden Karau" <ho...@pigscanfly.ca> wrote:

> And the recording of our discussion is at https://www.youtube.com/
> watch?v=2q0uAldCQ8M
> A few of us have follow up things and we will try and do another meeting
> in about a month or two :)
>
> On Tue, Apr 25, 2017 at 1:04 PM, Holden Karau <ho...@pigscanfly.ca>
> wrote:
>
>> Urgh hangouts did something frustrating, updated link
>> https://hangouts.google.com/hangouts/_/ha6kusycp5fvzei2trhay4uhhqe
>>
>> On Mon, Apr 24, 2017 at 12:13 AM, Holden Karau <ho...@pigscanfly.ca>
>> wrote:
>>
>>> The (tentative) link for those interested is https://hangouts.google.com
>>> /hangouts/_/oyjvcnffejcjhi6qazf3lysypue .
>>>
>>> On Mon, Apr 24, 2017 at 12:02 AM, Holden Karau <ho...@pigscanfly.ca>
>>> wrote:
>>>
>>>> So 14 people have said they are available on Tuesday the 25th at 1PM
>>>> pacific so we will do this meeting then ( https://doodle.com/poll/69y6
>>>> yab4pyf7u8bn ).
>>>>
>>>> Since hangouts tends to work ok on the Linux distro I'm running my
>>>> default is to host this as a "hangouts-on-air" unless there are alternative
>>>> ideas.
>>>>
>>>> I'll record the hangout and if it isn't terrible I'll post it for those
>>>> who weren't able to make it (and for next time I'll include more European
>>>> friendly time options - Doodle wouldn't let me update it once posted).
>>>>
>>>> On Fri, Apr 14, 2017 at 11:17 AM, Holden Karau <ho...@pigscanfly.ca>
>>>> wrote:
>>>>
>>>>> Hi Spark Users (+ Some Spark Testing Devs on BCC),
>>>>>
>>>>> Awhile back on one of the many threads about testing in Spark there
>>>>> was some interest in having a chat about the state of Spark testing and
>>>>> what people want/need.
>>>>>
>>>>> So if you are interested in joining an online (with maybe an IRL
>>>>> component if enough people are SF based) chat about Spark testing please
>>>>> fill out this doodle - https://doodle.com/poll/69y6yab4pyf7u8bn
>>>>>
>>>>> I think reasonable topics of discussion could be:
>>>>>
>>>>> 1) What is the state of the different Spark testing libraries in the
>>>>> different core (Scala, Python, R, Java) and extended languages (C#,
>>>>> Javascript, etc.)?
>>>>> 2) How do we make these more easily discovered by users?
>>>>> 3) What are people looking for in their testing libraries that we are
>>>>> missing? (can be functionality, documentation, etc.)
>>>>> 4) Are there any examples of well tested open source Spark projects
>>>>> and where are they?
>>>>>
>>>>> If you have other topics that's awesome.
>>>>>
>>>>> To clarify this about libraries and best practices for people testing
>>>>> their Spark applications, and less about testing Spark's internals
>>>>> (although as illustrated by some of the libraries there is some strong
>>>>> overlap in what is required to make that work).
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Holden :)
>>>>>
>>>>> --
>>>>> Cell : 425-233-8271 <(425)%20233-8271>
>>>>> Twitter: https://twitter.com/holdenkarau
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Cell : 425-233-8271 <(425)%20233-8271>
>>>> Twitter: https://twitter.com/holdenkarau
>>>>
>>>
>>>
>>>
>>> --
>>> Cell : 425-233-8271 <(425)%20233-8271>
>>> Twitter: https://twitter.com/holdenkarau
>>>
>>
>>
>>
>> --
>> Cell : 425-233-8271 <(425)%20233-8271>
>> Twitter: https://twitter.com/holdenkarau
>>
>
>
>
> --
> Cell : 425-233-8271 <(425)%20233-8271>
> Twitter: https://twitter.com/holdenkarau
>

Re: Spark Testing Library Discussion

Posted by Holden Karau <ho...@pigscanfly.ca>.
And the recording of our discussion is at
https://www.youtube.com/watch?v=2q0uAldCQ8M
A few of us have follow up things and we will try and do another meeting in
about a month or two :)

On Tue, Apr 25, 2017 at 1:04 PM, Holden Karau <ho...@pigscanfly.ca> wrote:

> Urgh hangouts did something frustrating, updated link
> https://hangouts.google.com/hangouts/_/ha6kusycp5fvzei2trhay4uhhqe
>
> On Mon, Apr 24, 2017 at 12:13 AM, Holden Karau <ho...@pigscanfly.ca>
> wrote:
>
>> The (tentative) link for those interested is https://hangouts.google.com
>> /hangouts/_/oyjvcnffejcjhi6qazf3lysypue .
>>
>> On Mon, Apr 24, 2017 at 12:02 AM, Holden Karau <ho...@pigscanfly.ca>
>> wrote:
>>
>>> So 14 people have said they are available on Tuesday the 25th at 1PM
>>> pacific so we will do this meeting then ( https://doodle.com/poll/69y6
>>> yab4pyf7u8bn ).
>>>
>>> Since hangouts tends to work ok on the Linux distro I'm running my
>>> default is to host this as a "hangouts-on-air" unless there are alternative
>>> ideas.
>>>
>>> I'll record the hangout and if it isn't terrible I'll post it for those
>>> who weren't able to make it (and for next time I'll include more European
>>> friendly time options - Doodle wouldn't let me update it once posted).
>>>
>>> On Fri, Apr 14, 2017 at 11:17 AM, Holden Karau <ho...@pigscanfly.ca>
>>> wrote:
>>>
>>>> Hi Spark Users (+ Some Spark Testing Devs on BCC),
>>>>
>>>> Awhile back on one of the many threads about testing in Spark there was
>>>> some interest in having a chat about the state of Spark testing and what
>>>> people want/need.
>>>>
>>>> So if you are interested in joining an online (with maybe an IRL
>>>> component if enough people are SF based) chat about Spark testing please
>>>> fill out this doodle - https://doodle.com/poll/69y6yab4pyf7u8bn
>>>>
>>>> I think reasonable topics of discussion could be:
>>>>
>>>> 1) What is the state of the different Spark testing libraries in the
>>>> different core (Scala, Python, R, Java) and extended languages (C#,
>>>> Javascript, etc.)?
>>>> 2) How do we make these more easily discovered by users?
>>>> 3) What are people looking for in their testing libraries that we are
>>>> missing? (can be functionality, documentation, etc.)
>>>> 4) Are there any examples of well tested open source Spark projects and
>>>> where are they?
>>>>
>>>> If you have other topics that's awesome.
>>>>
>>>> To clarify this about libraries and best practices for people testing
>>>> their Spark applications, and less about testing Spark's internals
>>>> (although as illustrated by some of the libraries there is some strong
>>>> overlap in what is required to make that work).
>>>>
>>>> Cheers,
>>>>
>>>> Holden :)
>>>>
>>>> --
>>>> Cell : 425-233-8271 <(425)%20233-8271>
>>>> Twitter: https://twitter.com/holdenkarau
>>>>
>>>
>>>
>>>
>>> --
>>> Cell : 425-233-8271 <(425)%20233-8271>
>>> Twitter: https://twitter.com/holdenkarau
>>>
>>
>>
>>
>> --
>> Cell : 425-233-8271 <(425)%20233-8271>
>> Twitter: https://twitter.com/holdenkarau
>>
>
>
>
> --
> Cell : 425-233-8271 <(425)%20233-8271>
> Twitter: https://twitter.com/holdenkarau
>



-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau

Re: Spark Testing Library Discussion

Posted by Holden Karau <ho...@pigscanfly.ca>.
Urgh hangouts did something frustrating, updated link
https://hangouts.google.com/hangouts/_/ha6kusycp5fvzei2trhay4uhhqe

On Mon, Apr 24, 2017 at 12:13 AM, Holden Karau <ho...@pigscanfly.ca> wrote:

> The (tentative) link for those interested is https://hangouts.google.
> com/hangouts/_/oyjvcnffejcjhi6qazf3lysypue .
>
> On Mon, Apr 24, 2017 at 12:02 AM, Holden Karau <ho...@pigscanfly.ca>
> wrote:
>
>> So 14 people have said they are available on Tuesday the 25th at 1PM
>> pacific so we will do this meeting then ( https://doodle.com/poll/69y6
>> yab4pyf7u8bn ).
>>
>> Since hangouts tends to work ok on the Linux distro I'm running my
>> default is to host this as a "hangouts-on-air" unless there are alternative
>> ideas.
>>
>> I'll record the hangout and if it isn't terrible I'll post it for those
>> who weren't able to make it (and for next time I'll include more European
>> friendly time options - Doodle wouldn't let me update it once posted).
>>
>> On Fri, Apr 14, 2017 at 11:17 AM, Holden Karau <ho...@pigscanfly.ca>
>> wrote:
>>
>>> Hi Spark Users (+ Some Spark Testing Devs on BCC),
>>>
>>> Awhile back on one of the many threads about testing in Spark there was
>>> some interest in having a chat about the state of Spark testing and what
>>> people want/need.
>>>
>>> So if you are interested in joining an online (with maybe an IRL
>>> component if enough people are SF based) chat about Spark testing please
>>> fill out this doodle - https://doodle.com/poll/69y6yab4pyf7u8bn
>>>
>>> I think reasonable topics of discussion could be:
>>>
>>> 1) What is the state of the different Spark testing libraries in the
>>> different core (Scala, Python, R, Java) and extended languages (C#,
>>> Javascript, etc.)?
>>> 2) How do we make these more easily discovered by users?
>>> 3) What are people looking for in their testing libraries that we are
>>> missing? (can be functionality, documentation, etc.)
>>> 4) Are there any examples of well tested open source Spark projects and
>>> where are they?
>>>
>>> If you have other topics that's awesome.
>>>
>>> To clarify this about libraries and best practices for people testing
>>> their Spark applications, and less about testing Spark's internals
>>> (although as illustrated by some of the libraries there is some strong
>>> overlap in what is required to make that work).
>>>
>>> Cheers,
>>>
>>> Holden :)
>>>
>>> --
>>> Cell : 425-233-8271 <(425)%20233-8271>
>>> Twitter: https://twitter.com/holdenkarau
>>>
>>
>>
>>
>> --
>> Cell : 425-233-8271 <(425)%20233-8271>
>> Twitter: https://twitter.com/holdenkarau
>>
>
>
>
> --
> Cell : 425-233-8271 <(425)%20233-8271>
> Twitter: https://twitter.com/holdenkarau
>



-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau

Re: Spark Testing Library Discussion

Posted by Holden Karau <ho...@pigscanfly.ca>.
The (tentative) link for those interested is
https://hangouts.google.com/hangouts/_/oyjvcnffejcjhi6qazf3lysypue .

On Mon, Apr 24, 2017 at 12:02 AM, Holden Karau <ho...@pigscanfly.ca> wrote:

> So 14 people have said they are available on Tuesday the 25th at 1PM
> pacific so we will do this meeting then ( https://doodle.com/poll/
> 69y6yab4pyf7u8bn ).
>
> Since hangouts tends to work ok on the Linux distro I'm running my default
> is to host this as a "hangouts-on-air" unless there are alternative ideas.
>
> I'll record the hangout and if it isn't terrible I'll post it for those
> who weren't able to make it (and for next time I'll include more European
> friendly time options - Doodle wouldn't let me update it once posted).
>
> On Fri, Apr 14, 2017 at 11:17 AM, Holden Karau <ho...@pigscanfly.ca>
> wrote:
>
>> Hi Spark Users (+ Some Spark Testing Devs on BCC),
>>
>> Awhile back on one of the many threads about testing in Spark there was
>> some interest in having a chat about the state of Spark testing and what
>> people want/need.
>>
>> So if you are interested in joining an online (with maybe an IRL
>> component if enough people are SF based) chat about Spark testing please
>> fill out this doodle - https://doodle.com/poll/69y6yab4pyf7u8bn
>>
>> I think reasonable topics of discussion could be:
>>
>> 1) What is the state of the different Spark testing libraries in the
>> different core (Scala, Python, R, Java) and extended languages (C#,
>> Javascript, etc.)?
>> 2) How do we make these more easily discovered by users?
>> 3) What are people looking for in their testing libraries that we are
>> missing? (can be functionality, documentation, etc.)
>> 4) Are there any examples of well tested open source Spark projects and
>> where are they?
>>
>> If you have other topics that's awesome.
>>
>> To clarify this about libraries and best practices for people testing
>> their Spark applications, and less about testing Spark's internals
>> (although as illustrated by some of the libraries there is some strong
>> overlap in what is required to make that work).
>>
>> Cheers,
>>
>> Holden :)
>>
>> --
>> Cell : 425-233-8271 <(425)%20233-8271>
>> Twitter: https://twitter.com/holdenkarau
>>
>
>
>
> --
> Cell : 425-233-8271 <(425)%20233-8271>
> Twitter: https://twitter.com/holdenkarau
>



-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau

Re: Spark Testing Library Discussion

Posted by Holden Karau <ho...@pigscanfly.ca>.
So 14 people have said they are available on Tuesday the 25th at 1PM
pacific so we will do this meeting then (
https://doodle.com/poll/69y6yab4pyf7u8bn ).

Since hangouts tends to work ok on the Linux distro I'm running my default
is to host this as a "hangouts-on-air" unless there are alternative ideas.

I'll record the hangout and if it isn't terrible I'll post it for those who
weren't able to make it (and for next time I'll include more European
friendly time options - Doodle wouldn't let me update it once posted).

On Fri, Apr 14, 2017 at 11:17 AM, Holden Karau <ho...@pigscanfly.ca> wrote:

> Hi Spark Users (+ Some Spark Testing Devs on BCC),
>
> Awhile back on one of the many threads about testing in Spark there was
> some interest in having a chat about the state of Spark testing and what
> people want/need.
>
> So if you are interested in joining an online (with maybe an IRL component
> if enough people are SF based) chat about Spark testing please fill out
> this doodle - https://doodle.com/poll/69y6yab4pyf7u8bn
>
> I think reasonable topics of discussion could be:
>
> 1) What is the state of the different Spark testing libraries in the
> different core (Scala, Python, R, Java) and extended languages (C#,
> Javascript, etc.)?
> 2) How do we make these more easily discovered by users?
> 3) What are people looking for in their testing libraries that we are
> missing? (can be functionality, documentation, etc.)
> 4) Are there any examples of well tested open source Spark projects and
> where are they?
>
> If you have other topics that's awesome.
>
> To clarify this about libraries and best practices for people testing
> their Spark applications, and less about testing Spark's internals
> (although as illustrated by some of the libraries there is some strong
> overlap in what is required to make that work).
>
> Cheers,
>
> Holden :)
>
> --
> Cell : 425-233-8271 <(425)%20233-8271>
> Twitter: https://twitter.com/holdenkarau
>



-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau