You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pirk.apache.org by Walter Ray-Dulany <ra...@apache.org> on 2016/10/06 13:32:57 UTC

Distributed tests take a long time, and that's ok.

Tim Ellison noted in a recent email to the list (subject: [CANCEL] [VOTE]:
Apache Pirk 0.2.0-incubating Realse) the following:

> The distributed tests took a long time (~35mins IIRC), is that normal?

I feel like noting this in a separate thread, and discussing it here, is
worthwhile.

The distributed tests *do* take a long time. The reason isn't because any
one test is slow; indeed, most of the individual tests take only a handful
of seconds. The slowness is because of the very large number of tests. This
vast array of tests, upon inspection, can be seen to be caused by a very
thorough testing of the large number of different actions that can be
performed by Pirk and platforms it supports.

I think this is ok. As an example of why this thorough testing is
necessary, just yesterday, while finishing up PIRK-45, I ran into an error
15 minutes into the distributed tests. Without this thorough testing, that
code would have made it to a PR yesterday night, and (without the
distributed tests) it is likely no one would have caught it (it was a
subtle "Spark has an ancient package on its classpath pre-empting a newer
version I'd included, and the old package has a bug" error).

Re: Distributed tests take a long time, and that's ok.

Posted by Ellison Anne Williams <ea...@apache.org>.

Thanks for adding the instructions for Bluemix in the PR today - I will
check them out and try to run everything on Bluemix.

Yes, we should print 'SUCESS' or 'FAILURE' messages at the end of the test
runs...

On Tue, Oct 11, 2016 at 8:06 AM, Tim Ellison <t....@gmail.com> wrote:

> On 06/10/16 14:32, Walter Ray-Dulany wrote:
> > Tim Ellison noted in a recent email to the list (subject: [CANCEL]
> [VOTE]:
> > Apache Pirk 0.2.0-incubating Realse) the following:
> >
> >> The distributed tests took a long time (~35mins IIRC), is that normal?
> >
> > I feel like noting this in a separate thread, and discussing it here, is
> > worthwhile.
> >
> > The distributed tests *do* take a long time. The reason isn't because any
> > one test is slow; indeed, most of the individual tests take only a
> handful
> > of seconds. The slowness is because of the very large number of tests.
> This
> > vast array of tests, upon inspection, can be seen to be caused by a very
> > thorough testing of the large number of different actions that can be
> > performed by Pirk and platforms it supports.
>
> There was certainly lots of logging going by - not that I was watching
> it constantly for half an hour - so I appreciate there is plenty of code
> being run and I trust that it is doing something useful ;-)
>
> I was also trusting that if there was a failure, it would be obvious,
> and there was nothing that said "FAILURE" at the end of the run, so I
> took that as a good sign.
>
> > I think this is ok. As an example of why this thorough testing is
> > necessary, just yesterday, while finishing up PIRK-45, I ran into an
> error
> > 15 minutes into the distributed tests. Without this thorough testing,
> that
> > code would have made it to a PR yesterday night, and (without the
> > distributed tests) it is likely no one would have caught it (it was a
> > subtle "Spark has an ancient package on its classpath pre-empting a newer
> > version I'd included, and the old package has a bug" error).
>
> Testing is good.  Automatic background testing for this type of test
> suite may be better.  I'm not likely to run the dist tests on every
> pause in development if they take a long time, and cost real money (a
> single run on AWS reported that it would cost me $2.50-ish*)
>
> It does run faster on the IBM JVM, and I can run it faster and for free
> on Bluemix, so I'll go that route for the moment -- but don't want to
> loose sight of trying to make the testing readily available for anyone
> who drops by in the community.  As mooted earlier, maybe the best way to
> do that is investigate a shared resource test environment hosted at the
> ASF.
>
> * I know, I'm cheap :-)
>
> Regards,
> Tim
>

Re: Distributed tests take a long time, and that's ok.

Posted by Tim Ellison <t....@gmail.com>.

On 06/10/16 14:32, Walter Ray-Dulany wrote:
> Tim Ellison noted in a recent email to the list (subject: [CANCEL] [VOTE]:
> Apache Pirk 0.2.0-incubating Realse) the following:
> 
>> The distributed tests took a long time (~35mins IIRC), is that normal?
> 
> I feel like noting this in a separate thread, and discussing it here, is
> worthwhile.
> 
> The distributed tests *do* take a long time. The reason isn't because any
> one test is slow; indeed, most of the individual tests take only a handful
> of seconds. The slowness is because of the very large number of tests. This
> vast array of tests, upon inspection, can be seen to be caused by a very
> thorough testing of the large number of different actions that can be
> performed by Pirk and platforms it supports.

There was certainly lots of logging going by - not that I was watching
it constantly for half an hour - so I appreciate there is plenty of code
being run and I trust that it is doing something useful ;-)

I was also trusting that if there was a failure, it would be obvious,
and there was nothing that said "FAILURE" at the end of the run, so I
took that as a good sign.

> I think this is ok. As an example of why this thorough testing is
> necessary, just yesterday, while finishing up PIRK-45, I ran into an error
> 15 minutes into the distributed tests. Without this thorough testing, that
> code would have made it to a PR yesterday night, and (without the
> distributed tests) it is likely no one would have caught it (it was a
> subtle "Spark has an ancient package on its classpath pre-empting a newer
> version I'd included, and the old package has a bug" error).

Testing is good.  Automatic background testing for this type of test
suite may be better.  I'm not likely to run the dist tests on every
pause in development if they take a long time, and cost real money (a
single run on AWS reported that it would cost me $2.50-ish*)

It does run faster on the IBM JVM, and I can run it faster and for free
on Bluemix, so I'll go that route for the moment -- but don't want to
loose sight of trying to make the testing readily available for anyone
who drops by in the community.  As mooted earlier, maybe the best way to
do that is investigate a shared resource test environment hosted at the ASF.

* I know, I'm cheap :-)

Regards,
Tim

Re: Distributed tests take a long time, and that's ok.

Posted by Ellison Anne Williams <ea...@apache.org>.

The tests were designed to be Responder platform centric. So, if one want
to test Pirk on Spark, then they can run the corresponding test suite. The
individual tests within each Responder platform scope test test the various
cases (and corner cases) of Pirk.

One correction to what you wrote about test input - the Inputs for the
tests (see the DistributedTestDriver initialize(fs) method), are created
once per run of the driver - if multiple platforms are being tested, the
inputs are only created one time.

I am not at all opposed to refactoring the tests (they absolutely need a
bit of cleanup)

On Thu, Oct 6, 2016 at 6:39 PM, Darin Johnson <db...@gmail.com>
wrote:

> Being pretty familiar with that section of the code base right now.  I
> think a few things could be done.  Regardless I'm refactoring it for
> submodules do it'll be easier to iterate these. The main place I'd start is
> currently each platform test creates a new input, which means the querier
> logic is repeated for the number of platforms tested - setting up the test
> inputs once, running all platforms for that test and then tearing down the
> inputs would be a start.
>
>
> Maybe have all tests implement an interface like:
>
> Public interface BaseTest {
>
>
> public void createInputs(...);
>
>
> public boolean checkOutputs(List results);
>
> }
>
>
> The a runner could do:
>
> BaseTest myTest=new myTest()
>
> myTest.createInputs(...)
>
> For(responder:responders)
>
> {
>
> results = responder.runDistributedTest();
>
> If(!myTest.checkResults(results) fail();
>
> }
>
>
> If nothing else it'd make adding tests easier.
>
>
>
> On Oct 6, 2016 9:33 AM, "Walter Ray-Dulany" <ra...@apache.org> wrote:
>
> > Tim Ellison noted in a recent email to the list (subject: [CANCEL]
> [VOTE]:
> > Apache Pirk 0.2.0-incubating Realse) the following:
> >
> > > The distributed tests took a long time (~35mins IIRC), is that normal?
> >
> > I feel like noting this in a separate thread, and discussing it here, is
> > worthwhile.
> >
> > The distributed tests *do* take a long time. The reason isn't because any
> > one test is slow; indeed, most of the individual tests take only a
> handful
> > of seconds. The slowness is because of the very large number of tests.
> This
> > vast array of tests, upon inspection, can be seen to be caused by a very
> > thorough testing of the large number of different actions that can be
> > performed by Pirk and platforms it supports.
> >
> > I think this is ok. As an example of why this thorough testing is
> > necessary, just yesterday, while finishing up PIRK-45, I ran into an
> error
> > 15 minutes into the distributed tests. Without this thorough testing,
> that
> > code would have made it to a PR yesterday night, and (without the
> > distributed tests) it is likely no one would have caught it (it was a
> > subtle "Spark has an ancient package on its classpath pre-empting a newer
> > version I'd included, and the old package has a bug" error).
> >
>

Re: Distributed tests take a long time, and that's ok.

Posted by Darin Johnson <db...@gmail.com>.

Being pretty familiar with that section of the code base right now.  I
think a few things could be done.  Regardless I'm refactoring it for
submodules do it'll be easier to iterate these. The main place I'd start is
currently each platform test creates a new input, which means the querier
logic is repeated for the number of platforms tested - setting up the test
inputs once, running all platforms for that test and then tearing down the
inputs would be a start.


Maybe have all tests implement an interface like:

Public interface BaseTest {


public void createInputs(...);


public boolean checkOutputs(List results);

}


The a runner could do:

BaseTest myTest=new myTest()

myTest.createInputs(...)

For(responder:responders)

{

results = responder.runDistributedTest();

If(!myTest.checkResults(results) fail();

}


If nothing else it'd make adding tests easier.



On Oct 6, 2016 9:33 AM, "Walter Ray-Dulany" <ra...@apache.org> wrote:

> Tim Ellison noted in a recent email to the list (subject: [CANCEL] [VOTE]:
> Apache Pirk 0.2.0-incubating Realse) the following:
>
> > The distributed tests took a long time (~35mins IIRC), is that normal?
>
> I feel like noting this in a separate thread, and discussing it here, is
> worthwhile.
>
> The distributed tests *do* take a long time. The reason isn't because any
> one test is slow; indeed, most of the individual tests take only a handful
> of seconds. The slowness is because of the very large number of tests. This
> vast array of tests, upon inspection, can be seen to be caused by a very
> thorough testing of the large number of different actions that can be
> performed by Pirk and platforms it supports.
>
> I think this is ok. As an example of why this thorough testing is
> necessary, just yesterday, while finishing up PIRK-45, I ran into an error
> 15 minutes into the distributed tests. Without this thorough testing, that
> code would have made it to a PR yesterday night, and (without the
> distributed tests) it is likely no one would have caught it (it was a
> subtle "Spark has an ancient package on its classpath pre-empting a newer
> version I'd included, and the old package has a bug" error).
>