You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pirk.apache.org by Darin Johnson <db...@gmail.com> on 2016/09/29 00:50:34 UTC

Some Obversations while refactoring into submodules

Hey guys, started working on refactoring into submodules and noticed a few
things.

First the class DistributedTestSuite in org.apache.pirk.tests package is
directly calling SparkLauncher, which can be avoided using platform
(already a TODO). This will be needed to break the spark responder out.  No
issues here' expected it.

Also noticed that DistributedTestSuite calls BaseTests.testDNSHostnameQuery
which in turn calls the static method performQuery in
DistTestSuite.performQuery so we've got a cyclic dependency.  I'm thinking
this should be refactored, which shouldn't be hard - most of the method are
static.

I figured I should solicit some thoughts on the best approach to going
about this before jumping in.  Comments welcome, especially as I haven't
written anything yet :).

My initial thoughts are the DistTestSuite should see what distributed
frameworks are available (via plugin) and then run those tests.  This will
likely require a setup helper for each framework though - the DistTestSuite
might then need to skip frameworks where the setup helper isn't found.  If
this seems to be the approach people would like I'll start stubbing it out
and get some opinions.

Darin

Re: Some Obversations while refactoring into submodules

Posted by Ellison Anne Williams <ea...@apache.org>.

Looks good Darin - look forward to the PR and integrating it into Pirk! :)

On Wed, Oct 5, 2016 at 5:18 AM, Tim Ellison <t....@gmail.com> wrote:

> On 04/10/16 07:19, Darin Johnson wrote:
> > Spent some time working on this tonight.  Currently it's here:
> > https://github.com/DarinJ/incubator-pirk/tree/submodules
>
> Thanks Darin, looking good.
>
> My first thought is that the StandaloneResponder look s a bit out of
> place now in core, but it may be overkill to pull it out into its own
> module.  WDYT?
>
> > if anyone wants to
> > provide a sanity check on how I've modified the ResponderPlugin to handle
> > the distributed tests (Look at the ResponderPlugin and maybe
> > SparkResponder).  Here we added a hasDistributedTest and
> > runDistributedTest, the reason for hasDistributedTest is that Storm
> didn't
> > have a test in BaseTests.
>
> Yeah, though in time I hope that the BaseTests don't differentiate on
> distributed vs stand alone tests.  I'm afraid it is a bit of a Pirk-ism
> for the code to know about implementation details elsewhere.  IMHO the
> various test queries should run on all available plug-ins, and not be
> concerned with the distributed/otherwise nature of the implementation.
>
> > The basic idea is to provide the Plugins the ability to setup their own
> > distributed test and provide BaseTests back with the list of objects.  I
> > moved performQuery from DistTestSuite to BaseTests and made it private,
> > this got rid of a circular dependency.  There's a few things I don't
> > particularly like the setup necessarily handled in two functions and
> > there's a lot of file reads/writes.  I don't think this can be helped due
> > to the nature of the tests.  I'm also torn on calling main, using
> > ProcessBuilder, each frameworks special pragmatic launcher.
>
> It's a journey and this is a big step in the right direction  :-)
>
> > Please note at present, you have to specify a test to run (i.e. -t 1:JS),
> > this is because not all responders are in all jars now (so some tests
> fail
> > saying plugin not found). That should be corrected tomorrow, I wouldn't
> > recommend going that far. I'll have the TestSuite look for available
> > platforms that have a distributed test and then run them.  I also plan to
> > start consolidating functions in DistTestSuite.java and BaseTests.java.
> >
> > Also in BaseTests, it seems really awkward that Standalone is there but
> has
> > special performQuery methods and special results.  Is there a particular
> > reason for this?  It would be nice to make it so standalone was tested
> the
> > same way.
>
> Definitely.
>
> > The other modules should be easier to break out.  I have bothered
> > optimizing maven yet.
>
> I assume you mean for the submodule dependencies etc?  No problem.
>
> Your branch is testing ok for me locally:
> [INFO] Apache Pirk (incubating) Project ................... SUCCESS [
> 1.392 s]
> [INFO] pirk-core .......................................... SUCCESS
> [01:17 min]
> [INFO] pirk-storm ......................................... SUCCESS [
> 54.000 s]
> [INFO] pirk-spark ......................................... SUCCESS [
> 3.275 s]
> [INFO] pirk-mapreduce ..................................... SUCCESS [
> 1.095 s]
>
> The overall direction is good.  Once the release is cut, and you are
> happy that it is reasonably in shape, I think it would be good to switch
> over to the modular builds so we can collaborate on the refinements.
>
> Regards,
> Tim
>
> > On Thu, Sep 29, 2016 at 7:39 AM, Darin Johnson <db...@gmail.com>
> > wrote:
> >
> >> Great, will do. Unfortunately, I don't think this will be done by end of
> >> week.  We'll see.
> >>
> >> On Sep 29, 2016 6:32 AM, "Ellison Anne Williams" <eawilliams@apache.org
> >
> >> wrote:
> >>
> >>> I would say that the DistributedTestSuite is ripe for refactor - it
> grew
> >>> pretty organically and hasn't had a good refactoring yet. ;)
> >>>
> >>> What you suggested sounds reasonable - look forward to seeing the code.
> >>>
> >>> On Thu, Sep 29, 2016 at 4:48 AM, Tim Ellison <t....@gmail.com>
> >>> wrote:
> >>>
> >>>> On 29/09/16 01:50, Darin Johnson wrote:
> >>>>> Hey guys, started working on refactoring into submodules and noticed
> a
> >>>> few
> >>>>> things.
> >>>>>
> >>>>> First the class DistributedTestSuite in org.apache.pirk.tests package
> >>> is
> >>>>> directly calling SparkLauncher, which can be avoided using platform
> >>>>> (already a TODO). This will be needed to break the spark responder
> >>> out.
> >>>> No
> >>>>> issues here' expected it.
> >>>>>
> >>>>> Also noticed that DistributedTestSuite calls
> >>>> BaseTests.testDNSHostnameQuery
> >>>>> which in turn calls the static method performQuery in
> >>>>> DistTestSuite.performQuery so we've got a cyclic dependency.  I'm
> >>>> thinking
> >>>>> this should be refactored, which shouldn't be hard - most of the
> >>> method
> >>>> are
> >>>>> static.
> >>>>>
> >>>>> I figured I should solicit some thoughts on the best approach to
> going
> >>>>> about this before jumping in.  Comments welcome, especially as I
> >>> haven't
> >>>>> written anything yet :).
> >>>>>
> >>>>> My initial thoughts are the DistTestSuite should see what distributed
> >>>>> frameworks are available (via plugin) and then run those tests.  This
> >>>> will
> >>>>> likely require a setup helper for each framework though - the
> >>>> DistTestSuite
> >>>>> might then need to skip frameworks where the setup helper isn't
> found.
> >>>> If
> >>>>> this seems to be the approach people would like I'll start stubbing
> it
> >>>> out
> >>>>> and get some opinions.
> >>>>
> >>>> Yep, sounds reasonable.  The distributed tests should discover what
> >>>> platforms are available and run on them, perhaps printing a warning if
> >>>> an expected project platform is not found.
> >>>>
> >>>> Looking forward to seeing the code.
> >>>>
> >>>> Regards,
> >>>> Tim
> >>>>
> >>>>
> >>>
> >>
> >
>

Re: Some Obversations while refactoring into submodules

Posted by Tim Ellison <t....@gmail.com>.

On 04/10/16 07:19, Darin Johnson wrote:
> Spent some time working on this tonight.  Currently it's here:
> https://github.com/DarinJ/incubator-pirk/tree/submodules 

Thanks Darin, looking good.

My first thought is that the StandaloneResponder look s a bit out of
place now in core, but it may be overkill to pull it out into its own
module.  WDYT?

> if anyone wants to
> provide a sanity check on how I've modified the ResponderPlugin to handle
> the distributed tests (Look at the ResponderPlugin and maybe
> SparkResponder).  Here we added a hasDistributedTest and
> runDistributedTest, the reason for hasDistributedTest is that Storm didn't
> have a test in BaseTests.

Yeah, though in time I hope that the BaseTests don't differentiate on
distributed vs stand alone tests.  I'm afraid it is a bit of a Pirk-ism
for the code to know about implementation details elsewhere.  IMHO the
various test queries should run on all available plug-ins, and not be
concerned with the distributed/otherwise nature of the implementation.

> The basic idea is to provide the Plugins the ability to setup their own
> distributed test and provide BaseTests back with the list of objects.  I
> moved performQuery from DistTestSuite to BaseTests and made it private,
> this got rid of a circular dependency.  There's a few things I don't
> particularly like the setup necessarily handled in two functions and
> there's a lot of file reads/writes.  I don't think this can be helped due
> to the nature of the tests.  I'm also torn on calling main, using
> ProcessBuilder, each frameworks special pragmatic launcher.

It's a journey and this is a big step in the right direction  :-)

> Please note at present, you have to specify a test to run (i.e. -t 1:JS),
> this is because not all responders are in all jars now (so some tests fail
> saying plugin not found). That should be corrected tomorrow, I wouldn't
> recommend going that far. I'll have the TestSuite look for available
> platforms that have a distributed test and then run them.  I also plan to
> start consolidating functions in DistTestSuite.java and BaseTests.java.
> 
> Also in BaseTests, it seems really awkward that Standalone is there but has
> special performQuery methods and special results.  Is there a particular
> reason for this?  It would be nice to make it so standalone was tested the
> same way.

Definitely.

> The other modules should be easier to break out.  I have bothered
> optimizing maven yet.

I assume you mean for the submodule dependencies etc?  No problem.

Your branch is testing ok for me locally:
[INFO] Apache Pirk (incubating) Project ................... SUCCESS [
1.392 s]
[INFO] pirk-core .......................................... SUCCESS
[01:17 min]
[INFO] pirk-storm ......................................... SUCCESS [
54.000 s]
[INFO] pirk-spark ......................................... SUCCESS [
3.275 s]
[INFO] pirk-mapreduce ..................................... SUCCESS [
1.095 s]

The overall direction is good.  Once the release is cut, and you are
happy that it is reasonably in shape, I think it would be good to switch
over to the modular builds so we can collaborate on the refinements.

Regards,
Tim

> On Thu, Sep 29, 2016 at 7:39 AM, Darin Johnson <db...@gmail.com>
> wrote:
> 
>> Great, will do. Unfortunately, I don't think this will be done by end of
>> week.  We'll see.
>>
>> On Sep 29, 2016 6:32 AM, "Ellison Anne Williams" <ea...@apache.org>
>> wrote:
>>
>>> I would say that the DistributedTestSuite is ripe for refactor - it grew
>>> pretty organically and hasn't had a good refactoring yet. ;)
>>>
>>> What you suggested sounds reasonable - look forward to seeing the code.
>>>
>>> On Thu, Sep 29, 2016 at 4:48 AM, Tim Ellison <t....@gmail.com>
>>> wrote:
>>>
>>>> On 29/09/16 01:50, Darin Johnson wrote:
>>>>> Hey guys, started working on refactoring into submodules and noticed a
>>>> few
>>>>> things.
>>>>>
>>>>> First the class DistributedTestSuite in org.apache.pirk.tests package
>>> is
>>>>> directly calling SparkLauncher, which can be avoided using platform
>>>>> (already a TODO). This will be needed to break the spark responder
>>> out.
>>>> No
>>>>> issues here' expected it.
>>>>>
>>>>> Also noticed that DistributedTestSuite calls
>>>> BaseTests.testDNSHostnameQuery
>>>>> which in turn calls the static method performQuery in
>>>>> DistTestSuite.performQuery so we've got a cyclic dependency.  I'm
>>>> thinking
>>>>> this should be refactored, which shouldn't be hard - most of the
>>> method
>>>> are
>>>>> static.
>>>>>
>>>>> I figured I should solicit some thoughts on the best approach to going
>>>>> about this before jumping in.  Comments welcome, especially as I
>>> haven't
>>>>> written anything yet :).
>>>>>
>>>>> My initial thoughts are the DistTestSuite should see what distributed
>>>>> frameworks are available (via plugin) and then run those tests.  This
>>>> will
>>>>> likely require a setup helper for each framework though - the
>>>> DistTestSuite
>>>>> might then need to skip frameworks where the setup helper isn't found.
>>>> If
>>>>> this seems to be the approach people would like I'll start stubbing it
>>>> out
>>>>> and get some opinions.
>>>>
>>>> Yep, sounds reasonable.  The distributed tests should discover what
>>>> platforms are available and run on them, perhaps printing a warning if
>>>> an expected project platform is not found.
>>>>
>>>> Looking forward to seeing the code.
>>>>
>>>> Regards,
>>>> Tim
>>>>
>>>>
>>>
>>
>

Re: Some Obversations while refactoring into submodules

Posted by Darin Johnson <db...@gmail.com>.

Spent some time working on this tonight.  Currently it's here:
https://github.com/DarinJ/incubator-pirk/tree/submodules if anyone wants to
provide a sanity check on how I've modified the ResponderPlugin to handle
the distributed tests (Look at the ResponderPlugin and maybe
SparkResponder).  Here we added a hasDistributedTest and
runDistributedTest, the reason for hasDistributedTest is that Storm didn't
have a test in BaseTests.

The basic idea is to provide the Plugins the ability to setup their own
distributed test and provide BaseTests back with the list of objects.  I
moved performQuery from DistTestSuite to BaseTests and made it private,
this got rid of a circular dependency.  There's a few things I don't
particularly like the setup necessarily handled in two functions and
there's a lot of file reads/writes.  I don't think this can be helped due
to the nature of the tests.  I'm also torn on calling main, using
ProcessBuilder, each frameworks special pragmatic launcher.

Please note at present, you have to specify a test to run (i.e. -t 1:JS),
this is because not all responders are in all jars now (so some tests fail
saying plugin not found). That should be corrected tomorrow, I wouldn't
recommend going that far. I'll have the TestSuite look for available
platforms that have a distributed test and then run them.  I also plan to
start consolidating functions in DistTestSuite.java and BaseTests.java.

Also in BaseTests, it seems really awkward that Standalone is there but has
special performQuery methods and special results.  Is there a particular
reason for this?  It would be nice to make it so standalone was tested the
same way.

The other modules should be easier to break out.  I have bothered
optimizing maven yet.

Darin

On Thu, Sep 29, 2016 at 7:39 AM, Darin Johnson <db...@gmail.com>
wrote:

> Great, will do. Unfortunately, I don't think this will be done by end of
> week.  We'll see.
>
> On Sep 29, 2016 6:32 AM, "Ellison Anne Williams" <ea...@apache.org>
> wrote:
>
>> I would say that the DistributedTestSuite is ripe for refactor - it grew
>> pretty organically and hasn't had a good refactoring yet. ;)
>>
>> What you suggested sounds reasonable - look forward to seeing the code.
>>
>> On Thu, Sep 29, 2016 at 4:48 AM, Tim Ellison <t....@gmail.com>
>> wrote:
>>
>> > On 29/09/16 01:50, Darin Johnson wrote:
>> > > Hey guys, started working on refactoring into submodules and noticed a
>> > few
>> > > things.
>> > >
>> > > First the class DistributedTestSuite in org.apache.pirk.tests package
>> is
>> > > directly calling SparkLauncher, which can be avoided using platform
>> > > (already a TODO). This will be needed to break the spark responder
>> out.
>> > No
>> > > issues here' expected it.
>> > >
>> > > Also noticed that DistributedTestSuite calls
>> > BaseTests.testDNSHostnameQuery
>> > > which in turn calls the static method performQuery in
>> > > DistTestSuite.performQuery so we've got a cyclic dependency.  I'm
>> > thinking
>> > > this should be refactored, which shouldn't be hard - most of the
>> method
>> > are
>> > > static.
>> > >
>> > > I figured I should solicit some thoughts on the best approach to going
>> > > about this before jumping in.  Comments welcome, especially as I
>> haven't
>> > > written anything yet :).
>> > >
>> > > My initial thoughts are the DistTestSuite should see what distributed
>> > > frameworks are available (via plugin) and then run those tests.  This
>> > will
>> > > likely require a setup helper for each framework though - the
>> > DistTestSuite
>> > > might then need to skip frameworks where the setup helper isn't found.
>> > If
>> > > this seems to be the approach people would like I'll start stubbing it
>> > out
>> > > and get some opinions.
>> >
>> > Yep, sounds reasonable.  The distributed tests should discover what
>> > platforms are available and run on them, perhaps printing a warning if
>> > an expected project platform is not found.
>> >
>> > Looking forward to seeing the code.
>> >
>> > Regards,
>> > Tim
>> >
>> >
>>
>

Re: Some Obversations while refactoring into submodules

Posted by Darin Johnson <db...@gmail.com>.

Great, will do. Unfortunately, I don't think this will be done by end of
week.  We'll see.

On Sep 29, 2016 6:32 AM, "Ellison Anne Williams" <ea...@apache.org>
wrote:

> I would say that the DistributedTestSuite is ripe for refactor - it grew
> pretty organically and hasn't had a good refactoring yet. ;)
>
> What you suggested sounds reasonable - look forward to seeing the code.
>
> On Thu, Sep 29, 2016 at 4:48 AM, Tim Ellison <t....@gmail.com>
> wrote:
>
> > On 29/09/16 01:50, Darin Johnson wrote:
> > > Hey guys, started working on refactoring into submodules and noticed a
> > few
> > > things.
> > >
> > > First the class DistributedTestSuite in org.apache.pirk.tests package
> is
> > > directly calling SparkLauncher, which can be avoided using platform
> > > (already a TODO). This will be needed to break the spark responder out.
> > No
> > > issues here' expected it.
> > >
> > > Also noticed that DistributedTestSuite calls
> > BaseTests.testDNSHostnameQuery
> > > which in turn calls the static method performQuery in
> > > DistTestSuite.performQuery so we've got a cyclic dependency.  I'm
> > thinking
> > > this should be refactored, which shouldn't be hard - most of the method
> > are
> > > static.
> > >
> > > I figured I should solicit some thoughts on the best approach to going
> > > about this before jumping in.  Comments welcome, especially as I
> haven't
> > > written anything yet :).
> > >
> > > My initial thoughts are the DistTestSuite should see what distributed
> > > frameworks are available (via plugin) and then run those tests.  This
> > will
> > > likely require a setup helper for each framework though - the
> > DistTestSuite
> > > might then need to skip frameworks where the setup helper isn't found.
> > If
> > > this seems to be the approach people would like I'll start stubbing it
> > out
> > > and get some opinions.
> >
> > Yep, sounds reasonable.  The distributed tests should discover what
> > platforms are available and run on them, perhaps printing a warning if
> > an expected project platform is not found.
> >
> > Looking forward to seeing the code.
> >
> > Regards,
> > Tim
> >
> >
>

Re: Some Obversations while refactoring into submodules

Posted by Ellison Anne Williams <ea...@apache.org>.

I would say that the DistributedTestSuite is ripe for refactor - it grew
pretty organically and hasn't had a good refactoring yet. ;)

What you suggested sounds reasonable - look forward to seeing the code.

On Thu, Sep 29, 2016 at 4:48 AM, Tim Ellison <t....@gmail.com> wrote:

> On 29/09/16 01:50, Darin Johnson wrote:
> > Hey guys, started working on refactoring into submodules and noticed a
> few
> > things.
> >
> > First the class DistributedTestSuite in org.apache.pirk.tests package is
> > directly calling SparkLauncher, which can be avoided using platform
> > (already a TODO). This will be needed to break the spark responder out.
> No
> > issues here' expected it.
> >
> > Also noticed that DistributedTestSuite calls
> BaseTests.testDNSHostnameQuery
> > which in turn calls the static method performQuery in
> > DistTestSuite.performQuery so we've got a cyclic dependency.  I'm
> thinking
> > this should be refactored, which shouldn't be hard - most of the method
> are
> > static.
> >
> > I figured I should solicit some thoughts on the best approach to going
> > about this before jumping in.  Comments welcome, especially as I haven't
> > written anything yet :).
> >
> > My initial thoughts are the DistTestSuite should see what distributed
> > frameworks are available (via plugin) and then run those tests.  This
> will
> > likely require a setup helper for each framework though - the
> DistTestSuite
> > might then need to skip frameworks where the setup helper isn't found.
> If
> > this seems to be the approach people would like I'll start stubbing it
> out
> > and get some opinions.
>
> Yep, sounds reasonable.  The distributed tests should discover what
> platforms are available and run on them, perhaps printing a warning if
> an expected project platform is not found.
>
> Looking forward to seeing the code.
>
> Regards,
> Tim
>
>

Re: Some Obversations while refactoring into submodules

Posted by Tim Ellison <t....@gmail.com>.

On 29/09/16 01:50, Darin Johnson wrote:
> Hey guys, started working on refactoring into submodules and noticed a few
> things.
> 
> First the class DistributedTestSuite in org.apache.pirk.tests package is
> directly calling SparkLauncher, which can be avoided using platform
> (already a TODO). This will be needed to break the spark responder out.  No
> issues here' expected it.
> 
> Also noticed that DistributedTestSuite calls BaseTests.testDNSHostnameQuery
> which in turn calls the static method performQuery in
> DistTestSuite.performQuery so we've got a cyclic dependency.  I'm thinking
> this should be refactored, which shouldn't be hard - most of the method are
> static.
> 
> I figured I should solicit some thoughts on the best approach to going
> about this before jumping in.  Comments welcome, especially as I haven't
> written anything yet :).
> 
> My initial thoughts are the DistTestSuite should see what distributed
> frameworks are available (via plugin) and then run those tests.  This will
> likely require a setup helper for each framework though - the DistTestSuite
> might then need to skip frameworks where the setup helper isn't found.  If
> this seems to be the approach people would like I'll start stubbing it out
> and get some opinions.

Yep, sounds reasonable.  The distributed tests should discover what
platforms are available and run on them, perhaps printing a warning if
an expected project platform is not found.

Looking forward to seeing the code.

Regards,
Tim