You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Hanifi Gunes <hg...@maprtech.com> on 2015/04/07 02:06:14 UTC

Regarding unit test failures -- patterns & possible fixes

Hey devs,


I have been dealing with sporadic unit test failures since last week. I'd
like to share my findings on failing unit tests outlining the patterns
causing these.

The general advice I was given was to run tests lowering the fork count
(the default is 4). This sounds ok to get around the problem. If this is
your intent feel free to give a try. However, I wanted to investigate this
further, went ahead and set fork count to 8, number of cores testing
machine has.

The first set of failures manifested around TestUnionAll &
TestExampleQueries. After Hakeem's pointer, I noticed that some test cases
in TestUnionAll & TestExampleQueries creates & drops views with the same
name. In a concurrent settings(given fork count > 0) with no strict
ordering among test cases, this sure creates a havoc throwing arbitrary
errors at each test run. https://issues.apache.org/jira/browse/DRILL-2684
takes care of this. After applying this fix, now I can cleanly run
java-exec tests, faster & consistently.

Then my hive tests started failing since multiple test classes use the same
metadata folders and clean up after without caring whether other hive tests
are still running. I got a patch to unique-ify metadata folders for test
class. https://issues.apache.org/jira/browse/DRILL-2685 tracks this however
the patch is not public yet.

The last problem relates to hive related tests living under jdbc module. I
know that there has been an ongoing effort to migrate these tests from jdbc
module to where they belong. I am not sure about the timeframe for the
migration but I would think the sooner is better.

To avoid further failures, I would strongly recommend devs to use view
names and external resources that are unique across test cases. One idea is
to suffix view a view name with test case/class name for instance prefer
using names_view_test_union_all rather than a more generic names_view. Also
with increasing number of test cases checked in, it takes more and more
time to complete a test run. We should consider concurrent test runs as a
legitimate everyday use case and design tests accordingly.


I would be interested in hearing other unit test failure patterns so as to
be alert. Feel free to share if you discovered any.


Regards.
-Hanifi

Re: Regarding unit test failures -- patterns & possible fixes

Posted by Ted Dunning <te...@gmail.com>.
My mac knows it has 2 physical and 4 logical cores:

ted:spy$ sysctl hw | grep cpu
hw.ncpu: 4
hw.activecpu: 4
hw.physicalcpu: 2
hw.physicalcpu_max: 2
hw.logicalcpu: 4
hw.logicalcpu_max: 4
hw.cputype: 7
hw.cpusubtype: 4
hw.cpu64bit_capable: 1
hw.cpufamily: 526772277
hw.cpufrequency: 2900000000
hw.cpufrequency_min: 2900000000
hw.cpufrequency_max: 2900000000
hw.cputhreadtype: 1
hw.ncpu = 4
hw.cpufrequency = 2900000000
hw.availcpu = 4


On Mon, Apr 6, 2015 at 5:52 PM, Hanifi Gunes <hg...@maprtech.com> wrote:

> @Ted, I actually have a patch pending review for this at
> https://issues.apache.org/jira/browse/DRILL-2039. However, @Hakeem you may
> still end up in a troubling situation as your mac might have 4 logical
> cores =)
>
> *sysctl hw | grep **cpu* should tell you.
>
>
> -Hanifi
>
>
> On Mon, Apr 6, 2015 at 5:22 PM, Ted Dunning <te...@gmail.com> wrote:
>
> > Maven allows the forkCount to be set in terms of the number of cores on
> the
> > machine by appending a C to the number.
> >
> > (I think)
> >
> >
> >
> > On Mon, Apr 6, 2015 at 5:15 PM, Abdel Hakim Deneche <
> adeneche@maprtech.com
> > >
> > wrote:
> >
> > > From my personal experience, using a forkCount of 4 makes some tests to
> > > timeout, not because there are some concurrent problems but just
> because
> > my
> > > machine is not powerful enough to run as many tests in parallel.
> > >
> > > On Mon, Apr 6, 2015 at 5:06 PM, Hanifi Gunes <hg...@maprtech.com>
> > wrote:
> > >
> > > > Hey devs,
> > > >
> > > >
> > > > I have been dealing with sporadic unit test failures since last week.
> > I'd
> > > > like to share my findings on failing unit tests outlining the
> patterns
> > > > causing these.
> > > >
> > > > The general advice I was given was to run tests lowering the fork
> count
> > > > (the default is 4). This sounds ok to get around the problem. If this
> > is
> > > > your intent feel free to give a try. However, I wanted to investigate
> > > this
> > > > further, went ahead and set fork count to 8, number of cores testing
> > > > machine has.
> > > >
> > > > The first set of failures manifested around TestUnionAll &
> > > > TestExampleQueries. After Hakeem's pointer, I noticed that some test
> > > cases
> > > > in TestUnionAll & TestExampleQueries creates & drops views with the
> > same
> > > > name. In a concurrent settings(given fork count > 0) with no strict
> > > > ordering among test cases, this sure creates a havoc throwing
> arbitrary
> > > > errors at each test run.
> > > https://issues.apache.org/jira/browse/DRILL-2684
> > > > takes care of this. After applying this fix, now I can cleanly run
> > > > java-exec tests, faster & consistently.
> > > >
> > > > Then my hive tests started failing since multiple test classes use
> the
> > > same
> > > > metadata folders and clean up after without caring whether other hive
> > > tests
> > > > are still running. I got a patch to unique-ify metadata folders for
> > test
> > > > class. https://issues.apache.org/jira/browse/DRILL-2685 tracks this
> > > > however
> > > > the patch is not public yet.
> > > >
> > > > The last problem relates to hive related tests living under jdbc
> > module.
> > > I
> > > > know that there has been an ongoing effort to migrate these tests
> from
> > > jdbc
> > > > module to where they belong. I am not sure about the timeframe for
> the
> > > > migration but I would think the sooner is better.
> > > >
> > > > To avoid further failures, I would strongly recommend devs to use
> view
> > > > names and external resources that are unique across test cases. One
> > idea
> > > is
> > > > to suffix view a view name with test case/class name for instance
> > prefer
> > > > using names_view_test_union_all rather than a more generic
> names_view.
> > > Also
> > > > with increasing number of test cases checked in, it takes more and
> more
> > > > time to complete a test run. We should consider concurrent test runs
> > as a
> > > > legitimate everyday use case and design tests accordingly.
> > > >
> > > >
> > > > I would be interested in hearing other unit test failure patterns so
> as
> > > to
> > > > be alert. Feel free to share if you discovered any.
> > > >
> > > >
> > > > Regards.
> > > > -Hanifi
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Abdelhakim Deneche
> > >
> > > Software Engineer
> > >
> > >   <http://www.mapr.com/>
> > >
> > >
> > > Now Available - Free Hadoop On-Demand Training
> > > <
> > >
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > > >
> > >
> >
>

Re: Regarding unit test failures -- patterns & possible fixes

Posted by Hanifi Gunes <hg...@maprtech.com>.
I have just seen Jacques' comment under the patch. I am not sure whether we
should set forks to a fraction of available cores. Any ideas?


On Mon, Apr 6, 2015 at 5:52 PM, Hanifi Gunes <hg...@maprtech.com> wrote:

> @Ted, I actually have a patch pending review for this at
> https://issues.apache.org/jira/browse/DRILL-2039. However, @Hakeem you
> may still end up in a troubling situation as your mac might have 4 logical
> cores =)
>
> *sysctl hw | grep **cpu* should tell you.
>
>
> -Hanifi
>
>
> On Mon, Apr 6, 2015 at 5:22 PM, Ted Dunning <te...@gmail.com> wrote:
>
>> Maven allows the forkCount to be set in terms of the number of cores on
>> the
>> machine by appending a C to the number.
>>
>> (I think)
>>
>>
>>
>> On Mon, Apr 6, 2015 at 5:15 PM, Abdel Hakim Deneche <
>> adeneche@maprtech.com>
>> wrote:
>>
>> > From my personal experience, using a forkCount of 4 makes some tests to
>> > timeout, not because there are some concurrent problems but just
>> because my
>> > machine is not powerful enough to run as many tests in parallel.
>> >
>> > On Mon, Apr 6, 2015 at 5:06 PM, Hanifi Gunes <hg...@maprtech.com>
>> wrote:
>> >
>> > > Hey devs,
>> > >
>> > >
>> > > I have been dealing with sporadic unit test failures since last week.
>> I'd
>> > > like to share my findings on failing unit tests outlining the patterns
>> > > causing these.
>> > >
>> > > The general advice I was given was to run tests lowering the fork
>> count
>> > > (the default is 4). This sounds ok to get around the problem. If this
>> is
>> > > your intent feel free to give a try. However, I wanted to investigate
>> > this
>> > > further, went ahead and set fork count to 8, number of cores testing
>> > > machine has.
>> > >
>> > > The first set of failures manifested around TestUnionAll &
>> > > TestExampleQueries. After Hakeem's pointer, I noticed that some test
>> > cases
>> > > in TestUnionAll & TestExampleQueries creates & drops views with the
>> same
>> > > name. In a concurrent settings(given fork count > 0) with no strict
>> > > ordering among test cases, this sure creates a havoc throwing
>> arbitrary
>> > > errors at each test run.
>> > https://issues.apache.org/jira/browse/DRILL-2684
>> > > takes care of this. After applying this fix, now I can cleanly run
>> > > java-exec tests, faster & consistently.
>> > >
>> > > Then my hive tests started failing since multiple test classes use the
>> > same
>> > > metadata folders and clean up after without caring whether other hive
>> > tests
>> > > are still running. I got a patch to unique-ify metadata folders for
>> test
>> > > class. https://issues.apache.org/jira/browse/DRILL-2685 tracks this
>> > > however
>> > > the patch is not public yet.
>> > >
>> > > The last problem relates to hive related tests living under jdbc
>> module.
>> > I
>> > > know that there has been an ongoing effort to migrate these tests from
>> > jdbc
>> > > module to where they belong. I am not sure about the timeframe for the
>> > > migration but I would think the sooner is better.
>> > >
>> > > To avoid further failures, I would strongly recommend devs to use view
>> > > names and external resources that are unique across test cases. One
>> idea
>> > is
>> > > to suffix view a view name with test case/class name for instance
>> prefer
>> > > using names_view_test_union_all rather than a more generic names_view.
>> > Also
>> > > with increasing number of test cases checked in, it takes more and
>> more
>> > > time to complete a test run. We should consider concurrent test runs
>> as a
>> > > legitimate everyday use case and design tests accordingly.
>> > >
>> > >
>> > > I would be interested in hearing other unit test failure patterns so
>> as
>> > to
>> > > be alert. Feel free to share if you discovered any.
>> > >
>> > >
>> > > Regards.
>> > > -Hanifi
>> > >
>> >
>> >
>> >
>> > --
>> >
>> > Abdelhakim Deneche
>> >
>> > Software Engineer
>> >
>> >   <http://www.mapr.com/>
>> >
>> >
>> > Now Available - Free Hadoop On-Demand Training
>> > <
>> >
>> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
>> > >
>> >
>>
>
>

Re: Regarding unit test failures -- patterns & possible fixes

Posted by Hanifi Gunes <hg...@maprtech.com>.
@Ted, I actually have a patch pending review for this at
https://issues.apache.org/jira/browse/DRILL-2039. However, @Hakeem you may
still end up in a troubling situation as your mac might have 4 logical
cores =)

*sysctl hw | grep **cpu* should tell you.


-Hanifi


On Mon, Apr 6, 2015 at 5:22 PM, Ted Dunning <te...@gmail.com> wrote:

> Maven allows the forkCount to be set in terms of the number of cores on the
> machine by appending a C to the number.
>
> (I think)
>
>
>
> On Mon, Apr 6, 2015 at 5:15 PM, Abdel Hakim Deneche <adeneche@maprtech.com
> >
> wrote:
>
> > From my personal experience, using a forkCount of 4 makes some tests to
> > timeout, not because there are some concurrent problems but just because
> my
> > machine is not powerful enough to run as many tests in parallel.
> >
> > On Mon, Apr 6, 2015 at 5:06 PM, Hanifi Gunes <hg...@maprtech.com>
> wrote:
> >
> > > Hey devs,
> > >
> > >
> > > I have been dealing with sporadic unit test failures since last week.
> I'd
> > > like to share my findings on failing unit tests outlining the patterns
> > > causing these.
> > >
> > > The general advice I was given was to run tests lowering the fork count
> > > (the default is 4). This sounds ok to get around the problem. If this
> is
> > > your intent feel free to give a try. However, I wanted to investigate
> > this
> > > further, went ahead and set fork count to 8, number of cores testing
> > > machine has.
> > >
> > > The first set of failures manifested around TestUnionAll &
> > > TestExampleQueries. After Hakeem's pointer, I noticed that some test
> > cases
> > > in TestUnionAll & TestExampleQueries creates & drops views with the
> same
> > > name. In a concurrent settings(given fork count > 0) with no strict
> > > ordering among test cases, this sure creates a havoc throwing arbitrary
> > > errors at each test run.
> > https://issues.apache.org/jira/browse/DRILL-2684
> > > takes care of this. After applying this fix, now I can cleanly run
> > > java-exec tests, faster & consistently.
> > >
> > > Then my hive tests started failing since multiple test classes use the
> > same
> > > metadata folders and clean up after without caring whether other hive
> > tests
> > > are still running. I got a patch to unique-ify metadata folders for
> test
> > > class. https://issues.apache.org/jira/browse/DRILL-2685 tracks this
> > > however
> > > the patch is not public yet.
> > >
> > > The last problem relates to hive related tests living under jdbc
> module.
> > I
> > > know that there has been an ongoing effort to migrate these tests from
> > jdbc
> > > module to where they belong. I am not sure about the timeframe for the
> > > migration but I would think the sooner is better.
> > >
> > > To avoid further failures, I would strongly recommend devs to use view
> > > names and external resources that are unique across test cases. One
> idea
> > is
> > > to suffix view a view name with test case/class name for instance
> prefer
> > > using names_view_test_union_all rather than a more generic names_view.
> > Also
> > > with increasing number of test cases checked in, it takes more and more
> > > time to complete a test run. We should consider concurrent test runs
> as a
> > > legitimate everyday use case and design tests accordingly.
> > >
> > >
> > > I would be interested in hearing other unit test failure patterns so as
> > to
> > > be alert. Feel free to share if you discovered any.
> > >
> > >
> > > Regards.
> > > -Hanifi
> > >
> >
> >
> >
> > --
> >
> > Abdelhakim Deneche
> >
> > Software Engineer
> >
> >   <http://www.mapr.com/>
> >
> >
> > Now Available - Free Hadoop On-Demand Training
> > <
> >
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> > >
> >
>

Re: Regarding unit test failures -- patterns & possible fixes

Posted by Ted Dunning <te...@gmail.com>.
Maven allows the forkCount to be set in terms of the number of cores on the
machine by appending a C to the number.

(I think)



On Mon, Apr 6, 2015 at 5:15 PM, Abdel Hakim Deneche <ad...@maprtech.com>
wrote:

> From my personal experience, using a forkCount of 4 makes some tests to
> timeout, not because there are some concurrent problems but just because my
> machine is not powerful enough to run as many tests in parallel.
>
> On Mon, Apr 6, 2015 at 5:06 PM, Hanifi Gunes <hg...@maprtech.com> wrote:
>
> > Hey devs,
> >
> >
> > I have been dealing with sporadic unit test failures since last week. I'd
> > like to share my findings on failing unit tests outlining the patterns
> > causing these.
> >
> > The general advice I was given was to run tests lowering the fork count
> > (the default is 4). This sounds ok to get around the problem. If this is
> > your intent feel free to give a try. However, I wanted to investigate
> this
> > further, went ahead and set fork count to 8, number of cores testing
> > machine has.
> >
> > The first set of failures manifested around TestUnionAll &
> > TestExampleQueries. After Hakeem's pointer, I noticed that some test
> cases
> > in TestUnionAll & TestExampleQueries creates & drops views with the same
> > name. In a concurrent settings(given fork count > 0) with no strict
> > ordering among test cases, this sure creates a havoc throwing arbitrary
> > errors at each test run.
> https://issues.apache.org/jira/browse/DRILL-2684
> > takes care of this. After applying this fix, now I can cleanly run
> > java-exec tests, faster & consistently.
> >
> > Then my hive tests started failing since multiple test classes use the
> same
> > metadata folders and clean up after without caring whether other hive
> tests
> > are still running. I got a patch to unique-ify metadata folders for test
> > class. https://issues.apache.org/jira/browse/DRILL-2685 tracks this
> > however
> > the patch is not public yet.
> >
> > The last problem relates to hive related tests living under jdbc module.
> I
> > know that there has been an ongoing effort to migrate these tests from
> jdbc
> > module to where they belong. I am not sure about the timeframe for the
> > migration but I would think the sooner is better.
> >
> > To avoid further failures, I would strongly recommend devs to use view
> > names and external resources that are unique across test cases. One idea
> is
> > to suffix view a view name with test case/class name for instance prefer
> > using names_view_test_union_all rather than a more generic names_view.
> Also
> > with increasing number of test cases checked in, it takes more and more
> > time to complete a test run. We should consider concurrent test runs as a
> > legitimate everyday use case and design tests accordingly.
> >
> >
> > I would be interested in hearing other unit test failure patterns so as
> to
> > be alert. Feel free to share if you discovered any.
> >
> >
> > Regards.
> > -Hanifi
> >
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   <http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training
> <
> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
> >
>

Re: Regarding unit test failures -- patterns & possible fixes

Posted by Abdel Hakim Deneche <ad...@maprtech.com>.
>From my personal experience, using a forkCount of 4 makes some tests to
timeout, not because there are some concurrent problems but just because my
machine is not powerful enough to run as many tests in parallel.

On Mon, Apr 6, 2015 at 5:06 PM, Hanifi Gunes <hg...@maprtech.com> wrote:

> Hey devs,
>
>
> I have been dealing with sporadic unit test failures since last week. I'd
> like to share my findings on failing unit tests outlining the patterns
> causing these.
>
> The general advice I was given was to run tests lowering the fork count
> (the default is 4). This sounds ok to get around the problem. If this is
> your intent feel free to give a try. However, I wanted to investigate this
> further, went ahead and set fork count to 8, number of cores testing
> machine has.
>
> The first set of failures manifested around TestUnionAll &
> TestExampleQueries. After Hakeem's pointer, I noticed that some test cases
> in TestUnionAll & TestExampleQueries creates & drops views with the same
> name. In a concurrent settings(given fork count > 0) with no strict
> ordering among test cases, this sure creates a havoc throwing arbitrary
> errors at each test run. https://issues.apache.org/jira/browse/DRILL-2684
> takes care of this. After applying this fix, now I can cleanly run
> java-exec tests, faster & consistently.
>
> Then my hive tests started failing since multiple test classes use the same
> metadata folders and clean up after without caring whether other hive tests
> are still running. I got a patch to unique-ify metadata folders for test
> class. https://issues.apache.org/jira/browse/DRILL-2685 tracks this
> however
> the patch is not public yet.
>
> The last problem relates to hive related tests living under jdbc module. I
> know that there has been an ongoing effort to migrate these tests from jdbc
> module to where they belong. I am not sure about the timeframe for the
> migration but I would think the sooner is better.
>
> To avoid further failures, I would strongly recommend devs to use view
> names and external resources that are unique across test cases. One idea is
> to suffix view a view name with test case/class name for instance prefer
> using names_view_test_union_all rather than a more generic names_view. Also
> with increasing number of test cases checked in, it takes more and more
> time to complete a test run. We should consider concurrent test runs as a
> legitimate everyday use case and design tests accordingly.
>
>
> I would be interested in hearing other unit test failure patterns so as to
> be alert. Feel free to share if you discovered any.
>
>
> Regards.
> -Hanifi
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>