You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Apekshit Sharma <ap...@cloudera.com> on 2017/12/12 00:07:57 UTC

Suggestion to speed up precommit - Reduce versions in Hadoop check

Hi

+1 hadoopcheck 52m 1s Patch does not cause any errors with Hadoop 2.6.1
2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4.

Almost 1 hr to check against 10 versions. And it's only going to increase
as more 2.6.x, 2.7.x and 3.0.x releases come out.

Suggestion here is simple, let's check against only the latest maintenance
release for each minor version i.e. 2.6.5, 2.7.4 and 3.0.0-alpha4.
Advantage: Save ~40 min on pre-commit time.

Justification:
- We only do compile checks. Maintenance releases are not supposed to be
doing API breaking changes. So checking against maintenance release for
each minor version should be enough.
- We rarely see any hadoop check -1, and most recent ones have been due to
3.0. These will still be caught.
- Nightly can still check against all hadoop versions (since nightlies are
supposed to do holistic testing)
- Analyzing 201 precommits from 10100 (11/29) - 10300 (12/8) [10 days]:
  138 had +1 hadoopcheck
   15 had -1 hadoopcheck
  (others probably failed even before that - merge issue, etc)


Spot checking some
failures:[10241,10246,10225,10269,10151,10156,10184,10250,10298,10227,10294,10223,10251,10119,10230]

10241: All 2.6.x failed. Others didn't run
10246: All 10 versions failed.
10184: All 2.6.x and 2.7.x failed. Others didn't run
10223: All 10 versions failed
10230: All 2.6.x failed. Others didn't run

Common pattern being, all maintenance versions fail together.
(idk, why sometimes 2.7.* are not reported if 2.6.* fail, but that's
irrelevant to this discussion).

What do you say - only check latest maintenance releases in precommit (and
let nightlies do holistic testing against all versions)?

-- Appy

Re: Suggestion to speed up precommit - Reduce versions in Hadoop check

Posted by Ted Yu <yu...@gmail.com>.
bq. check against only the latest maintenance release for each minor
version i.e. 2.6.5, 2.7.4 and 3.0.0-alpha4

Makes sense.

For hadoop 3, we can build against 3.0.0-beta1

Cheers

On Mon, Dec 11, 2017 at 4:11 PM, Apekshit Sharma <ap...@cloudera.com> wrote:

> Oh, btw, here's the little piece of code if anyone want's to analyze more.
>
> Script to collect precommit runs' console text.
>
> #!/bin/bash
>
> for i in `seq 10100 10300`; do
>   wget -a log -O ${i}
> https://builds.apache.org/job/PreCommit-HBASE-Build/${i}/consoleText
> done
>
> Number of failed runs:
> grep "|  -1  |    hadoopcheck" `ls 1*` | awk '{x[$1] = 1} END{for (i in x)
> print i;}' | wc -l
>
> Number of passed runs:
> grep "|  +1  |    hadoopcheck" `ls 1*` | awk '{x[$1] = 1} END{for (i in x)
> print i;}' | wc -l
>
> -- Appy
>
>
> On Mon, Dec 11, 2017 at 4:07 PM, Apekshit Sharma <ap...@cloudera.com>
> wrote:
>
> > Hi
> >
> > +1 hadoopcheck 52m 1s Patch does not cause any errors with Hadoop 2.6.1
> > 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4.
> >
> > Almost 1 hr to check against 10 versions. And it's only going to increase
> > as more 2.6.x, 2.7.x and 3.0.x releases come out.
> >
> > Suggestion here is simple, let's check against only the latest
> maintenance
> > release for each minor version i.e. 2.6.5, 2.7.4 and 3.0.0-alpha4.
> > Advantage: Save ~40 min on pre-commit time.
> >
> > Justification:
> > - We only do compile checks. Maintenance releases are not supposed to be
> > doing API breaking changes. So checking against maintenance release for
> > each minor version should be enough.
> > - We rarely see any hadoop check -1, and most recent ones have been due
> to
> > 3.0. These will still be caught.
> > - Nightly can still check against all hadoop versions (since nightlies
> are
> > supposed to do holistic testing)
> > - Analyzing 201 precommits from 10100 (11/29) - 10300 (12/8) [10 days]:
> >   138 had +1 hadoopcheck
> >    15 had -1 hadoopcheck
> >   (others probably failed even before that - merge issue, etc)
> >
> >
> > Spot checking some failures:[10241,10246,10225,
> > 10269,10151,10156,10184,10250,10298,10227,10294,10223,10251,10119,10230]
> >
> > 10241: All 2.6.x failed. Others didn't run
> > 10246: All 10 versions failed.
> > 10184: All 2.6.x and 2.7.x failed. Others didn't run
> > 10223: All 10 versions failed
> > 10230: All 2.6.x failed. Others didn't run
> >
> > Common pattern being, all maintenance versions fail together.
> > (idk, why sometimes 2.7.* are not reported if 2.6.* fail, but that's
> > irrelevant to this discussion).
> >
> > What do you say - only check latest maintenance releases in precommit
> (and
> > let nightlies do holistic testing against all versions)?
> >
> > -- Appy
> >
>
>
>
> --
>
> -- Appy
>

Re: Suggestion to speed up precommit - Reduce versions in Hadoop check

Posted by Apekshit Sharma <ap...@cloudera.com>.
https://issues.apache.org/jira/browse/HBASE-19489

On Mon, Dec 11, 2017 at 4:30 PM, Josh Elser <el...@apache.org> wrote:

> +1
>
>
> On 12/11/17 7:11 PM, Apekshit Sharma wrote:
>
>> Oh, btw, here's the little piece of code if anyone want's to analyze more.
>>
>> Script to collect precommit runs' console text.
>>
>> #!/bin/bash
>>
>> for i in `seq 10100 10300`; do
>>    wget -a log -O ${i}
>> https://builds.apache.org/job/PreCommit-HBASE-Build/${i}/consoleText
>> done
>>
>> Number of failed runs:
>> grep "|  -1  |    hadoopcheck" `ls 1*` | awk '{x[$1] = 1} END{for (i in x)
>> print i;}' | wc -l
>>
>> Number of passed runs:
>> grep "|  +1  |    hadoopcheck" `ls 1*` | awk '{x[$1] = 1} END{for (i in x)
>> print i;}' | wc -l
>>
>> -- Appy
>>
>>
>> On Mon, Dec 11, 2017 at 4:07 PM, Apekshit Sharma <ap...@cloudera.com>
>> wrote:
>>
>> Hi
>>>
>>> +1 hadoopcheck 52m 1s Patch does not cause any errors with Hadoop 2.6.1
>>> 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4.
>>>
>>> Almost 1 hr to check against 10 versions. And it's only going to increase
>>> as more 2.6.x, 2.7.x and 3.0.x releases come out.
>>>
>>> Suggestion here is simple, let's check against only the latest
>>> maintenance
>>> release for each minor version i.e. 2.6.5, 2.7.4 and 3.0.0-alpha4.
>>> Advantage: Save ~40 min on pre-commit time.
>>>
>>> Justification:
>>> - We only do compile checks. Maintenance releases are not supposed to be
>>> doing API breaking changes. So checking against maintenance release for
>>> each minor version should be enough.
>>> - We rarely see any hadoop check -1, and most recent ones have been due
>>> to
>>> 3.0. These will still be caught.
>>> - Nightly can still check against all hadoop versions (since nightlies
>>> are
>>> supposed to do holistic testing)
>>> - Analyzing 201 precommits from 10100 (11/29) - 10300 (12/8) [10 days]:
>>>    138 had +1 hadoopcheck
>>>     15 had -1 hadoopcheck
>>>    (others probably failed even before that - merge issue, etc)
>>>
>>>
>>> Spot checking some failures:[10241,10246,10225,
>>> 10269,10151,10156,10184,10250,10298,10227,10294,10223,10251,10119,10230]
>>>
>>> 10241: All 2.6.x failed. Others didn't run
>>> 10246: All 10 versions failed.
>>> 10184: All 2.6.x and 2.7.x failed. Others didn't run
>>> 10223: All 10 versions failed
>>> 10230: All 2.6.x failed. Others didn't run
>>>
>>> Common pattern being, all maintenance versions fail together.
>>> (idk, why sometimes 2.7.* are not reported if 2.6.* fail, but that's
>>> irrelevant to this discussion).
>>>
>>> What do you say - only check latest maintenance releases in precommit
>>> (and
>>> let nightlies do holistic testing against all versions)?
>>>
>>> -- Appy
>>>
>>>
>>
>>
>>


-- 

-- Appy

Re: Suggestion to speed up precommit - Reduce versions in Hadoop check

Posted by Josh Elser <el...@apache.org>.
+1

On 12/11/17 7:11 PM, Apekshit Sharma wrote:
> Oh, btw, here's the little piece of code if anyone want's to analyze more.
> 
> Script to collect precommit runs' console text.
> 
> #!/bin/bash
> 
> for i in `seq 10100 10300`; do
>    wget -a log -O ${i}
> https://builds.apache.org/job/PreCommit-HBASE-Build/${i}/consoleText
> done
> 
> Number of failed runs:
> grep "|  -1  |    hadoopcheck" `ls 1*` | awk '{x[$1] = 1} END{for (i in x)
> print i;}' | wc -l
> 
> Number of passed runs:
> grep "|  +1  |    hadoopcheck" `ls 1*` | awk '{x[$1] = 1} END{for (i in x)
> print i;}' | wc -l
> 
> -- Appy
> 
> 
> On Mon, Dec 11, 2017 at 4:07 PM, Apekshit Sharma <ap...@cloudera.com> wrote:
> 
>> Hi
>>
>> +1 hadoopcheck 52m 1s Patch does not cause any errors with Hadoop 2.6.1
>> 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4.
>>
>> Almost 1 hr to check against 10 versions. And it's only going to increase
>> as more 2.6.x, 2.7.x and 3.0.x releases come out.
>>
>> Suggestion here is simple, let's check against only the latest maintenance
>> release for each minor version i.e. 2.6.5, 2.7.4 and 3.0.0-alpha4.
>> Advantage: Save ~40 min on pre-commit time.
>>
>> Justification:
>> - We only do compile checks. Maintenance releases are not supposed to be
>> doing API breaking changes. So checking against maintenance release for
>> each minor version should be enough.
>> - We rarely see any hadoop check -1, and most recent ones have been due to
>> 3.0. These will still be caught.
>> - Nightly can still check against all hadoop versions (since nightlies are
>> supposed to do holistic testing)
>> - Analyzing 201 precommits from 10100 (11/29) - 10300 (12/8) [10 days]:
>>    138 had +1 hadoopcheck
>>     15 had -1 hadoopcheck
>>    (others probably failed even before that - merge issue, etc)
>>
>>
>> Spot checking some failures:[10241,10246,10225,
>> 10269,10151,10156,10184,10250,10298,10227,10294,10223,10251,10119,10230]
>>
>> 10241: All 2.6.x failed. Others didn't run
>> 10246: All 10 versions failed.
>> 10184: All 2.6.x and 2.7.x failed. Others didn't run
>> 10223: All 10 versions failed
>> 10230: All 2.6.x failed. Others didn't run
>>
>> Common pattern being, all maintenance versions fail together.
>> (idk, why sometimes 2.7.* are not reported if 2.6.* fail, but that's
>> irrelevant to this discussion).
>>
>> What do you say - only check latest maintenance releases in precommit (and
>> let nightlies do holistic testing against all versions)?
>>
>> -- Appy
>>
> 
> 
> 

Re: Suggestion to speed up precommit - Reduce versions in Hadoop check

Posted by Apekshit Sharma <ap...@cloudera.com>.
Oh, btw, here's the little piece of code if anyone want's to analyze more.

Script to collect precommit runs' console text.

#!/bin/bash

for i in `seq 10100 10300`; do
  wget -a log -O ${i}
https://builds.apache.org/job/PreCommit-HBASE-Build/${i}/consoleText
done

Number of failed runs:
grep "|  -1  |    hadoopcheck" `ls 1*` | awk '{x[$1] = 1} END{for (i in x)
print i;}' | wc -l

Number of passed runs:
grep "|  +1  |    hadoopcheck" `ls 1*` | awk '{x[$1] = 1} END{for (i in x)
print i;}' | wc -l

-- Appy


On Mon, Dec 11, 2017 at 4:07 PM, Apekshit Sharma <ap...@cloudera.com> wrote:

> Hi
>
> +1 hadoopcheck 52m 1s Patch does not cause any errors with Hadoop 2.6.1
> 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 2.7.4 or 3.0.0-alpha4.
>
> Almost 1 hr to check against 10 versions. And it's only going to increase
> as more 2.6.x, 2.7.x and 3.0.x releases come out.
>
> Suggestion here is simple, let's check against only the latest maintenance
> release for each minor version i.e. 2.6.5, 2.7.4 and 3.0.0-alpha4.
> Advantage: Save ~40 min on pre-commit time.
>
> Justification:
> - We only do compile checks. Maintenance releases are not supposed to be
> doing API breaking changes. So checking against maintenance release for
> each minor version should be enough.
> - We rarely see any hadoop check -1, and most recent ones have been due to
> 3.0. These will still be caught.
> - Nightly can still check against all hadoop versions (since nightlies are
> supposed to do holistic testing)
> - Analyzing 201 precommits from 10100 (11/29) - 10300 (12/8) [10 days]:
>   138 had +1 hadoopcheck
>    15 had -1 hadoopcheck
>   (others probably failed even before that - merge issue, etc)
>
>
> Spot checking some failures:[10241,10246,10225,
> 10269,10151,10156,10184,10250,10298,10227,10294,10223,10251,10119,10230]
>
> 10241: All 2.6.x failed. Others didn't run
> 10246: All 10 versions failed.
> 10184: All 2.6.x and 2.7.x failed. Others didn't run
> 10223: All 10 versions failed
> 10230: All 2.6.x failed. Others didn't run
>
> Common pattern being, all maintenance versions fail together.
> (idk, why sometimes 2.7.* are not reported if 2.6.* fail, but that's
> irrelevant to this discussion).
>
> What do you say - only check latest maintenance releases in precommit (and
> let nightlies do holistic testing against all versions)?
>
> -- Appy
>



-- 

-- Appy