You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Andrew Purtell <ap...@apache.org> on 2019/04/03 23:44:06 UTC

The fourth HBase 1.5.0 release candidate (RC3) is available

The fourth HBase 1.5.0 release candidate (RC3) is available for download at
https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/ and Maven
artifacts are available in the temporary repository
https://repository.apache.org/content/repositories/orgapachehbase-1292/

The git tag corresponding to the candidate is '1.5.0RC3’ (b0bc7225c5).

A detailed source and binary compatibility report for this release is
available for your review at
https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
.

A list of the 115 issues resolved in this release can be found at
https://s.apache.org/K4Wk . The 1.5.0 changelog is derived from the
changelog of the last branch-1.4 release, 1.4.9.

Please try out the candidate and vote +1/0/-1.

The vote will be open for at least 72 hours. Unless objection I will try to
close it Friday April 12, 2019 if we have sufficient votes.

Prior to making this announcement I made the following preflight checks:

    RAT check passes (7u80)
    Unit test suite passes (7u80, 8u181)*
    Opened the UI in a browser, poked around
    LTT load 100M rows with 100% verification and 20% updates (8u181)
    ITBLL 1B rows with slowDeterministic monkey (8u181)
    ITBLL 1B rows with serverKilling monkey (8u181)

There are known flaky tests. See HBASE-21904 and HBASE-21905. These flaky
tests do not represent serious test failures that would prevent a release.


-- 
Best regards,
Andrew

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Yu Li <ca...@gmail.com>.
Oh one more explanation here, I meant before FORMAL release announcement,
not RC announcement, which is not a complaint.

Best Regards,
Yu


On Fri, 12 Apr 2019 at 12:38, Andrew Purtell <an...@gmail.com>
wrote:

> “However it's good to find the issue earlier if there
> really is any, before release announced.”
>
> I run the complete unit test suite before announcing a release candidate.
> Just to be clear.
>
> Totally agree we should get these problems sorted before an actual
> release. My policy is to cancel a RC if anyone vetoes for this reason...
> want as much coverage and varying environments as we can manage.
>
> Thank you for your help so far and I hope the failures you see result in
> analysis and fixes that lead to better test stability.
>
> > On Apr 11, 2019, at 9:32 PM, Yu Li <ca...@gmail.com> wrote:
> >
> > Confirmed in 1.4.7 source the listed out cases passed (all in the 1st
> part
> > of hbase-server so the result comes out quickly.)... Also confirmed the
> > test ran order are the same...
> >
> > Will try 1.5.0 again to prevent the environment difference caused by
> time.
> > If 1.5.0 still fails, will start to do the git bisect to locate the first
> > bad commit.
> >
> > Was also expecting an easy pass and +1 as always to save time and
> efforts,
> > but obvious no luck. However it's good to find the issue earlier if there
> > really is any, before release announced.
> >
> > Best Regards,
> > Yu
> >
> >
> >> On Fri, 12 Apr 2019 at 12:16, Yu Li <ca...@gmail.com> wrote:
> >>
> >> Fine, let's focus on verifying whether it's a real problem rather than
> >> arguing about wording, after all that's not my intention...
> >>
> >> As mentioned, I participated in the 1.4.7 release vote[1] and IIRC I was
> >> using the same env and all tests passed w/o issue, that's where my
> concern
> >> lies and the main reason I gave a -1 vote. I'm running against 1.4.7
> source
> >> on the same now and let's see the result.
> >>
> >> [1] https://www.mail-archive.com/dev@hbase.apache.org/msg51380.html
> >>
> >> Best Regards,
> >> Yu
> >>
> >>
> >> On Fri, 12 Apr 2019 at 12:05, Andrew Purtell <an...@gmail.com>
> >> wrote:
> >>
> >>> I believe the test execution order matters. We run some tests in
> >>> parallel. The ordering of tests is determined by readdir() results and
> this
> >>> differs from host to host and checkout to checkout. So when you see a
> >>> repeatable group of failures, that’s great. And when someone else
> doesn’t
> >>> see those same tests fail, or they cannot be reproduced when running by
> >>> themselves, the commonly accepted term of art for this is “flaky”.
> >>>
> >>>
> >>>> On Apr 11, 2019, at 8:52 PM, Yu Li <ca...@gmail.com> wrote:
> >>>>
> >>>> Sorry but I'd call it "possible environment related problem" or "some
> >>>> feature may not work well in specific environment", rather than a
> flaky.
> >>>>
> >>>> Will check against 1.4.7 released source package before opening any
> >>> JIRA.
> >>>>
> >>>> Best Regards,
> >>>> Yu
> >>>>
> >>>>
> >>>> On Fri, 12 Apr 2019 at 11:37, Andrew Purtell <
> andrew.purtell@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> And if they pass in my environment , then what should we call it
> then.
> >>> I
> >>>>> have no doubt you are seeing failures. Therefore can you please file
> >>> JIRAs
> >>>>> and attach information that can help identify a fix. Thanks.
> >>>>>
> >>>>>> On Apr 11, 2019, at 8:35 PM, Yu Li <ca...@gmail.com> wrote:
> >>>>>>
> >>>>>> I ran the test suite with the -Dsurefire.rerunFailingTestsCount=2
> >>> option
> >>>>>> and on two different env separately, so it sums up to 6 times stable
> >>>>>> failure for each case, and from my perspective this is not flaky.
> >>>>>>
> >>>>>> IIRC last time when verifying 1.4.7 on the same env no such issue
> >>>>> observed,
> >>>>>> will double check.
> >>>>>>
> >>>>>> Best Regards,
> >>>>>> Yu
> >>>>>>
> >>>>>>
> >>>>>> On Fri, 12 Apr 2019 at 00:07, Andrew Purtell <
> >>> andrew.purtell@gmail.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> There are two failure cases it looks like. And this looks like
> >>> flakes.
> >>>>>>>
> >>>>>>> The wrong FS assertions are not something I see when I run these
> >>> tests
> >>>>>>> myself. I am not able to investigate something I can’t reproduce.
> >>> What I
> >>>>>>> suggest is since you can reproduce do a git bisect to find the
> commit
> >>>>> that
> >>>>>>> introduced the problem. Then we can revert it. As an alternative we
> >>> can
> >>>>>>> open a JIRA, report the problem, temporarily @ignore the test, and
> >>>>>>> continue. This latter option only should be done if we are fairly
> >>>>> confident
> >>>>>>> it is a test only problem.
> >>>>>>>
> >>>>>>> The connect exceptions are interesting. I see these sometimes when
> >>> the
> >>>>>>> suite is executed, not this particular case, but when the failed
> >>> test is
> >>>>>>> executed by itself it always passes. It is possible some change to
> >>>>> classes
> >>>>>>> related to the minicluster or startup or shutdown timing are the
> >>> cause,
> >>>>> but
> >>>>>>> it is test time flaky behavior. I’m not happy about this but it
> >>> doesn’t
> >>>>>>> actually fail the release because the failure is never repeatable
> >>> when
> >>>>> the
> >>>>>>> test is run standalone.
> >>>>>>>
> >>>>>>> In general it would be great if some attention was paid to test
> >>>>>>> cleanliness on branch-1. As RM I’m not in a position to insist that
> >>>>>>> everything is perfect or there will never be another 1.x release,
> >>>>> certainly
> >>>>>>> not from branch-1. So, tests which fail repeatedly block a release
> >>> IMHO
> >>>>> but
> >>>>>>> flakes do not.
> >>>>>>>
> >>>>>>>
> >>>>>>>> On Apr 10, 2019, at 11:20 PM, Yu Li <ca...@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>> -1
> >>>>>>>>
> >>>>>>>> Observed many UT failures when checking the source package (tried
> >>>>>>> multiple
> >>>>>>>> rounds on two different environments, MacOs and Linux, got the
> same
> >>>>>>>> result), including (but not limited to):
> >>>>>>>>
> >>>>>>>> TestBulkload:
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad)
> >>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
> >>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29,
> >>>>>>>> expected: hdfs://localhost:55938
> >>>>>>>>     at
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246)
> >>>>>>>>     at
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256)
> >>>>>>>>     at
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150)
> >>>>>>>>
> >>>>>>>> TestStoreFile:
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile)
> >>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
> >>>>>>>> java.net.ConnectException: Call From localhost/127.0.0.1 to
> >>>>>>> localhost:55938
> >>>>>>>> failed on connection exception: java.net.ConnectException:
> >>> Connection
> >>>>>>>> refused; For more details see:
> >>>>>>>> http://wiki.apache.org/hadoop/ConnectionRefused
> >>>>>>>>     at
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047)
> >>>>>>>>     at
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908)
> >>>>>>>>
> >>>>>>>> TestHFile:
> >>>>>>>> testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile)  Time
> >>>>> elapsed:
> >>>>>>>> 0.08 s  <<< ERROR!
> >>>>>>>> java.net.ConnectException: Call From
> >>>>>>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35529
> >>> failed
> >>>>> on
> >>>>>>>> connection exception: java.net.ConnectException: Connection
> refused;
> >>>>> For
> >>>>>>>> more details see:
> http://wiki.apache.org/hadoop/ConnectionRefused
> >>>>>>>>     at
> >>>>>>>> org.apache.hadoop.hbase.io
> >>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
> >>>>>>>> Caused by: java.net.ConnectException: Connection refused
> >>>>>>>>     at
> >>>>>>>> org.apache.hadoop.hbase.io
> >>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
> >>>>>>>>
> >>>>>>>> TestBlocksScanned:
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned)
> >>>>>>>> Time elapsed: 0.069 s  <<< ERROR!
> >>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
> >>>>> hdfs://localhost:35529/tmp/
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08
> >>>>>>> ,
> >>>>>>>> expected: file:///
> >>>>>>>>     at
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90)
> >>>>>>>>     at
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86)
> >>>>>>>>
> >>>>>>>> And please let me know if any known issue I'm not aware of.
> Thanks.
> >>>>>>>>
> >>>>>>>> Best Regards,
> >>>>>>>> Yu
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> On Mon, 8 Apr 2019 at 11:38, Yu Li <ca...@gmail.com> wrote:
> >>>>>>>>>
> >>>>>>>>> The performance report LGTM, thanks! (and sorry for the lag due
> to
> >>>>>>>>> Qingming Festival Holiday here in China)
> >>>>>>>>>
> >>>>>>>>> Still verifying the release, just some quick feedback: observed
> >>> some
> >>>>>>>>> incompatible changes in compatibility report including
> >>>>>>>>> HBASE-21492/HBASE-21684 and worth a reminder in ReleaseNote.
> >>>>>>>>>
> >>>>>>>>> Irrelative but noticeable: the 1.4.9 release note URL is invalid
> on
> >>>>>>>>> https://hbase.apache.org/downloads.html
> >>>>>>>>>
> >>>>>>>>> Best Regards,
> >>>>>>>>> Yu
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <
> apurtell@apache.org>
> >>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> The difference is basically noise per the usual YCSB evaluation.
> >>>>> Small
> >>>>>>>>>> differences in workloads D and F (slightly worse) and workload E
> >>>>>>> (slightly
> >>>>>>>>>> better) that do not indicate serious regression.
> >>>>>>>>>>
> >>>>>>>>>> Linux version 4.14.55-62.37.amzn1.x86_64
> >>>>>>>>>> c3.8xlarge x 5
> >>>>>>>>>> OpenJDK Runtime Environment (build 1.8.0_181-shenandoah-b13)
> >>>>>>>>>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch -XX:+UseNUMA
> >>>>>>>>>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
> >>>>>>>>>> Hadoop 2.9.2
> >>>>>>>>>> Init: Load 100 M rows and snapshot
> >>>>>>>>>> Run: Delete table, clone and redeploy from snapshot, run 10 M
> >>>>>>> operations
> >>>>>>>>>> Args: -threads 100 -target 50000
> >>>>>>>>>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS => '1',
> >>>>>>> IN_MEMORY
> >>>>>>>>>> => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING
> =>
> >>>>>>>>>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => 'SNAPPY',
> >>>>>>> MIN_VERSIONS =>
> >>>>>>>>>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536',
> >>> REPLICATION_SCOPE =>
> >>>>>>>>>> '0'}
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> YCSB Workload A
> >>>>>>>>>>
> >>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> [OVERALL], RunTime(ms) 200592 200583
> >>>>>>>>>> [OVERALL], Throughput(ops/sec) 49852 49855
> >>>>>>>>>> [READ], AverageLatency(us) 544 559
> >>>>>>>>>> [READ], MinLatency(us) 267 292
> >>>>>>>>>> [READ], MaxLatency(us) 165631 185087
> >>>>>>>>>> [READ], 95thPercentileLatency(us) 738 742
> >>>>>>>>>> [READ], 99thPercentileLatency(us), 1877 1961
> >>>>>>>>>> [UPDATE], AverageLatency(us) 1370 1181
> >>>>>>>>>> [UPDATE], MinLatency(us) 702 646
> >>>>>>>>>> [UPDATE], MaxLatency(us) 180735 177279
> >>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1943 1652
> >>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 3257 3085
> >>>>>>>>>>
> >>>>>>>>>> YCSB Workload B
> >>>>>>>>>>
> >>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> [OVERALL], RunTime(ms) 200599 200581
> >>>>>>>>>> [OVERALL], Throughput(ops/sec) 49850 49855
> >>>>>>>>>> [READ], AverageLatency(us),  454 471
> >>>>>>>>>> [READ], MinLatency(us) 203 213
> >>>>>>>>>> [READ], MaxLatency(us) 183423 174207
> >>>>>>>>>> [READ], 95thPercentileLatency(us) 563 599
> >>>>>>>>>> [READ], 99thPercentileLatency(us) 1360 1172
> >>>>>>>>>> [UPDATE], AverageLatency(us) 1064 1029
> >>>>>>>>>> [UPDATE], MinLatency(us) 746 726
> >>>>>>>>>> [UPDATE], MaxLatency(us) 163455 101631
> >>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1327 1157
> >>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 2241 1898
> >>>>>>>>>>
> >>>>>>>>>> YCSB Workload C
> >>>>>>>>>>
> >>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> [OVERALL], RunTime(ms) 200541 200538
> >>>>>>>>>> [OVERALL], Throughput(ops/sec) 49865 49865
> >>>>>>>>>> [READ], AverageLatency(us) 332 327
> >>>>>>>>>> [READ], MinLatency(us) 175 179
> >>>>>>>>>> [READ], MaxLatency(us) 210559 170367
> >>>>>>>>>> [READ], 95thPercentileLatency(us) 410 396
> >>>>>>>>>> [READ], 99thPercentileLatency(us) 871 892
> >>>>>>>>>>
> >>>>>>>>>> YCSB Workload D
> >>>>>>>>>>
> >>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> [OVERALL], RunTime(ms) 200579 200562
> >>>>>>>>>> [OVERALL], Throughput(ops/sec) 49855 49859
> >>>>>>>>>> [READ], AverageLatency(us) 487 547
> >>>>>>>>>> [READ], MinLatency(us) 210 214
> >>>>>>>>>> [READ], MaxLatency(us) 192255 177535
> >>>>>>>>>> [READ], 95thPercentileLatency(us) 973 1529
> >>>>>>>>>> [READ], 99thPercentileLatency(us) 1836 2683
> >>>>>>>>>> [INSERT], AverageLatency(us) 1239 1152
> >>>>>>>>>> [INSERT], MinLatency(us) 807 788
> >>>>>>>>>> [INSERT], MaxLatency(us) 184575 148735
> >>>>>>>>>> [INSERT], 95thPercentileLatency(us) 1496 1243
> >>>>>>>>>> [INSERT], 99thPercentileLatency(us) 2965 2495
> >>>>>>>>>>
> >>>>>>>>>> YCSB Workload E
> >>>>>>>>>>
> >>>>>>>>>> target 10k/op/s 1.4.9 1.5.0
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> [OVERALL], RunTime(ms) 100605 100568
> >>>>>>>>>> [OVERALL], Throughput(ops/sec) 9939 9943
> >>>>>>>>>> [SCAN], AverageLatency(us) 3548 2687
> >>>>>>>>>> [SCAN], MinLatency(us) 696 678
> >>>>>>>>>> [SCAN], MaxLatency(us) 1059839 238463
> >>>>>>>>>> [SCAN], 95thPercentileLatency(us) 8327 6791
> >>>>>>>>>> [SCAN], 99thPercentileLatency(us) 17647 14415
> >>>>>>>>>> [INSERT], AverageLatency(us) 2688 1555
> >>>>>>>>>> [INSERT], MinLatency(us) 887 815
> >>>>>>>>>> [INSERT], MaxLatency(us) 173311 154623
> >>>>>>>>>> [INSERT], 95thPercentileLatency(us) 4455 2571
> >>>>>>>>>> [INSERT], 99thPercentileLatency(us) 9303 5375
> >>>>>>>>>>
> >>>>>>>>>> YCSB Workload F
> >>>>>>>>>>
> >>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> [OVERALL], RunTime(ms) 200562 204178
> >>>>>>>>>> [OVERALL], Throughput(ops/sec) 49859 48976
> >>>>>>>>>> [READ], AverageLatency(us) 856 1137
> >>>>>>>>>> [READ], MinLatency(us) 262 257
> >>>>>>>>>> [READ], MaxLatency(us) 205567 222335
> >>>>>>>>>> [READ], 95thPercentileLatency(us) 2365 3475
> >>>>>>>>>> [READ], 99thPercentileLatency(us) 3099 4143
> >>>>>>>>>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
> >>>>>>>>>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034
> >>>>>>>>>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
> >>>>>>>>>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627
> >>>>>>>>>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919
> >>>>>>>>>> [UPDATE], AverageLatency(us) 1700 1777
> >>>>>>>>>> [UPDATE], MinLatency(us) 737 687
> >>>>>>>>>> [UPDATE], MaxLatency(us) 97983 94271
> >>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 3377 4147
> >>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 4147 4831
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <ca...@gmail.com> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks for the efforts boss.
> >>>>>>>>>>>
> >>>>>>>>>>> Since it's a new minor release, do we have performance
> comparison
> >>>>>>> report
> >>>>>>>>>>> with 1.4.9 as we did when releasing 1.4.0? If so, any
> reference?
> >>>>> Many
> >>>>>>>>>>> thanks!
> >>>>>>>>>>>
> >>>>>>>>>>> Best Regards,
> >>>>>>>>>>> Yu
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <
> apurtell@apache.org
> >>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> The fourth HBase 1.5.0 release candidate (RC3) is available
> for
> >>>>>>>>>> download
> >>>>>>>>>>> at
> >>>>>>>>>>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/
> >>> and
> >>>>>>>>>> Maven
> >>>>>>>>>>>> artifacts are available in the temporary repository
> >>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>
> >>>
> https://repository.apache.org/content/repositories/orgapachehbase-1292/
> >>>>>>>>>>>>
> >>>>>>>>>>>> The git tag corresponding to the candidate is '1.5.0RC3’
> >>>>>>> (b0bc7225c5).
> >>>>>>>>>>>>
> >>>>>>>>>>>> A detailed source and binary compatibility report for this
> >>> release
> >>>>> is
> >>>>>>>>>>>> available for your review at
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
> >>>>>>>>>>>> .
> >>>>>>>>>>>>
> >>>>>>>>>>>> A list of the 115 issues resolved in this release can be found
> >>> at
> >>>>>>>>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is derived
> from
> >>>>> the
> >>>>>>>>>>>> changelog of the last branch-1.4 release, 1.4.9.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Please try out the candidate and vote +1/0/-1.
> >>>>>>>>>>>>
> >>>>>>>>>>>> The vote will be open for at least 72 hours. Unless objection
> I
> >>>>> will
> >>>>>>>>>> try
> >>>>>>>>>>> to
> >>>>>>>>>>>> close it Friday April 12, 2019 if we have sufficient votes.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Prior to making this announcement I made the following
> preflight
> >>>>>>>>>> checks:
> >>>>>>>>>>>>
> >>>>>>>>>>>> RAT check passes (7u80)
> >>>>>>>>>>>> Unit test suite passes (7u80, 8u181)*
> >>>>>>>>>>>> Opened the UI in a browser, poked around
> >>>>>>>>>>>> LTT load 100M rows with 100% verification and 20% updates
> >>> (8u181)
> >>>>>>>>>>>> ITBLL 1B rows with slowDeterministic monkey (8u181)
> >>>>>>>>>>>> ITBLL 1B rows with serverKilling monkey (8u181)
> >>>>>>>>>>>>
> >>>>>>>>>>>> There are known flaky tests. See HBASE-21904 and HBASE-21905.
> >>> These
> >>>>>>>>>> flaky
> >>>>>>>>>>>> tests do not represent serious test failures that would
> prevent
> >>> a
> >>>>>>>>>>> release.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> --
> >>>>>>>>>>>> Best regards,
> >>>>>>>>>>>> Andrew
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> Best regards,
> >>>>>>>>>> Andrew
> >>>>>>>>>>
> >>>>>>>>>> Words like orphans lost among the crosstalk, meaning torn from
> >>>>> truth's
> >>>>>>>>>> decrepit hands
> >>>>>>>>>> - A23, Crosstalk
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >>
>

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Andrew Purtell <ap...@apache.org>.
These commits improve results on my end FWIW:

commit 539de1cae922e6ce498993b9f5409f5edb90d382 (HEAD -> branch-1,
asf/branch-1)
Author: Wellington Chevreuil <we...@gmail.com>
Date:   Wed Apr 17 18:54:34 2019 -0700
    HBASE-21959 - CompactionTool should close the store it uses for
compacting files, in order to properly archive compacted files.
    Reapply without unit test
    Change-Id: If852529e79274a77eb08cac13936f02776232608
    Signed-off-by: Xu Cang <xu...@apache.org>
    Amending-Author: Andrew Purtell <ap...@apache.org>

commit 46e0e880561150a6362540ca161e7ecf1539ea02
Author: Andrew Purtell <ap...@apache.org>
Date:   Wed Apr 17 18:54:34 2019 -0700
    Revert "HBASE-21959 - CompactionTool should close the store it uses for
compacting files, in order to properly archive compacted files."
    This reverts commit c1a64aaa1a75abd0a89209c317a3fecd81853fe6.


On Wed, Apr 17, 2019 at 10:38 AM Andrew Purtell <ap...@apache.org> wrote:

> I'm testing a change that keeps the change to CompactionTool but drops the
> unit test. Will let you know how it goes.
>
>
> On Wed, Apr 17, 2019 at 10:28 AM Xu Cang <xc...@salesforce.com.invalid>
> wrote:
>
>> I just saw this email, Andrew. Should I re-open HBASE-21959? And revert it
>> before we understand/fix why it caused the test failure?
>> Regarding the failing test, do you mean this one "TestBlocksRead"?
>> Thanks,
>>
>> Xu
>>
>> On Tue, Apr 16, 2019 at 9:47 PM Andrew Purtell <an...@gmail.com>
>> wrote:
>>
>> > I've bisected twice and it lands on this commit:
>> >
>> > commit 6bc46bb10920c1c335b784b01d2a326db1a3d587 (HEAD, refs/bisect/bad)
>> >     HBASE-21959 CompactionTool should close the store it uses for
>> > compacting files, in order to properly archive compacted files.
>> >
>> >
>> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java
>> > |   2 ++
>> >
>> >
>> hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactionTool.java
>> > | 100
>> >
>> > At first glance it's hard to see how this change is relevant, but it
>> does
>> > introduce a new unit test.
>> >
>> >
>> > On Tue, Apr 16, 2019 at 7:48 PM Andrew Purtell <
>> andrew.purtell@gmail.com>
>> > wrote:
>> >
>> > > I’ve been able to reproduce it sometimes too and am bisecting. It may
>> be
>> > > an interaction between test cases, not a failure per se, but does seem
>> > have
>> > > a recent cause, as you pointed out. I’ll be looking at it.
>> > >
>> > > Thank you for your kind consideration and for revoking your veto.
>> > >
>> > > A coprocessor API fix was just committed to branch-1 so I want to
>> roll a
>> > > new RC soon to include it. There is also an issue open to improve the
>> > > behavior of the UI when the profiler link is clicked but system
>> support
>> > is
>> > > not available.
>> > >
>> > > > On Apr 16, 2019, at 7:40 PM, Yu Li <ca...@gmail.com> wrote:
>> > > >
>> > > > After more investigation, the ConnectionRefused exception could be
>> > > > reproduced with "mvn -Dtest=<case_name> test" after a complete run
>> of
>> > all
>> > > > cases through "mvn -PrunAllTests clean test", but cannot by a clean
>> > > > standalone run (with "mvn *clean* test"). So now I'm more convinced
>> > it's
>> > > > some kind of environment chaos caused by parallel execution of test
>> > > cases,
>> > > > and not a blocker issue.
>> > > >
>> > > > @Andrew It seems to me that kerby jar is not included in our binary
>> > > > package, so I'm not sure whether a new RC is required by
>> HBASE-22219.
>> > > > Anyway I'd like to revoke my -1 vote now. Thanks.
>> > > >
>> > > > Best Regards,
>> > > > Yu
>> > > >
>> > > >
>> > > >> On Tue, 16 Apr 2019 at 10:19, Yu Li <ca...@gmail.com> wrote:
>> > > >>
>> > > >> Sorry for the late response due to job priority.
>> > > >>
>> > > >> This ConnectionRefused issue cannot be reproduced on my laptop
>> (MacOS
>> > > >> 10.14.4) but could on the linux env. And I've checked and
>> confirmed it
>> > > >> could pass with 1.4.7/1.4.9 source package but stably failed with
>> > 1.5.0,
>> > > >> performing a git bisect now, will report back later.
>> > > >>
>> > > >> Best Regards,
>> > > >> Yu
>> > > >>
>> > > >>
>> > > >> On Sat, 13 Apr 2019 at 00:38, Andrew Purtell <
>> > andrew.purtell@gmail.com>
>> > > >> wrote:
>> > > >>
>> > > >>> I also see the occasional ConnectionRefused errors. They don’t
>> > > reproduce
>> > > >>> if you run the test standalone. I also only see them on a Linux
>> dev
>> > > host.
>> > > >>> That may be enough to find by bisect the commit that introduced
>> this
>> > > >>> behavior. Working on it. There is a JIRA filed for this one.
>> Search
>> > for
>> > > >>> “TestBlocksRead” and label “branch-1”.
>> > > >>>
>> > > >>> Thanks for the investigations.
>> > > >>>
>> > > >>>> On Apr 12, 2019, at 6:36 AM, Yu Li <ca...@gmail.com> wrote:
>> > > >>>>
>> > > >>>> Quick updates:
>> > > >>>>
>> > > >>>> W/ patch of HBASE-22219 or say upgrading kerby version to 1.0.1,
>> the
>> > > >>>> failures listed above in the 1st part of hbase-server
>> disappeared.
>> > > >>>>
>> > > >>>> However, in the 2nd part of hbase-server UT there're still many
>> > > >>>> ConnectionRefused exceptions (17 errors in total) as shown below,
>> > > which
>> > > >>>> could be reproduced easily with -Dtest=xxx command on my
>> > environments,
>> > > >>>> still checking the root cause.
>> > > >>>>
>> > > >>>> [INFO] Running
>> org.apache.hadoop.hbase.regionserver.TestBlocksRead
>> > > >>>> [ERROR] Tests run: 4, Failures: 0, Errors: 4, Skipped: 0, Time
>> > > elapsed:
>> > > >>>> 0.853 s <<< FAILURE! - in
>> > > >>>> org.apache.hadoop.hbase.regionserver.TestBlocksRead
>> > > >>>> [ERROR]
>> > > >>>>
>> > > >>>
>> > >
>> >
>> testBlocksStoredWhenCachingDisabled(org.apache.hadoop.hbase.regionserver.TestBlocksRead)
>> > > >>>> Time elapsed: 0.17 s  <<< ERROR!
>> > > >>>> java.net.ConnectException: Call From
>> > > >>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35669
>> > failed
>> > > >>> on
>> > > >>>> connection exception: java.net.ConnectException: Connection
>> refused;
>> > > For
>> > > >>>> more details see:
>> > > >>>> http://wiki.apache.org/hadoop/ConnectionRefused
>> > > >>>>       at
>> > > >>>>
>> > > >>>
>> > >
>> >
>> org.apache.hadoop.hbase.regionserver.TestBlocksRead.initHRegion(TestBlocksRead.java:112)
>> > > >>>>       at
>> > > >>>>
>> > > >>>
>> > >
>> >
>> org.apache.hadoop.hbase.regionserver.TestBlocksRead.testBlocksStoredWhenCachingDisabled(TestBlocksRead.java:389)
>> > > >>>> Caused by: java.net.ConnectException: Connection refused
>> > > >>>>       at
>> > > >>>>
>> > > >>>
>> > >
>> >
>> org.apache.hadoop.hbase.regionserver.TestBlocksRead.initHRegion(TestBlocksRead.java:112)
>> > > >>>>       at
>> > > >>>>
>> > > >>>
>> > >
>> >
>> org.apache.hadoop.hbase.regionserver.TestBlocksRead.testBlocksStoredWhenCachingDisabled(TestBlocksRead.java:389)
>> > > >>>>
>> > > >>>> Best Regards,
>> > > >>>> Yu
>> > > >>>>
>> > > >>>>
>> > > >>>>> On Fri, 12 Apr 2019 at 13:11, Yu Li <ca...@gmail.com> wrote:
>> > > >>>>>
>> > > >>>>> I have no doubt that you've run the tests locally before
>> > announcing a
>> > > >>>>> release as you're always a great RM boss. And this shows one
>> value
>> > of
>> > > >>>>> verifying release, that different voter has different
>> environments.
>> > > >>>>>
>> > > >>>>> Now I think the failures may be kerberos related, since I
>> possibly
>> > > has
>> > > >>>>> changed some system configuration when doing Flink testing on
>> this
>> > > env
>> > > >>>>> weeks ago. Located one issue (HBASE-22219) which also observed
>> in
>> > > >>> 1.4.7,
>> > > >>>>> will further investigate.
>> > > >>>>>
>> > > >>>>> Best Regards,
>> > > >>>>> Yu
>> > > >>>>>
>> > > >>>>>
>> > > >>>>> On Fri, 12 Apr 2019 at 12:38, Andrew Purtell <
>> > > andrew.purtell@gmail.com
>> > > >>>>
>> > > >>>>> wrote:
>> > > >>>>>
>> > > >>>>>> “However it's good to find the issue earlier if there
>> > > >>>>>> really is any, before release announced.”
>> > > >>>>>>
>> > > >>>>>> I run the complete unit test suite before announcing a release
>> > > >>> candidate.
>> > > >>>>>> Just to be clear.
>> > > >>>>>>
>> > > >>>>>> Totally agree we should get these problems sorted before an
>> actual
>> > > >>>>>> release. My policy is to cancel a RC if anyone vetoes for this
>> > > >>> reason...
>> > > >>>>>> want as much coverage and varying environments as we can
>> manage.
>> > > >>>>>>
>> > > >>>>>> Thank you for your help so far and I hope the failures you see
>> > > result
>> > > >>> in
>> > > >>>>>> analysis and fixes that lead to better test stability.
>> > > >>>>>>
>> > > >>>>>>> On Apr 11, 2019, at 9:32 PM, Yu Li <ca...@gmail.com> wrote:
>> > > >>>>>>>
>> > > >>>>>>> Confirmed in 1.4.7 source the listed out cases passed (all in
>> the
>> > > 1st
>> > > >>>>>> part
>> > > >>>>>>> of hbase-server so the result comes out quickly.)... Also
>> > confirmed
>> > > >>> the
>> > > >>>>>>> test ran order are the same...
>> > > >>>>>>>
>> > > >>>>>>> Will try 1.5.0 again to prevent the environment difference
>> caused
>> > > by
>> > > >>>>>> time.
>> > > >>>>>>> If 1.5.0 still fails, will start to do the git bisect to
>> locate
>> > the
>> > > >>>>>> first
>> > > >>>>>>> bad commit.
>> > > >>>>>>>
>> > > >>>>>>> Was also expecting an easy pass and +1 as always to save time
>> and
>> > > >>>>>> efforts,
>> > > >>>>>>> but obvious no luck. However it's good to find the issue
>> earlier
>> > if
>> > > >>>>>> there
>> > > >>>>>>> really is any, before release announced.
>> > > >>>>>>>
>> > > >>>>>>> Best Regards,
>> > > >>>>>>> Yu
>> > > >>>>>>>
>> > > >>>>>>>
>> > > >>>>>>>> On Fri, 12 Apr 2019 at 12:16, Yu Li <ca...@gmail.com>
>> wrote:
>> > > >>>>>>>>
>> > > >>>>>>>> Fine, let's focus on verifying whether it's a real problem
>> > rather
>> > > >>> than
>> > > >>>>>>>> arguing about wording, after all that's not my intention...
>> > > >>>>>>>>
>> > > >>>>>>>> As mentioned, I participated in the 1.4.7 release vote[1] and
>> > > IIRC I
>> > > >>>>>> was
>> > > >>>>>>>> using the same env and all tests passed w/o issue, that's
>> where
>> > my
>> > > >>>>>> concern
>> > > >>>>>>>> lies and the main reason I gave a -1 vote. I'm running
>> against
>> > > 1.4.7
>> > > >>>>>> source
>> > > >>>>>>>> on the same now and let's see the result.
>> > > >>>>>>>>
>> > > >>>>>>>> [1]
>> > > https://www.mail-archive.com/dev@hbase.apache.org/msg51380.html
>> > > >>>>>>>>
>> > > >>>>>>>> Best Regards,
>> > > >>>>>>>> Yu
>> > > >>>>>>>>
>> > > >>>>>>>>
>> > > >>>>>>>> On Fri, 12 Apr 2019 at 12:05, Andrew Purtell <
>> > > >>> andrew.purtell@gmail.com
>> > > >>>>>>>
>> > > >>>>>>>> wrote:
>> > > >>>>>>>>
>> > > >>>>>>>>> I believe the test execution order matters. We run some
>> tests
>> > in
>> > > >>>>>>>>> parallel. The ordering of tests is determined by readdir()
>> > > results
>> > > >>>>>> and this
>> > > >>>>>>>>> differs from host to host and checkout to checkout. So when
>> you
>> > > >>> see a
>> > > >>>>>>>>> repeatable group of failures, that’s great. And when someone
>> > else
>> > > >>>>>> doesn’t
>> > > >>>>>>>>> see those same tests fail, or they cannot be reproduced when
>> > > >>> running
>> > > >>>>>> by
>> > > >>>>>>>>> themselves, the commonly accepted term of art for this is
>> > > “flaky”.
>> > > >>>>>>>>>
>> > > >>>>>>>>>
>> > > >>>>>>>>>> On Apr 11, 2019, at 8:52 PM, Yu Li <ca...@gmail.com>
>> wrote:
>> > > >>>>>>>>>>
>> > > >>>>>>>>>> Sorry but I'd call it "possible environment related
>> problem"
>> > or
>> > > >>> "some
>> > > >>>>>>>>>> feature may not work well in specific environment", rather
>> > than
>> > > a
>> > > >>>>>> flaky.
>> > > >>>>>>>>>>
>> > > >>>>>>>>>> Will check against 1.4.7 released source package before
>> > opening
>> > > >>> any
>> > > >>>>>>>>> JIRA.
>> > > >>>>>>>>>>
>> > > >>>>>>>>>> Best Regards,
>> > > >>>>>>>>>> Yu
>> > > >>>>>>>>>>
>> > > >>>>>>>>>>
>> > > >>>>>>>>>> On Fri, 12 Apr 2019 at 11:37, Andrew Purtell <
>> > > >>>>>> andrew.purtell@gmail.com>
>> > > >>>>>>>>>> wrote:
>> > > >>>>>>>>>>
>> > > >>>>>>>>>>> And if they pass in my environment , then what should we
>> call
>> > > it
>> > > >>>>>> then.
>> > > >>>>>>>>> I
>> > > >>>>>>>>>>> have no doubt you are seeing failures. Therefore can you
>> > please
>> > > >>> file
>> > > >>>>>>>>> JIRAs
>> > > >>>>>>>>>>> and attach information that can help identify a fix.
>> Thanks.
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>>>> On Apr 11, 2019, at 8:35 PM, Yu Li <ca...@gmail.com>
>> > wrote:
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>> I ran the test suite with the
>> > > >>> -Dsurefire.rerunFailingTestsCount=2
>> > > >>>>>>>>> option
>> > > >>>>>>>>>>>> and on two different env separately, so it sums up to 6
>> > times
>> > > >>>>>> stable
>> > > >>>>>>>>>>>> failure for each case, and from my perspective this is
>> not
>> > > >>> flaky.
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>> IIRC last time when verifying 1.4.7 on the same env no
>> such
>> > > >>> issue
>> > > >>>>>>>>>>> observed,
>> > > >>>>>>>>>>>> will double check.
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>> Best Regards,
>> > > >>>>>>>>>>>> Yu
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>> On Fri, 12 Apr 2019 at 00:07, Andrew Purtell <
>> > > >>>>>>>>> andrew.purtell@gmail.com>
>> > > >>>>>>>>>>>> wrote:
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>>> There are two failure cases it looks like. And this
>> looks
>> > > like
>> > > >>>>>>>>> flakes.
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>>> The wrong FS assertions are not something I see when I
>> run
>> > > >>> these
>> > > >>>>>>>>> tests
>> > > >>>>>>>>>>>>> myself. I am not able to investigate something I can’t
>> > > >>> reproduce.
>> > > >>>>>>>>> What I
>> > > >>>>>>>>>>>>> suggest is since you can reproduce do a git bisect to
>> find
>> > > the
>> > > >>>>>> commit
>> > > >>>>>>>>>>> that
>> > > >>>>>>>>>>>>> introduced the problem. Then we can revert it. As an
>> > > >>> alternative
>> > > >>>>>> we
>> > > >>>>>>>>> can
>> > > >>>>>>>>>>>>> open a JIRA, report the problem, temporarily @ignore the
>> > > test,
>> > > >>> and
>> > > >>>>>>>>>>>>> continue. This latter option only should be done if we
>> are
>> > > >>> fairly
>> > > >>>>>>>>>>> confident
>> > > >>>>>>>>>>>>> it is a test only problem.
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>>> The connect exceptions are interesting. I see these
>> > sometimes
>> > > >>> when
>> > > >>>>>>>>> the
>> > > >>>>>>>>>>>>> suite is executed, not this particular case, but when
>> the
>> > > >>> failed
>> > > >>>>>>>>> test is
>> > > >>>>>>>>>>>>> executed by itself it always passes. It is possible some
>> > > >>> change to
>> > > >>>>>>>>>>> classes
>> > > >>>>>>>>>>>>> related to the minicluster or startup or shutdown timing
>> > are
>> > > >>> the
>> > > >>>>>>>>> cause,
>> > > >>>>>>>>>>> but
>> > > >>>>>>>>>>>>> it is test time flaky behavior. I’m not happy about this
>> > but
>> > > it
>> > > >>>>>>>>> doesn’t
>> > > >>>>>>>>>>>>> actually fail the release because the failure is never
>> > > >>> repeatable
>> > > >>>>>>>>> when
>> > > >>>>>>>>>>> the
>> > > >>>>>>>>>>>>> test is run standalone.
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>>> In general it would be great if some attention was paid
>> to
>> > > test
>> > > >>>>>>>>>>>>> cleanliness on branch-1. As RM I’m not in a position to
>> > > insist
>> > > >>>>>> that
>> > > >>>>>>>>>>>>> everything is perfect or there will never be another 1.x
>> > > >>> release,
>> > > >>>>>>>>>>> certainly
>> > > >>>>>>>>>>>>> not from branch-1. So, tests which fail repeatedly
>> block a
>> > > >>> release
>> > > >>>>>>>>> IMHO
>> > > >>>>>>>>>>> but
>> > > >>>>>>>>>>>>> flakes do not.
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>> On Apr 10, 2019, at 11:20 PM, Yu Li <ca...@gmail.com>
>> > > wrote:
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>> -1
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>> Observed many UT failures when checking the source
>> package
>> > > >>> (tried
>> > > >>>>>>>>>>>>> multiple
>> > > >>>>>>>>>>>>>> rounds on two different environments, MacOs and Linux,
>> got
>> > > the
>> > > >>>>>> same
>> > > >>>>>>>>>>>>>> result), including (but not limited to):
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>> TestBulkload:
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>
>> > > >>>>>>
>> > > >>>
>> > >
>> >
>> shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad)
>> > > >>>>>>>>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
>> > > >>>>>>>>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>
>> > > >>>>>>
>> > > >>>
>> > >
>> >
>> file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29,
>> > > >>>>>>>>>>>>>> expected: hdfs://localhost:55938
>> > > >>>>>>>>>>>>>>   at
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>
>> > > >>>>>>
>> > > >>>
>> > >
>> >
>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246)
>> > > >>>>>>>>>>>>>>   at
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>
>> > > >>>>>>
>> > > >>>
>> > >
>> >
>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256)
>> > > >>>>>>>>>>>>>>   at
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>
>> > > >>>>>>
>> > > >>>
>> > >
>> >
>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150)
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>> TestStoreFile:
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>
>> > > >>>>>>
>> > > >>>
>> > >
>> >
>> testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile)
>> > > >>>>>>>>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
>> > > >>>>>>>>>>>>>> java.net.ConnectException: Call From localhost/
>> 127.0.0.1
>> > to
>> > > >>>>>>>>>>>>> localhost:55938
>> > > >>>>>>>>>>>>>> failed on connection exception:
>> java.net.ConnectException:
>> > > >>>>>>>>> Connection
>> > > >>>>>>>>>>>>>> refused; For more details see:
>> > > >>>>>>>>>>>>>> http://wiki.apache.org/hadoop/ConnectionRefused
>> > > >>>>>>>>>>>>>>   at
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>
>> > > >>>>>>
>> > > >>>
>> > >
>> >
>> org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047)
>> > > >>>>>>>>>>>>>>   at
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>
>> > > >>>>>>
>> > > >>>
>> > >
>> >
>> org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908)
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>> TestHFile:
>> > > >>>>>>>>>>>>>>
>> testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile)
>> > > >>> Time
>> > > >>>>>>>>>>> elapsed:
>> > > >>>>>>>>>>>>>> 0.08 s  <<< ERROR!
>> > > >>>>>>>>>>>>>> java.net.ConnectException: Call From
>> > > >>>>>>>>>>>>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to
>> > > >>> localhost:35529
>> > > >>>>>>>>> failed
>> > > >>>>>>>>>>> on
>> > > >>>>>>>>>>>>>> connection exception: java.net.ConnectException:
>> > Connection
>> > > >>>>>> refused;
>> > > >>>>>>>>>>> For
>> > > >>>>>>>>>>>>>> more details see:
>> > > >>>>>> http://wiki.apache.org/hadoop/ConnectionRefused
>> > > >>>>>>>>>>>>>>   at
>> > > >>>>>>>>>>>>>> org.apache.hadoop.hbase.io
>> > > >>>>>>>>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
>> > > >>>>>>>>>>>>>> Caused by: java.net.ConnectException: Connection
>> refused
>> > > >>>>>>>>>>>>>>   at
>> > > >>>>>>>>>>>>>> org.apache.hadoop.hbase.io
>> > > >>>>>>>>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>> TestBlocksScanned:
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>
>> > > >>>>>>
>> > > >>>
>> > >
>> >
>> testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned)
>> > > >>>>>>>>>>>>>> Time elapsed: 0.069 s  <<< ERROR!
>> > > >>>>>>>>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
>> > > >>>>>>>>>>> hdfs://localhost:35529/tmp/
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>
>> > > >>>>>>
>> > > >>>
>> > >
>> >
>> hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08
>> > > >>>>>>>>>>>>> ,
>> > > >>>>>>>>>>>>>> expected: file:///
>> > > >>>>>>>>>>>>>>   at
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>
>> > > >>>>>>
>> > > >>>
>> > >
>> >
>> org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90)
>> > > >>>>>>>>>>>>>>   at
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>
>> > > >>>>>>
>> > > >>>
>> > >
>> >
>> org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86)
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>> And please let me know if any known issue I'm not aware
>> > of.
>> > > >>>>>> Thanks.
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>> Best Regards,
>> > > >>>>>>>>>>>>>> Yu
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>> On Mon, 8 Apr 2019 at 11:38, Yu Li <ca...@gmail.com>
>> > > wrote:
>> > > >>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>> The performance report LGTM, thanks! (and sorry for
>> the
>> > lag
>> > > >>> due
>> > > >>>>>> to
>> > > >>>>>>>>>>>>>>> Qingming Festival Holiday here in China)
>> > > >>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>> Still verifying the release, just some quick feedback:
>> > > >>> observed
>> > > >>>>>>>>> some
>> > > >>>>>>>>>>>>>>> incompatible changes in compatibility report including
>> > > >>>>>>>>>>>>>>> HBASE-21492/HBASE-21684 and worth a reminder in
>> > > ReleaseNote.
>> > > >>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>> Irrelative but noticeable: the 1.4.9 release note URL
>> is
>> > > >>>>>> invalid on
>> > > >>>>>>>>>>>>>>> https://hbase.apache.org/downloads.html
>> > > >>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>> Best Regards,
>> > > >>>>>>>>>>>>>>> Yu
>> > > >>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <
>> > > >>>>>> apurtell@apache.org>
>> > > >>>>>>>>>>>>> wrote:
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> The difference is basically noise per the usual YCSB
>> > > >>>>>> evaluation.
>> > > >>>>>>>>>>> Small
>> > > >>>>>>>>>>>>>>>> differences in workloads D and F (slightly worse) and
>> > > >>> workload
>> > > >>>>>> E
>> > > >>>>>>>>>>>>> (slightly
>> > > >>>>>>>>>>>>>>>> better) that do not indicate serious regression.
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> Linux version 4.14.55-62.37.amzn1.x86_64
>> > > >>>>>>>>>>>>>>>> c3.8xlarge x 5
>> > > >>>>>>>>>>>>>>>> OpenJDK Runtime Environment (build
>> > > 1.8.0_181-shenandoah-b13)
>> > > >>>>>>>>>>>>>>>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch
>> > > >>> -XX:+UseNUMA
>> > > >>>>>>>>>>>>>>>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
>> > > >>>>>>>>>>>>>>>> Hadoop 2.9.2
>> > > >>>>>>>>>>>>>>>> Init: Load 100 M rows and snapshot
>> > > >>>>>>>>>>>>>>>> Run: Delete table, clone and redeploy from snapshot,
>> run
>> > > 10
>> > > >>> M
>> > > >>>>>>>>>>>>> operations
>> > > >>>>>>>>>>>>>>>> Args: -threads 100 -target 50000
>> > > >>>>>>>>>>>>>>>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW',
>> VERSIONS
>> > > =>
>> > > >>>>>> '1',
>> > > >>>>>>>>>>>>> IN_MEMORY
>> > > >>>>>>>>>>>>>>>> => 'false', KEEP_DELETED_CELLS => 'FALSE',
>> > > >>> DATA_BLOCK_ENCODING
>> > > >>>>>> =>
>> > > >>>>>>>>>>>>>>>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION =>
>> > 'SNAPPY',
>> > > >>>>>>>>>>>>> MIN_VERSIONS =>
>> > > >>>>>>>>>>>>>>>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536',
>> > > >>>>>>>>> REPLICATION_SCOPE =>
>> > > >>>>>>>>>>>>>>>> '0'}
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> YCSB Workload A
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200592 200583
>> > > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49852 49855
>> > > >>>>>>>>>>>>>>>> [READ], AverageLatency(us) 544 559
>> > > >>>>>>>>>>>>>>>> [READ], MinLatency(us) 267 292
>> > > >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 165631 185087
>> > > >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 738 742
>> > > >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us), 1877 1961
>> > > >>>>>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1370 1181
>> > > >>>>>>>>>>>>>>>> [UPDATE], MinLatency(us) 702 646
>> > > >>>>>>>>>>>>>>>> [UPDATE], MaxLatency(us) 180735 177279
>> > > >>>>>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1943 1652
>> > > >>>>>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 3257 3085
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> YCSB Workload B
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200599 200581
>> > > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49850 49855
>> > > >>>>>>>>>>>>>>>> [READ], AverageLatency(us),  454 471
>> > > >>>>>>>>>>>>>>>> [READ], MinLatency(us) 203 213
>> > > >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 183423 174207
>> > > >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 563 599
>> > > >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 1360 1172
>> > > >>>>>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1064 1029
>> > > >>>>>>>>>>>>>>>> [UPDATE], MinLatency(us) 746 726
>> > > >>>>>>>>>>>>>>>> [UPDATE], MaxLatency(us) 163455 101631
>> > > >>>>>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1327 1157
>> > > >>>>>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 2241 1898
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> YCSB Workload C
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200541 200538
>> > > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49865 49865
>> > > >>>>>>>>>>>>>>>> [READ], AverageLatency(us) 332 327
>> > > >>>>>>>>>>>>>>>> [READ], MinLatency(us) 175 179
>> > > >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 210559 170367
>> > > >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 410 396
>> > > >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 871 892
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> YCSB Workload D
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200579 200562
>> > > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49855 49859
>> > > >>>>>>>>>>>>>>>> [READ], AverageLatency(us) 487 547
>> > > >>>>>>>>>>>>>>>> [READ], MinLatency(us) 210 214
>> > > >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 192255 177535
>> > > >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 973 1529
>> > > >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 1836 2683
>> > > >>>>>>>>>>>>>>>> [INSERT], AverageLatency(us) 1239 1152
>> > > >>>>>>>>>>>>>>>> [INSERT], MinLatency(us) 807 788
>> > > >>>>>>>>>>>>>>>> [INSERT], MaxLatency(us) 184575 148735
>> > > >>>>>>>>>>>>>>>> [INSERT], 95thPercentileLatency(us) 1496 1243
>> > > >>>>>>>>>>>>>>>> [INSERT], 99thPercentileLatency(us) 2965 2495
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> YCSB Workload E
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> target 10k/op/s 1.4.9 1.5.0
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 100605 100568
>> > > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 9939 9943
>> > > >>>>>>>>>>>>>>>> [SCAN], AverageLatency(us) 3548 2687
>> > > >>>>>>>>>>>>>>>> [SCAN], MinLatency(us) 696 678
>> > > >>>>>>>>>>>>>>>> [SCAN], MaxLatency(us) 1059839 238463
>> > > >>>>>>>>>>>>>>>> [SCAN], 95thPercentileLatency(us) 8327 6791
>> > > >>>>>>>>>>>>>>>> [SCAN], 99thPercentileLatency(us) 17647 14415
>> > > >>>>>>>>>>>>>>>> [INSERT], AverageLatency(us) 2688 1555
>> > > >>>>>>>>>>>>>>>> [INSERT], MinLatency(us) 887 815
>> > > >>>>>>>>>>>>>>>> [INSERT], MaxLatency(us) 173311 154623
>> > > >>>>>>>>>>>>>>>> [INSERT], 95thPercentileLatency(us) 4455 2571
>> > > >>>>>>>>>>>>>>>> [INSERT], 99thPercentileLatency(us) 9303 5375
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> YCSB Workload F
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200562 204178
>> > > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49859 48976
>> > > >>>>>>>>>>>>>>>> [READ], AverageLatency(us) 856 1137
>> > > >>>>>>>>>>>>>>>> [READ], MinLatency(us) 262 257
>> > > >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 205567 222335
>> > > >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 2365 3475
>> > > >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 3099 4143
>> > > >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
>> > > >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034
>> > > >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
>> > > >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747
>> 7627
>> > > >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203
>> 8919
>> > > >>>>>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1700 1777
>> > > >>>>>>>>>>>>>>>> [UPDATE], MinLatency(us) 737 687
>> > > >>>>>>>>>>>>>>>> [UPDATE], MaxLatency(us) 97983 94271
>> > > >>>>>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 3377 4147
>> > > >>>>>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 4147 4831
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <
>> carp84@gmail.com
>> > >
>> > > >>>>>> wrote:
>> > > >>>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>> Thanks for the efforts boss.
>> > > >>>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>> Since it's a new minor release, do we have
>> performance
>> > > >>>>>> comparison
>> > > >>>>>>>>>>>>> report
>> > > >>>>>>>>>>>>>>>>> with 1.4.9 as we did when releasing 1.4.0? If so,
>> any
>> > > >>>>>> reference?
>> > > >>>>>>>>>>> Many
>> > > >>>>>>>>>>>>>>>>> thanks!
>> > > >>>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>> Best Regards,
>> > > >>>>>>>>>>>>>>>>> Yu
>> > > >>>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <
>> > > >>>>>> apurtell@apache.org
>> > > >>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> wrote:
>> > > >>>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>>> The fourth HBase 1.5.0 release candidate (RC3) is
>> > > >>> available
>> > > >>>>>> for
>> > > >>>>>>>>>>>>>>>> download
>> > > >>>>>>>>>>>>>>>>> at
>> > > >>>>>>>>>>>>>>>>>>
>> > > >>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/
>> > > >>>>>>>>> and
>> > > >>>>>>>>>>>>>>>> Maven
>> > > >>>>>>>>>>>>>>>>>> artifacts are available in the temporary repository
>> > > >>>>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>
>> > > >>>>>>
>> > > >>>
>> > >
>> https://repository.apache.org/content/repositories/orgapachehbase-1292/
>> > > >>>>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>>> The git tag corresponding to the candidate is
>> > '1.5.0RC3’
>> > > >>>>>>>>>>>>> (b0bc7225c5).
>> > > >>>>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>>> A detailed source and binary compatibility report
>> for
>> > > this
>> > > >>>>>>>>> release
>> > > >>>>>>>>>>> is
>> > > >>>>>>>>>>>>>>>>>> available for your review at
>> > > >>>>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>
>> > > >>>>>>
>> > > >>>
>> > >
>> >
>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
>> > > >>>>>>>>>>>>>>>>>> .
>> > > >>>>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>>> A list of the 115 issues resolved in this release
>> can
>> > be
>> > > >>>>>> found
>> > > >>>>>>>>> at
>> > > >>>>>>>>>>>>>>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is
>> > > >>> derived
>> > > >>>>>> from
>> > > >>>>>>>>>>> the
>> > > >>>>>>>>>>>>>>>>>> changelog of the last branch-1.4 release, 1.4.9.
>> > > >>>>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>>> Please try out the candidate and vote +1/0/-1.
>> > > >>>>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>>> The vote will be open for at least 72 hours. Unless
>> > > >>>>>> objection I
>> > > >>>>>>>>>>> will
>> > > >>>>>>>>>>>>>>>> try
>> > > >>>>>>>>>>>>>>>>> to
>> > > >>>>>>>>>>>>>>>>>> close it Friday April 12, 2019 if we have
>> sufficient
>> > > >>> votes.
>> > > >>>>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>>> Prior to making this announcement I made the
>> following
>> > > >>>>>> preflight
>> > > >>>>>>>>>>>>>>>> checks:
>> > > >>>>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>>> RAT check passes (7u80)
>> > > >>>>>>>>>>>>>>>>>> Unit test suite passes (7u80, 8u181)*
>> > > >>>>>>>>>>>>>>>>>> Opened the UI in a browser, poked around
>> > > >>>>>>>>>>>>>>>>>> LTT load 100M rows with 100% verification and 20%
>> > > updates
>> > > >>>>>>>>> (8u181)
>> > > >>>>>>>>>>>>>>>>>> ITBLL 1B rows with slowDeterministic monkey (8u181)
>> > > >>>>>>>>>>>>>>>>>> ITBLL 1B rows with serverKilling monkey (8u181)
>> > > >>>>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>>> There are known flaky tests. See HBASE-21904 and
>> > > >>> HBASE-21905.
>> > > >>>>>>>>> These
>> > > >>>>>>>>>>>>>>>> flaky
>> > > >>>>>>>>>>>>>>>>>> tests do not represent serious test failures that
>> > would
>> > > >>>>>> prevent
>> > > >>>>>>>>> a
>> > > >>>>>>>>>>>>>>>>> release.
>> > > >>>>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>>> --
>> > > >>>>>>>>>>>>>>>>>> Best regards,
>> > > >>>>>>>>>>>>>>>>>> Andrew
>> > > >>>>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> --
>> > > >>>>>>>>>>>>>>>> Best regards,
>> > > >>>>>>>>>>>>>>>> Andrew
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>> Words like orphans lost among the crosstalk, meaning
>> > torn
>> > > >>> from
>> > > >>>>>>>>>>> truth's
>> > > >>>>>>>>>>>>>>>> decrepit hands
>> > > >>>>>>>>>>>>>>>> - A23, Crosstalk
>> > > >>>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>
>> > > >>>>>>>>
>> > > >>>>>>
>> > > >>>>>
>> > > >>>
>> > > >>
>> > >
>> >
>> >
>> > --
>> > Best regards,
>> > Andrew
>> >
>> > Words like orphans lost among the crosstalk, meaning torn from truth's
>> > decrepit hands
>> >    - A23, Crosstalk
>> >
>>
>
>
> --
> Best regards,
> Andrew
>
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>    - A23, Crosstalk
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Andrew Purtell <ap...@apache.org>.
I'm testing a change that keeps the change to CompactionTool but drops the
unit test. Will let you know how it goes.


On Wed, Apr 17, 2019 at 10:28 AM Xu Cang <xc...@salesforce.com.invalid>
wrote:

> I just saw this email, Andrew. Should I re-open HBASE-21959? And revert it
> before we understand/fix why it caused the test failure?
> Regarding the failing test, do you mean this one "TestBlocksRead"?
> Thanks,
>
> Xu
>
> On Tue, Apr 16, 2019 at 9:47 PM Andrew Purtell <an...@gmail.com>
> wrote:
>
> > I've bisected twice and it lands on this commit:
> >
> > commit 6bc46bb10920c1c335b784b01d2a326db1a3d587 (HEAD, refs/bisect/bad)
> >     HBASE-21959 CompactionTool should close the store it uses for
> > compacting files, in order to properly archive compacted files.
> >
> >
> hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java
> > |   2 ++
> >
> >
> hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactionTool.java
> > | 100
> >
> > At first glance it's hard to see how this change is relevant, but it does
> > introduce a new unit test.
> >
> >
> > On Tue, Apr 16, 2019 at 7:48 PM Andrew Purtell <andrew.purtell@gmail.com
> >
> > wrote:
> >
> > > I’ve been able to reproduce it sometimes too and am bisecting. It may
> be
> > > an interaction between test cases, not a failure per se, but does seem
> > have
> > > a recent cause, as you pointed out. I’ll be looking at it.
> > >
> > > Thank you for your kind consideration and for revoking your veto.
> > >
> > > A coprocessor API fix was just committed to branch-1 so I want to roll
> a
> > > new RC soon to include it. There is also an issue open to improve the
> > > behavior of the UI when the profiler link is clicked but system support
> > is
> > > not available.
> > >
> > > > On Apr 16, 2019, at 7:40 PM, Yu Li <ca...@gmail.com> wrote:
> > > >
> > > > After more investigation, the ConnectionRefused exception could be
> > > > reproduced with "mvn -Dtest=<case_name> test" after a complete run of
> > all
> > > > cases through "mvn -PrunAllTests clean test", but cannot by a clean
> > > > standalone run (with "mvn *clean* test"). So now I'm more convinced
> > it's
> > > > some kind of environment chaos caused by parallel execution of test
> > > cases,
> > > > and not a blocker issue.
> > > >
> > > > @Andrew It seems to me that kerby jar is not included in our binary
> > > > package, so I'm not sure whether a new RC is required by HBASE-22219.
> > > > Anyway I'd like to revoke my -1 vote now. Thanks.
> > > >
> > > > Best Regards,
> > > > Yu
> > > >
> > > >
> > > >> On Tue, 16 Apr 2019 at 10:19, Yu Li <ca...@gmail.com> wrote:
> > > >>
> > > >> Sorry for the late response due to job priority.
> > > >>
> > > >> This ConnectionRefused issue cannot be reproduced on my laptop
> (MacOS
> > > >> 10.14.4) but could on the linux env. And I've checked and confirmed
> it
> > > >> could pass with 1.4.7/1.4.9 source package but stably failed with
> > 1.5.0,
> > > >> performing a git bisect now, will report back later.
> > > >>
> > > >> Best Regards,
> > > >> Yu
> > > >>
> > > >>
> > > >> On Sat, 13 Apr 2019 at 00:38, Andrew Purtell <
> > andrew.purtell@gmail.com>
> > > >> wrote:
> > > >>
> > > >>> I also see the occasional ConnectionRefused errors. They don’t
> > > reproduce
> > > >>> if you run the test standalone. I also only see them on a Linux dev
> > > host.
> > > >>> That may be enough to find by bisect the commit that introduced
> this
> > > >>> behavior. Working on it. There is a JIRA filed for this one. Search
> > for
> > > >>> “TestBlocksRead” and label “branch-1”.
> > > >>>
> > > >>> Thanks for the investigations.
> > > >>>
> > > >>>> On Apr 12, 2019, at 6:36 AM, Yu Li <ca...@gmail.com> wrote:
> > > >>>>
> > > >>>> Quick updates:
> > > >>>>
> > > >>>> W/ patch of HBASE-22219 or say upgrading kerby version to 1.0.1,
> the
> > > >>>> failures listed above in the 1st part of hbase-server disappeared.
> > > >>>>
> > > >>>> However, in the 2nd part of hbase-server UT there're still many
> > > >>>> ConnectionRefused exceptions (17 errors in total) as shown below,
> > > which
> > > >>>> could be reproduced easily with -Dtest=xxx command on my
> > environments,
> > > >>>> still checking the root cause.
> > > >>>>
> > > >>>> [INFO] Running org.apache.hadoop.hbase.regionserver.TestBlocksRead
> > > >>>> [ERROR] Tests run: 4, Failures: 0, Errors: 4, Skipped: 0, Time
> > > elapsed:
> > > >>>> 0.853 s <<< FAILURE! - in
> > > >>>> org.apache.hadoop.hbase.regionserver.TestBlocksRead
> > > >>>> [ERROR]
> > > >>>>
> > > >>>
> > >
> >
> testBlocksStoredWhenCachingDisabled(org.apache.hadoop.hbase.regionserver.TestBlocksRead)
> > > >>>> Time elapsed: 0.17 s  <<< ERROR!
> > > >>>> java.net.ConnectException: Call From
> > > >>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35669
> > failed
> > > >>> on
> > > >>>> connection exception: java.net.ConnectException: Connection
> refused;
> > > For
> > > >>>> more details see:
> > > >>>> http://wiki.apache.org/hadoop/ConnectionRefused
> > > >>>>       at
> > > >>>>
> > > >>>
> > >
> >
> org.apache.hadoop.hbase.regionserver.TestBlocksRead.initHRegion(TestBlocksRead.java:112)
> > > >>>>       at
> > > >>>>
> > > >>>
> > >
> >
> org.apache.hadoop.hbase.regionserver.TestBlocksRead.testBlocksStoredWhenCachingDisabled(TestBlocksRead.java:389)
> > > >>>> Caused by: java.net.ConnectException: Connection refused
> > > >>>>       at
> > > >>>>
> > > >>>
> > >
> >
> org.apache.hadoop.hbase.regionserver.TestBlocksRead.initHRegion(TestBlocksRead.java:112)
> > > >>>>       at
> > > >>>>
> > > >>>
> > >
> >
> org.apache.hadoop.hbase.regionserver.TestBlocksRead.testBlocksStoredWhenCachingDisabled(TestBlocksRead.java:389)
> > > >>>>
> > > >>>> Best Regards,
> > > >>>> Yu
> > > >>>>
> > > >>>>
> > > >>>>> On Fri, 12 Apr 2019 at 13:11, Yu Li <ca...@gmail.com> wrote:
> > > >>>>>
> > > >>>>> I have no doubt that you've run the tests locally before
> > announcing a
> > > >>>>> release as you're always a great RM boss. And this shows one
> value
> > of
> > > >>>>> verifying release, that different voter has different
> environments.
> > > >>>>>
> > > >>>>> Now I think the failures may be kerberos related, since I
> possibly
> > > has
> > > >>>>> changed some system configuration when doing Flink testing on
> this
> > > env
> > > >>>>> weeks ago. Located one issue (HBASE-22219) which also observed in
> > > >>> 1.4.7,
> > > >>>>> will further investigate.
> > > >>>>>
> > > >>>>> Best Regards,
> > > >>>>> Yu
> > > >>>>>
> > > >>>>>
> > > >>>>> On Fri, 12 Apr 2019 at 12:38, Andrew Purtell <
> > > andrew.purtell@gmail.com
> > > >>>>
> > > >>>>> wrote:
> > > >>>>>
> > > >>>>>> “However it's good to find the issue earlier if there
> > > >>>>>> really is any, before release announced.”
> > > >>>>>>
> > > >>>>>> I run the complete unit test suite before announcing a release
> > > >>> candidate.
> > > >>>>>> Just to be clear.
> > > >>>>>>
> > > >>>>>> Totally agree we should get these problems sorted before an
> actual
> > > >>>>>> release. My policy is to cancel a RC if anyone vetoes for this
> > > >>> reason...
> > > >>>>>> want as much coverage and varying environments as we can manage.
> > > >>>>>>
> > > >>>>>> Thank you for your help so far and I hope the failures you see
> > > result
> > > >>> in
> > > >>>>>> analysis and fixes that lead to better test stability.
> > > >>>>>>
> > > >>>>>>> On Apr 11, 2019, at 9:32 PM, Yu Li <ca...@gmail.com> wrote:
> > > >>>>>>>
> > > >>>>>>> Confirmed in 1.4.7 source the listed out cases passed (all in
> the
> > > 1st
> > > >>>>>> part
> > > >>>>>>> of hbase-server so the result comes out quickly.)... Also
> > confirmed
> > > >>> the
> > > >>>>>>> test ran order are the same...
> > > >>>>>>>
> > > >>>>>>> Will try 1.5.0 again to prevent the environment difference
> caused
> > > by
> > > >>>>>> time.
> > > >>>>>>> If 1.5.0 still fails, will start to do the git bisect to locate
> > the
> > > >>>>>> first
> > > >>>>>>> bad commit.
> > > >>>>>>>
> > > >>>>>>> Was also expecting an easy pass and +1 as always to save time
> and
> > > >>>>>> efforts,
> > > >>>>>>> but obvious no luck. However it's good to find the issue
> earlier
> > if
> > > >>>>>> there
> > > >>>>>>> really is any, before release announced.
> > > >>>>>>>
> > > >>>>>>> Best Regards,
> > > >>>>>>> Yu
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>> On Fri, 12 Apr 2019 at 12:16, Yu Li <ca...@gmail.com> wrote:
> > > >>>>>>>>
> > > >>>>>>>> Fine, let's focus on verifying whether it's a real problem
> > rather
> > > >>> than
> > > >>>>>>>> arguing about wording, after all that's not my intention...
> > > >>>>>>>>
> > > >>>>>>>> As mentioned, I participated in the 1.4.7 release vote[1] and
> > > IIRC I
> > > >>>>>> was
> > > >>>>>>>> using the same env and all tests passed w/o issue, that's
> where
> > my
> > > >>>>>> concern
> > > >>>>>>>> lies and the main reason I gave a -1 vote. I'm running against
> > > 1.4.7
> > > >>>>>> source
> > > >>>>>>>> on the same now and let's see the result.
> > > >>>>>>>>
> > > >>>>>>>> [1]
> > > https://www.mail-archive.com/dev@hbase.apache.org/msg51380.html
> > > >>>>>>>>
> > > >>>>>>>> Best Regards,
> > > >>>>>>>> Yu
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> On Fri, 12 Apr 2019 at 12:05, Andrew Purtell <
> > > >>> andrew.purtell@gmail.com
> > > >>>>>>>
> > > >>>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> I believe the test execution order matters. We run some tests
> > in
> > > >>>>>>>>> parallel. The ordering of tests is determined by readdir()
> > > results
> > > >>>>>> and this
> > > >>>>>>>>> differs from host to host and checkout to checkout. So when
> you
> > > >>> see a
> > > >>>>>>>>> repeatable group of failures, that’s great. And when someone
> > else
> > > >>>>>> doesn’t
> > > >>>>>>>>> see those same tests fail, or they cannot be reproduced when
> > > >>> running
> > > >>>>>> by
> > > >>>>>>>>> themselves, the commonly accepted term of art for this is
> > > “flaky”.
> > > >>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>>> On Apr 11, 2019, at 8:52 PM, Yu Li <ca...@gmail.com>
> wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>> Sorry but I'd call it "possible environment related problem"
> > or
> > > >>> "some
> > > >>>>>>>>>> feature may not work well in specific environment", rather
> > than
> > > a
> > > >>>>>> flaky.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Will check against 1.4.7 released source package before
> > opening
> > > >>> any
> > > >>>>>>>>> JIRA.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Best Regards,
> > > >>>>>>>>>> Yu
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>> On Fri, 12 Apr 2019 at 11:37, Andrew Purtell <
> > > >>>>>> andrew.purtell@gmail.com>
> > > >>>>>>>>>> wrote:
> > > >>>>>>>>>>
> > > >>>>>>>>>>> And if they pass in my environment , then what should we
> call
> > > it
> > > >>>>>> then.
> > > >>>>>>>>> I
> > > >>>>>>>>>>> have no doubt you are seeing failures. Therefore can you
> > please
> > > >>> file
> > > >>>>>>>>> JIRAs
> > > >>>>>>>>>>> and attach information that can help identify a fix.
> Thanks.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>>> On Apr 11, 2019, at 8:35 PM, Yu Li <ca...@gmail.com>
> > wrote:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> I ran the test suite with the
> > > >>> -Dsurefire.rerunFailingTestsCount=2
> > > >>>>>>>>> option
> > > >>>>>>>>>>>> and on two different env separately, so it sums up to 6
> > times
> > > >>>>>> stable
> > > >>>>>>>>>>>> failure for each case, and from my perspective this is not
> > > >>> flaky.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> IIRC last time when verifying 1.4.7 on the same env no
> such
> > > >>> issue
> > > >>>>>>>>>>> observed,
> > > >>>>>>>>>>>> will double check.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Best Regards,
> > > >>>>>>>>>>>> Yu
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> On Fri, 12 Apr 2019 at 00:07, Andrew Purtell <
> > > >>>>>>>>> andrew.purtell@gmail.com>
> > > >>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>> There are two failure cases it looks like. And this looks
> > > like
> > > >>>>>>>>> flakes.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> The wrong FS assertions are not something I see when I
> run
> > > >>> these
> > > >>>>>>>>> tests
> > > >>>>>>>>>>>>> myself. I am not able to investigate something I can’t
> > > >>> reproduce.
> > > >>>>>>>>> What I
> > > >>>>>>>>>>>>> suggest is since you can reproduce do a git bisect to
> find
> > > the
> > > >>>>>> commit
> > > >>>>>>>>>>> that
> > > >>>>>>>>>>>>> introduced the problem. Then we can revert it. As an
> > > >>> alternative
> > > >>>>>> we
> > > >>>>>>>>> can
> > > >>>>>>>>>>>>> open a JIRA, report the problem, temporarily @ignore the
> > > test,
> > > >>> and
> > > >>>>>>>>>>>>> continue. This latter option only should be done if we
> are
> > > >>> fairly
> > > >>>>>>>>>>> confident
> > > >>>>>>>>>>>>> it is a test only problem.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> The connect exceptions are interesting. I see these
> > sometimes
> > > >>> when
> > > >>>>>>>>> the
> > > >>>>>>>>>>>>> suite is executed, not this particular case, but when the
> > > >>> failed
> > > >>>>>>>>> test is
> > > >>>>>>>>>>>>> executed by itself it always passes. It is possible some
> > > >>> change to
> > > >>>>>>>>>>> classes
> > > >>>>>>>>>>>>> related to the minicluster or startup or shutdown timing
> > are
> > > >>> the
> > > >>>>>>>>> cause,
> > > >>>>>>>>>>> but
> > > >>>>>>>>>>>>> it is test time flaky behavior. I’m not happy about this
> > but
> > > it
> > > >>>>>>>>> doesn’t
> > > >>>>>>>>>>>>> actually fail the release because the failure is never
> > > >>> repeatable
> > > >>>>>>>>> when
> > > >>>>>>>>>>> the
> > > >>>>>>>>>>>>> test is run standalone.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> In general it would be great if some attention was paid
> to
> > > test
> > > >>>>>>>>>>>>> cleanliness on branch-1. As RM I’m not in a position to
> > > insist
> > > >>>>>> that
> > > >>>>>>>>>>>>> everything is perfect or there will never be another 1.x
> > > >>> release,
> > > >>>>>>>>>>> certainly
> > > >>>>>>>>>>>>> not from branch-1. So, tests which fail repeatedly block
> a
> > > >>> release
> > > >>>>>>>>> IMHO
> > > >>>>>>>>>>> but
> > > >>>>>>>>>>>>> flakes do not.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> On Apr 10, 2019, at 11:20 PM, Yu Li <ca...@gmail.com>
> > > wrote:
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> -1
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Observed many UT failures when checking the source
> package
> > > >>> (tried
> > > >>>>>>>>>>>>> multiple
> > > >>>>>>>>>>>>>> rounds on two different environments, MacOs and Linux,
> got
> > > the
> > > >>>>>> same
> > > >>>>>>>>>>>>>> result), including (but not limited to):
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> TestBulkload:
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>
> > > >>>
> > >
> >
> shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad)
> > > >>>>>>>>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
> > > >>>>>>>>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>
> > > >>>
> > >
> >
> file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29,
> > > >>>>>>>>>>>>>> expected: hdfs://localhost:55938
> > > >>>>>>>>>>>>>>   at
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>
> > > >>>
> > >
> >
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246)
> > > >>>>>>>>>>>>>>   at
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>
> > > >>>
> > >
> >
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256)
> > > >>>>>>>>>>>>>>   at
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>
> > > >>>
> > >
> >
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150)
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> TestStoreFile:
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>
> > > >>>
> > >
> >
> testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile)
> > > >>>>>>>>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
> > > >>>>>>>>>>>>>> java.net.ConnectException: Call From localhost/
> 127.0.0.1
> > to
> > > >>>>>>>>>>>>> localhost:55938
> > > >>>>>>>>>>>>>> failed on connection exception:
> java.net.ConnectException:
> > > >>>>>>>>> Connection
> > > >>>>>>>>>>>>>> refused; For more details see:
> > > >>>>>>>>>>>>>> http://wiki.apache.org/hadoop/ConnectionRefused
> > > >>>>>>>>>>>>>>   at
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>
> > > >>>
> > >
> >
> org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047)
> > > >>>>>>>>>>>>>>   at
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>
> > > >>>
> > >
> >
> org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908)
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> TestHFile:
> > > >>>>>>>>>>>>>>
> testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile)
> > > >>> Time
> > > >>>>>>>>>>> elapsed:
> > > >>>>>>>>>>>>>> 0.08 s  <<< ERROR!
> > > >>>>>>>>>>>>>> java.net.ConnectException: Call From
> > > >>>>>>>>>>>>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to
> > > >>> localhost:35529
> > > >>>>>>>>> failed
> > > >>>>>>>>>>> on
> > > >>>>>>>>>>>>>> connection exception: java.net.ConnectException:
> > Connection
> > > >>>>>> refused;
> > > >>>>>>>>>>> For
> > > >>>>>>>>>>>>>> more details see:
> > > >>>>>> http://wiki.apache.org/hadoop/ConnectionRefused
> > > >>>>>>>>>>>>>>   at
> > > >>>>>>>>>>>>>> org.apache.hadoop.hbase.io
> > > >>>>>>>>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
> > > >>>>>>>>>>>>>> Caused by: java.net.ConnectException: Connection refused
> > > >>>>>>>>>>>>>>   at
> > > >>>>>>>>>>>>>> org.apache.hadoop.hbase.io
> > > >>>>>>>>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> TestBlocksScanned:
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>
> > > >>>
> > >
> >
> testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned)
> > > >>>>>>>>>>>>>> Time elapsed: 0.069 s  <<< ERROR!
> > > >>>>>>>>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
> > > >>>>>>>>>>> hdfs://localhost:35529/tmp/
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>
> > > >>>
> > >
> >
> hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08
> > > >>>>>>>>>>>>> ,
> > > >>>>>>>>>>>>>> expected: file:///
> > > >>>>>>>>>>>>>>   at
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>
> > > >>>
> > >
> >
> org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90)
> > > >>>>>>>>>>>>>>   at
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>
> > > >>>
> > >
> >
> org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86)
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> And please let me know if any known issue I'm not aware
> > of.
> > > >>>>>> Thanks.
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Best Regards,
> > > >>>>>>>>>>>>>> Yu
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> On Mon, 8 Apr 2019 at 11:38, Yu Li <ca...@gmail.com>
> > > wrote:
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> The performance report LGTM, thanks! (and sorry for the
> > lag
> > > >>> due
> > > >>>>>> to
> > > >>>>>>>>>>>>>>> Qingming Festival Holiday here in China)
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Still verifying the release, just some quick feedback:
> > > >>> observed
> > > >>>>>>>>> some
> > > >>>>>>>>>>>>>>> incompatible changes in compatibility report including
> > > >>>>>>>>>>>>>>> HBASE-21492/HBASE-21684 and worth a reminder in
> > > ReleaseNote.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Irrelative but noticeable: the 1.4.9 release note URL
> is
> > > >>>>>> invalid on
> > > >>>>>>>>>>>>>>> https://hbase.apache.org/downloads.html
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Best Regards,
> > > >>>>>>>>>>>>>>> Yu
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <
> > > >>>>>> apurtell@apache.org>
> > > >>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> The difference is basically noise per the usual YCSB
> > > >>>>>> evaluation.
> > > >>>>>>>>>>> Small
> > > >>>>>>>>>>>>>>>> differences in workloads D and F (slightly worse) and
> > > >>> workload
> > > >>>>>> E
> > > >>>>>>>>>>>>> (slightly
> > > >>>>>>>>>>>>>>>> better) that do not indicate serious regression.
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> Linux version 4.14.55-62.37.amzn1.x86_64
> > > >>>>>>>>>>>>>>>> c3.8xlarge x 5
> > > >>>>>>>>>>>>>>>> OpenJDK Runtime Environment (build
> > > 1.8.0_181-shenandoah-b13)
> > > >>>>>>>>>>>>>>>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch
> > > >>> -XX:+UseNUMA
> > > >>>>>>>>>>>>>>>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
> > > >>>>>>>>>>>>>>>> Hadoop 2.9.2
> > > >>>>>>>>>>>>>>>> Init: Load 100 M rows and snapshot
> > > >>>>>>>>>>>>>>>> Run: Delete table, clone and redeploy from snapshot,
> run
> > > 10
> > > >>> M
> > > >>>>>>>>>>>>> operations
> > > >>>>>>>>>>>>>>>> Args: -threads 100 -target 50000
> > > >>>>>>>>>>>>>>>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW',
> VERSIONS
> > > =>
> > > >>>>>> '1',
> > > >>>>>>>>>>>>> IN_MEMORY
> > > >>>>>>>>>>>>>>>> => 'false', KEEP_DELETED_CELLS => 'FALSE',
> > > >>> DATA_BLOCK_ENCODING
> > > >>>>>> =>
> > > >>>>>>>>>>>>>>>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION =>
> > 'SNAPPY',
> > > >>>>>>>>>>>>> MIN_VERSIONS =>
> > > >>>>>>>>>>>>>>>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536',
> > > >>>>>>>>> REPLICATION_SCOPE =>
> > > >>>>>>>>>>>>>>>> '0'}
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> YCSB Workload A
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200592 200583
> > > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49852 49855
> > > >>>>>>>>>>>>>>>> [READ], AverageLatency(us) 544 559
> > > >>>>>>>>>>>>>>>> [READ], MinLatency(us) 267 292
> > > >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 165631 185087
> > > >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 738 742
> > > >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us), 1877 1961
> > > >>>>>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1370 1181
> > > >>>>>>>>>>>>>>>> [UPDATE], MinLatency(us) 702 646
> > > >>>>>>>>>>>>>>>> [UPDATE], MaxLatency(us) 180735 177279
> > > >>>>>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1943 1652
> > > >>>>>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 3257 3085
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> YCSB Workload B
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200599 200581
> > > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49850 49855
> > > >>>>>>>>>>>>>>>> [READ], AverageLatency(us),  454 471
> > > >>>>>>>>>>>>>>>> [READ], MinLatency(us) 203 213
> > > >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 183423 174207
> > > >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 563 599
> > > >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 1360 1172
> > > >>>>>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1064 1029
> > > >>>>>>>>>>>>>>>> [UPDATE], MinLatency(us) 746 726
> > > >>>>>>>>>>>>>>>> [UPDATE], MaxLatency(us) 163455 101631
> > > >>>>>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1327 1157
> > > >>>>>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 2241 1898
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> YCSB Workload C
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200541 200538
> > > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49865 49865
> > > >>>>>>>>>>>>>>>> [READ], AverageLatency(us) 332 327
> > > >>>>>>>>>>>>>>>> [READ], MinLatency(us) 175 179
> > > >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 210559 170367
> > > >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 410 396
> > > >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 871 892
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> YCSB Workload D
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200579 200562
> > > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49855 49859
> > > >>>>>>>>>>>>>>>> [READ], AverageLatency(us) 487 547
> > > >>>>>>>>>>>>>>>> [READ], MinLatency(us) 210 214
> > > >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 192255 177535
> > > >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 973 1529
> > > >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 1836 2683
> > > >>>>>>>>>>>>>>>> [INSERT], AverageLatency(us) 1239 1152
> > > >>>>>>>>>>>>>>>> [INSERT], MinLatency(us) 807 788
> > > >>>>>>>>>>>>>>>> [INSERT], MaxLatency(us) 184575 148735
> > > >>>>>>>>>>>>>>>> [INSERT], 95thPercentileLatency(us) 1496 1243
> > > >>>>>>>>>>>>>>>> [INSERT], 99thPercentileLatency(us) 2965 2495
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> YCSB Workload E
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> target 10k/op/s 1.4.9 1.5.0
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 100605 100568
> > > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 9939 9943
> > > >>>>>>>>>>>>>>>> [SCAN], AverageLatency(us) 3548 2687
> > > >>>>>>>>>>>>>>>> [SCAN], MinLatency(us) 696 678
> > > >>>>>>>>>>>>>>>> [SCAN], MaxLatency(us) 1059839 238463
> > > >>>>>>>>>>>>>>>> [SCAN], 95thPercentileLatency(us) 8327 6791
> > > >>>>>>>>>>>>>>>> [SCAN], 99thPercentileLatency(us) 17647 14415
> > > >>>>>>>>>>>>>>>> [INSERT], AverageLatency(us) 2688 1555
> > > >>>>>>>>>>>>>>>> [INSERT], MinLatency(us) 887 815
> > > >>>>>>>>>>>>>>>> [INSERT], MaxLatency(us) 173311 154623
> > > >>>>>>>>>>>>>>>> [INSERT], 95thPercentileLatency(us) 4455 2571
> > > >>>>>>>>>>>>>>>> [INSERT], 99thPercentileLatency(us) 9303 5375
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> YCSB Workload F
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200562 204178
> > > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49859 48976
> > > >>>>>>>>>>>>>>>> [READ], AverageLatency(us) 856 1137
> > > >>>>>>>>>>>>>>>> [READ], MinLatency(us) 262 257
> > > >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 205567 222335
> > > >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 2365 3475
> > > >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 3099 4143
> > > >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
> > > >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034
> > > >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
> > > >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747
> 7627
> > > >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203
> 8919
> > > >>>>>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1700 1777
> > > >>>>>>>>>>>>>>>> [UPDATE], MinLatency(us) 737 687
> > > >>>>>>>>>>>>>>>> [UPDATE], MaxLatency(us) 97983 94271
> > > >>>>>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 3377 4147
> > > >>>>>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 4147 4831
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <
> carp84@gmail.com
> > >
> > > >>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> Thanks for the efforts boss.
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> Since it's a new minor release, do we have
> performance
> > > >>>>>> comparison
> > > >>>>>>>>>>>>> report
> > > >>>>>>>>>>>>>>>>> with 1.4.9 as we did when releasing 1.4.0? If so, any
> > > >>>>>> reference?
> > > >>>>>>>>>>> Many
> > > >>>>>>>>>>>>>>>>> thanks!
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> Best Regards,
> > > >>>>>>>>>>>>>>>>> Yu
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <
> > > >>>>>> apurtell@apache.org
> > > >>>>>>>>>>
> > > >>>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> The fourth HBase 1.5.0 release candidate (RC3) is
> > > >>> available
> > > >>>>>> for
> > > >>>>>>>>>>>>>>>> download
> > > >>>>>>>>>>>>>>>>> at
> > > >>>>>>>>>>>>>>>>>>
> > > >>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/
> > > >>>>>>>>> and
> > > >>>>>>>>>>>>>>>> Maven
> > > >>>>>>>>>>>>>>>>>> artifacts are available in the temporary repository
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>
> > > >>>
> > >
> https://repository.apache.org/content/repositories/orgapachehbase-1292/
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> The git tag corresponding to the candidate is
> > '1.5.0RC3’
> > > >>>>>>>>>>>>> (b0bc7225c5).
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> A detailed source and binary compatibility report
> for
> > > this
> > > >>>>>>>>> release
> > > >>>>>>>>>>> is
> > > >>>>>>>>>>>>>>>>>> available for your review at
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>
> > > >>>
> > >
> >
> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
> > > >>>>>>>>>>>>>>>>>> .
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> A list of the 115 issues resolved in this release
> can
> > be
> > > >>>>>> found
> > > >>>>>>>>> at
> > > >>>>>>>>>>>>>>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is
> > > >>> derived
> > > >>>>>> from
> > > >>>>>>>>>>> the
> > > >>>>>>>>>>>>>>>>>> changelog of the last branch-1.4 release, 1.4.9.
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> Please try out the candidate and vote +1/0/-1.
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> The vote will be open for at least 72 hours. Unless
> > > >>>>>> objection I
> > > >>>>>>>>>>> will
> > > >>>>>>>>>>>>>>>> try
> > > >>>>>>>>>>>>>>>>> to
> > > >>>>>>>>>>>>>>>>>> close it Friday April 12, 2019 if we have sufficient
> > > >>> votes.
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> Prior to making this announcement I made the
> following
> > > >>>>>> preflight
> > > >>>>>>>>>>>>>>>> checks:
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> RAT check passes (7u80)
> > > >>>>>>>>>>>>>>>>>> Unit test suite passes (7u80, 8u181)*
> > > >>>>>>>>>>>>>>>>>> Opened the UI in a browser, poked around
> > > >>>>>>>>>>>>>>>>>> LTT load 100M rows with 100% verification and 20%
> > > updates
> > > >>>>>>>>> (8u181)
> > > >>>>>>>>>>>>>>>>>> ITBLL 1B rows with slowDeterministic monkey (8u181)
> > > >>>>>>>>>>>>>>>>>> ITBLL 1B rows with serverKilling monkey (8u181)
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> There are known flaky tests. See HBASE-21904 and
> > > >>> HBASE-21905.
> > > >>>>>>>>> These
> > > >>>>>>>>>>>>>>>> flaky
> > > >>>>>>>>>>>>>>>>>> tests do not represent serious test failures that
> > would
> > > >>>>>> prevent
> > > >>>>>>>>> a
> > > >>>>>>>>>>>>>>>>> release.
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> --
> > > >>>>>>>>>>>>>>>>>> Best regards,
> > > >>>>>>>>>>>>>>>>>> Andrew
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> --
> > > >>>>>>>>>>>>>>>> Best regards,
> > > >>>>>>>>>>>>>>>> Andrew
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> Words like orphans lost among the crosstalk, meaning
> > torn
> > > >>> from
> > > >>>>>>>>>>> truth's
> > > >>>>>>>>>>>>>>>> decrepit hands
> > > >>>>>>>>>>>>>>>> - A23, Crosstalk
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> >
> > --
> > Best regards,
> > Andrew
> >
> > Words like orphans lost among the crosstalk, meaning torn from truth's
> > decrepit hands
> >    - A23, Crosstalk
> >
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Xu Cang <xc...@salesforce.com.INVALID>.
I just saw this email, Andrew. Should I re-open HBASE-21959? And revert it
before we understand/fix why it caused the test failure?
Regarding the failing test, do you mean this one "TestBlocksRead"?
Thanks,

Xu

On Tue, Apr 16, 2019 at 9:47 PM Andrew Purtell <an...@gmail.com>
wrote:

> I've bisected twice and it lands on this commit:
>
> commit 6bc46bb10920c1c335b784b01d2a326db1a3d587 (HEAD, refs/bisect/bad)
>     HBASE-21959 CompactionTool should close the store it uses for
> compacting files, in order to properly archive compacted files.
>
>  hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java
> |   2 ++
>
>  hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactionTool.java
> | 100
>
> At first glance it's hard to see how this change is relevant, but it does
> introduce a new unit test.
>
>
> On Tue, Apr 16, 2019 at 7:48 PM Andrew Purtell <an...@gmail.com>
> wrote:
>
> > I’ve been able to reproduce it sometimes too and am bisecting. It may be
> > an interaction between test cases, not a failure per se, but does seem
> have
> > a recent cause, as you pointed out. I’ll be looking at it.
> >
> > Thank you for your kind consideration and for revoking your veto.
> >
> > A coprocessor API fix was just committed to branch-1 so I want to roll a
> > new RC soon to include it. There is also an issue open to improve the
> > behavior of the UI when the profiler link is clicked but system support
> is
> > not available.
> >
> > > On Apr 16, 2019, at 7:40 PM, Yu Li <ca...@gmail.com> wrote:
> > >
> > > After more investigation, the ConnectionRefused exception could be
> > > reproduced with "mvn -Dtest=<case_name> test" after a complete run of
> all
> > > cases through "mvn -PrunAllTests clean test", but cannot by a clean
> > > standalone run (with "mvn *clean* test"). So now I'm more convinced
> it's
> > > some kind of environment chaos caused by parallel execution of test
> > cases,
> > > and not a blocker issue.
> > >
> > > @Andrew It seems to me that kerby jar is not included in our binary
> > > package, so I'm not sure whether a new RC is required by HBASE-22219.
> > > Anyway I'd like to revoke my -1 vote now. Thanks.
> > >
> > > Best Regards,
> > > Yu
> > >
> > >
> > >> On Tue, 16 Apr 2019 at 10:19, Yu Li <ca...@gmail.com> wrote:
> > >>
> > >> Sorry for the late response due to job priority.
> > >>
> > >> This ConnectionRefused issue cannot be reproduced on my laptop (MacOS
> > >> 10.14.4) but could on the linux env. And I've checked and confirmed it
> > >> could pass with 1.4.7/1.4.9 source package but stably failed with
> 1.5.0,
> > >> performing a git bisect now, will report back later.
> > >>
> > >> Best Regards,
> > >> Yu
> > >>
> > >>
> > >> On Sat, 13 Apr 2019 at 00:38, Andrew Purtell <
> andrew.purtell@gmail.com>
> > >> wrote:
> > >>
> > >>> I also see the occasional ConnectionRefused errors. They don’t
> > reproduce
> > >>> if you run the test standalone. I also only see them on a Linux dev
> > host.
> > >>> That may be enough to find by bisect the commit that introduced this
> > >>> behavior. Working on it. There is a JIRA filed for this one. Search
> for
> > >>> “TestBlocksRead” and label “branch-1”.
> > >>>
> > >>> Thanks for the investigations.
> > >>>
> > >>>> On Apr 12, 2019, at 6:36 AM, Yu Li <ca...@gmail.com> wrote:
> > >>>>
> > >>>> Quick updates:
> > >>>>
> > >>>> W/ patch of HBASE-22219 or say upgrading kerby version to 1.0.1, the
> > >>>> failures listed above in the 1st part of hbase-server disappeared.
> > >>>>
> > >>>> However, in the 2nd part of hbase-server UT there're still many
> > >>>> ConnectionRefused exceptions (17 errors in total) as shown below,
> > which
> > >>>> could be reproduced easily with -Dtest=xxx command on my
> environments,
> > >>>> still checking the root cause.
> > >>>>
> > >>>> [INFO] Running org.apache.hadoop.hbase.regionserver.TestBlocksRead
> > >>>> [ERROR] Tests run: 4, Failures: 0, Errors: 4, Skipped: 0, Time
> > elapsed:
> > >>>> 0.853 s <<< FAILURE! - in
> > >>>> org.apache.hadoop.hbase.regionserver.TestBlocksRead
> > >>>> [ERROR]
> > >>>>
> > >>>
> >
> testBlocksStoredWhenCachingDisabled(org.apache.hadoop.hbase.regionserver.TestBlocksRead)
> > >>>> Time elapsed: 0.17 s  <<< ERROR!
> > >>>> java.net.ConnectException: Call From
> > >>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35669
> failed
> > >>> on
> > >>>> connection exception: java.net.ConnectException: Connection refused;
> > For
> > >>>> more details see:
> > >>>> http://wiki.apache.org/hadoop/ConnectionRefused
> > >>>>       at
> > >>>>
> > >>>
> >
> org.apache.hadoop.hbase.regionserver.TestBlocksRead.initHRegion(TestBlocksRead.java:112)
> > >>>>       at
> > >>>>
> > >>>
> >
> org.apache.hadoop.hbase.regionserver.TestBlocksRead.testBlocksStoredWhenCachingDisabled(TestBlocksRead.java:389)
> > >>>> Caused by: java.net.ConnectException: Connection refused
> > >>>>       at
> > >>>>
> > >>>
> >
> org.apache.hadoop.hbase.regionserver.TestBlocksRead.initHRegion(TestBlocksRead.java:112)
> > >>>>       at
> > >>>>
> > >>>
> >
> org.apache.hadoop.hbase.regionserver.TestBlocksRead.testBlocksStoredWhenCachingDisabled(TestBlocksRead.java:389)
> > >>>>
> > >>>> Best Regards,
> > >>>> Yu
> > >>>>
> > >>>>
> > >>>>> On Fri, 12 Apr 2019 at 13:11, Yu Li <ca...@gmail.com> wrote:
> > >>>>>
> > >>>>> I have no doubt that you've run the tests locally before
> announcing a
> > >>>>> release as you're always a great RM boss. And this shows one value
> of
> > >>>>> verifying release, that different voter has different environments.
> > >>>>>
> > >>>>> Now I think the failures may be kerberos related, since I possibly
> > has
> > >>>>> changed some system configuration when doing Flink testing on this
> > env
> > >>>>> weeks ago. Located one issue (HBASE-22219) which also observed in
> > >>> 1.4.7,
> > >>>>> will further investigate.
> > >>>>>
> > >>>>> Best Regards,
> > >>>>> Yu
> > >>>>>
> > >>>>>
> > >>>>> On Fri, 12 Apr 2019 at 12:38, Andrew Purtell <
> > andrew.purtell@gmail.com
> > >>>>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> “However it's good to find the issue earlier if there
> > >>>>>> really is any, before release announced.”
> > >>>>>>
> > >>>>>> I run the complete unit test suite before announcing a release
> > >>> candidate.
> > >>>>>> Just to be clear.
> > >>>>>>
> > >>>>>> Totally agree we should get these problems sorted before an actual
> > >>>>>> release. My policy is to cancel a RC if anyone vetoes for this
> > >>> reason...
> > >>>>>> want as much coverage and varying environments as we can manage.
> > >>>>>>
> > >>>>>> Thank you for your help so far and I hope the failures you see
> > result
> > >>> in
> > >>>>>> analysis and fixes that lead to better test stability.
> > >>>>>>
> > >>>>>>> On Apr 11, 2019, at 9:32 PM, Yu Li <ca...@gmail.com> wrote:
> > >>>>>>>
> > >>>>>>> Confirmed in 1.4.7 source the listed out cases passed (all in the
> > 1st
> > >>>>>> part
> > >>>>>>> of hbase-server so the result comes out quickly.)... Also
> confirmed
> > >>> the
> > >>>>>>> test ran order are the same...
> > >>>>>>>
> > >>>>>>> Will try 1.5.0 again to prevent the environment difference caused
> > by
> > >>>>>> time.
> > >>>>>>> If 1.5.0 still fails, will start to do the git bisect to locate
> the
> > >>>>>> first
> > >>>>>>> bad commit.
> > >>>>>>>
> > >>>>>>> Was also expecting an easy pass and +1 as always to save time and
> > >>>>>> efforts,
> > >>>>>>> but obvious no luck. However it's good to find the issue earlier
> if
> > >>>>>> there
> > >>>>>>> really is any, before release announced.
> > >>>>>>>
> > >>>>>>> Best Regards,
> > >>>>>>> Yu
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>> On Fri, 12 Apr 2019 at 12:16, Yu Li <ca...@gmail.com> wrote:
> > >>>>>>>>
> > >>>>>>>> Fine, let's focus on verifying whether it's a real problem
> rather
> > >>> than
> > >>>>>>>> arguing about wording, after all that's not my intention...
> > >>>>>>>>
> > >>>>>>>> As mentioned, I participated in the 1.4.7 release vote[1] and
> > IIRC I
> > >>>>>> was
> > >>>>>>>> using the same env and all tests passed w/o issue, that's where
> my
> > >>>>>> concern
> > >>>>>>>> lies and the main reason I gave a -1 vote. I'm running against
> > 1.4.7
> > >>>>>> source
> > >>>>>>>> on the same now and let's see the result.
> > >>>>>>>>
> > >>>>>>>> [1]
> > https://www.mail-archive.com/dev@hbase.apache.org/msg51380.html
> > >>>>>>>>
> > >>>>>>>> Best Regards,
> > >>>>>>>> Yu
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On Fri, 12 Apr 2019 at 12:05, Andrew Purtell <
> > >>> andrew.purtell@gmail.com
> > >>>>>>>
> > >>>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> I believe the test execution order matters. We run some tests
> in
> > >>>>>>>>> parallel. The ordering of tests is determined by readdir()
> > results
> > >>>>>> and this
> > >>>>>>>>> differs from host to host and checkout to checkout. So when you
> > >>> see a
> > >>>>>>>>> repeatable group of failures, that’s great. And when someone
> else
> > >>>>>> doesn’t
> > >>>>>>>>> see those same tests fail, or they cannot be reproduced when
> > >>> running
> > >>>>>> by
> > >>>>>>>>> themselves, the commonly accepted term of art for this is
> > “flaky”.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>> On Apr 11, 2019, at 8:52 PM, Yu Li <ca...@gmail.com> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>> Sorry but I'd call it "possible environment related problem"
> or
> > >>> "some
> > >>>>>>>>>> feature may not work well in specific environment", rather
> than
> > a
> > >>>>>> flaky.
> > >>>>>>>>>>
> > >>>>>>>>>> Will check against 1.4.7 released source package before
> opening
> > >>> any
> > >>>>>>>>> JIRA.
> > >>>>>>>>>>
> > >>>>>>>>>> Best Regards,
> > >>>>>>>>>> Yu
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> On Fri, 12 Apr 2019 at 11:37, Andrew Purtell <
> > >>>>>> andrew.purtell@gmail.com>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> And if they pass in my environment , then what should we call
> > it
> > >>>>>> then.
> > >>>>>>>>> I
> > >>>>>>>>>>> have no doubt you are seeing failures. Therefore can you
> please
> > >>> file
> > >>>>>>>>> JIRAs
> > >>>>>>>>>>> and attach information that can help identify a fix. Thanks.
> > >>>>>>>>>>>
> > >>>>>>>>>>>> On Apr 11, 2019, at 8:35 PM, Yu Li <ca...@gmail.com>
> wrote:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> I ran the test suite with the
> > >>> -Dsurefire.rerunFailingTestsCount=2
> > >>>>>>>>> option
> > >>>>>>>>>>>> and on two different env separately, so it sums up to 6
> times
> > >>>>>> stable
> > >>>>>>>>>>>> failure for each case, and from my perspective this is not
> > >>> flaky.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> IIRC last time when verifying 1.4.7 on the same env no such
> > >>> issue
> > >>>>>>>>>>> observed,
> > >>>>>>>>>>>> will double check.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Best Regards,
> > >>>>>>>>>>>> Yu
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On Fri, 12 Apr 2019 at 00:07, Andrew Purtell <
> > >>>>>>>>> andrew.purtell@gmail.com>
> > >>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> There are two failure cases it looks like. And this looks
> > like
> > >>>>>>>>> flakes.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> The wrong FS assertions are not something I see when I run
> > >>> these
> > >>>>>>>>> tests
> > >>>>>>>>>>>>> myself. I am not able to investigate something I can’t
> > >>> reproduce.
> > >>>>>>>>> What I
> > >>>>>>>>>>>>> suggest is since you can reproduce do a git bisect to find
> > the
> > >>>>>> commit
> > >>>>>>>>>>> that
> > >>>>>>>>>>>>> introduced the problem. Then we can revert it. As an
> > >>> alternative
> > >>>>>> we
> > >>>>>>>>> can
> > >>>>>>>>>>>>> open a JIRA, report the problem, temporarily @ignore the
> > test,
> > >>> and
> > >>>>>>>>>>>>> continue. This latter option only should be done if we are
> > >>> fairly
> > >>>>>>>>>>> confident
> > >>>>>>>>>>>>> it is a test only problem.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> The connect exceptions are interesting. I see these
> sometimes
> > >>> when
> > >>>>>>>>> the
> > >>>>>>>>>>>>> suite is executed, not this particular case, but when the
> > >>> failed
> > >>>>>>>>> test is
> > >>>>>>>>>>>>> executed by itself it always passes. It is possible some
> > >>> change to
> > >>>>>>>>>>> classes
> > >>>>>>>>>>>>> related to the minicluster or startup or shutdown timing
> are
> > >>> the
> > >>>>>>>>> cause,
> > >>>>>>>>>>> but
> > >>>>>>>>>>>>> it is test time flaky behavior. I’m not happy about this
> but
> > it
> > >>>>>>>>> doesn’t
> > >>>>>>>>>>>>> actually fail the release because the failure is never
> > >>> repeatable
> > >>>>>>>>> when
> > >>>>>>>>>>> the
> > >>>>>>>>>>>>> test is run standalone.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> In general it would be great if some attention was paid to
> > test
> > >>>>>>>>>>>>> cleanliness on branch-1. As RM I’m not in a position to
> > insist
> > >>>>>> that
> > >>>>>>>>>>>>> everything is perfect or there will never be another 1.x
> > >>> release,
> > >>>>>>>>>>> certainly
> > >>>>>>>>>>>>> not from branch-1. So, tests which fail repeatedly block a
> > >>> release
> > >>>>>>>>> IMHO
> > >>>>>>>>>>> but
> > >>>>>>>>>>>>> flakes do not.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> On Apr 10, 2019, at 11:20 PM, Yu Li <ca...@gmail.com>
> > wrote:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> -1
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Observed many UT failures when checking the source package
> > >>> (tried
> > >>>>>>>>>>>>> multiple
> > >>>>>>>>>>>>>> rounds on two different environments, MacOs and Linux, got
> > the
> > >>>>>> same
> > >>>>>>>>>>>>>> result), including (but not limited to):
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> TestBulkload:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>
> >
> shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad)
> > >>>>>>>>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
> > >>>>>>>>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>
> >
> file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29,
> > >>>>>>>>>>>>>> expected: hdfs://localhost:55938
> > >>>>>>>>>>>>>>   at
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>
> >
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246)
> > >>>>>>>>>>>>>>   at
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>
> >
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256)
> > >>>>>>>>>>>>>>   at
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>
> >
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150)
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> TestStoreFile:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>
> >
> testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile)
> > >>>>>>>>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
> > >>>>>>>>>>>>>> java.net.ConnectException: Call From localhost/127.0.0.1
> to
> > >>>>>>>>>>>>> localhost:55938
> > >>>>>>>>>>>>>> failed on connection exception: java.net.ConnectException:
> > >>>>>>>>> Connection
> > >>>>>>>>>>>>>> refused; For more details see:
> > >>>>>>>>>>>>>> http://wiki.apache.org/hadoop/ConnectionRefused
> > >>>>>>>>>>>>>>   at
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>
> >
> org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047)
> > >>>>>>>>>>>>>>   at
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>
> >
> org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908)
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> TestHFile:
> > >>>>>>>>>>>>>> testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile)
> > >>> Time
> > >>>>>>>>>>> elapsed:
> > >>>>>>>>>>>>>> 0.08 s  <<< ERROR!
> > >>>>>>>>>>>>>> java.net.ConnectException: Call From
> > >>>>>>>>>>>>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to
> > >>> localhost:35529
> > >>>>>>>>> failed
> > >>>>>>>>>>> on
> > >>>>>>>>>>>>>> connection exception: java.net.ConnectException:
> Connection
> > >>>>>> refused;
> > >>>>>>>>>>> For
> > >>>>>>>>>>>>>> more details see:
> > >>>>>> http://wiki.apache.org/hadoop/ConnectionRefused
> > >>>>>>>>>>>>>>   at
> > >>>>>>>>>>>>>> org.apache.hadoop.hbase.io
> > >>>>>>>>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
> > >>>>>>>>>>>>>> Caused by: java.net.ConnectException: Connection refused
> > >>>>>>>>>>>>>>   at
> > >>>>>>>>>>>>>> org.apache.hadoop.hbase.io
> > >>>>>>>>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> TestBlocksScanned:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>
> >
> testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned)
> > >>>>>>>>>>>>>> Time elapsed: 0.069 s  <<< ERROR!
> > >>>>>>>>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
> > >>>>>>>>>>> hdfs://localhost:35529/tmp/
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>
> >
> hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08
> > >>>>>>>>>>>>> ,
> > >>>>>>>>>>>>>> expected: file:///
> > >>>>>>>>>>>>>>   at
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>
> >
> org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90)
> > >>>>>>>>>>>>>>   at
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>
> >
> org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86)
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> And please let me know if any known issue I'm not aware
> of.
> > >>>>>> Thanks.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Best Regards,
> > >>>>>>>>>>>>>> Yu
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> On Mon, 8 Apr 2019 at 11:38, Yu Li <ca...@gmail.com>
> > wrote:
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> The performance report LGTM, thanks! (and sorry for the
> lag
> > >>> due
> > >>>>>> to
> > >>>>>>>>>>>>>>> Qingming Festival Holiday here in China)
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Still verifying the release, just some quick feedback:
> > >>> observed
> > >>>>>>>>> some
> > >>>>>>>>>>>>>>> incompatible changes in compatibility report including
> > >>>>>>>>>>>>>>> HBASE-21492/HBASE-21684 and worth a reminder in
> > ReleaseNote.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Irrelative but noticeable: the 1.4.9 release note URL is
> > >>>>>> invalid on
> > >>>>>>>>>>>>>>> https://hbase.apache.org/downloads.html
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Best Regards,
> > >>>>>>>>>>>>>>> Yu
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <
> > >>>>>> apurtell@apache.org>
> > >>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> The difference is basically noise per the usual YCSB
> > >>>>>> evaluation.
> > >>>>>>>>>>> Small
> > >>>>>>>>>>>>>>>> differences in workloads D and F (slightly worse) and
> > >>> workload
> > >>>>>> E
> > >>>>>>>>>>>>> (slightly
> > >>>>>>>>>>>>>>>> better) that do not indicate serious regression.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Linux version 4.14.55-62.37.amzn1.x86_64
> > >>>>>>>>>>>>>>>> c3.8xlarge x 5
> > >>>>>>>>>>>>>>>> OpenJDK Runtime Environment (build
> > 1.8.0_181-shenandoah-b13)
> > >>>>>>>>>>>>>>>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch
> > >>> -XX:+UseNUMA
> > >>>>>>>>>>>>>>>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
> > >>>>>>>>>>>>>>>> Hadoop 2.9.2
> > >>>>>>>>>>>>>>>> Init: Load 100 M rows and snapshot
> > >>>>>>>>>>>>>>>> Run: Delete table, clone and redeploy from snapshot, run
> > 10
> > >>> M
> > >>>>>>>>>>>>> operations
> > >>>>>>>>>>>>>>>> Args: -threads 100 -target 50000
> > >>>>>>>>>>>>>>>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS
> > =>
> > >>>>>> '1',
> > >>>>>>>>>>>>> IN_MEMORY
> > >>>>>>>>>>>>>>>> => 'false', KEEP_DELETED_CELLS => 'FALSE',
> > >>> DATA_BLOCK_ENCODING
> > >>>>>> =>
> > >>>>>>>>>>>>>>>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION =>
> 'SNAPPY',
> > >>>>>>>>>>>>> MIN_VERSIONS =>
> > >>>>>>>>>>>>>>>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536',
> > >>>>>>>>> REPLICATION_SCOPE =>
> > >>>>>>>>>>>>>>>> '0'}
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> YCSB Workload A
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200592 200583
> > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49852 49855
> > >>>>>>>>>>>>>>>> [READ], AverageLatency(us) 544 559
> > >>>>>>>>>>>>>>>> [READ], MinLatency(us) 267 292
> > >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 165631 185087
> > >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 738 742
> > >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us), 1877 1961
> > >>>>>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1370 1181
> > >>>>>>>>>>>>>>>> [UPDATE], MinLatency(us) 702 646
> > >>>>>>>>>>>>>>>> [UPDATE], MaxLatency(us) 180735 177279
> > >>>>>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1943 1652
> > >>>>>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 3257 3085
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> YCSB Workload B
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200599 200581
> > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49850 49855
> > >>>>>>>>>>>>>>>> [READ], AverageLatency(us),  454 471
> > >>>>>>>>>>>>>>>> [READ], MinLatency(us) 203 213
> > >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 183423 174207
> > >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 563 599
> > >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 1360 1172
> > >>>>>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1064 1029
> > >>>>>>>>>>>>>>>> [UPDATE], MinLatency(us) 746 726
> > >>>>>>>>>>>>>>>> [UPDATE], MaxLatency(us) 163455 101631
> > >>>>>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1327 1157
> > >>>>>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 2241 1898
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> YCSB Workload C
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200541 200538
> > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49865 49865
> > >>>>>>>>>>>>>>>> [READ], AverageLatency(us) 332 327
> > >>>>>>>>>>>>>>>> [READ], MinLatency(us) 175 179
> > >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 210559 170367
> > >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 410 396
> > >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 871 892
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> YCSB Workload D
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200579 200562
> > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49855 49859
> > >>>>>>>>>>>>>>>> [READ], AverageLatency(us) 487 547
> > >>>>>>>>>>>>>>>> [READ], MinLatency(us) 210 214
> > >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 192255 177535
> > >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 973 1529
> > >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 1836 2683
> > >>>>>>>>>>>>>>>> [INSERT], AverageLatency(us) 1239 1152
> > >>>>>>>>>>>>>>>> [INSERT], MinLatency(us) 807 788
> > >>>>>>>>>>>>>>>> [INSERT], MaxLatency(us) 184575 148735
> > >>>>>>>>>>>>>>>> [INSERT], 95thPercentileLatency(us) 1496 1243
> > >>>>>>>>>>>>>>>> [INSERT], 99thPercentileLatency(us) 2965 2495
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> YCSB Workload E
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> target 10k/op/s 1.4.9 1.5.0
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 100605 100568
> > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 9939 9943
> > >>>>>>>>>>>>>>>> [SCAN], AverageLatency(us) 3548 2687
> > >>>>>>>>>>>>>>>> [SCAN], MinLatency(us) 696 678
> > >>>>>>>>>>>>>>>> [SCAN], MaxLatency(us) 1059839 238463
> > >>>>>>>>>>>>>>>> [SCAN], 95thPercentileLatency(us) 8327 6791
> > >>>>>>>>>>>>>>>> [SCAN], 99thPercentileLatency(us) 17647 14415
> > >>>>>>>>>>>>>>>> [INSERT], AverageLatency(us) 2688 1555
> > >>>>>>>>>>>>>>>> [INSERT], MinLatency(us) 887 815
> > >>>>>>>>>>>>>>>> [INSERT], MaxLatency(us) 173311 154623
> > >>>>>>>>>>>>>>>> [INSERT], 95thPercentileLatency(us) 4455 2571
> > >>>>>>>>>>>>>>>> [INSERT], 99thPercentileLatency(us) 9303 5375
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> YCSB Workload F
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200562 204178
> > >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49859 48976
> > >>>>>>>>>>>>>>>> [READ], AverageLatency(us) 856 1137
> > >>>>>>>>>>>>>>>> [READ], MinLatency(us) 262 257
> > >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 205567 222335
> > >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 2365 3475
> > >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 3099 4143
> > >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
> > >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034
> > >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
> > >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627
> > >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919
> > >>>>>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1700 1777
> > >>>>>>>>>>>>>>>> [UPDATE], MinLatency(us) 737 687
> > >>>>>>>>>>>>>>>> [UPDATE], MaxLatency(us) 97983 94271
> > >>>>>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 3377 4147
> > >>>>>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 4147 4831
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <carp84@gmail.com
> >
> > >>>>>> wrote:
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Thanks for the efforts boss.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Since it's a new minor release, do we have performance
> > >>>>>> comparison
> > >>>>>>>>>>>>> report
> > >>>>>>>>>>>>>>>>> with 1.4.9 as we did when releasing 1.4.0? If so, any
> > >>>>>> reference?
> > >>>>>>>>>>> Many
> > >>>>>>>>>>>>>>>>> thanks!
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Best Regards,
> > >>>>>>>>>>>>>>>>> Yu
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <
> > >>>>>> apurtell@apache.org
> > >>>>>>>>>>
> > >>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> The fourth HBase 1.5.0 release candidate (RC3) is
> > >>> available
> > >>>>>> for
> > >>>>>>>>>>>>>>>> download
> > >>>>>>>>>>>>>>>>> at
> > >>>>>>>>>>>>>>>>>>
> > >>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/
> > >>>>>>>>> and
> > >>>>>>>>>>>>>>>> Maven
> > >>>>>>>>>>>>>>>>>> artifacts are available in the temporary repository
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>
> > https://repository.apache.org/content/repositories/orgapachehbase-1292/
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> The git tag corresponding to the candidate is
> '1.5.0RC3’
> > >>>>>>>>>>>>> (b0bc7225c5).
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> A detailed source and binary compatibility report for
> > this
> > >>>>>>>>> release
> > >>>>>>>>>>> is
> > >>>>>>>>>>>>>>>>>> available for your review at
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>
> >
> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
> > >>>>>>>>>>>>>>>>>> .
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> A list of the 115 issues resolved in this release can
> be
> > >>>>>> found
> > >>>>>>>>> at
> > >>>>>>>>>>>>>>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is
> > >>> derived
> > >>>>>> from
> > >>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>> changelog of the last branch-1.4 release, 1.4.9.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Please try out the candidate and vote +1/0/-1.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> The vote will be open for at least 72 hours. Unless
> > >>>>>> objection I
> > >>>>>>>>>>> will
> > >>>>>>>>>>>>>>>> try
> > >>>>>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>>> close it Friday April 12, 2019 if we have sufficient
> > >>> votes.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Prior to making this announcement I made the following
> > >>>>>> preflight
> > >>>>>>>>>>>>>>>> checks:
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> RAT check passes (7u80)
> > >>>>>>>>>>>>>>>>>> Unit test suite passes (7u80, 8u181)*
> > >>>>>>>>>>>>>>>>>> Opened the UI in a browser, poked around
> > >>>>>>>>>>>>>>>>>> LTT load 100M rows with 100% verification and 20%
> > updates
> > >>>>>>>>> (8u181)
> > >>>>>>>>>>>>>>>>>> ITBLL 1B rows with slowDeterministic monkey (8u181)
> > >>>>>>>>>>>>>>>>>> ITBLL 1B rows with serverKilling monkey (8u181)
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> There are known flaky tests. See HBASE-21904 and
> > >>> HBASE-21905.
> > >>>>>>>>> These
> > >>>>>>>>>>>>>>>> flaky
> > >>>>>>>>>>>>>>>>>> tests do not represent serious test failures that
> would
> > >>>>>> prevent
> > >>>>>>>>> a
> > >>>>>>>>>>>>>>>>> release.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>>>>> Best regards,
> > >>>>>>>>>>>>>>>>>> Andrew
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> --
> > >>>>>>>>>>>>>>>> Best regards,
> > >>>>>>>>>>>>>>>> Andrew
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Words like orphans lost among the crosstalk, meaning
> torn
> > >>> from
> > >>>>>>>>>>> truth's
> > >>>>>>>>>>>>>>>> decrepit hands
> > >>>>>>>>>>>>>>>> - A23, Crosstalk
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>>
> > >>>
> > >>
> >
>
>
> --
> Best regards,
> Andrew
>
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>    - A23, Crosstalk
>

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Andrew Purtell <an...@gmail.com>.
I've bisected twice and it lands on this commit:

commit 6bc46bb10920c1c335b784b01d2a326db1a3d587 (HEAD, refs/bisect/bad)
    HBASE-21959 CompactionTool should close the store it uses for
compacting files, in order to properly archive compacted files.
 hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/CompactionTool.java
|   2 ++
 hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompactionTool.java
| 100

At first glance it's hard to see how this change is relevant, but it does
introduce a new unit test.


On Tue, Apr 16, 2019 at 7:48 PM Andrew Purtell <an...@gmail.com>
wrote:

> I’ve been able to reproduce it sometimes too and am bisecting. It may be
> an interaction between test cases, not a failure per se, but does seem have
> a recent cause, as you pointed out. I’ll be looking at it.
>
> Thank you for your kind consideration and for revoking your veto.
>
> A coprocessor API fix was just committed to branch-1 so I want to roll a
> new RC soon to include it. There is also an issue open to improve the
> behavior of the UI when the profiler link is clicked but system support is
> not available.
>
> > On Apr 16, 2019, at 7:40 PM, Yu Li <ca...@gmail.com> wrote:
> >
> > After more investigation, the ConnectionRefused exception could be
> > reproduced with "mvn -Dtest=<case_name> test" after a complete run of all
> > cases through "mvn -PrunAllTests clean test", but cannot by a clean
> > standalone run (with "mvn *clean* test"). So now I'm more convinced it's
> > some kind of environment chaos caused by parallel execution of test
> cases,
> > and not a blocker issue.
> >
> > @Andrew It seems to me that kerby jar is not included in our binary
> > package, so I'm not sure whether a new RC is required by HBASE-22219.
> > Anyway I'd like to revoke my -1 vote now. Thanks.
> >
> > Best Regards,
> > Yu
> >
> >
> >> On Tue, 16 Apr 2019 at 10:19, Yu Li <ca...@gmail.com> wrote:
> >>
> >> Sorry for the late response due to job priority.
> >>
> >> This ConnectionRefused issue cannot be reproduced on my laptop (MacOS
> >> 10.14.4) but could on the linux env. And I've checked and confirmed it
> >> could pass with 1.4.7/1.4.9 source package but stably failed with 1.5.0,
> >> performing a git bisect now, will report back later.
> >>
> >> Best Regards,
> >> Yu
> >>
> >>
> >> On Sat, 13 Apr 2019 at 00:38, Andrew Purtell <an...@gmail.com>
> >> wrote:
> >>
> >>> I also see the occasional ConnectionRefused errors. They don’t
> reproduce
> >>> if you run the test standalone. I also only see them on a Linux dev
> host.
> >>> That may be enough to find by bisect the commit that introduced this
> >>> behavior. Working on it. There is a JIRA filed for this one. Search for
> >>> “TestBlocksRead” and label “branch-1”.
> >>>
> >>> Thanks for the investigations.
> >>>
> >>>> On Apr 12, 2019, at 6:36 AM, Yu Li <ca...@gmail.com> wrote:
> >>>>
> >>>> Quick updates:
> >>>>
> >>>> W/ patch of HBASE-22219 or say upgrading kerby version to 1.0.1, the
> >>>> failures listed above in the 1st part of hbase-server disappeared.
> >>>>
> >>>> However, in the 2nd part of hbase-server UT there're still many
> >>>> ConnectionRefused exceptions (17 errors in total) as shown below,
> which
> >>>> could be reproduced easily with -Dtest=xxx command on my environments,
> >>>> still checking the root cause.
> >>>>
> >>>> [INFO] Running org.apache.hadoop.hbase.regionserver.TestBlocksRead
> >>>> [ERROR] Tests run: 4, Failures: 0, Errors: 4, Skipped: 0, Time
> elapsed:
> >>>> 0.853 s <<< FAILURE! - in
> >>>> org.apache.hadoop.hbase.regionserver.TestBlocksRead
> >>>> [ERROR]
> >>>>
> >>>
> testBlocksStoredWhenCachingDisabled(org.apache.hadoop.hbase.regionserver.TestBlocksRead)
> >>>> Time elapsed: 0.17 s  <<< ERROR!
> >>>> java.net.ConnectException: Call From
> >>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35669 failed
> >>> on
> >>>> connection exception: java.net.ConnectException: Connection refused;
> For
> >>>> more details see:
> >>>> http://wiki.apache.org/hadoop/ConnectionRefused
> >>>>       at
> >>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBlocksRead.initHRegion(TestBlocksRead.java:112)
> >>>>       at
> >>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBlocksRead.testBlocksStoredWhenCachingDisabled(TestBlocksRead.java:389)
> >>>> Caused by: java.net.ConnectException: Connection refused
> >>>>       at
> >>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBlocksRead.initHRegion(TestBlocksRead.java:112)
> >>>>       at
> >>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBlocksRead.testBlocksStoredWhenCachingDisabled(TestBlocksRead.java:389)
> >>>>
> >>>> Best Regards,
> >>>> Yu
> >>>>
> >>>>
> >>>>> On Fri, 12 Apr 2019 at 13:11, Yu Li <ca...@gmail.com> wrote:
> >>>>>
> >>>>> I have no doubt that you've run the tests locally before announcing a
> >>>>> release as you're always a great RM boss. And this shows one value of
> >>>>> verifying release, that different voter has different environments.
> >>>>>
> >>>>> Now I think the failures may be kerberos related, since I possibly
> has
> >>>>> changed some system configuration when doing Flink testing on this
> env
> >>>>> weeks ago. Located one issue (HBASE-22219) which also observed in
> >>> 1.4.7,
> >>>>> will further investigate.
> >>>>>
> >>>>> Best Regards,
> >>>>> Yu
> >>>>>
> >>>>>
> >>>>> On Fri, 12 Apr 2019 at 12:38, Andrew Purtell <
> andrew.purtell@gmail.com
> >>>>
> >>>>> wrote:
> >>>>>
> >>>>>> “However it's good to find the issue earlier if there
> >>>>>> really is any, before release announced.”
> >>>>>>
> >>>>>> I run the complete unit test suite before announcing a release
> >>> candidate.
> >>>>>> Just to be clear.
> >>>>>>
> >>>>>> Totally agree we should get these problems sorted before an actual
> >>>>>> release. My policy is to cancel a RC if anyone vetoes for this
> >>> reason...
> >>>>>> want as much coverage and varying environments as we can manage.
> >>>>>>
> >>>>>> Thank you for your help so far and I hope the failures you see
> result
> >>> in
> >>>>>> analysis and fixes that lead to better test stability.
> >>>>>>
> >>>>>>> On Apr 11, 2019, at 9:32 PM, Yu Li <ca...@gmail.com> wrote:
> >>>>>>>
> >>>>>>> Confirmed in 1.4.7 source the listed out cases passed (all in the
> 1st
> >>>>>> part
> >>>>>>> of hbase-server so the result comes out quickly.)... Also confirmed
> >>> the
> >>>>>>> test ran order are the same...
> >>>>>>>
> >>>>>>> Will try 1.5.0 again to prevent the environment difference caused
> by
> >>>>>> time.
> >>>>>>> If 1.5.0 still fails, will start to do the git bisect to locate the
> >>>>>> first
> >>>>>>> bad commit.
> >>>>>>>
> >>>>>>> Was also expecting an easy pass and +1 as always to save time and
> >>>>>> efforts,
> >>>>>>> but obvious no luck. However it's good to find the issue earlier if
> >>>>>> there
> >>>>>>> really is any, before release announced.
> >>>>>>>
> >>>>>>> Best Regards,
> >>>>>>> Yu
> >>>>>>>
> >>>>>>>
> >>>>>>>> On Fri, 12 Apr 2019 at 12:16, Yu Li <ca...@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>> Fine, let's focus on verifying whether it's a real problem rather
> >>> than
> >>>>>>>> arguing about wording, after all that's not my intention...
> >>>>>>>>
> >>>>>>>> As mentioned, I participated in the 1.4.7 release vote[1] and
> IIRC I
> >>>>>> was
> >>>>>>>> using the same env and all tests passed w/o issue, that's where my
> >>>>>> concern
> >>>>>>>> lies and the main reason I gave a -1 vote. I'm running against
> 1.4.7
> >>>>>> source
> >>>>>>>> on the same now and let's see the result.
> >>>>>>>>
> >>>>>>>> [1]
> https://www.mail-archive.com/dev@hbase.apache.org/msg51380.html
> >>>>>>>>
> >>>>>>>> Best Regards,
> >>>>>>>> Yu
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Fri, 12 Apr 2019 at 12:05, Andrew Purtell <
> >>> andrew.purtell@gmail.com
> >>>>>>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> I believe the test execution order matters. We run some tests in
> >>>>>>>>> parallel. The ordering of tests is determined by readdir()
> results
> >>>>>> and this
> >>>>>>>>> differs from host to host and checkout to checkout. So when you
> >>> see a
> >>>>>>>>> repeatable group of failures, that’s great. And when someone else
> >>>>>> doesn’t
> >>>>>>>>> see those same tests fail, or they cannot be reproduced when
> >>> running
> >>>>>> by
> >>>>>>>>> themselves, the commonly accepted term of art for this is
> “flaky”.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> On Apr 11, 2019, at 8:52 PM, Yu Li <ca...@gmail.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Sorry but I'd call it "possible environment related problem" or
> >>> "some
> >>>>>>>>>> feature may not work well in specific environment", rather than
> a
> >>>>>> flaky.
> >>>>>>>>>>
> >>>>>>>>>> Will check against 1.4.7 released source package before opening
> >>> any
> >>>>>>>>> JIRA.
> >>>>>>>>>>
> >>>>>>>>>> Best Regards,
> >>>>>>>>>> Yu
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Fri, 12 Apr 2019 at 11:37, Andrew Purtell <
> >>>>>> andrew.purtell@gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> And if they pass in my environment , then what should we call
> it
> >>>>>> then.
> >>>>>>>>> I
> >>>>>>>>>>> have no doubt you are seeing failures. Therefore can you please
> >>> file
> >>>>>>>>> JIRAs
> >>>>>>>>>>> and attach information that can help identify a fix. Thanks.
> >>>>>>>>>>>
> >>>>>>>>>>>> On Apr 11, 2019, at 8:35 PM, Yu Li <ca...@gmail.com> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> I ran the test suite with the
> >>> -Dsurefire.rerunFailingTestsCount=2
> >>>>>>>>> option
> >>>>>>>>>>>> and on two different env separately, so it sums up to 6 times
> >>>>>> stable
> >>>>>>>>>>>> failure for each case, and from my perspective this is not
> >>> flaky.
> >>>>>>>>>>>>
> >>>>>>>>>>>> IIRC last time when verifying 1.4.7 on the same env no such
> >>> issue
> >>>>>>>>>>> observed,
> >>>>>>>>>>>> will double check.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best Regards,
> >>>>>>>>>>>> Yu
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Fri, 12 Apr 2019 at 00:07, Andrew Purtell <
> >>>>>>>>> andrew.purtell@gmail.com>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> There are two failure cases it looks like. And this looks
> like
> >>>>>>>>> flakes.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> The wrong FS assertions are not something I see when I run
> >>> these
> >>>>>>>>> tests
> >>>>>>>>>>>>> myself. I am not able to investigate something I can’t
> >>> reproduce.
> >>>>>>>>> What I
> >>>>>>>>>>>>> suggest is since you can reproduce do a git bisect to find
> the
> >>>>>> commit
> >>>>>>>>>>> that
> >>>>>>>>>>>>> introduced the problem. Then we can revert it. As an
> >>> alternative
> >>>>>> we
> >>>>>>>>> can
> >>>>>>>>>>>>> open a JIRA, report the problem, temporarily @ignore the
> test,
> >>> and
> >>>>>>>>>>>>> continue. This latter option only should be done if we are
> >>> fairly
> >>>>>>>>>>> confident
> >>>>>>>>>>>>> it is a test only problem.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> The connect exceptions are interesting. I see these sometimes
> >>> when
> >>>>>>>>> the
> >>>>>>>>>>>>> suite is executed, not this particular case, but when the
> >>> failed
> >>>>>>>>> test is
> >>>>>>>>>>>>> executed by itself it always passes. It is possible some
> >>> change to
> >>>>>>>>>>> classes
> >>>>>>>>>>>>> related to the minicluster or startup or shutdown timing are
> >>> the
> >>>>>>>>> cause,
> >>>>>>>>>>> but
> >>>>>>>>>>>>> it is test time flaky behavior. I’m not happy about this but
> it
> >>>>>>>>> doesn’t
> >>>>>>>>>>>>> actually fail the release because the failure is never
> >>> repeatable
> >>>>>>>>> when
> >>>>>>>>>>> the
> >>>>>>>>>>>>> test is run standalone.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> In general it would be great if some attention was paid to
> test
> >>>>>>>>>>>>> cleanliness on branch-1. As RM I’m not in a position to
> insist
> >>>>>> that
> >>>>>>>>>>>>> everything is perfect or there will never be another 1.x
> >>> release,
> >>>>>>>>>>> certainly
> >>>>>>>>>>>>> not from branch-1. So, tests which fail repeatedly block a
> >>> release
> >>>>>>>>> IMHO
> >>>>>>>>>>> but
> >>>>>>>>>>>>> flakes do not.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Apr 10, 2019, at 11:20 PM, Yu Li <ca...@gmail.com>
> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> -1
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Observed many UT failures when checking the source package
> >>> (tried
> >>>>>>>>>>>>> multiple
> >>>>>>>>>>>>>> rounds on two different environments, MacOs and Linux, got
> the
> >>>>>> same
> >>>>>>>>>>>>>> result), including (but not limited to):
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> TestBulkload:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>
> shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad)
> >>>>>>>>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
> >>>>>>>>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>
> file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29,
> >>>>>>>>>>>>>> expected: hdfs://localhost:55938
> >>>>>>>>>>>>>>   at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246)
> >>>>>>>>>>>>>>   at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256)
> >>>>>>>>>>>>>>   at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> TestStoreFile:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>
> testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile)
> >>>>>>>>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
> >>>>>>>>>>>>>> java.net.ConnectException: Call From localhost/127.0.0.1 to
> >>>>>>>>>>>>> localhost:55938
> >>>>>>>>>>>>>> failed on connection exception: java.net.ConnectException:
> >>>>>>>>> Connection
> >>>>>>>>>>>>>> refused; For more details see:
> >>>>>>>>>>>>>> http://wiki.apache.org/hadoop/ConnectionRefused
> >>>>>>>>>>>>>>   at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047)
> >>>>>>>>>>>>>>   at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> TestHFile:
> >>>>>>>>>>>>>> testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile)
> >>> Time
> >>>>>>>>>>> elapsed:
> >>>>>>>>>>>>>> 0.08 s  <<< ERROR!
> >>>>>>>>>>>>>> java.net.ConnectException: Call From
> >>>>>>>>>>>>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to
> >>> localhost:35529
> >>>>>>>>> failed
> >>>>>>>>>>> on
> >>>>>>>>>>>>>> connection exception: java.net.ConnectException: Connection
> >>>>>> refused;
> >>>>>>>>>>> For
> >>>>>>>>>>>>>> more details see:
> >>>>>> http://wiki.apache.org/hadoop/ConnectionRefused
> >>>>>>>>>>>>>>   at
> >>>>>>>>>>>>>> org.apache.hadoop.hbase.io
> >>>>>>>>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
> >>>>>>>>>>>>>> Caused by: java.net.ConnectException: Connection refused
> >>>>>>>>>>>>>>   at
> >>>>>>>>>>>>>> org.apache.hadoop.hbase.io
> >>>>>>>>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> TestBlocksScanned:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>
> testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned)
> >>>>>>>>>>>>>> Time elapsed: 0.069 s  <<< ERROR!
> >>>>>>>>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
> >>>>>>>>>>> hdfs://localhost:35529/tmp/
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>
> hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08
> >>>>>>>>>>>>> ,
> >>>>>>>>>>>>>> expected: file:///
> >>>>>>>>>>>>>>   at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90)
> >>>>>>>>>>>>>>   at
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> And please let me know if any known issue I'm not aware of.
> >>>>>> Thanks.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Best Regards,
> >>>>>>>>>>>>>> Yu
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Mon, 8 Apr 2019 at 11:38, Yu Li <ca...@gmail.com>
> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> The performance report LGTM, thanks! (and sorry for the lag
> >>> due
> >>>>>> to
> >>>>>>>>>>>>>>> Qingming Festival Holiday here in China)
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Still verifying the release, just some quick feedback:
> >>> observed
> >>>>>>>>> some
> >>>>>>>>>>>>>>> incompatible changes in compatibility report including
> >>>>>>>>>>>>>>> HBASE-21492/HBASE-21684 and worth a reminder in
> ReleaseNote.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Irrelative but noticeable: the 1.4.9 release note URL is
> >>>>>> invalid on
> >>>>>>>>>>>>>>> https://hbase.apache.org/downloads.html
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Best Regards,
> >>>>>>>>>>>>>>> Yu
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <
> >>>>>> apurtell@apache.org>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> The difference is basically noise per the usual YCSB
> >>>>>> evaluation.
> >>>>>>>>>>> Small
> >>>>>>>>>>>>>>>> differences in workloads D and F (slightly worse) and
> >>> workload
> >>>>>> E
> >>>>>>>>>>>>> (slightly
> >>>>>>>>>>>>>>>> better) that do not indicate serious regression.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Linux version 4.14.55-62.37.amzn1.x86_64
> >>>>>>>>>>>>>>>> c3.8xlarge x 5
> >>>>>>>>>>>>>>>> OpenJDK Runtime Environment (build
> 1.8.0_181-shenandoah-b13)
> >>>>>>>>>>>>>>>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch
> >>> -XX:+UseNUMA
> >>>>>>>>>>>>>>>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
> >>>>>>>>>>>>>>>> Hadoop 2.9.2
> >>>>>>>>>>>>>>>> Init: Load 100 M rows and snapshot
> >>>>>>>>>>>>>>>> Run: Delete table, clone and redeploy from snapshot, run
> 10
> >>> M
> >>>>>>>>>>>>> operations
> >>>>>>>>>>>>>>>> Args: -threads 100 -target 50000
> >>>>>>>>>>>>>>>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS
> =>
> >>>>>> '1',
> >>>>>>>>>>>>> IN_MEMORY
> >>>>>>>>>>>>>>>> => 'false', KEEP_DELETED_CELLS => 'FALSE',
> >>> DATA_BLOCK_ENCODING
> >>>>>> =>
> >>>>>>>>>>>>>>>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => 'SNAPPY',
> >>>>>>>>>>>>> MIN_VERSIONS =>
> >>>>>>>>>>>>>>>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536',
> >>>>>>>>> REPLICATION_SCOPE =>
> >>>>>>>>>>>>>>>> '0'}
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> YCSB Workload A
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200592 200583
> >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49852 49855
> >>>>>>>>>>>>>>>> [READ], AverageLatency(us) 544 559
> >>>>>>>>>>>>>>>> [READ], MinLatency(us) 267 292
> >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 165631 185087
> >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 738 742
> >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us), 1877 1961
> >>>>>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1370 1181
> >>>>>>>>>>>>>>>> [UPDATE], MinLatency(us) 702 646
> >>>>>>>>>>>>>>>> [UPDATE], MaxLatency(us) 180735 177279
> >>>>>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1943 1652
> >>>>>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 3257 3085
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> YCSB Workload B
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200599 200581
> >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49850 49855
> >>>>>>>>>>>>>>>> [READ], AverageLatency(us),  454 471
> >>>>>>>>>>>>>>>> [READ], MinLatency(us) 203 213
> >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 183423 174207
> >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 563 599
> >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 1360 1172
> >>>>>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1064 1029
> >>>>>>>>>>>>>>>> [UPDATE], MinLatency(us) 746 726
> >>>>>>>>>>>>>>>> [UPDATE], MaxLatency(us) 163455 101631
> >>>>>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1327 1157
> >>>>>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 2241 1898
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> YCSB Workload C
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200541 200538
> >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49865 49865
> >>>>>>>>>>>>>>>> [READ], AverageLatency(us) 332 327
> >>>>>>>>>>>>>>>> [READ], MinLatency(us) 175 179
> >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 210559 170367
> >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 410 396
> >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 871 892
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> YCSB Workload D
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200579 200562
> >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49855 49859
> >>>>>>>>>>>>>>>> [READ], AverageLatency(us) 487 547
> >>>>>>>>>>>>>>>> [READ], MinLatency(us) 210 214
> >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 192255 177535
> >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 973 1529
> >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 1836 2683
> >>>>>>>>>>>>>>>> [INSERT], AverageLatency(us) 1239 1152
> >>>>>>>>>>>>>>>> [INSERT], MinLatency(us) 807 788
> >>>>>>>>>>>>>>>> [INSERT], MaxLatency(us) 184575 148735
> >>>>>>>>>>>>>>>> [INSERT], 95thPercentileLatency(us) 1496 1243
> >>>>>>>>>>>>>>>> [INSERT], 99thPercentileLatency(us) 2965 2495
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> YCSB Workload E
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> target 10k/op/s 1.4.9 1.5.0
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 100605 100568
> >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 9939 9943
> >>>>>>>>>>>>>>>> [SCAN], AverageLatency(us) 3548 2687
> >>>>>>>>>>>>>>>> [SCAN], MinLatency(us) 696 678
> >>>>>>>>>>>>>>>> [SCAN], MaxLatency(us) 1059839 238463
> >>>>>>>>>>>>>>>> [SCAN], 95thPercentileLatency(us) 8327 6791
> >>>>>>>>>>>>>>>> [SCAN], 99thPercentileLatency(us) 17647 14415
> >>>>>>>>>>>>>>>> [INSERT], AverageLatency(us) 2688 1555
> >>>>>>>>>>>>>>>> [INSERT], MinLatency(us) 887 815
> >>>>>>>>>>>>>>>> [INSERT], MaxLatency(us) 173311 154623
> >>>>>>>>>>>>>>>> [INSERT], 95thPercentileLatency(us) 4455 2571
> >>>>>>>>>>>>>>>> [INSERT], 99thPercentileLatency(us) 9303 5375
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> YCSB Workload F
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200562 204178
> >>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49859 48976
> >>>>>>>>>>>>>>>> [READ], AverageLatency(us) 856 1137
> >>>>>>>>>>>>>>>> [READ], MinLatency(us) 262 257
> >>>>>>>>>>>>>>>> [READ], MaxLatency(us) 205567 222335
> >>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 2365 3475
> >>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 3099 4143
> >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
> >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034
> >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
> >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627
> >>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919
> >>>>>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1700 1777
> >>>>>>>>>>>>>>>> [UPDATE], MinLatency(us) 737 687
> >>>>>>>>>>>>>>>> [UPDATE], MaxLatency(us) 97983 94271
> >>>>>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 3377 4147
> >>>>>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 4147 4831
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <ca...@gmail.com>
> >>>>>> wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Thanks for the efforts boss.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Since it's a new minor release, do we have performance
> >>>>>> comparison
> >>>>>>>>>>>>> report
> >>>>>>>>>>>>>>>>> with 1.4.9 as we did when releasing 1.4.0? If so, any
> >>>>>> reference?
> >>>>>>>>>>> Many
> >>>>>>>>>>>>>>>>> thanks!
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Best Regards,
> >>>>>>>>>>>>>>>>> Yu
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <
> >>>>>> apurtell@apache.org
> >>>>>>>>>>
> >>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> The fourth HBase 1.5.0 release candidate (RC3) is
> >>> available
> >>>>>> for
> >>>>>>>>>>>>>>>> download
> >>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>
> >>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/
> >>>>>>>>> and
> >>>>>>>>>>>>>>>> Maven
> >>>>>>>>>>>>>>>>>> artifacts are available in the temporary repository
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>
> https://repository.apache.org/content/repositories/orgapachehbase-1292/
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> The git tag corresponding to the candidate is '1.5.0RC3’
> >>>>>>>>>>>>> (b0bc7225c5).
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> A detailed source and binary compatibility report for
> this
> >>>>>>>>> release
> >>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>> available for your review at
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>
> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
> >>>>>>>>>>>>>>>>>> .
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> A list of the 115 issues resolved in this release can be
> >>>>>> found
> >>>>>>>>> at
> >>>>>>>>>>>>>>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is
> >>> derived
> >>>>>> from
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> changelog of the last branch-1.4 release, 1.4.9.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Please try out the candidate and vote +1/0/-1.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> The vote will be open for at least 72 hours. Unless
> >>>>>> objection I
> >>>>>>>>>>> will
> >>>>>>>>>>>>>>>> try
> >>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>> close it Friday April 12, 2019 if we have sufficient
> >>> votes.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Prior to making this announcement I made the following
> >>>>>> preflight
> >>>>>>>>>>>>>>>> checks:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> RAT check passes (7u80)
> >>>>>>>>>>>>>>>>>> Unit test suite passes (7u80, 8u181)*
> >>>>>>>>>>>>>>>>>> Opened the UI in a browser, poked around
> >>>>>>>>>>>>>>>>>> LTT load 100M rows with 100% verification and 20%
> updates
> >>>>>>>>> (8u181)
> >>>>>>>>>>>>>>>>>> ITBLL 1B rows with slowDeterministic monkey (8u181)
> >>>>>>>>>>>>>>>>>> ITBLL 1B rows with serverKilling monkey (8u181)
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> There are known flaky tests. See HBASE-21904 and
> >>> HBASE-21905.
> >>>>>>>>> These
> >>>>>>>>>>>>>>>> flaky
> >>>>>>>>>>>>>>>>>> tests do not represent serious test failures that would
> >>>>>> prevent
> >>>>>>>>> a
> >>>>>>>>>>>>>>>>> release.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>> Best regards,
> >>>>>>>>>>>>>>>>>> Andrew
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>> Best regards,
> >>>>>>>>>>>>>>>> Andrew
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Words like orphans lost among the crosstalk, meaning torn
> >>> from
> >>>>>>>>>>> truth's
> >>>>>>>>>>>>>>>> decrepit hands
> >>>>>>>>>>>>>>>> - A23, Crosstalk
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>
> >>>
> >>
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Andrew Purtell <an...@gmail.com>.
I’ve been able to reproduce it sometimes too and am bisecting. It may be an interaction between test cases, not a failure per se, but does seem have a recent cause, as you pointed out. I’ll be looking at it. 

Thank you for your kind consideration and for revoking your veto.

A coprocessor API fix was just committed to branch-1 so I want to roll a new RC soon to include it. There is also an issue open to improve the behavior of the UI when the profiler link is clicked but system support is not available. 

> On Apr 16, 2019, at 7:40 PM, Yu Li <ca...@gmail.com> wrote:
> 
> After more investigation, the ConnectionRefused exception could be
> reproduced with "mvn -Dtest=<case_name> test" after a complete run of all
> cases through "mvn -PrunAllTests clean test", but cannot by a clean
> standalone run (with "mvn *clean* test"). So now I'm more convinced it's
> some kind of environment chaos caused by parallel execution of test cases,
> and not a blocker issue.
> 
> @Andrew It seems to me that kerby jar is not included in our binary
> package, so I'm not sure whether a new RC is required by HBASE-22219.
> Anyway I'd like to revoke my -1 vote now. Thanks.
> 
> Best Regards,
> Yu
> 
> 
>> On Tue, 16 Apr 2019 at 10:19, Yu Li <ca...@gmail.com> wrote:
>> 
>> Sorry for the late response due to job priority.
>> 
>> This ConnectionRefused issue cannot be reproduced on my laptop (MacOS
>> 10.14.4) but could on the linux env. And I've checked and confirmed it
>> could pass with 1.4.7/1.4.9 source package but stably failed with 1.5.0,
>> performing a git bisect now, will report back later.
>> 
>> Best Regards,
>> Yu
>> 
>> 
>> On Sat, 13 Apr 2019 at 00:38, Andrew Purtell <an...@gmail.com>
>> wrote:
>> 
>>> I also see the occasional ConnectionRefused errors. They don’t reproduce
>>> if you run the test standalone. I also only see them on a Linux dev host.
>>> That may be enough to find by bisect the commit that introduced this
>>> behavior. Working on it. There is a JIRA filed for this one. Search for
>>> “TestBlocksRead” and label “branch-1”.
>>> 
>>> Thanks for the investigations.
>>> 
>>>> On Apr 12, 2019, at 6:36 AM, Yu Li <ca...@gmail.com> wrote:
>>>> 
>>>> Quick updates:
>>>> 
>>>> W/ patch of HBASE-22219 or say upgrading kerby version to 1.0.1, the
>>>> failures listed above in the 1st part of hbase-server disappeared.
>>>> 
>>>> However, in the 2nd part of hbase-server UT there're still many
>>>> ConnectionRefused exceptions (17 errors in total) as shown below, which
>>>> could be reproduced easily with -Dtest=xxx command on my environments,
>>>> still checking the root cause.
>>>> 
>>>> [INFO] Running org.apache.hadoop.hbase.regionserver.TestBlocksRead
>>>> [ERROR] Tests run: 4, Failures: 0, Errors: 4, Skipped: 0, Time elapsed:
>>>> 0.853 s <<< FAILURE! - in
>>>> org.apache.hadoop.hbase.regionserver.TestBlocksRead
>>>> [ERROR]
>>>> 
>>> testBlocksStoredWhenCachingDisabled(org.apache.hadoop.hbase.regionserver.TestBlocksRead)
>>>> Time elapsed: 0.17 s  <<< ERROR!
>>>> java.net.ConnectException: Call From
>>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35669 failed
>>> on
>>>> connection exception: java.net.ConnectException: Connection refused; For
>>>> more details see:
>>>> http://wiki.apache.org/hadoop/ConnectionRefused
>>>>       at
>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestBlocksRead.initHRegion(TestBlocksRead.java:112)
>>>>       at
>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestBlocksRead.testBlocksStoredWhenCachingDisabled(TestBlocksRead.java:389)
>>>> Caused by: java.net.ConnectException: Connection refused
>>>>       at
>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestBlocksRead.initHRegion(TestBlocksRead.java:112)
>>>>       at
>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestBlocksRead.testBlocksStoredWhenCachingDisabled(TestBlocksRead.java:389)
>>>> 
>>>> Best Regards,
>>>> Yu
>>>> 
>>>> 
>>>>> On Fri, 12 Apr 2019 at 13:11, Yu Li <ca...@gmail.com> wrote:
>>>>> 
>>>>> I have no doubt that you've run the tests locally before announcing a
>>>>> release as you're always a great RM boss. And this shows one value of
>>>>> verifying release, that different voter has different environments.
>>>>> 
>>>>> Now I think the failures may be kerberos related, since I possibly has
>>>>> changed some system configuration when doing Flink testing on this env
>>>>> weeks ago. Located one issue (HBASE-22219) which also observed in
>>> 1.4.7,
>>>>> will further investigate.
>>>>> 
>>>>> Best Regards,
>>>>> Yu
>>>>> 
>>>>> 
>>>>> On Fri, 12 Apr 2019 at 12:38, Andrew Purtell <andrew.purtell@gmail.com
>>>> 
>>>>> wrote:
>>>>> 
>>>>>> “However it's good to find the issue earlier if there
>>>>>> really is any, before release announced.”
>>>>>> 
>>>>>> I run the complete unit test suite before announcing a release
>>> candidate.
>>>>>> Just to be clear.
>>>>>> 
>>>>>> Totally agree we should get these problems sorted before an actual
>>>>>> release. My policy is to cancel a RC if anyone vetoes for this
>>> reason...
>>>>>> want as much coverage and varying environments as we can manage.
>>>>>> 
>>>>>> Thank you for your help so far and I hope the failures you see result
>>> in
>>>>>> analysis and fixes that lead to better test stability.
>>>>>> 
>>>>>>> On Apr 11, 2019, at 9:32 PM, Yu Li <ca...@gmail.com> wrote:
>>>>>>> 
>>>>>>> Confirmed in 1.4.7 source the listed out cases passed (all in the 1st
>>>>>> part
>>>>>>> of hbase-server so the result comes out quickly.)... Also confirmed
>>> the
>>>>>>> test ran order are the same...
>>>>>>> 
>>>>>>> Will try 1.5.0 again to prevent the environment difference caused by
>>>>>> time.
>>>>>>> If 1.5.0 still fails, will start to do the git bisect to locate the
>>>>>> first
>>>>>>> bad commit.
>>>>>>> 
>>>>>>> Was also expecting an easy pass and +1 as always to save time and
>>>>>> efforts,
>>>>>>> but obvious no luck. However it's good to find the issue earlier if
>>>>>> there
>>>>>>> really is any, before release announced.
>>>>>>> 
>>>>>>> Best Regards,
>>>>>>> Yu
>>>>>>> 
>>>>>>> 
>>>>>>>> On Fri, 12 Apr 2019 at 12:16, Yu Li <ca...@gmail.com> wrote:
>>>>>>>> 
>>>>>>>> Fine, let's focus on verifying whether it's a real problem rather
>>> than
>>>>>>>> arguing about wording, after all that's not my intention...
>>>>>>>> 
>>>>>>>> As mentioned, I participated in the 1.4.7 release vote[1] and IIRC I
>>>>>> was
>>>>>>>> using the same env and all tests passed w/o issue, that's where my
>>>>>> concern
>>>>>>>> lies and the main reason I gave a -1 vote. I'm running against 1.4.7
>>>>>> source
>>>>>>>> on the same now and let's see the result.
>>>>>>>> 
>>>>>>>> [1] https://www.mail-archive.com/dev@hbase.apache.org/msg51380.html
>>>>>>>> 
>>>>>>>> Best Regards,
>>>>>>>> Yu
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Fri, 12 Apr 2019 at 12:05, Andrew Purtell <
>>> andrew.purtell@gmail.com
>>>>>>> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> I believe the test execution order matters. We run some tests in
>>>>>>>>> parallel. The ordering of tests is determined by readdir() results
>>>>>> and this
>>>>>>>>> differs from host to host and checkout to checkout. So when you
>>> see a
>>>>>>>>> repeatable group of failures, that’s great. And when someone else
>>>>>> doesn’t
>>>>>>>>> see those same tests fail, or they cannot be reproduced when
>>> running
>>>>>> by
>>>>>>>>> themselves, the commonly accepted term of art for this is “flaky”.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On Apr 11, 2019, at 8:52 PM, Yu Li <ca...@gmail.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> Sorry but I'd call it "possible environment related problem" or
>>> "some
>>>>>>>>>> feature may not work well in specific environment", rather than a
>>>>>> flaky.
>>>>>>>>>> 
>>>>>>>>>> Will check against 1.4.7 released source package before opening
>>> any
>>>>>>>>> JIRA.
>>>>>>>>>> 
>>>>>>>>>> Best Regards,
>>>>>>>>>> Yu
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Fri, 12 Apr 2019 at 11:37, Andrew Purtell <
>>>>>> andrew.purtell@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> And if they pass in my environment , then what should we call it
>>>>>> then.
>>>>>>>>> I
>>>>>>>>>>> have no doubt you are seeing failures. Therefore can you please
>>> file
>>>>>>>>> JIRAs
>>>>>>>>>>> and attach information that can help identify a fix. Thanks.
>>>>>>>>>>> 
>>>>>>>>>>>> On Apr 11, 2019, at 8:35 PM, Yu Li <ca...@gmail.com> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> I ran the test suite with the
>>> -Dsurefire.rerunFailingTestsCount=2
>>>>>>>>> option
>>>>>>>>>>>> and on two different env separately, so it sums up to 6 times
>>>>>> stable
>>>>>>>>>>>> failure for each case, and from my perspective this is not
>>> flaky.
>>>>>>>>>>>> 
>>>>>>>>>>>> IIRC last time when verifying 1.4.7 on the same env no such
>>> issue
>>>>>>>>>>> observed,
>>>>>>>>>>>> will double check.
>>>>>>>>>>>> 
>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>> Yu
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Fri, 12 Apr 2019 at 00:07, Andrew Purtell <
>>>>>>>>> andrew.purtell@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> There are two failure cases it looks like. And this looks like
>>>>>>>>> flakes.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The wrong FS assertions are not something I see when I run
>>> these
>>>>>>>>> tests
>>>>>>>>>>>>> myself. I am not able to investigate something I can’t
>>> reproduce.
>>>>>>>>> What I
>>>>>>>>>>>>> suggest is since you can reproduce do a git bisect to find the
>>>>>> commit
>>>>>>>>>>> that
>>>>>>>>>>>>> introduced the problem. Then we can revert it. As an
>>> alternative
>>>>>> we
>>>>>>>>> can
>>>>>>>>>>>>> open a JIRA, report the problem, temporarily @ignore the test,
>>> and
>>>>>>>>>>>>> continue. This latter option only should be done if we are
>>> fairly
>>>>>>>>>>> confident
>>>>>>>>>>>>> it is a test only problem.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The connect exceptions are interesting. I see these sometimes
>>> when
>>>>>>>>> the
>>>>>>>>>>>>> suite is executed, not this particular case, but when the
>>> failed
>>>>>>>>> test is
>>>>>>>>>>>>> executed by itself it always passes. It is possible some
>>> change to
>>>>>>>>>>> classes
>>>>>>>>>>>>> related to the minicluster or startup or shutdown timing are
>>> the
>>>>>>>>> cause,
>>>>>>>>>>> but
>>>>>>>>>>>>> it is test time flaky behavior. I’m not happy about this but it
>>>>>>>>> doesn’t
>>>>>>>>>>>>> actually fail the release because the failure is never
>>> repeatable
>>>>>>>>> when
>>>>>>>>>>> the
>>>>>>>>>>>>> test is run standalone.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> In general it would be great if some attention was paid to test
>>>>>>>>>>>>> cleanliness on branch-1. As RM I’m not in a position to insist
>>>>>> that
>>>>>>>>>>>>> everything is perfect or there will never be another 1.x
>>> release,
>>>>>>>>>>> certainly
>>>>>>>>>>>>> not from branch-1. So, tests which fail repeatedly block a
>>> release
>>>>>>>>> IMHO
>>>>>>>>>>> but
>>>>>>>>>>>>> flakes do not.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Apr 10, 2019, at 11:20 PM, Yu Li <ca...@gmail.com> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> -1
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Observed many UT failures when checking the source package
>>> (tried
>>>>>>>>>>>>> multiple
>>>>>>>>>>>>>> rounds on two different environments, MacOs and Linux, got the
>>>>>> same
>>>>>>>>>>>>>> result), including (but not limited to):
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> TestBulkload:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>> shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad)
>>>>>>>>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
>>>>>>>>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>> file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29,
>>>>>>>>>>>>>> expected: hdfs://localhost:55938
>>>>>>>>>>>>>>   at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246)
>>>>>>>>>>>>>>   at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256)
>>>>>>>>>>>>>>   at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> TestStoreFile:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>> testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile)
>>>>>>>>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
>>>>>>>>>>>>>> java.net.ConnectException: Call From localhost/127.0.0.1 to
>>>>>>>>>>>>> localhost:55938
>>>>>>>>>>>>>> failed on connection exception: java.net.ConnectException:
>>>>>>>>> Connection
>>>>>>>>>>>>>> refused; For more details see:
>>>>>>>>>>>>>> http://wiki.apache.org/hadoop/ConnectionRefused
>>>>>>>>>>>>>>   at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047)
>>>>>>>>>>>>>>   at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> TestHFile:
>>>>>>>>>>>>>> testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile)
>>> Time
>>>>>>>>>>> elapsed:
>>>>>>>>>>>>>> 0.08 s  <<< ERROR!
>>>>>>>>>>>>>> java.net.ConnectException: Call From
>>>>>>>>>>>>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to
>>> localhost:35529
>>>>>>>>> failed
>>>>>>>>>>> on
>>>>>>>>>>>>>> connection exception: java.net.ConnectException: Connection
>>>>>> refused;
>>>>>>>>>>> For
>>>>>>>>>>>>>> more details see:
>>>>>> http://wiki.apache.org/hadoop/ConnectionRefused
>>>>>>>>>>>>>>   at
>>>>>>>>>>>>>> org.apache.hadoop.hbase.io
>>>>>>>>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
>>>>>>>>>>>>>> Caused by: java.net.ConnectException: Connection refused
>>>>>>>>>>>>>>   at
>>>>>>>>>>>>>> org.apache.hadoop.hbase.io
>>>>>>>>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> TestBlocksScanned:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>> testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned)
>>>>>>>>>>>>>> Time elapsed: 0.069 s  <<< ERROR!
>>>>>>>>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
>>>>>>>>>>> hdfs://localhost:35529/tmp/
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>> hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08
>>>>>>>>>>>>> ,
>>>>>>>>>>>>>> expected: file:///
>>>>>>>>>>>>>>   at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90)
>>>>>>>>>>>>>>   at
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86)
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> And please let me know if any known issue I'm not aware of.
>>>>>> Thanks.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>> Yu
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Mon, 8 Apr 2019 at 11:38, Yu Li <ca...@gmail.com> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The performance report LGTM, thanks! (and sorry for the lag
>>> due
>>>>>> to
>>>>>>>>>>>>>>> Qingming Festival Holiday here in China)
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Still verifying the release, just some quick feedback:
>>> observed
>>>>>>>>> some
>>>>>>>>>>>>>>> incompatible changes in compatibility report including
>>>>>>>>>>>>>>> HBASE-21492/HBASE-21684 and worth a reminder in ReleaseNote.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Irrelative but noticeable: the 1.4.9 release note URL is
>>>>>> invalid on
>>>>>>>>>>>>>>> https://hbase.apache.org/downloads.html
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>> Yu
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <
>>>>>> apurtell@apache.org>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> The difference is basically noise per the usual YCSB
>>>>>> evaluation.
>>>>>>>>>>> Small
>>>>>>>>>>>>>>>> differences in workloads D and F (slightly worse) and
>>> workload
>>>>>> E
>>>>>>>>>>>>> (slightly
>>>>>>>>>>>>>>>> better) that do not indicate serious regression.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Linux version 4.14.55-62.37.amzn1.x86_64
>>>>>>>>>>>>>>>> c3.8xlarge x 5
>>>>>>>>>>>>>>>> OpenJDK Runtime Environment (build 1.8.0_181-shenandoah-b13)
>>>>>>>>>>>>>>>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch
>>> -XX:+UseNUMA
>>>>>>>>>>>>>>>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
>>>>>>>>>>>>>>>> Hadoop 2.9.2
>>>>>>>>>>>>>>>> Init: Load 100 M rows and snapshot
>>>>>>>>>>>>>>>> Run: Delete table, clone and redeploy from snapshot, run 10
>>> M
>>>>>>>>>>>>> operations
>>>>>>>>>>>>>>>> Args: -threads 100 -target 50000
>>>>>>>>>>>>>>>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS =>
>>>>>> '1',
>>>>>>>>>>>>> IN_MEMORY
>>>>>>>>>>>>>>>> => 'false', KEEP_DELETED_CELLS => 'FALSE',
>>> DATA_BLOCK_ENCODING
>>>>>> =>
>>>>>>>>>>>>>>>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => 'SNAPPY',
>>>>>>>>>>>>> MIN_VERSIONS =>
>>>>>>>>>>>>>>>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536',
>>>>>>>>> REPLICATION_SCOPE =>
>>>>>>>>>>>>>>>> '0'}
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> YCSB Workload A
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200592 200583
>>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49852 49855
>>>>>>>>>>>>>>>> [READ], AverageLatency(us) 544 559
>>>>>>>>>>>>>>>> [READ], MinLatency(us) 267 292
>>>>>>>>>>>>>>>> [READ], MaxLatency(us) 165631 185087
>>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 738 742
>>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us), 1877 1961
>>>>>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1370 1181
>>>>>>>>>>>>>>>> [UPDATE], MinLatency(us) 702 646
>>>>>>>>>>>>>>>> [UPDATE], MaxLatency(us) 180735 177279
>>>>>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1943 1652
>>>>>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 3257 3085
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> YCSB Workload B
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200599 200581
>>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49850 49855
>>>>>>>>>>>>>>>> [READ], AverageLatency(us),  454 471
>>>>>>>>>>>>>>>> [READ], MinLatency(us) 203 213
>>>>>>>>>>>>>>>> [READ], MaxLatency(us) 183423 174207
>>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 563 599
>>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 1360 1172
>>>>>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1064 1029
>>>>>>>>>>>>>>>> [UPDATE], MinLatency(us) 746 726
>>>>>>>>>>>>>>>> [UPDATE], MaxLatency(us) 163455 101631
>>>>>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1327 1157
>>>>>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 2241 1898
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> YCSB Workload C
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200541 200538
>>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49865 49865
>>>>>>>>>>>>>>>> [READ], AverageLatency(us) 332 327
>>>>>>>>>>>>>>>> [READ], MinLatency(us) 175 179
>>>>>>>>>>>>>>>> [READ], MaxLatency(us) 210559 170367
>>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 410 396
>>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 871 892
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> YCSB Workload D
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200579 200562
>>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49855 49859
>>>>>>>>>>>>>>>> [READ], AverageLatency(us) 487 547
>>>>>>>>>>>>>>>> [READ], MinLatency(us) 210 214
>>>>>>>>>>>>>>>> [READ], MaxLatency(us) 192255 177535
>>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 973 1529
>>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 1836 2683
>>>>>>>>>>>>>>>> [INSERT], AverageLatency(us) 1239 1152
>>>>>>>>>>>>>>>> [INSERT], MinLatency(us) 807 788
>>>>>>>>>>>>>>>> [INSERT], MaxLatency(us) 184575 148735
>>>>>>>>>>>>>>>> [INSERT], 95thPercentileLatency(us) 1496 1243
>>>>>>>>>>>>>>>> [INSERT], 99thPercentileLatency(us) 2965 2495
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> YCSB Workload E
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> target 10k/op/s 1.4.9 1.5.0
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 100605 100568
>>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 9939 9943
>>>>>>>>>>>>>>>> [SCAN], AverageLatency(us) 3548 2687
>>>>>>>>>>>>>>>> [SCAN], MinLatency(us) 696 678
>>>>>>>>>>>>>>>> [SCAN], MaxLatency(us) 1059839 238463
>>>>>>>>>>>>>>>> [SCAN], 95thPercentileLatency(us) 8327 6791
>>>>>>>>>>>>>>>> [SCAN], 99thPercentileLatency(us) 17647 14415
>>>>>>>>>>>>>>>> [INSERT], AverageLatency(us) 2688 1555
>>>>>>>>>>>>>>>> [INSERT], MinLatency(us) 887 815
>>>>>>>>>>>>>>>> [INSERT], MaxLatency(us) 173311 154623
>>>>>>>>>>>>>>>> [INSERT], 95thPercentileLatency(us) 4455 2571
>>>>>>>>>>>>>>>> [INSERT], 99thPercentileLatency(us) 9303 5375
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> YCSB Workload F
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200562 204178
>>>>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49859 48976
>>>>>>>>>>>>>>>> [READ], AverageLatency(us) 856 1137
>>>>>>>>>>>>>>>> [READ], MinLatency(us) 262 257
>>>>>>>>>>>>>>>> [READ], MaxLatency(us) 205567 222335
>>>>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 2365 3475
>>>>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 3099 4143
>>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
>>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034
>>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
>>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627
>>>>>>>>>>>>>>>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919
>>>>>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1700 1777
>>>>>>>>>>>>>>>> [UPDATE], MinLatency(us) 737 687
>>>>>>>>>>>>>>>> [UPDATE], MaxLatency(us) 97983 94271
>>>>>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 3377 4147
>>>>>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 4147 4831
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <ca...@gmail.com>
>>>>>> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Thanks for the efforts boss.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Since it's a new minor release, do we have performance
>>>>>> comparison
>>>>>>>>>>>>> report
>>>>>>>>>>>>>>>>> with 1.4.9 as we did when releasing 1.4.0? If so, any
>>>>>> reference?
>>>>>>>>>>> Many
>>>>>>>>>>>>>>>>> thanks!
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>>>>> Yu
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <
>>>>>> apurtell@apache.org
>>>>>>>>>> 
>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> The fourth HBase 1.5.0 release candidate (RC3) is
>>> available
>>>>>> for
>>>>>>>>>>>>>>>> download
>>>>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>>>>> 
>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/
>>>>>>>>> and
>>>>>>>>>>>>>>>> Maven
>>>>>>>>>>>>>>>>>> artifacts are available in the temporary repository
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>> https://repository.apache.org/content/repositories/orgapachehbase-1292/
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> The git tag corresponding to the candidate is '1.5.0RC3’
>>>>>>>>>>>>> (b0bc7225c5).
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> A detailed source and binary compatibility report for this
>>>>>>>>> release
>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>> available for your review at
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
>>>>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> A list of the 115 issues resolved in this release can be
>>>>>> found
>>>>>>>>> at
>>>>>>>>>>>>>>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is
>>> derived
>>>>>> from
>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> changelog of the last branch-1.4 release, 1.4.9.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Please try out the candidate and vote +1/0/-1.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> The vote will be open for at least 72 hours. Unless
>>>>>> objection I
>>>>>>>>>>> will
>>>>>>>>>>>>>>>> try
>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> close it Friday April 12, 2019 if we have sufficient
>>> votes.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Prior to making this announcement I made the following
>>>>>> preflight
>>>>>>>>>>>>>>>> checks:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> RAT check passes (7u80)
>>>>>>>>>>>>>>>>>> Unit test suite passes (7u80, 8u181)*
>>>>>>>>>>>>>>>>>> Opened the UI in a browser, poked around
>>>>>>>>>>>>>>>>>> LTT load 100M rows with 100% verification and 20% updates
>>>>>>>>> (8u181)
>>>>>>>>>>>>>>>>>> ITBLL 1B rows with slowDeterministic monkey (8u181)
>>>>>>>>>>>>>>>>>> ITBLL 1B rows with serverKilling monkey (8u181)
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> There are known flaky tests. See HBASE-21904 and
>>> HBASE-21905.
>>>>>>>>> These
>>>>>>>>>>>>>>>> flaky
>>>>>>>>>>>>>>>>>> tests do not represent serious test failures that would
>>>>>> prevent
>>>>>>>>> a
>>>>>>>>>>>>>>>>> release.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>> Andrew
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>> Andrew
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Words like orphans lost among the crosstalk, meaning torn
>>> from
>>>>>>>>>>> truth's
>>>>>>>>>>>>>>>> decrepit hands
>>>>>>>>>>>>>>>> - A23, Crosstalk
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>> 
>>> 
>> 

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Yu Li <ca...@gmail.com>.
After more investigation, the ConnectionRefused exception could be
reproduced with "mvn -Dtest=<case_name> test" after a complete run of all
cases through "mvn -PrunAllTests clean test", but cannot by a clean
standalone run (with "mvn *clean* test"). So now I'm more convinced it's
some kind of environment chaos caused by parallel execution of test cases,
and not a blocker issue.

@Andrew It seems to me that kerby jar is not included in our binary
package, so I'm not sure whether a new RC is required by HBASE-22219.
Anyway I'd like to revoke my -1 vote now. Thanks.

Best Regards,
Yu


On Tue, 16 Apr 2019 at 10:19, Yu Li <ca...@gmail.com> wrote:

> Sorry for the late response due to job priority.
>
> This ConnectionRefused issue cannot be reproduced on my laptop (MacOS
> 10.14.4) but could on the linux env. And I've checked and confirmed it
> could pass with 1.4.7/1.4.9 source package but stably failed with 1.5.0,
> performing a git bisect now, will report back later.
>
> Best Regards,
> Yu
>
>
> On Sat, 13 Apr 2019 at 00:38, Andrew Purtell <an...@gmail.com>
> wrote:
>
>> I also see the occasional ConnectionRefused errors. They don’t reproduce
>> if you run the test standalone. I also only see them on a Linux dev host.
>> That may be enough to find by bisect the commit that introduced this
>> behavior. Working on it. There is a JIRA filed for this one. Search for
>> “TestBlocksRead” and label “branch-1”.
>>
>> Thanks for the investigations.
>>
>> > On Apr 12, 2019, at 6:36 AM, Yu Li <ca...@gmail.com> wrote:
>> >
>> > Quick updates:
>> >
>> > W/ patch of HBASE-22219 or say upgrading kerby version to 1.0.1, the
>> > failures listed above in the 1st part of hbase-server disappeared.
>> >
>> > However, in the 2nd part of hbase-server UT there're still many
>> > ConnectionRefused exceptions (17 errors in total) as shown below, which
>> > could be reproduced easily with -Dtest=xxx command on my environments,
>> > still checking the root cause.
>> >
>> > [INFO] Running org.apache.hadoop.hbase.regionserver.TestBlocksRead
>> > [ERROR] Tests run: 4, Failures: 0, Errors: 4, Skipped: 0, Time elapsed:
>> > 0.853 s <<< FAILURE! - in
>> > org.apache.hadoop.hbase.regionserver.TestBlocksRead
>> > [ERROR]
>> >
>> testBlocksStoredWhenCachingDisabled(org.apache.hadoop.hbase.regionserver.TestBlocksRead)
>> > Time elapsed: 0.17 s  <<< ERROR!
>> > java.net.ConnectException: Call From
>> > z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35669 failed
>> on
>> > connection exception: java.net.ConnectException: Connection refused; For
>> > more details see:
>> > http://wiki.apache.org/hadoop/ConnectionRefused
>> >        at
>> >
>> org.apache.hadoop.hbase.regionserver.TestBlocksRead.initHRegion(TestBlocksRead.java:112)
>> >        at
>> >
>> org.apache.hadoop.hbase.regionserver.TestBlocksRead.testBlocksStoredWhenCachingDisabled(TestBlocksRead.java:389)
>> > Caused by: java.net.ConnectException: Connection refused
>> >        at
>> >
>> org.apache.hadoop.hbase.regionserver.TestBlocksRead.initHRegion(TestBlocksRead.java:112)
>> >        at
>> >
>> org.apache.hadoop.hbase.regionserver.TestBlocksRead.testBlocksStoredWhenCachingDisabled(TestBlocksRead.java:389)
>> >
>> > Best Regards,
>> > Yu
>> >
>> >
>> >> On Fri, 12 Apr 2019 at 13:11, Yu Li <ca...@gmail.com> wrote:
>> >>
>> >> I have no doubt that you've run the tests locally before announcing a
>> >> release as you're always a great RM boss. And this shows one value of
>> >> verifying release, that different voter has different environments.
>> >>
>> >> Now I think the failures may be kerberos related, since I possibly has
>> >> changed some system configuration when doing Flink testing on this env
>> >> weeks ago. Located one issue (HBASE-22219) which also observed in
>> 1.4.7,
>> >> will further investigate.
>> >>
>> >> Best Regards,
>> >> Yu
>> >>
>> >>
>> >> On Fri, 12 Apr 2019 at 12:38, Andrew Purtell <andrew.purtell@gmail.com
>> >
>> >> wrote:
>> >>
>> >>> “However it's good to find the issue earlier if there
>> >>> really is any, before release announced.”
>> >>>
>> >>> I run the complete unit test suite before announcing a release
>> candidate.
>> >>> Just to be clear.
>> >>>
>> >>> Totally agree we should get these problems sorted before an actual
>> >>> release. My policy is to cancel a RC if anyone vetoes for this
>> reason...
>> >>> want as much coverage and varying environments as we can manage.
>> >>>
>> >>> Thank you for your help so far and I hope the failures you see result
>> in
>> >>> analysis and fixes that lead to better test stability.
>> >>>
>> >>>> On Apr 11, 2019, at 9:32 PM, Yu Li <ca...@gmail.com> wrote:
>> >>>>
>> >>>> Confirmed in 1.4.7 source the listed out cases passed (all in the 1st
>> >>> part
>> >>>> of hbase-server so the result comes out quickly.)... Also confirmed
>> the
>> >>>> test ran order are the same...
>> >>>>
>> >>>> Will try 1.5.0 again to prevent the environment difference caused by
>> >>> time.
>> >>>> If 1.5.0 still fails, will start to do the git bisect to locate the
>> >>> first
>> >>>> bad commit.
>> >>>>
>> >>>> Was also expecting an easy pass and +1 as always to save time and
>> >>> efforts,
>> >>>> but obvious no luck. However it's good to find the issue earlier if
>> >>> there
>> >>>> really is any, before release announced.
>> >>>>
>> >>>> Best Regards,
>> >>>> Yu
>> >>>>
>> >>>>
>> >>>>> On Fri, 12 Apr 2019 at 12:16, Yu Li <ca...@gmail.com> wrote:
>> >>>>>
>> >>>>> Fine, let's focus on verifying whether it's a real problem rather
>> than
>> >>>>> arguing about wording, after all that's not my intention...
>> >>>>>
>> >>>>> As mentioned, I participated in the 1.4.7 release vote[1] and IIRC I
>> >>> was
>> >>>>> using the same env and all tests passed w/o issue, that's where my
>> >>> concern
>> >>>>> lies and the main reason I gave a -1 vote. I'm running against 1.4.7
>> >>> source
>> >>>>> on the same now and let's see the result.
>> >>>>>
>> >>>>> [1] https://www.mail-archive.com/dev@hbase.apache.org/msg51380.html
>> >>>>>
>> >>>>> Best Regards,
>> >>>>> Yu
>> >>>>>
>> >>>>>
>> >>>>> On Fri, 12 Apr 2019 at 12:05, Andrew Purtell <
>> andrew.purtell@gmail.com
>> >>>>
>> >>>>> wrote:
>> >>>>>
>> >>>>>> I believe the test execution order matters. We run some tests in
>> >>>>>> parallel. The ordering of tests is determined by readdir() results
>> >>> and this
>> >>>>>> differs from host to host and checkout to checkout. So when you
>> see a
>> >>>>>> repeatable group of failures, that’s great. And when someone else
>> >>> doesn’t
>> >>>>>> see those same tests fail, or they cannot be reproduced when
>> running
>> >>> by
>> >>>>>> themselves, the commonly accepted term of art for this is “flaky”.
>> >>>>>>
>> >>>>>>
>> >>>>>>> On Apr 11, 2019, at 8:52 PM, Yu Li <ca...@gmail.com> wrote:
>> >>>>>>>
>> >>>>>>> Sorry but I'd call it "possible environment related problem" or
>> "some
>> >>>>>>> feature may not work well in specific environment", rather than a
>> >>> flaky.
>> >>>>>>>
>> >>>>>>> Will check against 1.4.7 released source package before opening
>> any
>> >>>>>> JIRA.
>> >>>>>>>
>> >>>>>>> Best Regards,
>> >>>>>>> Yu
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> On Fri, 12 Apr 2019 at 11:37, Andrew Purtell <
>> >>> andrew.purtell@gmail.com>
>> >>>>>>> wrote:
>> >>>>>>>
>> >>>>>>>> And if they pass in my environment , then what should we call it
>> >>> then.
>> >>>>>> I
>> >>>>>>>> have no doubt you are seeing failures. Therefore can you please
>> file
>> >>>>>> JIRAs
>> >>>>>>>> and attach information that can help identify a fix. Thanks.
>> >>>>>>>>
>> >>>>>>>>> On Apr 11, 2019, at 8:35 PM, Yu Li <ca...@gmail.com> wrote:
>> >>>>>>>>>
>> >>>>>>>>> I ran the test suite with the
>> -Dsurefire.rerunFailingTestsCount=2
>> >>>>>> option
>> >>>>>>>>> and on two different env separately, so it sums up to 6 times
>> >>> stable
>> >>>>>>>>> failure for each case, and from my perspective this is not
>> flaky.
>> >>>>>>>>>
>> >>>>>>>>> IIRC last time when verifying 1.4.7 on the same env no such
>> issue
>> >>>>>>>> observed,
>> >>>>>>>>> will double check.
>> >>>>>>>>>
>> >>>>>>>>> Best Regards,
>> >>>>>>>>> Yu
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> On Fri, 12 Apr 2019 at 00:07, Andrew Purtell <
>> >>>>>> andrew.purtell@gmail.com>
>> >>>>>>>>> wrote:
>> >>>>>>>>>
>> >>>>>>>>>> There are two failure cases it looks like. And this looks like
>> >>>>>> flakes.
>> >>>>>>>>>>
>> >>>>>>>>>> The wrong FS assertions are not something I see when I run
>> these
>> >>>>>> tests
>> >>>>>>>>>> myself. I am not able to investigate something I can’t
>> reproduce.
>> >>>>>> What I
>> >>>>>>>>>> suggest is since you can reproduce do a git bisect to find the
>> >>> commit
>> >>>>>>>> that
>> >>>>>>>>>> introduced the problem. Then we can revert it. As an
>> alternative
>> >>> we
>> >>>>>> can
>> >>>>>>>>>> open a JIRA, report the problem, temporarily @ignore the test,
>> and
>> >>>>>>>>>> continue. This latter option only should be done if we are
>> fairly
>> >>>>>>>> confident
>> >>>>>>>>>> it is a test only problem.
>> >>>>>>>>>>
>> >>>>>>>>>> The connect exceptions are interesting. I see these sometimes
>> when
>> >>>>>> the
>> >>>>>>>>>> suite is executed, not this particular case, but when the
>> failed
>> >>>>>> test is
>> >>>>>>>>>> executed by itself it always passes. It is possible some
>> change to
>> >>>>>>>> classes
>> >>>>>>>>>> related to the minicluster or startup or shutdown timing are
>> the
>> >>>>>> cause,
>> >>>>>>>> but
>> >>>>>>>>>> it is test time flaky behavior. I’m not happy about this but it
>> >>>>>> doesn’t
>> >>>>>>>>>> actually fail the release because the failure is never
>> repeatable
>> >>>>>> when
>> >>>>>>>> the
>> >>>>>>>>>> test is run standalone.
>> >>>>>>>>>>
>> >>>>>>>>>> In general it would be great if some attention was paid to test
>> >>>>>>>>>> cleanliness on branch-1. As RM I’m not in a position to insist
>> >>> that
>> >>>>>>>>>> everything is perfect or there will never be another 1.x
>> release,
>> >>>>>>>> certainly
>> >>>>>>>>>> not from branch-1. So, tests which fail repeatedly block a
>> release
>> >>>>>> IMHO
>> >>>>>>>> but
>> >>>>>>>>>> flakes do not.
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>> On Apr 10, 2019, at 11:20 PM, Yu Li <ca...@gmail.com> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>> -1
>> >>>>>>>>>>>
>> >>>>>>>>>>> Observed many UT failures when checking the source package
>> (tried
>> >>>>>>>>>> multiple
>> >>>>>>>>>>> rounds on two different environments, MacOs and Linux, got the
>> >>> same
>> >>>>>>>>>>> result), including (but not limited to):
>> >>>>>>>>>>>
>> >>>>>>>>>>> TestBulkload:
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>
>> >>>>>>
>> >>>
>> shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad)
>> >>>>>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
>> >>>>>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>
>> >>>>>>
>> >>>
>> file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29,
>> >>>>>>>>>>> expected: hdfs://localhost:55938
>> >>>>>>>>>>>    at
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>
>> >>>>>>
>> >>>
>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246)
>> >>>>>>>>>>>    at
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>
>> >>>>>>
>> >>>
>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256)
>> >>>>>>>>>>>    at
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>
>> >>>>>>
>> >>>
>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150)
>> >>>>>>>>>>>
>> >>>>>>>>>>> TestStoreFile:
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>
>> >>>>>>
>> >>>
>> testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile)
>> >>>>>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
>> >>>>>>>>>>> java.net.ConnectException: Call From localhost/127.0.0.1 to
>> >>>>>>>>>> localhost:55938
>> >>>>>>>>>>> failed on connection exception: java.net.ConnectException:
>> >>>>>> Connection
>> >>>>>>>>>>> refused; For more details see:
>> >>>>>>>>>>> http://wiki.apache.org/hadoop/ConnectionRefused
>> >>>>>>>>>>>    at
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>
>> >>>>>>
>> >>>
>> org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047)
>> >>>>>>>>>>>    at
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>
>> >>>>>>
>> >>>
>> org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908)
>> >>>>>>>>>>>
>> >>>>>>>>>>> TestHFile:
>> >>>>>>>>>>> testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile)
>> Time
>> >>>>>>>> elapsed:
>> >>>>>>>>>>> 0.08 s  <<< ERROR!
>> >>>>>>>>>>> java.net.ConnectException: Call From
>> >>>>>>>>>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to
>> localhost:35529
>> >>>>>> failed
>> >>>>>>>> on
>> >>>>>>>>>>> connection exception: java.net.ConnectException: Connection
>> >>> refused;
>> >>>>>>>> For
>> >>>>>>>>>>> more details see:
>> >>> http://wiki.apache.org/hadoop/ConnectionRefused
>> >>>>>>>>>>>    at
>> >>>>>>>>>>> org.apache.hadoop.hbase.io
>> >>>>>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
>> >>>>>>>>>>> Caused by: java.net.ConnectException: Connection refused
>> >>>>>>>>>>>    at
>> >>>>>>>>>>> org.apache.hadoop.hbase.io
>> >>>>>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
>> >>>>>>>>>>>
>> >>>>>>>>>>> TestBlocksScanned:
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>
>> >>>>>>
>> >>>
>> testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned)
>> >>>>>>>>>>> Time elapsed: 0.069 s  <<< ERROR!
>> >>>>>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
>> >>>>>>>> hdfs://localhost:35529/tmp/
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>
>> >>>>>>
>> >>>
>> hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08
>> >>>>>>>>>> ,
>> >>>>>>>>>>> expected: file:///
>> >>>>>>>>>>>    at
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>
>> >>>>>>
>> >>>
>> org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90)
>> >>>>>>>>>>>    at
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>
>> >>>>>>
>> >>>
>> org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86)
>> >>>>>>>>>>>
>> >>>>>>>>>>> And please let me know if any known issue I'm not aware of.
>> >>> Thanks.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Best Regards,
>> >>>>>>>>>>> Yu
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>> On Mon, 8 Apr 2019 at 11:38, Yu Li <ca...@gmail.com> wrote:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> The performance report LGTM, thanks! (and sorry for the lag
>> due
>> >>> to
>> >>>>>>>>>>>> Qingming Festival Holiday here in China)
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Still verifying the release, just some quick feedback:
>> observed
>> >>>>>> some
>> >>>>>>>>>>>> incompatible changes in compatibility report including
>> >>>>>>>>>>>> HBASE-21492/HBASE-21684 and worth a reminder in ReleaseNote.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Irrelative but noticeable: the 1.4.9 release note URL is
>> >>> invalid on
>> >>>>>>>>>>>> https://hbase.apache.org/downloads.html
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Best Regards,
>> >>>>>>>>>>>> Yu
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <
>> >>> apurtell@apache.org>
>> >>>>>>>>>> wrote:
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> The difference is basically noise per the usual YCSB
>> >>> evaluation.
>> >>>>>>>> Small
>> >>>>>>>>>>>>> differences in workloads D and F (slightly worse) and
>> workload
>> >>> E
>> >>>>>>>>>> (slightly
>> >>>>>>>>>>>>> better) that do not indicate serious regression.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Linux version 4.14.55-62.37.amzn1.x86_64
>> >>>>>>>>>>>>> c3.8xlarge x 5
>> >>>>>>>>>>>>> OpenJDK Runtime Environment (build 1.8.0_181-shenandoah-b13)
>> >>>>>>>>>>>>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch
>> -XX:+UseNUMA
>> >>>>>>>>>>>>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
>> >>>>>>>>>>>>> Hadoop 2.9.2
>> >>>>>>>>>>>>> Init: Load 100 M rows and snapshot
>> >>>>>>>>>>>>> Run: Delete table, clone and redeploy from snapshot, run 10
>> M
>> >>>>>>>>>> operations
>> >>>>>>>>>>>>> Args: -threads 100 -target 50000
>> >>>>>>>>>>>>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS =>
>> >>> '1',
>> >>>>>>>>>> IN_MEMORY
>> >>>>>>>>>>>>> => 'false', KEEP_DELETED_CELLS => 'FALSE',
>> DATA_BLOCK_ENCODING
>> >>> =>
>> >>>>>>>>>>>>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => 'SNAPPY',
>> >>>>>>>>>> MIN_VERSIONS =>
>> >>>>>>>>>>>>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536',
>> >>>>>> REPLICATION_SCOPE =>
>> >>>>>>>>>>>>> '0'}
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> YCSB Workload A
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> [OVERALL], RunTime(ms) 200592 200583
>> >>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49852 49855
>> >>>>>>>>>>>>> [READ], AverageLatency(us) 544 559
>> >>>>>>>>>>>>> [READ], MinLatency(us) 267 292
>> >>>>>>>>>>>>> [READ], MaxLatency(us) 165631 185087
>> >>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 738 742
>> >>>>>>>>>>>>> [READ], 99thPercentileLatency(us), 1877 1961
>> >>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1370 1181
>> >>>>>>>>>>>>> [UPDATE], MinLatency(us) 702 646
>> >>>>>>>>>>>>> [UPDATE], MaxLatency(us) 180735 177279
>> >>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1943 1652
>> >>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 3257 3085
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> YCSB Workload B
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> [OVERALL], RunTime(ms) 200599 200581
>> >>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49850 49855
>> >>>>>>>>>>>>> [READ], AverageLatency(us),  454 471
>> >>>>>>>>>>>>> [READ], MinLatency(us) 203 213
>> >>>>>>>>>>>>> [READ], MaxLatency(us) 183423 174207
>> >>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 563 599
>> >>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 1360 1172
>> >>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1064 1029
>> >>>>>>>>>>>>> [UPDATE], MinLatency(us) 746 726
>> >>>>>>>>>>>>> [UPDATE], MaxLatency(us) 163455 101631
>> >>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1327 1157
>> >>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 2241 1898
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> YCSB Workload C
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> [OVERALL], RunTime(ms) 200541 200538
>> >>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49865 49865
>> >>>>>>>>>>>>> [READ], AverageLatency(us) 332 327
>> >>>>>>>>>>>>> [READ], MinLatency(us) 175 179
>> >>>>>>>>>>>>> [READ], MaxLatency(us) 210559 170367
>> >>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 410 396
>> >>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 871 892
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> YCSB Workload D
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> [OVERALL], RunTime(ms) 200579 200562
>> >>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49855 49859
>> >>>>>>>>>>>>> [READ], AverageLatency(us) 487 547
>> >>>>>>>>>>>>> [READ], MinLatency(us) 210 214
>> >>>>>>>>>>>>> [READ], MaxLatency(us) 192255 177535
>> >>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 973 1529
>> >>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 1836 2683
>> >>>>>>>>>>>>> [INSERT], AverageLatency(us) 1239 1152
>> >>>>>>>>>>>>> [INSERT], MinLatency(us) 807 788
>> >>>>>>>>>>>>> [INSERT], MaxLatency(us) 184575 148735
>> >>>>>>>>>>>>> [INSERT], 95thPercentileLatency(us) 1496 1243
>> >>>>>>>>>>>>> [INSERT], 99thPercentileLatency(us) 2965 2495
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> YCSB Workload E
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> target 10k/op/s 1.4.9 1.5.0
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> [OVERALL], RunTime(ms) 100605 100568
>> >>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 9939 9943
>> >>>>>>>>>>>>> [SCAN], AverageLatency(us) 3548 2687
>> >>>>>>>>>>>>> [SCAN], MinLatency(us) 696 678
>> >>>>>>>>>>>>> [SCAN], MaxLatency(us) 1059839 238463
>> >>>>>>>>>>>>> [SCAN], 95thPercentileLatency(us) 8327 6791
>> >>>>>>>>>>>>> [SCAN], 99thPercentileLatency(us) 17647 14415
>> >>>>>>>>>>>>> [INSERT], AverageLatency(us) 2688 1555
>> >>>>>>>>>>>>> [INSERT], MinLatency(us) 887 815
>> >>>>>>>>>>>>> [INSERT], MaxLatency(us) 173311 154623
>> >>>>>>>>>>>>> [INSERT], 95thPercentileLatency(us) 4455 2571
>> >>>>>>>>>>>>> [INSERT], 99thPercentileLatency(us) 9303 5375
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> YCSB Workload F
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> [OVERALL], RunTime(ms) 200562 204178
>> >>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49859 48976
>> >>>>>>>>>>>>> [READ], AverageLatency(us) 856 1137
>> >>>>>>>>>>>>> [READ], MinLatency(us) 262 257
>> >>>>>>>>>>>>> [READ], MaxLatency(us) 205567 222335
>> >>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 2365 3475
>> >>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 3099 4143
>> >>>>>>>>>>>>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
>> >>>>>>>>>>>>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034
>> >>>>>>>>>>>>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
>> >>>>>>>>>>>>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627
>> >>>>>>>>>>>>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919
>> >>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1700 1777
>> >>>>>>>>>>>>> [UPDATE], MinLatency(us) 737 687
>> >>>>>>>>>>>>> [UPDATE], MaxLatency(us) 97983 94271
>> >>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 3377 4147
>> >>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 4147 4831
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <ca...@gmail.com>
>> >>> wrote:
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Thanks for the efforts boss.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Since it's a new minor release, do we have performance
>> >>> comparison
>> >>>>>>>>>> report
>> >>>>>>>>>>>>>> with 1.4.9 as we did when releasing 1.4.0? If so, any
>> >>> reference?
>> >>>>>>>> Many
>> >>>>>>>>>>>>>> thanks!
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Best Regards,
>> >>>>>>>>>>>>>> Yu
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <
>> >>> apurtell@apache.org
>> >>>>>>>
>> >>>>>>>>>>>>> wrote:
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> The fourth HBase 1.5.0 release candidate (RC3) is
>> available
>> >>> for
>> >>>>>>>>>>>>> download
>> >>>>>>>>>>>>>> at
>> >>>>>>>>>>>>>>>
>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/
>> >>>>>> and
>> >>>>>>>>>>>>> Maven
>> >>>>>>>>>>>>>>> artifacts are available in the temporary repository
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>
>> >>>
>> https://repository.apache.org/content/repositories/orgapachehbase-1292/
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> The git tag corresponding to the candidate is '1.5.0RC3’
>> >>>>>>>>>> (b0bc7225c5).
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> A detailed source and binary compatibility report for this
>> >>>>>> release
>> >>>>>>>> is
>> >>>>>>>>>>>>>>> available for your review at
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>
>> >>>>>>
>> >>>
>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
>> >>>>>>>>>>>>>>> .
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> A list of the 115 issues resolved in this release can be
>> >>> found
>> >>>>>> at
>> >>>>>>>>>>>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is
>> derived
>> >>> from
>> >>>>>>>> the
>> >>>>>>>>>>>>>>> changelog of the last branch-1.4 release, 1.4.9.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> Please try out the candidate and vote +1/0/-1.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> The vote will be open for at least 72 hours. Unless
>> >>> objection I
>> >>>>>>>> will
>> >>>>>>>>>>>>> try
>> >>>>>>>>>>>>>> to
>> >>>>>>>>>>>>>>> close it Friday April 12, 2019 if we have sufficient
>> votes.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> Prior to making this announcement I made the following
>> >>> preflight
>> >>>>>>>>>>>>> checks:
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> RAT check passes (7u80)
>> >>>>>>>>>>>>>>> Unit test suite passes (7u80, 8u181)*
>> >>>>>>>>>>>>>>> Opened the UI in a browser, poked around
>> >>>>>>>>>>>>>>> LTT load 100M rows with 100% verification and 20% updates
>> >>>>>> (8u181)
>> >>>>>>>>>>>>>>> ITBLL 1B rows with slowDeterministic monkey (8u181)
>> >>>>>>>>>>>>>>> ITBLL 1B rows with serverKilling monkey (8u181)
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> There are known flaky tests. See HBASE-21904 and
>> HBASE-21905.
>> >>>>>> These
>> >>>>>>>>>>>>> flaky
>> >>>>>>>>>>>>>>> tests do not represent serious test failures that would
>> >>> prevent
>> >>>>>> a
>> >>>>>>>>>>>>>> release.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>> Best regards,
>> >>>>>>>>>>>>>>> Andrew
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> --
>> >>>>>>>>>>>>> Best regards,
>> >>>>>>>>>>>>> Andrew
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> Words like orphans lost among the crosstalk, meaning torn
>> from
>> >>>>>>>> truth's
>> >>>>>>>>>>>>> decrepit hands
>> >>>>>>>>>>>>> - A23, Crosstalk
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>
>> >>>>>>
>> >>>>>
>> >>>
>> >>
>>
>

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Yu Li <ca...@gmail.com>.
Sorry for the late response due to job priority.

This ConnectionRefused issue cannot be reproduced on my laptop (MacOS
10.14.4) but could on the linux env. And I've checked and confirmed it
could pass with 1.4.7/1.4.9 source package but stably failed with 1.5.0,
performing a git bisect now, will report back later.

Best Regards,
Yu


On Sat, 13 Apr 2019 at 00:38, Andrew Purtell <an...@gmail.com>
wrote:

> I also see the occasional ConnectionRefused errors. They don’t reproduce
> if you run the test standalone. I also only see them on a Linux dev host.
> That may be enough to find by bisect the commit that introduced this
> behavior. Working on it. There is a JIRA filed for this one. Search for
> “TestBlocksRead” and label “branch-1”.
>
> Thanks for the investigations.
>
> > On Apr 12, 2019, at 6:36 AM, Yu Li <ca...@gmail.com> wrote:
> >
> > Quick updates:
> >
> > W/ patch of HBASE-22219 or say upgrading kerby version to 1.0.1, the
> > failures listed above in the 1st part of hbase-server disappeared.
> >
> > However, in the 2nd part of hbase-server UT there're still many
> > ConnectionRefused exceptions (17 errors in total) as shown below, which
> > could be reproduced easily with -Dtest=xxx command on my environments,
> > still checking the root cause.
> >
> > [INFO] Running org.apache.hadoop.hbase.regionserver.TestBlocksRead
> > [ERROR] Tests run: 4, Failures: 0, Errors: 4, Skipped: 0, Time elapsed:
> > 0.853 s <<< FAILURE! - in
> > org.apache.hadoop.hbase.regionserver.TestBlocksRead
> > [ERROR]
> >
> testBlocksStoredWhenCachingDisabled(org.apache.hadoop.hbase.regionserver.TestBlocksRead)
> > Time elapsed: 0.17 s  <<< ERROR!
> > java.net.ConnectException: Call From
> > z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35669 failed on
> > connection exception: java.net.ConnectException: Connection refused; For
> > more details see:
> > http://wiki.apache.org/hadoop/ConnectionRefused
> >        at
> >
> org.apache.hadoop.hbase.regionserver.TestBlocksRead.initHRegion(TestBlocksRead.java:112)
> >        at
> >
> org.apache.hadoop.hbase.regionserver.TestBlocksRead.testBlocksStoredWhenCachingDisabled(TestBlocksRead.java:389)
> > Caused by: java.net.ConnectException: Connection refused
> >        at
> >
> org.apache.hadoop.hbase.regionserver.TestBlocksRead.initHRegion(TestBlocksRead.java:112)
> >        at
> >
> org.apache.hadoop.hbase.regionserver.TestBlocksRead.testBlocksStoredWhenCachingDisabled(TestBlocksRead.java:389)
> >
> > Best Regards,
> > Yu
> >
> >
> >> On Fri, 12 Apr 2019 at 13:11, Yu Li <ca...@gmail.com> wrote:
> >>
> >> I have no doubt that you've run the tests locally before announcing a
> >> release as you're always a great RM boss. And this shows one value of
> >> verifying release, that different voter has different environments.
> >>
> >> Now I think the failures may be kerberos related, since I possibly has
> >> changed some system configuration when doing Flink testing on this env
> >> weeks ago. Located one issue (HBASE-22219) which also observed in 1.4.7,
> >> will further investigate.
> >>
> >> Best Regards,
> >> Yu
> >>
> >>
> >> On Fri, 12 Apr 2019 at 12:38, Andrew Purtell <an...@gmail.com>
> >> wrote:
> >>
> >>> “However it's good to find the issue earlier if there
> >>> really is any, before release announced.”
> >>>
> >>> I run the complete unit test suite before announcing a release
> candidate.
> >>> Just to be clear.
> >>>
> >>> Totally agree we should get these problems sorted before an actual
> >>> release. My policy is to cancel a RC if anyone vetoes for this
> reason...
> >>> want as much coverage and varying environments as we can manage.
> >>>
> >>> Thank you for your help so far and I hope the failures you see result
> in
> >>> analysis and fixes that lead to better test stability.
> >>>
> >>>> On Apr 11, 2019, at 9:32 PM, Yu Li <ca...@gmail.com> wrote:
> >>>>
> >>>> Confirmed in 1.4.7 source the listed out cases passed (all in the 1st
> >>> part
> >>>> of hbase-server so the result comes out quickly.)... Also confirmed
> the
> >>>> test ran order are the same...
> >>>>
> >>>> Will try 1.5.0 again to prevent the environment difference caused by
> >>> time.
> >>>> If 1.5.0 still fails, will start to do the git bisect to locate the
> >>> first
> >>>> bad commit.
> >>>>
> >>>> Was also expecting an easy pass and +1 as always to save time and
> >>> efforts,
> >>>> but obvious no luck. However it's good to find the issue earlier if
> >>> there
> >>>> really is any, before release announced.
> >>>>
> >>>> Best Regards,
> >>>> Yu
> >>>>
> >>>>
> >>>>> On Fri, 12 Apr 2019 at 12:16, Yu Li <ca...@gmail.com> wrote:
> >>>>>
> >>>>> Fine, let's focus on verifying whether it's a real problem rather
> than
> >>>>> arguing about wording, after all that's not my intention...
> >>>>>
> >>>>> As mentioned, I participated in the 1.4.7 release vote[1] and IIRC I
> >>> was
> >>>>> using the same env and all tests passed w/o issue, that's where my
> >>> concern
> >>>>> lies and the main reason I gave a -1 vote. I'm running against 1.4.7
> >>> source
> >>>>> on the same now and let's see the result.
> >>>>>
> >>>>> [1] https://www.mail-archive.com/dev@hbase.apache.org/msg51380.html
> >>>>>
> >>>>> Best Regards,
> >>>>> Yu
> >>>>>
> >>>>>
> >>>>> On Fri, 12 Apr 2019 at 12:05, Andrew Purtell <
> andrew.purtell@gmail.com
> >>>>
> >>>>> wrote:
> >>>>>
> >>>>>> I believe the test execution order matters. We run some tests in
> >>>>>> parallel. The ordering of tests is determined by readdir() results
> >>> and this
> >>>>>> differs from host to host and checkout to checkout. So when you see
> a
> >>>>>> repeatable group of failures, that’s great. And when someone else
> >>> doesn’t
> >>>>>> see those same tests fail, or they cannot be reproduced when running
> >>> by
> >>>>>> themselves, the commonly accepted term of art for this is “flaky”.
> >>>>>>
> >>>>>>
> >>>>>>> On Apr 11, 2019, at 8:52 PM, Yu Li <ca...@gmail.com> wrote:
> >>>>>>>
> >>>>>>> Sorry but I'd call it "possible environment related problem" or
> "some
> >>>>>>> feature may not work well in specific environment", rather than a
> >>> flaky.
> >>>>>>>
> >>>>>>> Will check against 1.4.7 released source package before opening any
> >>>>>> JIRA.
> >>>>>>>
> >>>>>>> Best Regards,
> >>>>>>> Yu
> >>>>>>>
> >>>>>>>
> >>>>>>> On Fri, 12 Apr 2019 at 11:37, Andrew Purtell <
> >>> andrew.purtell@gmail.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> And if they pass in my environment , then what should we call it
> >>> then.
> >>>>>> I
> >>>>>>>> have no doubt you are seeing failures. Therefore can you please
> file
> >>>>>> JIRAs
> >>>>>>>> and attach information that can help identify a fix. Thanks.
> >>>>>>>>
> >>>>>>>>> On Apr 11, 2019, at 8:35 PM, Yu Li <ca...@gmail.com> wrote:
> >>>>>>>>>
> >>>>>>>>> I ran the test suite with the -Dsurefire.rerunFailingTestsCount=2
> >>>>>> option
> >>>>>>>>> and on two different env separately, so it sums up to 6 times
> >>> stable
> >>>>>>>>> failure for each case, and from my perspective this is not flaky.
> >>>>>>>>>
> >>>>>>>>> IIRC last time when verifying 1.4.7 on the same env no such issue
> >>>>>>>> observed,
> >>>>>>>>> will double check.
> >>>>>>>>>
> >>>>>>>>> Best Regards,
> >>>>>>>>> Yu
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Fri, 12 Apr 2019 at 00:07, Andrew Purtell <
> >>>>>> andrew.purtell@gmail.com>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> There are two failure cases it looks like. And this looks like
> >>>>>> flakes.
> >>>>>>>>>>
> >>>>>>>>>> The wrong FS assertions are not something I see when I run these
> >>>>>> tests
> >>>>>>>>>> myself. I am not able to investigate something I can’t
> reproduce.
> >>>>>> What I
> >>>>>>>>>> suggest is since you can reproduce do a git bisect to find the
> >>> commit
> >>>>>>>> that
> >>>>>>>>>> introduced the problem. Then we can revert it. As an alternative
> >>> we
> >>>>>> can
> >>>>>>>>>> open a JIRA, report the problem, temporarily @ignore the test,
> and
> >>>>>>>>>> continue. This latter option only should be done if we are
> fairly
> >>>>>>>> confident
> >>>>>>>>>> it is a test only problem.
> >>>>>>>>>>
> >>>>>>>>>> The connect exceptions are interesting. I see these sometimes
> when
> >>>>>> the
> >>>>>>>>>> suite is executed, not this particular case, but when the failed
> >>>>>> test is
> >>>>>>>>>> executed by itself it always passes. It is possible some change
> to
> >>>>>>>> classes
> >>>>>>>>>> related to the minicluster or startup or shutdown timing are the
> >>>>>> cause,
> >>>>>>>> but
> >>>>>>>>>> it is test time flaky behavior. I’m not happy about this but it
> >>>>>> doesn’t
> >>>>>>>>>> actually fail the release because the failure is never
> repeatable
> >>>>>> when
> >>>>>>>> the
> >>>>>>>>>> test is run standalone.
> >>>>>>>>>>
> >>>>>>>>>> In general it would be great if some attention was paid to test
> >>>>>>>>>> cleanliness on branch-1. As RM I’m not in a position to insist
> >>> that
> >>>>>>>>>> everything is perfect or there will never be another 1.x
> release,
> >>>>>>>> certainly
> >>>>>>>>>> not from branch-1. So, tests which fail repeatedly block a
> release
> >>>>>> IMHO
> >>>>>>>> but
> >>>>>>>>>> flakes do not.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> On Apr 10, 2019, at 11:20 PM, Yu Li <ca...@gmail.com> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> -1
> >>>>>>>>>>>
> >>>>>>>>>>> Observed many UT failures when checking the source package
> (tried
> >>>>>>>>>> multiple
> >>>>>>>>>>> rounds on two different environments, MacOs and Linux, got the
> >>> same
> >>>>>>>>>>> result), including (but not limited to):
> >>>>>>>>>>>
> >>>>>>>>>>> TestBulkload:
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>
> shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad)
> >>>>>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
> >>>>>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>
> file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29,
> >>>>>>>>>>> expected: hdfs://localhost:55938
> >>>>>>>>>>>    at
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246)
> >>>>>>>>>>>    at
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256)
> >>>>>>>>>>>    at
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150)
> >>>>>>>>>>>
> >>>>>>>>>>> TestStoreFile:
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>
> testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile)
> >>>>>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
> >>>>>>>>>>> java.net.ConnectException: Call From localhost/127.0.0.1 to
> >>>>>>>>>> localhost:55938
> >>>>>>>>>>> failed on connection exception: java.net.ConnectException:
> >>>>>> Connection
> >>>>>>>>>>> refused; For more details see:
> >>>>>>>>>>> http://wiki.apache.org/hadoop/ConnectionRefused
> >>>>>>>>>>>    at
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047)
> >>>>>>>>>>>    at
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908)
> >>>>>>>>>>>
> >>>>>>>>>>> TestHFile:
> >>>>>>>>>>> testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile)
> Time
> >>>>>>>> elapsed:
> >>>>>>>>>>> 0.08 s  <<< ERROR!
> >>>>>>>>>>> java.net.ConnectException: Call From
> >>>>>>>>>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35529
> >>>>>> failed
> >>>>>>>> on
> >>>>>>>>>>> connection exception: java.net.ConnectException: Connection
> >>> refused;
> >>>>>>>> For
> >>>>>>>>>>> more details see:
> >>> http://wiki.apache.org/hadoop/ConnectionRefused
> >>>>>>>>>>>    at
> >>>>>>>>>>> org.apache.hadoop.hbase.io
> >>>>>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
> >>>>>>>>>>> Caused by: java.net.ConnectException: Connection refused
> >>>>>>>>>>>    at
> >>>>>>>>>>> org.apache.hadoop.hbase.io
> >>>>>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
> >>>>>>>>>>>
> >>>>>>>>>>> TestBlocksScanned:
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>
> testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned)
> >>>>>>>>>>> Time elapsed: 0.069 s  <<< ERROR!
> >>>>>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
> >>>>>>>> hdfs://localhost:35529/tmp/
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>
> hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08
> >>>>>>>>>> ,
> >>>>>>>>>>> expected: file:///
> >>>>>>>>>>>    at
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90)
> >>>>>>>>>>>    at
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86)
> >>>>>>>>>>>
> >>>>>>>>>>> And please let me know if any known issue I'm not aware of.
> >>> Thanks.
> >>>>>>>>>>>
> >>>>>>>>>>> Best Regards,
> >>>>>>>>>>> Yu
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>> On Mon, 8 Apr 2019 at 11:38, Yu Li <ca...@gmail.com> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> The performance report LGTM, thanks! (and sorry for the lag
> due
> >>> to
> >>>>>>>>>>>> Qingming Festival Holiday here in China)
> >>>>>>>>>>>>
> >>>>>>>>>>>> Still verifying the release, just some quick feedback:
> observed
> >>>>>> some
> >>>>>>>>>>>> incompatible changes in compatibility report including
> >>>>>>>>>>>> HBASE-21492/HBASE-21684 and worth a reminder in ReleaseNote.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Irrelative but noticeable: the 1.4.9 release note URL is
> >>> invalid on
> >>>>>>>>>>>> https://hbase.apache.org/downloads.html
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best Regards,
> >>>>>>>>>>>> Yu
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <
> >>> apurtell@apache.org>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> The difference is basically noise per the usual YCSB
> >>> evaluation.
> >>>>>>>> Small
> >>>>>>>>>>>>> differences in workloads D and F (slightly worse) and
> workload
> >>> E
> >>>>>>>>>> (slightly
> >>>>>>>>>>>>> better) that do not indicate serious regression.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Linux version 4.14.55-62.37.amzn1.x86_64
> >>>>>>>>>>>>> c3.8xlarge x 5
> >>>>>>>>>>>>> OpenJDK Runtime Environment (build 1.8.0_181-shenandoah-b13)
> >>>>>>>>>>>>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch -XX:+UseNUMA
> >>>>>>>>>>>>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
> >>>>>>>>>>>>> Hadoop 2.9.2
> >>>>>>>>>>>>> Init: Load 100 M rows and snapshot
> >>>>>>>>>>>>> Run: Delete table, clone and redeploy from snapshot, run 10 M
> >>>>>>>>>> operations
> >>>>>>>>>>>>> Args: -threads 100 -target 50000
> >>>>>>>>>>>>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS =>
> >>> '1',
> >>>>>>>>>> IN_MEMORY
> >>>>>>>>>>>>> => 'false', KEEP_DELETED_CELLS => 'FALSE',
> DATA_BLOCK_ENCODING
> >>> =>
> >>>>>>>>>>>>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => 'SNAPPY',
> >>>>>>>>>> MIN_VERSIONS =>
> >>>>>>>>>>>>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536',
> >>>>>> REPLICATION_SCOPE =>
> >>>>>>>>>>>>> '0'}
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> YCSB Workload A
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> [OVERALL], RunTime(ms) 200592 200583
> >>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49852 49855
> >>>>>>>>>>>>> [READ], AverageLatency(us) 544 559
> >>>>>>>>>>>>> [READ], MinLatency(us) 267 292
> >>>>>>>>>>>>> [READ], MaxLatency(us) 165631 185087
> >>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 738 742
> >>>>>>>>>>>>> [READ], 99thPercentileLatency(us), 1877 1961
> >>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1370 1181
> >>>>>>>>>>>>> [UPDATE], MinLatency(us) 702 646
> >>>>>>>>>>>>> [UPDATE], MaxLatency(us) 180735 177279
> >>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1943 1652
> >>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 3257 3085
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> YCSB Workload B
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> [OVERALL], RunTime(ms) 200599 200581
> >>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49850 49855
> >>>>>>>>>>>>> [READ], AverageLatency(us),  454 471
> >>>>>>>>>>>>> [READ], MinLatency(us) 203 213
> >>>>>>>>>>>>> [READ], MaxLatency(us) 183423 174207
> >>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 563 599
> >>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 1360 1172
> >>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1064 1029
> >>>>>>>>>>>>> [UPDATE], MinLatency(us) 746 726
> >>>>>>>>>>>>> [UPDATE], MaxLatency(us) 163455 101631
> >>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1327 1157
> >>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 2241 1898
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> YCSB Workload C
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> [OVERALL], RunTime(ms) 200541 200538
> >>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49865 49865
> >>>>>>>>>>>>> [READ], AverageLatency(us) 332 327
> >>>>>>>>>>>>> [READ], MinLatency(us) 175 179
> >>>>>>>>>>>>> [READ], MaxLatency(us) 210559 170367
> >>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 410 396
> >>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 871 892
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> YCSB Workload D
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> [OVERALL], RunTime(ms) 200579 200562
> >>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49855 49859
> >>>>>>>>>>>>> [READ], AverageLatency(us) 487 547
> >>>>>>>>>>>>> [READ], MinLatency(us) 210 214
> >>>>>>>>>>>>> [READ], MaxLatency(us) 192255 177535
> >>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 973 1529
> >>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 1836 2683
> >>>>>>>>>>>>> [INSERT], AverageLatency(us) 1239 1152
> >>>>>>>>>>>>> [INSERT], MinLatency(us) 807 788
> >>>>>>>>>>>>> [INSERT], MaxLatency(us) 184575 148735
> >>>>>>>>>>>>> [INSERT], 95thPercentileLatency(us) 1496 1243
> >>>>>>>>>>>>> [INSERT], 99thPercentileLatency(us) 2965 2495
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> YCSB Workload E
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> target 10k/op/s 1.4.9 1.5.0
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> [OVERALL], RunTime(ms) 100605 100568
> >>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 9939 9943
> >>>>>>>>>>>>> [SCAN], AverageLatency(us) 3548 2687
> >>>>>>>>>>>>> [SCAN], MinLatency(us) 696 678
> >>>>>>>>>>>>> [SCAN], MaxLatency(us) 1059839 238463
> >>>>>>>>>>>>> [SCAN], 95thPercentileLatency(us) 8327 6791
> >>>>>>>>>>>>> [SCAN], 99thPercentileLatency(us) 17647 14415
> >>>>>>>>>>>>> [INSERT], AverageLatency(us) 2688 1555
> >>>>>>>>>>>>> [INSERT], MinLatency(us) 887 815
> >>>>>>>>>>>>> [INSERT], MaxLatency(us) 173311 154623
> >>>>>>>>>>>>> [INSERT], 95thPercentileLatency(us) 4455 2571
> >>>>>>>>>>>>> [INSERT], 99thPercentileLatency(us) 9303 5375
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> YCSB Workload F
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> [OVERALL], RunTime(ms) 200562 204178
> >>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49859 48976
> >>>>>>>>>>>>> [READ], AverageLatency(us) 856 1137
> >>>>>>>>>>>>> [READ], MinLatency(us) 262 257
> >>>>>>>>>>>>> [READ], MaxLatency(us) 205567 222335
> >>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 2365 3475
> >>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 3099 4143
> >>>>>>>>>>>>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
> >>>>>>>>>>>>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034
> >>>>>>>>>>>>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
> >>>>>>>>>>>>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627
> >>>>>>>>>>>>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919
> >>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1700 1777
> >>>>>>>>>>>>> [UPDATE], MinLatency(us) 737 687
> >>>>>>>>>>>>> [UPDATE], MaxLatency(us) 97983 94271
> >>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 3377 4147
> >>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 4147 4831
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <ca...@gmail.com>
> >>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks for the efforts boss.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Since it's a new minor release, do we have performance
> >>> comparison
> >>>>>>>>>> report
> >>>>>>>>>>>>>> with 1.4.9 as we did when releasing 1.4.0? If so, any
> >>> reference?
> >>>>>>>> Many
> >>>>>>>>>>>>>> thanks!
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Best Regards,
> >>>>>>>>>>>>>> Yu
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <
> >>> apurtell@apache.org
> >>>>>>>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> The fourth HBase 1.5.0 release candidate (RC3) is available
> >>> for
> >>>>>>>>>>>>> download
> >>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>
> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/
> >>>>>> and
> >>>>>>>>>>>>> Maven
> >>>>>>>>>>>>>>> artifacts are available in the temporary repository
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>
> >>>
> https://repository.apache.org/content/repositories/orgapachehbase-1292/
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> The git tag corresponding to the candidate is '1.5.0RC3’
> >>>>>>>>>> (b0bc7225c5).
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> A detailed source and binary compatibility report for this
> >>>>>> release
> >>>>>>>> is
> >>>>>>>>>>>>>>> available for your review at
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>
> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
> >>>>>>>>>>>>>>> .
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> A list of the 115 issues resolved in this release can be
> >>> found
> >>>>>> at
> >>>>>>>>>>>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is derived
> >>> from
> >>>>>>>> the
> >>>>>>>>>>>>>>> changelog of the last branch-1.4 release, 1.4.9.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Please try out the candidate and vote +1/0/-1.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> The vote will be open for at least 72 hours. Unless
> >>> objection I
> >>>>>>>> will
> >>>>>>>>>>>>> try
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>> close it Friday April 12, 2019 if we have sufficient votes.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Prior to making this announcement I made the following
> >>> preflight
> >>>>>>>>>>>>> checks:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> RAT check passes (7u80)
> >>>>>>>>>>>>>>> Unit test suite passes (7u80, 8u181)*
> >>>>>>>>>>>>>>> Opened the UI in a browser, poked around
> >>>>>>>>>>>>>>> LTT load 100M rows with 100% verification and 20% updates
> >>>>>> (8u181)
> >>>>>>>>>>>>>>> ITBLL 1B rows with slowDeterministic monkey (8u181)
> >>>>>>>>>>>>>>> ITBLL 1B rows with serverKilling monkey (8u181)
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> There are known flaky tests. See HBASE-21904 and
> HBASE-21905.
> >>>>>> These
> >>>>>>>>>>>>> flaky
> >>>>>>>>>>>>>>> tests do not represent serious test failures that would
> >>> prevent
> >>>>>> a
> >>>>>>>>>>>>>> release.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>> Best regards,
> >>>>>>>>>>>>>>> Andrew
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> --
> >>>>>>>>>>>>> Best regards,
> >>>>>>>>>>>>> Andrew
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Words like orphans lost among the crosstalk, meaning torn
> from
> >>>>>>>> truth's
> >>>>>>>>>>>>> decrepit hands
> >>>>>>>>>>>>> - A23, Crosstalk
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>
> >>>
> >>
>

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Andrew Purtell <an...@gmail.com>.
I also see the occasional ConnectionRefused errors. They don’t reproduce if you run the test standalone. I also only see them on a Linux dev host. That may be enough to find by bisect the commit that introduced this behavior. Working on it. There is a JIRA filed for this one. Search for “TestBlocksRead” and label “branch-1”.

Thanks for the investigations. 

> On Apr 12, 2019, at 6:36 AM, Yu Li <ca...@gmail.com> wrote:
> 
> Quick updates:
> 
> W/ patch of HBASE-22219 or say upgrading kerby version to 1.0.1, the
> failures listed above in the 1st part of hbase-server disappeared.
> 
> However, in the 2nd part of hbase-server UT there're still many
> ConnectionRefused exceptions (17 errors in total) as shown below, which
> could be reproduced easily with -Dtest=xxx command on my environments,
> still checking the root cause.
> 
> [INFO] Running org.apache.hadoop.hbase.regionserver.TestBlocksRead
> [ERROR] Tests run: 4, Failures: 0, Errors: 4, Skipped: 0, Time elapsed:
> 0.853 s <<< FAILURE! - in
> org.apache.hadoop.hbase.regionserver.TestBlocksRead
> [ERROR]
> testBlocksStoredWhenCachingDisabled(org.apache.hadoop.hbase.regionserver.TestBlocksRead)
> Time elapsed: 0.17 s  <<< ERROR!
> java.net.ConnectException: Call From
> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35669 failed on
> connection exception: java.net.ConnectException: Connection refused; For
> more details see:
> http://wiki.apache.org/hadoop/ConnectionRefused
>        at
> org.apache.hadoop.hbase.regionserver.TestBlocksRead.initHRegion(TestBlocksRead.java:112)
>        at
> org.apache.hadoop.hbase.regionserver.TestBlocksRead.testBlocksStoredWhenCachingDisabled(TestBlocksRead.java:389)
> Caused by: java.net.ConnectException: Connection refused
>        at
> org.apache.hadoop.hbase.regionserver.TestBlocksRead.initHRegion(TestBlocksRead.java:112)
>        at
> org.apache.hadoop.hbase.regionserver.TestBlocksRead.testBlocksStoredWhenCachingDisabled(TestBlocksRead.java:389)
> 
> Best Regards,
> Yu
> 
> 
>> On Fri, 12 Apr 2019 at 13:11, Yu Li <ca...@gmail.com> wrote:
>> 
>> I have no doubt that you've run the tests locally before announcing a
>> release as you're always a great RM boss. And this shows one value of
>> verifying release, that different voter has different environments.
>> 
>> Now I think the failures may be kerberos related, since I possibly has
>> changed some system configuration when doing Flink testing on this env
>> weeks ago. Located one issue (HBASE-22219) which also observed in 1.4.7,
>> will further investigate.
>> 
>> Best Regards,
>> Yu
>> 
>> 
>> On Fri, 12 Apr 2019 at 12:38, Andrew Purtell <an...@gmail.com>
>> wrote:
>> 
>>> “However it's good to find the issue earlier if there
>>> really is any, before release announced.”
>>> 
>>> I run the complete unit test suite before announcing a release candidate.
>>> Just to be clear.
>>> 
>>> Totally agree we should get these problems sorted before an actual
>>> release. My policy is to cancel a RC if anyone vetoes for this reason...
>>> want as much coverage and varying environments as we can manage.
>>> 
>>> Thank you for your help so far and I hope the failures you see result in
>>> analysis and fixes that lead to better test stability.
>>> 
>>>> On Apr 11, 2019, at 9:32 PM, Yu Li <ca...@gmail.com> wrote:
>>>> 
>>>> Confirmed in 1.4.7 source the listed out cases passed (all in the 1st
>>> part
>>>> of hbase-server so the result comes out quickly.)... Also confirmed the
>>>> test ran order are the same...
>>>> 
>>>> Will try 1.5.0 again to prevent the environment difference caused by
>>> time.
>>>> If 1.5.0 still fails, will start to do the git bisect to locate the
>>> first
>>>> bad commit.
>>>> 
>>>> Was also expecting an easy pass and +1 as always to save time and
>>> efforts,
>>>> but obvious no luck. However it's good to find the issue earlier if
>>> there
>>>> really is any, before release announced.
>>>> 
>>>> Best Regards,
>>>> Yu
>>>> 
>>>> 
>>>>> On Fri, 12 Apr 2019 at 12:16, Yu Li <ca...@gmail.com> wrote:
>>>>> 
>>>>> Fine, let's focus on verifying whether it's a real problem rather than
>>>>> arguing about wording, after all that's not my intention...
>>>>> 
>>>>> As mentioned, I participated in the 1.4.7 release vote[1] and IIRC I
>>> was
>>>>> using the same env and all tests passed w/o issue, that's where my
>>> concern
>>>>> lies and the main reason I gave a -1 vote. I'm running against 1.4.7
>>> source
>>>>> on the same now and let's see the result.
>>>>> 
>>>>> [1] https://www.mail-archive.com/dev@hbase.apache.org/msg51380.html
>>>>> 
>>>>> Best Regards,
>>>>> Yu
>>>>> 
>>>>> 
>>>>> On Fri, 12 Apr 2019 at 12:05, Andrew Purtell <andrew.purtell@gmail.com
>>>> 
>>>>> wrote:
>>>>> 
>>>>>> I believe the test execution order matters. We run some tests in
>>>>>> parallel. The ordering of tests is determined by readdir() results
>>> and this
>>>>>> differs from host to host and checkout to checkout. So when you see a
>>>>>> repeatable group of failures, that’s great. And when someone else
>>> doesn’t
>>>>>> see those same tests fail, or they cannot be reproduced when running
>>> by
>>>>>> themselves, the commonly accepted term of art for this is “flaky”.
>>>>>> 
>>>>>> 
>>>>>>> On Apr 11, 2019, at 8:52 PM, Yu Li <ca...@gmail.com> wrote:
>>>>>>> 
>>>>>>> Sorry but I'd call it "possible environment related problem" or "some
>>>>>>> feature may not work well in specific environment", rather than a
>>> flaky.
>>>>>>> 
>>>>>>> Will check against 1.4.7 released source package before opening any
>>>>>> JIRA.
>>>>>>> 
>>>>>>> Best Regards,
>>>>>>> Yu
>>>>>>> 
>>>>>>> 
>>>>>>> On Fri, 12 Apr 2019 at 11:37, Andrew Purtell <
>>> andrew.purtell@gmail.com>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> And if they pass in my environment , then what should we call it
>>> then.
>>>>>> I
>>>>>>>> have no doubt you are seeing failures. Therefore can you please file
>>>>>> JIRAs
>>>>>>>> and attach information that can help identify a fix. Thanks.
>>>>>>>> 
>>>>>>>>> On Apr 11, 2019, at 8:35 PM, Yu Li <ca...@gmail.com> wrote:
>>>>>>>>> 
>>>>>>>>> I ran the test suite with the -Dsurefire.rerunFailingTestsCount=2
>>>>>> option
>>>>>>>>> and on two different env separately, so it sums up to 6 times
>>> stable
>>>>>>>>> failure for each case, and from my perspective this is not flaky.
>>>>>>>>> 
>>>>>>>>> IIRC last time when verifying 1.4.7 on the same env no such issue
>>>>>>>> observed,
>>>>>>>>> will double check.
>>>>>>>>> 
>>>>>>>>> Best Regards,
>>>>>>>>> Yu
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, 12 Apr 2019 at 00:07, Andrew Purtell <
>>>>>> andrew.purtell@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> There are two failure cases it looks like. And this looks like
>>>>>> flakes.
>>>>>>>>>> 
>>>>>>>>>> The wrong FS assertions are not something I see when I run these
>>>>>> tests
>>>>>>>>>> myself. I am not able to investigate something I can’t reproduce.
>>>>>> What I
>>>>>>>>>> suggest is since you can reproduce do a git bisect to find the
>>> commit
>>>>>>>> that
>>>>>>>>>> introduced the problem. Then we can revert it. As an alternative
>>> we
>>>>>> can
>>>>>>>>>> open a JIRA, report the problem, temporarily @ignore the test, and
>>>>>>>>>> continue. This latter option only should be done if we are fairly
>>>>>>>> confident
>>>>>>>>>> it is a test only problem.
>>>>>>>>>> 
>>>>>>>>>> The connect exceptions are interesting. I see these sometimes when
>>>>>> the
>>>>>>>>>> suite is executed, not this particular case, but when the failed
>>>>>> test is
>>>>>>>>>> executed by itself it always passes. It is possible some change to
>>>>>>>> classes
>>>>>>>>>> related to the minicluster or startup or shutdown timing are the
>>>>>> cause,
>>>>>>>> but
>>>>>>>>>> it is test time flaky behavior. I’m not happy about this but it
>>>>>> doesn’t
>>>>>>>>>> actually fail the release because the failure is never repeatable
>>>>>> when
>>>>>>>> the
>>>>>>>>>> test is run standalone.
>>>>>>>>>> 
>>>>>>>>>> In general it would be great if some attention was paid to test
>>>>>>>>>> cleanliness on branch-1. As RM I’m not in a position to insist
>>> that
>>>>>>>>>> everything is perfect or there will never be another 1.x release,
>>>>>>>> certainly
>>>>>>>>>> not from branch-1. So, tests which fail repeatedly block a release
>>>>>> IMHO
>>>>>>>> but
>>>>>>>>>> flakes do not.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On Apr 10, 2019, at 11:20 PM, Yu Li <ca...@gmail.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> -1
>>>>>>>>>>> 
>>>>>>>>>>> Observed many UT failures when checking the source package (tried
>>>>>>>>>> multiple
>>>>>>>>>>> rounds on two different environments, MacOs and Linux, got the
>>> same
>>>>>>>>>>> result), including (but not limited to):
>>>>>>>>>>> 
>>>>>>>>>>> TestBulkload:
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>> shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad)
>>>>>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
>>>>>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>> file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29,
>>>>>>>>>>> expected: hdfs://localhost:55938
>>>>>>>>>>>    at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246)
>>>>>>>>>>>    at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256)
>>>>>>>>>>>    at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150)
>>>>>>>>>>> 
>>>>>>>>>>> TestStoreFile:
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>> testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile)
>>>>>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
>>>>>>>>>>> java.net.ConnectException: Call From localhost/127.0.0.1 to
>>>>>>>>>> localhost:55938
>>>>>>>>>>> failed on connection exception: java.net.ConnectException:
>>>>>> Connection
>>>>>>>>>>> refused; For more details see:
>>>>>>>>>>> http://wiki.apache.org/hadoop/ConnectionRefused
>>>>>>>>>>>    at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047)
>>>>>>>>>>>    at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908)
>>>>>>>>>>> 
>>>>>>>>>>> TestHFile:
>>>>>>>>>>> testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile)  Time
>>>>>>>> elapsed:
>>>>>>>>>>> 0.08 s  <<< ERROR!
>>>>>>>>>>> java.net.ConnectException: Call From
>>>>>>>>>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35529
>>>>>> failed
>>>>>>>> on
>>>>>>>>>>> connection exception: java.net.ConnectException: Connection
>>> refused;
>>>>>>>> For
>>>>>>>>>>> more details see:
>>> http://wiki.apache.org/hadoop/ConnectionRefused
>>>>>>>>>>>    at
>>>>>>>>>>> org.apache.hadoop.hbase.io
>>>>>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
>>>>>>>>>>> Caused by: java.net.ConnectException: Connection refused
>>>>>>>>>>>    at
>>>>>>>>>>> org.apache.hadoop.hbase.io
>>>>>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
>>>>>>>>>>> 
>>>>>>>>>>> TestBlocksScanned:
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>> testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned)
>>>>>>>>>>> Time elapsed: 0.069 s  <<< ERROR!
>>>>>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
>>>>>>>> hdfs://localhost:35529/tmp/
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>> hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08
>>>>>>>>>> ,
>>>>>>>>>>> expected: file:///
>>>>>>>>>>>    at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90)
>>>>>>>>>>>    at
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86)
>>>>>>>>>>> 
>>>>>>>>>>> And please let me know if any known issue I'm not aware of.
>>> Thanks.
>>>>>>>>>>> 
>>>>>>>>>>> Best Regards,
>>>>>>>>>>> Yu
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> On Mon, 8 Apr 2019 at 11:38, Yu Li <ca...@gmail.com> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> The performance report LGTM, thanks! (and sorry for the lag due
>>> to
>>>>>>>>>>>> Qingming Festival Holiday here in China)
>>>>>>>>>>>> 
>>>>>>>>>>>> Still verifying the release, just some quick feedback: observed
>>>>>> some
>>>>>>>>>>>> incompatible changes in compatibility report including
>>>>>>>>>>>> HBASE-21492/HBASE-21684 and worth a reminder in ReleaseNote.
>>>>>>>>>>>> 
>>>>>>>>>>>> Irrelative but noticeable: the 1.4.9 release note URL is
>>> invalid on
>>>>>>>>>>>> https://hbase.apache.org/downloads.html
>>>>>>>>>>>> 
>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>> Yu
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <
>>> apurtell@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The difference is basically noise per the usual YCSB
>>> evaluation.
>>>>>>>> Small
>>>>>>>>>>>>> differences in workloads D and F (slightly worse) and workload
>>> E
>>>>>>>>>> (slightly
>>>>>>>>>>>>> better) that do not indicate serious regression.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Linux version 4.14.55-62.37.amzn1.x86_64
>>>>>>>>>>>>> c3.8xlarge x 5
>>>>>>>>>>>>> OpenJDK Runtime Environment (build 1.8.0_181-shenandoah-b13)
>>>>>>>>>>>>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch -XX:+UseNUMA
>>>>>>>>>>>>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
>>>>>>>>>>>>> Hadoop 2.9.2
>>>>>>>>>>>>> Init: Load 100 M rows and snapshot
>>>>>>>>>>>>> Run: Delete table, clone and redeploy from snapshot, run 10 M
>>>>>>>>>> operations
>>>>>>>>>>>>> Args: -threads 100 -target 50000
>>>>>>>>>>>>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS =>
>>> '1',
>>>>>>>>>> IN_MEMORY
>>>>>>>>>>>>> => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING
>>> =>
>>>>>>>>>>>>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => 'SNAPPY',
>>>>>>>>>> MIN_VERSIONS =>
>>>>>>>>>>>>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536',
>>>>>> REPLICATION_SCOPE =>
>>>>>>>>>>>>> '0'}
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> YCSB Workload A
>>>>>>>>>>>>> 
>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200592 200583
>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49852 49855
>>>>>>>>>>>>> [READ], AverageLatency(us) 544 559
>>>>>>>>>>>>> [READ], MinLatency(us) 267 292
>>>>>>>>>>>>> [READ], MaxLatency(us) 165631 185087
>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 738 742
>>>>>>>>>>>>> [READ], 99thPercentileLatency(us), 1877 1961
>>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1370 1181
>>>>>>>>>>>>> [UPDATE], MinLatency(us) 702 646
>>>>>>>>>>>>> [UPDATE], MaxLatency(us) 180735 177279
>>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1943 1652
>>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 3257 3085
>>>>>>>>>>>>> 
>>>>>>>>>>>>> YCSB Workload B
>>>>>>>>>>>>> 
>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200599 200581
>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49850 49855
>>>>>>>>>>>>> [READ], AverageLatency(us),  454 471
>>>>>>>>>>>>> [READ], MinLatency(us) 203 213
>>>>>>>>>>>>> [READ], MaxLatency(us) 183423 174207
>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 563 599
>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 1360 1172
>>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1064 1029
>>>>>>>>>>>>> [UPDATE], MinLatency(us) 746 726
>>>>>>>>>>>>> [UPDATE], MaxLatency(us) 163455 101631
>>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1327 1157
>>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 2241 1898
>>>>>>>>>>>>> 
>>>>>>>>>>>>> YCSB Workload C
>>>>>>>>>>>>> 
>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200541 200538
>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49865 49865
>>>>>>>>>>>>> [READ], AverageLatency(us) 332 327
>>>>>>>>>>>>> [READ], MinLatency(us) 175 179
>>>>>>>>>>>>> [READ], MaxLatency(us) 210559 170367
>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 410 396
>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 871 892
>>>>>>>>>>>>> 
>>>>>>>>>>>>> YCSB Workload D
>>>>>>>>>>>>> 
>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200579 200562
>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49855 49859
>>>>>>>>>>>>> [READ], AverageLatency(us) 487 547
>>>>>>>>>>>>> [READ], MinLatency(us) 210 214
>>>>>>>>>>>>> [READ], MaxLatency(us) 192255 177535
>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 973 1529
>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 1836 2683
>>>>>>>>>>>>> [INSERT], AverageLatency(us) 1239 1152
>>>>>>>>>>>>> [INSERT], MinLatency(us) 807 788
>>>>>>>>>>>>> [INSERT], MaxLatency(us) 184575 148735
>>>>>>>>>>>>> [INSERT], 95thPercentileLatency(us) 1496 1243
>>>>>>>>>>>>> [INSERT], 99thPercentileLatency(us) 2965 2495
>>>>>>>>>>>>> 
>>>>>>>>>>>>> YCSB Workload E
>>>>>>>>>>>>> 
>>>>>>>>>>>>> target 10k/op/s 1.4.9 1.5.0
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> [OVERALL], RunTime(ms) 100605 100568
>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 9939 9943
>>>>>>>>>>>>> [SCAN], AverageLatency(us) 3548 2687
>>>>>>>>>>>>> [SCAN], MinLatency(us) 696 678
>>>>>>>>>>>>> [SCAN], MaxLatency(us) 1059839 238463
>>>>>>>>>>>>> [SCAN], 95thPercentileLatency(us) 8327 6791
>>>>>>>>>>>>> [SCAN], 99thPercentileLatency(us) 17647 14415
>>>>>>>>>>>>> [INSERT], AverageLatency(us) 2688 1555
>>>>>>>>>>>>> [INSERT], MinLatency(us) 887 815
>>>>>>>>>>>>> [INSERT], MaxLatency(us) 173311 154623
>>>>>>>>>>>>> [INSERT], 95thPercentileLatency(us) 4455 2571
>>>>>>>>>>>>> [INSERT], 99thPercentileLatency(us) 9303 5375
>>>>>>>>>>>>> 
>>>>>>>>>>>>> YCSB Workload F
>>>>>>>>>>>>> 
>>>>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> [OVERALL], RunTime(ms) 200562 204178
>>>>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49859 48976
>>>>>>>>>>>>> [READ], AverageLatency(us) 856 1137
>>>>>>>>>>>>> [READ], MinLatency(us) 262 257
>>>>>>>>>>>>> [READ], MaxLatency(us) 205567 222335
>>>>>>>>>>>>> [READ], 95thPercentileLatency(us) 2365 3475
>>>>>>>>>>>>> [READ], 99thPercentileLatency(us) 3099 4143
>>>>>>>>>>>>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
>>>>>>>>>>>>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034
>>>>>>>>>>>>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
>>>>>>>>>>>>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627
>>>>>>>>>>>>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919
>>>>>>>>>>>>> [UPDATE], AverageLatency(us) 1700 1777
>>>>>>>>>>>>> [UPDATE], MinLatency(us) 737 687
>>>>>>>>>>>>> [UPDATE], MaxLatency(us) 97983 94271
>>>>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 3377 4147
>>>>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 4147 4831
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <ca...@gmail.com>
>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks for the efforts boss.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Since it's a new minor release, do we have performance
>>> comparison
>>>>>>>>>> report
>>>>>>>>>>>>>> with 1.4.9 as we did when releasing 1.4.0? If so, any
>>> reference?
>>>>>>>> Many
>>>>>>>>>>>>>> thanks!
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>> Yu
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <
>>> apurtell@apache.org
>>>>>>> 
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The fourth HBase 1.5.0 release candidate (RC3) is available
>>> for
>>>>>>>>>>>>> download
>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/
>>>>>> and
>>>>>>>>>>>>> Maven
>>>>>>>>>>>>>>> artifacts are available in the temporary repository
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>> 
>>> https://repository.apache.org/content/repositories/orgapachehbase-1292/
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The git tag corresponding to the candidate is '1.5.0RC3’
>>>>>>>>>> (b0bc7225c5).
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> A detailed source and binary compatibility report for this
>>>>>> release
>>>>>>>> is
>>>>>>>>>>>>>>> available for your review at
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> A list of the 115 issues resolved in this release can be
>>> found
>>>>>> at
>>>>>>>>>>>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is derived
>>> from
>>>>>>>> the
>>>>>>>>>>>>>>> changelog of the last branch-1.4 release, 1.4.9.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Please try out the candidate and vote +1/0/-1.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The vote will be open for at least 72 hours. Unless
>>> objection I
>>>>>>>> will
>>>>>>>>>>>>> try
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> close it Friday April 12, 2019 if we have sufficient votes.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Prior to making this announcement I made the following
>>> preflight
>>>>>>>>>>>>> checks:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> RAT check passes (7u80)
>>>>>>>>>>>>>>> Unit test suite passes (7u80, 8u181)*
>>>>>>>>>>>>>>> Opened the UI in a browser, poked around
>>>>>>>>>>>>>>> LTT load 100M rows with 100% verification and 20% updates
>>>>>> (8u181)
>>>>>>>>>>>>>>> ITBLL 1B rows with slowDeterministic monkey (8u181)
>>>>>>>>>>>>>>> ITBLL 1B rows with serverKilling monkey (8u181)
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> There are known flaky tests. See HBASE-21904 and HBASE-21905.
>>>>>> These
>>>>>>>>>>>>> flaky
>>>>>>>>>>>>>>> tests do not represent serious test failures that would
>>> prevent
>>>>>> a
>>>>>>>>>>>>>> release.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>> Andrew
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>> Andrew
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Words like orphans lost among the crosstalk, meaning torn from
>>>>>>>> truth's
>>>>>>>>>>>>> decrepit hands
>>>>>>>>>>>>> - A23, Crosstalk
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>> 
>>> 
>> 

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Yu Li <ca...@gmail.com>.
Quick updates:

W/ patch of HBASE-22219 or say upgrading kerby version to 1.0.1, the
failures listed above in the 1st part of hbase-server disappeared.

However, in the 2nd part of hbase-server UT there're still many
ConnectionRefused exceptions (17 errors in total) as shown below, which
could be reproduced easily with -Dtest=xxx command on my environments,
still checking the root cause.

[INFO] Running org.apache.hadoop.hbase.regionserver.TestBlocksRead
[ERROR] Tests run: 4, Failures: 0, Errors: 4, Skipped: 0, Time elapsed:
0.853 s <<< FAILURE! - in
org.apache.hadoop.hbase.regionserver.TestBlocksRead
[ERROR]
testBlocksStoredWhenCachingDisabled(org.apache.hadoop.hbase.regionserver.TestBlocksRead)
Time elapsed: 0.17 s  <<< ERROR!
java.net.ConnectException: Call From
z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35669 failed on
connection exception: java.net.ConnectException: Connection refused; For
more details see:
http://wiki.apache.org/hadoop/ConnectionRefused
        at
org.apache.hadoop.hbase.regionserver.TestBlocksRead.initHRegion(TestBlocksRead.java:112)
        at
org.apache.hadoop.hbase.regionserver.TestBlocksRead.testBlocksStoredWhenCachingDisabled(TestBlocksRead.java:389)
Caused by: java.net.ConnectException: Connection refused
        at
org.apache.hadoop.hbase.regionserver.TestBlocksRead.initHRegion(TestBlocksRead.java:112)
        at
org.apache.hadoop.hbase.regionserver.TestBlocksRead.testBlocksStoredWhenCachingDisabled(TestBlocksRead.java:389)

Best Regards,
Yu


On Fri, 12 Apr 2019 at 13:11, Yu Li <ca...@gmail.com> wrote:

> I have no doubt that you've run the tests locally before announcing a
> release as you're always a great RM boss. And this shows one value of
> verifying release, that different voter has different environments.
>
> Now I think the failures may be kerberos related, since I possibly has
> changed some system configuration when doing Flink testing on this env
> weeks ago. Located one issue (HBASE-22219) which also observed in 1.4.7,
> will further investigate.
>
> Best Regards,
> Yu
>
>
> On Fri, 12 Apr 2019 at 12:38, Andrew Purtell <an...@gmail.com>
> wrote:
>
>> “However it's good to find the issue earlier if there
>> really is any, before release announced.”
>>
>> I run the complete unit test suite before announcing a release candidate.
>> Just to be clear.
>>
>> Totally agree we should get these problems sorted before an actual
>> release. My policy is to cancel a RC if anyone vetoes for this reason...
>> want as much coverage and varying environments as we can manage.
>>
>> Thank you for your help so far and I hope the failures you see result in
>> analysis and fixes that lead to better test stability.
>>
>> > On Apr 11, 2019, at 9:32 PM, Yu Li <ca...@gmail.com> wrote:
>> >
>> > Confirmed in 1.4.7 source the listed out cases passed (all in the 1st
>> part
>> > of hbase-server so the result comes out quickly.)... Also confirmed the
>> > test ran order are the same...
>> >
>> > Will try 1.5.0 again to prevent the environment difference caused by
>> time.
>> > If 1.5.0 still fails, will start to do the git bisect to locate the
>> first
>> > bad commit.
>> >
>> > Was also expecting an easy pass and +1 as always to save time and
>> efforts,
>> > but obvious no luck. However it's good to find the issue earlier if
>> there
>> > really is any, before release announced.
>> >
>> > Best Regards,
>> > Yu
>> >
>> >
>> >> On Fri, 12 Apr 2019 at 12:16, Yu Li <ca...@gmail.com> wrote:
>> >>
>> >> Fine, let's focus on verifying whether it's a real problem rather than
>> >> arguing about wording, after all that's not my intention...
>> >>
>> >> As mentioned, I participated in the 1.4.7 release vote[1] and IIRC I
>> was
>> >> using the same env and all tests passed w/o issue, that's where my
>> concern
>> >> lies and the main reason I gave a -1 vote. I'm running against 1.4.7
>> source
>> >> on the same now and let's see the result.
>> >>
>> >> [1] https://www.mail-archive.com/dev@hbase.apache.org/msg51380.html
>> >>
>> >> Best Regards,
>> >> Yu
>> >>
>> >>
>> >> On Fri, 12 Apr 2019 at 12:05, Andrew Purtell <andrew.purtell@gmail.com
>> >
>> >> wrote:
>> >>
>> >>> I believe the test execution order matters. We run some tests in
>> >>> parallel. The ordering of tests is determined by readdir() results
>> and this
>> >>> differs from host to host and checkout to checkout. So when you see a
>> >>> repeatable group of failures, that’s great. And when someone else
>> doesn’t
>> >>> see those same tests fail, or they cannot be reproduced when running
>> by
>> >>> themselves, the commonly accepted term of art for this is “flaky”.
>> >>>
>> >>>
>> >>>> On Apr 11, 2019, at 8:52 PM, Yu Li <ca...@gmail.com> wrote:
>> >>>>
>> >>>> Sorry but I'd call it "possible environment related problem" or "some
>> >>>> feature may not work well in specific environment", rather than a
>> flaky.
>> >>>>
>> >>>> Will check against 1.4.7 released source package before opening any
>> >>> JIRA.
>> >>>>
>> >>>> Best Regards,
>> >>>> Yu
>> >>>>
>> >>>>
>> >>>> On Fri, 12 Apr 2019 at 11:37, Andrew Purtell <
>> andrew.purtell@gmail.com>
>> >>>> wrote:
>> >>>>
>> >>>>> And if they pass in my environment , then what should we call it
>> then.
>> >>> I
>> >>>>> have no doubt you are seeing failures. Therefore can you please file
>> >>> JIRAs
>> >>>>> and attach information that can help identify a fix. Thanks.
>> >>>>>
>> >>>>>> On Apr 11, 2019, at 8:35 PM, Yu Li <ca...@gmail.com> wrote:
>> >>>>>>
>> >>>>>> I ran the test suite with the -Dsurefire.rerunFailingTestsCount=2
>> >>> option
>> >>>>>> and on two different env separately, so it sums up to 6 times
>> stable
>> >>>>>> failure for each case, and from my perspective this is not flaky.
>> >>>>>>
>> >>>>>> IIRC last time when verifying 1.4.7 on the same env no such issue
>> >>>>> observed,
>> >>>>>> will double check.
>> >>>>>>
>> >>>>>> Best Regards,
>> >>>>>> Yu
>> >>>>>>
>> >>>>>>
>> >>>>>> On Fri, 12 Apr 2019 at 00:07, Andrew Purtell <
>> >>> andrew.purtell@gmail.com>
>> >>>>>> wrote:
>> >>>>>>
>> >>>>>>> There are two failure cases it looks like. And this looks like
>> >>> flakes.
>> >>>>>>>
>> >>>>>>> The wrong FS assertions are not something I see when I run these
>> >>> tests
>> >>>>>>> myself. I am not able to investigate something I can’t reproduce.
>> >>> What I
>> >>>>>>> suggest is since you can reproduce do a git bisect to find the
>> commit
>> >>>>> that
>> >>>>>>> introduced the problem. Then we can revert it. As an alternative
>> we
>> >>> can
>> >>>>>>> open a JIRA, report the problem, temporarily @ignore the test, and
>> >>>>>>> continue. This latter option only should be done if we are fairly
>> >>>>> confident
>> >>>>>>> it is a test only problem.
>> >>>>>>>
>> >>>>>>> The connect exceptions are interesting. I see these sometimes when
>> >>> the
>> >>>>>>> suite is executed, not this particular case, but when the failed
>> >>> test is
>> >>>>>>> executed by itself it always passes. It is possible some change to
>> >>>>> classes
>> >>>>>>> related to the minicluster or startup or shutdown timing are the
>> >>> cause,
>> >>>>> but
>> >>>>>>> it is test time flaky behavior. I’m not happy about this but it
>> >>> doesn’t
>> >>>>>>> actually fail the release because the failure is never repeatable
>> >>> when
>> >>>>> the
>> >>>>>>> test is run standalone.
>> >>>>>>>
>> >>>>>>> In general it would be great if some attention was paid to test
>> >>>>>>> cleanliness on branch-1. As RM I’m not in a position to insist
>> that
>> >>>>>>> everything is perfect or there will never be another 1.x release,
>> >>>>> certainly
>> >>>>>>> not from branch-1. So, tests which fail repeatedly block a release
>> >>> IMHO
>> >>>>> but
>> >>>>>>> flakes do not.
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>> On Apr 10, 2019, at 11:20 PM, Yu Li <ca...@gmail.com> wrote:
>> >>>>>>>>
>> >>>>>>>> -1
>> >>>>>>>>
>> >>>>>>>> Observed many UT failures when checking the source package (tried
>> >>>>>>> multiple
>> >>>>>>>> rounds on two different environments, MacOs and Linux, got the
>> same
>> >>>>>>>> result), including (but not limited to):
>> >>>>>>>>
>> >>>>>>>> TestBulkload:
>> >>>>>>>>
>> >>>>>>>
>> >>>>>
>> >>>
>> shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad)
>> >>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
>> >>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
>> >>>>>>>>
>> >>>>>>>
>> >>>>>
>> >>>
>> file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29,
>> >>>>>>>> expected: hdfs://localhost:55938
>> >>>>>>>>     at
>> >>>>>>>>
>> >>>>>>>
>> >>>>>
>> >>>
>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246)
>> >>>>>>>>     at
>> >>>>>>>>
>> >>>>>>>
>> >>>>>
>> >>>
>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256)
>> >>>>>>>>     at
>> >>>>>>>>
>> >>>>>>>
>> >>>>>
>> >>>
>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150)
>> >>>>>>>>
>> >>>>>>>> TestStoreFile:
>> >>>>>>>>
>> >>>>>>>
>> >>>>>
>> >>>
>> testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile)
>> >>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
>> >>>>>>>> java.net.ConnectException: Call From localhost/127.0.0.1 to
>> >>>>>>> localhost:55938
>> >>>>>>>> failed on connection exception: java.net.ConnectException:
>> >>> Connection
>> >>>>>>>> refused; For more details see:
>> >>>>>>>> http://wiki.apache.org/hadoop/ConnectionRefused
>> >>>>>>>>     at
>> >>>>>>>>
>> >>>>>>>
>> >>>>>
>> >>>
>> org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047)
>> >>>>>>>>     at
>> >>>>>>>>
>> >>>>>>>
>> >>>>>
>> >>>
>> org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908)
>> >>>>>>>>
>> >>>>>>>> TestHFile:
>> >>>>>>>> testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile)  Time
>> >>>>> elapsed:
>> >>>>>>>> 0.08 s  <<< ERROR!
>> >>>>>>>> java.net.ConnectException: Call From
>> >>>>>>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35529
>> >>> failed
>> >>>>> on
>> >>>>>>>> connection exception: java.net.ConnectException: Connection
>> refused;
>> >>>>> For
>> >>>>>>>> more details see:
>> http://wiki.apache.org/hadoop/ConnectionRefused
>> >>>>>>>>     at
>> >>>>>>>> org.apache.hadoop.hbase.io
>> >>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
>> >>>>>>>> Caused by: java.net.ConnectException: Connection refused
>> >>>>>>>>     at
>> >>>>>>>> org.apache.hadoop.hbase.io
>> >>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
>> >>>>>>>>
>> >>>>>>>> TestBlocksScanned:
>> >>>>>>>>
>> >>>>>>>
>> >>>>>
>> >>>
>> testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned)
>> >>>>>>>> Time elapsed: 0.069 s  <<< ERROR!
>> >>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
>> >>>>> hdfs://localhost:35529/tmp/
>> >>>>>>>>
>> >>>>>>>
>> >>>>>
>> >>>
>> hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08
>> >>>>>>> ,
>> >>>>>>>> expected: file:///
>> >>>>>>>>     at
>> >>>>>>>>
>> >>>>>>>
>> >>>>>
>> >>>
>> org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90)
>> >>>>>>>>     at
>> >>>>>>>>
>> >>>>>>>
>> >>>>>
>> >>>
>> org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86)
>> >>>>>>>>
>> >>>>>>>> And please let me know if any known issue I'm not aware of.
>> Thanks.
>> >>>>>>>>
>> >>>>>>>> Best Regards,
>> >>>>>>>> Yu
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>> On Mon, 8 Apr 2019 at 11:38, Yu Li <ca...@gmail.com> wrote:
>> >>>>>>>>>
>> >>>>>>>>> The performance report LGTM, thanks! (and sorry for the lag due
>> to
>> >>>>>>>>> Qingming Festival Holiday here in China)
>> >>>>>>>>>
>> >>>>>>>>> Still verifying the release, just some quick feedback: observed
>> >>> some
>> >>>>>>>>> incompatible changes in compatibility report including
>> >>>>>>>>> HBASE-21492/HBASE-21684 and worth a reminder in ReleaseNote.
>> >>>>>>>>>
>> >>>>>>>>> Irrelative but noticeable: the 1.4.9 release note URL is
>> invalid on
>> >>>>>>>>> https://hbase.apache.org/downloads.html
>> >>>>>>>>>
>> >>>>>>>>> Best Regards,
>> >>>>>>>>> Yu
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <
>> apurtell@apache.org>
>> >>>>>>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>> The difference is basically noise per the usual YCSB
>> evaluation.
>> >>>>> Small
>> >>>>>>>>>> differences in workloads D and F (slightly worse) and workload
>> E
>> >>>>>>> (slightly
>> >>>>>>>>>> better) that do not indicate serious regression.
>> >>>>>>>>>>
>> >>>>>>>>>> Linux version 4.14.55-62.37.amzn1.x86_64
>> >>>>>>>>>> c3.8xlarge x 5
>> >>>>>>>>>> OpenJDK Runtime Environment (build 1.8.0_181-shenandoah-b13)
>> >>>>>>>>>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch -XX:+UseNUMA
>> >>>>>>>>>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
>> >>>>>>>>>> Hadoop 2.9.2
>> >>>>>>>>>> Init: Load 100 M rows and snapshot
>> >>>>>>>>>> Run: Delete table, clone and redeploy from snapshot, run 10 M
>> >>>>>>> operations
>> >>>>>>>>>> Args: -threads 100 -target 50000
>> >>>>>>>>>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS =>
>> '1',
>> >>>>>>> IN_MEMORY
>> >>>>>>>>>> => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING
>> =>
>> >>>>>>>>>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => 'SNAPPY',
>> >>>>>>> MIN_VERSIONS =>
>> >>>>>>>>>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536',
>> >>> REPLICATION_SCOPE =>
>> >>>>>>>>>> '0'}
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> YCSB Workload A
>> >>>>>>>>>>
>> >>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> [OVERALL], RunTime(ms) 200592 200583
>> >>>>>>>>>> [OVERALL], Throughput(ops/sec) 49852 49855
>> >>>>>>>>>> [READ], AverageLatency(us) 544 559
>> >>>>>>>>>> [READ], MinLatency(us) 267 292
>> >>>>>>>>>> [READ], MaxLatency(us) 165631 185087
>> >>>>>>>>>> [READ], 95thPercentileLatency(us) 738 742
>> >>>>>>>>>> [READ], 99thPercentileLatency(us), 1877 1961
>> >>>>>>>>>> [UPDATE], AverageLatency(us) 1370 1181
>> >>>>>>>>>> [UPDATE], MinLatency(us) 702 646
>> >>>>>>>>>> [UPDATE], MaxLatency(us) 180735 177279
>> >>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1943 1652
>> >>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 3257 3085
>> >>>>>>>>>>
>> >>>>>>>>>> YCSB Workload B
>> >>>>>>>>>>
>> >>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> [OVERALL], RunTime(ms) 200599 200581
>> >>>>>>>>>> [OVERALL], Throughput(ops/sec) 49850 49855
>> >>>>>>>>>> [READ], AverageLatency(us),  454 471
>> >>>>>>>>>> [READ], MinLatency(us) 203 213
>> >>>>>>>>>> [READ], MaxLatency(us) 183423 174207
>> >>>>>>>>>> [READ], 95thPercentileLatency(us) 563 599
>> >>>>>>>>>> [READ], 99thPercentileLatency(us) 1360 1172
>> >>>>>>>>>> [UPDATE], AverageLatency(us) 1064 1029
>> >>>>>>>>>> [UPDATE], MinLatency(us) 746 726
>> >>>>>>>>>> [UPDATE], MaxLatency(us) 163455 101631
>> >>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1327 1157
>> >>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 2241 1898
>> >>>>>>>>>>
>> >>>>>>>>>> YCSB Workload C
>> >>>>>>>>>>
>> >>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> [OVERALL], RunTime(ms) 200541 200538
>> >>>>>>>>>> [OVERALL], Throughput(ops/sec) 49865 49865
>> >>>>>>>>>> [READ], AverageLatency(us) 332 327
>> >>>>>>>>>> [READ], MinLatency(us) 175 179
>> >>>>>>>>>> [READ], MaxLatency(us) 210559 170367
>> >>>>>>>>>> [READ], 95thPercentileLatency(us) 410 396
>> >>>>>>>>>> [READ], 99thPercentileLatency(us) 871 892
>> >>>>>>>>>>
>> >>>>>>>>>> YCSB Workload D
>> >>>>>>>>>>
>> >>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> [OVERALL], RunTime(ms) 200579 200562
>> >>>>>>>>>> [OVERALL], Throughput(ops/sec) 49855 49859
>> >>>>>>>>>> [READ], AverageLatency(us) 487 547
>> >>>>>>>>>> [READ], MinLatency(us) 210 214
>> >>>>>>>>>> [READ], MaxLatency(us) 192255 177535
>> >>>>>>>>>> [READ], 95thPercentileLatency(us) 973 1529
>> >>>>>>>>>> [READ], 99thPercentileLatency(us) 1836 2683
>> >>>>>>>>>> [INSERT], AverageLatency(us) 1239 1152
>> >>>>>>>>>> [INSERT], MinLatency(us) 807 788
>> >>>>>>>>>> [INSERT], MaxLatency(us) 184575 148735
>> >>>>>>>>>> [INSERT], 95thPercentileLatency(us) 1496 1243
>> >>>>>>>>>> [INSERT], 99thPercentileLatency(us) 2965 2495
>> >>>>>>>>>>
>> >>>>>>>>>> YCSB Workload E
>> >>>>>>>>>>
>> >>>>>>>>>> target 10k/op/s 1.4.9 1.5.0
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> [OVERALL], RunTime(ms) 100605 100568
>> >>>>>>>>>> [OVERALL], Throughput(ops/sec) 9939 9943
>> >>>>>>>>>> [SCAN], AverageLatency(us) 3548 2687
>> >>>>>>>>>> [SCAN], MinLatency(us) 696 678
>> >>>>>>>>>> [SCAN], MaxLatency(us) 1059839 238463
>> >>>>>>>>>> [SCAN], 95thPercentileLatency(us) 8327 6791
>> >>>>>>>>>> [SCAN], 99thPercentileLatency(us) 17647 14415
>> >>>>>>>>>> [INSERT], AverageLatency(us) 2688 1555
>> >>>>>>>>>> [INSERT], MinLatency(us) 887 815
>> >>>>>>>>>> [INSERT], MaxLatency(us) 173311 154623
>> >>>>>>>>>> [INSERT], 95thPercentileLatency(us) 4455 2571
>> >>>>>>>>>> [INSERT], 99thPercentileLatency(us) 9303 5375
>> >>>>>>>>>>
>> >>>>>>>>>> YCSB Workload F
>> >>>>>>>>>>
>> >>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> [OVERALL], RunTime(ms) 200562 204178
>> >>>>>>>>>> [OVERALL], Throughput(ops/sec) 49859 48976
>> >>>>>>>>>> [READ], AverageLatency(us) 856 1137
>> >>>>>>>>>> [READ], MinLatency(us) 262 257
>> >>>>>>>>>> [READ], MaxLatency(us) 205567 222335
>> >>>>>>>>>> [READ], 95thPercentileLatency(us) 2365 3475
>> >>>>>>>>>> [READ], 99thPercentileLatency(us) 3099 4143
>> >>>>>>>>>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
>> >>>>>>>>>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034
>> >>>>>>>>>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
>> >>>>>>>>>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627
>> >>>>>>>>>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919
>> >>>>>>>>>> [UPDATE], AverageLatency(us) 1700 1777
>> >>>>>>>>>> [UPDATE], MinLatency(us) 737 687
>> >>>>>>>>>> [UPDATE], MaxLatency(us) 97983 94271
>> >>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 3377 4147
>> >>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 4147 4831
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <ca...@gmail.com>
>> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>> Thanks for the efforts boss.
>> >>>>>>>>>>>
>> >>>>>>>>>>> Since it's a new minor release, do we have performance
>> comparison
>> >>>>>>> report
>> >>>>>>>>>>> with 1.4.9 as we did when releasing 1.4.0? If so, any
>> reference?
>> >>>>> Many
>> >>>>>>>>>>> thanks!
>> >>>>>>>>>>>
>> >>>>>>>>>>> Best Regards,
>> >>>>>>>>>>> Yu
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <
>> apurtell@apache.org
>> >>>>
>> >>>>>>>>>> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>>> The fourth HBase 1.5.0 release candidate (RC3) is available
>> for
>> >>>>>>>>>> download
>> >>>>>>>>>>> at
>> >>>>>>>>>>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/
>> >>> and
>> >>>>>>>>>> Maven
>> >>>>>>>>>>>> artifacts are available in the temporary repository
>> >>>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>
>> >>>
>> https://repository.apache.org/content/repositories/orgapachehbase-1292/
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> The git tag corresponding to the candidate is '1.5.0RC3’
>> >>>>>>> (b0bc7225c5).
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> A detailed source and binary compatibility report for this
>> >>> release
>> >>>>> is
>> >>>>>>>>>>>> available for your review at
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>
>> >>>>>
>> >>>
>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
>> >>>>>>>>>>>> .
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> A list of the 115 issues resolved in this release can be
>> found
>> >>> at
>> >>>>>>>>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is derived
>> from
>> >>>>> the
>> >>>>>>>>>>>> changelog of the last branch-1.4 release, 1.4.9.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Please try out the candidate and vote +1/0/-1.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> The vote will be open for at least 72 hours. Unless
>> objection I
>> >>>>> will
>> >>>>>>>>>> try
>> >>>>>>>>>>> to
>> >>>>>>>>>>>> close it Friday April 12, 2019 if we have sufficient votes.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Prior to making this announcement I made the following
>> preflight
>> >>>>>>>>>> checks:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> RAT check passes (7u80)
>> >>>>>>>>>>>> Unit test suite passes (7u80, 8u181)*
>> >>>>>>>>>>>> Opened the UI in a browser, poked around
>> >>>>>>>>>>>> LTT load 100M rows with 100% verification and 20% updates
>> >>> (8u181)
>> >>>>>>>>>>>> ITBLL 1B rows with slowDeterministic monkey (8u181)
>> >>>>>>>>>>>> ITBLL 1B rows with serverKilling monkey (8u181)
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> There are known flaky tests. See HBASE-21904 and HBASE-21905.
>> >>> These
>> >>>>>>>>>> flaky
>> >>>>>>>>>>>> tests do not represent serious test failures that would
>> prevent
>> >>> a
>> >>>>>>>>>>> release.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> --
>> >>>>>>>>>>>> Best regards,
>> >>>>>>>>>>>> Andrew
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> --
>> >>>>>>>>>> Best regards,
>> >>>>>>>>>> Andrew
>> >>>>>>>>>>
>> >>>>>>>>>> Words like orphans lost among the crosstalk, meaning torn from
>> >>>>> truth's
>> >>>>>>>>>> decrepit hands
>> >>>>>>>>>> - A23, Crosstalk
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>
>> >>>>>
>> >>>
>> >>
>>
>

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Yu Li <ca...@gmail.com>.
I have no doubt that you've run the tests locally before announcing a
release as you're always a great RM boss. And this shows one value of
verifying release, that different voter has different environments.

Now I think the failures may be kerberos related, since I possibly has
changed some system configuration when doing Flink testing on this env
weeks ago. Located one issue (HBASE-22219) which also observed in 1.4.7,
will further investigate.

Best Regards,
Yu


On Fri, 12 Apr 2019 at 12:38, Andrew Purtell <an...@gmail.com>
wrote:

> “However it's good to find the issue earlier if there
> really is any, before release announced.”
>
> I run the complete unit test suite before announcing a release candidate.
> Just to be clear.
>
> Totally agree we should get these problems sorted before an actual
> release. My policy is to cancel a RC if anyone vetoes for this reason...
> want as much coverage and varying environments as we can manage.
>
> Thank you for your help so far and I hope the failures you see result in
> analysis and fixes that lead to better test stability.
>
> > On Apr 11, 2019, at 9:32 PM, Yu Li <ca...@gmail.com> wrote:
> >
> > Confirmed in 1.4.7 source the listed out cases passed (all in the 1st
> part
> > of hbase-server so the result comes out quickly.)... Also confirmed the
> > test ran order are the same...
> >
> > Will try 1.5.0 again to prevent the environment difference caused by
> time.
> > If 1.5.0 still fails, will start to do the git bisect to locate the first
> > bad commit.
> >
> > Was also expecting an easy pass and +1 as always to save time and
> efforts,
> > but obvious no luck. However it's good to find the issue earlier if there
> > really is any, before release announced.
> >
> > Best Regards,
> > Yu
> >
> >
> >> On Fri, 12 Apr 2019 at 12:16, Yu Li <ca...@gmail.com> wrote:
> >>
> >> Fine, let's focus on verifying whether it's a real problem rather than
> >> arguing about wording, after all that's not my intention...
> >>
> >> As mentioned, I participated in the 1.4.7 release vote[1] and IIRC I was
> >> using the same env and all tests passed w/o issue, that's where my
> concern
> >> lies and the main reason I gave a -1 vote. I'm running against 1.4.7
> source
> >> on the same now and let's see the result.
> >>
> >> [1] https://www.mail-archive.com/dev@hbase.apache.org/msg51380.html
> >>
> >> Best Regards,
> >> Yu
> >>
> >>
> >> On Fri, 12 Apr 2019 at 12:05, Andrew Purtell <an...@gmail.com>
> >> wrote:
> >>
> >>> I believe the test execution order matters. We run some tests in
> >>> parallel. The ordering of tests is determined by readdir() results and
> this
> >>> differs from host to host and checkout to checkout. So when you see a
> >>> repeatable group of failures, that’s great. And when someone else
> doesn’t
> >>> see those same tests fail, or they cannot be reproduced when running by
> >>> themselves, the commonly accepted term of art for this is “flaky”.
> >>>
> >>>
> >>>> On Apr 11, 2019, at 8:52 PM, Yu Li <ca...@gmail.com> wrote:
> >>>>
> >>>> Sorry but I'd call it "possible environment related problem" or "some
> >>>> feature may not work well in specific environment", rather than a
> flaky.
> >>>>
> >>>> Will check against 1.4.7 released source package before opening any
> >>> JIRA.
> >>>>
> >>>> Best Regards,
> >>>> Yu
> >>>>
> >>>>
> >>>> On Fri, 12 Apr 2019 at 11:37, Andrew Purtell <
> andrew.purtell@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> And if they pass in my environment , then what should we call it
> then.
> >>> I
> >>>>> have no doubt you are seeing failures. Therefore can you please file
> >>> JIRAs
> >>>>> and attach information that can help identify a fix. Thanks.
> >>>>>
> >>>>>> On Apr 11, 2019, at 8:35 PM, Yu Li <ca...@gmail.com> wrote:
> >>>>>>
> >>>>>> I ran the test suite with the -Dsurefire.rerunFailingTestsCount=2
> >>> option
> >>>>>> and on two different env separately, so it sums up to 6 times stable
> >>>>>> failure for each case, and from my perspective this is not flaky.
> >>>>>>
> >>>>>> IIRC last time when verifying 1.4.7 on the same env no such issue
> >>>>> observed,
> >>>>>> will double check.
> >>>>>>
> >>>>>> Best Regards,
> >>>>>> Yu
> >>>>>>
> >>>>>>
> >>>>>> On Fri, 12 Apr 2019 at 00:07, Andrew Purtell <
> >>> andrew.purtell@gmail.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> There are two failure cases it looks like. And this looks like
> >>> flakes.
> >>>>>>>
> >>>>>>> The wrong FS assertions are not something I see when I run these
> >>> tests
> >>>>>>> myself. I am not able to investigate something I can’t reproduce.
> >>> What I
> >>>>>>> suggest is since you can reproduce do a git bisect to find the
> commit
> >>>>> that
> >>>>>>> introduced the problem. Then we can revert it. As an alternative we
> >>> can
> >>>>>>> open a JIRA, report the problem, temporarily @ignore the test, and
> >>>>>>> continue. This latter option only should be done if we are fairly
> >>>>> confident
> >>>>>>> it is a test only problem.
> >>>>>>>
> >>>>>>> The connect exceptions are interesting. I see these sometimes when
> >>> the
> >>>>>>> suite is executed, not this particular case, but when the failed
> >>> test is
> >>>>>>> executed by itself it always passes. It is possible some change to
> >>>>> classes
> >>>>>>> related to the minicluster or startup or shutdown timing are the
> >>> cause,
> >>>>> but
> >>>>>>> it is test time flaky behavior. I’m not happy about this but it
> >>> doesn’t
> >>>>>>> actually fail the release because the failure is never repeatable
> >>> when
> >>>>> the
> >>>>>>> test is run standalone.
> >>>>>>>
> >>>>>>> In general it would be great if some attention was paid to test
> >>>>>>> cleanliness on branch-1. As RM I’m not in a position to insist that
> >>>>>>> everything is perfect or there will never be another 1.x release,
> >>>>> certainly
> >>>>>>> not from branch-1. So, tests which fail repeatedly block a release
> >>> IMHO
> >>>>> but
> >>>>>>> flakes do not.
> >>>>>>>
> >>>>>>>
> >>>>>>>> On Apr 10, 2019, at 11:20 PM, Yu Li <ca...@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>> -1
> >>>>>>>>
> >>>>>>>> Observed many UT failures when checking the source package (tried
> >>>>>>> multiple
> >>>>>>>> rounds on two different environments, MacOs and Linux, got the
> same
> >>>>>>>> result), including (but not limited to):
> >>>>>>>>
> >>>>>>>> TestBulkload:
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad)
> >>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
> >>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29,
> >>>>>>>> expected: hdfs://localhost:55938
> >>>>>>>>     at
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246)
> >>>>>>>>     at
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256)
> >>>>>>>>     at
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150)
> >>>>>>>>
> >>>>>>>> TestStoreFile:
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile)
> >>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
> >>>>>>>> java.net.ConnectException: Call From localhost/127.0.0.1 to
> >>>>>>> localhost:55938
> >>>>>>>> failed on connection exception: java.net.ConnectException:
> >>> Connection
> >>>>>>>> refused; For more details see:
> >>>>>>>> http://wiki.apache.org/hadoop/ConnectionRefused
> >>>>>>>>     at
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047)
> >>>>>>>>     at
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908)
> >>>>>>>>
> >>>>>>>> TestHFile:
> >>>>>>>> testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile)  Time
> >>>>> elapsed:
> >>>>>>>> 0.08 s  <<< ERROR!
> >>>>>>>> java.net.ConnectException: Call From
> >>>>>>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35529
> >>> failed
> >>>>> on
> >>>>>>>> connection exception: java.net.ConnectException: Connection
> refused;
> >>>>> For
> >>>>>>>> more details see:
> http://wiki.apache.org/hadoop/ConnectionRefused
> >>>>>>>>     at
> >>>>>>>> org.apache.hadoop.hbase.io
> >>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
> >>>>>>>> Caused by: java.net.ConnectException: Connection refused
> >>>>>>>>     at
> >>>>>>>> org.apache.hadoop.hbase.io
> >>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
> >>>>>>>>
> >>>>>>>> TestBlocksScanned:
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned)
> >>>>>>>> Time elapsed: 0.069 s  <<< ERROR!
> >>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
> >>>>> hdfs://localhost:35529/tmp/
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08
> >>>>>>> ,
> >>>>>>>> expected: file:///
> >>>>>>>>     at
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90)
> >>>>>>>>     at
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86)
> >>>>>>>>
> >>>>>>>> And please let me know if any known issue I'm not aware of.
> Thanks.
> >>>>>>>>
> >>>>>>>> Best Regards,
> >>>>>>>> Yu
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> On Mon, 8 Apr 2019 at 11:38, Yu Li <ca...@gmail.com> wrote:
> >>>>>>>>>
> >>>>>>>>> The performance report LGTM, thanks! (and sorry for the lag due
> to
> >>>>>>>>> Qingming Festival Holiday here in China)
> >>>>>>>>>
> >>>>>>>>> Still verifying the release, just some quick feedback: observed
> >>> some
> >>>>>>>>> incompatible changes in compatibility report including
> >>>>>>>>> HBASE-21492/HBASE-21684 and worth a reminder in ReleaseNote.
> >>>>>>>>>
> >>>>>>>>> Irrelative but noticeable: the 1.4.9 release note URL is invalid
> on
> >>>>>>>>> https://hbase.apache.org/downloads.html
> >>>>>>>>>
> >>>>>>>>> Best Regards,
> >>>>>>>>> Yu
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <
> apurtell@apache.org>
> >>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> The difference is basically noise per the usual YCSB evaluation.
> >>>>> Small
> >>>>>>>>>> differences in workloads D and F (slightly worse) and workload E
> >>>>>>> (slightly
> >>>>>>>>>> better) that do not indicate serious regression.
> >>>>>>>>>>
> >>>>>>>>>> Linux version 4.14.55-62.37.amzn1.x86_64
> >>>>>>>>>> c3.8xlarge x 5
> >>>>>>>>>> OpenJDK Runtime Environment (build 1.8.0_181-shenandoah-b13)
> >>>>>>>>>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch -XX:+UseNUMA
> >>>>>>>>>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
> >>>>>>>>>> Hadoop 2.9.2
> >>>>>>>>>> Init: Load 100 M rows and snapshot
> >>>>>>>>>> Run: Delete table, clone and redeploy from snapshot, run 10 M
> >>>>>>> operations
> >>>>>>>>>> Args: -threads 100 -target 50000
> >>>>>>>>>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS => '1',
> >>>>>>> IN_MEMORY
> >>>>>>>>>> => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING
> =>
> >>>>>>>>>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => 'SNAPPY',
> >>>>>>> MIN_VERSIONS =>
> >>>>>>>>>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536',
> >>> REPLICATION_SCOPE =>
> >>>>>>>>>> '0'}
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> YCSB Workload A
> >>>>>>>>>>
> >>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> [OVERALL], RunTime(ms) 200592 200583
> >>>>>>>>>> [OVERALL], Throughput(ops/sec) 49852 49855
> >>>>>>>>>> [READ], AverageLatency(us) 544 559
> >>>>>>>>>> [READ], MinLatency(us) 267 292
> >>>>>>>>>> [READ], MaxLatency(us) 165631 185087
> >>>>>>>>>> [READ], 95thPercentileLatency(us) 738 742
> >>>>>>>>>> [READ], 99thPercentileLatency(us), 1877 1961
> >>>>>>>>>> [UPDATE], AverageLatency(us) 1370 1181
> >>>>>>>>>> [UPDATE], MinLatency(us) 702 646
> >>>>>>>>>> [UPDATE], MaxLatency(us) 180735 177279
> >>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1943 1652
> >>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 3257 3085
> >>>>>>>>>>
> >>>>>>>>>> YCSB Workload B
> >>>>>>>>>>
> >>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> [OVERALL], RunTime(ms) 200599 200581
> >>>>>>>>>> [OVERALL], Throughput(ops/sec) 49850 49855
> >>>>>>>>>> [READ], AverageLatency(us),  454 471
> >>>>>>>>>> [READ], MinLatency(us) 203 213
> >>>>>>>>>> [READ], MaxLatency(us) 183423 174207
> >>>>>>>>>> [READ], 95thPercentileLatency(us) 563 599
> >>>>>>>>>> [READ], 99thPercentileLatency(us) 1360 1172
> >>>>>>>>>> [UPDATE], AverageLatency(us) 1064 1029
> >>>>>>>>>> [UPDATE], MinLatency(us) 746 726
> >>>>>>>>>> [UPDATE], MaxLatency(us) 163455 101631
> >>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1327 1157
> >>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 2241 1898
> >>>>>>>>>>
> >>>>>>>>>> YCSB Workload C
> >>>>>>>>>>
> >>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> [OVERALL], RunTime(ms) 200541 200538
> >>>>>>>>>> [OVERALL], Throughput(ops/sec) 49865 49865
> >>>>>>>>>> [READ], AverageLatency(us) 332 327
> >>>>>>>>>> [READ], MinLatency(us) 175 179
> >>>>>>>>>> [READ], MaxLatency(us) 210559 170367
> >>>>>>>>>> [READ], 95thPercentileLatency(us) 410 396
> >>>>>>>>>> [READ], 99thPercentileLatency(us) 871 892
> >>>>>>>>>>
> >>>>>>>>>> YCSB Workload D
> >>>>>>>>>>
> >>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> [OVERALL], RunTime(ms) 200579 200562
> >>>>>>>>>> [OVERALL], Throughput(ops/sec) 49855 49859
> >>>>>>>>>> [READ], AverageLatency(us) 487 547
> >>>>>>>>>> [READ], MinLatency(us) 210 214
> >>>>>>>>>> [READ], MaxLatency(us) 192255 177535
> >>>>>>>>>> [READ], 95thPercentileLatency(us) 973 1529
> >>>>>>>>>> [READ], 99thPercentileLatency(us) 1836 2683
> >>>>>>>>>> [INSERT], AverageLatency(us) 1239 1152
> >>>>>>>>>> [INSERT], MinLatency(us) 807 788
> >>>>>>>>>> [INSERT], MaxLatency(us) 184575 148735
> >>>>>>>>>> [INSERT], 95thPercentileLatency(us) 1496 1243
> >>>>>>>>>> [INSERT], 99thPercentileLatency(us) 2965 2495
> >>>>>>>>>>
> >>>>>>>>>> YCSB Workload E
> >>>>>>>>>>
> >>>>>>>>>> target 10k/op/s 1.4.9 1.5.0
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> [OVERALL], RunTime(ms) 100605 100568
> >>>>>>>>>> [OVERALL], Throughput(ops/sec) 9939 9943
> >>>>>>>>>> [SCAN], AverageLatency(us) 3548 2687
> >>>>>>>>>> [SCAN], MinLatency(us) 696 678
> >>>>>>>>>> [SCAN], MaxLatency(us) 1059839 238463
> >>>>>>>>>> [SCAN], 95thPercentileLatency(us) 8327 6791
> >>>>>>>>>> [SCAN], 99thPercentileLatency(us) 17647 14415
> >>>>>>>>>> [INSERT], AverageLatency(us) 2688 1555
> >>>>>>>>>> [INSERT], MinLatency(us) 887 815
> >>>>>>>>>> [INSERT], MaxLatency(us) 173311 154623
> >>>>>>>>>> [INSERT], 95thPercentileLatency(us) 4455 2571
> >>>>>>>>>> [INSERT], 99thPercentileLatency(us) 9303 5375
> >>>>>>>>>>
> >>>>>>>>>> YCSB Workload F
> >>>>>>>>>>
> >>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> [OVERALL], RunTime(ms) 200562 204178
> >>>>>>>>>> [OVERALL], Throughput(ops/sec) 49859 48976
> >>>>>>>>>> [READ], AverageLatency(us) 856 1137
> >>>>>>>>>> [READ], MinLatency(us) 262 257
> >>>>>>>>>> [READ], MaxLatency(us) 205567 222335
> >>>>>>>>>> [READ], 95thPercentileLatency(us) 2365 3475
> >>>>>>>>>> [READ], 99thPercentileLatency(us) 3099 4143
> >>>>>>>>>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
> >>>>>>>>>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034
> >>>>>>>>>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
> >>>>>>>>>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627
> >>>>>>>>>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919
> >>>>>>>>>> [UPDATE], AverageLatency(us) 1700 1777
> >>>>>>>>>> [UPDATE], MinLatency(us) 737 687
> >>>>>>>>>> [UPDATE], MaxLatency(us) 97983 94271
> >>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 3377 4147
> >>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 4147 4831
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <ca...@gmail.com> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks for the efforts boss.
> >>>>>>>>>>>
> >>>>>>>>>>> Since it's a new minor release, do we have performance
> comparison
> >>>>>>> report
> >>>>>>>>>>> with 1.4.9 as we did when releasing 1.4.0? If so, any
> reference?
> >>>>> Many
> >>>>>>>>>>> thanks!
> >>>>>>>>>>>
> >>>>>>>>>>> Best Regards,
> >>>>>>>>>>> Yu
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <
> apurtell@apache.org
> >>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> The fourth HBase 1.5.0 release candidate (RC3) is available
> for
> >>>>>>>>>> download
> >>>>>>>>>>> at
> >>>>>>>>>>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/
> >>> and
> >>>>>>>>>> Maven
> >>>>>>>>>>>> artifacts are available in the temporary repository
> >>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>
> >>>
> https://repository.apache.org/content/repositories/orgapachehbase-1292/
> >>>>>>>>>>>>
> >>>>>>>>>>>> The git tag corresponding to the candidate is '1.5.0RC3’
> >>>>>>> (b0bc7225c5).
> >>>>>>>>>>>>
> >>>>>>>>>>>> A detailed source and binary compatibility report for this
> >>> release
> >>>>> is
> >>>>>>>>>>>> available for your review at
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
> >>>>>>>>>>>> .
> >>>>>>>>>>>>
> >>>>>>>>>>>> A list of the 115 issues resolved in this release can be found
> >>> at
> >>>>>>>>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is derived
> from
> >>>>> the
> >>>>>>>>>>>> changelog of the last branch-1.4 release, 1.4.9.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Please try out the candidate and vote +1/0/-1.
> >>>>>>>>>>>>
> >>>>>>>>>>>> The vote will be open for at least 72 hours. Unless objection
> I
> >>>>> will
> >>>>>>>>>> try
> >>>>>>>>>>> to
> >>>>>>>>>>>> close it Friday April 12, 2019 if we have sufficient votes.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Prior to making this announcement I made the following
> preflight
> >>>>>>>>>> checks:
> >>>>>>>>>>>>
> >>>>>>>>>>>> RAT check passes (7u80)
> >>>>>>>>>>>> Unit test suite passes (7u80, 8u181)*
> >>>>>>>>>>>> Opened the UI in a browser, poked around
> >>>>>>>>>>>> LTT load 100M rows with 100% verification and 20% updates
> >>> (8u181)
> >>>>>>>>>>>> ITBLL 1B rows with slowDeterministic monkey (8u181)
> >>>>>>>>>>>> ITBLL 1B rows with serverKilling monkey (8u181)
> >>>>>>>>>>>>
> >>>>>>>>>>>> There are known flaky tests. See HBASE-21904 and HBASE-21905.
> >>> These
> >>>>>>>>>> flaky
> >>>>>>>>>>>> tests do not represent serious test failures that would
> prevent
> >>> a
> >>>>>>>>>>> release.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> --
> >>>>>>>>>>>> Best regards,
> >>>>>>>>>>>> Andrew
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> Best regards,
> >>>>>>>>>> Andrew
> >>>>>>>>>>
> >>>>>>>>>> Words like orphans lost among the crosstalk, meaning torn from
> >>>>> truth's
> >>>>>>>>>> decrepit hands
> >>>>>>>>>> - A23, Crosstalk
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >>
>

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Andrew Purtell <an...@gmail.com>.
“However it's good to find the issue earlier if there
really is any, before release announced.”

I run the complete unit test suite before announcing a release candidate. Just to be clear. 

Totally agree we should get these problems sorted before an actual release. My policy is to cancel a RC if anyone vetoes for this reason... want as much coverage and varying environments as we can manage. 

Thank you for your help so far and I hope the failures you see result in analysis and fixes that lead to better test stability. 

> On Apr 11, 2019, at 9:32 PM, Yu Li <ca...@gmail.com> wrote:
> 
> Confirmed in 1.4.7 source the listed out cases passed (all in the 1st part
> of hbase-server so the result comes out quickly.)... Also confirmed the
> test ran order are the same...
> 
> Will try 1.5.0 again to prevent the environment difference caused by time.
> If 1.5.0 still fails, will start to do the git bisect to locate the first
> bad commit.
> 
> Was also expecting an easy pass and +1 as always to save time and efforts,
> but obvious no luck. However it's good to find the issue earlier if there
> really is any, before release announced.
> 
> Best Regards,
> Yu
> 
> 
>> On Fri, 12 Apr 2019 at 12:16, Yu Li <ca...@gmail.com> wrote:
>> 
>> Fine, let's focus on verifying whether it's a real problem rather than
>> arguing about wording, after all that's not my intention...
>> 
>> As mentioned, I participated in the 1.4.7 release vote[1] and IIRC I was
>> using the same env and all tests passed w/o issue, that's where my concern
>> lies and the main reason I gave a -1 vote. I'm running against 1.4.7 source
>> on the same now and let's see the result.
>> 
>> [1] https://www.mail-archive.com/dev@hbase.apache.org/msg51380.html
>> 
>> Best Regards,
>> Yu
>> 
>> 
>> On Fri, 12 Apr 2019 at 12:05, Andrew Purtell <an...@gmail.com>
>> wrote:
>> 
>>> I believe the test execution order matters. We run some tests in
>>> parallel. The ordering of tests is determined by readdir() results and this
>>> differs from host to host and checkout to checkout. So when you see a
>>> repeatable group of failures, that’s great. And when someone else doesn’t
>>> see those same tests fail, or they cannot be reproduced when running by
>>> themselves, the commonly accepted term of art for this is “flaky”.
>>> 
>>> 
>>>> On Apr 11, 2019, at 8:52 PM, Yu Li <ca...@gmail.com> wrote:
>>>> 
>>>> Sorry but I'd call it "possible environment related problem" or "some
>>>> feature may not work well in specific environment", rather than a flaky.
>>>> 
>>>> Will check against 1.4.7 released source package before opening any
>>> JIRA.
>>>> 
>>>> Best Regards,
>>>> Yu
>>>> 
>>>> 
>>>> On Fri, 12 Apr 2019 at 11:37, Andrew Purtell <an...@gmail.com>
>>>> wrote:
>>>> 
>>>>> And if they pass in my environment , then what should we call it then.
>>> I
>>>>> have no doubt you are seeing failures. Therefore can you please file
>>> JIRAs
>>>>> and attach information that can help identify a fix. Thanks.
>>>>> 
>>>>>> On Apr 11, 2019, at 8:35 PM, Yu Li <ca...@gmail.com> wrote:
>>>>>> 
>>>>>> I ran the test suite with the -Dsurefire.rerunFailingTestsCount=2
>>> option
>>>>>> and on two different env separately, so it sums up to 6 times stable
>>>>>> failure for each case, and from my perspective this is not flaky.
>>>>>> 
>>>>>> IIRC last time when verifying 1.4.7 on the same env no such issue
>>>>> observed,
>>>>>> will double check.
>>>>>> 
>>>>>> Best Regards,
>>>>>> Yu
>>>>>> 
>>>>>> 
>>>>>> On Fri, 12 Apr 2019 at 00:07, Andrew Purtell <
>>> andrew.purtell@gmail.com>
>>>>>> wrote:
>>>>>> 
>>>>>>> There are two failure cases it looks like. And this looks like
>>> flakes.
>>>>>>> 
>>>>>>> The wrong FS assertions are not something I see when I run these
>>> tests
>>>>>>> myself. I am not able to investigate something I can’t reproduce.
>>> What I
>>>>>>> suggest is since you can reproduce do a git bisect to find the commit
>>>>> that
>>>>>>> introduced the problem. Then we can revert it. As an alternative we
>>> can
>>>>>>> open a JIRA, report the problem, temporarily @ignore the test, and
>>>>>>> continue. This latter option only should be done if we are fairly
>>>>> confident
>>>>>>> it is a test only problem.
>>>>>>> 
>>>>>>> The connect exceptions are interesting. I see these sometimes when
>>> the
>>>>>>> suite is executed, not this particular case, but when the failed
>>> test is
>>>>>>> executed by itself it always passes. It is possible some change to
>>>>> classes
>>>>>>> related to the minicluster or startup or shutdown timing are the
>>> cause,
>>>>> but
>>>>>>> it is test time flaky behavior. I’m not happy about this but it
>>> doesn’t
>>>>>>> actually fail the release because the failure is never repeatable
>>> when
>>>>> the
>>>>>>> test is run standalone.
>>>>>>> 
>>>>>>> In general it would be great if some attention was paid to test
>>>>>>> cleanliness on branch-1. As RM I’m not in a position to insist that
>>>>>>> everything is perfect or there will never be another 1.x release,
>>>>> certainly
>>>>>>> not from branch-1. So, tests which fail repeatedly block a release
>>> IMHO
>>>>> but
>>>>>>> flakes do not.
>>>>>>> 
>>>>>>> 
>>>>>>>> On Apr 10, 2019, at 11:20 PM, Yu Li <ca...@gmail.com> wrote:
>>>>>>>> 
>>>>>>>> -1
>>>>>>>> 
>>>>>>>> Observed many UT failures when checking the source package (tried
>>>>>>> multiple
>>>>>>>> rounds on two different environments, MacOs and Linux, got the same
>>>>>>>> result), including (but not limited to):
>>>>>>>> 
>>>>>>>> TestBulkload:
>>>>>>>> 
>>>>>>> 
>>>>> 
>>> shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad)
>>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
>>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
>>>>>>>> 
>>>>>>> 
>>>>> 
>>> file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29,
>>>>>>>> expected: hdfs://localhost:55938
>>>>>>>>     at
>>>>>>>> 
>>>>>>> 
>>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246)
>>>>>>>>     at
>>>>>>>> 
>>>>>>> 
>>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256)
>>>>>>>>     at
>>>>>>>> 
>>>>>>> 
>>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150)
>>>>>>>> 
>>>>>>>> TestStoreFile:
>>>>>>>> 
>>>>>>> 
>>>>> 
>>> testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile)
>>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
>>>>>>>> java.net.ConnectException: Call From localhost/127.0.0.1 to
>>>>>>> localhost:55938
>>>>>>>> failed on connection exception: java.net.ConnectException:
>>> Connection
>>>>>>>> refused; For more details see:
>>>>>>>> http://wiki.apache.org/hadoop/ConnectionRefused
>>>>>>>>     at
>>>>>>>> 
>>>>>>> 
>>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047)
>>>>>>>>     at
>>>>>>>> 
>>>>>>> 
>>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908)
>>>>>>>> 
>>>>>>>> TestHFile:
>>>>>>>> testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile)  Time
>>>>> elapsed:
>>>>>>>> 0.08 s  <<< ERROR!
>>>>>>>> java.net.ConnectException: Call From
>>>>>>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35529
>>> failed
>>>>> on
>>>>>>>> connection exception: java.net.ConnectException: Connection refused;
>>>>> For
>>>>>>>> more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.hbase.io
>>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
>>>>>>>> Caused by: java.net.ConnectException: Connection refused
>>>>>>>>     at
>>>>>>>> org.apache.hadoop.hbase.io
>>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
>>>>>>>> 
>>>>>>>> TestBlocksScanned:
>>>>>>>> 
>>>>>>> 
>>>>> 
>>> testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned)
>>>>>>>> Time elapsed: 0.069 s  <<< ERROR!
>>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
>>>>> hdfs://localhost:35529/tmp/
>>>>>>>> 
>>>>>>> 
>>>>> 
>>> hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08
>>>>>>> ,
>>>>>>>> expected: file:///
>>>>>>>>     at
>>>>>>>> 
>>>>>>> 
>>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90)
>>>>>>>>     at
>>>>>>>> 
>>>>>>> 
>>>>> 
>>> org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86)
>>>>>>>> 
>>>>>>>> And please let me know if any known issue I'm not aware of. Thanks.
>>>>>>>> 
>>>>>>>> Best Regards,
>>>>>>>> Yu
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On Mon, 8 Apr 2019 at 11:38, Yu Li <ca...@gmail.com> wrote:
>>>>>>>>> 
>>>>>>>>> The performance report LGTM, thanks! (and sorry for the lag due to
>>>>>>>>> Qingming Festival Holiday here in China)
>>>>>>>>> 
>>>>>>>>> Still verifying the release, just some quick feedback: observed
>>> some
>>>>>>>>> incompatible changes in compatibility report including
>>>>>>>>> HBASE-21492/HBASE-21684 and worth a reminder in ReleaseNote.
>>>>>>>>> 
>>>>>>>>> Irrelative but noticeable: the 1.4.9 release note URL is invalid on
>>>>>>>>> https://hbase.apache.org/downloads.html
>>>>>>>>> 
>>>>>>>>> Best Regards,
>>>>>>>>> Yu
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <ap...@apache.org>
>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> The difference is basically noise per the usual YCSB evaluation.
>>>>> Small
>>>>>>>>>> differences in workloads D and F (slightly worse) and workload E
>>>>>>> (slightly
>>>>>>>>>> better) that do not indicate serious regression.
>>>>>>>>>> 
>>>>>>>>>> Linux version 4.14.55-62.37.amzn1.x86_64
>>>>>>>>>> c3.8xlarge x 5
>>>>>>>>>> OpenJDK Runtime Environment (build 1.8.0_181-shenandoah-b13)
>>>>>>>>>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch -XX:+UseNUMA
>>>>>>>>>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
>>>>>>>>>> Hadoop 2.9.2
>>>>>>>>>> Init: Load 100 M rows and snapshot
>>>>>>>>>> Run: Delete table, clone and redeploy from snapshot, run 10 M
>>>>>>> operations
>>>>>>>>>> Args: -threads 100 -target 50000
>>>>>>>>>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS => '1',
>>>>>>> IN_MEMORY
>>>>>>>>>> => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING =>
>>>>>>>>>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => 'SNAPPY',
>>>>>>> MIN_VERSIONS =>
>>>>>>>>>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536',
>>> REPLICATION_SCOPE =>
>>>>>>>>>> '0'}
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> YCSB Workload A
>>>>>>>>>> 
>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> [OVERALL], RunTime(ms) 200592 200583
>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49852 49855
>>>>>>>>>> [READ], AverageLatency(us) 544 559
>>>>>>>>>> [READ], MinLatency(us) 267 292
>>>>>>>>>> [READ], MaxLatency(us) 165631 185087
>>>>>>>>>> [READ], 95thPercentileLatency(us) 738 742
>>>>>>>>>> [READ], 99thPercentileLatency(us), 1877 1961
>>>>>>>>>> [UPDATE], AverageLatency(us) 1370 1181
>>>>>>>>>> [UPDATE], MinLatency(us) 702 646
>>>>>>>>>> [UPDATE], MaxLatency(us) 180735 177279
>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1943 1652
>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 3257 3085
>>>>>>>>>> 
>>>>>>>>>> YCSB Workload B
>>>>>>>>>> 
>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> [OVERALL], RunTime(ms) 200599 200581
>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49850 49855
>>>>>>>>>> [READ], AverageLatency(us),  454 471
>>>>>>>>>> [READ], MinLatency(us) 203 213
>>>>>>>>>> [READ], MaxLatency(us) 183423 174207
>>>>>>>>>> [READ], 95thPercentileLatency(us) 563 599
>>>>>>>>>> [READ], 99thPercentileLatency(us) 1360 1172
>>>>>>>>>> [UPDATE], AverageLatency(us) 1064 1029
>>>>>>>>>> [UPDATE], MinLatency(us) 746 726
>>>>>>>>>> [UPDATE], MaxLatency(us) 163455 101631
>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1327 1157
>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 2241 1898
>>>>>>>>>> 
>>>>>>>>>> YCSB Workload C
>>>>>>>>>> 
>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> [OVERALL], RunTime(ms) 200541 200538
>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49865 49865
>>>>>>>>>> [READ], AverageLatency(us) 332 327
>>>>>>>>>> [READ], MinLatency(us) 175 179
>>>>>>>>>> [READ], MaxLatency(us) 210559 170367
>>>>>>>>>> [READ], 95thPercentileLatency(us) 410 396
>>>>>>>>>> [READ], 99thPercentileLatency(us) 871 892
>>>>>>>>>> 
>>>>>>>>>> YCSB Workload D
>>>>>>>>>> 
>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> [OVERALL], RunTime(ms) 200579 200562
>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49855 49859
>>>>>>>>>> [READ], AverageLatency(us) 487 547
>>>>>>>>>> [READ], MinLatency(us) 210 214
>>>>>>>>>> [READ], MaxLatency(us) 192255 177535
>>>>>>>>>> [READ], 95thPercentileLatency(us) 973 1529
>>>>>>>>>> [READ], 99thPercentileLatency(us) 1836 2683
>>>>>>>>>> [INSERT], AverageLatency(us) 1239 1152
>>>>>>>>>> [INSERT], MinLatency(us) 807 788
>>>>>>>>>> [INSERT], MaxLatency(us) 184575 148735
>>>>>>>>>> [INSERT], 95thPercentileLatency(us) 1496 1243
>>>>>>>>>> [INSERT], 99thPercentileLatency(us) 2965 2495
>>>>>>>>>> 
>>>>>>>>>> YCSB Workload E
>>>>>>>>>> 
>>>>>>>>>> target 10k/op/s 1.4.9 1.5.0
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> [OVERALL], RunTime(ms) 100605 100568
>>>>>>>>>> [OVERALL], Throughput(ops/sec) 9939 9943
>>>>>>>>>> [SCAN], AverageLatency(us) 3548 2687
>>>>>>>>>> [SCAN], MinLatency(us) 696 678
>>>>>>>>>> [SCAN], MaxLatency(us) 1059839 238463
>>>>>>>>>> [SCAN], 95thPercentileLatency(us) 8327 6791
>>>>>>>>>> [SCAN], 99thPercentileLatency(us) 17647 14415
>>>>>>>>>> [INSERT], AverageLatency(us) 2688 1555
>>>>>>>>>> [INSERT], MinLatency(us) 887 815
>>>>>>>>>> [INSERT], MaxLatency(us) 173311 154623
>>>>>>>>>> [INSERT], 95thPercentileLatency(us) 4455 2571
>>>>>>>>>> [INSERT], 99thPercentileLatency(us) 9303 5375
>>>>>>>>>> 
>>>>>>>>>> YCSB Workload F
>>>>>>>>>> 
>>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> [OVERALL], RunTime(ms) 200562 204178
>>>>>>>>>> [OVERALL], Throughput(ops/sec) 49859 48976
>>>>>>>>>> [READ], AverageLatency(us) 856 1137
>>>>>>>>>> [READ], MinLatency(us) 262 257
>>>>>>>>>> [READ], MaxLatency(us) 205567 222335
>>>>>>>>>> [READ], 95thPercentileLatency(us) 2365 3475
>>>>>>>>>> [READ], 99thPercentileLatency(us) 3099 4143
>>>>>>>>>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
>>>>>>>>>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034
>>>>>>>>>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
>>>>>>>>>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627
>>>>>>>>>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919
>>>>>>>>>> [UPDATE], AverageLatency(us) 1700 1777
>>>>>>>>>> [UPDATE], MinLatency(us) 737 687
>>>>>>>>>> [UPDATE], MaxLatency(us) 97983 94271
>>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 3377 4147
>>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 4147 4831
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <ca...@gmail.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Thanks for the efforts boss.
>>>>>>>>>>> 
>>>>>>>>>>> Since it's a new minor release, do we have performance comparison
>>>>>>> report
>>>>>>>>>>> with 1.4.9 as we did when releasing 1.4.0? If so, any reference?
>>>>> Many
>>>>>>>>>>> thanks!
>>>>>>>>>>> 
>>>>>>>>>>> Best Regards,
>>>>>>>>>>> Yu
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <apurtell@apache.org
>>>> 
>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> The fourth HBase 1.5.0 release candidate (RC3) is available for
>>>>>>>>>> download
>>>>>>>>>>> at
>>>>>>>>>>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/
>>> and
>>>>>>>>>> Maven
>>>>>>>>>>>> artifacts are available in the temporary repository
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>> 
>>> https://repository.apache.org/content/repositories/orgapachehbase-1292/
>>>>>>>>>>>> 
>>>>>>>>>>>> The git tag corresponding to the candidate is '1.5.0RC3’
>>>>>>> (b0bc7225c5).
>>>>>>>>>>>> 
>>>>>>>>>>>> A detailed source and binary compatibility report for this
>>> release
>>>>> is
>>>>>>>>>>>> available for your review at
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>> 
>>>>> 
>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
>>>>>>>>>>>> .
>>>>>>>>>>>> 
>>>>>>>>>>>> A list of the 115 issues resolved in this release can be found
>>> at
>>>>>>>>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is derived from
>>>>> the
>>>>>>>>>>>> changelog of the last branch-1.4 release, 1.4.9.
>>>>>>>>>>>> 
>>>>>>>>>>>> Please try out the candidate and vote +1/0/-1.
>>>>>>>>>>>> 
>>>>>>>>>>>> The vote will be open for at least 72 hours. Unless objection I
>>>>> will
>>>>>>>>>> try
>>>>>>>>>>> to
>>>>>>>>>>>> close it Friday April 12, 2019 if we have sufficient votes.
>>>>>>>>>>>> 
>>>>>>>>>>>> Prior to making this announcement I made the following preflight
>>>>>>>>>> checks:
>>>>>>>>>>>> 
>>>>>>>>>>>> RAT check passes (7u80)
>>>>>>>>>>>> Unit test suite passes (7u80, 8u181)*
>>>>>>>>>>>> Opened the UI in a browser, poked around
>>>>>>>>>>>> LTT load 100M rows with 100% verification and 20% updates
>>> (8u181)
>>>>>>>>>>>> ITBLL 1B rows with slowDeterministic monkey (8u181)
>>>>>>>>>>>> ITBLL 1B rows with serverKilling monkey (8u181)
>>>>>>>>>>>> 
>>>>>>>>>>>> There are known flaky tests. See HBASE-21904 and HBASE-21905.
>>> These
>>>>>>>>>> flaky
>>>>>>>>>>>> tests do not represent serious test failures that would prevent
>>> a
>>>>>>>>>>> release.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> --
>>>>>>>>>>>> Best regards,
>>>>>>>>>>>> Andrew
>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> --
>>>>>>>>>> Best regards,
>>>>>>>>>> Andrew
>>>>>>>>>> 
>>>>>>>>>> Words like orphans lost among the crosstalk, meaning torn from
>>>>> truth's
>>>>>>>>>> decrepit hands
>>>>>>>>>> - A23, Crosstalk
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>>> 
>> 

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Yu Li <ca...@gmail.com>.
Confirmed in 1.4.7 source the listed out cases passed (all in the 1st part
of hbase-server so the result comes out quickly.)... Also confirmed the
test ran order are the same...

Will try 1.5.0 again to prevent the environment difference caused by time.
If 1.5.0 still fails, will start to do the git bisect to locate the first
bad commit.

Was also expecting an easy pass and +1 as always to save time and efforts,
but obvious no luck. However it's good to find the issue earlier if there
really is any, before release announced.

Best Regards,
Yu


On Fri, 12 Apr 2019 at 12:16, Yu Li <ca...@gmail.com> wrote:

> Fine, let's focus on verifying whether it's a real problem rather than
> arguing about wording, after all that's not my intention...
>
> As mentioned, I participated in the 1.4.7 release vote[1] and IIRC I was
> using the same env and all tests passed w/o issue, that's where my concern
> lies and the main reason I gave a -1 vote. I'm running against 1.4.7 source
> on the same now and let's see the result.
>
> [1] https://www.mail-archive.com/dev@hbase.apache.org/msg51380.html
>
> Best Regards,
> Yu
>
>
> On Fri, 12 Apr 2019 at 12:05, Andrew Purtell <an...@gmail.com>
> wrote:
>
>> I believe the test execution order matters. We run some tests in
>> parallel. The ordering of tests is determined by readdir() results and this
>> differs from host to host and checkout to checkout. So when you see a
>> repeatable group of failures, that’s great. And when someone else doesn’t
>> see those same tests fail, or they cannot be reproduced when running by
>> themselves, the commonly accepted term of art for this is “flaky”.
>>
>>
>> > On Apr 11, 2019, at 8:52 PM, Yu Li <ca...@gmail.com> wrote:
>> >
>> > Sorry but I'd call it "possible environment related problem" or "some
>> > feature may not work well in specific environment", rather than a flaky.
>> >
>> > Will check against 1.4.7 released source package before opening any
>> JIRA.
>> >
>> > Best Regards,
>> > Yu
>> >
>> >
>> > On Fri, 12 Apr 2019 at 11:37, Andrew Purtell <an...@gmail.com>
>> > wrote:
>> >
>> >> And if they pass in my environment , then what should we call it then.
>> I
>> >> have no doubt you are seeing failures. Therefore can you please file
>> JIRAs
>> >> and attach information that can help identify a fix. Thanks.
>> >>
>> >>> On Apr 11, 2019, at 8:35 PM, Yu Li <ca...@gmail.com> wrote:
>> >>>
>> >>> I ran the test suite with the -Dsurefire.rerunFailingTestsCount=2
>> option
>> >>> and on two different env separately, so it sums up to 6 times stable
>> >>> failure for each case, and from my perspective this is not flaky.
>> >>>
>> >>> IIRC last time when verifying 1.4.7 on the same env no such issue
>> >> observed,
>> >>> will double check.
>> >>>
>> >>> Best Regards,
>> >>> Yu
>> >>>
>> >>>
>> >>> On Fri, 12 Apr 2019 at 00:07, Andrew Purtell <
>> andrew.purtell@gmail.com>
>> >>> wrote:
>> >>>
>> >>>> There are two failure cases it looks like. And this looks like
>> flakes.
>> >>>>
>> >>>> The wrong FS assertions are not something I see when I run these
>> tests
>> >>>> myself. I am not able to investigate something I can’t reproduce.
>> What I
>> >>>> suggest is since you can reproduce do a git bisect to find the commit
>> >> that
>> >>>> introduced the problem. Then we can revert it. As an alternative we
>> can
>> >>>> open a JIRA, report the problem, temporarily @ignore the test, and
>> >>>> continue. This latter option only should be done if we are fairly
>> >> confident
>> >>>> it is a test only problem.
>> >>>>
>> >>>> The connect exceptions are interesting. I see these sometimes when
>> the
>> >>>> suite is executed, not this particular case, but when the failed
>> test is
>> >>>> executed by itself it always passes. It is possible some change to
>> >> classes
>> >>>> related to the minicluster or startup or shutdown timing are the
>> cause,
>> >> but
>> >>>> it is test time flaky behavior. I’m not happy about this but it
>> doesn’t
>> >>>> actually fail the release because the failure is never repeatable
>> when
>> >> the
>> >>>> test is run standalone.
>> >>>>
>> >>>> In general it would be great if some attention was paid to test
>> >>>> cleanliness on branch-1. As RM I’m not in a position to insist that
>> >>>> everything is perfect or there will never be another 1.x release,
>> >> certainly
>> >>>> not from branch-1. So, tests which fail repeatedly block a release
>> IMHO
>> >> but
>> >>>> flakes do not.
>> >>>>
>> >>>>
>> >>>>> On Apr 10, 2019, at 11:20 PM, Yu Li <ca...@gmail.com> wrote:
>> >>>>>
>> >>>>> -1
>> >>>>>
>> >>>>> Observed many UT failures when checking the source package (tried
>> >>>> multiple
>> >>>>> rounds on two different environments, MacOs and Linux, got the same
>> >>>>> result), including (but not limited to):
>> >>>>>
>> >>>>> TestBulkload:
>> >>>>>
>> >>>>
>> >>
>> shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad)
>> >>>>> Time elapsed: 0.083 s  <<< ERROR!
>> >>>>> java.lang.IllegalArgumentException: Wrong FS:
>> >>>>>
>> >>>>
>> >>
>> file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29,
>> >>>>> expected: hdfs://localhost:55938
>> >>>>>      at
>> >>>>>
>> >>>>
>> >>
>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246)
>> >>>>>      at
>> >>>>>
>> >>>>
>> >>
>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256)
>> >>>>>      at
>> >>>>>
>> >>>>
>> >>
>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150)
>> >>>>>
>> >>>>> TestStoreFile:
>> >>>>>
>> >>>>
>> >>
>> testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile)
>> >>>>> Time elapsed: 0.083 s  <<< ERROR!
>> >>>>> java.net.ConnectException: Call From localhost/127.0.0.1 to
>> >>>> localhost:55938
>> >>>>> failed on connection exception: java.net.ConnectException:
>> Connection
>> >>>>> refused; For more details see:
>> >>>>> http://wiki.apache.org/hadoop/ConnectionRefused
>> >>>>>      at
>> >>>>>
>> >>>>
>> >>
>> org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047)
>> >>>>>      at
>> >>>>>
>> >>>>
>> >>
>> org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908)
>> >>>>>
>> >>>>> TestHFile:
>> >>>>> testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile)  Time
>> >> elapsed:
>> >>>>> 0.08 s  <<< ERROR!
>> >>>>> java.net.ConnectException: Call From
>> >>>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35529
>> failed
>> >> on
>> >>>>> connection exception: java.net.ConnectException: Connection refused;
>> >> For
>> >>>>> more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
>> >>>>>      at
>> >>>>> org.apache.hadoop.hbase.io
>> >>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
>> >>>>> Caused by: java.net.ConnectException: Connection refused
>> >>>>>      at
>> >>>>> org.apache.hadoop.hbase.io
>> >>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
>> >>>>>
>> >>>>> TestBlocksScanned:
>> >>>>>
>> >>>>
>> >>
>> testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned)
>> >>>>> Time elapsed: 0.069 s  <<< ERROR!
>> >>>>> java.lang.IllegalArgumentException: Wrong FS:
>> >> hdfs://localhost:35529/tmp/
>> >>>>>
>> >>>>
>> >>
>> hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08
>> >>>> ,
>> >>>>> expected: file:///
>> >>>>>      at
>> >>>>>
>> >>>>
>> >>
>> org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90)
>> >>>>>      at
>> >>>>>
>> >>>>
>> >>
>> org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86)
>> >>>>>
>> >>>>> And please let me know if any known issue I'm not aware of. Thanks.
>> >>>>>
>> >>>>> Best Regards,
>> >>>>> Yu
>> >>>>>
>> >>>>>
>> >>>>>> On Mon, 8 Apr 2019 at 11:38, Yu Li <ca...@gmail.com> wrote:
>> >>>>>>
>> >>>>>> The performance report LGTM, thanks! (and sorry for the lag due to
>> >>>>>> Qingming Festival Holiday here in China)
>> >>>>>>
>> >>>>>> Still verifying the release, just some quick feedback: observed
>> some
>> >>>>>> incompatible changes in compatibility report including
>> >>>>>> HBASE-21492/HBASE-21684 and worth a reminder in ReleaseNote.
>> >>>>>>
>> >>>>>> Irrelative but noticeable: the 1.4.9 release note URL is invalid on
>> >>>>>> https://hbase.apache.org/downloads.html
>> >>>>>>
>> >>>>>> Best Regards,
>> >>>>>> Yu
>> >>>>>>
>> >>>>>>
>> >>>>>>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <ap...@apache.org>
>> >>>> wrote:
>> >>>>>>>
>> >>>>>>> The difference is basically noise per the usual YCSB evaluation.
>> >> Small
>> >>>>>>> differences in workloads D and F (slightly worse) and workload E
>> >>>> (slightly
>> >>>>>>> better) that do not indicate serious regression.
>> >>>>>>>
>> >>>>>>> Linux version 4.14.55-62.37.amzn1.x86_64
>> >>>>>>> c3.8xlarge x 5
>> >>>>>>> OpenJDK Runtime Environment (build 1.8.0_181-shenandoah-b13)
>> >>>>>>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch -XX:+UseNUMA
>> >>>>>>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
>> >>>>>>> Hadoop 2.9.2
>> >>>>>>> Init: Load 100 M rows and snapshot
>> >>>>>>> Run: Delete table, clone and redeploy from snapshot, run 10 M
>> >>>> operations
>> >>>>>>> Args: -threads 100 -target 50000
>> >>>>>>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS => '1',
>> >>>> IN_MEMORY
>> >>>>>>> => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING =>
>> >>>>>>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => 'SNAPPY',
>> >>>> MIN_VERSIONS =>
>> >>>>>>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536',
>> REPLICATION_SCOPE =>
>> >>>>>>> '0'}
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> YCSB Workload A
>> >>>>>>>
>> >>>>>>> target 50k/op/s 1.4.9 1.5.0
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> [OVERALL], RunTime(ms) 200592 200583
>> >>>>>>> [OVERALL], Throughput(ops/sec) 49852 49855
>> >>>>>>> [READ], AverageLatency(us) 544 559
>> >>>>>>> [READ], MinLatency(us) 267 292
>> >>>>>>> [READ], MaxLatency(us) 165631 185087
>> >>>>>>> [READ], 95thPercentileLatency(us) 738 742
>> >>>>>>> [READ], 99thPercentileLatency(us), 1877 1961
>> >>>>>>> [UPDATE], AverageLatency(us) 1370 1181
>> >>>>>>> [UPDATE], MinLatency(us) 702 646
>> >>>>>>> [UPDATE], MaxLatency(us) 180735 177279
>> >>>>>>> [UPDATE], 95thPercentileLatency(us) 1943 1652
>> >>>>>>> [UPDATE], 99thPercentileLatency(us) 3257 3085
>> >>>>>>>
>> >>>>>>> YCSB Workload B
>> >>>>>>>
>> >>>>>>> target 50k/op/s 1.4.9 1.5.0
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> [OVERALL], RunTime(ms) 200599 200581
>> >>>>>>> [OVERALL], Throughput(ops/sec) 49850 49855
>> >>>>>>> [READ], AverageLatency(us),  454 471
>> >>>>>>> [READ], MinLatency(us) 203 213
>> >>>>>>> [READ], MaxLatency(us) 183423 174207
>> >>>>>>> [READ], 95thPercentileLatency(us) 563 599
>> >>>>>>> [READ], 99thPercentileLatency(us) 1360 1172
>> >>>>>>> [UPDATE], AverageLatency(us) 1064 1029
>> >>>>>>> [UPDATE], MinLatency(us) 746 726
>> >>>>>>> [UPDATE], MaxLatency(us) 163455 101631
>> >>>>>>> [UPDATE], 95thPercentileLatency(us) 1327 1157
>> >>>>>>> [UPDATE], 99thPercentileLatency(us) 2241 1898
>> >>>>>>>
>> >>>>>>> YCSB Workload C
>> >>>>>>>
>> >>>>>>> target 50k/op/s 1.4.9 1.5.0
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> [OVERALL], RunTime(ms) 200541 200538
>> >>>>>>> [OVERALL], Throughput(ops/sec) 49865 49865
>> >>>>>>> [READ], AverageLatency(us) 332 327
>> >>>>>>> [READ], MinLatency(us) 175 179
>> >>>>>>> [READ], MaxLatency(us) 210559 170367
>> >>>>>>> [READ], 95thPercentileLatency(us) 410 396
>> >>>>>>> [READ], 99thPercentileLatency(us) 871 892
>> >>>>>>>
>> >>>>>>> YCSB Workload D
>> >>>>>>>
>> >>>>>>> target 50k/op/s 1.4.9 1.5.0
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> [OVERALL], RunTime(ms) 200579 200562
>> >>>>>>> [OVERALL], Throughput(ops/sec) 49855 49859
>> >>>>>>> [READ], AverageLatency(us) 487 547
>> >>>>>>> [READ], MinLatency(us) 210 214
>> >>>>>>> [READ], MaxLatency(us) 192255 177535
>> >>>>>>> [READ], 95thPercentileLatency(us) 973 1529
>> >>>>>>> [READ], 99thPercentileLatency(us) 1836 2683
>> >>>>>>> [INSERT], AverageLatency(us) 1239 1152
>> >>>>>>> [INSERT], MinLatency(us) 807 788
>> >>>>>>> [INSERT], MaxLatency(us) 184575 148735
>> >>>>>>> [INSERT], 95thPercentileLatency(us) 1496 1243
>> >>>>>>> [INSERT], 99thPercentileLatency(us) 2965 2495
>> >>>>>>>
>> >>>>>>> YCSB Workload E
>> >>>>>>>
>> >>>>>>> target 10k/op/s 1.4.9 1.5.0
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> [OVERALL], RunTime(ms) 100605 100568
>> >>>>>>> [OVERALL], Throughput(ops/sec) 9939 9943
>> >>>>>>> [SCAN], AverageLatency(us) 3548 2687
>> >>>>>>> [SCAN], MinLatency(us) 696 678
>> >>>>>>> [SCAN], MaxLatency(us) 1059839 238463
>> >>>>>>> [SCAN], 95thPercentileLatency(us) 8327 6791
>> >>>>>>> [SCAN], 99thPercentileLatency(us) 17647 14415
>> >>>>>>> [INSERT], AverageLatency(us) 2688 1555
>> >>>>>>> [INSERT], MinLatency(us) 887 815
>> >>>>>>> [INSERT], MaxLatency(us) 173311 154623
>> >>>>>>> [INSERT], 95thPercentileLatency(us) 4455 2571
>> >>>>>>> [INSERT], 99thPercentileLatency(us) 9303 5375
>> >>>>>>>
>> >>>>>>> YCSB Workload F
>> >>>>>>>
>> >>>>>>> target 50k/op/s 1.4.9 1.5.0
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> [OVERALL], RunTime(ms) 200562 204178
>> >>>>>>> [OVERALL], Throughput(ops/sec) 49859 48976
>> >>>>>>> [READ], AverageLatency(us) 856 1137
>> >>>>>>> [READ], MinLatency(us) 262 257
>> >>>>>>> [READ], MaxLatency(us) 205567 222335
>> >>>>>>> [READ], 95thPercentileLatency(us) 2365 3475
>> >>>>>>> [READ], 99thPercentileLatency(us) 3099 4143
>> >>>>>>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
>> >>>>>>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034
>> >>>>>>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
>> >>>>>>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627
>> >>>>>>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919
>> >>>>>>> [UPDATE], AverageLatency(us) 1700 1777
>> >>>>>>> [UPDATE], MinLatency(us) 737 687
>> >>>>>>> [UPDATE], MaxLatency(us) 97983 94271
>> >>>>>>> [UPDATE], 95thPercentileLatency(us) 3377 4147
>> >>>>>>> [UPDATE], 99thPercentileLatency(us) 4147 4831
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <ca...@gmail.com> wrote:
>> >>>>>>>>
>> >>>>>>>> Thanks for the efforts boss.
>> >>>>>>>>
>> >>>>>>>> Since it's a new minor release, do we have performance comparison
>> >>>> report
>> >>>>>>>> with 1.4.9 as we did when releasing 1.4.0? If so, any reference?
>> >> Many
>> >>>>>>>> thanks!
>> >>>>>>>>
>> >>>>>>>> Best Regards,
>> >>>>>>>> Yu
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <apurtell@apache.org
>> >
>> >>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>>> The fourth HBase 1.5.0 release candidate (RC3) is available for
>> >>>>>>> download
>> >>>>>>>> at
>> >>>>>>>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/
>> and
>> >>>>>>> Maven
>> >>>>>>>>> artifacts are available in the temporary repository
>> >>>>>>>>>
>> >>>>>>>
>> >>>>
>> https://repository.apache.org/content/repositories/orgapachehbase-1292/
>> >>>>>>>>>
>> >>>>>>>>> The git tag corresponding to the candidate is '1.5.0RC3’
>> >>>> (b0bc7225c5).
>> >>>>>>>>>
>> >>>>>>>>> A detailed source and binary compatibility report for this
>> release
>> >> is
>> >>>>>>>>> available for your review at
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>
>> >>
>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
>> >>>>>>>>> .
>> >>>>>>>>>
>> >>>>>>>>> A list of the 115 issues resolved in this release can be found
>> at
>> >>>>>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is derived from
>> >> the
>> >>>>>>>>> changelog of the last branch-1.4 release, 1.4.9.
>> >>>>>>>>>
>> >>>>>>>>> Please try out the candidate and vote +1/0/-1.
>> >>>>>>>>>
>> >>>>>>>>> The vote will be open for at least 72 hours. Unless objection I
>> >> will
>> >>>>>>> try
>> >>>>>>>> to
>> >>>>>>>>> close it Friday April 12, 2019 if we have sufficient votes.
>> >>>>>>>>>
>> >>>>>>>>> Prior to making this announcement I made the following preflight
>> >>>>>>> checks:
>> >>>>>>>>>
>> >>>>>>>>>  RAT check passes (7u80)
>> >>>>>>>>>  Unit test suite passes (7u80, 8u181)*
>> >>>>>>>>>  Opened the UI in a browser, poked around
>> >>>>>>>>>  LTT load 100M rows with 100% verification and 20% updates
>> (8u181)
>> >>>>>>>>>  ITBLL 1B rows with slowDeterministic monkey (8u181)
>> >>>>>>>>>  ITBLL 1B rows with serverKilling monkey (8u181)
>> >>>>>>>>>
>> >>>>>>>>> There are known flaky tests. See HBASE-21904 and HBASE-21905.
>> These
>> >>>>>>> flaky
>> >>>>>>>>> tests do not represent serious test failures that would prevent
>> a
>> >>>>>>>> release.
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> --
>> >>>>>>>>> Best regards,
>> >>>>>>>>> Andrew
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> Best regards,
>> >>>>>>> Andrew
>> >>>>>>>
>> >>>>>>> Words like orphans lost among the crosstalk, meaning torn from
>> >> truth's
>> >>>>>>> decrepit hands
>> >>>>>>> - A23, Crosstalk
>> >>>>>>>
>> >>>>>>
>> >>>>
>> >>
>>
>

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Andrew Purtell <an...@gmail.com>.
That sounds good. 

> On Apr 11, 2019, at 9:16 PM, Yu Li <ca...@gmail.com> wrote:
> 
> Fine, let's focus on verifying whether it's a real problem rather than
> arguing about wording, after all that's not my intention...
> 
> As mentioned, I participated in the 1.4.7 release vote[1] and IIRC I was
> using the same env and all tests passed w/o issue, that's where my concern
> lies and the main reason I gave a -1 vote. I'm running against 1.4.7 source
> on the same now and let's see the result.
> 
> [1] https://www.mail-archive.com/dev@hbase.apache.org/msg51380.html
> 
> Best Regards,
> Yu
> 
> 
> On Fri, 12 Apr 2019 at 12:05, Andrew Purtell <an...@gmail.com>
> wrote:
> 
>> I believe the test execution order matters. We run some tests in parallel.
>> The ordering of tests is determined by readdir() results and this differs
>> from host to host and checkout to checkout. So when you see a repeatable
>> group of failures, that’s great. And when someone else doesn’t see those
>> same tests fail, or they cannot be reproduced when running by themselves,
>> the commonly accepted term of art for this is “flaky”.
>> 
>> 
>>> On Apr 11, 2019, at 8:52 PM, Yu Li <ca...@gmail.com> wrote:
>>> 
>>> Sorry but I'd call it "possible environment related problem" or "some
>>> feature may not work well in specific environment", rather than a flaky.
>>> 
>>> Will check against 1.4.7 released source package before opening any JIRA.
>>> 
>>> Best Regards,
>>> Yu
>>> 
>>> 
>>> On Fri, 12 Apr 2019 at 11:37, Andrew Purtell <an...@gmail.com>
>>> wrote:
>>> 
>>>> And if they pass in my environment , then what should we call it then. I
>>>> have no doubt you are seeing failures. Therefore can you please file
>> JIRAs
>>>> and attach information that can help identify a fix. Thanks.
>>>> 
>>>>> On Apr 11, 2019, at 8:35 PM, Yu Li <ca...@gmail.com> wrote:
>>>>> 
>>>>> I ran the test suite with the -Dsurefire.rerunFailingTestsCount=2
>> option
>>>>> and on two different env separately, so it sums up to 6 times stable
>>>>> failure for each case, and from my perspective this is not flaky.
>>>>> 
>>>>> IIRC last time when verifying 1.4.7 on the same env no such issue
>>>> observed,
>>>>> will double check.
>>>>> 
>>>>> Best Regards,
>>>>> Yu
>>>>> 
>>>>> 
>>>>> On Fri, 12 Apr 2019 at 00:07, Andrew Purtell <andrew.purtell@gmail.com
>>> 
>>>>> wrote:
>>>>> 
>>>>>> There are two failure cases it looks like. And this looks like flakes.
>>>>>> 
>>>>>> The wrong FS assertions are not something I see when I run these tests
>>>>>> myself. I am not able to investigate something I can’t reproduce.
>> What I
>>>>>> suggest is since you can reproduce do a git bisect to find the commit
>>>> that
>>>>>> introduced the problem. Then we can revert it. As an alternative we
>> can
>>>>>> open a JIRA, report the problem, temporarily @ignore the test, and
>>>>>> continue. This latter option only should be done if we are fairly
>>>> confident
>>>>>> it is a test only problem.
>>>>>> 
>>>>>> The connect exceptions are interesting. I see these sometimes when the
>>>>>> suite is executed, not this particular case, but when the failed test
>> is
>>>>>> executed by itself it always passes. It is possible some change to
>>>> classes
>>>>>> related to the minicluster or startup or shutdown timing are the
>> cause,
>>>> but
>>>>>> it is test time flaky behavior. I’m not happy about this but it
>> doesn’t
>>>>>> actually fail the release because the failure is never repeatable when
>>>> the
>>>>>> test is run standalone.
>>>>>> 
>>>>>> In general it would be great if some attention was paid to test
>>>>>> cleanliness on branch-1. As RM I’m not in a position to insist that
>>>>>> everything is perfect or there will never be another 1.x release,
>>>> certainly
>>>>>> not from branch-1. So, tests which fail repeatedly block a release
>> IMHO
>>>> but
>>>>>> flakes do not.
>>>>>> 
>>>>>> 
>>>>>>> On Apr 10, 2019, at 11:20 PM, Yu Li <ca...@gmail.com> wrote:
>>>>>>> 
>>>>>>> -1
>>>>>>> 
>>>>>>> Observed many UT failures when checking the source package (tried
>>>>>> multiple
>>>>>>> rounds on two different environments, MacOs and Linux, got the same
>>>>>>> result), including (but not limited to):
>>>>>>> 
>>>>>>> TestBulkload:
>>>>>>> 
>>>>>> 
>>>> 
>> shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad)
>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
>>>>>>> 
>>>>>> 
>>>> 
>> file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29,
>>>>>>> expected: hdfs://localhost:55938
>>>>>>>     at
>>>>>>> 
>>>>>> 
>>>> 
>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246)
>>>>>>>     at
>>>>>>> 
>>>>>> 
>>>> 
>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256)
>>>>>>>     at
>>>>>>> 
>>>>>> 
>>>> 
>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150)
>>>>>>> 
>>>>>>> TestStoreFile:
>>>>>>> 
>>>>>> 
>>>> 
>> testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile)
>>>>>>> Time elapsed: 0.083 s  <<< ERROR!
>>>>>>> java.net.ConnectException: Call From localhost/127.0.0.1 to
>>>>>> localhost:55938
>>>>>>> failed on connection exception: java.net.ConnectException: Connection
>>>>>>> refused; For more details see:
>>>>>>> http://wiki.apache.org/hadoop/ConnectionRefused
>>>>>>>     at
>>>>>>> 
>>>>>> 
>>>> 
>> org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047)
>>>>>>>     at
>>>>>>> 
>>>>>> 
>>>> 
>> org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908)
>>>>>>> 
>>>>>>> TestHFile:
>>>>>>> testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile)  Time
>>>> elapsed:
>>>>>>> 0.08 s  <<< ERROR!
>>>>>>> java.net.ConnectException: Call From
>>>>>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35529
>> failed
>>>> on
>>>>>>> connection exception: java.net.ConnectException: Connection refused;
>>>> For
>>>>>>> more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
>>>>>>>     at
>>>>>>> org.apache.hadoop.hbase.io
>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
>>>>>>> Caused by: java.net.ConnectException: Connection refused
>>>>>>>     at
>>>>>>> org.apache.hadoop.hbase.io
>>>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
>>>>>>> 
>>>>>>> TestBlocksScanned:
>>>>>>> 
>>>>>> 
>>>> 
>> testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned)
>>>>>>> Time elapsed: 0.069 s  <<< ERROR!
>>>>>>> java.lang.IllegalArgumentException: Wrong FS:
>>>> hdfs://localhost:35529/tmp/
>>>>>>> 
>>>>>> 
>>>> 
>> hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08
>>>>>> ,
>>>>>>> expected: file:///
>>>>>>>     at
>>>>>>> 
>>>>>> 
>>>> 
>> org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90)
>>>>>>>     at
>>>>>>> 
>>>>>> 
>>>> 
>> org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86)
>>>>>>> 
>>>>>>> And please let me know if any known issue I'm not aware of. Thanks.
>>>>>>> 
>>>>>>> Best Regards,
>>>>>>> Yu
>>>>>>> 
>>>>>>> 
>>>>>>>> On Mon, 8 Apr 2019 at 11:38, Yu Li <ca...@gmail.com> wrote:
>>>>>>>> 
>>>>>>>> The performance report LGTM, thanks! (and sorry for the lag due to
>>>>>>>> Qingming Festival Holiday here in China)
>>>>>>>> 
>>>>>>>> Still verifying the release, just some quick feedback: observed some
>>>>>>>> incompatible changes in compatibility report including
>>>>>>>> HBASE-21492/HBASE-21684 and worth a reminder in ReleaseNote.
>>>>>>>> 
>>>>>>>> Irrelative but noticeable: the 1.4.9 release note URL is invalid on
>>>>>>>> https://hbase.apache.org/downloads.html
>>>>>>>> 
>>>>>>>> Best Regards,
>>>>>>>> Yu
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <ap...@apache.org>
>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> The difference is basically noise per the usual YCSB evaluation.
>>>> Small
>>>>>>>>> differences in workloads D and F (slightly worse) and workload E
>>>>>> (slightly
>>>>>>>>> better) that do not indicate serious regression.
>>>>>>>>> 
>>>>>>>>> Linux version 4.14.55-62.37.amzn1.x86_64
>>>>>>>>> c3.8xlarge x 5
>>>>>>>>> OpenJDK Runtime Environment (build 1.8.0_181-shenandoah-b13)
>>>>>>>>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch -XX:+UseNUMA
>>>>>>>>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
>>>>>>>>> Hadoop 2.9.2
>>>>>>>>> Init: Load 100 M rows and snapshot
>>>>>>>>> Run: Delete table, clone and redeploy from snapshot, run 10 M
>>>>>> operations
>>>>>>>>> Args: -threads 100 -target 50000
>>>>>>>>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS => '1',
>>>>>> IN_MEMORY
>>>>>>>>> => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING =>
>>>>>>>>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => 'SNAPPY',
>>>>>> MIN_VERSIONS =>
>>>>>>>>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE
>> =>
>>>>>>>>> '0'}
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> YCSB Workload A
>>>>>>>>> 
>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> [OVERALL], RunTime(ms) 200592 200583
>>>>>>>>> [OVERALL], Throughput(ops/sec) 49852 49855
>>>>>>>>> [READ], AverageLatency(us) 544 559
>>>>>>>>> [READ], MinLatency(us) 267 292
>>>>>>>>> [READ], MaxLatency(us) 165631 185087
>>>>>>>>> [READ], 95thPercentileLatency(us) 738 742
>>>>>>>>> [READ], 99thPercentileLatency(us), 1877 1961
>>>>>>>>> [UPDATE], AverageLatency(us) 1370 1181
>>>>>>>>> [UPDATE], MinLatency(us) 702 646
>>>>>>>>> [UPDATE], MaxLatency(us) 180735 177279
>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1943 1652
>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 3257 3085
>>>>>>>>> 
>>>>>>>>> YCSB Workload B
>>>>>>>>> 
>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> [OVERALL], RunTime(ms) 200599 200581
>>>>>>>>> [OVERALL], Throughput(ops/sec) 49850 49855
>>>>>>>>> [READ], AverageLatency(us),  454 471
>>>>>>>>> [READ], MinLatency(us) 203 213
>>>>>>>>> [READ], MaxLatency(us) 183423 174207
>>>>>>>>> [READ], 95thPercentileLatency(us) 563 599
>>>>>>>>> [READ], 99thPercentileLatency(us) 1360 1172
>>>>>>>>> [UPDATE], AverageLatency(us) 1064 1029
>>>>>>>>> [UPDATE], MinLatency(us) 746 726
>>>>>>>>> [UPDATE], MaxLatency(us) 163455 101631
>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 1327 1157
>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 2241 1898
>>>>>>>>> 
>>>>>>>>> YCSB Workload C
>>>>>>>>> 
>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> [OVERALL], RunTime(ms) 200541 200538
>>>>>>>>> [OVERALL], Throughput(ops/sec) 49865 49865
>>>>>>>>> [READ], AverageLatency(us) 332 327
>>>>>>>>> [READ], MinLatency(us) 175 179
>>>>>>>>> [READ], MaxLatency(us) 210559 170367
>>>>>>>>> [READ], 95thPercentileLatency(us) 410 396
>>>>>>>>> [READ], 99thPercentileLatency(us) 871 892
>>>>>>>>> 
>>>>>>>>> YCSB Workload D
>>>>>>>>> 
>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> [OVERALL], RunTime(ms) 200579 200562
>>>>>>>>> [OVERALL], Throughput(ops/sec) 49855 49859
>>>>>>>>> [READ], AverageLatency(us) 487 547
>>>>>>>>> [READ], MinLatency(us) 210 214
>>>>>>>>> [READ], MaxLatency(us) 192255 177535
>>>>>>>>> [READ], 95thPercentileLatency(us) 973 1529
>>>>>>>>> [READ], 99thPercentileLatency(us) 1836 2683
>>>>>>>>> [INSERT], AverageLatency(us) 1239 1152
>>>>>>>>> [INSERT], MinLatency(us) 807 788
>>>>>>>>> [INSERT], MaxLatency(us) 184575 148735
>>>>>>>>> [INSERT], 95thPercentileLatency(us) 1496 1243
>>>>>>>>> [INSERT], 99thPercentileLatency(us) 2965 2495
>>>>>>>>> 
>>>>>>>>> YCSB Workload E
>>>>>>>>> 
>>>>>>>>> target 10k/op/s 1.4.9 1.5.0
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> [OVERALL], RunTime(ms) 100605 100568
>>>>>>>>> [OVERALL], Throughput(ops/sec) 9939 9943
>>>>>>>>> [SCAN], AverageLatency(us) 3548 2687
>>>>>>>>> [SCAN], MinLatency(us) 696 678
>>>>>>>>> [SCAN], MaxLatency(us) 1059839 238463
>>>>>>>>> [SCAN], 95thPercentileLatency(us) 8327 6791
>>>>>>>>> [SCAN], 99thPercentileLatency(us) 17647 14415
>>>>>>>>> [INSERT], AverageLatency(us) 2688 1555
>>>>>>>>> [INSERT], MinLatency(us) 887 815
>>>>>>>>> [INSERT], MaxLatency(us) 173311 154623
>>>>>>>>> [INSERT], 95thPercentileLatency(us) 4455 2571
>>>>>>>>> [INSERT], 99thPercentileLatency(us) 9303 5375
>>>>>>>>> 
>>>>>>>>> YCSB Workload F
>>>>>>>>> 
>>>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> [OVERALL], RunTime(ms) 200562 204178
>>>>>>>>> [OVERALL], Throughput(ops/sec) 49859 48976
>>>>>>>>> [READ], AverageLatency(us) 856 1137
>>>>>>>>> [READ], MinLatency(us) 262 257
>>>>>>>>> [READ], MaxLatency(us) 205567 222335
>>>>>>>>> [READ], 95thPercentileLatency(us) 2365 3475
>>>>>>>>> [READ], 99thPercentileLatency(us) 3099 4143
>>>>>>>>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
>>>>>>>>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034
>>>>>>>>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
>>>>>>>>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627
>>>>>>>>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919
>>>>>>>>> [UPDATE], AverageLatency(us) 1700 1777
>>>>>>>>> [UPDATE], MinLatency(us) 737 687
>>>>>>>>> [UPDATE], MaxLatency(us) 97983 94271
>>>>>>>>> [UPDATE], 95thPercentileLatency(us) 3377 4147
>>>>>>>>> [UPDATE], 99thPercentileLatency(us) 4147 4831
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <ca...@gmail.com> wrote:
>>>>>>>>>> 
>>>>>>>>>> Thanks for the efforts boss.
>>>>>>>>>> 
>>>>>>>>>> Since it's a new minor release, do we have performance comparison
>>>>>> report
>>>>>>>>>> with 1.4.9 as we did when releasing 1.4.0? If so, any reference?
>>>> Many
>>>>>>>>>> thanks!
>>>>>>>>>> 
>>>>>>>>>> Best Regards,
>>>>>>>>>> Yu
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <ap...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> The fourth HBase 1.5.0 release candidate (RC3) is available for
>>>>>>>>> download
>>>>>>>>>> at
>>>>>>>>>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/ and
>>>>>>>>> Maven
>>>>>>>>>>> artifacts are available in the temporary repository
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>> https://repository.apache.org/content/repositories/orgapachehbase-1292/
>>>>>>>>>>> 
>>>>>>>>>>> The git tag corresponding to the candidate is '1.5.0RC3’
>>>>>> (b0bc7225c5).
>>>>>>>>>>> 
>>>>>>>>>>> A detailed source and binary compatibility report for this
>> release
>>>> is
>>>>>>>>>>> available for your review at
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>>> 
>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
>>>>>>>>>>> .
>>>>>>>>>>> 
>>>>>>>>>>> A list of the 115 issues resolved in this release can be found at
>>>>>>>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is derived from
>>>> the
>>>>>>>>>>> changelog of the last branch-1.4 release, 1.4.9.
>>>>>>>>>>> 
>>>>>>>>>>> Please try out the candidate and vote +1/0/-1.
>>>>>>>>>>> 
>>>>>>>>>>> The vote will be open for at least 72 hours. Unless objection I
>>>> will
>>>>>>>>> try
>>>>>>>>>> to
>>>>>>>>>>> close it Friday April 12, 2019 if we have sufficient votes.
>>>>>>>>>>> 
>>>>>>>>>>> Prior to making this announcement I made the following preflight
>>>>>>>>> checks:
>>>>>>>>>>> 
>>>>>>>>>>> RAT check passes (7u80)
>>>>>>>>>>> Unit test suite passes (7u80, 8u181)*
>>>>>>>>>>> Opened the UI in a browser, poked around
>>>>>>>>>>> LTT load 100M rows with 100% verification and 20% updates
>> (8u181)
>>>>>>>>>>> ITBLL 1B rows with slowDeterministic monkey (8u181)
>>>>>>>>>>> ITBLL 1B rows with serverKilling monkey (8u181)
>>>>>>>>>>> 
>>>>>>>>>>> There are known flaky tests. See HBASE-21904 and HBASE-21905.
>> These
>>>>>>>>> flaky
>>>>>>>>>>> tests do not represent serious test failures that would prevent a
>>>>>>>>>> release.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> --
>>>>>>>>>>> Best regards,
>>>>>>>>>>> Andrew
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Best regards,
>>>>>>>>> Andrew
>>>>>>>>> 
>>>>>>>>> Words like orphans lost among the crosstalk, meaning torn from
>>>> truth's
>>>>>>>>> decrepit hands
>>>>>>>>> - A23, Crosstalk
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>> 

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Yu Li <ca...@gmail.com>.
Fine, let's focus on verifying whether it's a real problem rather than
arguing about wording, after all that's not my intention...

As mentioned, I participated in the 1.4.7 release vote[1] and IIRC I was
using the same env and all tests passed w/o issue, that's where my concern
lies and the main reason I gave a -1 vote. I'm running against 1.4.7 source
on the same now and let's see the result.

[1] https://www.mail-archive.com/dev@hbase.apache.org/msg51380.html

Best Regards,
Yu


On Fri, 12 Apr 2019 at 12:05, Andrew Purtell <an...@gmail.com>
wrote:

> I believe the test execution order matters. We run some tests in parallel.
> The ordering of tests is determined by readdir() results and this differs
> from host to host and checkout to checkout. So when you see a repeatable
> group of failures, that’s great. And when someone else doesn’t see those
> same tests fail, or they cannot be reproduced when running by themselves,
> the commonly accepted term of art for this is “flaky”.
>
>
> > On Apr 11, 2019, at 8:52 PM, Yu Li <ca...@gmail.com> wrote:
> >
> > Sorry but I'd call it "possible environment related problem" or "some
> > feature may not work well in specific environment", rather than a flaky.
> >
> > Will check against 1.4.7 released source package before opening any JIRA.
> >
> > Best Regards,
> > Yu
> >
> >
> > On Fri, 12 Apr 2019 at 11:37, Andrew Purtell <an...@gmail.com>
> > wrote:
> >
> >> And if they pass in my environment , then what should we call it then. I
> >> have no doubt you are seeing failures. Therefore can you please file
> JIRAs
> >> and attach information that can help identify a fix. Thanks.
> >>
> >>> On Apr 11, 2019, at 8:35 PM, Yu Li <ca...@gmail.com> wrote:
> >>>
> >>> I ran the test suite with the -Dsurefire.rerunFailingTestsCount=2
> option
> >>> and on two different env separately, so it sums up to 6 times stable
> >>> failure for each case, and from my perspective this is not flaky.
> >>>
> >>> IIRC last time when verifying 1.4.7 on the same env no such issue
> >> observed,
> >>> will double check.
> >>>
> >>> Best Regards,
> >>> Yu
> >>>
> >>>
> >>> On Fri, 12 Apr 2019 at 00:07, Andrew Purtell <andrew.purtell@gmail.com
> >
> >>> wrote:
> >>>
> >>>> There are two failure cases it looks like. And this looks like flakes.
> >>>>
> >>>> The wrong FS assertions are not something I see when I run these tests
> >>>> myself. I am not able to investigate something I can’t reproduce.
> What I
> >>>> suggest is since you can reproduce do a git bisect to find the commit
> >> that
> >>>> introduced the problem. Then we can revert it. As an alternative we
> can
> >>>> open a JIRA, report the problem, temporarily @ignore the test, and
> >>>> continue. This latter option only should be done if we are fairly
> >> confident
> >>>> it is a test only problem.
> >>>>
> >>>> The connect exceptions are interesting. I see these sometimes when the
> >>>> suite is executed, not this particular case, but when the failed test
> is
> >>>> executed by itself it always passes. It is possible some change to
> >> classes
> >>>> related to the minicluster or startup or shutdown timing are the
> cause,
> >> but
> >>>> it is test time flaky behavior. I’m not happy about this but it
> doesn’t
> >>>> actually fail the release because the failure is never repeatable when
> >> the
> >>>> test is run standalone.
> >>>>
> >>>> In general it would be great if some attention was paid to test
> >>>> cleanliness on branch-1. As RM I’m not in a position to insist that
> >>>> everything is perfect or there will never be another 1.x release,
> >> certainly
> >>>> not from branch-1. So, tests which fail repeatedly block a release
> IMHO
> >> but
> >>>> flakes do not.
> >>>>
> >>>>
> >>>>> On Apr 10, 2019, at 11:20 PM, Yu Li <ca...@gmail.com> wrote:
> >>>>>
> >>>>> -1
> >>>>>
> >>>>> Observed many UT failures when checking the source package (tried
> >>>> multiple
> >>>>> rounds on two different environments, MacOs and Linux, got the same
> >>>>> result), including (but not limited to):
> >>>>>
> >>>>> TestBulkload:
> >>>>>
> >>>>
> >>
> shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad)
> >>>>> Time elapsed: 0.083 s  <<< ERROR!
> >>>>> java.lang.IllegalArgumentException: Wrong FS:
> >>>>>
> >>>>
> >>
> file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29,
> >>>>> expected: hdfs://localhost:55938
> >>>>>      at
> >>>>>
> >>>>
> >>
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246)
> >>>>>      at
> >>>>>
> >>>>
> >>
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256)
> >>>>>      at
> >>>>>
> >>>>
> >>
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150)
> >>>>>
> >>>>> TestStoreFile:
> >>>>>
> >>>>
> >>
> testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile)
> >>>>> Time elapsed: 0.083 s  <<< ERROR!
> >>>>> java.net.ConnectException: Call From localhost/127.0.0.1 to
> >>>> localhost:55938
> >>>>> failed on connection exception: java.net.ConnectException: Connection
> >>>>> refused; For more details see:
> >>>>> http://wiki.apache.org/hadoop/ConnectionRefused
> >>>>>      at
> >>>>>
> >>>>
> >>
> org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047)
> >>>>>      at
> >>>>>
> >>>>
> >>
> org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908)
> >>>>>
> >>>>> TestHFile:
> >>>>> testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile)  Time
> >> elapsed:
> >>>>> 0.08 s  <<< ERROR!
> >>>>> java.net.ConnectException: Call From
> >>>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35529
> failed
> >> on
> >>>>> connection exception: java.net.ConnectException: Connection refused;
> >> For
> >>>>> more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
> >>>>>      at
> >>>>> org.apache.hadoop.hbase.io
> >>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
> >>>>> Caused by: java.net.ConnectException: Connection refused
> >>>>>      at
> >>>>> org.apache.hadoop.hbase.io
> >>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
> >>>>>
> >>>>> TestBlocksScanned:
> >>>>>
> >>>>
> >>
> testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned)
> >>>>> Time elapsed: 0.069 s  <<< ERROR!
> >>>>> java.lang.IllegalArgumentException: Wrong FS:
> >> hdfs://localhost:35529/tmp/
> >>>>>
> >>>>
> >>
> hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08
> >>>> ,
> >>>>> expected: file:///
> >>>>>      at
> >>>>>
> >>>>
> >>
> org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90)
> >>>>>      at
> >>>>>
> >>>>
> >>
> org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86)
> >>>>>
> >>>>> And please let me know if any known issue I'm not aware of. Thanks.
> >>>>>
> >>>>> Best Regards,
> >>>>> Yu
> >>>>>
> >>>>>
> >>>>>> On Mon, 8 Apr 2019 at 11:38, Yu Li <ca...@gmail.com> wrote:
> >>>>>>
> >>>>>> The performance report LGTM, thanks! (and sorry for the lag due to
> >>>>>> Qingming Festival Holiday here in China)
> >>>>>>
> >>>>>> Still verifying the release, just some quick feedback: observed some
> >>>>>> incompatible changes in compatibility report including
> >>>>>> HBASE-21492/HBASE-21684 and worth a reminder in ReleaseNote.
> >>>>>>
> >>>>>> Irrelative but noticeable: the 1.4.9 release note URL is invalid on
> >>>>>> https://hbase.apache.org/downloads.html
> >>>>>>
> >>>>>> Best Regards,
> >>>>>> Yu
> >>>>>>
> >>>>>>
> >>>>>>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <ap...@apache.org>
> >>>> wrote:
> >>>>>>>
> >>>>>>> The difference is basically noise per the usual YCSB evaluation.
> >> Small
> >>>>>>> differences in workloads D and F (slightly worse) and workload E
> >>>> (slightly
> >>>>>>> better) that do not indicate serious regression.
> >>>>>>>
> >>>>>>> Linux version 4.14.55-62.37.amzn1.x86_64
> >>>>>>> c3.8xlarge x 5
> >>>>>>> OpenJDK Runtime Environment (build 1.8.0_181-shenandoah-b13)
> >>>>>>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch -XX:+UseNUMA
> >>>>>>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
> >>>>>>> Hadoop 2.9.2
> >>>>>>> Init: Load 100 M rows and snapshot
> >>>>>>> Run: Delete table, clone and redeploy from snapshot, run 10 M
> >>>> operations
> >>>>>>> Args: -threads 100 -target 50000
> >>>>>>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS => '1',
> >>>> IN_MEMORY
> >>>>>>> => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING =>
> >>>>>>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => 'SNAPPY',
> >>>> MIN_VERSIONS =>
> >>>>>>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE
> =>
> >>>>>>> '0'}
> >>>>>>>
> >>>>>>>
> >>>>>>> YCSB Workload A
> >>>>>>>
> >>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> [OVERALL], RunTime(ms) 200592 200583
> >>>>>>> [OVERALL], Throughput(ops/sec) 49852 49855
> >>>>>>> [READ], AverageLatency(us) 544 559
> >>>>>>> [READ], MinLatency(us) 267 292
> >>>>>>> [READ], MaxLatency(us) 165631 185087
> >>>>>>> [READ], 95thPercentileLatency(us) 738 742
> >>>>>>> [READ], 99thPercentileLatency(us), 1877 1961
> >>>>>>> [UPDATE], AverageLatency(us) 1370 1181
> >>>>>>> [UPDATE], MinLatency(us) 702 646
> >>>>>>> [UPDATE], MaxLatency(us) 180735 177279
> >>>>>>> [UPDATE], 95thPercentileLatency(us) 1943 1652
> >>>>>>> [UPDATE], 99thPercentileLatency(us) 3257 3085
> >>>>>>>
> >>>>>>> YCSB Workload B
> >>>>>>>
> >>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> [OVERALL], RunTime(ms) 200599 200581
> >>>>>>> [OVERALL], Throughput(ops/sec) 49850 49855
> >>>>>>> [READ], AverageLatency(us),  454 471
> >>>>>>> [READ], MinLatency(us) 203 213
> >>>>>>> [READ], MaxLatency(us) 183423 174207
> >>>>>>> [READ], 95thPercentileLatency(us) 563 599
> >>>>>>> [READ], 99thPercentileLatency(us) 1360 1172
> >>>>>>> [UPDATE], AverageLatency(us) 1064 1029
> >>>>>>> [UPDATE], MinLatency(us) 746 726
> >>>>>>> [UPDATE], MaxLatency(us) 163455 101631
> >>>>>>> [UPDATE], 95thPercentileLatency(us) 1327 1157
> >>>>>>> [UPDATE], 99thPercentileLatency(us) 2241 1898
> >>>>>>>
> >>>>>>> YCSB Workload C
> >>>>>>>
> >>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> [OVERALL], RunTime(ms) 200541 200538
> >>>>>>> [OVERALL], Throughput(ops/sec) 49865 49865
> >>>>>>> [READ], AverageLatency(us) 332 327
> >>>>>>> [READ], MinLatency(us) 175 179
> >>>>>>> [READ], MaxLatency(us) 210559 170367
> >>>>>>> [READ], 95thPercentileLatency(us) 410 396
> >>>>>>> [READ], 99thPercentileLatency(us) 871 892
> >>>>>>>
> >>>>>>> YCSB Workload D
> >>>>>>>
> >>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> [OVERALL], RunTime(ms) 200579 200562
> >>>>>>> [OVERALL], Throughput(ops/sec) 49855 49859
> >>>>>>> [READ], AverageLatency(us) 487 547
> >>>>>>> [READ], MinLatency(us) 210 214
> >>>>>>> [READ], MaxLatency(us) 192255 177535
> >>>>>>> [READ], 95thPercentileLatency(us) 973 1529
> >>>>>>> [READ], 99thPercentileLatency(us) 1836 2683
> >>>>>>> [INSERT], AverageLatency(us) 1239 1152
> >>>>>>> [INSERT], MinLatency(us) 807 788
> >>>>>>> [INSERT], MaxLatency(us) 184575 148735
> >>>>>>> [INSERT], 95thPercentileLatency(us) 1496 1243
> >>>>>>> [INSERT], 99thPercentileLatency(us) 2965 2495
> >>>>>>>
> >>>>>>> YCSB Workload E
> >>>>>>>
> >>>>>>> target 10k/op/s 1.4.9 1.5.0
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> [OVERALL], RunTime(ms) 100605 100568
> >>>>>>> [OVERALL], Throughput(ops/sec) 9939 9943
> >>>>>>> [SCAN], AverageLatency(us) 3548 2687
> >>>>>>> [SCAN], MinLatency(us) 696 678
> >>>>>>> [SCAN], MaxLatency(us) 1059839 238463
> >>>>>>> [SCAN], 95thPercentileLatency(us) 8327 6791
> >>>>>>> [SCAN], 99thPercentileLatency(us) 17647 14415
> >>>>>>> [INSERT], AverageLatency(us) 2688 1555
> >>>>>>> [INSERT], MinLatency(us) 887 815
> >>>>>>> [INSERT], MaxLatency(us) 173311 154623
> >>>>>>> [INSERT], 95thPercentileLatency(us) 4455 2571
> >>>>>>> [INSERT], 99thPercentileLatency(us) 9303 5375
> >>>>>>>
> >>>>>>> YCSB Workload F
> >>>>>>>
> >>>>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> [OVERALL], RunTime(ms) 200562 204178
> >>>>>>> [OVERALL], Throughput(ops/sec) 49859 48976
> >>>>>>> [READ], AverageLatency(us) 856 1137
> >>>>>>> [READ], MinLatency(us) 262 257
> >>>>>>> [READ], MaxLatency(us) 205567 222335
> >>>>>>> [READ], 95thPercentileLatency(us) 2365 3475
> >>>>>>> [READ], 99thPercentileLatency(us) 3099 4143
> >>>>>>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
> >>>>>>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034
> >>>>>>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
> >>>>>>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627
> >>>>>>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919
> >>>>>>> [UPDATE], AverageLatency(us) 1700 1777
> >>>>>>> [UPDATE], MinLatency(us) 737 687
> >>>>>>> [UPDATE], MaxLatency(us) 97983 94271
> >>>>>>> [UPDATE], 95thPercentileLatency(us) 3377 4147
> >>>>>>> [UPDATE], 99thPercentileLatency(us) 4147 4831
> >>>>>>>
> >>>>>>>
> >>>>>>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <ca...@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>> Thanks for the efforts boss.
> >>>>>>>>
> >>>>>>>> Since it's a new minor release, do we have performance comparison
> >>>> report
> >>>>>>>> with 1.4.9 as we did when releasing 1.4.0? If so, any reference?
> >> Many
> >>>>>>>> thanks!
> >>>>>>>>
> >>>>>>>> Best Regards,
> >>>>>>>> Yu
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <ap...@apache.org>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> The fourth HBase 1.5.0 release candidate (RC3) is available for
> >>>>>>> download
> >>>>>>>> at
> >>>>>>>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/ and
> >>>>>>> Maven
> >>>>>>>>> artifacts are available in the temporary repository
> >>>>>>>>>
> >>>>>>>
> >>>>
> https://repository.apache.org/content/repositories/orgapachehbase-1292/
> >>>>>>>>>
> >>>>>>>>> The git tag corresponding to the candidate is '1.5.0RC3’
> >>>> (b0bc7225c5).
> >>>>>>>>>
> >>>>>>>>> A detailed source and binary compatibility report for this
> release
> >> is
> >>>>>>>>> available for your review at
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>
> >>
> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
> >>>>>>>>> .
> >>>>>>>>>
> >>>>>>>>> A list of the 115 issues resolved in this release can be found at
> >>>>>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is derived from
> >> the
> >>>>>>>>> changelog of the last branch-1.4 release, 1.4.9.
> >>>>>>>>>
> >>>>>>>>> Please try out the candidate and vote +1/0/-1.
> >>>>>>>>>
> >>>>>>>>> The vote will be open for at least 72 hours. Unless objection I
> >> will
> >>>>>>> try
> >>>>>>>> to
> >>>>>>>>> close it Friday April 12, 2019 if we have sufficient votes.
> >>>>>>>>>
> >>>>>>>>> Prior to making this announcement I made the following preflight
> >>>>>>> checks:
> >>>>>>>>>
> >>>>>>>>>  RAT check passes (7u80)
> >>>>>>>>>  Unit test suite passes (7u80, 8u181)*
> >>>>>>>>>  Opened the UI in a browser, poked around
> >>>>>>>>>  LTT load 100M rows with 100% verification and 20% updates
> (8u181)
> >>>>>>>>>  ITBLL 1B rows with slowDeterministic monkey (8u181)
> >>>>>>>>>  ITBLL 1B rows with serverKilling monkey (8u181)
> >>>>>>>>>
> >>>>>>>>> There are known flaky tests. See HBASE-21904 and HBASE-21905.
> These
> >>>>>>> flaky
> >>>>>>>>> tests do not represent serious test failures that would prevent a
> >>>>>>>> release.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> Best regards,
> >>>>>>>>> Andrew
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Best regards,
> >>>>>>> Andrew
> >>>>>>>
> >>>>>>> Words like orphans lost among the crosstalk, meaning torn from
> >> truth's
> >>>>>>> decrepit hands
> >>>>>>> - A23, Crosstalk
> >>>>>>>
> >>>>>>
> >>>>
> >>
>

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Andrew Purtell <an...@gmail.com>.
I believe the test execution order matters. We run some tests in parallel. The ordering of tests is determined by readdir() results and this differs from host to host and checkout to checkout. So when you see a repeatable group of failures, that’s great. And when someone else doesn’t see those same tests fail, or they cannot be reproduced when running by themselves, the commonly accepted term of art for this is “flaky”. 


> On Apr 11, 2019, at 8:52 PM, Yu Li <ca...@gmail.com> wrote:
> 
> Sorry but I'd call it "possible environment related problem" or "some
> feature may not work well in specific environment", rather than a flaky.
> 
> Will check against 1.4.7 released source package before opening any JIRA.
> 
> Best Regards,
> Yu
> 
> 
> On Fri, 12 Apr 2019 at 11:37, Andrew Purtell <an...@gmail.com>
> wrote:
> 
>> And if they pass in my environment , then what should we call it then. I
>> have no doubt you are seeing failures. Therefore can you please file JIRAs
>> and attach information that can help identify a fix. Thanks.
>> 
>>> On Apr 11, 2019, at 8:35 PM, Yu Li <ca...@gmail.com> wrote:
>>> 
>>> I ran the test suite with the -Dsurefire.rerunFailingTestsCount=2 option
>>> and on two different env separately, so it sums up to 6 times stable
>>> failure for each case, and from my perspective this is not flaky.
>>> 
>>> IIRC last time when verifying 1.4.7 on the same env no such issue
>> observed,
>>> will double check.
>>> 
>>> Best Regards,
>>> Yu
>>> 
>>> 
>>> On Fri, 12 Apr 2019 at 00:07, Andrew Purtell <an...@gmail.com>
>>> wrote:
>>> 
>>>> There are two failure cases it looks like. And this looks like flakes.
>>>> 
>>>> The wrong FS assertions are not something I see when I run these tests
>>>> myself. I am not able to investigate something I can’t reproduce. What I
>>>> suggest is since you can reproduce do a git bisect to find the commit
>> that
>>>> introduced the problem. Then we can revert it. As an alternative we can
>>>> open a JIRA, report the problem, temporarily @ignore the test, and
>>>> continue. This latter option only should be done if we are fairly
>> confident
>>>> it is a test only problem.
>>>> 
>>>> The connect exceptions are interesting. I see these sometimes when the
>>>> suite is executed, not this particular case, but when the failed test is
>>>> executed by itself it always passes. It is possible some change to
>> classes
>>>> related to the minicluster or startup or shutdown timing are the cause,
>> but
>>>> it is test time flaky behavior. I’m not happy about this but it doesn’t
>>>> actually fail the release because the failure is never repeatable when
>> the
>>>> test is run standalone.
>>>> 
>>>> In general it would be great if some attention was paid to test
>>>> cleanliness on branch-1. As RM I’m not in a position to insist that
>>>> everything is perfect or there will never be another 1.x release,
>> certainly
>>>> not from branch-1. So, tests which fail repeatedly block a release IMHO
>> but
>>>> flakes do not.
>>>> 
>>>> 
>>>>> On Apr 10, 2019, at 11:20 PM, Yu Li <ca...@gmail.com> wrote:
>>>>> 
>>>>> -1
>>>>> 
>>>>> Observed many UT failures when checking the source package (tried
>>>> multiple
>>>>> rounds on two different environments, MacOs and Linux, got the same
>>>>> result), including (but not limited to):
>>>>> 
>>>>> TestBulkload:
>>>>> 
>>>> 
>> shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad)
>>>>> Time elapsed: 0.083 s  <<< ERROR!
>>>>> java.lang.IllegalArgumentException: Wrong FS:
>>>>> 
>>>> 
>> file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29,
>>>>> expected: hdfs://localhost:55938
>>>>>      at
>>>>> 
>>>> 
>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246)
>>>>>      at
>>>>> 
>>>> 
>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256)
>>>>>      at
>>>>> 
>>>> 
>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150)
>>>>> 
>>>>> TestStoreFile:
>>>>> 
>>>> 
>> testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile)
>>>>> Time elapsed: 0.083 s  <<< ERROR!
>>>>> java.net.ConnectException: Call From localhost/127.0.0.1 to
>>>> localhost:55938
>>>>> failed on connection exception: java.net.ConnectException: Connection
>>>>> refused; For more details see:
>>>>> http://wiki.apache.org/hadoop/ConnectionRefused
>>>>>      at
>>>>> 
>>>> 
>> org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047)
>>>>>      at
>>>>> 
>>>> 
>> org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908)
>>>>> 
>>>>> TestHFile:
>>>>> testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile)  Time
>> elapsed:
>>>>> 0.08 s  <<< ERROR!
>>>>> java.net.ConnectException: Call From
>>>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35529 failed
>> on
>>>>> connection exception: java.net.ConnectException: Connection refused;
>> For
>>>>> more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
>>>>>      at
>>>>> org.apache.hadoop.hbase.io
>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
>>>>> Caused by: java.net.ConnectException: Connection refused
>>>>>      at
>>>>> org.apache.hadoop.hbase.io
>>>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
>>>>> 
>>>>> TestBlocksScanned:
>>>>> 
>>>> 
>> testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned)
>>>>> Time elapsed: 0.069 s  <<< ERROR!
>>>>> java.lang.IllegalArgumentException: Wrong FS:
>> hdfs://localhost:35529/tmp/
>>>>> 
>>>> 
>> hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08
>>>> ,
>>>>> expected: file:///
>>>>>      at
>>>>> 
>>>> 
>> org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90)
>>>>>      at
>>>>> 
>>>> 
>> org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86)
>>>>> 
>>>>> And please let me know if any known issue I'm not aware of. Thanks.
>>>>> 
>>>>> Best Regards,
>>>>> Yu
>>>>> 
>>>>> 
>>>>>> On Mon, 8 Apr 2019 at 11:38, Yu Li <ca...@gmail.com> wrote:
>>>>>> 
>>>>>> The performance report LGTM, thanks! (and sorry for the lag due to
>>>>>> Qingming Festival Holiday here in China)
>>>>>> 
>>>>>> Still verifying the release, just some quick feedback: observed some
>>>>>> incompatible changes in compatibility report including
>>>>>> HBASE-21492/HBASE-21684 and worth a reminder in ReleaseNote.
>>>>>> 
>>>>>> Irrelative but noticeable: the 1.4.9 release note URL is invalid on
>>>>>> https://hbase.apache.org/downloads.html
>>>>>> 
>>>>>> Best Regards,
>>>>>> Yu
>>>>>> 
>>>>>> 
>>>>>>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <ap...@apache.org>
>>>> wrote:
>>>>>>> 
>>>>>>> The difference is basically noise per the usual YCSB evaluation.
>> Small
>>>>>>> differences in workloads D and F (slightly worse) and workload E
>>>> (slightly
>>>>>>> better) that do not indicate serious regression.
>>>>>>> 
>>>>>>> Linux version 4.14.55-62.37.amzn1.x86_64
>>>>>>> c3.8xlarge x 5
>>>>>>> OpenJDK Runtime Environment (build 1.8.0_181-shenandoah-b13)
>>>>>>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch -XX:+UseNUMA
>>>>>>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
>>>>>>> Hadoop 2.9.2
>>>>>>> Init: Load 100 M rows and snapshot
>>>>>>> Run: Delete table, clone and redeploy from snapshot, run 10 M
>>>> operations
>>>>>>> Args: -threads 100 -target 50000
>>>>>>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS => '1',
>>>> IN_MEMORY
>>>>>>> => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING =>
>>>>>>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => 'SNAPPY',
>>>> MIN_VERSIONS =>
>>>>>>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE =>
>>>>>>> '0'}
>>>>>>> 
>>>>>>> 
>>>>>>> YCSB Workload A
>>>>>>> 
>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> [OVERALL], RunTime(ms) 200592 200583
>>>>>>> [OVERALL], Throughput(ops/sec) 49852 49855
>>>>>>> [READ], AverageLatency(us) 544 559
>>>>>>> [READ], MinLatency(us) 267 292
>>>>>>> [READ], MaxLatency(us) 165631 185087
>>>>>>> [READ], 95thPercentileLatency(us) 738 742
>>>>>>> [READ], 99thPercentileLatency(us), 1877 1961
>>>>>>> [UPDATE], AverageLatency(us) 1370 1181
>>>>>>> [UPDATE], MinLatency(us) 702 646
>>>>>>> [UPDATE], MaxLatency(us) 180735 177279
>>>>>>> [UPDATE], 95thPercentileLatency(us) 1943 1652
>>>>>>> [UPDATE], 99thPercentileLatency(us) 3257 3085
>>>>>>> 
>>>>>>> YCSB Workload B
>>>>>>> 
>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> [OVERALL], RunTime(ms) 200599 200581
>>>>>>> [OVERALL], Throughput(ops/sec) 49850 49855
>>>>>>> [READ], AverageLatency(us),  454 471
>>>>>>> [READ], MinLatency(us) 203 213
>>>>>>> [READ], MaxLatency(us) 183423 174207
>>>>>>> [READ], 95thPercentileLatency(us) 563 599
>>>>>>> [READ], 99thPercentileLatency(us) 1360 1172
>>>>>>> [UPDATE], AverageLatency(us) 1064 1029
>>>>>>> [UPDATE], MinLatency(us) 746 726
>>>>>>> [UPDATE], MaxLatency(us) 163455 101631
>>>>>>> [UPDATE], 95thPercentileLatency(us) 1327 1157
>>>>>>> [UPDATE], 99thPercentileLatency(us) 2241 1898
>>>>>>> 
>>>>>>> YCSB Workload C
>>>>>>> 
>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> [OVERALL], RunTime(ms) 200541 200538
>>>>>>> [OVERALL], Throughput(ops/sec) 49865 49865
>>>>>>> [READ], AverageLatency(us) 332 327
>>>>>>> [READ], MinLatency(us) 175 179
>>>>>>> [READ], MaxLatency(us) 210559 170367
>>>>>>> [READ], 95thPercentileLatency(us) 410 396
>>>>>>> [READ], 99thPercentileLatency(us) 871 892
>>>>>>> 
>>>>>>> YCSB Workload D
>>>>>>> 
>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> [OVERALL], RunTime(ms) 200579 200562
>>>>>>> [OVERALL], Throughput(ops/sec) 49855 49859
>>>>>>> [READ], AverageLatency(us) 487 547
>>>>>>> [READ], MinLatency(us) 210 214
>>>>>>> [READ], MaxLatency(us) 192255 177535
>>>>>>> [READ], 95thPercentileLatency(us) 973 1529
>>>>>>> [READ], 99thPercentileLatency(us) 1836 2683
>>>>>>> [INSERT], AverageLatency(us) 1239 1152
>>>>>>> [INSERT], MinLatency(us) 807 788
>>>>>>> [INSERT], MaxLatency(us) 184575 148735
>>>>>>> [INSERT], 95thPercentileLatency(us) 1496 1243
>>>>>>> [INSERT], 99thPercentileLatency(us) 2965 2495
>>>>>>> 
>>>>>>> YCSB Workload E
>>>>>>> 
>>>>>>> target 10k/op/s 1.4.9 1.5.0
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> [OVERALL], RunTime(ms) 100605 100568
>>>>>>> [OVERALL], Throughput(ops/sec) 9939 9943
>>>>>>> [SCAN], AverageLatency(us) 3548 2687
>>>>>>> [SCAN], MinLatency(us) 696 678
>>>>>>> [SCAN], MaxLatency(us) 1059839 238463
>>>>>>> [SCAN], 95thPercentileLatency(us) 8327 6791
>>>>>>> [SCAN], 99thPercentileLatency(us) 17647 14415
>>>>>>> [INSERT], AverageLatency(us) 2688 1555
>>>>>>> [INSERT], MinLatency(us) 887 815
>>>>>>> [INSERT], MaxLatency(us) 173311 154623
>>>>>>> [INSERT], 95thPercentileLatency(us) 4455 2571
>>>>>>> [INSERT], 99thPercentileLatency(us) 9303 5375
>>>>>>> 
>>>>>>> YCSB Workload F
>>>>>>> 
>>>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> [OVERALL], RunTime(ms) 200562 204178
>>>>>>> [OVERALL], Throughput(ops/sec) 49859 48976
>>>>>>> [READ], AverageLatency(us) 856 1137
>>>>>>> [READ], MinLatency(us) 262 257
>>>>>>> [READ], MaxLatency(us) 205567 222335
>>>>>>> [READ], 95thPercentileLatency(us) 2365 3475
>>>>>>> [READ], 99thPercentileLatency(us) 3099 4143
>>>>>>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
>>>>>>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034
>>>>>>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
>>>>>>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627
>>>>>>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919
>>>>>>> [UPDATE], AverageLatency(us) 1700 1777
>>>>>>> [UPDATE], MinLatency(us) 737 687
>>>>>>> [UPDATE], MaxLatency(us) 97983 94271
>>>>>>> [UPDATE], 95thPercentileLatency(us) 3377 4147
>>>>>>> [UPDATE], 99thPercentileLatency(us) 4147 4831
>>>>>>> 
>>>>>>> 
>>>>>>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <ca...@gmail.com> wrote:
>>>>>>>> 
>>>>>>>> Thanks for the efforts boss.
>>>>>>>> 
>>>>>>>> Since it's a new minor release, do we have performance comparison
>>>> report
>>>>>>>> with 1.4.9 as we did when releasing 1.4.0? If so, any reference?
>> Many
>>>>>>>> thanks!
>>>>>>>> 
>>>>>>>> Best Regards,
>>>>>>>> Yu
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <ap...@apache.org>
>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> The fourth HBase 1.5.0 release candidate (RC3) is available for
>>>>>>> download
>>>>>>>> at
>>>>>>>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/ and
>>>>>>> Maven
>>>>>>>>> artifacts are available in the temporary repository
>>>>>>>>> 
>>>>>>> 
>>>> https://repository.apache.org/content/repositories/orgapachehbase-1292/
>>>>>>>>> 
>>>>>>>>> The git tag corresponding to the candidate is '1.5.0RC3’
>>>> (b0bc7225c5).
>>>>>>>>> 
>>>>>>>>> A detailed source and binary compatibility report for this release
>> is
>>>>>>>>> available for your review at
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>> 
>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
>>>>>>>>> .
>>>>>>>>> 
>>>>>>>>> A list of the 115 issues resolved in this release can be found at
>>>>>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is derived from
>> the
>>>>>>>>> changelog of the last branch-1.4 release, 1.4.9.
>>>>>>>>> 
>>>>>>>>> Please try out the candidate and vote +1/0/-1.
>>>>>>>>> 
>>>>>>>>> The vote will be open for at least 72 hours. Unless objection I
>> will
>>>>>>> try
>>>>>>>> to
>>>>>>>>> close it Friday April 12, 2019 if we have sufficient votes.
>>>>>>>>> 
>>>>>>>>> Prior to making this announcement I made the following preflight
>>>>>>> checks:
>>>>>>>>> 
>>>>>>>>>  RAT check passes (7u80)
>>>>>>>>>  Unit test suite passes (7u80, 8u181)*
>>>>>>>>>  Opened the UI in a browser, poked around
>>>>>>>>>  LTT load 100M rows with 100% verification and 20% updates (8u181)
>>>>>>>>>  ITBLL 1B rows with slowDeterministic monkey (8u181)
>>>>>>>>>  ITBLL 1B rows with serverKilling monkey (8u181)
>>>>>>>>> 
>>>>>>>>> There are known flaky tests. See HBASE-21904 and HBASE-21905. These
>>>>>>> flaky
>>>>>>>>> tests do not represent serious test failures that would prevent a
>>>>>>>> release.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> Best regards,
>>>>>>>>> Andrew
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Best regards,
>>>>>>> Andrew
>>>>>>> 
>>>>>>> Words like orphans lost among the crosstalk, meaning torn from
>> truth's
>>>>>>> decrepit hands
>>>>>>> - A23, Crosstalk
>>>>>>> 
>>>>>> 
>>>> 
>> 

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Yu Li <ca...@gmail.com>.
Sorry but I'd call it "possible environment related problem" or "some
feature may not work well in specific environment", rather than a flaky.

Will check against 1.4.7 released source package before opening any JIRA.

Best Regards,
Yu


On Fri, 12 Apr 2019 at 11:37, Andrew Purtell <an...@gmail.com>
wrote:

> And if they pass in my environment , then what should we call it then. I
> have no doubt you are seeing failures. Therefore can you please file JIRAs
> and attach information that can help identify a fix. Thanks.
>
> > On Apr 11, 2019, at 8:35 PM, Yu Li <ca...@gmail.com> wrote:
> >
> > I ran the test suite with the -Dsurefire.rerunFailingTestsCount=2 option
> > and on two different env separately, so it sums up to 6 times stable
> > failure for each case, and from my perspective this is not flaky.
> >
> > IIRC last time when verifying 1.4.7 on the same env no such issue
> observed,
> > will double check.
> >
> > Best Regards,
> > Yu
> >
> >
> > On Fri, 12 Apr 2019 at 00:07, Andrew Purtell <an...@gmail.com>
> > wrote:
> >
> >> There are two failure cases it looks like. And this looks like flakes.
> >>
> >> The wrong FS assertions are not something I see when I run these tests
> >> myself. I am not able to investigate something I can’t reproduce. What I
> >> suggest is since you can reproduce do a git bisect to find the commit
> that
> >> introduced the problem. Then we can revert it. As an alternative we can
> >> open a JIRA, report the problem, temporarily @ignore the test, and
> >> continue. This latter option only should be done if we are fairly
> confident
> >> it is a test only problem.
> >>
> >> The connect exceptions are interesting. I see these sometimes when the
> >> suite is executed, not this particular case, but when the failed test is
> >> executed by itself it always passes. It is possible some change to
> classes
> >> related to the minicluster or startup or shutdown timing are the cause,
> but
> >> it is test time flaky behavior. I’m not happy about this but it doesn’t
> >> actually fail the release because the failure is never repeatable when
> the
> >> test is run standalone.
> >>
> >> In general it would be great if some attention was paid to test
> >> cleanliness on branch-1. As RM I’m not in a position to insist that
> >> everything is perfect or there will never be another 1.x release,
> certainly
> >> not from branch-1. So, tests which fail repeatedly block a release IMHO
> but
> >> flakes do not.
> >>
> >>
> >>> On Apr 10, 2019, at 11:20 PM, Yu Li <ca...@gmail.com> wrote:
> >>>
> >>> -1
> >>>
> >>> Observed many UT failures when checking the source package (tried
> >> multiple
> >>> rounds on two different environments, MacOs and Linux, got the same
> >>> result), including (but not limited to):
> >>>
> >>> TestBulkload:
> >>>
> >>
> shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad)
> >>> Time elapsed: 0.083 s  <<< ERROR!
> >>> java.lang.IllegalArgumentException: Wrong FS:
> >>>
> >>
> file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29,
> >>> expected: hdfs://localhost:55938
> >>>       at
> >>>
> >>
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246)
> >>>       at
> >>>
> >>
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256)
> >>>       at
> >>>
> >>
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150)
> >>>
> >>> TestStoreFile:
> >>>
> >>
> testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile)
> >>> Time elapsed: 0.083 s  <<< ERROR!
> >>> java.net.ConnectException: Call From localhost/127.0.0.1 to
> >> localhost:55938
> >>> failed on connection exception: java.net.ConnectException: Connection
> >>> refused; For more details see:
> >>> http://wiki.apache.org/hadoop/ConnectionRefused
> >>>       at
> >>>
> >>
> org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047)
> >>>       at
> >>>
> >>
> org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908)
> >>>
> >>> TestHFile:
> >>> testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile)  Time
> elapsed:
> >>> 0.08 s  <<< ERROR!
> >>> java.net.ConnectException: Call From
> >>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35529 failed
> on
> >>> connection exception: java.net.ConnectException: Connection refused;
> For
> >>> more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
> >>>       at
> >>> org.apache.hadoop.hbase.io
> >> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
> >>> Caused by: java.net.ConnectException: Connection refused
> >>>       at
> >>> org.apache.hadoop.hbase.io
> >> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
> >>>
> >>> TestBlocksScanned:
> >>>
> >>
> testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned)
> >>> Time elapsed: 0.069 s  <<< ERROR!
> >>> java.lang.IllegalArgumentException: Wrong FS:
> hdfs://localhost:35529/tmp/
> >>>
> >>
> hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08
> >> ,
> >>> expected: file:///
> >>>       at
> >>>
> >>
> org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90)
> >>>       at
> >>>
> >>
> org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86)
> >>>
> >>> And please let me know if any known issue I'm not aware of. Thanks.
> >>>
> >>> Best Regards,
> >>> Yu
> >>>
> >>>
> >>>> On Mon, 8 Apr 2019 at 11:38, Yu Li <ca...@gmail.com> wrote:
> >>>>
> >>>> The performance report LGTM, thanks! (and sorry for the lag due to
> >>>> Qingming Festival Holiday here in China)
> >>>>
> >>>> Still verifying the release, just some quick feedback: observed some
> >>>> incompatible changes in compatibility report including
> >>>> HBASE-21492/HBASE-21684 and worth a reminder in ReleaseNote.
> >>>>
> >>>> Irrelative but noticeable: the 1.4.9 release note URL is invalid on
> >>>> https://hbase.apache.org/downloads.html
> >>>>
> >>>> Best Regards,
> >>>> Yu
> >>>>
> >>>>
> >>>>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <ap...@apache.org>
> >> wrote:
> >>>>>
> >>>>> The difference is basically noise per the usual YCSB evaluation.
> Small
> >>>>> differences in workloads D and F (slightly worse) and workload E
> >> (slightly
> >>>>> better) that do not indicate serious regression.
> >>>>>
> >>>>> Linux version 4.14.55-62.37.amzn1.x86_64
> >>>>> c3.8xlarge x 5
> >>>>> OpenJDK Runtime Environment (build 1.8.0_181-shenandoah-b13)
> >>>>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch -XX:+UseNUMA
> >>>>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
> >>>>> Hadoop 2.9.2
> >>>>> Init: Load 100 M rows and snapshot
> >>>>> Run: Delete table, clone and redeploy from snapshot, run 10 M
> >> operations
> >>>>> Args: -threads 100 -target 50000
> >>>>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS => '1',
> >> IN_MEMORY
> >>>>> => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING =>
> >>>>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => 'SNAPPY',
> >> MIN_VERSIONS =>
> >>>>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE =>
> >>>>> '0'}
> >>>>>
> >>>>>
> >>>>> YCSB Workload A
> >>>>>
> >>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>
> >>>>>
> >>>>>
> >>>>> [OVERALL], RunTime(ms) 200592 200583
> >>>>> [OVERALL], Throughput(ops/sec) 49852 49855
> >>>>> [READ], AverageLatency(us) 544 559
> >>>>> [READ], MinLatency(us) 267 292
> >>>>> [READ], MaxLatency(us) 165631 185087
> >>>>> [READ], 95thPercentileLatency(us) 738 742
> >>>>> [READ], 99thPercentileLatency(us), 1877 1961
> >>>>> [UPDATE], AverageLatency(us) 1370 1181
> >>>>> [UPDATE], MinLatency(us) 702 646
> >>>>> [UPDATE], MaxLatency(us) 180735 177279
> >>>>> [UPDATE], 95thPercentileLatency(us) 1943 1652
> >>>>> [UPDATE], 99thPercentileLatency(us) 3257 3085
> >>>>>
> >>>>> YCSB Workload B
> >>>>>
> >>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>
> >>>>>
> >>>>>
> >>>>> [OVERALL], RunTime(ms) 200599 200581
> >>>>> [OVERALL], Throughput(ops/sec) 49850 49855
> >>>>> [READ], AverageLatency(us),  454 471
> >>>>> [READ], MinLatency(us) 203 213
> >>>>> [READ], MaxLatency(us) 183423 174207
> >>>>> [READ], 95thPercentileLatency(us) 563 599
> >>>>> [READ], 99thPercentileLatency(us) 1360 1172
> >>>>> [UPDATE], AverageLatency(us) 1064 1029
> >>>>> [UPDATE], MinLatency(us) 746 726
> >>>>> [UPDATE], MaxLatency(us) 163455 101631
> >>>>> [UPDATE], 95thPercentileLatency(us) 1327 1157
> >>>>> [UPDATE], 99thPercentileLatency(us) 2241 1898
> >>>>>
> >>>>> YCSB Workload C
> >>>>>
> >>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>
> >>>>>
> >>>>>
> >>>>> [OVERALL], RunTime(ms) 200541 200538
> >>>>> [OVERALL], Throughput(ops/sec) 49865 49865
> >>>>> [READ], AverageLatency(us) 332 327
> >>>>> [READ], MinLatency(us) 175 179
> >>>>> [READ], MaxLatency(us) 210559 170367
> >>>>> [READ], 95thPercentileLatency(us) 410 396
> >>>>> [READ], 99thPercentileLatency(us) 871 892
> >>>>>
> >>>>> YCSB Workload D
> >>>>>
> >>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>
> >>>>>
> >>>>>
> >>>>> [OVERALL], RunTime(ms) 200579 200562
> >>>>> [OVERALL], Throughput(ops/sec) 49855 49859
> >>>>> [READ], AverageLatency(us) 487 547
> >>>>> [READ], MinLatency(us) 210 214
> >>>>> [READ], MaxLatency(us) 192255 177535
> >>>>> [READ], 95thPercentileLatency(us) 973 1529
> >>>>> [READ], 99thPercentileLatency(us) 1836 2683
> >>>>> [INSERT], AverageLatency(us) 1239 1152
> >>>>> [INSERT], MinLatency(us) 807 788
> >>>>> [INSERT], MaxLatency(us) 184575 148735
> >>>>> [INSERT], 95thPercentileLatency(us) 1496 1243
> >>>>> [INSERT], 99thPercentileLatency(us) 2965 2495
> >>>>>
> >>>>> YCSB Workload E
> >>>>>
> >>>>> target 10k/op/s 1.4.9 1.5.0
> >>>>>
> >>>>>
> >>>>>
> >>>>> [OVERALL], RunTime(ms) 100605 100568
> >>>>> [OVERALL], Throughput(ops/sec) 9939 9943
> >>>>> [SCAN], AverageLatency(us) 3548 2687
> >>>>> [SCAN], MinLatency(us) 696 678
> >>>>> [SCAN], MaxLatency(us) 1059839 238463
> >>>>> [SCAN], 95thPercentileLatency(us) 8327 6791
> >>>>> [SCAN], 99thPercentileLatency(us) 17647 14415
> >>>>> [INSERT], AverageLatency(us) 2688 1555
> >>>>> [INSERT], MinLatency(us) 887 815
> >>>>> [INSERT], MaxLatency(us) 173311 154623
> >>>>> [INSERT], 95thPercentileLatency(us) 4455 2571
> >>>>> [INSERT], 99thPercentileLatency(us) 9303 5375
> >>>>>
> >>>>> YCSB Workload F
> >>>>>
> >>>>> target 50k/op/s 1.4.9 1.5.0
> >>>>>
> >>>>>
> >>>>>
> >>>>> [OVERALL], RunTime(ms) 200562 204178
> >>>>> [OVERALL], Throughput(ops/sec) 49859 48976
> >>>>> [READ], AverageLatency(us) 856 1137
> >>>>> [READ], MinLatency(us) 262 257
> >>>>> [READ], MaxLatency(us) 205567 222335
> >>>>> [READ], 95thPercentileLatency(us) 2365 3475
> >>>>> [READ], 99thPercentileLatency(us) 3099 4143
> >>>>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
> >>>>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034
> >>>>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
> >>>>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627
> >>>>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919
> >>>>> [UPDATE], AverageLatency(us) 1700 1777
> >>>>> [UPDATE], MinLatency(us) 737 687
> >>>>> [UPDATE], MaxLatency(us) 97983 94271
> >>>>> [UPDATE], 95thPercentileLatency(us) 3377 4147
> >>>>> [UPDATE], 99thPercentileLatency(us) 4147 4831
> >>>>>
> >>>>>
> >>>>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <ca...@gmail.com> wrote:
> >>>>>>
> >>>>>> Thanks for the efforts boss.
> >>>>>>
> >>>>>> Since it's a new minor release, do we have performance comparison
> >> report
> >>>>>> with 1.4.9 as we did when releasing 1.4.0? If so, any reference?
> Many
> >>>>>> thanks!
> >>>>>>
> >>>>>> Best Regards,
> >>>>>> Yu
> >>>>>>
> >>>>>>
> >>>>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <ap...@apache.org>
> >>>>> wrote:
> >>>>>>
> >>>>>>> The fourth HBase 1.5.0 release candidate (RC3) is available for
> >>>>> download
> >>>>>> at
> >>>>>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/ and
> >>>>> Maven
> >>>>>>> artifacts are available in the temporary repository
> >>>>>>>
> >>>>>
> >> https://repository.apache.org/content/repositories/orgapachehbase-1292/
> >>>>>>>
> >>>>>>> The git tag corresponding to the candidate is '1.5.0RC3’
> >> (b0bc7225c5).
> >>>>>>>
> >>>>>>> A detailed source and binary compatibility report for this release
> is
> >>>>>>> available for your review at
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>
> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
> >>>>>>> .
> >>>>>>>
> >>>>>>> A list of the 115 issues resolved in this release can be found at
> >>>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is derived from
> the
> >>>>>>> changelog of the last branch-1.4 release, 1.4.9.
> >>>>>>>
> >>>>>>> Please try out the candidate and vote +1/0/-1.
> >>>>>>>
> >>>>>>> The vote will be open for at least 72 hours. Unless objection I
> will
> >>>>> try
> >>>>>> to
> >>>>>>> close it Friday April 12, 2019 if we have sufficient votes.
> >>>>>>>
> >>>>>>> Prior to making this announcement I made the following preflight
> >>>>> checks:
> >>>>>>>
> >>>>>>>   RAT check passes (7u80)
> >>>>>>>   Unit test suite passes (7u80, 8u181)*
> >>>>>>>   Opened the UI in a browser, poked around
> >>>>>>>   LTT load 100M rows with 100% verification and 20% updates (8u181)
> >>>>>>>   ITBLL 1B rows with slowDeterministic monkey (8u181)
> >>>>>>>   ITBLL 1B rows with serverKilling monkey (8u181)
> >>>>>>>
> >>>>>>> There are known flaky tests. See HBASE-21904 and HBASE-21905. These
> >>>>> flaky
> >>>>>>> tests do not represent serious test failures that would prevent a
> >>>>>> release.
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Best regards,
> >>>>>>> Andrew
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Best regards,
> >>>>> Andrew
> >>>>>
> >>>>> Words like orphans lost among the crosstalk, meaning torn from
> truth's
> >>>>> decrepit hands
> >>>>>  - A23, Crosstalk
> >>>>>
> >>>>
> >>
>

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Andrew Purtell <an...@gmail.com>.
And if they pass in my environment , then what should we call it then. I have no doubt you are seeing failures. Therefore can you please file JIRAs and attach information that can help identify a fix. Thanks. 

> On Apr 11, 2019, at 8:35 PM, Yu Li <ca...@gmail.com> wrote:
> 
> I ran the test suite with the -Dsurefire.rerunFailingTestsCount=2 option
> and on two different env separately, so it sums up to 6 times stable
> failure for each case, and from my perspective this is not flaky.
> 
> IIRC last time when verifying 1.4.7 on the same env no such issue observed,
> will double check.
> 
> Best Regards,
> Yu
> 
> 
> On Fri, 12 Apr 2019 at 00:07, Andrew Purtell <an...@gmail.com>
> wrote:
> 
>> There are two failure cases it looks like. And this looks like flakes.
>> 
>> The wrong FS assertions are not something I see when I run these tests
>> myself. I am not able to investigate something I can’t reproduce. What I
>> suggest is since you can reproduce do a git bisect to find the commit that
>> introduced the problem. Then we can revert it. As an alternative we can
>> open a JIRA, report the problem, temporarily @ignore the test, and
>> continue. This latter option only should be done if we are fairly confident
>> it is a test only problem.
>> 
>> The connect exceptions are interesting. I see these sometimes when the
>> suite is executed, not this particular case, but when the failed test is
>> executed by itself it always passes. It is possible some change to classes
>> related to the minicluster or startup or shutdown timing are the cause, but
>> it is test time flaky behavior. I’m not happy about this but it doesn’t
>> actually fail the release because the failure is never repeatable when the
>> test is run standalone.
>> 
>> In general it would be great if some attention was paid to test
>> cleanliness on branch-1. As RM I’m not in a position to insist that
>> everything is perfect or there will never be another 1.x release, certainly
>> not from branch-1. So, tests which fail repeatedly block a release IMHO but
>> flakes do not.
>> 
>> 
>>> On Apr 10, 2019, at 11:20 PM, Yu Li <ca...@gmail.com> wrote:
>>> 
>>> -1
>>> 
>>> Observed many UT failures when checking the source package (tried
>> multiple
>>> rounds on two different environments, MacOs and Linux, got the same
>>> result), including (but not limited to):
>>> 
>>> TestBulkload:
>>> 
>> shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad)
>>> Time elapsed: 0.083 s  <<< ERROR!
>>> java.lang.IllegalArgumentException: Wrong FS:
>>> 
>> file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29,
>>> expected: hdfs://localhost:55938
>>>       at
>>> 
>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246)
>>>       at
>>> 
>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256)
>>>       at
>>> 
>> org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150)
>>> 
>>> TestStoreFile:
>>> 
>> testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile)
>>> Time elapsed: 0.083 s  <<< ERROR!
>>> java.net.ConnectException: Call From localhost/127.0.0.1 to
>> localhost:55938
>>> failed on connection exception: java.net.ConnectException: Connection
>>> refused; For more details see:
>>> http://wiki.apache.org/hadoop/ConnectionRefused
>>>       at
>>> 
>> org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047)
>>>       at
>>> 
>> org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908)
>>> 
>>> TestHFile:
>>> testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile)  Time elapsed:
>>> 0.08 s  <<< ERROR!
>>> java.net.ConnectException: Call From
>>> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35529 failed on
>>> connection exception: java.net.ConnectException: Connection refused; For
>>> more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
>>>       at
>>> org.apache.hadoop.hbase.io
>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
>>> Caused by: java.net.ConnectException: Connection refused
>>>       at
>>> org.apache.hadoop.hbase.io
>> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
>>> 
>>> TestBlocksScanned:
>>> 
>> testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned)
>>> Time elapsed: 0.069 s  <<< ERROR!
>>> java.lang.IllegalArgumentException: Wrong FS: hdfs://localhost:35529/tmp/
>>> 
>> hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08
>> ,
>>> expected: file:///
>>>       at
>>> 
>> org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90)
>>>       at
>>> 
>> org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86)
>>> 
>>> And please let me know if any known issue I'm not aware of. Thanks.
>>> 
>>> Best Regards,
>>> Yu
>>> 
>>> 
>>>> On Mon, 8 Apr 2019 at 11:38, Yu Li <ca...@gmail.com> wrote:
>>>> 
>>>> The performance report LGTM, thanks! (and sorry for the lag due to
>>>> Qingming Festival Holiday here in China)
>>>> 
>>>> Still verifying the release, just some quick feedback: observed some
>>>> incompatible changes in compatibility report including
>>>> HBASE-21492/HBASE-21684 and worth a reminder in ReleaseNote.
>>>> 
>>>> Irrelative but noticeable: the 1.4.9 release note URL is invalid on
>>>> https://hbase.apache.org/downloads.html
>>>> 
>>>> Best Regards,
>>>> Yu
>>>> 
>>>> 
>>>>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <ap...@apache.org>
>> wrote:
>>>>> 
>>>>> The difference is basically noise per the usual YCSB evaluation. Small
>>>>> differences in workloads D and F (slightly worse) and workload E
>> (slightly
>>>>> better) that do not indicate serious regression.
>>>>> 
>>>>> Linux version 4.14.55-62.37.amzn1.x86_64
>>>>> c3.8xlarge x 5
>>>>> OpenJDK Runtime Environment (build 1.8.0_181-shenandoah-b13)
>>>>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch -XX:+UseNUMA
>>>>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
>>>>> Hadoop 2.9.2
>>>>> Init: Load 100 M rows and snapshot
>>>>> Run: Delete table, clone and redeploy from snapshot, run 10 M
>> operations
>>>>> Args: -threads 100 -target 50000
>>>>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS => '1',
>> IN_MEMORY
>>>>> => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING =>
>>>>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => 'SNAPPY',
>> MIN_VERSIONS =>
>>>>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE =>
>>>>> '0'}
>>>>> 
>>>>> 
>>>>> YCSB Workload A
>>>>> 
>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>> 
>>>>> 
>>>>> 
>>>>> [OVERALL], RunTime(ms) 200592 200583
>>>>> [OVERALL], Throughput(ops/sec) 49852 49855
>>>>> [READ], AverageLatency(us) 544 559
>>>>> [READ], MinLatency(us) 267 292
>>>>> [READ], MaxLatency(us) 165631 185087
>>>>> [READ], 95thPercentileLatency(us) 738 742
>>>>> [READ], 99thPercentileLatency(us), 1877 1961
>>>>> [UPDATE], AverageLatency(us) 1370 1181
>>>>> [UPDATE], MinLatency(us) 702 646
>>>>> [UPDATE], MaxLatency(us) 180735 177279
>>>>> [UPDATE], 95thPercentileLatency(us) 1943 1652
>>>>> [UPDATE], 99thPercentileLatency(us) 3257 3085
>>>>> 
>>>>> YCSB Workload B
>>>>> 
>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>> 
>>>>> 
>>>>> 
>>>>> [OVERALL], RunTime(ms) 200599 200581
>>>>> [OVERALL], Throughput(ops/sec) 49850 49855
>>>>> [READ], AverageLatency(us),  454 471
>>>>> [READ], MinLatency(us) 203 213
>>>>> [READ], MaxLatency(us) 183423 174207
>>>>> [READ], 95thPercentileLatency(us) 563 599
>>>>> [READ], 99thPercentileLatency(us) 1360 1172
>>>>> [UPDATE], AverageLatency(us) 1064 1029
>>>>> [UPDATE], MinLatency(us) 746 726
>>>>> [UPDATE], MaxLatency(us) 163455 101631
>>>>> [UPDATE], 95thPercentileLatency(us) 1327 1157
>>>>> [UPDATE], 99thPercentileLatency(us) 2241 1898
>>>>> 
>>>>> YCSB Workload C
>>>>> 
>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>> 
>>>>> 
>>>>> 
>>>>> [OVERALL], RunTime(ms) 200541 200538
>>>>> [OVERALL], Throughput(ops/sec) 49865 49865
>>>>> [READ], AverageLatency(us) 332 327
>>>>> [READ], MinLatency(us) 175 179
>>>>> [READ], MaxLatency(us) 210559 170367
>>>>> [READ], 95thPercentileLatency(us) 410 396
>>>>> [READ], 99thPercentileLatency(us) 871 892
>>>>> 
>>>>> YCSB Workload D
>>>>> 
>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>> 
>>>>> 
>>>>> 
>>>>> [OVERALL], RunTime(ms) 200579 200562
>>>>> [OVERALL], Throughput(ops/sec) 49855 49859
>>>>> [READ], AverageLatency(us) 487 547
>>>>> [READ], MinLatency(us) 210 214
>>>>> [READ], MaxLatency(us) 192255 177535
>>>>> [READ], 95thPercentileLatency(us) 973 1529
>>>>> [READ], 99thPercentileLatency(us) 1836 2683
>>>>> [INSERT], AverageLatency(us) 1239 1152
>>>>> [INSERT], MinLatency(us) 807 788
>>>>> [INSERT], MaxLatency(us) 184575 148735
>>>>> [INSERT], 95thPercentileLatency(us) 1496 1243
>>>>> [INSERT], 99thPercentileLatency(us) 2965 2495
>>>>> 
>>>>> YCSB Workload E
>>>>> 
>>>>> target 10k/op/s 1.4.9 1.5.0
>>>>> 
>>>>> 
>>>>> 
>>>>> [OVERALL], RunTime(ms) 100605 100568
>>>>> [OVERALL], Throughput(ops/sec) 9939 9943
>>>>> [SCAN], AverageLatency(us) 3548 2687
>>>>> [SCAN], MinLatency(us) 696 678
>>>>> [SCAN], MaxLatency(us) 1059839 238463
>>>>> [SCAN], 95thPercentileLatency(us) 8327 6791
>>>>> [SCAN], 99thPercentileLatency(us) 17647 14415
>>>>> [INSERT], AverageLatency(us) 2688 1555
>>>>> [INSERT], MinLatency(us) 887 815
>>>>> [INSERT], MaxLatency(us) 173311 154623
>>>>> [INSERT], 95thPercentileLatency(us) 4455 2571
>>>>> [INSERT], 99thPercentileLatency(us) 9303 5375
>>>>> 
>>>>> YCSB Workload F
>>>>> 
>>>>> target 50k/op/s 1.4.9 1.5.0
>>>>> 
>>>>> 
>>>>> 
>>>>> [OVERALL], RunTime(ms) 200562 204178
>>>>> [OVERALL], Throughput(ops/sec) 49859 48976
>>>>> [READ], AverageLatency(us) 856 1137
>>>>> [READ], MinLatency(us) 262 257
>>>>> [READ], MaxLatency(us) 205567 222335
>>>>> [READ], 95thPercentileLatency(us) 2365 3475
>>>>> [READ], 99thPercentileLatency(us) 3099 4143
>>>>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
>>>>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034
>>>>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
>>>>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627
>>>>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919
>>>>> [UPDATE], AverageLatency(us) 1700 1777
>>>>> [UPDATE], MinLatency(us) 737 687
>>>>> [UPDATE], MaxLatency(us) 97983 94271
>>>>> [UPDATE], 95thPercentileLatency(us) 3377 4147
>>>>> [UPDATE], 99thPercentileLatency(us) 4147 4831
>>>>> 
>>>>> 
>>>>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <ca...@gmail.com> wrote:
>>>>>> 
>>>>>> Thanks for the efforts boss.
>>>>>> 
>>>>>> Since it's a new minor release, do we have performance comparison
>> report
>>>>>> with 1.4.9 as we did when releasing 1.4.0? If so, any reference? Many
>>>>>> thanks!
>>>>>> 
>>>>>> Best Regards,
>>>>>> Yu
>>>>>> 
>>>>>> 
>>>>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <ap...@apache.org>
>>>>> wrote:
>>>>>> 
>>>>>>> The fourth HBase 1.5.0 release candidate (RC3) is available for
>>>>> download
>>>>>> at
>>>>>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/ and
>>>>> Maven
>>>>>>> artifacts are available in the temporary repository
>>>>>>> 
>>>>> 
>> https://repository.apache.org/content/repositories/orgapachehbase-1292/
>>>>>>> 
>>>>>>> The git tag corresponding to the candidate is '1.5.0RC3’
>> (b0bc7225c5).
>>>>>>> 
>>>>>>> A detailed source and binary compatibility report for this release is
>>>>>>> available for your review at
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
>>>>>>> .
>>>>>>> 
>>>>>>> A list of the 115 issues resolved in this release can be found at
>>>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is derived from the
>>>>>>> changelog of the last branch-1.4 release, 1.4.9.
>>>>>>> 
>>>>>>> Please try out the candidate and vote +1/0/-1.
>>>>>>> 
>>>>>>> The vote will be open for at least 72 hours. Unless objection I will
>>>>> try
>>>>>> to
>>>>>>> close it Friday April 12, 2019 if we have sufficient votes.
>>>>>>> 
>>>>>>> Prior to making this announcement I made the following preflight
>>>>> checks:
>>>>>>> 
>>>>>>>   RAT check passes (7u80)
>>>>>>>   Unit test suite passes (7u80, 8u181)*
>>>>>>>   Opened the UI in a browser, poked around
>>>>>>>   LTT load 100M rows with 100% verification and 20% updates (8u181)
>>>>>>>   ITBLL 1B rows with slowDeterministic monkey (8u181)
>>>>>>>   ITBLL 1B rows with serverKilling monkey (8u181)
>>>>>>> 
>>>>>>> There are known flaky tests. See HBASE-21904 and HBASE-21905. These
>>>>> flaky
>>>>>>> tests do not represent serious test failures that would prevent a
>>>>>> release.
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Best regards,
>>>>>>> Andrew
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Best regards,
>>>>> Andrew
>>>>> 
>>>>> Words like orphans lost among the crosstalk, meaning torn from truth's
>>>>> decrepit hands
>>>>>  - A23, Crosstalk
>>>>> 
>>>> 
>> 

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Yu Li <ca...@gmail.com>.
I ran the test suite with the -Dsurefire.rerunFailingTestsCount=2 option
and on two different env separately, so it sums up to 6 times stable
failure for each case, and from my perspective this is not flaky.

IIRC last time when verifying 1.4.7 on the same env no such issue observed,
will double check.

Best Regards,
Yu


On Fri, 12 Apr 2019 at 00:07, Andrew Purtell <an...@gmail.com>
wrote:

> There are two failure cases it looks like. And this looks like flakes.
>
> The wrong FS assertions are not something I see when I run these tests
> myself. I am not able to investigate something I can’t reproduce. What I
> suggest is since you can reproduce do a git bisect to find the commit that
> introduced the problem. Then we can revert it. As an alternative we can
> open a JIRA, report the problem, temporarily @ignore the test, and
> continue. This latter option only should be done if we are fairly confident
> it is a test only problem.
>
> The connect exceptions are interesting. I see these sometimes when the
> suite is executed, not this particular case, but when the failed test is
> executed by itself it always passes. It is possible some change to classes
> related to the minicluster or startup or shutdown timing are the cause, but
> it is test time flaky behavior. I’m not happy about this but it doesn’t
> actually fail the release because the failure is never repeatable when the
> test is run standalone.
>
> In general it would be great if some attention was paid to test
> cleanliness on branch-1. As RM I’m not in a position to insist that
> everything is perfect or there will never be another 1.x release, certainly
> not from branch-1. So, tests which fail repeatedly block a release IMHO but
> flakes do not.
>
>
> > On Apr 10, 2019, at 11:20 PM, Yu Li <ca...@gmail.com> wrote:
> >
> > -1
> >
> > Observed many UT failures when checking the source package (tried
> multiple
> > rounds on two different environments, MacOs and Linux, got the same
> > result), including (but not limited to):
> >
> > TestBulkload:
> >
> shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad)
> > Time elapsed: 0.083 s  <<< ERROR!
> > java.lang.IllegalArgumentException: Wrong FS:
> >
> file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29,
> > expected: hdfs://localhost:55938
> >        at
> >
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246)
> >        at
> >
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256)
> >        at
> >
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150)
> >
> > TestStoreFile:
> >
> testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile)
> > Time elapsed: 0.083 s  <<< ERROR!
> > java.net.ConnectException: Call From localhost/127.0.0.1 to
> localhost:55938
> > failed on connection exception: java.net.ConnectException: Connection
> > refused; For more details see:
> > http://wiki.apache.org/hadoop/ConnectionRefused
> >        at
> >
> org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047)
> >        at
> >
> org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908)
> >
> > TestHFile:
> > testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile)  Time elapsed:
> > 0.08 s  <<< ERROR!
> > java.net.ConnectException: Call From
> > z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35529 failed on
> > connection exception: java.net.ConnectException: Connection refused; For
> > more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
> >        at
> > org.apache.hadoop.hbase.io
> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
> > Caused by: java.net.ConnectException: Connection refused
> >        at
> > org.apache.hadoop.hbase.io
> .hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
> >
> > TestBlocksScanned:
> >
> testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned)
> > Time elapsed: 0.069 s  <<< ERROR!
> > java.lang.IllegalArgumentException: Wrong FS: hdfs://localhost:35529/tmp/
> >
> hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08
> ,
> > expected: file:///
> >        at
> >
> org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90)
> >        at
> >
> org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86)
> >
> > And please let me know if any known issue I'm not aware of. Thanks.
> >
> > Best Regards,
> > Yu
> >
> >
> >> On Mon, 8 Apr 2019 at 11:38, Yu Li <ca...@gmail.com> wrote:
> >>
> >> The performance report LGTM, thanks! (and sorry for the lag due to
> >> Qingming Festival Holiday here in China)
> >>
> >> Still verifying the release, just some quick feedback: observed some
> >> incompatible changes in compatibility report including
> >> HBASE-21492/HBASE-21684 and worth a reminder in ReleaseNote.
> >>
> >> Irrelative but noticeable: the 1.4.9 release note URL is invalid on
> >> https://hbase.apache.org/downloads.html
> >>
> >> Best Regards,
> >> Yu
> >>
> >>
> >>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <ap...@apache.org>
> wrote:
> >>>
> >>> The difference is basically noise per the usual YCSB evaluation. Small
> >>> differences in workloads D and F (slightly worse) and workload E
> (slightly
> >>> better) that do not indicate serious regression.
> >>>
> >>> Linux version 4.14.55-62.37.amzn1.x86_64
> >>> c3.8xlarge x 5
> >>> OpenJDK Runtime Environment (build 1.8.0_181-shenandoah-b13)
> >>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch -XX:+UseNUMA
> >>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
> >>> Hadoop 2.9.2
> >>> Init: Load 100 M rows and snapshot
> >>> Run: Delete table, clone and redeploy from snapshot, run 10 M
> operations
> >>> Args: -threads 100 -target 50000
> >>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS => '1',
> IN_MEMORY
> >>> => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING =>
> >>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => 'SNAPPY',
> MIN_VERSIONS =>
> >>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE =>
> >>> '0'}
> >>>
> >>>
> >>> YCSB Workload A
> >>>
> >>> target 50k/op/s 1.4.9 1.5.0
> >>>
> >>>
> >>>
> >>> [OVERALL], RunTime(ms) 200592 200583
> >>> [OVERALL], Throughput(ops/sec) 49852 49855
> >>> [READ], AverageLatency(us) 544 559
> >>> [READ], MinLatency(us) 267 292
> >>> [READ], MaxLatency(us) 165631 185087
> >>> [READ], 95thPercentileLatency(us) 738 742
> >>> [READ], 99thPercentileLatency(us), 1877 1961
> >>> [UPDATE], AverageLatency(us) 1370 1181
> >>> [UPDATE], MinLatency(us) 702 646
> >>> [UPDATE], MaxLatency(us) 180735 177279
> >>> [UPDATE], 95thPercentileLatency(us) 1943 1652
> >>> [UPDATE], 99thPercentileLatency(us) 3257 3085
> >>>
> >>> YCSB Workload B
> >>>
> >>> target 50k/op/s 1.4.9 1.5.0
> >>>
> >>>
> >>>
> >>> [OVERALL], RunTime(ms) 200599 200581
> >>> [OVERALL], Throughput(ops/sec) 49850 49855
> >>> [READ], AverageLatency(us),  454 471
> >>> [READ], MinLatency(us) 203 213
> >>> [READ], MaxLatency(us) 183423 174207
> >>> [READ], 95thPercentileLatency(us) 563 599
> >>> [READ], 99thPercentileLatency(us) 1360 1172
> >>> [UPDATE], AverageLatency(us) 1064 1029
> >>> [UPDATE], MinLatency(us) 746 726
> >>> [UPDATE], MaxLatency(us) 163455 101631
> >>> [UPDATE], 95thPercentileLatency(us) 1327 1157
> >>> [UPDATE], 99thPercentileLatency(us) 2241 1898
> >>>
> >>> YCSB Workload C
> >>>
> >>> target 50k/op/s 1.4.9 1.5.0
> >>>
> >>>
> >>>
> >>> [OVERALL], RunTime(ms) 200541 200538
> >>> [OVERALL], Throughput(ops/sec) 49865 49865
> >>> [READ], AverageLatency(us) 332 327
> >>> [READ], MinLatency(us) 175 179
> >>> [READ], MaxLatency(us) 210559 170367
> >>> [READ], 95thPercentileLatency(us) 410 396
> >>> [READ], 99thPercentileLatency(us) 871 892
> >>>
> >>> YCSB Workload D
> >>>
> >>> target 50k/op/s 1.4.9 1.5.0
> >>>
> >>>
> >>>
> >>> [OVERALL], RunTime(ms) 200579 200562
> >>> [OVERALL], Throughput(ops/sec) 49855 49859
> >>> [READ], AverageLatency(us) 487 547
> >>> [READ], MinLatency(us) 210 214
> >>> [READ], MaxLatency(us) 192255 177535
> >>> [READ], 95thPercentileLatency(us) 973 1529
> >>> [READ], 99thPercentileLatency(us) 1836 2683
> >>> [INSERT], AverageLatency(us) 1239 1152
> >>> [INSERT], MinLatency(us) 807 788
> >>> [INSERT], MaxLatency(us) 184575 148735
> >>> [INSERT], 95thPercentileLatency(us) 1496 1243
> >>> [INSERT], 99thPercentileLatency(us) 2965 2495
> >>>
> >>> YCSB Workload E
> >>>
> >>> target 10k/op/s 1.4.9 1.5.0
> >>>
> >>>
> >>>
> >>> [OVERALL], RunTime(ms) 100605 100568
> >>> [OVERALL], Throughput(ops/sec) 9939 9943
> >>> [SCAN], AverageLatency(us) 3548 2687
> >>> [SCAN], MinLatency(us) 696 678
> >>> [SCAN], MaxLatency(us) 1059839 238463
> >>> [SCAN], 95thPercentileLatency(us) 8327 6791
> >>> [SCAN], 99thPercentileLatency(us) 17647 14415
> >>> [INSERT], AverageLatency(us) 2688 1555
> >>> [INSERT], MinLatency(us) 887 815
> >>> [INSERT], MaxLatency(us) 173311 154623
> >>> [INSERT], 95thPercentileLatency(us) 4455 2571
> >>> [INSERT], 99thPercentileLatency(us) 9303 5375
> >>>
> >>> YCSB Workload F
> >>>
> >>> target 50k/op/s 1.4.9 1.5.0
> >>>
> >>>
> >>>
> >>> [OVERALL], RunTime(ms) 200562 204178
> >>> [OVERALL], Throughput(ops/sec) 49859 48976
> >>> [READ], AverageLatency(us) 856 1137
> >>> [READ], MinLatency(us) 262 257
> >>> [READ], MaxLatency(us) 205567 222335
> >>> [READ], 95thPercentileLatency(us) 2365 3475
> >>> [READ], 99thPercentileLatency(us) 3099 4143
> >>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
> >>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034
> >>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
> >>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627
> >>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919
> >>> [UPDATE], AverageLatency(us) 1700 1777
> >>> [UPDATE], MinLatency(us) 737 687
> >>> [UPDATE], MaxLatency(us) 97983 94271
> >>> [UPDATE], 95thPercentileLatency(us) 3377 4147
> >>> [UPDATE], 99thPercentileLatency(us) 4147 4831
> >>>
> >>>
> >>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <ca...@gmail.com> wrote:
> >>>>
> >>>> Thanks for the efforts boss.
> >>>>
> >>>> Since it's a new minor release, do we have performance comparison
> report
> >>>> with 1.4.9 as we did when releasing 1.4.0? If so, any reference? Many
> >>>> thanks!
> >>>>
> >>>> Best Regards,
> >>>> Yu
> >>>>
> >>>>
> >>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <ap...@apache.org>
> >>> wrote:
> >>>>
> >>>>> The fourth HBase 1.5.0 release candidate (RC3) is available for
> >>> download
> >>>> at
> >>>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/ and
> >>> Maven
> >>>>> artifacts are available in the temporary repository
> >>>>>
> >>>
> https://repository.apache.org/content/repositories/orgapachehbase-1292/
> >>>>>
> >>>>> The git tag corresponding to the candidate is '1.5.0RC3’
> (b0bc7225c5).
> >>>>>
> >>>>> A detailed source and binary compatibility report for this release is
> >>>>> available for your review at
> >>>>>
> >>>>>
> >>>>
> >>>
> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
> >>>>> .
> >>>>>
> >>>>> A list of the 115 issues resolved in this release can be found at
> >>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is derived from the
> >>>>> changelog of the last branch-1.4 release, 1.4.9.
> >>>>>
> >>>>> Please try out the candidate and vote +1/0/-1.
> >>>>>
> >>>>> The vote will be open for at least 72 hours. Unless objection I will
> >>> try
> >>>> to
> >>>>> close it Friday April 12, 2019 if we have sufficient votes.
> >>>>>
> >>>>> Prior to making this announcement I made the following preflight
> >>> checks:
> >>>>>
> >>>>>    RAT check passes (7u80)
> >>>>>    Unit test suite passes (7u80, 8u181)*
> >>>>>    Opened the UI in a browser, poked around
> >>>>>    LTT load 100M rows with 100% verification and 20% updates (8u181)
> >>>>>    ITBLL 1B rows with slowDeterministic monkey (8u181)
> >>>>>    ITBLL 1B rows with serverKilling monkey (8u181)
> >>>>>
> >>>>> There are known flaky tests. See HBASE-21904 and HBASE-21905. These
> >>> flaky
> >>>>> tests do not represent serious test failures that would prevent a
> >>>> release.
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Best regards,
> >>>>> Andrew
> >>>>>
> >>>>
> >>>
> >>>
> >>> --
> >>> Best regards,
> >>> Andrew
> >>>
> >>> Words like orphans lost among the crosstalk, meaning torn from truth's
> >>> decrepit hands
> >>>   - A23, Crosstalk
> >>>
> >>
>

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Andrew Purtell <an...@gmail.com>.
There are two failure cases it looks like. And this looks like flakes. 

The wrong FS assertions are not something I see when I run these tests myself. I am not able to investigate something I can’t reproduce. What I suggest is since you can reproduce do a git bisect to find the commit that introduced the problem. Then we can revert it. As an alternative we can open a JIRA, report the problem, temporarily @ignore the test, and continue. This latter option only should be done if we are fairly confident it is a test only problem. 

The connect exceptions are interesting. I see these sometimes when the suite is executed, not this particular case, but when the failed test is executed by itself it always passes. It is possible some change to classes related to the minicluster or startup or shutdown timing are the cause, but it is test time flaky behavior. I’m not happy about this but it doesn’t actually fail the release because the failure is never repeatable when the test is run standalone. 

In general it would be great if some attention was paid to test cleanliness on branch-1. As RM I’m not in a position to insist that everything is perfect or there will never be another 1.x release, certainly not from branch-1. So, tests which fail repeatedly block a release IMHO but flakes do not.


> On Apr 10, 2019, at 11:20 PM, Yu Li <ca...@gmail.com> wrote:
> 
> -1
> 
> Observed many UT failures when checking the source package (tried multiple
> rounds on two different environments, MacOs and Linux, got the same
> result), including (but not limited to):
> 
> TestBulkload:
> shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad)
> Time elapsed: 0.083 s  <<< ERROR!
> java.lang.IllegalArgumentException: Wrong FS:
> file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29,
> expected: hdfs://localhost:55938
>        at
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246)
>        at
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256)
>        at
> org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150)
> 
> TestStoreFile:
> testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile)
> Time elapsed: 0.083 s  <<< ERROR!
> java.net.ConnectException: Call From localhost/127.0.0.1 to localhost:55938
> failed on connection exception: java.net.ConnectException: Connection
> refused; For more details see:
> http://wiki.apache.org/hadoop/ConnectionRefused
>        at
> org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047)
>        at
> org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908)
> 
> TestHFile:
> testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile)  Time elapsed:
> 0.08 s  <<< ERROR!
> java.net.ConnectException: Call From
> z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35529 failed on
> connection exception: java.net.ConnectException: Connection refused; For
> more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
>        at
> org.apache.hadoop.hbase.io.hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
> Caused by: java.net.ConnectException: Connection refused
>        at
> org.apache.hadoop.hbase.io.hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
> 
> TestBlocksScanned:
> testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned)
> Time elapsed: 0.069 s  <<< ERROR!
> java.lang.IllegalArgumentException: Wrong FS: hdfs://localhost:35529/tmp/
> hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08,
> expected: file:///
>        at
> org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90)
>        at
> org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86)
> 
> And please let me know if any known issue I'm not aware of. Thanks.
> 
> Best Regards,
> Yu
> 
> 
>> On Mon, 8 Apr 2019 at 11:38, Yu Li <ca...@gmail.com> wrote:
>> 
>> The performance report LGTM, thanks! (and sorry for the lag due to
>> Qingming Festival Holiday here in China)
>> 
>> Still verifying the release, just some quick feedback: observed some
>> incompatible changes in compatibility report including
>> HBASE-21492/HBASE-21684 and worth a reminder in ReleaseNote.
>> 
>> Irrelative but noticeable: the 1.4.9 release note URL is invalid on
>> https://hbase.apache.org/downloads.html
>> 
>> Best Regards,
>> Yu
>> 
>> 
>>> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <ap...@apache.org> wrote:
>>> 
>>> The difference is basically noise per the usual YCSB evaluation. Small
>>> differences in workloads D and F (slightly worse) and workload E (slightly
>>> better) that do not indicate serious regression.
>>> 
>>> Linux version 4.14.55-62.37.amzn1.x86_64
>>> c3.8xlarge x 5
>>> OpenJDK Runtime Environment (build 1.8.0_181-shenandoah-b13)
>>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch -XX:+UseNUMA
>>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
>>> Hadoop 2.9.2
>>> Init: Load 100 M rows and snapshot
>>> Run: Delete table, clone and redeploy from snapshot, run 10 M operations
>>> Args: -threads 100 -target 50000
>>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY
>>> => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING =>
>>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => 'SNAPPY', MIN_VERSIONS =>
>>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE =>
>>> '0'}
>>> 
>>> 
>>> YCSB Workload A
>>> 
>>> target 50k/op/s 1.4.9 1.5.0
>>> 
>>> 
>>> 
>>> [OVERALL], RunTime(ms) 200592 200583
>>> [OVERALL], Throughput(ops/sec) 49852 49855
>>> [READ], AverageLatency(us) 544 559
>>> [READ], MinLatency(us) 267 292
>>> [READ], MaxLatency(us) 165631 185087
>>> [READ], 95thPercentileLatency(us) 738 742
>>> [READ], 99thPercentileLatency(us), 1877 1961
>>> [UPDATE], AverageLatency(us) 1370 1181
>>> [UPDATE], MinLatency(us) 702 646
>>> [UPDATE], MaxLatency(us) 180735 177279
>>> [UPDATE], 95thPercentileLatency(us) 1943 1652
>>> [UPDATE], 99thPercentileLatency(us) 3257 3085
>>> 
>>> YCSB Workload B
>>> 
>>> target 50k/op/s 1.4.9 1.5.0
>>> 
>>> 
>>> 
>>> [OVERALL], RunTime(ms) 200599 200581
>>> [OVERALL], Throughput(ops/sec) 49850 49855
>>> [READ], AverageLatency(us),  454 471
>>> [READ], MinLatency(us) 203 213
>>> [READ], MaxLatency(us) 183423 174207
>>> [READ], 95thPercentileLatency(us) 563 599
>>> [READ], 99thPercentileLatency(us) 1360 1172
>>> [UPDATE], AverageLatency(us) 1064 1029
>>> [UPDATE], MinLatency(us) 746 726
>>> [UPDATE], MaxLatency(us) 163455 101631
>>> [UPDATE], 95thPercentileLatency(us) 1327 1157
>>> [UPDATE], 99thPercentileLatency(us) 2241 1898
>>> 
>>> YCSB Workload C
>>> 
>>> target 50k/op/s 1.4.9 1.5.0
>>> 
>>> 
>>> 
>>> [OVERALL], RunTime(ms) 200541 200538
>>> [OVERALL], Throughput(ops/sec) 49865 49865
>>> [READ], AverageLatency(us) 332 327
>>> [READ], MinLatency(us) 175 179
>>> [READ], MaxLatency(us) 210559 170367
>>> [READ], 95thPercentileLatency(us) 410 396
>>> [READ], 99thPercentileLatency(us) 871 892
>>> 
>>> YCSB Workload D
>>> 
>>> target 50k/op/s 1.4.9 1.5.0
>>> 
>>> 
>>> 
>>> [OVERALL], RunTime(ms) 200579 200562
>>> [OVERALL], Throughput(ops/sec) 49855 49859
>>> [READ], AverageLatency(us) 487 547
>>> [READ], MinLatency(us) 210 214
>>> [READ], MaxLatency(us) 192255 177535
>>> [READ], 95thPercentileLatency(us) 973 1529
>>> [READ], 99thPercentileLatency(us) 1836 2683
>>> [INSERT], AverageLatency(us) 1239 1152
>>> [INSERT], MinLatency(us) 807 788
>>> [INSERT], MaxLatency(us) 184575 148735
>>> [INSERT], 95thPercentileLatency(us) 1496 1243
>>> [INSERT], 99thPercentileLatency(us) 2965 2495
>>> 
>>> YCSB Workload E
>>> 
>>> target 10k/op/s 1.4.9 1.5.0
>>> 
>>> 
>>> 
>>> [OVERALL], RunTime(ms) 100605 100568
>>> [OVERALL], Throughput(ops/sec) 9939 9943
>>> [SCAN], AverageLatency(us) 3548 2687
>>> [SCAN], MinLatency(us) 696 678
>>> [SCAN], MaxLatency(us) 1059839 238463
>>> [SCAN], 95thPercentileLatency(us) 8327 6791
>>> [SCAN], 99thPercentileLatency(us) 17647 14415
>>> [INSERT], AverageLatency(us) 2688 1555
>>> [INSERT], MinLatency(us) 887 815
>>> [INSERT], MaxLatency(us) 173311 154623
>>> [INSERT], 95thPercentileLatency(us) 4455 2571
>>> [INSERT], 99thPercentileLatency(us) 9303 5375
>>> 
>>> YCSB Workload F
>>> 
>>> target 50k/op/s 1.4.9 1.5.0
>>> 
>>> 
>>> 
>>> [OVERALL], RunTime(ms) 200562 204178
>>> [OVERALL], Throughput(ops/sec) 49859 48976
>>> [READ], AverageLatency(us) 856 1137
>>> [READ], MinLatency(us) 262 257
>>> [READ], MaxLatency(us) 205567 222335
>>> [READ], 95thPercentileLatency(us) 2365 3475
>>> [READ], 99thPercentileLatency(us) 3099 4143
>>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
>>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034
>>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
>>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627
>>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919
>>> [UPDATE], AverageLatency(us) 1700 1777
>>> [UPDATE], MinLatency(us) 737 687
>>> [UPDATE], MaxLatency(us) 97983 94271
>>> [UPDATE], 95thPercentileLatency(us) 3377 4147
>>> [UPDATE], 99thPercentileLatency(us) 4147 4831
>>> 
>>> 
>>>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <ca...@gmail.com> wrote:
>>>> 
>>>> Thanks for the efforts boss.
>>>> 
>>>> Since it's a new minor release, do we have performance comparison report
>>>> with 1.4.9 as we did when releasing 1.4.0? If so, any reference? Many
>>>> thanks!
>>>> 
>>>> Best Regards,
>>>> Yu
>>>> 
>>>> 
>>>> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <ap...@apache.org>
>>> wrote:
>>>> 
>>>>> The fourth HBase 1.5.0 release candidate (RC3) is available for
>>> download
>>>> at
>>>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/ and
>>> Maven
>>>>> artifacts are available in the temporary repository
>>>>> 
>>> https://repository.apache.org/content/repositories/orgapachehbase-1292/
>>>>> 
>>>>> The git tag corresponding to the candidate is '1.5.0RC3’ (b0bc7225c5).
>>>>> 
>>>>> A detailed source and binary compatibility report for this release is
>>>>> available for your review at
>>>>> 
>>>>> 
>>>> 
>>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
>>>>> .
>>>>> 
>>>>> A list of the 115 issues resolved in this release can be found at
>>>>> https://s.apache.org/K4Wk . The 1.5.0 changelog is derived from the
>>>>> changelog of the last branch-1.4 release, 1.4.9.
>>>>> 
>>>>> Please try out the candidate and vote +1/0/-1.
>>>>> 
>>>>> The vote will be open for at least 72 hours. Unless objection I will
>>> try
>>>> to
>>>>> close it Friday April 12, 2019 if we have sufficient votes.
>>>>> 
>>>>> Prior to making this announcement I made the following preflight
>>> checks:
>>>>> 
>>>>>    RAT check passes (7u80)
>>>>>    Unit test suite passes (7u80, 8u181)*
>>>>>    Opened the UI in a browser, poked around
>>>>>    LTT load 100M rows with 100% verification and 20% updates (8u181)
>>>>>    ITBLL 1B rows with slowDeterministic monkey (8u181)
>>>>>    ITBLL 1B rows with serverKilling monkey (8u181)
>>>>> 
>>>>> There are known flaky tests. See HBASE-21904 and HBASE-21905. These
>>> flaky
>>>>> tests do not represent serious test failures that would prevent a
>>>> release.
>>>>> 
>>>>> 
>>>>> --
>>>>> Best regards,
>>>>> Andrew
>>>>> 
>>>> 
>>> 
>>> 
>>> --
>>> Best regards,
>>> Andrew
>>> 
>>> Words like orphans lost among the crosstalk, meaning torn from truth's
>>> decrepit hands
>>>   - A23, Crosstalk
>>> 
>> 

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Yu Li <ca...@gmail.com>.
-1

Observed many UT failures when checking the source package (tried multiple
rounds on two different environments, MacOs and Linux, got the same
result), including (but not limited to):

TestBulkload:
shouldBulkLoadSingleFamilyHLog(org.apache.hadoop.hbase.regionserver.TestBulkLoad)
Time elapsed: 0.083 s  <<< ERROR!
java.lang.IllegalArgumentException: Wrong FS:
file:/var/folders/t6/vch4nh357f98y1wlq09lbm7h0000gn/T/junit1805329913454564189/junit8020757893576011944/data/default/shouldBulkLoadSingleFamilyHLog/8f4a6b584533de2fd1bf3c398dfaac29,
expected: hdfs://localhost:55938
        at
org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamiliesAndSpecifiedTableName(TestBulkLoad.java:246)
        at
org.apache.hadoop.hbase.regionserver.TestBulkLoad.testRegionWithFamilies(TestBulkLoad.java:256)
        at
org.apache.hadoop.hbase.regionserver.TestBulkLoad.shouldBulkLoadSingleFamilyHLog(TestBulkLoad.java:150)

TestStoreFile:
testCacheOnWriteEvictOnClose(org.apache.hadoop.hbase.regionserver.TestStoreFile)
Time elapsed: 0.083 s  <<< ERROR!
java.net.ConnectException: Call From localhost/127.0.0.1 to localhost:55938
failed on connection exception: java.net.ConnectException: Connection
refused; For more details see:
http://wiki.apache.org/hadoop/ConnectionRefused
        at
org.apache.hadoop.hbase.regionserver.TestStoreFile.writeStoreFile(TestStoreFile.java:1047)
        at
org.apache.hadoop.hbase.regionserver.TestStoreFile.testCacheOnWriteEvictOnClose(TestStoreFile.java:908)

TestHFile:
testEmptyHFile(org.apache.hadoop.hbase.io.hfile.TestHFile)  Time elapsed:
0.08 s  <<< ERROR!
java.net.ConnectException: Call From
z05f06378.sqa.zth.tbsite.net/11.163.183.195 to localhost:35529 failed on
connection exception: java.net.ConnectException: Connection refused; For
more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
        at
org.apache.hadoop.hbase.io.hfile.TestHFile.testEmptyHFile(TestHFile.java:90)
Caused by: java.net.ConnectException: Connection refused
        at
org.apache.hadoop.hbase.io.hfile.TestHFile.testEmptyHFile(TestHFile.java:90)

TestBlocksScanned:
testBlocksScannedWithEncoding(org.apache.hadoop.hbase.regionserver.TestBlocksScanned)
Time elapsed: 0.069 s  <<< ERROR!
java.lang.IllegalArgumentException: Wrong FS: hdfs://localhost:35529/tmp/
hbase-jueding.ly/hbase/data/default/TestBlocksScannedWithEncoding/a4a416cc3060d9820a621c294af0aa08,
expected: file:///
        at
org.apache.hadoop.hbase.regionserver.TestBlocksScanned._testBlocksScanned(TestBlocksScanned.java:90)
        at
org.apache.hadoop.hbase.regionserver.TestBlocksScanned.testBlocksScannedWithEncoding(TestBlocksScanned.java:86)

And please let me know if any known issue I'm not aware of. Thanks.

Best Regards,
Yu


On Mon, 8 Apr 2019 at 11:38, Yu Li <ca...@gmail.com> wrote:

> The performance report LGTM, thanks! (and sorry for the lag due to
> Qingming Festival Holiday here in China)
>
> Still verifying the release, just some quick feedback: observed some
> incompatible changes in compatibility report including
> HBASE-21492/HBASE-21684 and worth a reminder in ReleaseNote.
>
> Irrelative but noticeable: the 1.4.9 release note URL is invalid on
> https://hbase.apache.org/downloads.html
>
> Best Regards,
> Yu
>
>
> On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <ap...@apache.org> wrote:
>
>> The difference is basically noise per the usual YCSB evaluation. Small
>> differences in workloads D and F (slightly worse) and workload E (slightly
>> better) that do not indicate serious regression.
>>
>> Linux version 4.14.55-62.37.amzn1.x86_64
>> c3.8xlarge x 5
>> OpenJDK Runtime Environment (build 1.8.0_181-shenandoah-b13)
>> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch -XX:+UseNUMA
>> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
>> Hadoop 2.9.2
>> Init: Load 100 M rows and snapshot
>> Run: Delete table, clone and redeploy from snapshot, run 10 M operations
>> Args: -threads 100 -target 50000
>> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY
>> => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING =>
>> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => 'SNAPPY', MIN_VERSIONS =>
>> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE =>
>> '0'}
>>
>>
>> YCSB Workload A
>>
>> target 50k/op/s 1.4.9 1.5.0
>>
>>
>>
>> [OVERALL], RunTime(ms) 200592 200583
>> [OVERALL], Throughput(ops/sec) 49852 49855
>> [READ], AverageLatency(us) 544 559
>> [READ], MinLatency(us) 267 292
>> [READ], MaxLatency(us) 165631 185087
>> [READ], 95thPercentileLatency(us) 738 742
>> [READ], 99thPercentileLatency(us), 1877 1961
>> [UPDATE], AverageLatency(us) 1370 1181
>> [UPDATE], MinLatency(us) 702 646
>> [UPDATE], MaxLatency(us) 180735 177279
>> [UPDATE], 95thPercentileLatency(us) 1943 1652
>> [UPDATE], 99thPercentileLatency(us) 3257 3085
>>
>> YCSB Workload B
>>
>> target 50k/op/s 1.4.9 1.5.0
>>
>>
>>
>> [OVERALL], RunTime(ms) 200599 200581
>> [OVERALL], Throughput(ops/sec) 49850 49855
>> [READ], AverageLatency(us),  454 471
>> [READ], MinLatency(us) 203 213
>> [READ], MaxLatency(us) 183423 174207
>> [READ], 95thPercentileLatency(us) 563 599
>> [READ], 99thPercentileLatency(us) 1360 1172
>> [UPDATE], AverageLatency(us) 1064 1029
>> [UPDATE], MinLatency(us) 746 726
>> [UPDATE], MaxLatency(us) 163455 101631
>> [UPDATE], 95thPercentileLatency(us) 1327 1157
>> [UPDATE], 99thPercentileLatency(us) 2241 1898
>>
>> YCSB Workload C
>>
>> target 50k/op/s 1.4.9 1.5.0
>>
>>
>>
>> [OVERALL], RunTime(ms) 200541 200538
>> [OVERALL], Throughput(ops/sec) 49865 49865
>> [READ], AverageLatency(us) 332 327
>> [READ], MinLatency(us) 175 179
>> [READ], MaxLatency(us) 210559 170367
>> [READ], 95thPercentileLatency(us) 410 396
>> [READ], 99thPercentileLatency(us) 871 892
>>
>> YCSB Workload D
>>
>> target 50k/op/s 1.4.9 1.5.0
>>
>>
>>
>> [OVERALL], RunTime(ms) 200579 200562
>> [OVERALL], Throughput(ops/sec) 49855 49859
>> [READ], AverageLatency(us) 487 547
>> [READ], MinLatency(us) 210 214
>> [READ], MaxLatency(us) 192255 177535
>> [READ], 95thPercentileLatency(us) 973 1529
>> [READ], 99thPercentileLatency(us) 1836 2683
>> [INSERT], AverageLatency(us) 1239 1152
>> [INSERT], MinLatency(us) 807 788
>> [INSERT], MaxLatency(us) 184575 148735
>> [INSERT], 95thPercentileLatency(us) 1496 1243
>> [INSERT], 99thPercentileLatency(us) 2965 2495
>>
>> YCSB Workload E
>>
>> target 10k/op/s 1.4.9 1.5.0
>>
>>
>>
>> [OVERALL], RunTime(ms) 100605 100568
>> [OVERALL], Throughput(ops/sec) 9939 9943
>> [SCAN], AverageLatency(us) 3548 2687
>> [SCAN], MinLatency(us) 696 678
>> [SCAN], MaxLatency(us) 1059839 238463
>> [SCAN], 95thPercentileLatency(us) 8327 6791
>> [SCAN], 99thPercentileLatency(us) 17647 14415
>> [INSERT], AverageLatency(us) 2688 1555
>> [INSERT], MinLatency(us) 887 815
>> [INSERT], MaxLatency(us) 173311 154623
>> [INSERT], 95thPercentileLatency(us) 4455 2571
>> [INSERT], 99thPercentileLatency(us) 9303 5375
>>
>> YCSB Workload F
>>
>> target 50k/op/s 1.4.9 1.5.0
>>
>>
>>
>> [OVERALL], RunTime(ms) 200562 204178
>> [OVERALL], Throughput(ops/sec) 49859 48976
>> [READ], AverageLatency(us) 856 1137
>> [READ], MinLatency(us) 262 257
>> [READ], MaxLatency(us) 205567 222335
>> [READ], 95thPercentileLatency(us) 2365 3475
>> [READ], 99thPercentileLatency(us) 3099 4143
>> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
>> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034
>> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
>> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627
>> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919
>> [UPDATE], AverageLatency(us) 1700 1777
>> [UPDATE], MinLatency(us) 737 687
>> [UPDATE], MaxLatency(us) 97983 94271
>> [UPDATE], 95thPercentileLatency(us) 3377 4147
>> [UPDATE], 99thPercentileLatency(us) 4147 4831
>>
>>
>> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <ca...@gmail.com> wrote:
>>
>> > Thanks for the efforts boss.
>> >
>> > Since it's a new minor release, do we have performance comparison report
>> > with 1.4.9 as we did when releasing 1.4.0? If so, any reference? Many
>> > thanks!
>> >
>> > Best Regards,
>> > Yu
>> >
>> >
>> > On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <ap...@apache.org>
>> wrote:
>> >
>> > > The fourth HBase 1.5.0 release candidate (RC3) is available for
>> download
>> > at
>> > > https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/ and
>> Maven
>> > > artifacts are available in the temporary repository
>> > >
>> https://repository.apache.org/content/repositories/orgapachehbase-1292/
>> > >
>> > > The git tag corresponding to the candidate is '1.5.0RC3’ (b0bc7225c5).
>> > >
>> > > A detailed source and binary compatibility report for this release is
>> > > available for your review at
>> > >
>> > >
>> >
>> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
>> > > .
>> > >
>> > > A list of the 115 issues resolved in this release can be found at
>> > > https://s.apache.org/K4Wk . The 1.5.0 changelog is derived from the
>> > > changelog of the last branch-1.4 release, 1.4.9.
>> > >
>> > > Please try out the candidate and vote +1/0/-1.
>> > >
>> > > The vote will be open for at least 72 hours. Unless objection I will
>> try
>> > to
>> > > close it Friday April 12, 2019 if we have sufficient votes.
>> > >
>> > > Prior to making this announcement I made the following preflight
>> checks:
>> > >
>> > >     RAT check passes (7u80)
>> > >     Unit test suite passes (7u80, 8u181)*
>> > >     Opened the UI in a browser, poked around
>> > >     LTT load 100M rows with 100% verification and 20% updates (8u181)
>> > >     ITBLL 1B rows with slowDeterministic monkey (8u181)
>> > >     ITBLL 1B rows with serverKilling monkey (8u181)
>> > >
>> > > There are known flaky tests. See HBASE-21904 and HBASE-21905. These
>> flaky
>> > > tests do not represent serious test failures that would prevent a
>> > release.
>> > >
>> > >
>> > > --
>> > > Best regards,
>> > > Andrew
>> > >
>> >
>>
>>
>> --
>> Best regards,
>> Andrew
>>
>> Words like orphans lost among the crosstalk, meaning torn from truth's
>> decrepit hands
>>    - A23, Crosstalk
>>
>

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Yu Li <ca...@gmail.com>.
The performance report LGTM, thanks! (and sorry for the lag due to Qingming
Festival Holiday here in China)

Still verifying the release, just some quick feedback: observed some
incompatible changes in compatibility report including
HBASE-21492/HBASE-21684 and worth a reminder in ReleaseNote.

Irrelative but noticeable: the 1.4.9 release note URL is invalid on
https://hbase.apache.org/downloads.html

Best Regards,
Yu


On Fri, 5 Apr 2019 at 08:45, Andrew Purtell <ap...@apache.org> wrote:

> The difference is basically noise per the usual YCSB evaluation. Small
> differences in workloads D and F (slightly worse) and workload E (slightly
> better) that do not indicate serious regression.
>
> Linux version 4.14.55-62.37.amzn1.x86_64
> c3.8xlarge x 5
> OpenJDK Runtime Environment (build 1.8.0_181-shenandoah-b13)
> -Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch -XX:+UseNUMA
> -XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
> Hadoop 2.9.2
> Init: Load 100 M rows and snapshot
> Run: Delete table, clone and redeploy from snapshot, run 10 M operations
> Args: -threads 100 -target 50000
> Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY
> => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING =>
> 'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => 'SNAPPY', MIN_VERSIONS =>
> '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE =>
> '0'}
>
>
> YCSB Workload A
>
> target 50k/op/s 1.4.9 1.5.0
>
>
>
> [OVERALL], RunTime(ms) 200592 200583
> [OVERALL], Throughput(ops/sec) 49852 49855
> [READ], AverageLatency(us) 544 559
> [READ], MinLatency(us) 267 292
> [READ], MaxLatency(us) 165631 185087
> [READ], 95thPercentileLatency(us) 738 742
> [READ], 99thPercentileLatency(us), 1877 1961
> [UPDATE], AverageLatency(us) 1370 1181
> [UPDATE], MinLatency(us) 702 646
> [UPDATE], MaxLatency(us) 180735 177279
> [UPDATE], 95thPercentileLatency(us) 1943 1652
> [UPDATE], 99thPercentileLatency(us) 3257 3085
>
> YCSB Workload B
>
> target 50k/op/s 1.4.9 1.5.0
>
>
>
> [OVERALL], RunTime(ms) 200599 200581
> [OVERALL], Throughput(ops/sec) 49850 49855
> [READ], AverageLatency(us),  454 471
> [READ], MinLatency(us) 203 213
> [READ], MaxLatency(us) 183423 174207
> [READ], 95thPercentileLatency(us) 563 599
> [READ], 99thPercentileLatency(us) 1360 1172
> [UPDATE], AverageLatency(us) 1064 1029
> [UPDATE], MinLatency(us) 746 726
> [UPDATE], MaxLatency(us) 163455 101631
> [UPDATE], 95thPercentileLatency(us) 1327 1157
> [UPDATE], 99thPercentileLatency(us) 2241 1898
>
> YCSB Workload C
>
> target 50k/op/s 1.4.9 1.5.0
>
>
>
> [OVERALL], RunTime(ms) 200541 200538
> [OVERALL], Throughput(ops/sec) 49865 49865
> [READ], AverageLatency(us) 332 327
> [READ], MinLatency(us) 175 179
> [READ], MaxLatency(us) 210559 170367
> [READ], 95thPercentileLatency(us) 410 396
> [READ], 99thPercentileLatency(us) 871 892
>
> YCSB Workload D
>
> target 50k/op/s 1.4.9 1.5.0
>
>
>
> [OVERALL], RunTime(ms) 200579 200562
> [OVERALL], Throughput(ops/sec) 49855 49859
> [READ], AverageLatency(us) 487 547
> [READ], MinLatency(us) 210 214
> [READ], MaxLatency(us) 192255 177535
> [READ], 95thPercentileLatency(us) 973 1529
> [READ], 99thPercentileLatency(us) 1836 2683
> [INSERT], AverageLatency(us) 1239 1152
> [INSERT], MinLatency(us) 807 788
> [INSERT], MaxLatency(us) 184575 148735
> [INSERT], 95thPercentileLatency(us) 1496 1243
> [INSERT], 99thPercentileLatency(us) 2965 2495
>
> YCSB Workload E
>
> target 10k/op/s 1.4.9 1.5.0
>
>
>
> [OVERALL], RunTime(ms) 100605 100568
> [OVERALL], Throughput(ops/sec) 9939 9943
> [SCAN], AverageLatency(us) 3548 2687
> [SCAN], MinLatency(us) 696 678
> [SCAN], MaxLatency(us) 1059839 238463
> [SCAN], 95thPercentileLatency(us) 8327 6791
> [SCAN], 99thPercentileLatency(us) 17647 14415
> [INSERT], AverageLatency(us) 2688 1555
> [INSERT], MinLatency(us) 887 815
> [INSERT], MaxLatency(us) 173311 154623
> [INSERT], 95thPercentileLatency(us) 4455 2571
> [INSERT], 99thPercentileLatency(us) 9303 5375
>
> YCSB Workload F
>
> target 50k/op/s 1.4.9 1.5.0
>
>
>
> [OVERALL], RunTime(ms) 200562 204178
> [OVERALL], Throughput(ops/sec) 49859 48976
> [READ], AverageLatency(us) 856 1137
> [READ], MinLatency(us) 262 257
> [READ], MaxLatency(us) 205567 222335
> [READ], 95thPercentileLatency(us) 2365 3475
> [READ], 99thPercentileLatency(us) 3099 4143
> [READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
> [READ-MODIFY-WRITE], MinLatency(us) 1100 1034
> [READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
> [READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627
> [READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919
> [UPDATE], AverageLatency(us) 1700 1777
> [UPDATE], MinLatency(us) 737 687
> [UPDATE], MaxLatency(us) 97983 94271
> [UPDATE], 95thPercentileLatency(us) 3377 4147
> [UPDATE], 99thPercentileLatency(us) 4147 4831
>
>
> On Thu, Apr 4, 2019 at 1:14 AM Yu Li <ca...@gmail.com> wrote:
>
> > Thanks for the efforts boss.
> >
> > Since it's a new minor release, do we have performance comparison report
> > with 1.4.9 as we did when releasing 1.4.0? If so, any reference? Many
> > thanks!
> >
> > Best Regards,
> > Yu
> >
> >
> > On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <ap...@apache.org> wrote:
> >
> > > The fourth HBase 1.5.0 release candidate (RC3) is available for
> download
> > at
> > > https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/ and Maven
> > > artifacts are available in the temporary repository
> > >
> https://repository.apache.org/content/repositories/orgapachehbase-1292/
> > >
> > > The git tag corresponding to the candidate is '1.5.0RC3’ (b0bc7225c5).
> > >
> > > A detailed source and binary compatibility report for this release is
> > > available for your review at
> > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
> > > .
> > >
> > > A list of the 115 issues resolved in this release can be found at
> > > https://s.apache.org/K4Wk . The 1.5.0 changelog is derived from the
> > > changelog of the last branch-1.4 release, 1.4.9.
> > >
> > > Please try out the candidate and vote +1/0/-1.
> > >
> > > The vote will be open for at least 72 hours. Unless objection I will
> try
> > to
> > > close it Friday April 12, 2019 if we have sufficient votes.
> > >
> > > Prior to making this announcement I made the following preflight
> checks:
> > >
> > >     RAT check passes (7u80)
> > >     Unit test suite passes (7u80, 8u181)*
> > >     Opened the UI in a browser, poked around
> > >     LTT load 100M rows with 100% verification and 20% updates (8u181)
> > >     ITBLL 1B rows with slowDeterministic monkey (8u181)
> > >     ITBLL 1B rows with serverKilling monkey (8u181)
> > >
> > > There are known flaky tests. See HBASE-21904 and HBASE-21905. These
> flaky
> > > tests do not represent serious test failures that would prevent a
> > release.
> > >
> > >
> > > --
> > > Best regards,
> > > Andrew
> > >
> >
>
>
> --
> Best regards,
> Andrew
>
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>    - A23, Crosstalk
>

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Andrew Purtell <ap...@apache.org>.
The difference is basically noise per the usual YCSB evaluation. Small
differences in workloads D and F (slightly worse) and workload E (slightly
better) that do not indicate serious regression.

Linux version 4.14.55-62.37.amzn1.x86_64
c3.8xlarge x 5
OpenJDK Runtime Environment (build 1.8.0_181-shenandoah-b13)
-Xms20g -Xmx20g -XX:+UseG1GC -XX:+AlwaysPreTouch -XX:+UseNUMA
-XX:-UseBiasedLocking -XX:+ParallelRefProcEnabled
Hadoop 2.9.2
Init: Load 100 M rows and snapshot
Run: Delete table, clone and redeploy from snapshot, run 10 M operations
Args: -threads 100 -target 50000
Test table: {NAME => 'u', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY
=> 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING =>
'ROW_INDEX_V1', TTL => 'FOREVER', COMPRESSION => 'SNAPPY', MIN_VERSIONS =>
'0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE =>
'0'}


YCSB Workload A

target 50k/op/s 1.4.9 1.5.0



[OVERALL], RunTime(ms) 200592 200583
[OVERALL], Throughput(ops/sec) 49852 49855
[READ], AverageLatency(us) 544 559
[READ], MinLatency(us) 267 292
[READ], MaxLatency(us) 165631 185087
[READ], 95thPercentileLatency(us) 738 742
[READ], 99thPercentileLatency(us), 1877 1961
[UPDATE], AverageLatency(us) 1370 1181
[UPDATE], MinLatency(us) 702 646
[UPDATE], MaxLatency(us) 180735 177279
[UPDATE], 95thPercentileLatency(us) 1943 1652
[UPDATE], 99thPercentileLatency(us) 3257 3085

YCSB Workload B

target 50k/op/s 1.4.9 1.5.0



[OVERALL], RunTime(ms) 200599 200581
[OVERALL], Throughput(ops/sec) 49850 49855
[READ], AverageLatency(us),  454 471
[READ], MinLatency(us) 203 213
[READ], MaxLatency(us) 183423 174207
[READ], 95thPercentileLatency(us) 563 599
[READ], 99thPercentileLatency(us) 1360 1172
[UPDATE], AverageLatency(us) 1064 1029
[UPDATE], MinLatency(us) 746 726
[UPDATE], MaxLatency(us) 163455 101631
[UPDATE], 95thPercentileLatency(us) 1327 1157
[UPDATE], 99thPercentileLatency(us) 2241 1898

YCSB Workload C

target 50k/op/s 1.4.9 1.5.0



[OVERALL], RunTime(ms) 200541 200538
[OVERALL], Throughput(ops/sec) 49865 49865
[READ], AverageLatency(us) 332 327
[READ], MinLatency(us) 175 179
[READ], MaxLatency(us) 210559 170367
[READ], 95thPercentileLatency(us) 410 396
[READ], 99thPercentileLatency(us) 871 892

YCSB Workload D

target 50k/op/s 1.4.9 1.5.0



[OVERALL], RunTime(ms) 200579 200562
[OVERALL], Throughput(ops/sec) 49855 49859
[READ], AverageLatency(us) 487 547
[READ], MinLatency(us) 210 214
[READ], MaxLatency(us) 192255 177535
[READ], 95thPercentileLatency(us) 973 1529
[READ], 99thPercentileLatency(us) 1836 2683
[INSERT], AverageLatency(us) 1239 1152
[INSERT], MinLatency(us) 807 788
[INSERT], MaxLatency(us) 184575 148735
[INSERT], 95thPercentileLatency(us) 1496 1243
[INSERT], 99thPercentileLatency(us) 2965 2495

YCSB Workload E

target 10k/op/s 1.4.9 1.5.0



[OVERALL], RunTime(ms) 100605 100568
[OVERALL], Throughput(ops/sec) 9939 9943
[SCAN], AverageLatency(us) 3548 2687
[SCAN], MinLatency(us) 696 678
[SCAN], MaxLatency(us) 1059839 238463
[SCAN], 95thPercentileLatency(us) 8327 6791
[SCAN], 99thPercentileLatency(us) 17647 14415
[INSERT], AverageLatency(us) 2688 1555
[INSERT], MinLatency(us) 887 815
[INSERT], MaxLatency(us) 173311 154623
[INSERT], 95thPercentileLatency(us) 4455 2571
[INSERT], 99thPercentileLatency(us) 9303 5375

YCSB Workload F

target 50k/op/s 1.4.9 1.5.0



[OVERALL], RunTime(ms) 200562 204178
[OVERALL], Throughput(ops/sec) 49859 48976
[READ], AverageLatency(us) 856 1137
[READ], MinLatency(us) 262 257
[READ], MaxLatency(us) 205567 222335
[READ], 95thPercentileLatency(us) 2365 3475
[READ], 99thPercentileLatency(us) 3099 4143
[READ-MODIFY-WRITE], AverageLatency(us) 2559 2917
[READ-MODIFY-WRITE], MinLatency(us) 1100 1034
[READ-MODIFY-WRITE], MaxLatency(us) 208767 204799
[READ-MODIFY-WRITE], 95thPercentileLatency(us) 5747 7627
[READ-MODIFY-WRITE], 99thPercentileLatency(us) 7203 8919
[UPDATE], AverageLatency(us) 1700 1777
[UPDATE], MinLatency(us) 737 687
[UPDATE], MaxLatency(us) 97983 94271
[UPDATE], 95thPercentileLatency(us) 3377 4147
[UPDATE], 99thPercentileLatency(us) 4147 4831


On Thu, Apr 4, 2019 at 1:14 AM Yu Li <ca...@gmail.com> wrote:

> Thanks for the efforts boss.
>
> Since it's a new minor release, do we have performance comparison report
> with 1.4.9 as we did when releasing 1.4.0? If so, any reference? Many
> thanks!
>
> Best Regards,
> Yu
>
>
> On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <ap...@apache.org> wrote:
>
> > The fourth HBase 1.5.0 release candidate (RC3) is available for download
> at
> > https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/ and Maven
> > artifacts are available in the temporary repository
> > https://repository.apache.org/content/repositories/orgapachehbase-1292/
> >
> > The git tag corresponding to the candidate is '1.5.0RC3’ (b0bc7225c5).
> >
> > A detailed source and binary compatibility report for this release is
> > available for your review at
> >
> >
> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
> > .
> >
> > A list of the 115 issues resolved in this release can be found at
> > https://s.apache.org/K4Wk . The 1.5.0 changelog is derived from the
> > changelog of the last branch-1.4 release, 1.4.9.
> >
> > Please try out the candidate and vote +1/0/-1.
> >
> > The vote will be open for at least 72 hours. Unless objection I will try
> to
> > close it Friday April 12, 2019 if we have sufficient votes.
> >
> > Prior to making this announcement I made the following preflight checks:
> >
> >     RAT check passes (7u80)
> >     Unit test suite passes (7u80, 8u181)*
> >     Opened the UI in a browser, poked around
> >     LTT load 100M rows with 100% verification and 20% updates (8u181)
> >     ITBLL 1B rows with slowDeterministic monkey (8u181)
> >     ITBLL 1B rows with serverKilling monkey (8u181)
> >
> > There are known flaky tests. See HBASE-21904 and HBASE-21905. These flaky
> > tests do not represent serious test failures that would prevent a
> release.
> >
> >
> > --
> > Best regards,
> > Andrew
> >
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk

Re: The fourth HBase 1.5.0 release candidate (RC3) is available

Posted by Yu Li <ca...@gmail.com>.
Thanks for the efforts boss.

Since it's a new minor release, do we have performance comparison report
with 1.4.9 as we did when releasing 1.4.0? If so, any reference? Many
thanks!

Best Regards,
Yu


On Thu, 4 Apr 2019 at 07:44, Andrew Purtell <ap...@apache.org> wrote:

> The fourth HBase 1.5.0 release candidate (RC3) is available for download at
> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/ and Maven
> artifacts are available in the temporary repository
> https://repository.apache.org/content/repositories/orgapachehbase-1292/
>
> The git tag corresponding to the candidate is '1.5.0RC3’ (b0bc7225c5).
>
> A detailed source and binary compatibility report for this release is
> available for your review at
>
> https://dist.apache.org/repos/dist/dev/hbase/hbase-1.5.0RC3/compat-check-report.html
> .
>
> A list of the 115 issues resolved in this release can be found at
> https://s.apache.org/K4Wk . The 1.5.0 changelog is derived from the
> changelog of the last branch-1.4 release, 1.4.9.
>
> Please try out the candidate and vote +1/0/-1.
>
> The vote will be open for at least 72 hours. Unless objection I will try to
> close it Friday April 12, 2019 if we have sufficient votes.
>
> Prior to making this announcement I made the following preflight checks:
>
>     RAT check passes (7u80)
>     Unit test suite passes (7u80, 8u181)*
>     Opened the UI in a browser, poked around
>     LTT load 100M rows with 100% verification and 20% updates (8u181)
>     ITBLL 1B rows with slowDeterministic monkey (8u181)
>     ITBLL 1B rows with serverKilling monkey (8u181)
>
> There are known flaky tests. See HBASE-21904 and HBASE-21905. These flaky
> tests do not represent serious test failures that would prevent a release.
>
>
> --
> Best regards,
> Andrew
>