You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by GitBox <gi...@apache.org> on 2020/08/04 02:09:13 UTC

[GitHub] [hbase] joshelser opened a new pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

joshelser opened a new pull request #2193:
URL: https://github.com/apache/hbase/pull/2193


   Testing done:
   ```
   create 'j', 'f'
   add_peer '1', ENDPOINT_CLASSNAME => 'org.apache.hadoop.hbase.replication.TestReplicationEndpoint$SleepingReplicationEndpointForTest'
   alter 'j', {NAME=>'f', REPLICATION_SCOPE=>1}
   (1..10000).each{|y| (1..100).each{|x| put 'j', "#{x}#{y}", 'f:q', "#{x}#{y}"}; sleep 0.5}
   ```
   
   And then to observe:
   1. set `replication.stats.thread.period.seconds` to `15` in hbase-site.xml and tail regionserver log
   2. `watch -n 15 "curl -s http://mizar.local:16030/jmx | jq '.beans[] | select(.name == \"Hadoop:service=HBase,name=RegionServer,sub=Replication\") | .\"source.walReaderEditsBufferUsage\"'"`
   
   Can see the buffer grow. When the puts run out, the buffer eventually drains.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] joshelser commented on a change in pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
joshelser commented on a change in pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#discussion_r465103092



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/Replication.java
##########
@@ -244,17 +248,22 @@ void addHFileRefsToQueue(TableName tableName, byte[] family, List<Pair<Path, Pat
 
     private final ReplicationSink replicationSink;
     private final ReplicationSourceManager replicationManager;
+    private final MetricsReplicationSourceSource globalMetricsSource;
 
     public ReplicationStatisticsTask(ReplicationSink replicationSink,
-        ReplicationSourceManager replicationManager) {
+        ReplicationSourceManager replicationManager, MetricsReplicationSourceSource globalMetricsSource) {
       this.replicationManager = replicationManager;
       this.replicationSink = replicationSink;
+      this.globalMetricsSource = globalMetricsSource;
     }
 
     @Override
     public void run() {
       printStats(this.replicationManager.getStats());
       printStats(this.replicationSink.getStats());
+
+      // Report how much data we've read off disk which is pending replication, across all sources
+      globalMetricsSource.setWALReaderEditsBufferBytes(replicationManager.getTotalBufferUsed().get());

Review comment:
       The rest of the global metrics are tied to some regular "workflow" of replication, e.g. when a new file is queued.
   
   We could piggy-back onto one of those, but I wasn't sure what would be an "appropriate" time in which to report this, so I thought a regular period would be better. Open to suggestions.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] joshelser commented on pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
joshelser commented on pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#issuecomment-668731763


   Thanks folks! Just pushed a couple more commits for cleanup on QA and wellington's suggestions.
   
   I'll merge when QA is happy.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] joshelser commented on a change in pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
joshelser commented on a change in pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#discussion_r465154338



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/Replication.java
##########
@@ -244,17 +248,22 @@ void addHFileRefsToQueue(TableName tableName, byte[] family, List<Pair<Path, Pat
 
     private final ReplicationSink replicationSink;
     private final ReplicationSourceManager replicationManager;
+    private final MetricsReplicationSourceSource globalMetricsSource;
 
     public ReplicationStatisticsTask(ReplicationSink replicationSink,
-        ReplicationSourceManager replicationManager) {
+        ReplicationSourceManager replicationManager, MetricsReplicationSourceSource globalMetricsSource) {
       this.replicationManager = replicationManager;
       this.replicationSink = replicationSink;
+      this.globalMetricsSource = globalMetricsSource;
     }
 
     @Override
     public void run() {
       printStats(this.replicationManager.getStats());
       printStats(this.replicationSink.getStats());
+
+      // Report how much data we've read off disk which is pending replication, across all sources
+      globalMetricsSource.setWALReaderEditsBufferBytes(replicationManager.getTotalBufferUsed().get());

Review comment:
       Pushed this down into each source and it's not too bad.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] asfgit closed pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #2193:
URL: https://github.com/apache/hbase/pull/2193


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#issuecomment-668841506


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 30s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  3s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 15s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 31s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 15s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   5m 35s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 51s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 16s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 33s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 19s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 19s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   5m 40s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 35s |  hbase-server generated 1 new + 28 unchanged - 0 fixed = 29 total (was 28)  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 33s |  hbase-hadoop-compat in the patch passed.  |
   | -1 :x: |  unit  | 145m 27s |  hbase-server in the patch failed.  |
   |  |   | 171m 57s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/3/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2193 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 2c92a83078cb 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / d2f5a5f27b |
   | Default Java | 1.8.0_232 |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/3/artifact/yetus-jdk8-hadoop3-check/output/diff-javadoc-javadoc-hbase-server.txt |
   | unit | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/3/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/3/testReport/ |
   | Max. process+thread count | 4593 (vs. ulimit of 12500) |
   | modules | C: hbase-hadoop-compat hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/3/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#issuecomment-668379724


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 28s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  3s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 23s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 39s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   5m 36s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 55s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 17s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 23s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 13s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   5m 32s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 38s |  hbase-server generated 1 new + 28 unchanged - 0 fixed = 29 total (was 28)  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 34s |  hbase-hadoop-compat in the patch passed.  |
   | -1 :x: |  unit  | 140m 34s |  hbase-server in the patch failed.  |
   |  |   | 167m 11s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/1/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2193 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux d2a0a5f3deff 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 148c185486 |
   | Default Java | 1.8.0_232 |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/1/artifact/yetus-jdk8-hadoop3-check/output/diff-javadoc-javadoc-hbase-server.txt |
   | unit | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/1/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/1/testReport/ |
   | Max. process+thread count | 4209 (vs. ulimit of 12500) |
   | modules | C: hbase-hadoop-compat hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/1/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#issuecomment-668766824


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 34s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  3s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 24s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 49s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   5m 54s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 55s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 17s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 26s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 13s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 13s |  the patch passed  |
   | -1 :x: |  shadedjars  |   2m 38s |  patch has 10 errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 37s |  hbase-server generated 1 new + 28 unchanged - 0 fixed = 29 total (was 28)  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 34s |  hbase-hadoop-compat in the patch passed.  |
   | -1 :x: |  unit  | 153m 49s |  hbase-server in the patch failed.  |
   |  |   | 178m  5s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/2/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2193 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux cb42875b271d 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / d2f5a5f27b |
   | Default Java | 1.8.0_232 |
   | shadedjars | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/2/artifact/yetus-jdk8-hadoop3-check/output/patch-shadedjars.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/2/artifact/yetus-jdk8-hadoop3-check/output/diff-javadoc-javadoc-hbase-server.txt |
   | unit | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/2/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/2/testReport/ |
   | Max. process+thread count | 4770 (vs. ulimit of 12500) |
   | modules | C: hbase-hadoop-compat hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/2/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] joshelser commented on a change in pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
joshelser commented on a change in pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#discussion_r465100973



##########
File path: hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsReplicationSourceSourceImpl.java
##########
@@ -314,4 +314,15 @@ public String getMetricsName() {
   @Override public long getEditsFiltered() {
     return this.walEditsFilteredCounter.value();
   }
+
+  @Override
+  public void setWALReaderEditsBufferBytes(long usage) {
+    //noop. Global limit, tracked globally. Do not need per-source metrics

Review comment:
       Once we chuck something into this usage, we have zero insight back to which source put it there. Wellington was working on this buffer tracking in HBASE-24813, but I don't think per-source tracking was "in scope"




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] joshelser commented on a change in pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
joshelser commented on a change in pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#discussion_r465118725



##########
File path: hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsReplicationSourceSource.java
##########
@@ -76,4 +77,13 @@
   long getWALEditsRead();
   long getShippedOps();
   long getEditsFiltered();
+  /**
+   * Sets the total usage of memory used by edits in memory read from WALs.
+   * @param usage The memory used by edits in bytes
+   */
+  void setWALReaderEditsBufferBytes(long usage);

Review comment:
       Yeah, so the reason this was done this was is that we had no interface for global metrics, just a second implementation of the per-source interface.
   
   Stubbed in a new interface so that we can have "global-metrics-only" methods that don't pollute per-source metrics.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] busbey commented on a change in pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
busbey commented on a change in pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#discussion_r465182276



##########
File path: hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsReplicationGlobalSourceSource.java
##########
@@ -1,253 +1,19 @@
-/**
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-

Review comment:
       gotta put this back.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] joshelser commented on pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
joshelser commented on pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#issuecomment-668870260


   Interesting. TestWALEntryStream timed out for me too. Will dig in.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#issuecomment-670271614


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 10s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  4s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 25s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   4m 17s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 25s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   5m 55s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 22s |  hbase-hadoop-compat in master failed.  |
   | -0 :warning: |  javadoc  |   0m 39s |  hbase-server in master failed.  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 16s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   4m  0s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 26s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 26s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   5m 51s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 18s |  hbase-hadoop-compat in the patch failed.  |
   | -0 :warning: |  javadoc  |   0m 40s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 37s |  hbase-hadoop-compat in the patch passed.  |
   | -1 :x: |  unit  | 137m 52s |  hbase-server in the patch failed.  |
   |  |   | 167m 41s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/4/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2193 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 99c53a8bc42d 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / f710d2d654 |
   | Default Java | 2020-01-14 |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/4/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-hadoop-compat.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/4/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/4/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-hadoop-compat.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/4/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   | unit | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/4/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/4/testReport/ |
   | Max. process+thread count | 3991 (vs. ulimit of 12500) |
   | modules | C: hbase-hadoop-compat hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/4/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#issuecomment-668698173


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 29s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 23s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 36s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m 18s |  master passed  |
   | +1 :green_heart: |  spotbugs  |   2m 27s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 14s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 22s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   0m 13s |  hbase-hadoop-compat: The patch generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0)  |
   | -0 :warning: |  checkstyle  |   1m  3s |  hbase-server: The patch generated 2 new + 7 unchanged - 0 fixed = 9 total (was 7)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  11m 15s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   2m 45s |  the patch passed  |
   ||| _ Other Tests _ |
   | -1 :x: |  asflicense  |   0m 26s |  The patch generated 1 ASF License warnings.  |
   |  |   |  35m  3s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/2/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2193 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux 815f6c25f8a8 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / d2f5a5f27b |
   | checkstyle | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/2/artifact/yetus-general-check/output/diff-checkstyle-hbase-hadoop-compat.txt |
   | checkstyle | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/2/artifact/yetus-general-check/output/diff-checkstyle-hbase-server.txt |
   | asflicense | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/2/artifact/yetus-general-check/output/patch-asflicense-problems.txt |
   | Max. process+thread count | 94 (vs. ulimit of 12500) |
   | modules | C: hbase-hadoop-compat hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/2/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] joshelser commented on a change in pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
joshelser commented on a change in pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#discussion_r465204713



##########
File path: hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsReplicationGlobalSourceSource.java
##########
@@ -1,253 +1,19 @@
-/**
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
 package org.apache.hadoop.hbase.replication.regionserver;
 
-import org.apache.hadoop.metrics2.lib.MutableFastCounter;
-import org.apache.hadoop.metrics2.lib.MutableGaugeLong;
-import org.apache.hadoop.metrics2.lib.MutableHistogram;
 import org.apache.yetus.audience.InterfaceAudience;
 
 @InterfaceAudience.Private
-public class MetricsReplicationGlobalSourceSource implements MetricsReplicationSourceSource{
-  private static final String KEY_PREFIX = "source.";
-
-  private final MetricsReplicationSourceImpl rms;
-
-  private final MutableHistogram ageOfLastShippedOpHist;
-  private final MutableGaugeLong sizeOfLogQueueGauge;
-  private final MutableFastCounter logReadInEditsCounter;
-  private final MutableFastCounter walEditsFilteredCounter;
-  private final MutableFastCounter shippedBatchesCounter;
-  private final MutableFastCounter shippedOpsCounter;
-  private final MutableFastCounter shippedBytesCounter;
-  private final MutableFastCounter logReadInBytesCounter;
-  private final MutableFastCounter shippedHFilesCounter;
-  private final MutableGaugeLong sizeOfHFileRefsQueueGauge;
-  private final MutableFastCounter unknownFileLengthForClosedWAL;
-  private final MutableFastCounter uncleanlyClosedWAL;
-  private final MutableFastCounter uncleanlyClosedSkippedBytes;
-  private final MutableFastCounter restartWALReading;
-  private final MutableFastCounter repeatedFileBytes;
-  private final MutableFastCounter completedWAL;
-  private final MutableFastCounter completedRecoveryQueue;
-  private final MutableFastCounter failedRecoveryQueue;
-
-  public MetricsReplicationGlobalSourceSource(MetricsReplicationSourceImpl rms) {
-    this.rms = rms;
-
-    ageOfLastShippedOpHist = rms.getMetricsRegistry().getHistogram(SOURCE_AGE_OF_LAST_SHIPPED_OP);
-
-    sizeOfLogQueueGauge = rms.getMetricsRegistry().getGauge(SOURCE_SIZE_OF_LOG_QUEUE, 0L);
-
-    shippedBatchesCounter = rms.getMetricsRegistry().getCounter(SOURCE_SHIPPED_BATCHES, 0L);
-
-    shippedOpsCounter = rms.getMetricsRegistry().getCounter(SOURCE_SHIPPED_OPS, 0L);
-
-    shippedBytesCounter = rms.getMetricsRegistry().getCounter(SOURCE_SHIPPED_BYTES, 0L);
-
-    logReadInBytesCounter = rms.getMetricsRegistry().getCounter(SOURCE_LOG_READ_IN_BYTES, 0L);
-
-    logReadInEditsCounter = rms.getMetricsRegistry().getCounter(SOURCE_LOG_READ_IN_EDITS, 0L);
-
-    walEditsFilteredCounter = rms.getMetricsRegistry().getCounter(SOURCE_LOG_EDITS_FILTERED, 0L);
-
-    shippedHFilesCounter = rms.getMetricsRegistry().getCounter(SOURCE_SHIPPED_HFILES, 0L);
-
-    sizeOfHFileRefsQueueGauge =
-        rms.getMetricsRegistry().getGauge(SOURCE_SIZE_OF_HFILE_REFS_QUEUE, 0L);
-
-    unknownFileLengthForClosedWAL = rms.getMetricsRegistry()
-            .getCounter(SOURCE_CLOSED_LOGS_WITH_UNKNOWN_LENGTH, 0L);
-    uncleanlyClosedWAL = rms.getMetricsRegistry().getCounter(SOURCE_UNCLEANLY_CLOSED_LOGS, 0L);
-    uncleanlyClosedSkippedBytes = rms.getMetricsRegistry()
-            .getCounter(SOURCE_UNCLEANLY_CLOSED_IGNORED_IN_BYTES, 0L);
-    restartWALReading = rms.getMetricsRegistry().getCounter(SOURCE_RESTARTED_LOG_READING, 0L);
-    repeatedFileBytes = rms.getMetricsRegistry().getCounter(SOURCE_REPEATED_LOG_FILE_BYTES, 0L);
-    completedWAL = rms.getMetricsRegistry().getCounter(SOURCE_COMPLETED_LOGS, 0L);
-    completedRecoveryQueue = rms.getMetricsRegistry()
-            .getCounter(SOURCE_COMPLETED_RECOVERY_QUEUES, 0L);
-    failedRecoveryQueue = rms.getMetricsRegistry()
-            .getCounter(SOURCE_FAILED_RECOVERY_QUEUES, 0L);
-  }
-
-  @Override public void setLastShippedAge(long age) {
-    ageOfLastShippedOpHist.add(age);
-  }
-
-  @Override public void incrSizeOfLogQueue(int size) {
-    sizeOfLogQueueGauge.incr(size);
-  }
-
-  @Override public void decrSizeOfLogQueue(int size) {
-    sizeOfLogQueueGauge.decr(size);
-  }
-
-  @Override public void incrLogReadInEdits(long size) {
-    logReadInEditsCounter.incr(size);
-  }
-
-  @Override public void incrLogEditsFiltered(long size) {
-    walEditsFilteredCounter.incr(size);
-  }
-
-  @Override public void incrBatchesShipped(int batches) {
-    shippedBatchesCounter.incr(batches);
-  }
-
-  @Override public void incrOpsShipped(long ops) {
-    shippedOpsCounter.incr(ops);
-  }
-
-  @Override public void incrShippedBytes(long size) {
-    shippedBytesCounter.incr(size);
-  }
-
-  @Override public void incrLogReadInBytes(long size) {
-    logReadInBytesCounter.incr(size);
-  }
-
-  @Override public void clear() {
-  }
-
-  @Override
-  public long getLastShippedAge() {
-    return ageOfLastShippedOpHist.getMax();
-  }
-
-  @Override public void incrHFilesShipped(long hfiles) {
-    shippedHFilesCounter.incr(hfiles);
-  }
-
-  @Override
-  public void incrSizeOfHFileRefsQueue(long size) {
-    sizeOfHFileRefsQueueGauge.incr(size);
-  }
-
-  @Override
-  public void decrSizeOfHFileRefsQueue(long size) {
-    sizeOfHFileRefsQueueGauge.decr(size);
-  }
-
-  @Override
-  public int getSizeOfLogQueue() {
-    return (int)sizeOfLogQueueGauge.value();
-  }
-
-  @Override
-  public void incrUnknownFileLengthForClosedWAL() {
-    unknownFileLengthForClosedWAL.incr(1L);
-  }
-  @Override
-  public void incrUncleanlyClosedWALs() {
-    uncleanlyClosedWAL.incr(1L);
-  }
-  @Override
-  public void incrBytesSkippedInUncleanlyClosedWALs(final long bytes) {
-    uncleanlyClosedSkippedBytes.incr(bytes);
-  }
-  @Override
-  public void incrRestartedWALReading() {
-    restartWALReading.incr(1L);
-  }
-  @Override
-  public void incrRepeatedFileBytes(final long bytes) {
-    repeatedFileBytes.incr(bytes);
-  }
-  @Override
-  public void incrCompletedWAL() {
-    completedWAL.incr(1L);
-  }
-  @Override
-  public void incrCompletedRecoveryQueue() {
-    completedRecoveryQueue.incr(1L);
-  }
-  @Override
-  public void incrFailedRecoveryQueue() {
-    failedRecoveryQueue.incr(1L);
-  }
-  @Override
-  public void init() {
-    rms.init();
-  }
-
-  @Override
-  public void setGauge(String gaugeName, long value) {
-    rms.setGauge(KEY_PREFIX + gaugeName, value);
-  }
-
-  @Override
-  public void incGauge(String gaugeName, long delta) {
-    rms.incGauge(KEY_PREFIX + gaugeName, delta);
-  }
-
-  @Override
-  public void decGauge(String gaugeName, long delta) {
-    rms.decGauge(KEY_PREFIX + gaugeName, delta);
-  }
-
-  @Override
-  public void removeMetric(String key) {
-    rms.removeMetric(KEY_PREFIX + key);
-  }
-
-  @Override
-  public void incCounters(String counterName, long delta) {
-    rms.incCounters(KEY_PREFIX + counterName, delta);
-  }
-
-  @Override
-  public void updateHistogram(String name, long value) {
-    rms.updateHistogram(KEY_PREFIX + name, value);
-  }
-
-  @Override
-  public String getMetricsContext() {
-    return rms.getMetricsContext();
-  }
-
-  @Override
-  public String getMetricsDescription() {
-    return rms.getMetricsDescription();
-  }
-
-  @Override
-  public String getMetricsJmxContext() {
-    return rms.getMetricsJmxContext();
-  }
-
-  @Override
-  public String getMetricsName() {
-    return rms.getMetricsName();
-  }
-
-  @Override
-  public long getWALEditsRead() {
-    return this.logReadInEditsCounter.value();
-  }
-
-  @Override
-  public long getShippedOps() {
-    return this.shippedOpsCounter.value();
-  }
-
-  @Override
-  public long getEditsFiltered() {
-    return this.walEditsFilteredCounter.value();
-  }
+public interface MetricsReplicationGlobalSourceSource extends MetricsReplicationSourceSource {
+
+  public static final String SOURCE_WAL_READER_EDITS_BUFFER = "source.walReaderEditsBufferUsage";
+
+  /**
+   * Sets the total usage of memory used by edits in memory read from WALs.
+   * @param usage The memory used by edits in bytes
+   */
+  void setWALReaderEditsBufferBytes(long usage);

Review comment:
       I like it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] bharathv commented on a change in pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
bharathv commented on a change in pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#discussion_r465216898



##########
File path: hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsReplicationSourceSourceImpl.java
##########
@@ -314,4 +314,15 @@ public String getMetricsName() {
   @Override public long getEditsFiltered() {
     return this.walEditsFilteredCounter.value();
   }
+
+  @Override
+  public void setWALReaderEditsBufferBytes(long usage) {
+    //noop. Global limit, tracked globally. Do not need per-source metrics

Review comment:
       Oh, I see. Looks like you pushed the metrics update logic into the source, the patch looks clean now.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] joshelser commented on pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
joshelser commented on pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#issuecomment-670613317


   TestReplicationSource failure appears to be due to HBASE-24817 (#2198) 
   ```
   [INFO] Running org.apache.hadoop.hbase.replication.regionserver.TestReplicationSource
   [ERROR] Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 49.293 s <<< FAILURE! - in org.apache.hadoop.hbase.replication.regionserver.TestReplicationSource
   [ERROR] org.apache.hadoop.hbase.replication.regionserver.TestReplicationSource.testWALEntryFilter  Time elapsed: 1.641 s  <<< FAILURE!
   java.lang.AssertionError
   	at org.apache.hadoop.hbase.replication.regionserver.TestReplicationSource.testWALEntryFilter(TestReplicationSource.java:177)
   ```
   We end up filtering out an edit because it's empty, but we expect to not filter it because it's not a system table edit.
   
   I'll make the same mocking changes to prevent future pain in TestReplicationSource (like I did for TestWALEntryStream) in a follow-on, as well as fix this test. FYI @saintstack 
   
   Pushing this to make some progress.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] joshelser commented on a change in pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
joshelser commented on a change in pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#discussion_r465189517



##########
File path: hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsReplicationGlobalSourceSource.java
##########
@@ -1,253 +1,19 @@
-/**
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-

Review comment:
       oops!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] joshelser commented on a change in pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
joshelser commented on a change in pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#discussion_r465218113



##########
File path: hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsReplicationSourceSourceImpl.java
##########
@@ -314,4 +314,15 @@ public String getMetricsName() {
   @Override public long getEditsFiltered() {
     return this.walEditsFilteredCounter.value();
   }
+
+  @Override
+  public void setWALReaderEditsBufferBytes(long usage) {
+    //noop. Global limit, tracked globally. Do not need per-source metrics

Review comment:
       yup! Just made another interface to isolate the new additions which cleans this up a little. I think your suggestion is still good for better, fine-grained tracking. However, since HBASE-20417, hopefully no one else runs into this ;)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#issuecomment-668783695


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 28s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 22s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 43s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m 20s |  master passed  |
   | +1 :green_heart: |  spotbugs  |   2m 45s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 13s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 19s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   1m  2s |  hbase-server: The patch generated 1 new + 7 unchanged - 0 fixed = 8 total (was 7)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  11m 17s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   2m 48s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 25s |  The patch does not generate ASF License warnings.  |
   |  |   |  35m 37s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/3/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2193 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux 88ef5b9d1868 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / d2f5a5f27b |
   | checkstyle | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/3/artifact/yetus-general-check/output/diff-checkstyle-hbase-server.txt |
   | Max. process+thread count | 94 (vs. ulimit of 12500) |
   | modules | C: hbase-hadoop-compat hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/3/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] bharathv commented on a change in pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
bharathv commented on a change in pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#discussion_r465146006



##########
File path: hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsReplicationSourceSourceImpl.java
##########
@@ -314,4 +314,15 @@ public String getMetricsName() {
   @Override public long getEditsFiltered() {
     return this.walEditsFilteredCounter.value();
   }
+
+  @Override
+  public void setWALReaderEditsBufferBytes(long usage) {
+    //noop. Global limit, tracked globally. Do not need per-source metrics

Review comment:
       > Once we chuck something into this usage, we have zero insight back to which source put it there.
   
   I had the same question. I was wondering if a drill down by source would be helpful in addition to the global usage. Correct me if I'm wrong, looks like ReplicationSource class has access to the MetricsSource object.. we can just update the byte usage for that source? (and the global too at the same time). That way we can also get rid of the special logic to update one metric setWALReaderEditsBufferBytes(). Does that not work for some reason?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] joshelser commented on a change in pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
joshelser commented on a change in pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#discussion_r465212657



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceWALReader.java
##########
@@ -276,6 +274,8 @@ public Path getCurrentPath() {
   private boolean checkQuota() {
     // try not to go over total quota
     if (totalBufferUsed.get() > totalBufferQuota) {
+      LOG.warn("Can't read more edits from WAL as buffer usage {}B exceeds limit {}B",

Review comment:
       Sure, that's easy to add.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#issuecomment-670235171


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 35s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 25s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 38s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m 17s |  master passed  |
   | +1 :green_heart: |  spotbugs  |   2m 29s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 14s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 27s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   1m  5s |  hbase-server: The patch generated 4 new + 10 unchanged - 0 fixed = 14 total (was 10)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  11m 20s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   3m  2s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 26s |  The patch does not generate ASF License warnings.  |
   |  |   |  37m  7s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/4/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2193 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux 273d4a64f845 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / f710d2d654 |
   | checkstyle | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/4/artifact/yetus-general-check/output/diff-checkstyle-hbase-server.txt |
   | Max. process+thread count | 94 (vs. ulimit of 12500) |
   | modules | C: hbase-hadoop-compat hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/4/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] joshelser commented on a change in pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
joshelser commented on a change in pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#discussion_r465101199



##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationEndpoint.java
##########
@@ -497,6 +497,33 @@ public boolean canReplicateToSameCluster() {
     }
   }
 
+  public static class SleepingReplicationEndpointForTest extends ReplicationEndpointForTest {

Review comment:
       Will do.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#issuecomment-668838364


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 31s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  4s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 12s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 56s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   5m 43s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 19s |  hbase-hadoop-compat in master failed.  |
   | -0 :warning: |  javadoc  |   0m 38s |  hbase-server in master failed.  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 16s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   4m  4s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 22s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 22s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   5m 47s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 19s |  hbase-hadoop-compat in the patch failed.  |
   | -0 :warning: |  javadoc  |   0m 40s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 33s |  hbase-hadoop-compat in the patch passed.  |
   | -1 :x: |  unit  | 134m 41s |  hbase-server in the patch failed.  |
   |  |   | 162m 51s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/3/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2193 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 87b66f02ce48 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / d2f5a5f27b |
   | Default Java | 2020-01-14 |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/3/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-hadoop-compat.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/3/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/3/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-hadoop-compat.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/3/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   | unit | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/3/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/3/testReport/ |
   | Max. process+thread count | 4261 (vs. ulimit of 12500) |
   | modules | C: hbase-hadoop-compat hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/3/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] joshelser commented on a change in pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
joshelser commented on a change in pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#discussion_r465154178



##########
File path: hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsReplicationSourceSourceImpl.java
##########
@@ -314,4 +314,15 @@ public String getMetricsName() {
   @Override public long getEditsFiltered() {
     return this.walEditsFilteredCounter.value();
   }
+
+  @Override
+  public void setWALReaderEditsBufferBytes(long usage) {
+    //noop. Global limit, tracked globally. Do not need per-source metrics

Review comment:
       > looks like ReplicationSource class has access to the MetricsSource object.. we can just update the byte usage for that source? (and the global too at the same time). That way we can also get rid of the special logic to update one metric setWALReaderEditsBufferBytes()
   
   You are correct that we could do that. I wanted to keep this change scoped on "make what we currently have reportable". I am all for doing a per-source tracking in addition to the globally-scoped tracking. I'd rather just keep these two things separate :)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] joshelser commented on pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
joshelser commented on pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#issuecomment-670773715


   Filed https://issues.apache.org/jira/browse/HBASE-24834 to address the TestReplicationSource failure mentioned above.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] joshelser commented on a change in pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
joshelser commented on a change in pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#discussion_r465102036



##########
File path: hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsReplicationSourceSource.java
##########
@@ -76,4 +77,13 @@
   long getWALEditsRead();
   long getShippedOps();
   long getEditsFiltered();
+  /**
+   * Sets the total usage of memory used by edits in memory read from WALs.
+   * @param usage The memory used by edits in bytes
+   */
+  void setWALReaderEditsBufferBytes(long usage);

Review comment:
       IIRC, the type hierarchy is terrible and prevented this from being done in a more clean way (that wouldn't require an unsafe cast), but let me double-check that.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] wchevreuil commented on a change in pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
wchevreuil commented on a change in pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#discussion_r465159767



##########
File path: hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsReplicationGlobalSourceSource.java
##########
@@ -1,253 +1,19 @@
-/**
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- *     http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
 package org.apache.hadoop.hbase.replication.regionserver;
 
-import org.apache.hadoop.metrics2.lib.MutableFastCounter;
-import org.apache.hadoop.metrics2.lib.MutableGaugeLong;
-import org.apache.hadoop.metrics2.lib.MutableHistogram;
 import org.apache.yetus.audience.InterfaceAudience;
 
 @InterfaceAudience.Private
-public class MetricsReplicationGlobalSourceSource implements MetricsReplicationSourceSource{
-  private static final String KEY_PREFIX = "source.";
-
-  private final MetricsReplicationSourceImpl rms;
-
-  private final MutableHistogram ageOfLastShippedOpHist;
-  private final MutableGaugeLong sizeOfLogQueueGauge;
-  private final MutableFastCounter logReadInEditsCounter;
-  private final MutableFastCounter walEditsFilteredCounter;
-  private final MutableFastCounter shippedBatchesCounter;
-  private final MutableFastCounter shippedOpsCounter;
-  private final MutableFastCounter shippedBytesCounter;
-  private final MutableFastCounter logReadInBytesCounter;
-  private final MutableFastCounter shippedHFilesCounter;
-  private final MutableGaugeLong sizeOfHFileRefsQueueGauge;
-  private final MutableFastCounter unknownFileLengthForClosedWAL;
-  private final MutableFastCounter uncleanlyClosedWAL;
-  private final MutableFastCounter uncleanlyClosedSkippedBytes;
-  private final MutableFastCounter restartWALReading;
-  private final MutableFastCounter repeatedFileBytes;
-  private final MutableFastCounter completedWAL;
-  private final MutableFastCounter completedRecoveryQueue;
-  private final MutableFastCounter failedRecoveryQueue;
-
-  public MetricsReplicationGlobalSourceSource(MetricsReplicationSourceImpl rms) {
-    this.rms = rms;
-
-    ageOfLastShippedOpHist = rms.getMetricsRegistry().getHistogram(SOURCE_AGE_OF_LAST_SHIPPED_OP);
-
-    sizeOfLogQueueGauge = rms.getMetricsRegistry().getGauge(SOURCE_SIZE_OF_LOG_QUEUE, 0L);
-
-    shippedBatchesCounter = rms.getMetricsRegistry().getCounter(SOURCE_SHIPPED_BATCHES, 0L);
-
-    shippedOpsCounter = rms.getMetricsRegistry().getCounter(SOURCE_SHIPPED_OPS, 0L);
-
-    shippedBytesCounter = rms.getMetricsRegistry().getCounter(SOURCE_SHIPPED_BYTES, 0L);
-
-    logReadInBytesCounter = rms.getMetricsRegistry().getCounter(SOURCE_LOG_READ_IN_BYTES, 0L);
-
-    logReadInEditsCounter = rms.getMetricsRegistry().getCounter(SOURCE_LOG_READ_IN_EDITS, 0L);
-
-    walEditsFilteredCounter = rms.getMetricsRegistry().getCounter(SOURCE_LOG_EDITS_FILTERED, 0L);
-
-    shippedHFilesCounter = rms.getMetricsRegistry().getCounter(SOURCE_SHIPPED_HFILES, 0L);
-
-    sizeOfHFileRefsQueueGauge =
-        rms.getMetricsRegistry().getGauge(SOURCE_SIZE_OF_HFILE_REFS_QUEUE, 0L);
-
-    unknownFileLengthForClosedWAL = rms.getMetricsRegistry()
-            .getCounter(SOURCE_CLOSED_LOGS_WITH_UNKNOWN_LENGTH, 0L);
-    uncleanlyClosedWAL = rms.getMetricsRegistry().getCounter(SOURCE_UNCLEANLY_CLOSED_LOGS, 0L);
-    uncleanlyClosedSkippedBytes = rms.getMetricsRegistry()
-            .getCounter(SOURCE_UNCLEANLY_CLOSED_IGNORED_IN_BYTES, 0L);
-    restartWALReading = rms.getMetricsRegistry().getCounter(SOURCE_RESTARTED_LOG_READING, 0L);
-    repeatedFileBytes = rms.getMetricsRegistry().getCounter(SOURCE_REPEATED_LOG_FILE_BYTES, 0L);
-    completedWAL = rms.getMetricsRegistry().getCounter(SOURCE_COMPLETED_LOGS, 0L);
-    completedRecoveryQueue = rms.getMetricsRegistry()
-            .getCounter(SOURCE_COMPLETED_RECOVERY_QUEUES, 0L);
-    failedRecoveryQueue = rms.getMetricsRegistry()
-            .getCounter(SOURCE_FAILED_RECOVERY_QUEUES, 0L);
-  }
-
-  @Override public void setLastShippedAge(long age) {
-    ageOfLastShippedOpHist.add(age);
-  }
-
-  @Override public void incrSizeOfLogQueue(int size) {
-    sizeOfLogQueueGauge.incr(size);
-  }
-
-  @Override public void decrSizeOfLogQueue(int size) {
-    sizeOfLogQueueGauge.decr(size);
-  }
-
-  @Override public void incrLogReadInEdits(long size) {
-    logReadInEditsCounter.incr(size);
-  }
-
-  @Override public void incrLogEditsFiltered(long size) {
-    walEditsFilteredCounter.incr(size);
-  }
-
-  @Override public void incrBatchesShipped(int batches) {
-    shippedBatchesCounter.incr(batches);
-  }
-
-  @Override public void incrOpsShipped(long ops) {
-    shippedOpsCounter.incr(ops);
-  }
-
-  @Override public void incrShippedBytes(long size) {
-    shippedBytesCounter.incr(size);
-  }
-
-  @Override public void incrLogReadInBytes(long size) {
-    logReadInBytesCounter.incr(size);
-  }
-
-  @Override public void clear() {
-  }
-
-  @Override
-  public long getLastShippedAge() {
-    return ageOfLastShippedOpHist.getMax();
-  }
-
-  @Override public void incrHFilesShipped(long hfiles) {
-    shippedHFilesCounter.incr(hfiles);
-  }
-
-  @Override
-  public void incrSizeOfHFileRefsQueue(long size) {
-    sizeOfHFileRefsQueueGauge.incr(size);
-  }
-
-  @Override
-  public void decrSizeOfHFileRefsQueue(long size) {
-    sizeOfHFileRefsQueueGauge.decr(size);
-  }
-
-  @Override
-  public int getSizeOfLogQueue() {
-    return (int)sizeOfLogQueueGauge.value();
-  }
-
-  @Override
-  public void incrUnknownFileLengthForClosedWAL() {
-    unknownFileLengthForClosedWAL.incr(1L);
-  }
-  @Override
-  public void incrUncleanlyClosedWALs() {
-    uncleanlyClosedWAL.incr(1L);
-  }
-  @Override
-  public void incrBytesSkippedInUncleanlyClosedWALs(final long bytes) {
-    uncleanlyClosedSkippedBytes.incr(bytes);
-  }
-  @Override
-  public void incrRestartedWALReading() {
-    restartWALReading.incr(1L);
-  }
-  @Override
-  public void incrRepeatedFileBytes(final long bytes) {
-    repeatedFileBytes.incr(bytes);
-  }
-  @Override
-  public void incrCompletedWAL() {
-    completedWAL.incr(1L);
-  }
-  @Override
-  public void incrCompletedRecoveryQueue() {
-    completedRecoveryQueue.incr(1L);
-  }
-  @Override
-  public void incrFailedRecoveryQueue() {
-    failedRecoveryQueue.incr(1L);
-  }
-  @Override
-  public void init() {
-    rms.init();
-  }
-
-  @Override
-  public void setGauge(String gaugeName, long value) {
-    rms.setGauge(KEY_PREFIX + gaugeName, value);
-  }
-
-  @Override
-  public void incGauge(String gaugeName, long delta) {
-    rms.incGauge(KEY_PREFIX + gaugeName, delta);
-  }
-
-  @Override
-  public void decGauge(String gaugeName, long delta) {
-    rms.decGauge(KEY_PREFIX + gaugeName, delta);
-  }
-
-  @Override
-  public void removeMetric(String key) {
-    rms.removeMetric(KEY_PREFIX + key);
-  }
-
-  @Override
-  public void incCounters(String counterName, long delta) {
-    rms.incCounters(KEY_PREFIX + counterName, delta);
-  }
-
-  @Override
-  public void updateHistogram(String name, long value) {
-    rms.updateHistogram(KEY_PREFIX + name, value);
-  }
-
-  @Override
-  public String getMetricsContext() {
-    return rms.getMetricsContext();
-  }
-
-  @Override
-  public String getMetricsDescription() {
-    return rms.getMetricsDescription();
-  }
-
-  @Override
-  public String getMetricsJmxContext() {
-    return rms.getMetricsJmxContext();
-  }
-
-  @Override
-  public String getMetricsName() {
-    return rms.getMetricsName();
-  }
-
-  @Override
-  public long getWALEditsRead() {
-    return this.logReadInEditsCounter.value();
-  }
-
-  @Override
-  public long getShippedOps() {
-    return this.shippedOpsCounter.value();
-  }
-
-  @Override
-  public long getEditsFiltered() {
-    return this.walEditsFilteredCounter.value();
-  }
+public interface MetricsReplicationGlobalSourceSource extends MetricsReplicationSourceSource {
+
+  public static final String SOURCE_WAL_READER_EDITS_BUFFER = "source.walReaderEditsBufferUsage";
+
+  /**
+   * Sets the total usage of memory used by edits in memory read from WALs.
+   * @param usage The memory used by edits in bytes
+   */
+  void setWALReaderEditsBufferBytes(long usage);

Review comment:
       Would it be worth explain this is actually the sum of edits read by all sources? Same single edit might had been read more than once, say, when having multiple peers set. 

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceWALReader.java
##########
@@ -276,6 +274,8 @@ public Path getCurrentPath() {
   private boolean checkQuota() {
     // try not to go over total quota
     if (totalBufferUsed.get() > totalBufferQuota) {
+      LOG.warn("Can't read more edits from WAL as buffer usage {}B exceeds limit {}B",

Review comment:
       Nit: Worth log for which peer the given source reader is sleeping?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#issuecomment-668348477


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 29s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 23s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 42s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m 18s |  master passed  |
   | +1 :green_heart: |  spotbugs  |   2m 30s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 14s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 20s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   0m 14s |  hbase-hadoop-compat: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)  |
   | -0 :warning: |  checkstyle  |   1m  3s |  hbase-server: The patch generated 5 new + 7 unchanged - 0 fixed = 12 total (was 7)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  11m 11s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   2m 45s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 26s |  The patch does not generate ASF License warnings.  |
   |  |   |  35m  7s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/1/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2193 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux c7b262607d4c 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 148c185486 |
   | checkstyle | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/1/artifact/yetus-general-check/output/diff-checkstyle-hbase-hadoop-compat.txt |
   | checkstyle | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/1/artifact/yetus-general-check/output/diff-checkstyle-hbase-server.txt |
   | Max. process+thread count | 94 (vs. ulimit of 12500) |
   | modules | C: hbase-hadoop-compat hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/1/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#issuecomment-668759278


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 31s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  4s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 15s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   4m  6s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 24s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   5m 48s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 20s |  hbase-hadoop-compat in master failed.  |
   | -0 :warning: |  javadoc  |   0m 40s |  hbase-server in master failed.  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 15s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 58s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 23s |  the patch passed  |
   | -1 :x: |  shadedjars  |   2m 32s |  patch has 10 errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 19s |  hbase-hadoop-compat in the patch failed.  |
   | -0 :warning: |  javadoc  |   0m 41s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 34s |  hbase-hadoop-compat in the patch passed.  |
   | -1 :x: |  unit  | 136m 34s |  hbase-server in the patch failed.  |
   |  |   | 161m 43s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/2/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2193 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux ad535a321ba1 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / d2f5a5f27b |
   | Default Java | 2020-01-14 |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/2/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-hadoop-compat.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/2/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | shadedjars | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/2/artifact/yetus-jdk11-hadoop3-check/output/patch-shadedjars.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/2/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-hadoop-compat.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/2/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   | unit | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/2/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/2/testReport/ |
   | Max. process+thread count | 4144 (vs. ulimit of 12500) |
   | modules | C: hbase-hadoop-compat hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2193/2/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] joshelser commented on pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
joshelser commented on pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#issuecomment-668872729


   > TestWALEntryStream timed out for me too. Will dig in.
   
   Ah, a mocking issue. Fixing.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] busbey commented on a change in pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
busbey commented on a change in pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#discussion_r464770288



##########
File path: hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsReplicationSourceSourceImpl.java
##########
@@ -314,4 +314,15 @@ public String getMetricsName() {
   @Override public long getEditsFiltered() {
     return this.walEditsFilteredCounter.value();
   }
+
+  @Override
+  public void setWALReaderEditsBufferBytes(long usage) {
+    //noop. Global limit, tracked globally. Do not need per-source metrics

Review comment:
       nit: wouldn't it still be useful to know if particular sources were eating up the global limit? or do we not have enough information tracked already to do that?

##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationEndpoint.java
##########
@@ -497,6 +497,33 @@ public boolean canReplicateToSameCluster() {
     }
   }
 
+  public static class SleepingReplicationEndpointForTest extends ReplicationEndpointForTest {

Review comment:
       AFAICT this only gets used in manual testing? add a comment on doing that. something like your example for this PR.

##########
File path: hbase-server/src/test/java/org/apache/hadoop/hbase/replication/TestReplicationEndpoint.java
##########
@@ -497,6 +497,33 @@ public boolean canReplicateToSameCluster() {
     }
   }
 
+  public static class SleepingReplicationEndpointForTest extends ReplicationEndpointForTest {
+    private long duration;
+    public SleepingReplicationEndpointForTest() {
+      super();
+    }
+
+    @Override
+    public void init(Context context) throws IOException {
+      super.init(context);
+      if (this.ctx != null) {
+        duration = this.ctx.getConfiguration().getLong(
+            "test.sleep.replication.endpoint.duration.millis", 5000L);

Review comment:
       nit: `hbase.test.sleep.replication.endpoint.duration.millis`

##########
File path: hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/replication/regionserver/MetricsReplicationSourceSource.java
##########
@@ -76,4 +77,13 @@
   long getWALEditsRead();
   long getShippedOps();
   long getEditsFiltered();
+  /**
+   * Sets the total usage of memory used by edits in memory read from WALs.
+   * @param usage The memory used by edits in bytes
+   */
+  void setWALReaderEditsBufferBytes(long usage);

Review comment:
       if we only want the global metric, why add this here instead of just on `MetricsReplicationGlobalSourceSource`?

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/Replication.java
##########
@@ -244,17 +248,22 @@ void addHFileRefsToQueue(TableName tableName, byte[] family, List<Pair<Path, Pat
 
     private final ReplicationSink replicationSink;
     private final ReplicationSourceManager replicationManager;
+    private final MetricsReplicationSourceSource globalMetricsSource;
 
     public ReplicationStatisticsTask(ReplicationSink replicationSink,
-        ReplicationSourceManager replicationManager) {
+        ReplicationSourceManager replicationManager, MetricsReplicationSourceSource globalMetricsSource) {
       this.replicationManager = replicationManager;
       this.replicationSink = replicationSink;
+      this.globalMetricsSource = globalMetricsSource;
     }
 
     @Override
     public void run() {
       printStats(this.replicationManager.getStats());
       printStats(this.replicationSink.getStats());
+
+      // Report how much data we've read off disk which is pending replication, across all sources
+      globalMetricsSource.setWALReaderEditsBufferBytes(replicationManager.getTotalBufferUsed().get());

Review comment:
       this feels odd. Why is it we only do updates for this one global metric here? why is it different then the other things tracked in the global metrics?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] joshelser commented on pull request #2193: HBASE-24779 Report on the WAL edit buffer usage/limit for replication

Posted by GitBox <gi...@apache.org>.
joshelser commented on pull request #2193:
URL: https://github.com/apache/hbase/pull/2193#issuecomment-670194319


   I got 99 problems and all of them are due to mocking. The second UT failure was also due to a mock returning a value of 0 instead of the `256*1000*1000` value which was previously getting pulled from the configuration. TestWALEntryStream is passing locally. Let's see what QA says now.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org