You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by GitBox <gi...@apache.org> on 2020/07/23 13:13:41 UTC

[GitHub] [hbase] virajjasani opened a new pull request #2127: HBASE-24757 : ReplicationSink should limit row size in batch mutation based on hbase.rpc.rows.warning.threshold

virajjasani opened a new pull request #2127:
URL: https://github.com/apache/hbase/pull/2127


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] saintstack commented on a change in pull request #2127: HBASE-24757 : ReplicationSink should limit row count in batch mutation based on hbase.rpc.rows.warning.threshold

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #2127:
URL: https://github.com/apache/hbase/pull/2127#discussion_r459621260



##########
File path: hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
##########
@@ -1583,6 +1583,16 @@
     "hbase.regionserver.slowlog.systable.enabled";
   public static final boolean DEFAULT_SLOW_LOG_SYS_TABLE_ENABLED_KEY = false;
 
+  /**
+   * Number of rows in a batch operation above which a warning will be logged.
+   */
+  public static final String BATCH_ROWS_THRESHOLD_NAME = "hbase.rpc.rows.warning.threshold";
+
+  /**
+   * Default value of {@link #BATCH_ROWS_THRESHOLD_NAME}
+   */
+  public static final int BATCH_ROWS_THRESHOLD_DEFAULT = 5000;
+

Review comment:
       Ok. Two very disparate contexts. This is probably only case for HConstants.

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java
##########
@@ -93,15 +94,21 @@
   private SourceFSConfigurationProvider provider;
   private WALEntrySinkFilter walEntrySinkFilter;
 
+  /**
+   * Row size threshold for multi requests above which a warning is logged
+   */
+  private final int rowSizeWarnThreshold;
+
   /**
    * Create a sink for replication
    * @param conf conf object
-   * @param stopper boolean to tell this thread to stop
    * @throws IOException thrown when HDFS goes bad or bad file name
    */
-  public ReplicationSink(Configuration conf, Stoppable stopper)
+  public ReplicationSink(Configuration conf)

Review comment:
       Ok




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] virajjasani commented on a change in pull request #2127: HBASE-24757 : ReplicationSink should limit row size in batch mutation based on hbase.rpc.rows.warning.threshold

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #2127:
URL: https://github.com/apache/hbase/pull/2127#discussion_r459613685



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java
##########
@@ -403,13 +398,24 @@ public void stopReplicationSinkServices() {
    * Do the changes and handle the pool
    * @param tableName table to insert into
    * @param allRows list of actions
+   * @param batchRowSizeThreshold rowSize threshold for batch mutation
    */
-  private void batch(TableName tableName, Collection<List<Row>> allRows) throws IOException {
+  private void batch(TableName tableName, Collection<List<Row>> allRows, int batchRowSizeThreshold)
+      throws IOException {
     if (allRows.isEmpty()) {
       return;
     }
     AsyncTable<?> table = getConnection().getTable(tableName);
-    List<Future<?>> futures = allRows.stream().map(table::batchAll).collect(Collectors.toList());
+    List<Future<?>> futures = new ArrayList<>();
+    for (List<Row> rows : allRows) {
+      List<List<Row>> batchRows;
+      if (rows.size() > batchRowSizeThreshold) {
+        batchRows = Lists.partition(rows, batchRowSizeThreshold);
+      } else {
+        batchRows = Collections.singletonList(rows);
+      }
+      futures.addAll(batchRows.stream().map(table::batchAll).collect(Collectors.toList()));
+    }

Review comment:
       UT `testLargeEditsPutDelete()` as part of this PR will always ensure that this is handled.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2127: HBASE-24757 : ReplicationSink should limit row size in batch mutation based on hbase.rpc.rows.warning.threshold

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2127:
URL: https://github.com/apache/hbase/pull/2127#issuecomment-663023627


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 37s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 23s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 42s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m 34s |  master passed  |
   | +1 :green_heart: |  spotbugs  |   2m 42s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 14s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 30s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   1m  4s |  hbase-server: The patch generated 1 new + 80 unchanged - 0 fixed = 81 total (was 80)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  11m 38s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   3m 49s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 26s |  The patch does not generate ASF License warnings.  |
   |  |   |  39m 22s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/1/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2127 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux 698adefbb8b0 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 8191fbdd7d |
   | checkstyle | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/1/artifact/yetus-general-check/output/diff-checkstyle-hbase-server.txt |
   | Max. process+thread count | 94 (vs. ulimit of 12500) |
   | modules | C: hbase-common hbase-server U: . |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/1/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] virajjasani commented on a change in pull request #2127: HBASE-24757 : ReplicationSink should limit row size in batch mutation based on hbase.rpc.rows.warning.threshold

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #2127:
URL: https://github.com/apache/hbase/pull/2127#discussion_r459605416



##########
File path: hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
##########
@@ -1583,6 +1583,16 @@
     "hbase.regionserver.slowlog.systable.enabled";
   public static final boolean DEFAULT_SLOW_LOG_SYS_TABLE_ENABLED_KEY = false;
 
+  /**
+   * Number of rows in a batch operation above which a warning will be logged.
+   */
+  public static final String BATCH_ROWS_THRESHOLD_NAME = "hbase.rpc.rows.warning.threshold";
+
+  /**
+   * Default value of {@link #BATCH_ROWS_THRESHOLD_NAME}
+   */
+  public static final int BATCH_ROWS_THRESHOLD_DEFAULT = 5000;
+

Review comment:
       We want `RsRpcServices` as well as `ReplicationSink` both to use these constants. Would you still recommend having duplicate copies?

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java
##########
@@ -403,13 +398,24 @@ public void stopReplicationSinkServices() {
    * Do the changes and handle the pool
    * @param tableName table to insert into
    * @param allRows list of actions
+   * @param batchRowSizeThreshold rowSize threshold for batch mutation
    */
-  private void batch(TableName tableName, Collection<List<Row>> allRows) throws IOException {
+  private void batch(TableName tableName, Collection<List<Row>> allRows, int batchRowSizeThreshold)
+      throws IOException {
     if (allRows.isEmpty()) {
       return;
     }
     AsyncTable<?> table = getConnection().getTable(tableName);
-    List<Future<?>> futures = allRows.stream().map(table::batchAll).collect(Collectors.toList());
+    List<Future<?>> futures = new ArrayList<>();
+    for (List<Row> rows : allRows) {
+      List<List<Row>> batchRows;
+      if (rows.size() > batchRowSizeThreshold) {
+        batchRows = Lists.partition(rows, batchRowSizeThreshold);
+      } else {
+        batchRows = Collections.singletonList(rows);
+      }
+      futures.addAll(batchRows.stream().map(table::batchAll).collect(Collectors.toList()));
+    }

Review comment:
       That's true, they are handled in next iteration.

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java
##########
@@ -93,15 +94,21 @@
   private SourceFSConfigurationProvider provider;
   private WALEntrySinkFilter walEntrySinkFilter;
 
+  /**
+   * Row size threshold for multi requests above which a warning is logged
+   */
+  private final int rowSizeWarnThreshold;
+
   /**
    * Create a sink for replication
    * @param conf conf object
-   * @param stopper boolean to tell this thread to stop
    * @throws IOException thrown when HDFS goes bad or bad file name
    */
-  public ReplicationSink(Configuration conf, Stoppable stopper)
+  public ReplicationSink(Configuration conf)

Review comment:
       Agree for stoppable usage, but here it is anyways unused :(




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] virajjasani commented on a change in pull request #2127: HBASE-24757 : ReplicationSink should limit row count in batch mutation based on hbase.rpc.rows.warning.threshold

Posted by GitBox <gi...@apache.org>.
virajjasani commented on a change in pull request #2127:
URL: https://github.com/apache/hbase/pull/2127#discussion_r459606308



##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java
##########
@@ -403,13 +398,24 @@ public void stopReplicationSinkServices() {
    * Do the changes and handle the pool
    * @param tableName table to insert into
    * @param allRows list of actions
+   * @param batchRowSizeThreshold rowSize threshold for batch mutation
    */
-  private void batch(TableName tableName, Collection<List<Row>> allRows) throws IOException {
+  private void batch(TableName tableName, Collection<List<Row>> allRows, int batchRowSizeThreshold)
+      throws IOException {
     if (allRows.isEmpty()) {
       return;
     }
     AsyncTable<?> table = getConnection().getTable(tableName);
-    List<Future<?>> futures = allRows.stream().map(table::batchAll).collect(Collectors.toList());
+    List<Future<?>> futures = new ArrayList<>();
+    for (List<Row> rows : allRows) {
+      List<List<Row>> batchRows;
+      if (rows.size() > batchRowSizeThreshold) {
+        batchRows = Lists.partition(rows, batchRowSizeThreshold);
+      } else {
+        batchRows = Collections.singletonList(rows);
+      }
+      futures.addAll(batchRows.stream().map(table::batchAll).collect(Collectors.toList()));
+    }

Review comment:
       That's true, they are handled in next batch.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2127: HBASE-24757 : ReplicationSink should limit row size in batch mutation based on hbase.rpc.rows.warning.threshold

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2127:
URL: https://github.com/apache/hbase/pull/2127#issuecomment-663095071


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 37s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  3s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 24s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 43s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 21s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   5m 42s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 59s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 16s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 31s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 16s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 16s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   5m 32s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 16s |  hbase-common in the patch passed.  |
   | +1 :green_heart: |  unit  | 141m 14s |  hbase-server in the patch passed.  |
   |  |   | 169m 17s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/1/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2127 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 53a4726e0545 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 8191fbdd7d |
   | Default Java | 1.8.0_232 |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/1/testReport/ |
   | Max. process+thread count | 5124 (vs. ulimit of 12500) |
   | modules | C: hbase-common hbase-server U: . |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/1/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2127: HBASE-24757 : ReplicationSink should limit row count in batch mutation based on hbase.rpc.rows.warning.threshold

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2127:
URL: https://github.com/apache/hbase/pull/2127#issuecomment-663243730


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 38s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  3s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 15s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 44s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 20s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   6m  6s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 55s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 14s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 50s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 18s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 18s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   6m  7s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 56s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 39s |  hbase-common in the patch passed.  |
   | +1 :green_heart: |  unit  | 210m 44s |  hbase-server in the patch passed.  |
   |  |   | 240m 36s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.9 Server=19.03.9 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/2/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2127 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux de83455c9947 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 8191fbdd7d |
   | Default Java | 1.8.0_232 |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/2/testReport/ |
   | Max. process+thread count | 2984 (vs. ulimit of 12500) |
   | modules | C: hbase-common hbase-server U: . |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/2/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2127: HBASE-24757 : ReplicationSink should limit row count in batch mutation based on hbase.rpc.rows.warning.threshold

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2127:
URL: https://github.com/apache/hbase/pull/2127#issuecomment-663210758


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 41s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  3s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 25s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   4m 21s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 32s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   5m 49s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 20s |  hbase-common in master failed.  |
   | -0 :warning: |  javadoc  |   0m 39s |  hbase-server in master failed.  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 15s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   4m  2s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 31s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 31s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   5m 47s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 18s |  hbase-common in the patch failed.  |
   | -0 :warning: |  javadoc  |   0m 39s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   1m 33s |  hbase-common in the patch passed.  |
   | +1 :green_heart: |  unit  | 132m 38s |  hbase-server in the patch passed.  |
   |  |   | 162m 42s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/2/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2127 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 6900b87b01a5 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 8191fbdd7d |
   | Default Java | 2020-01-14 |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/2/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-common.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/2/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/2/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-common.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/2/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/2/testReport/ |
   | Max. process+thread count | 4153 (vs. ulimit of 12500) |
   | modules | C: hbase-common hbase-server U: . |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/2/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] saintstack commented on a change in pull request #2127: HBASE-24757 : ReplicationSink should limit row size in batch mutation based on hbase.rpc.rows.warning.threshold

Posted by GitBox <gi...@apache.org>.
saintstack commented on a change in pull request #2127:
URL: https://github.com/apache/hbase/pull/2127#discussion_r459587451



##########
File path: hbase-common/src/main/java/org/apache/hadoop/hbase/HConstants.java
##########
@@ -1583,6 +1583,16 @@
     "hbase.regionserver.slowlog.systable.enabled";
   public static final boolean DEFAULT_SLOW_LOG_SYS_TABLE_ENABLED_KEY = false;
 
+  /**
+   * Number of rows in a batch operation above which a warning will be logged.
+   */
+  public static final String BATCH_ROWS_THRESHOLD_NAME = "hbase.rpc.rows.warning.threshold";
+
+  /**
+   * Default value of {@link #BATCH_ROWS_THRESHOLD_NAME}
+   */
+  public static final int BATCH_ROWS_THRESHOLD_DEFAULT = 5000;
+

Review comment:
       Why the move? HConstants is often considered an anti-pattern; better to have constants beside where they are used?

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java
##########
@@ -93,15 +94,21 @@
   private SourceFSConfigurationProvider provider;
   private WALEntrySinkFilter walEntrySinkFilter;
 
+  /**
+   * Row size threshold for multi requests above which a warning is logged
+   */
+  private final int rowSizeWarnThreshold;
+
   /**
    * Create a sink for replication
    * @param conf conf object
-   * @param stopper boolean to tell this thread to stop
    * @throws IOException thrown when HDFS goes bad or bad file name
    */
-  public ReplicationSink(Configuration conf, Stoppable stopper)
+  public ReplicationSink(Configuration conf)

Review comment:
       Is the Stoppable unused? Usually its a chain for a Service to pull on when it meets a condition it can't deal with... one that is so bad it wants to stop the process.

##########
File path: hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSink.java
##########
@@ -403,13 +398,24 @@ public void stopReplicationSinkServices() {
    * Do the changes and handle the pool
    * @param tableName table to insert into
    * @param allRows list of actions
+   * @param batchRowSizeThreshold rowSize threshold for batch mutation
    */
-  private void batch(TableName tableName, Collection<List<Row>> allRows) throws IOException {
+  private void batch(TableName tableName, Collection<List<Row>> allRows, int batchRowSizeThreshold)
+      throws IOException {
     if (allRows.isEmpty()) {
       return;
     }
     AsyncTable<?> table = getConnection().getTable(tableName);
-    List<Future<?>> futures = allRows.stream().map(table::batchAll).collect(Collectors.toList());
+    List<Future<?>> futures = new ArrayList<>();
+    for (List<Row> rows : allRows) {
+      List<List<Row>> batchRows;
+      if (rows.size() > batchRowSizeThreshold) {
+        batchRows = Lists.partition(rows, batchRowSizeThreshold);
+      } else {
+        batchRows = Collections.singletonList(rows);
+      }
+      futures.addAll(batchRows.stream().map(table::batchAll).collect(Collectors.toList()));
+    }

Review comment:
       For the edits that are beyond the limit, they get handled in next batch?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2127: HBASE-24757 : ReplicationSink should limit row size in batch mutation based on hbase.rpc.rows.warning.threshold

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2127:
URL: https://github.com/apache/hbase/pull/2127#issuecomment-663135784


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |  11m 17s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  2s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 23s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   5m  1s |  master passed  |
   | +1 :green_heart: |  compile  |   1m 46s |  master passed  |
   | +1 :green_heart: |  shadedjars  |   6m 45s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 19s |  hbase-common in master failed.  |
   | -0 :warning: |  javadoc  |   0m 43s |  hbase-server in master failed.  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 14s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   4m 46s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 42s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 42s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   6m 31s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 17s |  hbase-common in the patch failed.  |
   | -0 :warning: |  javadoc  |   0m 43s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m  3s |  hbase-common in the patch passed.  |
   | +1 :green_heart: |  unit  | 206m 23s |  hbase-server in the patch passed.  |
   |  |   | 250m 55s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.9 Server=19.03.9 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/1/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2127 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux bd72d7002f3c 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 8191fbdd7d |
   | Default Java | 2020-01-14 |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/1/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-common.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/1/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/1/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-common.txt |
   | javadoc | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/1/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   |  Test Results | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/1/testReport/ |
   | Max. process+thread count | 3531 (vs. ulimit of 12500) |
   | modules | C: hbase-common hbase-server U: . |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/1/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] virajjasani closed pull request #2127: HBASE-24757 : ReplicationSink should limit row count in batch mutation based on hbase.rpc.rows.warning.threshold

Posted by GitBox <gi...@apache.org>.
virajjasani closed pull request #2127:
URL: https://github.com/apache/hbase/pull/2127


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hbase] Apache-HBase commented on pull request #2127: HBASE-24757 : ReplicationSink should limit row count in batch mutation based on hbase.rpc.rows.warning.threshold

Posted by GitBox <gi...@apache.org>.
Apache-HBase commented on pull request #2127:
URL: https://github.com/apache/hbase/pull/2127#issuecomment-663154210


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 30s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ master Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 24s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 39s |  master passed  |
   | +1 :green_heart: |  checkstyle  |   1m 30s |  master passed  |
   | +1 :green_heart: |  spotbugs  |   2m 40s |  master passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 13s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 20s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   1m  5s |  hbase-server: The patch generated 1 new + 80 unchanged - 0 fixed = 81 total (was 80)  |
   | +1 :green_heart: |  whitespace  |   0m  1s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  11m 15s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   2m 59s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 27s |  The patch does not generate ASF License warnings.  |
   |  |   |  35m 42s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/2/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2127 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux 516371c17c49 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | master / 8191fbdd7d |
   | checkstyle | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/2/artifact/yetus-general-check/output/diff-checkstyle-hbase-server.txt |
   | Max. process+thread count | 94 (vs. ulimit of 12500) |
   | modules | C: hbase-common hbase-server U: . |
   | Console output | https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-2127/2/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org