You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by GitBox <gi...@apache.org> on 2020/08/27 15:15:44 UTC

[GitHub] [hbase] shahrs87 opened a new pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

shahrs87 opened a new pull request #2322:
URL: https://github.com/apache/hbase/pull/2322


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] Apache-HBase commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

Apache-HBase commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-682060240


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 39s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ branch-2 Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 16s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   4m 25s |  branch-2 passed  |
   | +1 :green_heart: |  checkstyle  |   2m  1s |  branch-2 passed  |
   | +1 :green_heart: |  spotbugs  |   3m 50s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 13s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 47s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   0m 40s |  hbase-client: The patch generated 1 new + 9 unchanged - 0 fixed = 10 total (was 9)  |
   | -0 :warning: |  checkstyle  |   1m 23s |  hbase-server: The patch generated 2 new + 6 unchanged - 0 fixed = 8 total (was 6)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  13m 15s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   3m 41s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 23s |  The patch does not generate ASF License warnings.  |
   |  |   |  43m 48s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/2/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2322 |
   | JIRA Issue | HBASE-24956 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux a373ba141f7c 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 85370f1443 |
   | checkstyle | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/2/artifact/yetus-general-check/output/diff-checkstyle-hbase-client.txt |
   | checkstyle | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/2/artifact/yetus-general-check/output/diff-checkstyle-hbase-server.txt |
   | Max. process+thread count | 84 (vs. ulimit of 12500) |
   | modules | C: hbase-client hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/2/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r483976537



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       Given that we have 4 +1s on using hbase.client.scanner.timeout.period, can we commit this patch ?  Also I have created https://issues.apache.org/jira/browse/HBASE-24983 for wrapping scan operation under operation timeout. 
   @apurtell  @saintstack  @bharathv  @virajjasani  @infraio  what do you guys think ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] Apache-HBase commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

Apache-HBase commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-683209562


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 29s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  7s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 16s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   4m 28s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m 38s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   7m 14s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m  9s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 17s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   4m  6s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 44s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 44s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   7m 13s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m  4s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 45s |  hbase-client in the patch passed.  |
   | +1 :green_heart: |  unit  | 190m 24s |  hbase-server in the patch passed.  |
   |  |   | 224m 47s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/4/artifact/yetus-jdk8-hadoop2-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2322 |
   | JIRA Issue | HBASE-24956 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 45eacaf79348 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 85370f1443 |
   | Default Java | 1.8.0_232 |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/4/testReport/ |
   | Max. process+thread count | 3157 (vs. ulimit of 12500) |
   | modules | C: hbase-client hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/4/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] Apache-HBase commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

Apache-HBase commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-682130416


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 38s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  5s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 17s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   4m 32s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m 42s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   6m 57s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 32s |  hbase-client in branch-2 failed.  |
   | -0 :warning: |  javadoc  |   0m 41s |  hbase-server in branch-2 failed.  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 21s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   4m  9s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 44s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 44s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   6m 55s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 30s |  hbase-client in the patch failed.  |
   | -0 :warning: |  javadoc  |   0m 42s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 31s |  hbase-client in the patch passed.  |
   | +1 :green_heart: |  unit  | 135m 24s |  hbase-server in the patch passed.  |
   |  |   | 170m  4s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/2/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2322 |
   | JIRA Issue | HBASE-24956 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux f120bf79daa2 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 85370f1443 |
   | Default Java | 2020-01-14 |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/2/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-client.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/2/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/2/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-client.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/2/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/2/testReport/ |
   | Max. process+thread count | 3501 (vs. ulimit of 12500) |
   | modules | C: hbase-client hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/2/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r487274061



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @saintstack  Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you ! Cc @SukumarMaddineni 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] infraio commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

infraio commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r486099032



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       Seems not push the latest commit?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] infraio commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

infraio commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r486879689



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       I am -1 to use scanner timeout. It is too weird and confused. Operation timeout is not the best choice too but better. Thanks.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] bharathv commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

bharathv commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r478687341



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       @saintstack Rushabh and I had a quick offline chat about this PR. We were wondering what is the right timeout to use for this lock. In the client code path there are a bunch of time out configurations depending on the path we take and sometimes layered on top of each other.
   
   Specifically I was wondering if hbase.client.operation.timeout would be the right one to use for this. I understand that we are using the scanner timeout here because the call wraps a scanner with the same timeout. From a client standpoint though, scanner is just an implementation detail of locateRegion (root caller in this case) and that root caller should be wrapped with a general operation timeout rather than a timeout that is specific to the scanner.
   
   Not a big deal but I was just curious and would like to know your thoughts. Thanks.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] Apache-HBase commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

Apache-HBase commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-690058861


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 29s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  7s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 30s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 51s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m 33s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   6m 46s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m  7s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 15s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 50s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 33s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 33s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   6m 45s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m  3s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 43s |  hbase-client in the patch passed.  |
   | +1 :green_heart: |  unit  | 208m 50s |  hbase-server in the patch passed.  |
   |  |   | 241m  6s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/6/artifact/yetus-jdk8-hadoop2-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2322 |
   | JIRA Issue | HBASE-24956 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 7a9134442cd3 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 6a05eaf7d5 |
   | Default Java | 1.8.0_232 |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/6/testReport/ |
   | Max. process+thread count | 2949 (vs. ulimit of 12500) |
   | modules | C: hbase-client hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/6/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] Apache-HBase commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

Apache-HBase commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-690019286


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 34s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  7s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 18s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   4m 17s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m 36s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   6m 44s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 30s |  hbase-client in branch-2 failed.  |
   | -0 :warning: |  javadoc  |   0m 41s |  hbase-server in branch-2 failed.  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 21s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 53s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 34s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 34s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   6m 44s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 28s |  hbase-client in the patch failed.  |
   | -0 :warning: |  javadoc  |   0m 39s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 23s |  hbase-client in the patch passed.  |
   | +1 :green_heart: |  unit  | 127m 18s |  hbase-server in the patch passed.  |
   |  |   | 160m 37s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/6/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2322 |
   | JIRA Issue | HBASE-24956 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 90704c29f96f 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 6a05eaf7d5 |
   | Default Java | 2020-01-14 |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/6/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-client.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/6/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/6/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-client.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/6/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/6/testReport/ |
   | Max. process+thread count | 3847 (vs. ulimit of 12500) |
   | modules | C: hbase-client hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/6/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] Apache-HBase commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

Apache-HBase commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-693092325


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 38s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  6s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 33s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 59s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m 36s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   6m 38s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 29s |  hbase-client in branch-2 failed.  |
   | -0 :warning: |  javadoc  |   0m 40s |  hbase-server in branch-2 failed.  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 16s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 53s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 37s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 37s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   6m 39s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 27s |  hbase-client in the patch failed.  |
   | -0 :warning: |  javadoc  |   0m 42s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 23s |  hbase-client in the patch passed.  |
   | +1 :green_heart: |  unit  | 128m 31s |  hbase-server in the patch passed.  |
   |  |   | 161m 36s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/7/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2322 |
   | JIRA Issue | HBASE-24956 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux ebd1d9343ef6 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / ddddd2a822 |
   | Default Java | 2020-01-14 |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/7/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-client.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/7/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/7/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-client.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/7/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/7/testReport/ |
   | Max. process+thread count | 4233 (vs. ulimit of 12500) |
   | modules | C: hbase-client hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/7/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r487273056



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > Operation timeout is not the best choice too but better
   
   @infraio  In scan operation, there are 2 operations. One is to wait for lock and other is to wait for rpc to complete. On top of that we have retries. The problem we are trying to solve here is what is the timeout to use for lock. If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. So I am confused when you suggest to use operation timeout, are you suggesting to wait for operation timeout period while trying to get lock ?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @saintstack  Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you !

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @saintstack  Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you ! Cc @SukumarMaddineni 

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > Operation timeout is not the best choice too but better
   
   @infraio  In scan operation, there are 2 operations. One is to wait for lock and other is to wait for rpc to complete. On top of that we have retries. The problem we are trying to solve here is what is the timeout to use for lock. If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. So I am confused when you suggest to use operation timeout, are you suggesting to wait for operation timeout period while trying to get lock ?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @saintstack  Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you !

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @saintstack  Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you ! Cc @SukumarMaddineni 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r480197093



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       > I am +1 to use operation timeout here.
   
   @infraio  the default value for operation timeout is 20 minutes. Do you want clients to wait for 20 minutes to acquire this lock ? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r485216698



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       > If callable.prepare throw the LockTimeoutException, 
   
   IIUC the code, callable.prepare will never throw LockTimeoutException. We throw LockTimeoutException [here](https://github.com/apache/hbase/blob/f3210c8908809fa154118454e6a23f57bc222db1/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java#L866)  At this place we haven't created ReversedClientScanner object also. There is one retry loop [here](https://github.com/apache/hbase/blob/f3210c8908809fa154118454e6a23f57bc222db1/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java#L848) which will retry if exception is not TNFE or retries are not exhausted.
   @infraio  please correct me if I misunderstood anything.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] apurtell commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

apurtell commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r479441606



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       This is really difficult to operate already, just look at us discussing it now. Scanner timeout? Operation timeout? Both! Neither! Make another! Let's not introduce another config option. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] infraio commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

infraio commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r487349217



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. 
   
   Yes. The guarantee is that the operation will fail or success within the "operation timeout". No remaining time to retry and failed the operation is acceptable. 
   
   > are you suggesting to wait for operation timeout period while trying to get lock 
   Yes. Use the operation timeout period when wait for lock, instead of the scanner timeout now.
   
   > I think we are going back and forth on which timeout to use.
   I thought my point is clearly since we start this discussion. I suggested that use operation timeout instead of scanner timeout. Then you give me a 15 seconds SLA example. Then I checked the code: use operation timeout can meet your SLA requirements, too. So why not use operation timeout?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r483859942



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       Hi @infraio, 
   I am not able to understand your last comment. Could you please elaborate ? Thank you !




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r479713547



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionConfiguration.java
##########
@@ -117,6 +119,11 @@
 
     this.writeRpcTimeout = conf.getInt(HConstants.HBASE_RPC_WRITE_TIMEOUT_KEY,
         conf.getInt(HConstants.HBASE_RPC_TIMEOUT_KEY, HConstants.DEFAULT_HBASE_RPC_TIMEOUT));
+
+    this.scannerTimeoutPeriod = HBaseConfiguration.getInt(conf,

Review comment:
       Looks like HConstants.HBASE_REGIONSERVER_LEASE_PERIOD_KEY config property is deprecated after 0.96 release. So remove the deprecated config property altogether.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] Apache-HBase commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

Apache-HBase commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-683383309


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 11s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  6s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 16s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   4m 43s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m 41s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   7m 24s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 30s |  hbase-client in branch-2 failed.  |
   | -0 :warning: |  javadoc  |   0m 41s |  hbase-server in branch-2 failed.  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 18s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   4m 22s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 40s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 40s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   7m 23s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 28s |  hbase-client in the patch failed.  |
   | -0 :warning: |  javadoc  |   0m 41s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 43s |  hbase-client in the patch passed.  |
   | +1 :green_heart: |  unit  | 189m 41s |  hbase-server in the patch passed.  |
   |  |   | 225m 51s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/5/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2322 |
   | JIRA Issue | HBASE-24956 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux d0ceb2b023a6 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 54454f8de6 |
   | Default Java | 2020-01-14 |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/5/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-client.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/5/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/5/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-client.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/5/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/5/testReport/ |
   | Max. process+thread count | 2486 (vs. ulimit of 12500) |
   | modules | C: hbase-client hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/5/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] infraio commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

infraio commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r486877715



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       Why move the takeUserRegionLock to inside "try catch block"? The right fix is to use operation timeout......




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] saintstack commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

saintstack commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r479792558



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       `My concern was from a user configuration POV, there 3 or 4 different timeouts a user has to configure (depending on which API they are using) to get a proper timeout behavior. Ideally it'd be nice if there is a single timeout at the root level that is an e-e timeout and probably other timeouts like scanner timeout etc for finer control depending on the need.`
   
   This is a critical concern. Our @petersomogyi was the last to throw himself against this rampart. The work is not yet finished. Yes, to a new issue.
   
   Here will wait on @infraio 's input.....




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r487273056



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > Operation timeout is not the best choice too but better
   
   @infraio  In scan operation, there are 2 operations. One is to wait for lock and other is to wait for rpc to complete. On top of that we have retries. The problem we are trying to solve here is what is the timeout to use for lock. If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. So I am confused when you suggest to use operation timeout, are you suggesting to wait for operation timeout period while trying to get lock ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r489018057



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       @infraio  could you please review again ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] Apache-HBase commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

Apache-HBase commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-682251265


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   2m 37s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  7s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 20s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   4m 12s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m 38s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   6m 36s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 31s |  hbase-client in branch-2 failed.  |
   | -0 :warning: |  javadoc  |   0m 40s |  hbase-server in branch-2 failed.  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 21s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 55s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 36s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 36s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   6m 38s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 27s |  hbase-client in the patch failed.  |
   | -0 :warning: |  javadoc  |   0m 40s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 22s |  hbase-client in the patch passed.  |
   | +1 :green_heart: |  unit  | 129m 43s |  hbase-server in the patch passed.  |
   |  |   | 164m 37s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/3/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2322 |
   | JIRA Issue | HBASE-24956 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 66058022157b 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 85370f1443 |
   | Default Java | 2020-01-14 |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/3/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-client.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/3/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/3/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-client.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/3/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/3/testReport/ |
   | Max. process+thread count | 4077 (vs. ulimit of 12500) |
   | modules | C: hbase-client hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/3/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] Apache-HBase commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

Apache-HBase commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-683383258


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   2m 51s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  6s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 23s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 30s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m 24s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   6m 25s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 13s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 33s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 23s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   6m 23s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 30s |  hbase-client in the patch passed.  |
   | +1 :green_heart: |  unit  | 192m 40s |  hbase-server in the patch passed.  |
   |  |   | 225m 10s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/5/artifact/yetus-jdk8-hadoop2-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2322 |
   | JIRA Issue | HBASE-24956 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 39a714920d58 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 54454f8de6 |
   | Default Java | 1.8.0_232 |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/5/testReport/ |
   | Max. process+thread count | 3197 (vs. ulimit of 12500) |
   | modules | C: hbase-client hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/5/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r487274061



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @saintstack  Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you !




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r482175115



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       Let me explain my POV with one of our client example.
    We have an internal customer who wants a strict 15 seconds SLA for every operation. Since operation timeout is end to end timeout which includes all the retries, sleep within retries so we suggested them to use operation timeout as 15 seconds.
   Also we recommended them to set scanner timeout period to 7 seconds with  retries config (hbase.client.retries.number)  set to 2 .  There is some sleep interval between each attempt and we expect the call to complete within 15 seconds and if it doesn't then operation timeout will kick in and fail the call.
   But we found out that getRegionLocations call is not bounded by operation timeout.
   Now if we set the lock timeout to same as operation timeout, then in worst case scenario call will fail in 15 (lock timeout) + 15 (lock timeout 2nd try) + 1 (assuming sleep of 1 second) = 31 seconds
   If we set the lock timeout to same as scanner timeout, then in worst case scenario the call will fail in 7 (scanner timeout) + 7 (scanner timeout) + 1 (sleep between tries) = 16 seconds which is closer to SLA that we promised.
   Hope this makes sense. If still the community wants to go forward with operation timeout, I will change it to operation timeout.
   @infraio  @bharathv  @virajjasani  @saintstack  @apurtell 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] Apache-HBase commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

Apache-HBase commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-682026914


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 44s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ branch-2 Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 19s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 57s |  branch-2 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 39s |  branch-2 passed  |
   | +1 :green_heart: |  spotbugs  |   3m 17s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 14s |  Maven dependency ordering for patch  |
   | -1 :x: |  mvninstall  |   1m  3s |  root in the patch failed.  |
   | -0 :warning: |  checkstyle  |   0m 30s |  hbase-client: The patch generated 1 new + 9 unchanged - 0 fixed = 10 total (was 9)  |
   | -0 :warning: |  checkstyle  |   1m  6s |  hbase-server: The patch generated 3 new + 6 unchanged - 0 fixed = 9 total (was 6)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | -1 :x: |  hadoopcheck  |   1m 27s |  The patch causes 11 errors with Hadoop v3.1.2.  |
   | -1 :x: |  hadoopcheck  |   3m  2s |  The patch causes 11 errors with Hadoop v3.2.1.  |
   | -1 :x: |  spotbugs  |   0m 18s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 20s |  The patch does not generate ASF License warnings.  |
   |  |   |  19m 47s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2322 |
   | JIRA Issue | HBASE-24956 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux 0a5ae5558b2e 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 85370f1443 |
   | mvninstall | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-general-check/output/patch-mvninstall-root.txt |
   | checkstyle | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-general-check/output/diff-checkstyle-hbase-client.txt |
   | checkstyle | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-general-check/output/diff-checkstyle-hbase-server.txt |
   | hadoopcheck | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-general-check/output/patch-javac-3.1.2.txt |
   | hadoopcheck | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-general-check/output/patch-javac-3.2.1.txt |
   | spotbugs | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-general-check/output/patch-spotbugs-hbase-server.txt |
   | Max. process+thread count | 94 (vs. ulimit of 12500) |
   | modules | C: hbase-client hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] bharathv commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

bharathv commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r479581668



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/LockTimeoutException.java
##########
@@ -0,0 +1,32 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.client;
+
+import org.apache.hadoop.hbase.HBaseIOException;
+import org.apache.yetus.audience.InterfaceAudience;
+
+/*
+  Thrown whenever we are not able to get the lock within the specified wait time.
+ */
+@InterfaceAudience.Public

Review comment:
       need to be public?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       > operation timeout is for 'whole operation end-to-end'
   
   In that case IMO the right fix is to extend the operation timeout to all public methods of HTable (may or may not be a part of this change). Looking at the code, we do it for some and we skip it for others. One of those methods is locateRegion() which is the root caller in this case. 
   
   My concern was from a user configuration POV, there 3 or 4 different timeouts a user has to configure (depending on which API they are using) to get a proper timeout behavior. Ideally it'd be nice if there is a single timeout at the root level that is an e-e timeout and probably other timeouts like scanner timeout etc for finer control depending on the need.
   
   Don't mean to block this PR, we can probably implement that as a separate change.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionConfiguration.java
##########
@@ -117,6 +119,11 @@
 
     this.writeRpcTimeout = conf.getInt(HConstants.HBASE_RPC_WRITE_TIMEOUT_KEY,
         conf.getInt(HConstants.HBASE_RPC_TIMEOUT_KEY, HConstants.DEFAULT_HBASE_RPC_TIMEOUT));
+
+    this.scannerTimeoutPeriod = HBaseConfiguration.getInt(conf,

Review comment:
       static getInt() is deprecated, switch to conf.getInt()?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r480240959



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/LockTimeoutException.java
##########
@@ -0,0 +1,32 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.client;
+
+import org.apache.hadoop.hbase.HBaseIOException;
+import org.apache.yetus.audience.InterfaceAudience;
+
+/*
+  Thrown whenever we are not able to get the lock within the specified wait time.
+ */
+@InterfaceAudience.Public

Review comment:
       Maybe when I create a new PR for branch-1, I can re-use the existing exception class. Would that work ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] infraio commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

infraio commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r478762775



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       > We were wondering what is the right timeout to use for this lock.
   
   +1. The scanner timeout is not a good choice here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] infraio commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

infraio commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r479865631



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       I am +1 to use operation timeout here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] Apache-HBase commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

Apache-HBase commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-682214238


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 39s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ branch-2 Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 19s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   4m 21s |  branch-2 passed  |
   | +1 :green_heart: |  checkstyle  |   2m  7s |  branch-2 passed  |
   | +1 :green_heart: |  spotbugs  |   4m  8s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 15s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   4m  2s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   1m 26s |  hbase-server: The patch generated 1 new + 6 unchanged - 0 fixed = 7 total (was 6)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  15m 41s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   3m 54s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 23s |  The patch does not generate ASF License warnings.  |
   |  |   |  46m 44s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/3/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2322 |
   | JIRA Issue | HBASE-24956 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux 94178e99e7fe 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 85370f1443 |
   | checkstyle | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/3/artifact/yetus-general-check/output/diff-checkstyle-hbase-server.txt |
   | Max. process+thread count | 94 (vs. ulimit of 12500) |
   | modules | C: hbase-client hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/3/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] virajjasani commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

virajjasani commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r482199372



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       I just had an offline chat with @shahrs87 and I am +1 with keeping scanner timeout as config for Lock, the intention is to:
   1. not introduce more timeout configs and worsen timeout burden.
   2. use the one which can help with multiple retries (if allowed within SLA) rather than timeout which is available at entire operation level (rpc call + lock + locate region by scan + retry if required + and so on...).
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] Apache-HBase commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

Apache-HBase commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-683371480


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 31s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ branch-2 Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 25s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 10s |  branch-2 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 36s |  branch-2 passed  |
   | +1 :green_heart: |  spotbugs  |   2m 57s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 13s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m  6s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   0m 31s |  hbase-client: The patch generated 1 new + 9 unchanged - 0 fixed = 10 total (was 9)  |
   | -0 :warning: |  checkstyle  |   1m  5s |  hbase-server: The patch generated 1 new + 6 unchanged - 0 fixed = 7 total (was 6)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  11m 23s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   3m 21s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 25s |  The patch does not generate ASF License warnings.  |
   |  |   |  36m 16s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/5/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2322 |
   | JIRA Issue | HBASE-24956 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux cc6307366592 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 54454f8de6 |
   | checkstyle | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/5/artifact/yetus-general-check/output/diff-checkstyle-hbase-client.txt |
   | checkstyle | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/5/artifact/yetus-general-check/output/diff-checkstyle-hbase-server.txt |
   | Max. process+thread count | 94 (vs. ulimit of 12500) |
   | modules | C: hbase-client hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/5/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] infraio commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

infraio commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r484809224



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       > then in worst case scenario call will fail in 15 (lock timeout) + 15 (lock timeout 2nd try) + 1 (assuming sleep of 1 second) = 31 seconds
   
   Reread RpcRetryingCallerImpl#callWithRetries code. If callable.prepare throw the LockTimeoutException, it will not retry. There are a check about call duration and callTimeout in line 156. See https://github.com/apache/hbase/blob/branch-2/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RpcRetryingCallerImpl.java#L156




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r479540958



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -863,13 +863,15 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
       }
       // Query the meta region
       long pauseBase = this.pause;
-      userRegionLock.lock();
+      takeUserRegionLock();
       try {
-        if (useCache) {// re-check cache after get lock
-          RegionLocations locations = getCachedLocation(tableName, row);
-          if (locations != null && locations.getRegionLocation(replicaId) != null) {
-            return locations;
-          }
+        // We don't need to check if useCache is enabled or not. Even if useCache is false

Review comment:
       On my local system I created patch for this on top of 1.3 branch but while porting the patch to branch-2, I missed applying this hunk. 
   @saintstack  @apurtell  Since you guys already +1 the previous diff, wanted to bring this change to your attention. Sorry for confusion.
   Cc @bharathv  @infraio 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] Apache-HBase commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

Apache-HBase commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-682249409


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 27s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  6s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 19s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 38s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m 27s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   5m 56s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m  3s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 16s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 17s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 24s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 24s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   5m 57s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m  2s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 18s |  hbase-client in the patch passed.  |
   | +1 :green_heart: |  unit  | 130m 27s |  hbase-server in the patch passed.  |
   |  |   | 160m 47s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/3/artifact/yetus-jdk8-hadoop2-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2322 |
   | JIRA Issue | HBASE-24956 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 8ff038eb464f 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 85370f1443 |
   | Default Java | 1.8.0_232 |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/3/testReport/ |
   | Max. process+thread count | 3824 (vs. ulimit of 12500) |
   | modules | C: hbase-client hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/3/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] infraio commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

infraio commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r483046581



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       Got your case. I thought the problem is if lock timeout, the 2nd retry need to consider this case and should not retry.
   
   Let me give a example about the rpc timeout (configed as 4 secs) and operation timeout(configed as 10 secs). The rpc timeout maybe reset when the remaining time is less than rpc timeout.
   1. rpc call with 4 secs timeout.
   2. retry call with 4 secs timeout.
   3. retry call with 2 secs timeout which equals 10 - 4 - 4.
   
   For this case, we need same guarantee. If the remaining time is zero, it should not to retry. I thought this is the right way to fix this problem. Thanks.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] infraio commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

infraio commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r486877715



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       Why move the takeUserRegionLock to inside "try catch block"? The right fix is to use operation timeout......

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       I am -1 to use scanner timeout. It is too weird and confused. Operation timeout is not the best choice too but better. Thanks.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > Reread RpcRetryingCallerImpl#callWithRetries code. If callable.prepare throw the LockTimeoutException, it will not retry. There are a check about call duration and callTimeout in line 156.
   
   My point here is that your 15 seconds SLA case is not right. It is still meet your SLA even you use the operation timeout. I didn't mean that "move takeUserRegionLock  to try catch block". Thanks.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. 
   
   Yes. The guarantee is that the operation will fail or success within the "operation timeout". No remaining time to retry and failed the operation is acceptable. 
   
   > are you suggesting to wait for operation timeout period while trying to get lock 
   Yes. Use the operation timeout period when wait for lock, instead of the scanner timeout now.
   
   > I think we are going back and forth on which timeout to use.
   I thought my point is clearly since we start this discussion. I suggested that use operation timeout instead of scanner timeout. Then you give me a 15 seconds SLA example. Then I checked the code: use operation timeout can meet your SLA requirements, too. So why not use operation timeout?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. 
   
   Yes. The guarantee is that the operation will fail or success within the "operation timeout". No remaining time to retry and failed the operation is acceptable. 
   
   > are you suggesting to wait for operation timeout period while trying to get lock 
   
   Yes. Use the operation timeout period when wait for lock, instead of the scanner timeout now.
   
   > I think we are going back and forth on which timeout to use.
   
   I thought my point is clearly since we start this discussion. I suggested that use operation timeout instead of scanner timeout. Then you give me a 15 seconds SLA example. Then I checked the code: use operation timeout can meet your SLA requirements, too. So why not use operation timeout?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       Why move the takeUserRegionLock to inside "try catch block"? The right fix is to use operation timeout......

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       I am -1 to use scanner timeout. It is too weird and confused. Operation timeout is not the best choice too but better. Thanks.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > Reread RpcRetryingCallerImpl#callWithRetries code. If callable.prepare throw the LockTimeoutException, it will not retry. There are a check about call duration and callTimeout in line 156.
   
   My point here is that your 15 seconds SLA case is not right. It is still meet your SLA even you use the operation timeout. I didn't mean that "move takeUserRegionLock  to try catch block". Thanks.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. 
   
   Yes. The guarantee is that the operation will fail or success within the "operation timeout". No remaining time to retry and failed the operation is acceptable. 
   
   > are you suggesting to wait for operation timeout period while trying to get lock 
   Yes. Use the operation timeout period when wait for lock, instead of the scanner timeout now.
   
   > I think we are going back and forth on which timeout to use.
   I thought my point is clearly since we start this discussion. I suggested that use operation timeout instead of scanner timeout. Then you give me a 15 seconds SLA example. Then I checked the code: use operation timeout can meet your SLA requirements, too. So why not use operation timeout?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. 
   
   Yes. The guarantee is that the operation will fail or success within the "operation timeout". No remaining time to retry and failed the operation is acceptable. 
   
   > are you suggesting to wait for operation timeout period while trying to get lock 
   
   Yes. Use the operation timeout period when wait for lock, instead of the scanner timeout now.
   
   > I think we are going back and forth on which timeout to use.
   
   I thought my point is clearly since we start this discussion. I suggested that use operation timeout instead of scanner timeout. Then you give me a 15 seconds SLA example. Then I checked the code: use operation timeout can meet your SLA requirements, too. So why not use operation timeout?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] infraio commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

infraio commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r481152983



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       Operation timout is not the best choice but better than scanner timeout. Thanks.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r478701500



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/LockTimeoutException.java
##########
@@ -0,0 +1,32 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.client;
+
+import java.io.IOException;
+import org.apache.yetus.audience.InterfaceAudience;
+
+/*
+  Thrown whenever we are not able to get the lock within the specified wait time.
+ */
+@InterfaceAudience.Public
+public class LockTimeoutException extends IOException {

Review comment:
       Fixed in latest commit. Please review again




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] Apache-HBase commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

Apache-HBase commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-682027720


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 28s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  6s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 14s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 47s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m 23s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   6m 24s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 59s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 13s |  Maven dependency ordering for patch  |
   | -1 :x: |  mvninstall  |   1m  3s |  root in the patch failed.  |
   | -1 :x: |  compile  |   0m 18s |  hbase-server in the patch failed.  |
   | -0 :warning: |  javac  |   0m 18s |  hbase-server in the patch failed.  |
   | -1 :x: |  shadedjars  |   3m  6s |  patch has 11 errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   0m 57s |  the patch passed  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  |   0m 27s |  hbase-client in the patch failed.  |
   | -1 :x: |  unit  |   0m 18s |  hbase-server in the patch failed.  |
   |  |   |  21m 24s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-jdk8-hadoop2-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2322 |
   | JIRA Issue | HBASE-24956 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 6f9472b7cb67 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 85370f1443 |
   | Default Java | 1.8.0_232 |
   | mvninstall | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-jdk8-hadoop2-check/output/patch-mvninstall-root.txt |
   | compile | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-jdk8-hadoop2-check/output/patch-compile-hbase-server.txt |
   | javac | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-jdk8-hadoop2-check/output/patch-compile-hbase-server.txt |
   | shadedjars | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-jdk8-hadoop2-check/output/patch-shadedjars.txt |
   | unit | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-jdk8-hadoop2-check/output/patch-unit-hbase-client.txt |
   | unit | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-jdk8-hadoop2-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/testReport/ |
   | Max. process+thread count | 78 (vs. ulimit of 12500) |
   | modules | C: hbase-client hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] saintstack commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

saintstack commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r479460586



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       Good concern @bharathv .
   
   operation timeout is for 'whole operation end-to-end' per @shahrs87 . Here we are doing a sub-task so operation timeout doesn't seem right. Scanner timeout seems good; when it expires the scan will throw and we'll do the finally block anyways?
   
   What would you suggest @infraio ?
   
   I agree w/ @apurtell that last thing we need is new timeout ; client timeout is fraught as is.
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r479713606



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/LockTimeoutException.java
##########
@@ -0,0 +1,32 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.client;
+
+import org.apache.hadoop.hbase.HBaseIOException;
+import org.apache.yetus.audience.InterfaceAudience;
+
+/*
+  Thrown whenever we are not able to get the lock within the specified wait time.
+ */
+@InterfaceAudience.Public

Review comment:
       Actually I don't know. Since this will thrown back all the way to client so thought to make it public. But open for suggestions.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] saintstack commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

saintstack commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-689982232


   There is one outstanding discussion above IIUC... need @infraio buy-off on this patch. HBASE-24983 is for concerns raised in here as a follow-on.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r480239457



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/LockTimeoutException.java
##########
@@ -0,0 +1,32 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.client;
+
+import org.apache.hadoop.hbase.HBaseIOException;
+import org.apache.yetus.audience.InterfaceAudience;
+
+/*
+  Thrown whenever we are not able to get the lock within the specified wait time.
+ */
+@InterfaceAudience.Public

Review comment:
       > I just realized there is an exact same class LockTimeoutException under org.apache.hadoop.hbase.exceptions, switch to that?
   
   This class only exists in branch-1 and not in master/branch-2. :(




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r478767046



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       @bharathv  @infraio  Thank you for your feedback. 
   
   > Specifically I was wondering if hbase.client.operation.timeout would be the right one to use for this. 
   
   I don't feel hbase.client.operation.timeout is the right choice here too. This config is meant for the whole end to end operation timeout which includes all layers of retries and the default value is 20 mins. If we use this timeout then we are not gaining anything.
   
   We can introduce a new config property (something like hbase.client.lock.timeout.period) and default it to something like 10 seconds. That way we don't depend on existing scanner/operation timeout periods. Let me know what you guys think. Thank you !




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] Apache-HBase commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

Apache-HBase commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-682033261


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   2m  1s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  6s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 18s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   5m 27s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m 57s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   8m 28s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 36s |  hbase-client in branch-2 failed.  |
   | -0 :warning: |  javadoc  |   0m 48s |  hbase-server in branch-2 failed.  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 21s |  Maven dependency ordering for patch  |
   | -1 :x: |  mvninstall  |   1m 52s |  root in the patch failed.  |
   | -1 :x: |  compile  |   0m 25s |  hbase-server in the patch failed.  |
   | -0 :warning: |  javac  |   0m 25s |  hbase-server in the patch failed.  |
   | -1 :x: |  shadedjars  |   4m 12s |  patch has 11 errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 32s |  hbase-client in the patch failed.  |
   | -0 :warning: |  javadoc  |   0m 50s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  |   0m 38s |  hbase-client in the patch failed.  |
   | -1 :x: |  unit  |   0m 22s |  hbase-server in the patch failed.  |
   |  |   |  31m  2s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2322 |
   | JIRA Issue | HBASE-24956 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 29cdd4b7e3da 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 85370f1443 |
   | Default Java | 2020-01-14 |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-client.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | mvninstall | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-jdk11-hadoop3-check/output/patch-mvninstall-root.txt |
   | compile | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-jdk11-hadoop3-check/output/patch-compile-hbase-server.txt |
   | javac | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-jdk11-hadoop3-check/output/patch-compile-hbase-server.txt |
   | shadedjars | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-jdk11-hadoop3-check/output/patch-shadedjars.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-client.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   | unit | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-client.txt |
   | unit | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/testReport/ |
   | Max. process+thread count | 88 (vs. ulimit of 12500) |
   | modules | C: hbase-client hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/1/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] infraio commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

infraio commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r487349217



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. 
   
   Yes. The guarantee is that the operation will fail or success within the "operation timeout". No remaining time to retry and failed the operation is acceptable. 
   
   > are you suggesting to wait for operation timeout period while trying to get lock 
   
   Yes. Use the operation timeout period when wait for lock, instead of the scanner timeout now.
   
   > I think we are going back and forth on which timeout to use.
   
   I thought my point is clearly since we start this discussion. I suggested that use operation timeout instead of scanner timeout. Then you give me a 15 seconds SLA example. Then I checked the code: use operation timeout can meet your SLA requirements, too. So why not use operation timeout?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] infraio commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

infraio commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r485272081



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       bq. There is one retry loop here which will retry if exception is not TNFE or retries are not exhausted.
   Please check the code. The LockTimeoutException will be throwed out and terminate the loop. It is not in the try...catch code block.
   bq. IIUC the code, callable.prepare will never throw LockTimeoutException.
   callable.prepare will call locateRegionInMeta inside. So  locateRegionInMeta throw LockTimeoutException, then it will throw this exception out. 
   
   Thanks.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] Apache-HBase commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

Apache-HBase commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-689972315


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 40s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ branch-2 Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 19s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 49s |  branch-2 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 42s |  branch-2 passed  |
   | +1 :green_heart: |  spotbugs  |   3m 12s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 13s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 25s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   0m 33s |  hbase-client: The patch generated 2 new + 9 unchanged - 0 fixed = 11 total (was 9)  |
   | -0 :warning: |  checkstyle  |   1m  7s |  hbase-server: The patch generated 2 new + 6 unchanged - 0 fixed = 8 total (was 6)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  12m 10s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   3m 30s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 24s |  The patch does not generate ASF License warnings.  |
   |  |   |  38m 55s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/6/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2322 |
   | JIRA Issue | HBASE-24956 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux 918d4ac06fc0 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 6a05eaf7d5 |
   | checkstyle | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/6/artifact/yetus-general-check/output/diff-checkstyle-hbase-client.txt |
   | checkstyle | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/6/artifact/yetus-general-check/output/diff-checkstyle-hbase-server.txt |
   | Max. process+thread count | 94 (vs. ulimit of 12500) |
   | modules | C: hbase-client hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/6/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] Apache-HBase commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

Apache-HBase commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-683164673


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 40s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ branch-2 Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 17s |  branch-2 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 37s |  branch-2 passed  |
   | +1 :green_heart: |  spotbugs  |   3m  0s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 13s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 14s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   1m  4s |  hbase-server: The patch generated 1 new + 6 unchanged - 0 fixed = 7 total (was 6)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  11m 16s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   3m 19s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 26s |  The patch does not generate ASF License warnings.  |
   |  |   |  37m 24s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/4/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2322 |
   | JIRA Issue | HBASE-24956 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux 97340b31eb0c 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 85370f1443 |
   | checkstyle | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/4/artifact/yetus-general-check/output/diff-checkstyle-hbase-server.txt |
   | Max. process+thread count | 94 (vs. ulimit of 12500) |
   | modules | C: hbase-client hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/4/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r487273056



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > Operation timeout is not the best choice too but better
   
   @infraio  In scan operation, there are 2 operations. One is to wait for lock and other is to wait for rpc to complete. On top of that we have retries. The problem we are trying to solve here is what is the timeout to use for lock. If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. So I am confused when you suggest to use operation timeout, are you suggesting to wait for operation timeout period while trying to get lock ?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @saintstack  Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you !

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @saintstack  Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you ! Cc @SukumarMaddineni 

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > Operation timeout is not the best choice too but better
   
   @infraio  In scan operation, there are 2 operations. One is to wait for lock and other is to wait for rpc to complete. On top of that we have retries. The problem we are trying to solve here is what is the timeout to use for lock. If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. So I am confused when you suggest to use operation timeout, are you suggesting to wait for operation timeout period while trying to get lock ?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @saintstack  Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you !

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @saintstack  Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you ! Cc @SukumarMaddineni 

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > Operation timeout is not the best choice too but better
   
   @infraio  In scan operation, there are 2 operations. One is to wait for lock and other is to wait for rpc to complete. On top of that we have retries. The problem we are trying to solve here is what is the timeout to use for lock. If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. So I am confused when you suggest to use operation timeout, are you suggesting to wait for operation timeout period while trying to get lock ?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @saintstack  Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you !

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @saintstack  Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you ! Cc @SukumarMaddineni 

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @infraio  Thank you for your comment. Now I understand better what you mean. Let me update the PR by today. Thank you for being so patient with me.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > Operation timeout is not the best choice too but better
   
   @infraio  In scan operation, there are 2 operations. One is to wait for lock and other is to wait for rpc to complete. On top of that we have retries. The problem we are trying to solve here is what is the timeout to use for lock. If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. So I am confused when you suggest to use operation timeout, are you suggesting to wait for operation timeout period while trying to get lock ?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @saintstack  Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you !

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @saintstack  Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you ! Cc @SukumarMaddineni 

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @infraio  Thank you for your comment. Now I understand better what you mean. Let me update the PR by today. Thank you for being so patient with me.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > Operation timeout is not the best choice too but better
   
   @infraio  In scan operation, there are 2 operations. One is to wait for lock and other is to wait for rpc to complete. On top of that we have retries. The problem we are trying to solve here is what is the timeout to use for lock. If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. So I am confused when you suggest to use operation timeout, are you suggesting to wait for operation timeout period while trying to get lock ?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @saintstack  Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you !

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @saintstack  Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you ! Cc @SukumarMaddineni 

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @infraio  Thank you for your comment. Now I understand better what you mean. Let me update the PR by today. Thank you for being so patient with me.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > Operation timeout is not the best choice too but better
   
   @infraio  In scan operation, there are 2 operations. One is to wait for lock and other is to wait for rpc to complete. On top of that we have retries. The problem we are trying to solve here is what is the timeout to use for lock. If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. So I am confused when you suggest to use operation timeout, are you suggesting to wait for operation timeout period while trying to get lock ?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @saintstack  Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you !

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @saintstack  Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you ! Cc @SukumarMaddineni 

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @infraio  Thank you for your comment. Now I understand better what you mean. Let me update the PR by today. Thank you for being so patient with me.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > Operation timeout is not the best choice too but better
   
   @infraio  In scan operation, there are 2 operations. One is to wait for lock and other is to wait for rpc to complete. On top of that we have retries. The problem we are trying to solve here is what is the timeout to use for lock. If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. So I am confused when you suggest to use operation timeout, are you suggesting to wait for operation timeout period while trying to get lock ?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @saintstack  Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you !

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @saintstack  Could you please chime in with your inputs. I think we are going back and forth on which timeout to use. Also I have created https://issues.apache.org/jira/browse/HBASE-24983 to wrap the whole scan operation within operation timeout but is outside the scope of this jira. Thank you ! Cc @SukumarMaddineni 

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @infraio  Thank you for your comment. Now I understand better what you mean. Let me update the PR by today. Thank you for being so patient with me.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] infraio commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

infraio commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r481152014



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       > the default value for operation timeout is 20 minutes. Do you want clients to wait for 20 minutes to acquire this lock ?
   
   The problem is not how long the default value. The problem here is to use the right timeout. And for the default 20min operation timeout, we set it to 20mins because it is easy for passing ITBLL. And in real producetion cluster, no user will use the default value if it serve for online service. 
   
   The guarantee of operation timeout is all table call (exclude scan) should return or throw exception when timeout. And when need to locate, locateRegionInMeta is one step of table call. So it should to keep this guarantee.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] infraio commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

infraio commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r486884181



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > Reread RpcRetryingCallerImpl#callWithRetries code. If callable.prepare throw the LockTimeoutException, it will not retry. There are a check about call duration and callTimeout in line 156.
   
   My point here is that your 15 seconds SLA case is not right. It is still meet your SLA even you use the operation timeout. I didn't mean that "move takeUserRegionLock  to try catch block". Thanks.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] Apache-HBase commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

Apache-HBase commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-683208368


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 19s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  7s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 27s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   4m 29s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m 43s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   7m 22s |  branch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 28s |  hbase-client in branch-2 failed.  |
   | -0 :warning: |  javadoc  |   0m 41s |  hbase-server in branch-2 failed.  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 14s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   4m 24s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 40s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 40s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   7m 27s |  patch has no errors when building our shaded downstream artifacts.  |
   | -0 :warning: |  javadoc  |   0m 27s |  hbase-client in the patch failed.  |
   | -0 :warning: |  javadoc  |   0m 41s |  hbase-server in the patch failed.  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 38s |  hbase-client in the patch passed.  |
   | +1 :green_heart: |  unit  | 183m 10s |  hbase-server in the patch passed.  |
   |  |   | 219m  4s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/4/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2322 |
   | JIRA Issue | HBASE-24956 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 51fc46fd89f3 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 85370f1443 |
   | Default Java | 2020-01-14 |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/4/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-client.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/4/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/4/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-client.txt |
   | javadoc | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/4/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/4/testReport/ |
   | Max. process+thread count | 2664 (vs. ulimit of 12500) |
   | modules | C: hbase-client hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/4/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] saintstack commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

saintstack commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r478612123



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/LockTimeoutException.java
##########
@@ -0,0 +1,32 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.client;
+
+import java.io.IOException;
+import org.apache.yetus.audience.InterfaceAudience;
+
+/*
+  Thrown whenever we are not able to get the lock within the specified wait time.
+ */
+@InterfaceAudience.Public
+public class LockTimeoutException extends IOException {

Review comment:
       Make this subclass HBaseIOException rather than IOException?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r482175115



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       We have an internal customer who wants a strict 15 seconds SLA for every operation. Since operation timeout is end to end timeout which includes all the retries, sleep within retries so we suggested them to use operation timeout as 15 seconds.
   Also we recommended them to set scanner timeout period to 7 seconds with  retries config (hbase.client.retries.number)  set to 2 .  There is some sleep interval between each attempt and we expect the call to complete within 15 seconds and if it doesn't then operation timeout will kick in and fail the call.
   But we found out that getRegionLocations call is not bounded by operation timeout.
   Now if we set the lock timeout to same as operation timeout, then in worst case scenario call will fail in 15 (lock timeout) + 15 (lock timeout 2nd try) + 1 (assuming sleep of 1 second) = 31 seconds
   If we set the lock timeout to same as scanner timeout, then in worst case scenario the call will fail in 7 (scanner timeout) + 7 (scanner timeout) + 1 (sleep between tries) = 16 seconds which is closer to SLA that we promised.
   Hope this makes sense. If still the community wants to go forward with operation timeout, I will change it to operation timeout.
   @infraio  @bharathv  @virajjasani  @saintstack  @apurtell 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r480476762



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       @saintstack  @apurtell  @infraio  We need to come to consensus on which timeout property to use ? IMHO don't think we should use operation timeout.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r486558569



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       > What is the resolution here @shahrs87 + @infraio ?
   
   @infraio  pointed out one bug in my patch. Fixed it yesterday. Waiting for his feedback. Thank you for being so patient !




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] Apache-HBase commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

Apache-HBase commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-693004287


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 57s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  No case conflicting files found.  |
   | +1 :green_heart: |  hbaseanti  |   0m  0s |  Patch does not have any anti-patterns.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any @author tags.  |
   ||| _ branch-2 Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 33s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   4m  6s |  branch-2 passed  |
   | +1 :green_heart: |  checkstyle  |   1m 54s |  branch-2 passed  |
   | +1 :green_heart: |  spotbugs  |   4m  8s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 14s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 41s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   1m 11s |  hbase-server: The patch generated 1 new + 6 unchanged - 0 fixed = 7 total (was 6)  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace issues.  |
   | +1 :green_heart: |  hadoopcheck  |  13m  9s |  Patch does not cause any errors with Hadoop 3.1.2 3.2.1.  |
   | +1 :green_heart: |  spotbugs  |   3m 56s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  asflicense  |   0m 22s |  The patch does not generate ASF License warnings.  |
   |  |   |  43m 36s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/7/artifact/yetus-general-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2322 |
   | JIRA Issue | HBASE-24956 |
   | Optional Tests | dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle |
   | uname | Linux 88e585e77023 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / ddddd2a822 |
   | checkstyle | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/7/artifact/yetus-general-check/output/diff-checkstyle-hbase-server.txt |
   | Max. process+thread count | 95 (vs. ulimit of 12500) |
   | modules | C: hbase-client hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/7/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12 |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] Apache-HBase commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

Apache-HBase commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-693097288


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   1m 31s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  6s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 34s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   4m 18s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m 37s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   7m 25s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 11s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 17s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 47s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 34s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 34s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   7m 22s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 11s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 24s |  hbase-client in the patch passed.  |
   | +1 :green_heart: |  unit  | 143m 52s |  hbase-server in the patch passed.  |
   |  |   | 179m 39s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/7/artifact/yetus-jdk8-hadoop2-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2322 |
   | JIRA Issue | HBASE-24956 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux b55e94f65d69 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / ddddd2a822 |
   | Default Java | 1.8.0_232 |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/7/testReport/ |
   | Max. process+thread count | 4040 (vs. ulimit of 12500) |
   | modules | C: hbase-client hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/7/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] bharathv commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

bharathv commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-693539789


   Please speak up if anyone has any objections with the current state of the patch, else I'll merge this by EOD today. Thanks.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r486492130



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       @infraio  I moved the acquiring of lock inside try catch block. https://github.com/apache/hbase/blob/58cb1cdaba1dbf952647934be0f34b4bad2e71db/hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java#L868
   Also added a test case for that.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r486075626



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       > Please check the code. The LockTimeoutException will be throwed out and terminate the loop. It is not in the try...catch code block.
   
   @infraio  you are right. Fixed that in latest commit. Please review again.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] saintstack commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

saintstack commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r486063795



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +968,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  private void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();
+      if (!userRegionLock.tryLock(waitTime, TimeUnit.MILLISECONDS)) {

Review comment:
       What is the resolution here @shahrs87  + @infraio ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] bharathv commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

bharathv commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r480226179



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/LockTimeoutException.java
##########
@@ -0,0 +1,32 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.client;
+
+import org.apache.hadoop.hbase.HBaseIOException;
+import org.apache.yetus.audience.InterfaceAudience;
+
+/*
+  Thrown whenever we are not able to get the lock within the specified wait time.
+ */
+@InterfaceAudience.Public

Review comment:
       I just realized there is an exact same class LockTimeoutException under org.apache.hadoop.hbase.exceptions, switch to that?
   
   > But open for suggestions.
   
   I was hoping that clients would rely on a generic HBaseIOException and marking this as private would give us more flexibility to remove/update etc in the future. But i think it doesn't matter if we switch to the above LockTimeout I was referring to.
   
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] shahrs87 commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

shahrs87 commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r478701500



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/LockTimeoutException.java
##########
@@ -0,0 +1,32 @@
+/**
+ *
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.hbase.client;
+
+import java.io.IOException;
+import org.apache.yetus.audience.InterfaceAudience;
+
+/*
+  Thrown whenever we are not able to get the lock within the specified wait time.
+ */
+@InterfaceAudience.Public
+public class LockTimeoutException extends IOException {

Review comment:
       Fixed in latest commit.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] bharathv merged pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

bharathv merged pull request #2322:
URL: https://github.com/apache/hbase/pull/2322


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] Apache-HBase commented on pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

Apache-HBase commented on pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#issuecomment-682123433


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |:----:|----------:|--------:|:--------|
   | +0 :ok: |  reexec  |   0m 38s |  Docker mode activated.  |
   | -0 :warning: |  yetus  |   0m  7s |  Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck  |
   ||| _ Prechecks _ |
   ||| _ branch-2 Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 18s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |   3m 35s |  branch-2 passed  |
   | +1 :green_heart: |  compile  |   1m 21s |  branch-2 passed  |
   | +1 :green_heart: |  shadedjars  |   6m  1s |  branch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m  2s |  branch-2 passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 16s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m 15s |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 26s |  the patch passed  |
   | +1 :green_heart: |  javac  |   1m 26s |  the patch passed  |
   | +1 :green_heart: |  shadedjars  |   7m 23s |  patch has no errors when building our shaded downstream artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m  8s |  the patch passed  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 16s |  hbase-client in the patch passed.  |
   | +1 :green_heart: |  unit  | 136m  6s |  hbase-server in the patch passed.  |
   |  |   | 167m 20s |   |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | Client=19.03.12 Server=19.03.12 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/2/artifact/yetus-jdk8-hadoop2-check/output/Dockerfile |
   | GITHUB PR | https://github.com/apache/hbase/pull/2322 |
   | JIRA Issue | HBASE-24956 |
   | Optional Tests | javac javadoc unit shadedjars compile |
   | uname | Linux 0ebbb410c75a 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/hbase-personality.sh |
   | git revision | branch-2 / 85370f1443 |
   | Default Java | 1.8.0_232 |
   |  Test Results | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/2/testReport/ |
   | Max. process+thread count | 3832 (vs. ulimit of 12500) |
   | modules | C: hbase-client hbase-server U: . |
   | Console output | https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2322/2/console |
   | versions | git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) |
   | Powered by | Apache Yetus 0.11.1 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [hbase] infraio commented on a change in pull request #2322: [HBASE-24956] ConnectionManager#locateRegionInMeta waits for user region lock indefinitely.

Posted by GitBox <gi...@apache.org>.

infraio commented on a change in pull request #2322:
URL: https://github.com/apache/hbase/pull/2322#discussion_r486877715



##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       Why move the takeUserRegionLock to inside "try catch block"? The right fix is to use operation timeout......

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       I am -1 to use scanner timeout. It is too weird and confused. Operation timeout is not the best choice too but better. Thanks.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > Reread RpcRetryingCallerImpl#callWithRetries code. If callable.prepare throw the LockTimeoutException, it will not retry. There are a check about call duration and callTimeout in line 156.
   
   My point here is that your 15 seconds SLA case is not right. It is still meet your SLA even you use the operation timeout. I didn't mean that "move takeUserRegionLock  to try catch block". Thanks.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. 
   
   Yes. The guarantee is that the operation will fail or success within the "operation timeout". No remaining time to retry and failed the operation is acceptable. 
   
   > are you suggesting to wait for operation timeout period while trying to get lock 
   Yes. Use the operation timeout period when wait for lock, instead of the scanner timeout now.
   
   > I think we are going back and forth on which timeout to use.
   I thought my point is clearly since we start this discussion. I suggested that use operation timeout instead of scanner timeout. Then you give me a 15 seconds SLA example. Then I checked the code: use operation timeout can meet your SLA requirements, too. So why not use operation timeout?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. 
   
   Yes. The guarantee is that the operation will fail or success within the "operation timeout". No remaining time to retry and failed the operation is acceptable. 
   
   > are you suggesting to wait for operation timeout period while trying to get lock 
   
   Yes. Use the operation timeout period when wait for lock, instead of the scanner timeout now.
   
   > I think we are going back and forth on which timeout to use.
   
   I thought my point is clearly since we start this discussion. I suggested that use operation timeout instead of scanner timeout. Then you give me a 15 seconds SLA example. Then I checked the code: use operation timeout can meet your SLA requirements, too. So why not use operation timeout?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       Why move the takeUserRegionLock to inside "try catch block"? The right fix is to use operation timeout......

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       I am -1 to use scanner timeout. It is too weird and confused. Operation timeout is not the best choice too but better. Thanks.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > Reread RpcRetryingCallerImpl#callWithRetries code. If callable.prepare throw the LockTimeoutException, it will not retry. There are a check about call duration and callTimeout in line 156.
   
   My point here is that your 15 seconds SLA case is not right. It is still meet your SLA even you use the operation timeout. I didn't mean that "move takeUserRegionLock  to try catch block". Thanks.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. 
   
   Yes. The guarantee is that the operation will fail or success within the "operation timeout". No remaining time to retry and failed the operation is acceptable. 
   
   > are you suggesting to wait for operation timeout period while trying to get lock 
   Yes. Use the operation timeout period when wait for lock, instead of the scanner timeout now.
   
   > I think we are going back and forth on which timeout to use.
   I thought my point is clearly since we start this discussion. I suggested that use operation timeout instead of scanner timeout. Then you give me a 15 seconds SLA example. Then I checked the code: use operation timeout can meet your SLA requirements, too. So why not use operation timeout?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. 
   
   Yes. The guarantee is that the operation will fail or success within the "operation timeout". No remaining time to retry and failed the operation is acceptable. 
   
   > are you suggesting to wait for operation timeout period while trying to get lock 
   
   Yes. Use the operation timeout period when wait for lock, instead of the scanner timeout now.
   
   > I think we are going back and forth on which timeout to use.
   
   I thought my point is clearly since we start this discussion. I suggested that use operation timeout instead of scanner timeout. Then you give me a 15 seconds SLA example. Then I checked the code: use operation timeout can meet your SLA requirements, too. So why not use operation timeout?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       Why move the takeUserRegionLock to inside "try catch block"? The right fix is to use operation timeout......

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       I am -1 to use scanner timeout. It is too weird and confused. Operation timeout is not the best choice too but better. Thanks.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > Reread RpcRetryingCallerImpl#callWithRetries code. If callable.prepare throw the LockTimeoutException, it will not retry. There are a check about call duration and callTimeout in line 156.
   
   My point here is that your 15 seconds SLA case is not right. It is still meet your SLA even you use the operation timeout. I didn't mean that "move takeUserRegionLock  to try catch block". Thanks.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. 
   
   Yes. The guarantee is that the operation will fail or success within the "operation timeout". No remaining time to retry and failed the operation is acceptable. 
   
   > are you suggesting to wait for operation timeout period while trying to get lock 
   Yes. Use the operation timeout period when wait for lock, instead of the scanner timeout now.
   
   > I think we are going back and forth on which timeout to use.
   I thought my point is clearly since we start this discussion. I suggested that use operation timeout instead of scanner timeout. Then you give me a 15 seconds SLA example. Then I checked the code: use operation timeout can meet your SLA requirements, too. So why not use operation timeout?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. 
   
   Yes. The guarantee is that the operation will fail or success within the "operation timeout". No remaining time to retry and failed the operation is acceptable. 
   
   > are you suggesting to wait for operation timeout period while trying to get lock 
   
   Yes. Use the operation timeout period when wait for lock, instead of the scanner timeout now.
   
   > I think we are going back and forth on which timeout to use.
   
   I thought my point is clearly since we start this discussion. I suggested that use operation timeout instead of scanner timeout. Then you give me a 15 seconds SLA example. Then I checked the code: use operation timeout can meet your SLA requirements, too. So why not use operation timeout?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       Why move the takeUserRegionLock to inside "try catch block"? The right fix is to use operation timeout......

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       I am -1 to use scanner timeout. It is too weird and confused. Operation timeout is not the best choice too but better. Thanks.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > Reread RpcRetryingCallerImpl#callWithRetries code. If callable.prepare throw the LockTimeoutException, it will not retry. There are a check about call duration and callTimeout in line 156.
   
   My point here is that your 15 seconds SLA case is not right. It is still meet your SLA even you use the operation timeout. I didn't mean that "move takeUserRegionLock  to try catch block". Thanks.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. 
   
   Yes. The guarantee is that the operation will fail or success within the "operation timeout". No remaining time to retry and failed the operation is acceptable. 
   
   > are you suggesting to wait for operation timeout period while trying to get lock 
   Yes. Use the operation timeout period when wait for lock, instead of the scanner timeout now.
   
   > I think we are going back and forth on which timeout to use.
   I thought my point is clearly since we start this discussion. I suggested that use operation timeout instead of scanner timeout. Then you give me a 15 seconds SLA example. Then I checked the code: use operation timeout can meet your SLA requirements, too. So why not use operation timeout?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. 
   
   Yes. The guarantee is that the operation will fail or success within the "operation timeout". No remaining time to retry and failed the operation is acceptable. 
   
   > are you suggesting to wait for operation timeout period while trying to get lock 
   
   Yes. Use the operation timeout period when wait for lock, instead of the scanner timeout now.
   
   > I think we are going back and forth on which timeout to use.
   
   I thought my point is clearly since we start this discussion. I suggested that use operation timeout instead of scanner timeout. Then you give me a 15 seconds SLA example. Then I checked the code: use operation timeout can meet your SLA requirements, too. So why not use operation timeout?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       Why move the takeUserRegionLock to inside "try catch block"? The right fix is to use operation timeout......

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       I am -1 to use scanner timeout. It is too weird and confused. Operation timeout is not the best choice too but better. Thanks.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > Reread RpcRetryingCallerImpl#callWithRetries code. If callable.prepare throw the LockTimeoutException, it will not retry. There are a check about call duration and callTimeout in line 156.
   
   My point here is that your 15 seconds SLA case is not right. It is still meet your SLA even you use the operation timeout. I didn't mean that "move takeUserRegionLock  to try catch block". Thanks.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. 
   
   Yes. The guarantee is that the operation will fail or success within the "operation timeout". No remaining time to retry and failed the operation is acceptable. 
   
   > are you suggesting to wait for operation timeout period while trying to get lock 
   Yes. Use the operation timeout period when wait for lock, instead of the scanner timeout now.
   
   > I think we are going back and forth on which timeout to use.
   I thought my point is clearly since we start this discussion. I suggested that use operation timeout instead of scanner timeout. Then you give me a 15 seconds SLA example. Then I checked the code: use operation timeout can meet your SLA requirements, too. So why not use operation timeout?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. 
   
   Yes. The guarantee is that the operation will fail or success within the "operation timeout". No remaining time to retry and failed the operation is acceptable. 
   
   > are you suggesting to wait for operation timeout period while trying to get lock 
   
   Yes. Use the operation timeout period when wait for lock, instead of the scanner timeout now.
   
   > I think we are going back and forth on which timeout to use.
   
   I thought my point is clearly since we start this discussion. I suggested that use operation timeout instead of scanner timeout. Then you give me a 15 seconds SLA example. Then I checked the code: use operation timeout can meet your SLA requirements, too. So why not use operation timeout?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       Why move the takeUserRegionLock to inside "try catch block"? The right fix is to use operation timeout......

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       I am -1 to use scanner timeout. It is too weird and confused. Operation timeout is not the best choice too but better. Thanks.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > Reread RpcRetryingCallerImpl#callWithRetries code. If callable.prepare throw the LockTimeoutException, it will not retry. There are a check about call duration and callTimeout in line 156.
   
   My point here is that your 15 seconds SLA case is not right. It is still meet your SLA even you use the operation timeout. I didn't mean that "move takeUserRegionLock  to try catch block". Thanks.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. 
   
   Yes. The guarantee is that the operation will fail or success within the "operation timeout". No remaining time to retry and failed the operation is acceptable. 
   
   > are you suggesting to wait for operation timeout period while trying to get lock 
   Yes. Use the operation timeout period when wait for lock, instead of the scanner timeout now.
   
   > I think we are going back and forth on which timeout to use.
   I thought my point is clearly since we start this discussion. I suggested that use operation timeout instead of scanner timeout. Then you give me a 15 seconds SLA example. Then I checked the code: use operation timeout can meet your SLA requirements, too. So why not use operation timeout?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. 
   
   Yes. The guarantee is that the operation will fail or success within the "operation timeout". No remaining time to retry and failed the operation is acceptable. 
   
   > are you suggesting to wait for operation timeout period while trying to get lock 
   
   Yes. Use the operation timeout period when wait for lock, instead of the scanner timeout now.
   
   > I think we are going back and forth on which timeout to use.
   
   I thought my point is clearly since we start this discussion. I suggested that use operation timeout instead of scanner timeout. Then you give me a 15 seconds SLA example. Then I checked the code: use operation timeout can meet your SLA requirements, too. So why not use operation timeout?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       Why move the takeUserRegionLock to inside "try catch block"? The right fix is to use operation timeout......

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       I am -1 to use scanner timeout. It is too weird and confused. Operation timeout is not the best choice too but better. Thanks.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > Reread RpcRetryingCallerImpl#callWithRetries code. If callable.prepare throw the LockTimeoutException, it will not retry. There are a check about call duration and callTimeout in line 156.
   
   My point here is that your 15 seconds SLA case is not right. It is still meet your SLA even you use the operation timeout. I didn't mean that "move takeUserRegionLock  to try catch block". Thanks.

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. 
   
   Yes. The guarantee is that the operation will fail or success within the "operation timeout". No remaining time to retry and failed the operation is acceptable. 
   
   > are you suggesting to wait for operation timeout period while trying to get lock 
   Yes. Use the operation timeout period when wait for lock, instead of the scanner timeout now.
   
   > I think we are going back and forth on which timeout to use.
   I thought my point is clearly since we start this discussion. I suggested that use operation timeout instead of scanner timeout. Then you give me a 15 seconds SLA example. Then I checked the code: use operation timeout can meet your SLA requirements, too. So why not use operation timeout?

##########
File path: hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java
##########
@@ -968,6 +974,19 @@ private RegionLocations locateRegionInMeta(TableName tableName, byte[] row, bool
     }
   }
 
+  void takeUserRegionLock() throws IOException {
+    try {
+      long waitTime = connectionConfig.getScannerTimeoutPeriod();

Review comment:
       > If we wait for operation timeout period and if it can't get the lock after the timeout, it will not have any time remaining for next attempts. 
   
   Yes. The guarantee is that the operation will fail or success within the "operation timeout". No remaining time to retry and failed the operation is acceptable. 
   
   > are you suggesting to wait for operation timeout period while trying to get lock 
   
   Yes. Use the operation timeout period when wait for lock, instead of the scanner timeout now.
   
   > I think we are going back and forth on which timeout to use.
   
   I thought my point is clearly since we start this discussion. I suggested that use operation timeout instead of scanner timeout. Then you give me a 15 seconds SLA example. Then I checked the code: use operation timeout can meet your SLA requirements, too. So why not use operation timeout?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org