You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-issues@hadoop.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/06/01 05:59:00 UTC

[jira] [Commented] (HDFS-17030) Limit wait time for getHAServiceState in ObserverReaderProxy

    [ https://issues.apache.org/jira/browse/HDFS-17030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17728228#comment-17728228 ] 

ASF GitHub Bot commented on HDFS-17030:
---------------------------------------

hadoop-yetus commented on PR #5700:
URL: https://github.com/apache/hadoop/pull/5700#issuecomment-1571402556

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 48s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  1s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 1 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |  16m 42s |  |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  23m  6s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   5m 51s |  |  trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  compile  |   5m 33s |  |  trunk passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  checkstyle  |   1m 18s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m 11s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 46s |  |  trunk passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  5s |  |  trunk passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   5m 57s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  25m 50s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 24s |  |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m 54s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   5m 45s |  |  the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javac  |   5m 45s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   5m 31s |  |  the patch passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  javac  |   5m 31s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks issues.  |
   | -0 :warning: |  checkstyle  |   1m  9s | [/results-checkstyle-hadoop-hdfs-project.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5700/5/artifact/out/results-checkstyle-hadoop-hdfs-project.txt) |  hadoop-hdfs-project: The patch generated 1 new + 1 unchanged - 0 fixed = 2 total (was 1)  |
   | +1 :green_heart: |  mvnsite  |   1m 59s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   1m 28s |  |  the patch passed with JDK Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 57s |  |  the patch passed with JDK Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09  |
   | +1 :green_heart: |  spotbugs  |   5m 57s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m 47s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   2m 18s |  |  hadoop-hdfs-client in the patch passed.  |
   | -1 :x: |  unit  | 226m 49s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5700/5/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 42s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 371m 59s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.hdfs.server.datanode.TestDirectoryScanner |
   |   | hadoop.hdfs.server.namenode.ha.TestObserverNode |
   |   | hadoop.hdfs.TestRollingUpgrade |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5700/5/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/5700 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 3af1994e7412 4.15.0-206-generic #217-Ubuntu SMP Fri Feb 3 19:10:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 15a350d555a287f8a70a3438d73800a47efcf92e |
   | Default Java | Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.19+7-post-Ubuntu-0ubuntu120.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_362-8u372-ga~us1-0ubuntu1~20.04-b09 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5700/5/testReport/ |
   | Max. process+thread count | 2273 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs-client hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-5700/5/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> Limit wait time for getHAServiceState in ObserverReaderProxy
> ------------------------------------------------------------
>
>                 Key: HDFS-17030
>                 URL: https://issues.apache.org/jira/browse/HDFS-17030
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs
>    Affects Versions: 3.4.0
>            Reporter: Xing Lin
>            Assignee: Xing Lin
>            Priority: Minor
>              Labels: pull-request-available
>
> When namenode HA is enabled and a standby NN is not responsible, we have observed it would take a long time to serve a request, even though we have a healthy observer or active NN. 
> Basically, when a standby is down, the RPC client would (re)try to create socket connection to that standby for _ipc.client.connect.timeout_ _* ipc.client.connect.max.retries.on.timeouts_ before giving up. When we take a heap dump at a standby, the NN still accepts the socket connection but it won't send responses to these RPC requests and we would timeout after _ipc.client.rpc-timeout.ms._ This adds a significantly latency. For clusters at Linkedin, we set _ipc.client.rpc-timeout.ms_ to 120 seconds and thus a request takes more than 2 mins to complete when we take a heap dump at a standby. This has been causing user job failures. 
> We could set _ipc.client.rpc-timeout.ms to_ a smaller value when sending getHAServiceState requests in ObserverReaderProxy (for user rpc requests, we still use the original value from the config). However, that would double the socket connection between clients and the NN (which is a deal-breaker). 
> The proposal is to add a timeout on getHAServiceState() calls in ObserverReaderProxy and we will only wait for the timeout for an NN to respond its HA state. Once we pass that timeout, we will move on to probe the next NN. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org