You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-issues@hadoop.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/08/05 16:49:00 UTC

[jira] [Commented] (HDFS-16697) Randomly setting “dfs.namenode.resource.checked.volumes.minimum” will always prevent safe mode from being turned off

    [ https://issues.apache.org/jira/browse/HDFS-16697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17575955#comment-17575955 ] 

ASF GitHub Bot commented on HDFS-16697:
---------------------------------------

hadoop-yetus commented on PR #4653:
URL: https://github.com/apache/hadoop/pull/4653#issuecomment-1206652668

   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |:----:|----------:|--------:|:--------:|:-------:|
   | +0 :ok: |  reexec  |   0m 55s |  |  Docker mode activated.  |
   |||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  |  No case conflicting files found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  detsecrets  |   0m  0s |  |  detect-secrets was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to include 1 new or modified test files.  |
   |||| _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  42m  4s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |   1m 40s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  compile  |   1m 31s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  checkstyle  |   1m 19s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 40s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 21s |  |  trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 46s |  |  trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 48s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  26m  3s |  |  branch has no errors when building and testing our client artifacts.  |
   |||| _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m 22s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 28s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javac  |   1m 28s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |   1m 19s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  javac  |   1m 19s |  |  the patch passed  |
   | -1 :x: |  blanks  |   0m  0s | [/blanks-eol.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4653/3/artifact/out/blanks-eol.txt) |  The patch has 2 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  checkstyle  |   1m  0s |  |  the patch passed  |
   | +1 :green_heart: |  mvnsite  |   1m 28s |  |  the patch passed  |
   | +1 :green_heart: |  javadoc  |   0m 56s |  |  the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1  |
   | +1 :green_heart: |  javadoc  |   1m 32s |  |  the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07  |
   | +1 :green_heart: |  spotbugs  |   3m 35s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  26m  0s |  |  patch has no errors when building and testing our client artifacts.  |
   |||| _ Other Tests _ |
   | -1 :x: |  unit  | 340m 52s | [/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4653/3/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt) |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  2s |  |  The patch does not generate ASF License warnings.  |
   |  |   | 460m 16s |  |  |
   
   
   | Reason | Tests |
   |-------:|:------|
   | Failed junit tests | hadoop.hdfs.server.mover.TestMover |
   |   | hadoop.hdfs.TestDFSInotifyEventInputStreamKerberized |
   |   | hadoop.hdfs.TestRollingUpgradeRollback |
   |   | hadoop.hdfs.qjournal.TestSecureNNWithQJM |
   |   | hadoop.hdfs.TestRollingUpgrade |
   |   | hadoop.hdfs.qjournal.TestNNWithQJM |
   
   
   | Subsystem | Report/Notes |
   |----------:|:-------------|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4653/3/artifact/out/Dockerfile |
   | GITHUB PR | https://github.com/apache/hadoop/pull/4653 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell detsecrets |
   | uname | Linux 194712d0a8d6 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 9ca81e12c62e111eccb8c4f8753edbef161d472a |
   | Default Java | Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   | Multi-JDK versions | /usr/lib/jvm/java-11-openjdk-amd64:Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 |
   |  Test Results | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4653/3/testReport/ |
   | Max. process+thread count | 2116 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs |
   | Console output | https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4653/3/console |
   | versions | git=2.25.1 maven=3.6.3 spotbugs=4.2.2 |
   | Powered by | Apache Yetus 0.14.0 https://yetus.apache.org |
   
   
   This message was automatically generated.
   
   




> Randomly setting “dfs.namenode.resource.checked.volumes.minimum” will always prevent safe mode from being turned off
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-16697
>                 URL: https://issues.apache.org/jira/browse/HDFS-16697
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>    Affects Versions: 3.1.3
>         Environment: Linux version 4.15.0-142-generic (buildd@lgw01-amd64-039) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.12))
> java version "1.8.0_162"
> Java(TM) SE Runtime Environment (build 1.8.0_162-b12)
> Java HotSpot(TM) 64-Bit Server VM (build 25.162-b12, mixed mode)
>            Reporter: Jingxuan Fu
>            Assignee: Jingxuan Fu
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.1.3
>
>          Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> {code:java}
> <property>
>   <name>dfs.namenode.resource.checked.volumes.minimum</name>
>   <value>1</value>
>   <description>
>     The minimum number of redundant NameNode storage volumes required.
>   </description>
> </property>{code}
> I found that when setting the value of “dfs.namenode.resource.checked.volumes.minimum” is greater than the total number of storage volumes in the NameNode, it is always impossible to turn off the safe mode, and when in safe mode, the file system only accepts read data requests, but not delete, modify and other change requests, which is greatly limited by the function.
> The default value of the configuration item is 1, we set to 2 as an example for illustration, after starting hdfs logs and the client will throw the relevant reminders.
> {code:java}
> 2022-07-27 17:37:31,772 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: NameNode low on available disk space. Already in safe mode.
> 2022-07-27 17:37:31,772 INFO org.apache.hadoop.hdfs.StateChange: STATE* Safe mode is ON.
> Resources are low on NN. Please add or free up more resourcesthen turn off safe mode manually. NOTE:  If you turn off safe mode before adding resources, the NN will immediately return to safe mode. Use "hdfs dfsadmin -safemode leave" to turn safe mode off.
> {code}
> {code:java}
> org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot create directory /hdfsapi/test. Name node is in safe mode.
> Resources are low on NN. Please add or free up more resourcesthen turn off safe mode manually. NOTE:  If you turn off safe mode before adding resources, the NN will immediately return to safe mode. Use "hdfs dfsadmin -safemode leave" to turn safe mode off. NamenodeHostName:192.168.1.167
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.newSafemodeException(FSNamesystem.java:1468)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkNameNodeSafeMode(FSNamesystem.java:1455)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:3174)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:1145)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:714)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:527)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1036)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1000)
>         at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:928)
>         at java.base/java.security.AccessController.doPrivileged(Native Method)
>         at java.base/javax.security.auth.Subject.doAs(Subject.java:423)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2916){code}
> According to the prompt, it is believed that there is not enough resource space to meet the corresponding conditions to close safe mode, but after adding or releasing more resources and lowering the resource condition threshold "dfs.namenode.resource.du.reserved", it still fails to close safe mode and throws the same prompt .
> According to the source code, we know that if the NameNode has redundant storage volumes less than the "dfs.namenode.resource.checked.volumes.minimum" set the minimum number of redundant storage volumes will enter safe mode. After debugging, *we found that the current NameNode storage volumes are abundant resource space, but because the total number of NameNode storage volumes is less than the set value, so the number of NameNode storage volumes with redundancy space must also be less than the set value, resulting in always entering safe mode.*
> In summary, it is found that the configuration item lacks a condition check and an associated exception handling mechanism, which makes it impossible to find the root cause of the impact when a misconfiguration occurs.
> The solution I propose is to use Precondition.checkArgument() to check the value of the configuration item and throw a IllegalArgumentException and detailed error message when the value is greater than the number of  NameNode storage volumes to avoid the misconfiguration from affecting the subsequent operation of the program.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org