You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "JiangHua Zhu (Jira)" <ji...@apache.org> on 2021/04/16 07:24:00 UTC

[jira] [Created] (HDFS-15985) Incorrect sorting will cause failure to load an FsImage file

JiangHua Zhu created HDFS-15985:
-----------------------------------

             Summary: Incorrect sorting will cause failure to load an FsImage file
                 Key: HDFS-15985
                 URL: https://issues.apache.org/jira/browse/HDFS-15985
             Project: Hadoop HDFS
          Issue Type: Sub-task
            Reporter: JiangHua Zhu


After we have introduced HDFS-14617 or HDFS-14771, when loading an fsimage file, the following error will pop up:
2021-04-15 17:21:17,868 [293072]-INFO [main:FSImage@784]-Planning to load image: FSImageFile(file=/xxxx/hadoop/hdfs/namenode/current/fsimage_000000000xxxx, cpktTxId=000000000xxxx)
2021-04-15 17:25:53,288 [568492]-INFO [main:FSImageFormatPBINode$Loader@229]-Loading 725097952 INodes.
2021-04-15 17:25:53,289 [568493]-ERROR [main:FSImage@730]-Failed to load image from FSImageFile(file=/xxxx/hadoop/hdfs/namenode/current/fsimage_000000000xxxx, cpktTxId=000000000xxxx)
java.lang.IllegalStateException: GLOBAL: serial number 3 does not exist
at org.apache.hadoop.hdfs.server.namenode.SerialNumberMap.get(SerialNumberMap.java:85)
at org.apache.hadoop.hdfs.server.namenode.SerialNumberManager.getString(SerialNumberManager.java:121)
at org.apache.hadoop.hdfs.server.namenode.SerialNumberManager.getString(SerialNumberManager.java:125)
at org.apache.hadoop.hdfs.server.namenode.INodeWithAdditionalFields$PermissionStatusFormat.toPermissionStatus(INodeWithAdditionalFields.java:86)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadPermission(FSImageFormatPBINode.java:93)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeFile(FSImageFormatPBINode.java:303)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINode(FSImageFormatPBINode.java:280)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Loader.loadINodeSection(FSImageFormatPBINode.java:237)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.loadInternal(FSImageFormatProtobuf.java:237)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormatProtobuf$Loader.load(FSImageFormatProtobuf.java:176)
at org.apache.hadoop.hdfs.server.namenode.FSImageFormat$LoaderDelegator.load(FSImageFormat.java:226)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:937)
It was found that this anomaly was related to sorting, as follows:
ArrayList<FileSummary.Section> sections = Lists.newArrayList(summary
          .getSectionsList());
      Collections.sort(sections, new Comparator<FileSummary.Section>() {
        @Override
        public int compare(FileSummary.Section s1, FileSummary.Section s2) {
          SectionName n1 = SectionName.fromString(s1.getName());
          SectionName n2 = SectionName.fromString(s2.getName());
          if (n1 == null) {
            return n2 == null? 0: -1;
          } else if (n2 == null) {
            return -1;
          } else {
            return n1.ordinal()-n2.ordinal();
          }
        }
      });
When n1 != null and n2 == null, this will cause sorting errors.
When loading Sections, the correct order of loading Sections:
NS_INFO -> STRING_TABLE -> INODE
If the sorting is incorrect, the loading order is as follows:
INDOE -> NS_INFO -> STRING_TABLE

Because when loading INODE, you need to rely on STRING_TABLE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org