You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-issues@hadoop.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/02/24 08:27:00 UTC

[jira] [Commented] (HDFS-16933) A race in SerialNumberMap will cause wrong ownership

    [ https://issues.apache.org/jira/browse/HDFS-16933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17693070#comment-17693070 ] 

ASF GitHub Bot commented on HDFS-16933:
---------------------------------------

ZanderXu opened a new pull request, #5430:
URL: https://github.com/apache/hadoop/pull/5430

   Jira: [HDFS-16933](https://issues.apache.org/jira/browse/HDFS-16933)
   
   We encountered a problem that  NameNode randomly has wrong owner ship after loading the same fsimage if we enable parallel fsimage loading.
   
   After tracing and found that maybe there is a race in SerialNumberMap.
   ```
   public int get(T t) {
     if (t == null) {
       return 0;
     }
     Integer sn = t2i.get(t);
     if (sn == null) {
       // Assume there are two thread with different t, such as:
       // T1 with hbase
       // T2 with hdfs
       // If T1 and T2 get the sn in the same time, they will get the same sn, such as 10
       sn = current.getAndIncrement();
       if (sn > max) {
         current.getAndDecrement();
         throw new IllegalStateException(name + ": serial number map is full");
       }
       Integer old = t2i.putIfAbsent(t, sn);
       if (old != null) {
         current.getAndDecrement();
         return old;
       }
       // If T1 puts the 10->hbase to the i2t first, T2 will use 10 -> hdfs to overwrite it. So it will cause that the Inodes will get a wrong owner hdfs, actual it should be hbase.
       i2t.put(sn, t);
     }
     return sn;
   } 
   ```
   
   There are two mappings in SerialNumberMap, t2i and i2t. They should be safely updated together. 




> A race in SerialNumberMap will cause wrong ownership
> ----------------------------------------------------
>
>                 Key: HDFS-16933
>                 URL: https://issues.apache.org/jira/browse/HDFS-16933
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: ZanderXu
>            Assignee: ZanderXu
>            Priority: Major
>
> If namenode enables parallel fsimage loading, a race that occurs in SerialNumberMap will cause wrong owner ship for INodes.
> {code:java}
> public int get(T t) {
>   if (t == null) {
>     return 0;
>   }
>   Integer sn = t2i.get(t);
>   if (sn == null) {
>     // Assume there are two thread with different t, such as:
>     // T1 with hbase
>     // T2 with hdfs
>     // If T1 and T2 get the sn in the same time, they will get the same sn, such as 10
>     sn = current.getAndIncrement();
>     if (sn > max) {
>       current.getAndDecrement();
>       throw new IllegalStateException(name + ": serial number map is full");
>     }
>     Integer old = t2i.putIfAbsent(t, sn);
>     if (old != null) {
>       current.getAndDecrement();
>       return old;
>     }
>     // If T1 puts the 10->hbase to the i2t first, T2 will use 10 -> hdfs to overwrite it. So it will cause that the Inodes will get a wrong owner hdfs, actual it should be hbase.
>     i2t.put(sn, t);
>   }
>   return sn;
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org