You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Doug Cutting (JIRA)" <ji...@apache.org> on 2006/04/21 19:08:05 UTC

[jira] Created: (HADOOP-158) dfs block ids sometimes collide

dfs block ids sometimes collide
-------------------------------

         Key: HADOOP-158
         URL: http://issues.apache.org/jira/browse/HADOOP-158
     Project: Hadoop
        Type: Bug

  Components: dfs  
    Reporter: Doug Cutting
     Fix For: 0.2


A random number generator is used to allocate block ids in dfs.  Sometimes a block id is allocated that is already used in the filesystem, which causes filesystem corruption.

A short-term fix for this is to simply check when allocating block ids whether any file is already using the newly allocated id, and, if it is, generate another one.  There can still be collisions in some rare conditions, but these are harder to fix and will wait, since this simple fix will handle the vast majority of collisions.


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-158) dfs should allocate a random blockid range to a file, then assign ids sequentially to blocks in the file

Posted by "Sameer Paranjpye (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-158?page=comments#action_12413922 ] 

Sameer Paranjpye commented on HADOOP-158:
-----------------------------------------

Yes, random assignment of file-ids makes collisions more likely. However, collisions are possible even with sequential assignment, and if they are possible they need to be detected. Since, collision detection code is needed with both random and sequential assignment, random assignment makes the system simpler because the namenode doesn't have to track the 'high watermark' file-id.

Don't think recently assigned file-ids that belong to incomplete files are a concern, since the namenode will be aware of all file-ids used, whether they belong to incomplete files or not.

Wrap around before a file completes is not the only collision scenario. In the sequential assignment scheme, suppose, the first million files in the system get the file-ids, 0-999999. These files archival data of some kind, so are never deleted. Life goes on, lots of files are created and removed, at any given time there are only a few million files total (complete + incomplete) in the system. At some point, the system will have gone through a trillion file creation events, the file-ids will wrap and start to collide with the first million files.


> dfs should allocate a random blockid range to a file, then assign ids sequentially to blocks in the file
> --------------------------------------------------------------------------------------------------------
>
>          Key: HADOOP-158
>          URL: http://issues.apache.org/jira/browse/HADOOP-158
>      Project: Hadoop
>         Type: Bug

>   Components: dfs
>     Versions: 0.1.0
>     Reporter: Doug Cutting
>     Assignee: Konstantin Shvachko
>      Fix For: 0.4

>
> A random number generator is used to allocate block ids in dfs.  Sometimes a block id is allocated that is already used in the filesystem, which causes filesystem corruption.
> A short-term fix for this is to simply check when allocating block ids whether any file is already using the newly allocated id, and, if it is, generate another one.  There can still be collisions in some rare conditions, but these are harder to fix and will wait, since this simple fix will handle the vast majority of collisions.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-158) dfs should allocate a random blockid range to a file, then assign ids sequentially to blocks in the file

Posted by "Yoram Arnon (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-158?page=comments#action_12413910 ] 

Yoram Arnon commented on HADOOP-158:
------------------------------------

It could, but that would make upgrading an existing file system harder: one would have to "compress" the id's before upgrading, then keep track of the high-water-mark file id. The upgrade wouldn't be as smooth as when selecting 'clean' ranges randomly, and would require metadata conversion. Blocks on individual datanodes may even need to change their ids.
OR, one could copy/move the entire filesystem to a new, clean one.
The selected method allows for a seamless upgrade, requiring no conversion, and would work equally well with a 2^64 id address space.

> dfs should allocate a random blockid range to a file, then assign ids sequentially to blocks in the file
> --------------------------------------------------------------------------------------------------------
>
>          Key: HADOOP-158
>          URL: http://issues.apache.org/jira/browse/HADOOP-158
>      Project: Hadoop
>         Type: Bug

>   Components: dfs
>     Versions: 0.1.0
>     Reporter: Doug Cutting
>     Assignee: Konstantin Shvachko
>      Fix For: 0.4

>
> A random number generator is used to allocate block ids in dfs.  Sometimes a block id is allocated that is already used in the filesystem, which causes filesystem corruption.
> A short-term fix for this is to simply check when allocating block ids whether any file is already using the newly allocated id, and, if it is, generate another one.  There can still be collisions in some rare conditions, but these are harder to fix and will wait, since this simple fix will handle the vast majority of collisions.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Assigned: (HADOOP-158) dfs block ids sometimes collide

Posted by "Sameer Paranjpye (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-158?page=all ]

Sameer Paranjpye reassigned HADOOP-158:
---------------------------------------

    Assign To: Konstantin Shvachko

> dfs block ids sometimes collide
> -------------------------------
>
>          Key: HADOOP-158
>          URL: http://issues.apache.org/jira/browse/HADOOP-158
>      Project: Hadoop
>         Type: Bug

>   Components: dfs
>     Reporter: Doug Cutting
>     Assignee: Konstantin Shvachko
>      Fix For: 0.2

>
> A random number generator is used to allocate block ids in dfs.  Sometimes a block id is allocated that is already used in the filesystem, which causes filesystem corruption.
> A short-term fix for this is to simply check when allocating block ids whether any file is already using the newly allocated id, and, if it is, generate another one.  There can still be collisions in some rare conditions, but these are harder to fix and will wait, since this simple fix will handle the vast majority of collisions.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-158) dfs should allocate a random blockid range to a file, then assign ids sequentially to blocks in the file

Posted by "Sameer Paranjpye (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-158?page=comments#action_12413909 ] 

Sameer Paranjpye commented on HADOOP-158:
-----------------------------------------

It can be sequential. In that case, the namenode would need to determine the lowest unused file-id at startup and start file-id assignments from that point. 

Even sequential allocation of file-ids should probably do the collision check because you don't need a trillion files in the system before you wrap around, you only need a trillion file creation events. If you're doing the collision check in both schemes the random file-id assignment keeps things simpler.

The possibility of collision with sequential assignment of file-ids is very remote, but why expose ourselves? I'm probably being paranoid so ignore me on this one if you want.





> dfs should allocate a random blockid range to a file, then assign ids sequentially to blocks in the file
> --------------------------------------------------------------------------------------------------------
>
>          Key: HADOOP-158
>          URL: http://issues.apache.org/jira/browse/HADOOP-158
>      Project: Hadoop
>         Type: Bug

>   Components: dfs
>     Versions: 0.1.0
>     Reporter: Doug Cutting
>     Assignee: Konstantin Shvachko
>      Fix For: 0.4

>
> A random number generator is used to allocate block ids in dfs.  Sometimes a block id is allocated that is already used in the filesystem, which causes filesystem corruption.
> A short-term fix for this is to simply check when allocating block ids whether any file is already using the newly allocated id, and, if it is, generate another one.  There can still be collisions in some rare conditions, but these are harder to fix and will wait, since this simple fix will handle the vast majority of collisions.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (HADOOP-158) dfs should allocate a random blockid range to a file, then assign ids sequentially to blocks in the file

Posted by "Sameer Paranjpye (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-158?page=all ]

Sameer Paranjpye updated HADOOP-158:
------------------------------------

        Summary: dfs should allocate a random blockid range to a file, then assign ids sequentially to blocks in the file  (was: dfs block ids sometimes collide)
    Fix Version: 0.4
                     (was: 0.3)
        Version: 0.1.0

Block id collisions have largely beed addressed by the fix in HADOOP-146. The namenode first checks for the presence of a randomly generated blockid before assigning it to a block.

Longer term, we should implement a scheme where the namenode allocates large blockid ranges to files. When a file is created the namenode generates a 5-byte integer, a range-id, for the file, checking for collisions and re-generating the range-id if necessary. Blocks for the new file are then assigned 8-byte block ids sequentially, where the high 5-bytes are the range-id and the low 3-bytes correspond to the block number within the file. Blocks in a file then get the ids rangeid-0, rangeid-1, ..., rangeid-(N-1), where N is the number of blocks in the file. This lets us assign upto a trillion ranges and lets each file grow to 0.5, 1, 2 ... petabytes depending on whether the block size is 32, 64, 128 ... MB. The scheme has the additional benefit of saving some memory per block at the namenode.

There is the scenario of a file being deleted while a node holding some of its blocks is down, having the files range-id re-assigned to another file and seeing collisions when the node later comes back. To get around this, the namenode tags each block in a file with the creation time of the file. When a collision occurs the timestamps will be different, the most recent timestamp wins and old blocks are discarded.



> dfs should allocate a random blockid range to a file, then assign ids sequentially to blocks in the file
> --------------------------------------------------------------------------------------------------------
>
>          Key: HADOOP-158
>          URL: http://issues.apache.org/jira/browse/HADOOP-158
>      Project: Hadoop
>         Type: Bug

>   Components: dfs
>     Versions: 0.1.0
>     Reporter: Doug Cutting
>     Assignee: Konstantin Shvachko
>      Fix For: 0.4

>
> A random number generator is used to allocate block ids in dfs.  Sometimes a block id is allocated that is already used in the filesystem, which causes filesystem corruption.
> A short-term fix for this is to simply check when allocating block ids whether any file is already using the newly allocated id, and, if it is, generate another one.  There can still be collisions in some rare conditions, but these are harder to fix and will wait, since this simple fix will handle the vast majority of collisions.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Resolved: (HADOOP-158) dfs should allocate a random blockid range to a file, then assign ids sequentially to blocks in the file

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Konstantin Shvachko resolved HADOOP-158.
----------------------------------------

    Resolution: Duplicate

> dfs should allocate a random blockid range to a file, then assign ids sequentially to blocks in the file
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-158
>                 URL: https://issues.apache.org/jira/browse/HADOOP-158
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.1.0
>            Reporter: Doug Cutting
>            Assignee: Konstantin Shvachko
>
> A random number generator is used to allocate block ids in dfs.  Sometimes a block id is allocated that is already used in the filesystem, which causes filesystem corruption.
> A short-term fix for this is to simply check when allocating block ids whether any file is already using the newly allocated id, and, if it is, generate another one.  There can still be collisions in some rare conditions, but these are harder to fix and will wait, since this simple fix will handle the vast majority of collisions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-158) dfs should allocate a random blockid range to a file, then assign ids sequentially to blocks in the file

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-158?page=comments#action_12413893 ] 

Doug Cutting commented on HADOOP-158:
-------------------------------------

Why must the file-id part of the block id be random?  Can't that be sequential?

> dfs should allocate a random blockid range to a file, then assign ids sequentially to blocks in the file
> --------------------------------------------------------------------------------------------------------
>
>          Key: HADOOP-158
>          URL: http://issues.apache.org/jira/browse/HADOOP-158
>      Project: Hadoop
>         Type: Bug

>   Components: dfs
>     Versions: 0.1.0
>     Reporter: Doug Cutting
>     Assignee: Konstantin Shvachko
>      Fix For: 0.4

>
> A random number generator is used to allocate block ids in dfs.  Sometimes a block id is allocated that is already used in the filesystem, which causes filesystem corruption.
> A short-term fix for this is to simply check when allocating block ids whether any file is already using the newly allocated id, and, if it is, generate another one.  There can still be collisions in some rare conditions, but these are harder to fix and will wait, since this simple fix will handle the vast majority of collisions.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-158) dfs should allocate a random blockid range to a file, then assign ids sequentially to blocks in the file

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-158?page=comments#action_12413911 ] 

Doug Cutting commented on HADOOP-158:
-------------------------------------

I would think that random allocation would make collisions more likely, not less.  We always know which block ids are used by complete files.  The concern is only about block ids which have been recently allocated to a file, but the file is somehow not yet complete.  So, with sequential allocation, a collision can only happen if the probe key (the next block id to allocate) wraps all the way around before a file is completed, while with random allocation it can happen much more frequently.  We simply have to make sure that probe key increments are logged to the edits file along with other file system changes.  Am I missing something?

> dfs should allocate a random blockid range to a file, then assign ids sequentially to blocks in the file
> --------------------------------------------------------------------------------------------------------
>
>          Key: HADOOP-158
>          URL: http://issues.apache.org/jira/browse/HADOOP-158
>      Project: Hadoop
>         Type: Bug

>   Components: dfs
>     Versions: 0.1.0
>     Reporter: Doug Cutting
>     Assignee: Konstantin Shvachko
>      Fix For: 0.4

>
> A random number generator is used to allocate block ids in dfs.  Sometimes a block id is allocated that is already used in the filesystem, which causes filesystem corruption.
> A short-term fix for this is to simply check when allocating block ids whether any file is already using the newly allocated id, and, if it is, generate another one.  There can still be collisions in some rare conditions, but these are harder to fix and will wait, since this simple fix will handle the vast majority of collisions.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (HADOOP-158) dfs block ids sometimes collide

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-158?page=all ]

Doug Cutting updated HADOOP-158:
--------------------------------

    Fix Version: 0.3
                     (was: 0.2)

> dfs block ids sometimes collide
> -------------------------------
>
>          Key: HADOOP-158
>          URL: http://issues.apache.org/jira/browse/HADOOP-158
>      Project: Hadoop
>         Type: Bug

>   Components: dfs
>     Reporter: Doug Cutting
>     Assignee: Konstantin Shvachko
>      Fix For: 0.3

>
> A random number generator is used to allocate block ids in dfs.  Sometimes a block id is allocated that is already used in the filesystem, which causes filesystem corruption.
> A short-term fix for this is to simply check when allocating block ids whether any file is already using the newly allocated id, and, if it is, generate another one.  There can still be collisions in some rare conditions, but these are harder to fix and will wait, since this simple fix will handle the vast majority of collisions.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Updated: (HADOOP-158) dfs should allocate a random blockid range to a file, then assign ids sequentially to blocks in the file

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
     [ http://issues.apache.org/jira/browse/HADOOP-158?page=all ]

Doug Cutting updated HADOOP-158:
--------------------------------

    Fix Version: 0.5.0
                     (was: 0.4.0)

> dfs should allocate a random blockid range to a file, then assign ids sequentially to blocks in the file
> --------------------------------------------------------------------------------------------------------
>
>          Key: HADOOP-158
>          URL: http://issues.apache.org/jira/browse/HADOOP-158
>      Project: Hadoop
>         Type: Bug

>   Components: dfs
>     Versions: 0.1.0
>     Reporter: Doug Cutting
>     Assignee: Konstantin Shvachko
>      Fix For: 0.5.0

>
> A random number generator is used to allocate block ids in dfs.  Sometimes a block id is allocated that is already used in the filesystem, which causes filesystem corruption.
> A short-term fix for this is to simply check when allocating block ids whether any file is already using the newly allocated id, and, if it is, generate another one.  There can still be collisions in some rare conditions, but these are harder to fix and will wait, since this simple fix will handle the vast majority of collisions.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira


[jira] Commented: (HADOOP-158) dfs should allocate a random blockid range to a file, then assign ids sequentially to blocks in the file

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12505899 ] 

Doug Cutting commented on HADOOP-158:
-------------------------------------

> At some point, the system will have gone through a trillion file creation events [ ... ]

We generally aim for a block created per drive no more than every 100 milliseconds, so that transfer dominates seek.  With 10,000 nodes, each with four drives, that would give a maximum block creation rate of 400k/second (assuming a replication level of one).  At that rate it would take 100,000 years to exhaust all 64-bit block ids.  I wonder what version Hadoop will have then?


> dfs should allocate a random blockid range to a file, then assign ids sequentially to blocks in the file
> --------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-158
>                 URL: https://issues.apache.org/jira/browse/HADOOP-158
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.1.0
>            Reporter: Doug Cutting
>            Assignee: Konstantin Shvachko
>
> A random number generator is used to allocate block ids in dfs.  Sometimes a block id is allocated that is already used in the filesystem, which causes filesystem corruption.
> A short-term fix for this is to simply check when allocating block ids whether any file is already using the newly allocated id, and, if it is, generate another one.  There can still be collisions in some rare conditions, but these are harder to fix and will wait, since this simple fix will handle the vast majority of collisions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-158) dfs block ids sometimes collide

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ http://issues.apache.org/jira/browse/HADOOP-158?page=comments#action_12377273 ] 

Doug Cutting commented on HADOOP-158:
-------------------------------------

Using the formulas in:

http://en.wikipedia.org/wiki/Birthday_paradox#Generalisation

I think it is actually very unlikely that, with 64-bit block ids and a decent random number generator, we are actually seeing collisions.  It seems more likely that the symptoms ascribed to duplicate block id allocations are actually the result of other bugs.  Still, it would be more comfortable to not rely on random block id allocation long-term.

> dfs block ids sometimes collide
> -------------------------------
>
>          Key: HADOOP-158
>          URL: http://issues.apache.org/jira/browse/HADOOP-158
>      Project: Hadoop
>         Type: Bug

>   Components: dfs
>     Reporter: Doug Cutting
>     Assignee: Konstantin Shvachko
>      Fix For: 0.2

>
> A random number generator is used to allocate block ids in dfs.  Sometimes a block id is allocated that is already used in the filesystem, which causes filesystem corruption.
> A short-term fix for this is to simply check when allocating block ids whether any file is already using the newly allocated id, and, if it is, generate another one.  There can still be collisions in some rare conditions, but these are harder to fix and will wait, since this simple fix will handle the vast majority of collisions.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira