You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Sameer Paranjpye (JIRA)" <ji...@apache.org> on 2007/07/26 21:24:03 UTC

[jira] Created: (HADOOP-1656) HDFS does not record the blocksize for a file

HDFS does not record the blocksize for a file
---------------------------------------------

                 Key: HADOOP-1656
                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
             Project: Hadoop
          Issue Type: Bug
          Components: dfs
    Affects Versions: 0.13.0
            Reporter: Sameer Paranjpye
            Assignee: Raghu Angadi
             Fix For: 0.15.0


The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1656) HDFS does not record the blocksize for a file

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1656:
-------------------------------------

    Attachment: blockSize2.patch

Added unit tests to test blocksize.

> HDFS does not record the blocksize for a file
> ---------------------------------------------
>
>                 Key: HADOOP-1656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Sameer Paranjpye
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>         Attachments: blockSize2.patch
>
>
> The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1656) HDFS does not record the blocksize for a file

Posted by "Konstantin Shvachko (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12519181 ] 

Konstantin Shvachko commented on HADOOP-1656:
---------------------------------------------

I think we should be careful introducing new persistent fields. It is really simple to add a field, 
but it may be hard to remove or support it in the future.
Why do we need the block size to be persistent? What is the use case?
Right now we create files with the default block size, at least that is what map-reduce does.
Block size is used by map-reduce to calculate splits. I don't know whether this patch will break the
semantics of generating splits.
On the other hand if we store the block size per file, what is the semantics of that field?
Currently we have flexibility to create blocks of different sizes within the file, do we loose that flexibility
from now on? And if we don't why do we need to store it?
This looks like one of those simple changes that can lead to big consequences.

Currently if there is more than one block in the file the block size is returned correct;y. The problem is with one block files only.
I'd propose one of the 3 variants in this case:
# keep it as it is: return the first block size;
# return the default block size;
# return -1 as the block size from the name-node, and let DFSClient return its default size further up to the application. 
Most probably that will the size this file was created with.

On a side note we need to deprecate getBlockSize() both in DFSClient and ClientProtocol because it is never used.
The correct way is to call getFileInfo().getBlockSize().

> HDFS does not record the blocksize for a file
> ---------------------------------------------
>
>                 Key: HADOOP-1656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Sameer Paranjpye
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>         Attachments: blockSize2.patch
>
>
> The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1656) HDFS does not record the blocksize for a file

Posted by "Chris Douglas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523112 ] 

Chris Douglas commented on HADOOP-1656:
---------------------------------------

I could only find one piece of code that could be a problem: dfs.FileDataServlet::pickSrcDatanode. For a file longer than the reported blocksize, it assumes a length of n*blocksize will return n blocks (unless it's a zero length file, when it asks for only one block). It will still work, but the number of blocks surveyed is no longer as claimed.

The only other case is irrelevant, in TestDFSShell where querying the blocksize of a zero-length file need only not throw. That it expects zero doesn't matter.

> HDFS does not record the blocksize for a file
> ---------------------------------------------
>
>                 Key: HADOOP-1656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Sameer Paranjpye
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>         Attachments: blockSize4.patch
>
>
> The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1656) HDFS does not record the blocksize for a file

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523657 ] 

Hairong Kuang commented on HADOOP-1656:
---------------------------------------

+1 The patch looks good.

> HDFS does not record the blocksize for a file
> ---------------------------------------------
>
>                 Key: HADOOP-1656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Sameer Paranjpye
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>         Attachments: blockSize6.patch
>
>
> The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1656) HDFS does not record the blocksize for a file

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1656:
-------------------------------------

    Attachment:     (was: blockSize.patch)

> HDFS does not record the blocksize for a file
> ---------------------------------------------
>
>                 Key: HADOOP-1656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Sameer Paranjpye
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>
> The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1656) HDFS does not record the blocksize for a file

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1656:
-------------------------------------

    Attachment: blockSize4.patch

merged with latest trunk.

> HDFS does not record the blocksize for a file
> ---------------------------------------------
>
>                 Key: HADOOP-1656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Sameer Paranjpye
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>         Attachments: blockSize4.patch
>
>
> The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1656) HDFS does not record the blocksize for a file

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1656:
-------------------------------------

    Status: Patch Available  (was: Open)

> HDFS does not record the blocksize for a file
> ---------------------------------------------
>
>                 Key: HADOOP-1656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Sameer Paranjpye
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>         Attachments: blockSize6.patch
>
>
> The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1656) HDFS does not record the blocksize for a file

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1656:
-------------------------------------

    Attachment: blockSize3.patch

A question came up during the review that the blocksize is really a heuristic; the file system could still have blocks in the file that are different from the specified blocksize. This means that the specified block size is really the "preferred" block size.

HDFS now implement getPreferredBlockSize(). This preferred block size is persisted in the file system image. The FileSystem API has not changed.

> HDFS does not record the blocksize for a file
> ---------------------------------------------
>
>                 Key: HADOOP-1656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Sameer Paranjpye
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>         Attachments: blockSize3.patch
>
>
> The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-1656) HDFS does not record the blocksize for a file

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur reassigned HADOOP-1656:
----------------------------------------

    Assignee: dhruba borthakur  (was: Raghu Angadi)

> HDFS does not record the blocksize for a file
> ---------------------------------------------
>
>                 Key: HADOOP-1656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Sameer Paranjpye
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>
> The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1656) HDFS does not record the blocksize for a file

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1656:
-------------------------------------

    Attachment:     (was: blockSize3.patch)

> HDFS does not record the blocksize for a file
> ---------------------------------------------
>
>                 Key: HADOOP-1656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Sameer Paranjpye
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>         Attachments: blockSize4.patch
>
>
> The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1656) HDFS does not record the blocksize for a file

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1656:
-------------------------------------

    Attachment:     (was: blockSize4.patch)

> HDFS does not record the blocksize for a file
> ---------------------------------------------
>
>                 Key: HADOOP-1656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Sameer Paranjpye
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>         Attachments: blockSize5.patch
>
>
> The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1656) HDFS does not record the blocksize for a file

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1656:
-------------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this.

> HDFS does not record the blocksize for a file
> ---------------------------------------------
>
>                 Key: HADOOP-1656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Sameer Paranjpye
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>         Attachments: blockSize6.patch
>
>
> The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1656) HDFS does not record the blocksize for a file

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1656:
-------------------------------------

    Attachment: blockSize.patch

The blockSize a persistent attribute of the file. It is stored in the INode in the fsimage. It can be specified only at file creation time. HDFS makes every effort to chunk a file into blocks of size specified by blockSize. 

> HDFS does not record the blocksize for a file
> ---------------------------------------------
>
>                 Key: HADOOP-1656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Sameer Paranjpye
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>         Attachments: blockSize.patch
>
>
> The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1656) HDFS does not record the blocksize for a file

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12519451 ] 

dhruba borthakur commented on HADOOP-1656:
------------------------------------------

The blocksize is a heuristic that HDFS uses to chunk up a file. HDFS makes every effort to chunk a file where most chunks are of size specified by blocksize. This means HDFS can still create blocks of a size other than the specified blocksize if it needs to (maybe in the case of appends). Another requirement is that if an application specified the blocksize while creating the file, it should have the ability to retrieve that *precise* value by invoking getBlockSize(). Given the above definition and requirements, the above proposal 1-3 might not fit the needs. I do not see any other way of achieving this other than persisting the blocksize attribute in the inode.





> HDFS does not record the blocksize for a file
> ---------------------------------------------
>
>                 Key: HADOOP-1656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Sameer Paranjpye
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>         Attachments: blockSize2.patch
>
>
> The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1656) HDFS does not record the blocksize for a file

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1656:
-------------------------------------

    Attachment:     (was: blockSize2.patch)

> HDFS does not record the blocksize for a file
> ---------------------------------------------
>
>                 Key: HADOOP-1656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Sameer Paranjpye
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>         Attachments: blockSize2.patch
>
>
> The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1656) HDFS does not record the blocksize for a file

Posted by "Hadoop QA (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523755 ] 

Hadoop QA commented on HADOOP-1656:
-----------------------------------

-1, build or testing failed

2 attempts failed to build and test the latest attachment http://issues.apache.org/jira/secure/attachment/12364784/blockSize6.patch against trunk revision r570983.

Test results:   http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/650/testReport/
Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/650/console

Please note that this message is automatically generated and may represent a problem with the automation system and not the patch.

> HDFS does not record the blocksize for a file
> ---------------------------------------------
>
>                 Key: HADOOP-1656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Sameer Paranjpye
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>         Attachments: blockSize6.patch
>
>
> The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1656) HDFS does not record the blocksize for a file

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1656:
-------------------------------------

    Attachment: blockSize5.patch

merged patch with latest trunk

> HDFS does not record the blocksize for a file
> ---------------------------------------------
>
>                 Key: HADOOP-1656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Sameer Paranjpye
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>         Attachments: blockSize5.patch
>
>
> The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1656) HDFS does not record the blocksize for a file

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1656:
-------------------------------------

    Attachment:     (was: blockSize2.patch)

> HDFS does not record the blocksize for a file
> ---------------------------------------------
>
>                 Key: HADOOP-1656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Sameer Paranjpye
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>
> The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1656) HDFS does not record the blocksize for a file

Posted by "Hairong Kuang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12523639 ] 

Hairong Kuang commented on HADOOP-1656:
---------------------------------------

The patch looks good except for some minor comments:
1. FSConstants.java
line 161: should update the comment  for the new layout version.
1. FSEditLog.java
line 260: the error message of IOException should be updated.
line 502: it uses toLogTimeStamp to convert a block size to a UTF8. It should be clearer to change the method name to be something like to LogLong.

> HDFS does not record the blocksize for a file
> ---------------------------------------------
>
>                 Key: HADOOP-1656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Sameer Paranjpye
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>         Attachments: blockSize5.patch
>
>
> The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1656) HDFS does not record the blocksize for a file

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1656:
-------------------------------------

    Attachment:     (was: blockSize5.patch)

> HDFS does not record the blocksize for a file
> ---------------------------------------------
>
>                 Key: HADOOP-1656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Sameer Paranjpye
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>         Attachments: blockSize6.patch
>
>
> The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1656) HDFS does not record the blocksize for a file

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1656:
-------------------------------------

    Attachment: blockSize6.patch

Incorporated all of Hairong's review comments.

> HDFS does not record the blocksize for a file
> ---------------------------------------------
>
>                 Key: HADOOP-1656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Sameer Paranjpye
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>         Attachments: blockSize6.patch
>
>
> The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1656) HDFS does not record the blocksize for a file

Posted by "dhruba borthakur (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

dhruba borthakur updated HADOOP-1656:
-------------------------------------

    Attachment: blockSize2.patch

Added a unit test to test blockSize for files.

> HDFS does not record the blocksize for a file
> ---------------------------------------------
>
>                 Key: HADOOP-1656
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1656
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.13.0
>            Reporter: Sameer Paranjpye
>            Assignee: dhruba borthakur
>             Fix For: 0.15.0
>
>         Attachments: blockSize2.patch
>
>
> The blocksize that a file is created with is not recorded by the Namenode. It is used only by the client when it writes the file. Invoking 'getBlockSize' merely returns the size of the first block. The Namenode should record the blocksize.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.