You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by John Lilley <jo...@redpoint.net> on 2013/07/15 20:23:10 UTC

HDFS block storage

I was looking at the HDFS block storage and noticed a couple things (1) all block files are in a flat directory structure (2) there is a "meta" file for each block file.  This leads me to ask:
-- Where can I find good reading that describes this level of HDFS internals?
-- Is the flat storage sufficient to handle millions of blocks?
-- What is in the meta file?  Why isn't it just a header or trailer on the data file?
Thanks
John


Re: HDFS block storage

Posted by Bertrand Dechoux <de...@gmail.com>.
1) Right now, I would say jira and code.
2) It is not really a flat storage. It 'folds' itself when needed.
3) At least checksums. There are jiras about whether it should be somehow
in the block itself.

A beginning : https://issues.apache.org/jira/browse/HADOOP-1134
A (dead?) discussion : https://issues.apache.org/jira/browse/HDFS-2699

Regards

Bertrand



On Mon, Jul 15, 2013 at 8:23 PM, John Lilley <jo...@redpoint.net>wrote:

>  I was looking at the HDFS block storage and noticed a couple things (1)
> all block files are in a flat directory structure (2) there is a “meta”
> file for each block file.  This leads me to ask:****
>
> -- Where can I find good reading that describes this level of HDFS
> internals?****
>
> -- Is the flat storage sufficient to handle millions of blocks?****
>
> -- What is in the meta file?  Why isn’t it just a header or trailer on the
> data file?****
>
> Thanks****
>
> John****
>
> ** **
>



-- 
Bertrand Dechoux

Re: HDFS block storage

Posted by Bertrand Dechoux <de...@gmail.com>.
1) Right now, I would say jira and code.
2) It is not really a flat storage. It 'folds' itself when needed.
3) At least checksums. There are jiras about whether it should be somehow
in the block itself.

A beginning : https://issues.apache.org/jira/browse/HADOOP-1134
A (dead?) discussion : https://issues.apache.org/jira/browse/HDFS-2699

Regards

Bertrand



On Mon, Jul 15, 2013 at 8:23 PM, John Lilley <jo...@redpoint.net>wrote:

>  I was looking at the HDFS block storage and noticed a couple things (1)
> all block files are in a flat directory structure (2) there is a “meta”
> file for each block file.  This leads me to ask:****
>
> -- Where can I find good reading that describes this level of HDFS
> internals?****
>
> -- Is the flat storage sufficient to handle millions of blocks?****
>
> -- What is in the meta file?  Why isn’t it just a header or trailer on the
> data file?****
>
> Thanks****
>
> John****
>
> ** **
>



-- 
Bertrand Dechoux

Re: HDFS block storage

Posted by Bertrand Dechoux <de...@gmail.com>.
1) Right now, I would say jira and code.
2) It is not really a flat storage. It 'folds' itself when needed.
3) At least checksums. There are jiras about whether it should be somehow
in the block itself.

A beginning : https://issues.apache.org/jira/browse/HADOOP-1134
A (dead?) discussion : https://issues.apache.org/jira/browse/HDFS-2699

Regards

Bertrand



On Mon, Jul 15, 2013 at 8:23 PM, John Lilley <jo...@redpoint.net>wrote:

>  I was looking at the HDFS block storage and noticed a couple things (1)
> all block files are in a flat directory structure (2) there is a “meta”
> file for each block file.  This leads me to ask:****
>
> -- Where can I find good reading that describes this level of HDFS
> internals?****
>
> -- Is the flat storage sufficient to handle millions of blocks?****
>
> -- What is in the meta file?  Why isn’t it just a header or trailer on the
> data file?****
>
> Thanks****
>
> John****
>
> ** **
>



-- 
Bertrand Dechoux

Re: HDFS block storage

Posted by Bertrand Dechoux <de...@gmail.com>.
1) Right now, I would say jira and code.
2) It is not really a flat storage. It 'folds' itself when needed.
3) At least checksums. There are jiras about whether it should be somehow
in the block itself.

A beginning : https://issues.apache.org/jira/browse/HADOOP-1134
A (dead?) discussion : https://issues.apache.org/jira/browse/HDFS-2699

Regards

Bertrand



On Mon, Jul 15, 2013 at 8:23 PM, John Lilley <jo...@redpoint.net>wrote:

>  I was looking at the HDFS block storage and noticed a couple things (1)
> all block files are in a flat directory structure (2) there is a “meta”
> file for each block file.  This leads me to ask:****
>
> -- Where can I find good reading that describes this level of HDFS
> internals?****
>
> -- Is the flat storage sufficient to handle millions of blocks?****
>
> -- What is in the meta file?  Why isn’t it just a header or trailer on the
> data file?****
>
> Thanks****
>
> John****
>
> ** **
>



-- 
Bertrand Dechoux