You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "eric baldeschwieler (JIRA)" <ji...@apache.org> on 2006/11/26 01:09:03 UTC

[jira] Commented: (HADOOP-746) CRC computation and reading should move into a nested FileSystem

    [ http://issues.apache.org/jira/browse/HADOOP-746?page=comments#action_12452615 ] 
            
eric baldeschwieler commented on HADOOP-746:
--------------------------------------------

interesting.

This seems like it would have advantages in how we manage temporary storage and such.  BUT... I think HDFS needs to support CRCs for all files "below the covers".  I don't think we should rip that out and leave it to use code to invoke CRCs.  

The FS needs CRCs to manage replication and validation and should have a uniform internal mechanism.  I don't know that it is necessary that these CRCs be user accessible, but I do think that it is necessary that all blocks CRC in the same simple way. 

Nor is that URL very friendly...  Also nominally the URI prefix is supposted to specify the protocol / transport, right?  CRC seems like it belongs below the transport, not wrapping it.  odd.

> CRC computation and reading should move into a nested FileSystem
> ----------------------------------------------------------------
>
>                 Key: HADOOP-746
>                 URL: http://issues.apache.org/jira/browse/HADOOP-746
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 0.8.0
>            Reporter: Owen O'Malley
>         Assigned To: Owen O'Malley
>
> Currently FileSystem provides both an interface and a mechanism for computing and checking crc files. I propose splitting the crc code into a nestable FileSystem that like the PhasedFileSystem has a backing FileSystem. Once the Paths are converted to URI, this is fairly natural to express. To use crc files, your uris will look like:
> crc://hdfs:%2f%2fhost1:8020/ which is a crc FileSystem with an underlying file system of hdfs://host1:8020
> This will allow users to use crc files where they make sense for their application/cluster and get rid of the "raw" methods.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira