You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Gregory Farnum (JIRA)" <ji...@apache.org> on 2009/09/11 22:24:57 UTC

[jira] Updated: (HADOOP-6253) Add a Ceph FileSystem interface.

     [ https://issues.apache.org/jira/browse/HADOOP-6253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gregory Farnum updated HADOOP-6253:
-----------------------------------

    Status: Patch Available  (was: Open)

I've attached a patch which includes the CephFileSystem and IOStream classes, as well as package documentation. To actually use it you're going to need an installation of Ceph (ceph.newdream.net).
I have *not* included any unit tests, as the code depends on the libhadoopceph shared library and without a Ceph install it seems sort of pointless -- about all I can see to do is make sure that calling the methods throws an IOException for being uninitialized. Still, most of the other filesystems came up with something, so if you have any suggestions for useful test cases let me know and I can add them. :)

In very basic testing (~900MB and ~6GB worth of data), this and the current Ceph code is roughly equivalent in speed to HDFS running a mapred via the hadoop-examples jar from .20 using the default values for both systems; Ceph tends to be slightly faster in a put and slightly slower in the mapred (~3:35 versus ~3:20 on the 6GB test case). However, Ceph, while still highly experimental and in-development, is a full filesystem with a linux kernel and full userspace client; it also distinguishes itself from HDFS by having no single point of failure -- it uses a paxos-based monitor cluster for managing state and multiple metadata servers instead of the single HDFS namenode (though of course you can also run the entire system on one machine).

> Add a Ceph FileSystem interface.
> --------------------------------
>
>                 Key: HADOOP-6253
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6253
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs
>            Reporter: Gregory Farnum
>            Priority: Minor
>
> The experimental distributed filesystem Ceph does not have a single point of failure, and might be of use to some Hadoop users.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.