You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Julien Laurenceau <ju...@pepitedata.com> on 2019/05/21 07:47:54 UTC

Deploy HDFS (on openstack VM) over Ceph (bare metal) on same hardware

Hi,

I am designing an elastic rack formed of compute nodes, storage nodes and
openstack control nodes.

I plan to deploy:

   - Ceph on bare metal on storage nodes for ceph-osd and control nodes for
   monitor and metadata.
   - Openstack to orchestrate resources spanning the compute and storage
   nodes.
   - Hadoop worker on VMs instanciated on the storage nodes leveraging ceph
   for storage through cinder volumes.
   - Hadoop master on VMs instanciated on the storage nodes leveraging ceph
   for storage through cinder volumes.
   - Spark, k8s and a bunch of tools on VMs instantiated on the compute
   nodes leveraging ceph for storage through cinder volumes. Workload is
   multi-tenant.

I am really concerned about layering Ceph bare metal under HDFS on the same
physical nodes. This design being mainly driven by lack of resources, it
seems to me that it just cannot reliably withstand load.

I am not yet at the stage of stress tests, but it seems to me that
segregating HDFS bare metal side by side to Ceph-osd bare metal on
different nodes is mandatory.

What would be your pro/cons regarding these designs ?

Regards


PS:
https://superuser.com/questions/1437971/deploy-hdfs-on-openstack-vm-over-ceph-bare-metal-on-same-hardware