You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@cloudstack.apache.org by Bryan Whitehead <dr...@megahappy.net> on 2013/03/21 00:31:48 UTC

CloudStack + GlusterFS over Infiniband

I've gotten some requests to give some idea of how to setup CloudStack
with GlusterFS and what kind of numbers can be expected. I'm working
on a more complete writeup, but thought I'd send something to the
maillinglist so I can get an understanding of what questions people
have.

Since I'm adding another (small) cluster to my zone I wanted to get
some hardware numbers out there and disk access speeds.

Hardware consists of two servers with the following config:
1 6-core E5-1650 @ 3.2Ghz (looks like 12 in /proc/cpuinfo)
64GB RAM
Raid-10, 4 sas disks @ 3TB each
Infiniband Mellanox MT26428 @ 40GB/sec

I get ~300MB/sec disk write speeds on the raw xfs-backed filesystem.
command used: dd if=/dev/zero of=/gluster/qcow/temp.$SIZE count=$SIZE
bs=1M oflag=sync
SIZE is usually 20000 to 40000 when I run my tests
My xfs filesystem was build with these options:
mkfs.xfs -i size=512 /dev/vg_kvm/glust0

I mount xfs volume with these options:
/dev/vg_kvm/glust0 /gluster/0 xfs defaults,inode64 0 0

Here is the output of my gluster volume:
Volume Name: custqcow
Type: Replicate
Volume ID: d8d8570c-73ba-4b06-811e-2030d601cfaa
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 172.16.2.13:/gluster/0
Brick2: 172.16.2.14:/gluster/0
Options Reconfigured:
performance.io-thread-count: 64
nfs.disable: on
performance.least-prio-threads: 8
performance.normal-prio-threads: 32
performance.high-prio-threads: 64

here is my mount entry in /etc/fstab:
172.16.2.13:custqcow /gluster/qcow2 glusterfs defaults,_netdev 0 0

After adding a gluster layer (fuse mount) write speeds per process are
at ~150MB/sec.
If I run the above dd command simultaneously X3 I get ~100MB/sec per
dd. Adding more will proportionally reduce the rate evenly as dd's
compete for IO over the glusterfs fuse mountpoint. This means while 1
process with 1 filehandle cannot max out the underlying disks maximum
speed - collectively many processes will give me the same speed from
the gluster layer to the filesystem. I easily can get full IO out of
my underlying disks with many VM's running.

here is output from mount on 1 of the boxes:
/dev/mapper/system-root on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sda1 on /boot type ext4 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
/dev/mapper/vg_kvm-glust0 on /gluster/0 type xfs (rw,inode64)
172.16.2.13:custqcow on /gluster/qcow2 type fuse.glusterfs
(rw,default_permissions,allow_other,max_read=131072)

here is a df:
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/system-root
                       81G  1.6G   76G   3% /
tmpfs                  32G     0   32G   0% /dev/shm
/dev/sda1             485M   52M  408M  12% /boot
/dev/mapper/vg_kvm-glust0
                      4.0T   33M  4.0T   1% /gluster/0
172.16.2.13:custqcow  4.0T   33M  4.0T   1% /gluster/qcow2

NOTES: I have larger cloudstack clusters in production with similar
setups but it is a Distributed-Replicate (6 bricks with replication
2). Native Infiniband/RDMA is currently extremely crappy in gluster -
at best I've been able to get 45MB/sec per process and higher load.
Everything above is IPoIB. GlusterFS version 3.3.1.

I run the cloud-agent and qemu-kvm with CentOS6.3 (old cluster). This
cluster is qemu-kvm on CentOS6.4. Primary storage is sharedmountpoint
to /gluster/qcow2/images.

-Bryan

Re: CloudStack + GlusterFS over Infiniband

Posted by Bryan Whitehead <dr...@megahappy.net>.

I've had to put this cluster into production already, but I'll have
hardware for my lab end this month or beginning of April.

On Thu, Mar 21, 2013 at 5:09 AM, Jason Davis <sc...@gmail.com> wrote:
> Are you planning on including observations on IOPS and latency? Would be
> curious to see what performance penalty is incurred when you have a brick
> failure.
>
> I agree, having a writeup will be awesome. Thanks for your hard work!
> On Mar 21, 2013 1:03 AM, "Ahmad Emneina" <ae...@gmail.com> wrote:
>
>> On Mar 20, 2013, at 4:31 PM, Bryan Whitehead <dr...@megahappy.net> wrote:
>>
>> > I've gotten some requests to give some idea of how to setup CloudStack
>> > with GlusterFS and what kind of numbers can be expected. I'm working
>> > on a more complete writeup, but thought I'd send something to the
>> > maillinglist so I can get an understanding of what questions people
>> > have.
>> >
>> > Since I'm adding another (small) cluster to my zone I wanted to get
>> > some hardware numbers out there and disk access speeds.
>> >
>> > Hardware consists of two servers with the following config:
>> > 1 6-core E5-1650 @ 3.2Ghz (looks like 12 in /proc/cpuinfo)
>> > 64GB RAM
>> > Raid-10, 4 sas disks @ 3TB each
>> > Infiniband Mellanox MT26428 @ 40GB/sec
>> >
>> > I get ~300MB/sec disk write speeds on the raw xfs-backed filesystem.
>> > command used: dd if=/dev/zero of=/gluster/qcow/temp.$SIZE count=$SIZE
>> > bs=1M oflag=sync
>> > SIZE is usually 20000 to 40000 when I run my tests
>> > My xfs filesystem was build with these options:
>> > mkfs.xfs -i size=512 /dev/vg_kvm/glust0
>> >
>> > I mount xfs volume with these options:
>> > /dev/vg_kvm/glust0 /gluster/0 xfs defaults,inode64 0 0
>> >
>> > Here is the output of my gluster volume:
>> > Volume Name: custqcow
>> > Type: Replicate
>> > Volume ID: d8d8570c-73ba-4b06-811e-2030d601cfaa
>> > Status: Started
>> > Number of Bricks: 1 x 2 = 2
>> > Transport-type: tcp
>> > Bricks:
>> > Brick1: 172.16.2.13:/gluster/0
>> > Brick2: 172.16.2.14:/gluster/0
>> > Options Reconfigured:
>> > performance.io-thread-count: 64
>> > nfs.disable: on
>> > performance.least-prio-threads: 8
>> > performance.normal-prio-threads: 32
>> > performance.high-prio-threads: 64
>> >
>> > here is my mount entry in /etc/fstab:
>> > 172.16.2.13:custqcow /gluster/qcow2 glusterfs defaults,_netdev 0 0
>> >
>> > After adding a gluster layer (fuse mount) write speeds per process are
>> > at ~150MB/sec.
>> > If I run the above dd command simultaneously X3 I get ~100MB/sec per
>> > dd. Adding more will proportionally reduce the rate evenly as dd's
>> > compete for IO over the glusterfs fuse mountpoint. This means while 1
>> > process with 1 filehandle cannot max out the underlying disks maximum
>> > speed - collectively many processes will give me the same speed from
>> > the gluster layer to the filesystem. I easily can get full IO out of
>> > my underlying disks with many VM's running.
>> >
>> > here is output from mount on 1 of the boxes:
>> > /dev/mapper/system-root on / type ext4 (rw)
>> > proc on /proc type proc (rw)
>> > sysfs on /sys type sysfs (rw)
>> > devpts on /dev/pts type devpts (rw,gid=5,mode=620)
>> > tmpfs on /dev/shm type tmpfs (rw)
>> > /dev/sda1 on /boot type ext4 (rw)
>> > none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
>> > /dev/mapper/vg_kvm-glust0 on /gluster/0 type xfs (rw,inode64)
>> > 172.16.2.13:custqcow on /gluster/qcow2 type fuse.glusterfs
>> > (rw,default_permissions,allow_other,max_read=131072)
>> >
>> > here is a df:
>> > Filesystem            Size  Used Avail Use% Mounted on
>> > /dev/mapper/system-root
>> >                       81G  1.6G   76G   3% /
>> > tmpfs                  32G     0   32G   0% /dev/shm
>> > /dev/sda1             485M   52M  408M  12% /boot
>> > /dev/mapper/vg_kvm-glust0
>> >                      4.0T   33M  4.0T   1% /gluster/0
>> > 172.16.2.13:custqcow  4.0T   33M  4.0T   1% /gluster/qcow2
>> >
>> > NOTES: I have larger cloudstack clusters in production with similar
>> > setups but it is a Distributed-Replicate (6 bricks with replication
>> > 2). Native Infiniband/RDMA is currently extremely crappy in gluster -
>> > at best I've been able to get 45MB/sec per process and higher load.
>> > Everything above is IPoIB. GlusterFS version 3.3.1.
>> >
>> > I run the cloud-agent and qemu-kvm with CentOS6.3 (old cluster). This
>> > cluster is qemu-kvm on CentOS6.4. Primary storage is sharedmountpoint
>> > to /gluster/qcow2/images.
>> >
>> > -Bryan
>>
>> No real questions here just eager to check out the write ups. This seems
>> insanely valuable to have out there for cloudstack users.

Re: CloudStack + GlusterFS over Infiniband

Posted by Andreas Huser <ah...@7five-edv.de>.

Hi Bryan, 

i have a little other approach to build glusterfs + infiniband. 
My first question ist why you not use rdma when you have infiniband? 


My GlusterFS to consist of two parts. 
1.) Solaris ZFS Storage Subsystem 
2.) GlusterFS Server 

ZFS gives extra features to your glusterfs. 
Compression, Thin provisioning, Deduplication and Spacereclaim on ZFS Storage Backend to name only a few. 

ZFS Requirements: 
- min. For one TB Storage 2GB RAM 
- A pool of more then one vdev. eg. Pool1 with 2+n raidz. Create also min. two raidz vdevs with 3 HDDs. Each vdev more give IOPS You can grow also in volume and performance 

Shortly: 
1.) Create an zfs volume and enable Compression and Dedup. 
2.) Export this volume as SRP 
3.) Import this SRP volume in your glusterfs server and formating this as ext4 or ext3 
not xfs! I have report a bug https://bugzilla.redhat.com/show_bug.cgi?id=874348 
4.) create a GlusterFS volume with transport rdma and tcp. 
5.Mount on your Hypervisor (KVM) the gluster volume as rdma. 

When you need more IOPS, add more vdevs on your ZFS Backend. Or add ZIL an device. 

My question is not how much iops i have. My question is what is the best way to bild a flexible and performant Storage for Cloudstack. 
I think that is a good way. 

Many Thanks 
Regards Andreas 



----- Ursprüngliche Mail -----

Von: "Clayton Weise" <cw...@iswest.net> 
An: users@cloudstack.apache.org 
CC: cloudstack-users@incubator.apache.org 
Gesendet: Samstag, 23. März 2013 00:01:00 
Betreff: RE: CloudStack + GlusterFS over Infiniband 

I would love to see how GlusterFS + Infiniband + CloudStack does when VMs are performing smaller sequential and non-sequential reads and writes in addition to the larger sequential operations observable with dd. Thanks for posting this though, it's extremely valuable data. 

-----Original Message----- 
From: Jason Davis [mailto:scr512@gmail.com] 
Sent: Thursday, March 21, 2013 5:09 AM 
To: users@cloudstack.apache.org 
Cc: cloudstack-users@incubator.apache.org 
Subject: Re: CloudStack + GlusterFS over Infiniband 

Are you planning on including observations on IOPS and latency? Would be 
curious to see what performance penalty is incurred when you have a brick 
failure. 

I agree, having a writeup will be awesome. Thanks for your hard work! 
On Mar 21, 2013 1:03 AM, "Ahmad Emneina" <ae...@gmail.com> wrote: 

> On Mar 20, 2013, at 4:31 PM, Bryan Whitehead <dr...@megahappy.net> wrote: 
> 
> > I've gotten some requests to give some idea of how to setup CloudStack 
> > with GlusterFS and what kind of numbers can be expected. I'm working 
> > on a more complete writeup, but thought I'd send something to the 
> > maillinglist so I can get an understanding of what questions people 
> > have. 
> > 
> > Since I'm adding another (small) cluster to my zone I wanted to get 
> > some hardware numbers out there and disk access speeds. 
> > 
> > Hardware consists of two servers with the following config: 
> > 1 6-core E5-1650 @ 3.2Ghz (looks like 12 in /proc/cpuinfo) 
> > 64GB RAM 
> > Raid-10, 4 sas disks @ 3TB each 
> > Infiniband Mellanox MT26428 @ 40GB/sec 
> > 
> > I get ~300MB/sec disk write speeds on the raw xfs-backed filesystem. 
> > command used: dd if=/dev/zero of=/gluster/qcow/temp.$SIZE count=$SIZE 
> > bs=1M oflag=sync 
> > SIZE is usually 20000 to 40000 when I run my tests 
> > My xfs filesystem was build with these options: 
> > mkfs.xfs -i size=512 /dev/vg_kvm/glust0 
> > 
> > I mount xfs volume with these options: 
> > /dev/vg_kvm/glust0 /gluster/0 xfs defaults,inode64 0 0 
> > 
> > Here is the output of my gluster volume: 
> > Volume Name: custqcow 
> > Type: Replicate 
> > Volume ID: d8d8570c-73ba-4b06-811e-2030d601cfaa 
> > Status: Started 
> > Number of Bricks: 1 x 2 = 2 
> > Transport-type: tcp 
> > Bricks: 
> > Brick1: 172.16.2.13:/gluster/0 
> > Brick2: 172.16.2.14:/gluster/0 
> > Options Reconfigured: 
> > performance.io-thread-count: 64 
> > nfs.disable: on 
> > performance.least-prio-threads: 8 
> > performance.normal-prio-threads: 32 
> > performance.high-prio-threads: 64 
> > 
> > here is my mount entry in /etc/fstab: 
> > 172.16.2.13:custqcow /gluster/qcow2 glusterfs defaults,_netdev 0 0 
> > 
> > After adding a gluster layer (fuse mount) write speeds per process are 
> > at ~150MB/sec. 
> > If I run the above dd command simultaneously X3 I get ~100MB/sec per 
> > dd. Adding more will proportionally reduce the rate evenly as dd's 
> > compete for IO over the glusterfs fuse mountpoint. This means while 1 
> > process with 1 filehandle cannot max out the underlying disks maximum 
> > speed - collectively many processes will give me the same speed from 
> > the gluster layer to the filesystem. I easily can get full IO out of 
> > my underlying disks with many VM's running. 
> > 
> > here is output from mount on 1 of the boxes: 
> > /dev/mapper/system-root on / type ext4 (rw) 
> > proc on /proc type proc (rw) 
> > sysfs on /sys type sysfs (rw) 
> > devpts on /dev/pts type devpts (rw,gid=5,mode=620) 
> > tmpfs on /dev/shm type tmpfs (rw) 
> > /dev/sda1 on /boot type ext4 (rw) 
> > none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw) 
> > /dev/mapper/vg_kvm-glust0 on /gluster/0 type xfs (rw,inode64) 
> > 172.16.2.13:custqcow on /gluster/qcow2 type fuse.glusterfs 
> > (rw,default_permissions,allow_other,max_read=131072) 
> > 
> > here is a df: 
> > Filesystem Size Used Avail Use% Mounted on 
> > /dev/mapper/system-root 
> > 81G 1.6G 76G 3% / 
> > tmpfs 32G 0 32G 0% /dev/shm 
> > /dev/sda1 485M 52M 408M 12% /boot 
> > /dev/mapper/vg_kvm-glust0 
> > 4.0T 33M 4.0T 1% /gluster/0 
> > 172.16.2.13:custqcow 4.0T 33M 4.0T 1% /gluster/qcow2 
> > 
> > NOTES: I have larger cloudstack clusters in production with similar 
> > setups but it is a Distributed-Replicate (6 bricks with replication 
> > 2). Native Infiniband/RDMA is currently extremely crappy in gluster - 
> > at best I've been able to get 45MB/sec per process and higher load. 
> > Everything above is IPoIB. GlusterFS version 3.3.1. 
> > 
> > I run the cloud-agent and qemu-kvm with CentOS6.3 (old cluster). This 
> > cluster is qemu-kvm on CentOS6.4. Primary storage is sharedmountpoint 
> > to /gluster/qcow2/images. 
> > 
> > -Bryan 
> 
> No real questions here just eager to check out the write ups. This seems 
> insanely valuable to have out there for cloudstack users.

RE: CloudStack + GlusterFS over Infiniband

Posted by Clayton Weise <cw...@iswest.net>.

I would love to see how GlusterFS + Infiniband + CloudStack does when VMs are performing smaller sequential and non-sequential reads and writes in addition to the larger sequential operations observable with dd.  Thanks for posting this though, it's extremely valuable data.

-----Original Message-----
From: Jason Davis [mailto:scr512@gmail.com] 
Sent: Thursday, March 21, 2013 5:09 AM
To: users@cloudstack.apache.org
Cc: cloudstack-users@incubator.apache.org
Subject: Re: CloudStack + GlusterFS over Infiniband

Are you planning on including observations on IOPS and latency? Would be
curious to see what performance penalty is incurred when you have a brick
failure.

I agree, having a writeup will be awesome. Thanks for your hard work!
On Mar 21, 2013 1:03 AM, "Ahmad Emneina" <ae...@gmail.com> wrote:

> On Mar 20, 2013, at 4:31 PM, Bryan Whitehead <dr...@megahappy.net> wrote:
>
> > I've gotten some requests to give some idea of how to setup CloudStack
> > with GlusterFS and what kind of numbers can be expected. I'm working
> > on a more complete writeup, but thought I'd send something to the
> > maillinglist so I can get an understanding of what questions people
> > have.
> >
> > Since I'm adding another (small) cluster to my zone I wanted to get
> > some hardware numbers out there and disk access speeds.
> >
> > Hardware consists of two servers with the following config:
> > 1 6-core E5-1650 @ 3.2Ghz (looks like 12 in /proc/cpuinfo)
> > 64GB RAM
> > Raid-10, 4 sas disks @ 3TB each
> > Infiniband Mellanox MT26428 @ 40GB/sec
> >
> > I get ~300MB/sec disk write speeds on the raw xfs-backed filesystem.
> > command used: dd if=/dev/zero of=/gluster/qcow/temp.$SIZE count=$SIZE
> > bs=1M oflag=sync
> > SIZE is usually 20000 to 40000 when I run my tests
> > My xfs filesystem was build with these options:
> > mkfs.xfs -i size=512 /dev/vg_kvm/glust0
> >
> > I mount xfs volume with these options:
> > /dev/vg_kvm/glust0 /gluster/0 xfs defaults,inode64 0 0
> >
> > Here is the output of my gluster volume:
> > Volume Name: custqcow
> > Type: Replicate
> > Volume ID: d8d8570c-73ba-4b06-811e-2030d601cfaa
> > Status: Started
> > Number of Bricks: 1 x 2 = 2
> > Transport-type: tcp
> > Bricks:
> > Brick1: 172.16.2.13:/gluster/0
> > Brick2: 172.16.2.14:/gluster/0
> > Options Reconfigured:
> > performance.io-thread-count: 64
> > nfs.disable: on
> > performance.least-prio-threads: 8
> > performance.normal-prio-threads: 32
> > performance.high-prio-threads: 64
> >
> > here is my mount entry in /etc/fstab:
> > 172.16.2.13:custqcow /gluster/qcow2 glusterfs defaults,_netdev 0 0
> >
> > After adding a gluster layer (fuse mount) write speeds per process are
> > at ~150MB/sec.
> > If I run the above dd command simultaneously X3 I get ~100MB/sec per
> > dd. Adding more will proportionally reduce the rate evenly as dd's
> > compete for IO over the glusterfs fuse mountpoint. This means while 1
> > process with 1 filehandle cannot max out the underlying disks maximum
> > speed - collectively many processes will give me the same speed from
> > the gluster layer to the filesystem. I easily can get full IO out of
> > my underlying disks with many VM's running.
> >
> > here is output from mount on 1 of the boxes:
> > /dev/mapper/system-root on / type ext4 (rw)
> > proc on /proc type proc (rw)
> > sysfs on /sys type sysfs (rw)
> > devpts on /dev/pts type devpts (rw,gid=5,mode=620)
> > tmpfs on /dev/shm type tmpfs (rw)
> > /dev/sda1 on /boot type ext4 (rw)
> > none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
> > /dev/mapper/vg_kvm-glust0 on /gluster/0 type xfs (rw,inode64)
> > 172.16.2.13:custqcow on /gluster/qcow2 type fuse.glusterfs
> > (rw,default_permissions,allow_other,max_read=131072)
> >
> > here is a df:
> > Filesystem            Size  Used Avail Use% Mounted on
> > /dev/mapper/system-root
> >                       81G  1.6G   76G   3% /
> > tmpfs                  32G     0   32G   0% /dev/shm
> > /dev/sda1             485M   52M  408M  12% /boot
> > /dev/mapper/vg_kvm-glust0
> >                      4.0T   33M  4.0T   1% /gluster/0
> > 172.16.2.13:custqcow  4.0T   33M  4.0T   1% /gluster/qcow2
> >
> > NOTES: I have larger cloudstack clusters in production with similar
> > setups but it is a Distributed-Replicate (6 bricks with replication
> > 2). Native Infiniband/RDMA is currently extremely crappy in gluster -
> > at best I've been able to get 45MB/sec per process and higher load.
> > Everything above is IPoIB. GlusterFS version 3.3.1.
> >
> > I run the cloud-agent and qemu-kvm with CentOS6.3 (old cluster). This
> > cluster is qemu-kvm on CentOS6.4. Primary storage is sharedmountpoint
> > to /gluster/qcow2/images.
> >
> > -Bryan
>
> No real questions here just eager to check out the write ups. This seems
> insanely valuable to have out there for cloudstack users.

Re: CloudStack + GlusterFS over Infiniband

Posted by Jason Davis <sc...@gmail.com>.

Are you planning on including observations on IOPS and latency? Would be
curious to see what performance penalty is incurred when you have a brick
failure.

I agree, having a writeup will be awesome. Thanks for your hard work!
On Mar 21, 2013 1:03 AM, "Ahmad Emneina" <ae...@gmail.com> wrote:

> On Mar 20, 2013, at 4:31 PM, Bryan Whitehead <dr...@megahappy.net> wrote:
>
> > I've gotten some requests to give some idea of how to setup CloudStack
> > with GlusterFS and what kind of numbers can be expected. I'm working
> > on a more complete writeup, but thought I'd send something to the
> > maillinglist so I can get an understanding of what questions people
> > have.
> >
> > Since I'm adding another (small) cluster to my zone I wanted to get
> > some hardware numbers out there and disk access speeds.
> >
> > Hardware consists of two servers with the following config:
> > 1 6-core E5-1650 @ 3.2Ghz (looks like 12 in /proc/cpuinfo)
> > 64GB RAM
> > Raid-10, 4 sas disks @ 3TB each
> > Infiniband Mellanox MT26428 @ 40GB/sec
> >
> > I get ~300MB/sec disk write speeds on the raw xfs-backed filesystem.
> > command used: dd if=/dev/zero of=/gluster/qcow/temp.$SIZE count=$SIZE
> > bs=1M oflag=sync
> > SIZE is usually 20000 to 40000 when I run my tests
> > My xfs filesystem was build with these options:
> > mkfs.xfs -i size=512 /dev/vg_kvm/glust0
> >
> > I mount xfs volume with these options:
> > /dev/vg_kvm/glust0 /gluster/0 xfs defaults,inode64 0 0
> >
> > Here is the output of my gluster volume:
> > Volume Name: custqcow
> > Type: Replicate
> > Volume ID: d8d8570c-73ba-4b06-811e-2030d601cfaa
> > Status: Started
> > Number of Bricks: 1 x 2 = 2
> > Transport-type: tcp
> > Bricks:
> > Brick1: 172.16.2.13:/gluster/0
> > Brick2: 172.16.2.14:/gluster/0
> > Options Reconfigured:
> > performance.io-thread-count: 64
> > nfs.disable: on
> > performance.least-prio-threads: 8
> > performance.normal-prio-threads: 32
> > performance.high-prio-threads: 64
> >
> > here is my mount entry in /etc/fstab:
> > 172.16.2.13:custqcow /gluster/qcow2 glusterfs defaults,_netdev 0 0
> >
> > After adding a gluster layer (fuse mount) write speeds per process are
> > at ~150MB/sec.
> > If I run the above dd command simultaneously X3 I get ~100MB/sec per
> > dd. Adding more will proportionally reduce the rate evenly as dd's
> > compete for IO over the glusterfs fuse mountpoint. This means while 1
> > process with 1 filehandle cannot max out the underlying disks maximum
> > speed - collectively many processes will give me the same speed from
> > the gluster layer to the filesystem. I easily can get full IO out of
> > my underlying disks with many VM's running.
> >
> > here is output from mount on 1 of the boxes:
> > /dev/mapper/system-root on / type ext4 (rw)
> > proc on /proc type proc (rw)
> > sysfs on /sys type sysfs (rw)
> > devpts on /dev/pts type devpts (rw,gid=5,mode=620)
> > tmpfs on /dev/shm type tmpfs (rw)
> > /dev/sda1 on /boot type ext4 (rw)
> > none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
> > /dev/mapper/vg_kvm-glust0 on /gluster/0 type xfs (rw,inode64)
> > 172.16.2.13:custqcow on /gluster/qcow2 type fuse.glusterfs
> > (rw,default_permissions,allow_other,max_read=131072)
> >
> > here is a df:
> > Filesystem            Size  Used Avail Use% Mounted on
> > /dev/mapper/system-root
> >                       81G  1.6G   76G   3% /
> > tmpfs                  32G     0   32G   0% /dev/shm
> > /dev/sda1             485M   52M  408M  12% /boot
> > /dev/mapper/vg_kvm-glust0
> >                      4.0T   33M  4.0T   1% /gluster/0
> > 172.16.2.13:custqcow  4.0T   33M  4.0T   1% /gluster/qcow2
> >
> > NOTES: I have larger cloudstack clusters in production with similar
> > setups but it is a Distributed-Replicate (6 bricks with replication
> > 2). Native Infiniband/RDMA is currently extremely crappy in gluster -
> > at best I've been able to get 45MB/sec per process and higher load.
> > Everything above is IPoIB. GlusterFS version 3.3.1.
> >
> > I run the cloud-agent and qemu-kvm with CentOS6.3 (old cluster). This
> > cluster is qemu-kvm on CentOS6.4. Primary storage is sharedmountpoint
> > to /gluster/qcow2/images.
> >
> > -Bryan
>
> No real questions here just eager to check out the write ups. This seems
> insanely valuable to have out there for cloudstack users.

Re: CloudStack + GlusterFS over Infiniband

Posted by Ahmad Emneina <ae...@gmail.com>.

On Mar 20, 2013, at 4:31 PM, Bryan Whitehead <dr...@megahappy.net> wrote:

> I've gotten some requests to give some idea of how to setup CloudStack
> with GlusterFS and what kind of numbers can be expected. I'm working
> on a more complete writeup, but thought I'd send something to the
> maillinglist so I can get an understanding of what questions people
> have.
> 
> Since I'm adding another (small) cluster to my zone I wanted to get
> some hardware numbers out there and disk access speeds.
> 
> Hardware consists of two servers with the following config:
> 1 6-core E5-1650 @ 3.2Ghz (looks like 12 in /proc/cpuinfo)
> 64GB RAM
> Raid-10, 4 sas disks @ 3TB each
> Infiniband Mellanox MT26428 @ 40GB/sec
> 
> I get ~300MB/sec disk write speeds on the raw xfs-backed filesystem.
> command used: dd if=/dev/zero of=/gluster/qcow/temp.$SIZE count=$SIZE
> bs=1M oflag=sync
> SIZE is usually 20000 to 40000 when I run my tests
> My xfs filesystem was build with these options:
> mkfs.xfs -i size=512 /dev/vg_kvm/glust0
> 
> I mount xfs volume with these options:
> /dev/vg_kvm/glust0 /gluster/0 xfs defaults,inode64 0 0
> 
> Here is the output of my gluster volume:
> Volume Name: custqcow
> Type: Replicate
> Volume ID: d8d8570c-73ba-4b06-811e-2030d601cfaa
> Status: Started
> Number of Bricks: 1 x 2 = 2
> Transport-type: tcp
> Bricks:
> Brick1: 172.16.2.13:/gluster/0
> Brick2: 172.16.2.14:/gluster/0
> Options Reconfigured:
> performance.io-thread-count: 64
> nfs.disable: on
> performance.least-prio-threads: 8
> performance.normal-prio-threads: 32
> performance.high-prio-threads: 64
> 
> here is my mount entry in /etc/fstab:
> 172.16.2.13:custqcow /gluster/qcow2 glusterfs defaults,_netdev 0 0
> 
> After adding a gluster layer (fuse mount) write speeds per process are
> at ~150MB/sec.
> If I run the above dd command simultaneously X3 I get ~100MB/sec per
> dd. Adding more will proportionally reduce the rate evenly as dd's
> compete for IO over the glusterfs fuse mountpoint. This means while 1
> process with 1 filehandle cannot max out the underlying disks maximum
> speed - collectively many processes will give me the same speed from
> the gluster layer to the filesystem. I easily can get full IO out of
> my underlying disks with many VM's running.
> 
> here is output from mount on 1 of the boxes:
> /dev/mapper/system-root on / type ext4 (rw)
> proc on /proc type proc (rw)
> sysfs on /sys type sysfs (rw)
> devpts on /dev/pts type devpts (rw,gid=5,mode=620)
> tmpfs on /dev/shm type tmpfs (rw)
> /dev/sda1 on /boot type ext4 (rw)
> none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
> /dev/mapper/vg_kvm-glust0 on /gluster/0 type xfs (rw,inode64)
> 172.16.2.13:custqcow on /gluster/qcow2 type fuse.glusterfs
> (rw,default_permissions,allow_other,max_read=131072)
> 
> here is a df:
> Filesystem            Size  Used Avail Use% Mounted on
> /dev/mapper/system-root
>                       81G  1.6G   76G   3% /
> tmpfs                  32G     0   32G   0% /dev/shm
> /dev/sda1             485M   52M  408M  12% /boot
> /dev/mapper/vg_kvm-glust0
>                      4.0T   33M  4.0T   1% /gluster/0
> 172.16.2.13:custqcow  4.0T   33M  4.0T   1% /gluster/qcow2
> 
> NOTES: I have larger cloudstack clusters in production with similar
> setups but it is a Distributed-Replicate (6 bricks with replication
> 2). Native Infiniband/RDMA is currently extremely crappy in gluster -
> at best I've been able to get 45MB/sec per process and higher load.
> Everything above is IPoIB. GlusterFS version 3.3.1.
> 
> I run the cloud-agent and qemu-kvm with CentOS6.3 (old cluster). This
> cluster is qemu-kvm on CentOS6.4. Primary storage is sharedmountpoint
> to /gluster/qcow2/images.
> 
> -Bryan

No real questions here just eager to check out the write ups. This seems insanely valuable to have out there for cloudstack users.