You are viewing a plain text version of this content. The canonical link for it is here.

Posted to mapreduce-user@hadoop.apache.org by Colin Kincaid Williams <di...@uw.edu> on 2014/10/06 23:46:41 UTC

ext4 on a hadoop cluster datanodes

Hi,

I'm trying to figure out what are more ideal settings for using ext4 on
hadoop cluster datanodes. From the hadoop site its recommended nodelalloc
option is chosen in the fstab. Is that still a preferred option?

I read elsewhere to disable the ext4 journal, and use data=writeback.

http://fenidik.blogspot.com/2010/03/ext4-disable-journal.html

Finally, in some slides i read to use dir_index,sparse_super,extent when
creating the filesystem, and mount noatime and nodiratime

http://www.slideshare.net/leonsp/best-practices-for-deploying-hadoop-biginsights-in-the-cloud

Re: ext4 on a hadoop cluster datanodes

Posted by Travis <hc...@ghostar.org>.

On Tue, Oct 7, 2014 at 7:11 PM, Colin Kincaid Williams <di...@uw.edu>
wrote:

> Hi Travis,
>
> Are you using SSDs or spinning disks in your configuration?
>
>
>
We use 7200rpm NLSAS drives from various manufacturers including
(primarily) Seagate, Hitachi, and I believe there's a handful of Fujitsu
and Toshiba due to drive replacements over time.

I believe everything is 6gbps SAS, but I'd have to check again.

Travis

-- 
Travis Campbell
travis@ghostar.org

Re: ext4 on a hadoop cluster datanodes

Posted by Travis <hc...@ghostar.org>.

On Tue, Oct 7, 2014 at 7:11 PM, Colin Kincaid Williams <di...@uw.edu>
wrote:

> Hi Travis,
>
> Are you using SSDs or spinning disks in your configuration?
>
>
>
We use 7200rpm NLSAS drives from various manufacturers including
(primarily) Seagate, Hitachi, and I believe there's a handful of Fujitsu
and Toshiba due to drive replacements over time.

I believe everything is 6gbps SAS, but I'd have to check again.

Travis

-- 
Travis Campbell
travis@ghostar.org

Re: ext4 on a hadoop cluster datanodes

Posted by Travis <hc...@ghostar.org>.

On Tue, Oct 7, 2014 at 7:11 PM, Colin Kincaid Williams <di...@uw.edu>
wrote:

> Hi Travis,
>
> Are you using SSDs or spinning disks in your configuration?
>
>
>
We use 7200rpm NLSAS drives from various manufacturers including
(primarily) Seagate, Hitachi, and I believe there's a handful of Fujitsu
and Toshiba due to drive replacements over time.

I believe everything is 6gbps SAS, but I'd have to check again.

Travis

-- 
Travis Campbell
travis@ghostar.org

Re: ext4 on a hadoop cluster datanodes

Posted by Travis <hc...@ghostar.org>.

On Tue, Oct 7, 2014 at 7:11 PM, Colin Kincaid Williams <di...@uw.edu>
wrote:

> Hi Travis,
>
> Are you using SSDs or spinning disks in your configuration?
>
>
>
We use 7200rpm NLSAS drives from various manufacturers including
(primarily) Seagate, Hitachi, and I believe there's a handful of Fujitsu
and Toshiba due to drive replacements over time.

I believe everything is 6gbps SAS, but I'd have to check again.

Travis

-- 
Travis Campbell
travis@ghostar.org

Re: ext4 on a hadoop cluster datanodes

Posted by Colin Kincaid Williams <di...@uw.edu>.

Hi Travis,

Are you using SSDs or spinning disks in your configuration?

Thanks,

Colin Williams

On Mon, Oct 6, 2014 at 3:09 PM, Travis <hc...@ghostar.org> wrote:

> For filesystem creation, we use the following with mkfs.ext4
>
> mkfs.ext4 -T largefile -m 1 -O dir_index,extent,sparse_super -L
> $HDFS_LABEL /dev/${DEV}1
>
> By default, mkfs creates way too many inodes, so we tune it a bit with the
> "largefile" option, which modifies the inode_ratio.  This gives us ~2
> million usable inodes on a 2TB filesystem.
>
> As well, by default, mkfs sets the block reserve to 5%, which wastes a
> fair amount of space, since this space is only accessible to the root
> user.  We tune this down to 1% at mkfs time, but you can use tune2fs at
> runtime to change it.
>
> I don't know that I would use writeback. This mode is problematic in the
> event of a crash because it can allow old data to exist on the FS, but with
> new metadata.  I consider this corruption.  Unless you know your
> environment to be super stable (meaning no OS or hardware-induced crashes)
> AND you have stable, UPS-backed power, I would steer clear of this.
>
> If you're looking for the utmost in filesystem performance, you're better
> off looking at the controller card you're using.  Right now, we're using
> LSI9207-8i and seeing an aggregate 1.6-1.8GBytes/sec throughput across 12
> drives in JBOD.  Our older LSI-based cards can only sustain maybe a quarter
> of that in the same disk configuration.
>
> Travis
>
> On Mon, Oct 6, 2014 at 4:46 PM, Colin Kincaid Williams <di...@uw.edu>
> wrote:
>
>> Hi,
>>
>> I'm trying to figure out what are more ideal settings for using ext4 on
>> hadoop cluster datanodes. From the hadoop site its recommended nodelalloc
>> option is chosen in the fstab. Is that still a preferred option?
>>
>> I read elsewhere to disable the ext4 journal, and use data=writeback.
>>
>> http://fenidik.blogspot.com/2010/03/ext4-disable-journal.html
>>
>> Finally, in some slides i read to use dir_index,sparse_super,extent when
>> creating the filesystem, and mount noatime and nodiratime
>>
>>
>> http://www.slideshare.net/leonsp/best-practices-for-deploying-hadoop-biginsights-in-the-cloud
>>
>>
>>
>>
>>
>>
>
>
> --
> Travis Campbell
> travis@ghostar.org
>

Re: ext4 on a hadoop cluster datanodes

Posted by Travis <hc...@ghostar.org>.

On Wed, Nov 12, 2014 at 10:47 AM, Brian C. Huffman <
bhuffman@etinternational.com> wrote:

>  Would this set of ext4 parameters be ok for a 500GB HDFS data drive?
>

Probably?  It would depend on how many blocks you end up having stored from
HDFS on each specific ext4 mount point.  The largefile option heavily tunes
down the number of inodes that get created at mkfs time, so you might have
to tweak that.  I haven't seen any specific problems in my environment that
would indicate the options to be bad, horrible, or causing unexpected
explosions.

Try it out and let us know!

Travis

-- 
Travis Campbell
travis@ghostar.org

Re: ext4 on a hadoop cluster datanodes

Posted by Travis <hc...@ghostar.org>.

On Wed, Nov 12, 2014 at 10:47 AM, Brian C. Huffman <
bhuffman@etinternational.com> wrote:

>  Would this set of ext4 parameters be ok for a 500GB HDFS data drive?
>

Probably?  It would depend on how many blocks you end up having stored from
HDFS on each specific ext4 mount point.  The largefile option heavily tunes
down the number of inodes that get created at mkfs time, so you might have
to tweak that.  I haven't seen any specific problems in my environment that
would indicate the options to be bad, horrible, or causing unexpected
explosions.

Try it out and let us know!

Travis

-- 
Travis Campbell
travis@ghostar.org

Re: ext4 on a hadoop cluster datanodes

Posted by Travis <hc...@ghostar.org>.

On Wed, Nov 12, 2014 at 10:47 AM, Brian C. Huffman <
bhuffman@etinternational.com> wrote:

>  Would this set of ext4 parameters be ok for a 500GB HDFS data drive?
>

Probably?  It would depend on how many blocks you end up having stored from
HDFS on each specific ext4 mount point.  The largefile option heavily tunes
down the number of inodes that get created at mkfs time, so you might have
to tweak that.  I haven't seen any specific problems in my environment that
would indicate the options to be bad, horrible, or causing unexpected
explosions.

Try it out and let us know!

Travis

-- 
Travis Campbell
travis@ghostar.org

Re: ext4 on a hadoop cluster datanodes

Posted by Travis <hc...@ghostar.org>.

On Wed, Nov 12, 2014 at 10:47 AM, Brian C. Huffman <
bhuffman@etinternational.com> wrote:

>  Would this set of ext4 parameters be ok for a 500GB HDFS data drive?
>

Probably?  It would depend on how many blocks you end up having stored from
HDFS on each specific ext4 mount point.  The largefile option heavily tunes
down the number of inodes that get created at mkfs time, so you might have
to tweak that.  I haven't seen any specific problems in my environment that
would indicate the options to be bad, horrible, or causing unexpected
explosions.

Try it out and let us know!

Travis

-- 
Travis Campbell
travis@ghostar.org

Re: ext4 on a hadoop cluster datanodes

Posted by "Brian C. Huffman" <bh...@etinternational.com>.

Would this set of ext4 parameters be ok for a 500GB HDFS data drive?

Thanks,
Brian

On 10/06/2014 06:09 PM, Travis wrote:
> For filesystem creation, we use the following with mkfs.ext4
>
> mkfs.ext4 -T largefile -m 1 -O dir_index,extent,sparse_super -L 
> $HDFS_LABEL /dev/${DEV}1
>
> By default, mkfs creates way too many inodes, so we tune it a bit with 
> the "largefile" option, which modifies the inode_ratio.  This gives us 
> ~2 million usable inodes on a 2TB filesystem.
>
> As well, by default, mkfs sets the block reserve to 5%, which wastes a 
> fair amount of space, since this space is only accessible to the root 
> user.  We tune this down to 1% at mkfs time, but you can use tune2fs 
> at runtime to change it.
>
> I don't know that I would use writeback. This mode is problematic in 
> the event of a crash because it can allow old data to exist on the FS, 
> but with new metadata.  I consider this corruption.  Unless you know 
> your environment to be super stable (meaning no OS or hardware-induced 
> crashes) AND you have stable, UPS-backed power, I would steer clear of 
> this.
>
> If you're looking for the utmost in filesystem performance, you're 
> better off looking at the controller card you're using.  Right now, 
> we're using LSI9207-8i and seeing an aggregate 1.6-1.8GBytes/sec 
> throughput across 12 drives in JBOD.  Our older LSI-based cards can 
> only sustain maybe a quarter of that in the same disk configuration.
>
> Travis
>
> On Mon, Oct 6, 2014 at 4:46 PM, Colin Kincaid Williams <discord@uw.edu 
> <ma...@uw.edu>> wrote:
>
>     Hi,
>
>     I'm trying to figure out what are more ideal settings for using
>     ext4 on hadoop cluster datanodes. From the hadoop site its
>     recommended nodelalloc option is chosen in the fstab. Is that
>     still a preferred option?
>
>     I read elsewhere to disable the ext4 journal, and use data=writeback.
>
>     http://fenidik.blogspot.com/2010/03/ext4-disable-journal.html
>
>     Finally, in some slides i read to use
>     dir_index,sparse_super,extent when creating the filesystem, and
>     mount noatime and nodiratime
>
>     http://www.slideshare.net/leonsp/best-practices-for-deploying-hadoop-biginsights-in-the-cloud
>
>
>
>
>
>
>
>
> -- 
> Travis Campbell
> travis@ghostar.org <ma...@ghostar.org>

Re: ext4 on a hadoop cluster datanodes

Posted by Colin Kincaid Williams <di...@uw.edu>.

Hi Travis,

Are you using SSDs or spinning disks in your configuration?

Thanks,

Colin Williams

On Mon, Oct 6, 2014 at 3:09 PM, Travis <hc...@ghostar.org> wrote:

> For filesystem creation, we use the following with mkfs.ext4
>
> mkfs.ext4 -T largefile -m 1 -O dir_index,extent,sparse_super -L
> $HDFS_LABEL /dev/${DEV}1
>
> By default, mkfs creates way too many inodes, so we tune it a bit with the
> "largefile" option, which modifies the inode_ratio.  This gives us ~2
> million usable inodes on a 2TB filesystem.
>
> As well, by default, mkfs sets the block reserve to 5%, which wastes a
> fair amount of space, since this space is only accessible to the root
> user.  We tune this down to 1% at mkfs time, but you can use tune2fs at
> runtime to change it.
>
> I don't know that I would use writeback. This mode is problematic in the
> event of a crash because it can allow old data to exist on the FS, but with
> new metadata.  I consider this corruption.  Unless you know your
> environment to be super stable (meaning no OS or hardware-induced crashes)
> AND you have stable, UPS-backed power, I would steer clear of this.
>
> If you're looking for the utmost in filesystem performance, you're better
> off looking at the controller card you're using.  Right now, we're using
> LSI9207-8i and seeing an aggregate 1.6-1.8GBytes/sec throughput across 12
> drives in JBOD.  Our older LSI-based cards can only sustain maybe a quarter
> of that in the same disk configuration.
>
> Travis
>
> On Mon, Oct 6, 2014 at 4:46 PM, Colin Kincaid Williams <di...@uw.edu>
> wrote:
>
>> Hi,
>>
>> I'm trying to figure out what are more ideal settings for using ext4 on
>> hadoop cluster datanodes. From the hadoop site its recommended nodelalloc
>> option is chosen in the fstab. Is that still a preferred option?
>>
>> I read elsewhere to disable the ext4 journal, and use data=writeback.
>>
>> http://fenidik.blogspot.com/2010/03/ext4-disable-journal.html
>>
>> Finally, in some slides i read to use dir_index,sparse_super,extent when
>> creating the filesystem, and mount noatime and nodiratime
>>
>>
>> http://www.slideshare.net/leonsp/best-practices-for-deploying-hadoop-biginsights-in-the-cloud
>>
>>
>>
>>
>>
>>
>
>
> --
> Travis Campbell
> travis@ghostar.org
>

Re: ext4 on a hadoop cluster datanodes

Posted by "Brian C. Huffman" <bh...@etinternational.com>.

Would this set of ext4 parameters be ok for a 500GB HDFS data drive?

Thanks,
Brian

On 10/06/2014 06:09 PM, Travis wrote:
> For filesystem creation, we use the following with mkfs.ext4
>
> mkfs.ext4 -T largefile -m 1 -O dir_index,extent,sparse_super -L 
> $HDFS_LABEL /dev/${DEV}1
>
> By default, mkfs creates way too many inodes, so we tune it a bit with 
> the "largefile" option, which modifies the inode_ratio.  This gives us 
> ~2 million usable inodes on a 2TB filesystem.
>
> As well, by default, mkfs sets the block reserve to 5%, which wastes a 
> fair amount of space, since this space is only accessible to the root 
> user.  We tune this down to 1% at mkfs time, but you can use tune2fs 
> at runtime to change it.
>
> I don't know that I would use writeback. This mode is problematic in 
> the event of a crash because it can allow old data to exist on the FS, 
> but with new metadata.  I consider this corruption.  Unless you know 
> your environment to be super stable (meaning no OS or hardware-induced 
> crashes) AND you have stable, UPS-backed power, I would steer clear of 
> this.
>
> If you're looking for the utmost in filesystem performance, you're 
> better off looking at the controller card you're using.  Right now, 
> we're using LSI9207-8i and seeing an aggregate 1.6-1.8GBytes/sec 
> throughput across 12 drives in JBOD.  Our older LSI-based cards can 
> only sustain maybe a quarter of that in the same disk configuration.
>
> Travis
>
> On Mon, Oct 6, 2014 at 4:46 PM, Colin Kincaid Williams <discord@uw.edu 
> <ma...@uw.edu>> wrote:
>
>     Hi,
>
>     I'm trying to figure out what are more ideal settings for using
>     ext4 on hadoop cluster datanodes. From the hadoop site its
>     recommended nodelalloc option is chosen in the fstab. Is that
>     still a preferred option?
>
>     I read elsewhere to disable the ext4 journal, and use data=writeback.
>
>     http://fenidik.blogspot.com/2010/03/ext4-disable-journal.html
>
>     Finally, in some slides i read to use
>     dir_index,sparse_super,extent when creating the filesystem, and
>     mount noatime and nodiratime
>
>     http://www.slideshare.net/leonsp/best-practices-for-deploying-hadoop-biginsights-in-the-cloud
>
>
>
>
>
>
>
>
> -- 
> Travis Campbell
> travis@ghostar.org <ma...@ghostar.org>

Re: ext4 on a hadoop cluster datanodes

Posted by "Brian C. Huffman" <bh...@etinternational.com>.

Would this set of ext4 parameters be ok for a 500GB HDFS data drive?

Thanks,
Brian

On 10/06/2014 06:09 PM, Travis wrote:
> For filesystem creation, we use the following with mkfs.ext4
>
> mkfs.ext4 -T largefile -m 1 -O dir_index,extent,sparse_super -L 
> $HDFS_LABEL /dev/${DEV}1
>
> By default, mkfs creates way too many inodes, so we tune it a bit with 
> the "largefile" option, which modifies the inode_ratio.  This gives us 
> ~2 million usable inodes on a 2TB filesystem.
>
> As well, by default, mkfs sets the block reserve to 5%, which wastes a 
> fair amount of space, since this space is only accessible to the root 
> user.  We tune this down to 1% at mkfs time, but you can use tune2fs 
> at runtime to change it.
>
> I don't know that I would use writeback. This mode is problematic in 
> the event of a crash because it can allow old data to exist on the FS, 
> but with new metadata.  I consider this corruption.  Unless you know 
> your environment to be super stable (meaning no OS or hardware-induced 
> crashes) AND you have stable, UPS-backed power, I would steer clear of 
> this.
>
> If you're looking for the utmost in filesystem performance, you're 
> better off looking at the controller card you're using.  Right now, 
> we're using LSI9207-8i and seeing an aggregate 1.6-1.8GBytes/sec 
> throughput across 12 drives in JBOD.  Our older LSI-based cards can 
> only sustain maybe a quarter of that in the same disk configuration.
>
> Travis
>
> On Mon, Oct 6, 2014 at 4:46 PM, Colin Kincaid Williams <discord@uw.edu 
> <ma...@uw.edu>> wrote:
>
>     Hi,
>
>     I'm trying to figure out what are more ideal settings for using
>     ext4 on hadoop cluster datanodes. From the hadoop site its
>     recommended nodelalloc option is chosen in the fstab. Is that
>     still a preferred option?
>
>     I read elsewhere to disable the ext4 journal, and use data=writeback.
>
>     http://fenidik.blogspot.com/2010/03/ext4-disable-journal.html
>
>     Finally, in some slides i read to use
>     dir_index,sparse_super,extent when creating the filesystem, and
>     mount noatime and nodiratime
>
>     http://www.slideshare.net/leonsp/best-practices-for-deploying-hadoop-biginsights-in-the-cloud
>
>
>
>
>
>
>
>
> -- 
> Travis Campbell
> travis@ghostar.org <ma...@ghostar.org>

Re: ext4 on a hadoop cluster datanodes

Posted by Colin Kincaid Williams <di...@uw.edu>.

Hi Travis,

Are you using SSDs or spinning disks in your configuration?

Thanks,

Colin Williams

On Mon, Oct 6, 2014 at 3:09 PM, Travis <hc...@ghostar.org> wrote:

> For filesystem creation, we use the following with mkfs.ext4
>
> mkfs.ext4 -T largefile -m 1 -O dir_index,extent,sparse_super -L
> $HDFS_LABEL /dev/${DEV}1
>
> By default, mkfs creates way too many inodes, so we tune it a bit with the
> "largefile" option, which modifies the inode_ratio.  This gives us ~2
> million usable inodes on a 2TB filesystem.
>
> As well, by default, mkfs sets the block reserve to 5%, which wastes a
> fair amount of space, since this space is only accessible to the root
> user.  We tune this down to 1% at mkfs time, but you can use tune2fs at
> runtime to change it.
>
> I don't know that I would use writeback. This mode is problematic in the
> event of a crash because it can allow old data to exist on the FS, but with
> new metadata.  I consider this corruption.  Unless you know your
> environment to be super stable (meaning no OS or hardware-induced crashes)
> AND you have stable, UPS-backed power, I would steer clear of this.
>
> If you're looking for the utmost in filesystem performance, you're better
> off looking at the controller card you're using.  Right now, we're using
> LSI9207-8i and seeing an aggregate 1.6-1.8GBytes/sec throughput across 12
> drives in JBOD.  Our older LSI-based cards can only sustain maybe a quarter
> of that in the same disk configuration.
>
> Travis
>
> On Mon, Oct 6, 2014 at 4:46 PM, Colin Kincaid Williams <di...@uw.edu>
> wrote:
>
>> Hi,
>>
>> I'm trying to figure out what are more ideal settings for using ext4 on
>> hadoop cluster datanodes. From the hadoop site its recommended nodelalloc
>> option is chosen in the fstab. Is that still a preferred option?
>>
>> I read elsewhere to disable the ext4 journal, and use data=writeback.
>>
>> http://fenidik.blogspot.com/2010/03/ext4-disable-journal.html
>>
>> Finally, in some slides i read to use dir_index,sparse_super,extent when
>> creating the filesystem, and mount noatime and nodiratime
>>
>>
>> http://www.slideshare.net/leonsp/best-practices-for-deploying-hadoop-biginsights-in-the-cloud
>>
>>
>>
>>
>>
>>
>
>
> --
> Travis Campbell
> travis@ghostar.org
>

Re: ext4 on a hadoop cluster datanodes

Posted by "Brian C. Huffman" <bh...@etinternational.com>.

Would this set of ext4 parameters be ok for a 500GB HDFS data drive?

Thanks,
Brian

On 10/06/2014 06:09 PM, Travis wrote:
> For filesystem creation, we use the following with mkfs.ext4
>
> mkfs.ext4 -T largefile -m 1 -O dir_index,extent,sparse_super -L 
> $HDFS_LABEL /dev/${DEV}1
>
> By default, mkfs creates way too many inodes, so we tune it a bit with 
> the "largefile" option, which modifies the inode_ratio.  This gives us 
> ~2 million usable inodes on a 2TB filesystem.
>
> As well, by default, mkfs sets the block reserve to 5%, which wastes a 
> fair amount of space, since this space is only accessible to the root 
> user.  We tune this down to 1% at mkfs time, but you can use tune2fs 
> at runtime to change it.
>
> I don't know that I would use writeback. This mode is problematic in 
> the event of a crash because it can allow old data to exist on the FS, 
> but with new metadata.  I consider this corruption.  Unless you know 
> your environment to be super stable (meaning no OS or hardware-induced 
> crashes) AND you have stable, UPS-backed power, I would steer clear of 
> this.
>
> If you're looking for the utmost in filesystem performance, you're 
> better off looking at the controller card you're using.  Right now, 
> we're using LSI9207-8i and seeing an aggregate 1.6-1.8GBytes/sec 
> throughput across 12 drives in JBOD.  Our older LSI-based cards can 
> only sustain maybe a quarter of that in the same disk configuration.
>
> Travis
>
> On Mon, Oct 6, 2014 at 4:46 PM, Colin Kincaid Williams <discord@uw.edu 
> <ma...@uw.edu>> wrote:
>
>     Hi,
>
>     I'm trying to figure out what are more ideal settings for using
>     ext4 on hadoop cluster datanodes. From the hadoop site its
>     recommended nodelalloc option is chosen in the fstab. Is that
>     still a preferred option?
>
>     I read elsewhere to disable the ext4 journal, and use data=writeback.
>
>     http://fenidik.blogspot.com/2010/03/ext4-disable-journal.html
>
>     Finally, in some slides i read to use
>     dir_index,sparse_super,extent when creating the filesystem, and
>     mount noatime and nodiratime
>
>     http://www.slideshare.net/leonsp/best-practices-for-deploying-hadoop-biginsights-in-the-cloud
>
>
>
>
>
>
>
>
> -- 
> Travis Campbell
> travis@ghostar.org <ma...@ghostar.org>

Re: ext4 on a hadoop cluster datanodes

Posted by Colin Kincaid Williams <di...@uw.edu>.

Hi Travis,

Are you using SSDs or spinning disks in your configuration?

Thanks,

Colin Williams

On Mon, Oct 6, 2014 at 3:09 PM, Travis <hc...@ghostar.org> wrote:

> For filesystem creation, we use the following with mkfs.ext4
>
> mkfs.ext4 -T largefile -m 1 -O dir_index,extent,sparse_super -L
> $HDFS_LABEL /dev/${DEV}1
>
> By default, mkfs creates way too many inodes, so we tune it a bit with the
> "largefile" option, which modifies the inode_ratio.  This gives us ~2
> million usable inodes on a 2TB filesystem.
>
> As well, by default, mkfs sets the block reserve to 5%, which wastes a
> fair amount of space, since this space is only accessible to the root
> user.  We tune this down to 1% at mkfs time, but you can use tune2fs at
> runtime to change it.
>
> I don't know that I would use writeback. This mode is problematic in the
> event of a crash because it can allow old data to exist on the FS, but with
> new metadata.  I consider this corruption.  Unless you know your
> environment to be super stable (meaning no OS or hardware-induced crashes)
> AND you have stable, UPS-backed power, I would steer clear of this.
>
> If you're looking for the utmost in filesystem performance, you're better
> off looking at the controller card you're using.  Right now, we're using
> LSI9207-8i and seeing an aggregate 1.6-1.8GBytes/sec throughput across 12
> drives in JBOD.  Our older LSI-based cards can only sustain maybe a quarter
> of that in the same disk configuration.
>
> Travis
>
> On Mon, Oct 6, 2014 at 4:46 PM, Colin Kincaid Williams <di...@uw.edu>
> wrote:
>
>> Hi,
>>
>> I'm trying to figure out what are more ideal settings for using ext4 on
>> hadoop cluster datanodes. From the hadoop site its recommended nodelalloc
>> option is chosen in the fstab. Is that still a preferred option?
>>
>> I read elsewhere to disable the ext4 journal, and use data=writeback.
>>
>> http://fenidik.blogspot.com/2010/03/ext4-disable-journal.html
>>
>> Finally, in some slides i read to use dir_index,sparse_super,extent when
>> creating the filesystem, and mount noatime and nodiratime
>>
>>
>> http://www.slideshare.net/leonsp/best-practices-for-deploying-hadoop-biginsights-in-the-cloud
>>
>>
>>
>>
>>
>>
>
>
> --
> Travis Campbell
> travis@ghostar.org
>

Re: ext4 on a hadoop cluster datanodes

Posted by Travis <hc...@ghostar.org>.

For filesystem creation, we use the following with mkfs.ext4

mkfs.ext4 -T largefile -m 1 -O dir_index,extent,sparse_super -L $HDFS_LABEL
/dev/${DEV}1

By default, mkfs creates way too many inodes, so we tune it a bit with the
"largefile" option, which modifies the inode_ratio.  This gives us ~2
million usable inodes on a 2TB filesystem.

As well, by default, mkfs sets the block reserve to 5%, which wastes a fair
amount of space, since this space is only accessible to the root user.  We
tune this down to 1% at mkfs time, but you can use tune2fs at runtime to
change it.

I don't know that I would use writeback. This mode is problematic in the
event of a crash because it can allow old data to exist on the FS, but with
new metadata.  I consider this corruption.  Unless you know your
environment to be super stable (meaning no OS or hardware-induced crashes)
AND you have stable, UPS-backed power, I would steer clear of this.

If you're looking for the utmost in filesystem performance, you're better
off looking at the controller card you're using.  Right now, we're using
LSI9207-8i and seeing an aggregate 1.6-1.8GBytes/sec throughput across 12
drives in JBOD.  Our older LSI-based cards can only sustain maybe a quarter
of that in the same disk configuration.

Travis

On Mon, Oct 6, 2014 at 4:46 PM, Colin Kincaid Williams <di...@uw.edu>
wrote:

> Hi,
>
> I'm trying to figure out what are more ideal settings for using ext4 on
> hadoop cluster datanodes. From the hadoop site its recommended nodelalloc
> option is chosen in the fstab. Is that still a preferred option?
>
> I read elsewhere to disable the ext4 journal, and use data=writeback.
>
> http://fenidik.blogspot.com/2010/03/ext4-disable-journal.html
>
> Finally, in some slides i read to use dir_index,sparse_super,extent when
> creating the filesystem, and mount noatime and nodiratime
>
>
> http://www.slideshare.net/leonsp/best-practices-for-deploying-hadoop-biginsights-in-the-cloud
>
>
>
>
>
>

-- 
Travis Campbell
travis@ghostar.org

Re: ext4 on a hadoop cluster datanodes

Posted by Travis <hc...@ghostar.org>.

For filesystem creation, we use the following with mkfs.ext4

mkfs.ext4 -T largefile -m 1 -O dir_index,extent,sparse_super -L $HDFS_LABEL
/dev/${DEV}1

By default, mkfs creates way too many inodes, so we tune it a bit with the
"largefile" option, which modifies the inode_ratio.  This gives us ~2
million usable inodes on a 2TB filesystem.

As well, by default, mkfs sets the block reserve to 5%, which wastes a fair
amount of space, since this space is only accessible to the root user.  We
tune this down to 1% at mkfs time, but you can use tune2fs at runtime to
change it.

I don't know that I would use writeback. This mode is problematic in the
event of a crash because it can allow old data to exist on the FS, but with
new metadata.  I consider this corruption.  Unless you know your
environment to be super stable (meaning no OS or hardware-induced crashes)
AND you have stable, UPS-backed power, I would steer clear of this.

If you're looking for the utmost in filesystem performance, you're better
off looking at the controller card you're using.  Right now, we're using
LSI9207-8i and seeing an aggregate 1.6-1.8GBytes/sec throughput across 12
drives in JBOD.  Our older LSI-based cards can only sustain maybe a quarter
of that in the same disk configuration.

Travis

On Mon, Oct 6, 2014 at 4:46 PM, Colin Kincaid Williams <di...@uw.edu>
wrote:

> Hi,
>
> I'm trying to figure out what are more ideal settings for using ext4 on
> hadoop cluster datanodes. From the hadoop site its recommended nodelalloc
> option is chosen in the fstab. Is that still a preferred option?
>
> I read elsewhere to disable the ext4 journal, and use data=writeback.
>
> http://fenidik.blogspot.com/2010/03/ext4-disable-journal.html
>
> Finally, in some slides i read to use dir_index,sparse_super,extent when
> creating the filesystem, and mount noatime and nodiratime
>
>
> http://www.slideshare.net/leonsp/best-practices-for-deploying-hadoop-biginsights-in-the-cloud
>
>
>
>
>
>

-- 
Travis Campbell
travis@ghostar.org

Re: ext4 on a hadoop cluster datanodes

Posted by Travis <hc...@ghostar.org>.

For filesystem creation, we use the following with mkfs.ext4

mkfs.ext4 -T largefile -m 1 -O dir_index,extent,sparse_super -L $HDFS_LABEL
/dev/${DEV}1

By default, mkfs creates way too many inodes, so we tune it a bit with the
"largefile" option, which modifies the inode_ratio.  This gives us ~2
million usable inodes on a 2TB filesystem.

As well, by default, mkfs sets the block reserve to 5%, which wastes a fair
amount of space, since this space is only accessible to the root user.  We
tune this down to 1% at mkfs time, but you can use tune2fs at runtime to
change it.

I don't know that I would use writeback. This mode is problematic in the
event of a crash because it can allow old data to exist on the FS, but with
new metadata.  I consider this corruption.  Unless you know your
environment to be super stable (meaning no OS or hardware-induced crashes)
AND you have stable, UPS-backed power, I would steer clear of this.

If you're looking for the utmost in filesystem performance, you're better
off looking at the controller card you're using.  Right now, we're using
LSI9207-8i and seeing an aggregate 1.6-1.8GBytes/sec throughput across 12
drives in JBOD.  Our older LSI-based cards can only sustain maybe a quarter
of that in the same disk configuration.

Travis

On Mon, Oct 6, 2014 at 4:46 PM, Colin Kincaid Williams <di...@uw.edu>
wrote:

> Hi,
>
> I'm trying to figure out what are more ideal settings for using ext4 on
> hadoop cluster datanodes. From the hadoop site its recommended nodelalloc
> option is chosen in the fstab. Is that still a preferred option?
>
> I read elsewhere to disable the ext4 journal, and use data=writeback.
>
> http://fenidik.blogspot.com/2010/03/ext4-disable-journal.html
>
> Finally, in some slides i read to use dir_index,sparse_super,extent when
> creating the filesystem, and mount noatime and nodiratime
>
>
> http://www.slideshare.net/leonsp/best-practices-for-deploying-hadoop-biginsights-in-the-cloud
>
>
>
>
>
>

-- 
Travis Campbell
travis@ghostar.org

Re: ext4 on a hadoop cluster datanodes

Posted by Travis <hc...@ghostar.org>.

For filesystem creation, we use the following with mkfs.ext4

mkfs.ext4 -T largefile -m 1 -O dir_index,extent,sparse_super -L $HDFS_LABEL
/dev/${DEV}1

By default, mkfs creates way too many inodes, so we tune it a bit with the
"largefile" option, which modifies the inode_ratio.  This gives us ~2
million usable inodes on a 2TB filesystem.

As well, by default, mkfs sets the block reserve to 5%, which wastes a fair
amount of space, since this space is only accessible to the root user.  We
tune this down to 1% at mkfs time, but you can use tune2fs at runtime to
change it.

I don't know that I would use writeback. This mode is problematic in the
event of a crash because it can allow old data to exist on the FS, but with
new metadata.  I consider this corruption.  Unless you know your
environment to be super stable (meaning no OS or hardware-induced crashes)
AND you have stable, UPS-backed power, I would steer clear of this.

If you're looking for the utmost in filesystem performance, you're better
off looking at the controller card you're using.  Right now, we're using
LSI9207-8i and seeing an aggregate 1.6-1.8GBytes/sec throughput across 12
drives in JBOD.  Our older LSI-based cards can only sustain maybe a quarter
of that in the same disk configuration.

Travis

On Mon, Oct 6, 2014 at 4:46 PM, Colin Kincaid Williams <di...@uw.edu>
wrote:

> Hi,
>
> I'm trying to figure out what are more ideal settings for using ext4 on
> hadoop cluster datanodes. From the hadoop site its recommended nodelalloc
> option is chosen in the fstab. Is that still a preferred option?
>
> I read elsewhere to disable the ext4 journal, and use data=writeback.
>
> http://fenidik.blogspot.com/2010/03/ext4-disable-journal.html
>
> Finally, in some slides i read to use dir_index,sparse_super,extent when
> creating the filesystem, and mount noatime and nodiratime
>
>
> http://www.slideshare.net/leonsp/best-practices-for-deploying-hadoop-biginsights-in-the-cloud
>
>
>
>
>
>

-- 
Travis Campbell
travis@ghostar.org