You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-dev@hadoop.apache.org by haosdent <ha...@gmail.com> on 2013/08/25 07:11:51 UTC

hsync is too slower than hflush

Hi, all. Hadoop support hsync which would call fsync of system after 2.0.2. I have tested the performance of hsync() and hflush() again and again, but I found that the hsync call() everytime would spent nearly 50ms while the hflush call() just spent 2ms. In this slide(http://www.slideshare.net/enissoz/hbase-and-hdfs-understanding-filesystem-usage Page 18), the author mentions that hsync() is 2x slower than hflush(). So, is anything wrong? Thank you very much and looking forward to your help. 

-- 
Best Regards,
Haosong Huang
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)

Re: hsync is too slower than hflush

Posted by Andrew Wang <an...@cloudera.com>.

It's syncing the checksum file, so the disk head very likely has to move.
There are rotational seek delays too.


On Mon, Aug 26, 2013 at 7:30 AM, lei liu <li...@gmail.com> wrote:

> Hi all,
>
> DataNode sequential write file, so I think the disk seek time should be
> very small.Why is disk seek time 10ms? I think that is too long. Whether we
> can optimize the linux system configuration, reduce disk seek time.
>
>
> 2013/8/26 haosdent <ha...@gmail.com>
>
> > haha, thank you very much, I get it now.
> >
> > --
> > Best Regards,
> > Haosong Huang
> > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> >
> >
> > On Monday, August 26, 2013 at 11:18 AM, Andrew Wang wrote:
> >
> > > Ah, I forgot the checksum fsync, so two seeks. Even with 4k writes,
> 50ms
> > > still feels in the right ballpark. Best case it's ~20ms, still way
> slower
> > > than hflush.
> > >
> > > It's also worth asking if there's other dirty data waiting for
> writeback,
> > > since I believe it can also get written out on an fsync.
> > >
> > > hflush doesn't durably write to disk, so you're still in danger of
> losing
> > > data if there's a cluster-wide power outage. Because HDFS writes to two
> > > different racks, hflush still protects you from single-rack outages.
> Most
> > > people think this is good enough (I believe HBase by default runs with
> > just
> > > hflush), but if you *really* want to be sure, pay the cost of hsync and
> > do
> > > durable writes.
> > >
> > >
> > > On Sun, Aug 25, 2013 at 7:44 PM, haosdent <haosdent@gmail.com (mailto:
> > haosdent@gmail.com)> wrote:
> > >
> > > > In fact, I just write 4k in every hsync. Datenode would write
> checksum
> > > > file and data file when I hsync data to datanode. Each of them would
> > spent
> > > > nearly 25ms, so a hsync call would spent nearly 50ms. But hflush is
> > very
> > > > fast, which spent both 1ms in write checksum and data. If a hsync
> would
> > > > spent 50ms, what meanings we use it? Or my test way is wrong?
> > > >
> > > > --
> > > > Best Regards,
> > > > Haosong Huang
> > > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > > >
> > > >
> > > > On Monday, August 26, 2013 at 7:07 AM, Andrew Wang wrote:
> > > >
> > > > > 50ms is believable. hsync makes each DN call fsync and wait for
> > acks, so
> > > > > you'd expect at least a disk seek time (~10ms) with some extra time
> > > > > depending on how much unsync'd data is being written.
> > > > >
> > > > > So, just as some back of the envelope math, assuming a disk that
> can
> > > > write
> > > > > at 100MB/s:
> > > > >
> > > > > 50ms - 10ms seek = 40ms writing time
> > > > > 100 MB/s * 40ms = 4MB
> > > > >
> > > > > If you're hsync'ing every 4MB, 50ms would be exactly what I'd
> expect.
> > > > >
> > > > > Best,
> > > > > Andrew
> > > > >
> > > > >
> > > > > On Sat, Aug 24, 2013 at 10:11 PM, haosdent <haosdent@gmail.com
> (mailto:
> > haosdent@gmail.com) (mailto:
> > > > haosdent@gmail.com (mailto:haosdent@gmail.com))> wrote:
> > > > >
> > > > > > Hi, all. Hadoop support hsync which would call fsync of system
> > after
> > > > > > 2.0.2. I have tested the performance of hsync() and hflush()
> again
> > and
> > > > > > again, but I found that the hsync call() everytime would spent
> > nearly
> > > > > >
> > > > >
> > > > >
> > > >
> > > > 50ms
> > > > > > while the hflush call() just spent 2ms. In this slide(
> > > > >
> > > >
> > > >
> >
> http://www.slideshare.net/enissoz/hbase-and-hdfs-understanding-filesystem-usagePage18
> ),
> > the author mentions that hsync() is 2x slower than hflush(). So,
> > > > > > is anything wrong? Thank you very much and looking forward to
> your
> > > > >
> > > >
> > > > help.
> > > > > >
> > > > > > --
> > > > > > Best Regards,
> > > > > > Haosong Huang
> > > > > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > >
> >
> >
> >
>

Re: hsync is too slower than hflush

Posted by lei liu <li...@gmail.com>.

Hi all,

DataNode sequential write file, so I think the disk seek time should be
very small.Why is disk seek time 10ms? I think that is too long. Whether we
can optimize the linux system configuration, reduce disk seek time.


2013/8/26 haosdent <ha...@gmail.com>

> haha, thank you very much, I get it now.
>
> --
> Best Regards,
> Haosong Huang
> Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
>
>
> On Monday, August 26, 2013 at 11:18 AM, Andrew Wang wrote:
>
> > Ah, I forgot the checksum fsync, so two seeks. Even with 4k writes, 50ms
> > still feels in the right ballpark. Best case it's ~20ms, still way slower
> > than hflush.
> >
> > It's also worth asking if there's other dirty data waiting for writeback,
> > since I believe it can also get written out on an fsync.
> >
> > hflush doesn't durably write to disk, so you're still in danger of losing
> > data if there's a cluster-wide power outage. Because HDFS writes to two
> > different racks, hflush still protects you from single-rack outages. Most
> > people think this is good enough (I believe HBase by default runs with
> just
> > hflush), but if you *really* want to be sure, pay the cost of hsync and
> do
> > durable writes.
> >
> >
> > On Sun, Aug 25, 2013 at 7:44 PM, haosdent <haosdent@gmail.com (mailto:
> haosdent@gmail.com)> wrote:
> >
> > > In fact, I just write 4k in every hsync. Datenode would write checksum
> > > file and data file when I hsync data to datanode. Each of them would
> spent
> > > nearly 25ms, so a hsync call would spent nearly 50ms. But hflush is
> very
> > > fast, which spent both 1ms in write checksum and data. If a hsync would
> > > spent 50ms, what meanings we use it? Or my test way is wrong?
> > >
> > > --
> > > Best Regards,
> > > Haosong Huang
> > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > >
> > >
> > > On Monday, August 26, 2013 at 7:07 AM, Andrew Wang wrote:
> > >
> > > > 50ms is believable. hsync makes each DN call fsync and wait for
> acks, so
> > > > you'd expect at least a disk seek time (~10ms) with some extra time
> > > > depending on how much unsync'd data is being written.
> > > >
> > > > So, just as some back of the envelope math, assuming a disk that can
> > > write
> > > > at 100MB/s:
> > > >
> > > > 50ms - 10ms seek = 40ms writing time
> > > > 100 MB/s * 40ms = 4MB
> > > >
> > > > If you're hsync'ing every 4MB, 50ms would be exactly what I'd expect.
> > > >
> > > > Best,
> > > > Andrew
> > > >
> > > >
> > > > On Sat, Aug 24, 2013 at 10:11 PM, haosdent <haosdent@gmail.com(mailto:
> haosdent@gmail.com) (mailto:
> > > haosdent@gmail.com (mailto:haosdent@gmail.com))> wrote:
> > > >
> > > > > Hi, all. Hadoop support hsync which would call fsync of system
> after
> > > > > 2.0.2. I have tested the performance of hsync() and hflush() again
> and
> > > > > again, but I found that the hsync call() everytime would spent
> nearly
> > > > >
> > > >
> > > >
> > >
> > > 50ms
> > > > > while the hflush call() just spent 2ms. In this slide(
> > > >
> > >
> > >
> http://www.slideshare.net/enissoz/hbase-and-hdfs-understanding-filesystem-usagePage18),
> the author mentions that hsync() is 2x slower than hflush(). So,
> > > > > is anything wrong? Thank you very much and looking forward to your
> > > >
> > >
> > > help.
> > > > >
> > > > > --
> > > > > Best Regards,
> > > > > Haosong Huang
> > > > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > > > >
> > > >
> > >
> > >
> >
> >
> >
>
>
>

Re: hsync is too slower than hflush

Posted by haosdent <ha...@gmail.com>.

haha, thank you very much, I get it now.

-- 
Best Regards,
Haosong Huang
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Monday, August 26, 2013 at 11:18 AM, Andrew Wang wrote:

> Ah, I forgot the checksum fsync, so two seeks. Even with 4k writes, 50ms
> still feels in the right ballpark. Best case it's ~20ms, still way slower
> than hflush.
> 
> It's also worth asking if there's other dirty data waiting for writeback,
> since I believe it can also get written out on an fsync.
> 
> hflush doesn't durably write to disk, so you're still in danger of losing
> data if there's a cluster-wide power outage. Because HDFS writes to two
> different racks, hflush still protects you from single-rack outages. Most
> people think this is good enough (I believe HBase by default runs with just
> hflush), but if you *really* want to be sure, pay the cost of hsync and do
> durable writes.
> 
> 
> On Sun, Aug 25, 2013 at 7:44 PM, haosdent <haosdent@gmail.com (mailto:haosdent@gmail.com)> wrote:
> 
> > In fact, I just write 4k in every hsync. Datenode would write checksum
> > file and data file when I hsync data to datanode. Each of them would spent
> > nearly 25ms, so a hsync call would spent nearly 50ms. But hflush is very
> > fast, which spent both 1ms in write checksum and data. If a hsync would
> > spent 50ms, what meanings we use it? Or my test way is wrong?
> > 
> > --
> > Best Regards,
> > Haosong Huang
> > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > 
> > 
> > On Monday, August 26, 2013 at 7:07 AM, Andrew Wang wrote:
> > 
> > > 50ms is believable. hsync makes each DN call fsync and wait for acks, so
> > > you'd expect at least a disk seek time (~10ms) with some extra time
> > > depending on how much unsync'd data is being written.
> > > 
> > > So, just as some back of the envelope math, assuming a disk that can
> > write
> > > at 100MB/s:
> > > 
> > > 50ms - 10ms seek = 40ms writing time
> > > 100 MB/s * 40ms = 4MB
> > > 
> > > If you're hsync'ing every 4MB, 50ms would be exactly what I'd expect.
> > > 
> > > Best,
> > > Andrew
> > > 
> > > 
> > > On Sat, Aug 24, 2013 at 10:11 PM, haosdent <haosdent@gmail.com (mailto:haosdent@gmail.com) (mailto:
> > haosdent@gmail.com (mailto:haosdent@gmail.com))> wrote:
> > > 
> > > > Hi, all. Hadoop support hsync which would call fsync of system after
> > > > 2.0.2. I have tested the performance of hsync() and hflush() again and
> > > > again, but I found that the hsync call() everytime would spent nearly
> > > > 
> > > 
> > > 
> > 
> > 50ms
> > > > while the hflush call() just spent 2ms. In this slide(
> > > 
> > 
> > http://www.slideshare.net/enissoz/hbase-and-hdfs-understanding-filesystem-usagePage18), the author mentions that hsync() is 2x slower than hflush(). So,
> > > > is anything wrong? Thank you very much and looking forward to your
> > > 
> > 
> > help.
> > > > 
> > > > --
> > > > Best Regards,
> > > > Haosong Huang
> > > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > > > 
> > > 
> > 
> > 
> 
> 
>

Re: hsync is too slower than hflush

Posted by Andrew Wang <an...@cloudera.com>.

Ah, I forgot the checksum fsync, so two seeks. Even with 4k writes, 50ms
still feels in the right ballpark. Best case it's ~20ms, still way slower
than hflush.

It's also worth asking if there's other dirty data waiting for writeback,
since I believe it can also get written out on an fsync.

hflush doesn't durably write to disk, so you're still in danger of losing
data if there's a cluster-wide power outage. Because HDFS writes to two
different racks, hflush still protects you from single-rack outages. Most
people think this is good enough (I believe HBase by default runs with just
hflush), but if you *really* want to be sure, pay the cost of hsync and do
durable writes.


On Sun, Aug 25, 2013 at 7:44 PM, haosdent <ha...@gmail.com> wrote:

> In fact, I just write 4k in every hsync. Datenode would write checksum
> file and data file when I hsync data to datanode. Each of them would spent
> nearly 25ms, so a hsync call would spent nearly 50ms. But hflush is very
> fast, which spent both 1ms in write checksum and data. If a hsync would
> spent 50ms, what meanings we use it? Or my test way is wrong?
>
> --
> Best Regards,
> Haosong Huang
> Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
>
>
> On Monday, August 26, 2013 at 7:07 AM, Andrew Wang wrote:
>
> > 50ms is believable. hsync makes each DN call fsync and wait for acks, so
> > you'd expect at least a disk seek time (~10ms) with some extra time
> > depending on how much unsync'd data is being written.
> >
> > So, just as some back of the envelope math, assuming a disk that can
> write
> > at 100MB/s:
> >
> > 50ms - 10ms seek = 40ms writing time
> > 100 MB/s * 40ms = 4MB
> >
> > If you're hsync'ing every 4MB, 50ms would be exactly what I'd expect.
> >
> > Best,
> > Andrew
> >
> >
> > On Sat, Aug 24, 2013 at 10:11 PM, haosdent <haosdent@gmail.com (mailto:
> haosdent@gmail.com)> wrote:
> >
> > > Hi, all. Hadoop support hsync which would call fsync of system after
> > > 2.0.2. I have tested the performance of hsync() and hflush() again and
> > > again, but I found that the hsync call() everytime would spent nearly
> 50ms
> > > while the hflush call() just spent 2ms. In this slide(
> > >
> http://www.slideshare.net/enissoz/hbase-and-hdfs-understanding-filesystem-usagePage18), the author mentions that hsync() is 2x slower than hflush(). So,
> > > is anything wrong? Thank you very much and looking forward to your
> help.
> > >
> > > --
> > > Best Regards,
> > > Haosong Huang
> > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > >
> >
> >
> >
>
>
>

Re: hsync is too slower than hflush

Posted by haosdent <ha...@gmail.com>.

In fact, I just write 4k in every hsync. Datenode would write checksum file and data file when I hsync data to datanode. Each of them would spent nearly 25ms, so a hsync call would spent nearly 50ms. But hflush is very fast, which spent both 1ms in write checksum and data. If a hsync would spent 50ms, what meanings we use it? Or my test way is wrong?

-- 
Best Regards,
Haosong Huang
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Monday, August 26, 2013 at 7:07 AM, Andrew Wang wrote:

> 50ms is believable. hsync makes each DN call fsync and wait for acks, so
> you'd expect at least a disk seek time (~10ms) with some extra time
> depending on how much unsync'd data is being written.
> 
> So, just as some back of the envelope math, assuming a disk that can write
> at 100MB/s:
> 
> 50ms - 10ms seek = 40ms writing time
> 100 MB/s * 40ms = 4MB
> 
> If you're hsync'ing every 4MB, 50ms would be exactly what I'd expect.
> 
> Best,
> Andrew
> 
> 
> On Sat, Aug 24, 2013 at 10:11 PM, haosdent <haosdent@gmail.com (mailto:haosdent@gmail.com)> wrote:
> 
> > Hi, all. Hadoop support hsync which would call fsync of system after
> > 2.0.2. I have tested the performance of hsync() and hflush() again and
> > again, but I found that the hsync call() everytime would spent nearly 50ms
> > while the hflush call() just spent 2ms. In this slide(
> > http://www.slideshare.net/enissoz/hbase-and-hdfs-understanding-filesystem-usagePage 18), the author mentions that hsync() is 2x slower than hflush(). So,
> > is anything wrong? Thank you very much and looking forward to your help.
> > 
> > --
> > Best Regards,
> > Haosong Huang
> > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > 
> 
> 
>

Re: hsync is too slower than hflush

Posted by Andrew Wang <an...@cloudera.com>.

50ms is believable. hsync makes each DN call fsync and wait for acks, so
you'd expect at least a disk seek time (~10ms) with some extra time
depending on how much unsync'd data is being written.

So, just as some back of the envelope math, assuming a disk that can write
at 100MB/s:

50ms - 10ms seek = 40ms writing time
100 MB/s * 40ms = 4MB

If you're hsync'ing every 4MB, 50ms would be exactly what I'd expect.

Best,
Andrew

On Sat, Aug 24, 2013 at 10:11 PM, haosdent <ha...@gmail.com> wrote:

> Hi, all. Hadoop support hsync which would call fsync of system after
> 2.0.2. I have tested the performance of hsync() and hflush() again and
> again, but I found that the hsync call() everytime would spent nearly 50ms
> while the hflush call() just spent 2ms. In this slide(
> http://www.slideshare.net/enissoz/hbase-and-hdfs-understanding-filesystem-usagePage 18), the author mentions that hsync() is 2x slower than hflush(). So,
> is anything wrong? Thank you very much and looking forward to your help.
>
> --
> Best Regards,
> Haosong Huang
> Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
>
>