You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@zookeeper.apache.org by Adrian Hamza <ad...@hotmail.com> on 2015/10/21 18:49:03 UTC

Running zookeeper with journaling disabled on an ext4

Hi,
What's your opinion on running Zookeeper with file system journaling disabled on an EXT4 file system?
We need every ounce of performance out of it, and I believe given the Zookeeper implementation (operations are logged, majority quorum is required for succeeding, checkpoints are swapped only after being complete) I feel that EXT4 journaling is redundant. Disabling EXT4 journaling could improve performance. We have 5 servers in a Zookeeper ensemble. Am I missing something or this should be fine?
Thank you,
Adrian
 		 	   		  

RE: Running zookeeper with journaling disabled on an ext4

Posted by Adrian Hamza <ad...@hotmail.com>.
Zookeeper fsyncs the write requests to disk before acknowledging the writes as successful. Isn't that preventing what you described from happening?

> From: ivank@apache.org
> Date: Wed, 21 Oct 2015 17:47:08 +0000
> Subject: Re: Running zookeeper with journaling disabled on an ext4
> To: user@zookeeper.apache.org
> 
> >
> > What's your opinion on running Zookeeper with file system journaling
> > disabled on an EXT4 file system?
> > We need every ounce of performance out of it, and I believe given the
> > Zookeeper implementation (operations are logged, majority quorum is
> > required for succeeding, checkpoints are swapped only after being complete)
> > I feel that EXT4 journaling is redundant. Disabling EXT4 journaling could
> > improve performance. We have 5 servers in a Zookeeper ensemble. Am I
> > missing something or this should be fine?
> 
> I would recommend against it. With ext4 journalling disabled, a server
> crash will likely lose data, unless you have all write caching disabled. If
> you have all write caching disabled, I'd expect to see worse performance
> for all applications. It's not much safer than running in a ramdisk.
> 
> -Ivan
 		 	   		  

Re: Running zookeeper with journaling disabled on an ext4

Posted by Edward Capriolo <ed...@huffingtonpost.com>.
here is why the setting doe a not matter imho.

1 Fsync is virtualized in cloud (zen) anyway.
2 if one sync is happening the others batch up anyway right?
3 with a modest kafka setup 10 nodes 30 second offset commit 100 clients
your entire zk cluster is in constant blocked iowait

On Thursday, October 22, 2015, Ivan Kelly <iv...@apache.org> wrote:

> On Wed, Oct 21, 2015 at 9:44 PM Adrian Hamza <adrian.hamza@hotmail.com
> <javascript:;>>
> wrote:
>
> > No fsync can lead to data loss as the data could be written only in the
> OS
> > cache but no on the disk. We do use fsync to prevent that and we are not
> > going to disable it.
> >
> Any you're not seeing massive performance degradation without a journal?
> Perhaps the disk's write cache is hiding it. In any case, even if you
> fsync, without the fs journal you can have dataloss/corruption in the case
> of power loss.
>
> -Ivan
>

Re: Running zookeeper with journaling disabled on an ext4

Posted by Ivan Kelly <iv...@apache.org>.
On Wed, Oct 21, 2015 at 9:44 PM Adrian Hamza <ad...@hotmail.com>
wrote:

> No fsync can lead to data loss as the data could be written only in the OS
> cache but no on the disk. We do use fsync to prevent that and we are not
> going to disable it.
>
Any you're not seeing massive performance degradation without a journal?
Perhaps the disk's write cache is hiding it. In any case, even if you
fsync, without the fs journal you can have dataloss/corruption in the case
of power loss.

-Ivan

RE: Running zookeeper with journaling disabled on an ext4

Posted by Adrian Hamza <ad...@hotmail.com>.
No fsync can lead to data loss as the data could be written only in the OS cache but no on the disk. We do use fsync to prevent that and we are not going to disable it.

> Date: Wed, 21 Oct 2015 14:00:57 -0400
> Subject: Re: Running zookeeper with journaling disabled on an ext4
> From: edward.capriolo@huffingtonpost.com
> To: user@zookeeper.apache.org
> 
> There used to be this setting in zk.
> 
> zookeeper.forceSync
> Using Kafka I used to set this setting to false, The reason is I had a
> number of clients writing offsets to zk and the disk system was always
> sync-ing dirty pages. The system would be super high IO wait.
> 
> I set this to false and disk usage basically dropped to 0.
> 
> On Wed, Oct 21, 2015 at 1:47 PM, Ivan Kelly <iv...@apache.org> wrote:
> 
> > >
> > > What's your opinion on running Zookeeper with file system journaling
> > > disabled on an EXT4 file system?
> > > We need every ounce of performance out of it, and I believe given the
> > > Zookeeper implementation (operations are logged, majority quorum is
> > > required for succeeding, checkpoints are swapped only after being
> > complete)
> > > I feel that EXT4 journaling is redundant. Disabling EXT4 journaling could
> > > improve performance. We have 5 servers in a Zookeeper ensemble. Am I
> > > missing something or this should be fine?
> >
> > I would recommend against it. With ext4 journalling disabled, a server
> > crash will likely lose data, unless you have all write caching disabled. If
> > you have all write caching disabled, I'd expect to see worse performance
> > for all applications. It's not much safer than running in a ramdisk.
> >
> > -Ivan
> >
 		 	   		  

Re: Running zookeeper with journaling disabled on an ext4

Posted by Ivan Kelly <iv...@apache.org>.
On Wed, Oct 21, 2015 at 8:00 PM, Edward Capriolo
<ed...@huffingtonpost.com> wrote:
> There used to be this setting in zk.
>
> zookeeper.forceSync
> Using Kafka I used to set this setting to false, The reason is I had a
> number of clients writing offsets to zk and the disk system was always
> sync-ing dirty pages. The system would be super high IO wait.
>
> I set this to false and disk usage basically dropped to 0.
forceSync means use fsync for zk journal operations. Without this,
writes to zookeeper are only going into the os page cache, so a crash
of the node, could cause data loss on that node. Even worse, if a node
crashes and reboots, it can accept two competing proposals and you can
theoretically end up with a system with splitbrain. This is why
durability is a requirement for all paxos like protocols.

-Ivan

Re: Running zookeeper with journaling disabled on an ext4

Posted by Edward Capriolo <ed...@huffingtonpost.com>.
There used to be this setting in zk.

zookeeper.forceSync
Using Kafka I used to set this setting to false, The reason is I had a
number of clients writing offsets to zk and the disk system was always
sync-ing dirty pages. The system would be super high IO wait.

I set this to false and disk usage basically dropped to 0.

On Wed, Oct 21, 2015 at 1:47 PM, Ivan Kelly <iv...@apache.org> wrote:

> >
> > What's your opinion on running Zookeeper with file system journaling
> > disabled on an EXT4 file system?
> > We need every ounce of performance out of it, and I believe given the
> > Zookeeper implementation (operations are logged, majority quorum is
> > required for succeeding, checkpoints are swapped only after being
> complete)
> > I feel that EXT4 journaling is redundant. Disabling EXT4 journaling could
> > improve performance. We have 5 servers in a Zookeeper ensemble. Am I
> > missing something or this should be fine?
>
> I would recommend against it. With ext4 journalling disabled, a server
> crash will likely lose data, unless you have all write caching disabled. If
> you have all write caching disabled, I'd expect to see worse performance
> for all applications. It's not much safer than running in a ramdisk.
>
> -Ivan
>

Re: Running zookeeper with journaling disabled on an ext4

Posted by Ivan Kelly <iv...@apache.org>.
>
> What's your opinion on running Zookeeper with file system journaling
> disabled on an EXT4 file system?
> We need every ounce of performance out of it, and I believe given the
> Zookeeper implementation (operations are logged, majority quorum is
> required for succeeding, checkpoints are swapped only after being complete)
> I feel that EXT4 journaling is redundant. Disabling EXT4 journaling could
> improve performance. We have 5 servers in a Zookeeper ensemble. Am I
> missing something or this should be fine?

I would recommend against it. With ext4 journalling disabled, a server
crash will likely lose data, unless you have all write caching disabled. If
you have all write caching disabled, I'd expect to see worse performance
for all applications. It's not much safer than running in a ramdisk.

-Ivan

RE: Running zookeeper with journaling disabled on an ext4

Posted by Adrian Hamza <ad...@hotmail.com>.
Zookeeper fsyncs the write requests to disk before acknowledging the 
writes as successful. Isn't that preventing what you described from 
happening?

No, we don't use RAID, I don't know about the non-volatile cache, I would have to check on that.

> From: bvan@slac.stanford.edu
> To: user@zookeeper.apache.org
> Subject: Re: Running zookeeper with journaling disabled on an ext4
> Date: Wed, 21 Oct 2015 17:59:33 +0000
> 
> If 3 of the 5 servers were to shutdown and the files on those servers were to become corrupted, you would assuredly have data loss.
> 
> Do you have your disks behind a raid controller with non-volatile cache (aka write back cache)?
> 
> Brian
> 
> 
> > On Oct 21, 2015, at 9:49 AM, Adrian Hamza <ad...@hotmail.com> wrote:
> > 
> > Hi,
> > What's your opinion on running Zookeeper with file system journaling disabled on an EXT4 file system?
> > We need every ounce of performance out of it, and I believe given the Zookeeper implementation (operations are logged, majority quorum is required for succeeding, checkpoints are swapped only after being complete) I feel that EXT4 journaling is redundant. Disabling EXT4 journaling could improve performance. We have 5 servers in a Zookeeper ensemble. Am I missing something or this should be fine?
> > Thank you,
> > Adrian
> > 		 	   		  
> 
 		 	   		  

Re: Running zookeeper with journaling disabled on an ext4

Posted by "Van Klaveren, Brian N." <bv...@slac.stanford.edu>.
If 3 of the 5 servers were to shutdown and the files on those servers were to become corrupted, you would assuredly have data loss.

Do you have your disks behind a raid controller with non-volatile cache (aka write back cache)?

Brian


> On Oct 21, 2015, at 9:49 AM, Adrian Hamza <ad...@hotmail.com> wrote:
> 
> Hi,
> What's your opinion on running Zookeeper with file system journaling disabled on an EXT4 file system?
> We need every ounce of performance out of it, and I believe given the Zookeeper implementation (operations are logged, majority quorum is required for succeeding, checkpoints are swapped only after being complete) I feel that EXT4 journaling is redundant. Disabling EXT4 journaling could improve performance. We have 5 servers in a Zookeeper ensemble. Am I missing something or this should be fine?
> Thank you,
> Adrian
>