You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zookeeper.apache.org by Laxman <la...@huawei.com> on 2011/07/13 08:16:53 UTC

Does abrupt kill corrupts the datadir?

When we stop zookeeper through zkServer.sh stop, we are aborting the
zookeeper process using "kill -9".

 

129 stop)

130     echo -n "Stopping zookeeper ... "

131     if [ ! -f "$ZOOPIDFILE" ]

132     then

133       echo "error: could not find file $ZOOPIDFILE"

134       exit 1

135     else

136       $KILL -9 $(cat "$ZOOPIDFILE")

137       rm "$ZOOPIDFILE"

138       echo STOPPED

139       exit 0

140     fi

141     ;;

 

 

This may corrupt the snapshot and transaction logs. Also, its not
recommended to use "kill -9".

In worst case, if latest snaps in all zookeeper nodes gets corrupted there
is a chance of dataloss.

 

How about introducing a shutdown hook which will ensure zookeeper is
shutdown gracefully when we call stop?

 

Note: This is just an observation and its not found in a test. 

 

--

Thanks,

Laxman


RE: Does abrupt kill corrupts the datadir?

Posted by Laxman <la...@huawei.com>.
Hi Mahadev,

Shutdown hook is just a quick thought. Another approach can be just give a
kill [SIGTERM] call which can be interpreted by process.

First look at the "kill -9" triggered the following scenario.
>In worst case, if latest snaps in all zookeeper nodes gets corrupted there
>is a chance of dataloss.

How does zookeeper can deal with this scenario gracefully?

Also, I feel we should give a chance to application to shutdown gracefully
before abrupt shutdown.

http://en.wikipedia.org/wiki/SIGKILL

Because SIGKILL gives the process no opportunity to do cleanup operations on
terminating, in most system shutdown procedures an attempt is first made to
terminate processes using SIGTERM, before resorting to SIGKILL.

http://rackerhacker.com/2010/03/18/sigterm-vs-sigkill/

The application can determine what it wants to do once a SIGTERM is
received. While most applications will clean up their resources and stop,
some may not. An application may be configured to do something completely
different when a SIGTERM is received. Also, if the application is in a bad
state, such as waiting for disk I/O, it may not be able to act on the signal
that was sent.

Most system administrators will usually resort to the more abrupt signal
when an application doesn't respond to a SIGTERM.

-----Original Message-----
From: Mahadev Konar [mailto:mahadev@hortonworks.com] 
Sent: Wednesday, July 13, 2011 12:02 PM
To: dev@zookeeper.apache.org
Subject: Re: Does abrupt kill corrupts the datadir?

Hi Laxman,
  The servers takes care of all the issues with data integrity, so a kill
-9 is OK. Shutdown hooks are tricky. Also, the best way to make sure
everything works reliably is use kill -9 :).

Thanks
mahadev

On 7/12/11 11:16 PM, "Laxman" <la...@huawei.com> wrote:

>When we stop zookeeper through zkServer.sh stop, we are aborting the
>zookeeper process using "kill -9".
>
> 
>
>129 stop)
>
>130     echo -n "Stopping zookeeper ... "
>
>131     if [ ! -f "$ZOOPIDFILE" ]
>
>132     then
>
>133       echo "error: could not find file $ZOOPIDFILE"
>
>134       exit 1
>
>135     else
>
>136       $KILL -9 $(cat "$ZOOPIDFILE")
>
>137       rm "$ZOOPIDFILE"
>
>138       echo STOPPED
>
>139       exit 0
>
>140     fi
>
>141     ;;
>
> 
>
> 
>
>This may corrupt the snapshot and transaction logs. Also, its not
>recommended to use "kill -9".
>
>In worst case, if latest snaps in all zookeeper nodes gets corrupted there
>is a chance of dataloss.
>
> 
>
>How about introducing a shutdown hook which will ensure zookeeper is
>shutdown gracefully when we call stop?
>
> 
>
>Note: This is just an observation and its not found in a test.
>
> 
>
>--
>
>Thanks,
>
>Laxman
>



Re: Does abrupt kill corrupts the datadir?

Posted by Mahadev Konar <ma...@hortonworks.com>.
Hi Laxman,
  The servers takes care of all the issues with data integrity, so a kill
-9 is OK. Shutdown hooks are tricky. Also, the best way to make sure
everything works reliably is use kill -9 :).

Thanks
mahadev

On 7/12/11 11:16 PM, "Laxman" <la...@huawei.com> wrote:

>When we stop zookeeper through zkServer.sh stop, we are aborting the
>zookeeper process using "kill -9".
>
> 
>
>129 stop)
>
>130     echo -n "Stopping zookeeper ... "
>
>131     if [ ! -f "$ZOOPIDFILE" ]
>
>132     then
>
>133       echo "error: could not find file $ZOOPIDFILE"
>
>134       exit 1
>
>135     else
>
>136       $KILL -9 $(cat "$ZOOPIDFILE")
>
>137       rm "$ZOOPIDFILE"
>
>138       echo STOPPED
>
>139       exit 0
>
>140     fi
>
>141     ;;
>
> 
>
> 
>
>This may corrupt the snapshot and transaction logs. Also, its not
>recommended to use "kill -9".
>
>In worst case, if latest snaps in all zookeeper nodes gets corrupted there
>is a chance of dataloss.
>
> 
>
>How about introducing a shutdown hook which will ensure zookeeper is
>shutdown gracefully when we call stop?
>
> 
>
>Note: This is just an observation and its not found in a test.
>
> 
>
>--
>
>Thanks,
>
>Laxman
>