You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jcs-users@jakarta.apache.org by Ted Rice <tr...@netsuite.com> on 2006/08/02 19:07:35 UTC

Indexed Disk Cache (Key/Value Corruption)

We have recently deployed a small JCS backed cache and are experiencing some
symptoms that seem related to Keys retrieving the wrong Value from the Auxiliary
Indexed Disk Cache.

Our Setup is as follows.

1. JCS In-Memory running in same JVM as Application
2. Auxiliary Remote Cache on the same hardware with an Indexed Based Disk Cache 
as Auxillary in a different JVM.

What appears to happen is as follows:

1. After some period the main application is restarted clearing all in-memory 
items.
2. A Get is executed that hits the Remote Auxillary's Disk Cache.
3. The Value returned from the Key does not match.

Is there any known use cases where this would occur? Also, could the improper 
shutdown of the Indexed Based Disk Cache result in the key file and value file 
getting out of sync?







---------------------------------------------------------------------
To unsubscribe, e-mail: jcs-users-unsubscribe@jakarta.apache.org
For additional commands, e-mail: jcs-users-help@jakarta.apache.org


Re: Indexed Disk Cache (Key/Value Corruption)

Posted by Alistair Forbes <fo...@googlemail.com>.
Probably overcomplicated for most folks... but basically it uses the
"-shutdown" option, and I now wait 80 secs before killing any process. Seems
to help, but it's hard to reproduce the error, so I will wait any see what
happens.

        java -cp ${CLASSPATH} ${JCS_FACTORY} -shutdown /${JCS_CONFIG_FILE}

         # kill java
         shutdown $PROCESS_NAME
         # kill the shell too
         kill $(cat $PIDFILE)


function shutdown {
   TIMEOUT=80          # 80s default timeout
   [[ -n "$2" ]] && TIMEOUT=$2

   DURATION=0
   START=$(date '+%s')
   PROCS=$(ps -efwww | grep $1 | grep -v grep | wc -l)

   while [[ $DURATION -lt $TIMEOUT ]] && [[ $PROCS -ne 0 ]]; do
      CURR=$(date '+%s')
      ((DURATION=CURR-START))

      PROCS=$(ps -efwww | grep $1 | grep -v grep | wc -l)
      APROCS=$(ps -efwww | grep $1 | grep -v grep | gawk '{ print $NF }' |
sort -u)
      echo $1: waiting for $DURATION s, processes left: $PROCS \($APROCS\)
      sleep 1
   done;

   if [[ $DURATION -ge $TIMEOUT ]] && [[ $PROCS -ne 0 ]]; then
      echo $1: $PROCS process\(es\) left, kill-9ing..
      [[ -f "$PIDFILE" ]] && kill -9 $(cat $PIDFILE)
      ps -efwww | grep $1 | grep -v grep | gawk '{ print $2 }' | xargs -r
kill -9
   fi
}

On 8/2/06, Ted Rice <tr...@netsuite.com> wrote:
>
> Thanks for the response.
>
> Any information on how you were shutting down the remote server?
>
> i.e. Using the RemoteServerStub and calling shutdown
>
> Currently we are sending the Process ID a SIGHUP to stop it and are in the
> process of implementing a stop via the RemoteServerStub prior to that.
>
> Will that shutdown process give us the best chance of not corrupting the
> disk
> cache? How long do you usually wait before killing?
>
>
> Alistair Forbes wrote:
> > I have had this a couple of times, and have seen one other person
> reporting
> > the same thing. Pretty damaging results!
> >
> > I thought it may have had something to do with the remote cache not
> flushing
> > to disk before an exit. But I am not 100% sure. I now wait for longer
> before
> > killing any processes. If I get a bit more time I will try with 0 memory
> > (disk only cache) and see if the same thing happens.
> >
> > Regards
> > Al
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: jcs-users-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: jcs-users-help@jakarta.apache.org
>
>

Re: Indexed Disk Cache (Key/Value Corruption)

Posted by Ted Rice <tr...@netsuite.com>.
Thanks for the response.

Any information on how you were shutting down the remote server?

i.e. Using the RemoteServerStub and calling shutdown

Currently we are sending the Process ID a SIGHUP to stop it and are in the 
process of implementing a stop via the RemoteServerStub prior to that.

Will that shutdown process give us the best chance of not corrupting the disk 
cache? How long do you usually wait before killing?


Alistair Forbes wrote:
> I have had this a couple of times, and have seen one other person reporting
> the same thing. Pretty damaging results!
> 
> I thought it may have had something to do with the remote cache not flushing
> to disk before an exit. But I am not 100% sure. I now wait for longer before
> killing any processes. If I get a bit more time I will try with 0 memory
> (disk only cache) and see if the same thing happens.
> 
> Regards
> Al
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: jcs-users-unsubscribe@jakarta.apache.org
For additional commands, e-mail: jcs-users-help@jakarta.apache.org


Re: Indexed Disk Cache (Key/Value Corruption)

Posted by Alistair Forbes <fo...@googlemail.com>.
I have had this a couple of times, and have seen one other person reporting
the same thing. Pretty damaging results!

I thought it may have had something to do with the remote cache not flushing
to disk before an exit. But I am not 100% sure. I now wait for longer before
killing any processes. If I get a bit more time I will try with 0 memory
(disk only cache) and see if the same thing happens.

Regards
Al

On 8/2/06, Ted Rice <tr...@netsuite.com> wrote:
>
> We have recently deployed a small JCS backed cache and are experiencing
> some
> symptoms that seem related to Keys retrieving the wrong Value from the
> Auxiliary
> Indexed Disk Cache.
>
> Our Setup is as follows.
>
> 1. JCS In-Memory running in same JVM as Application
> 2. Auxiliary Remote Cache on the same hardware with an Indexed Based Disk
> Cache
> as Auxillary in a different JVM.
>
> What appears to happen is as follows:
>
> 1. After some period the main application is restarted clearing all
> in-memory
> items.
> 2. A Get is executed that hits the Remote Auxillary's Disk Cache.
> 3. The Value returned from the Key does not match.
>
> Is there any known use cases where this would occur? Also, could the
> improper
> shutdown of the Indexed Based Disk Cache result in the key file and value
> file
> getting out of sync?
>
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: jcs-users-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: jcs-users-help@jakarta.apache.org
>
>

Re: Indexed Disk Cache (Key/Value Corruption)

Posted by Aaron Smuts <as...@yahoo.com>.
Yes.  I saw this recently after a restart.  The
shutdown routine is not finishing nicely and the
validation on startup is not strong enough.  This can
only happen when the values are the same size, which
is not uncommon.  We will need to add some additional
indicator that the disk shutdown properly.  On startup
it will need to look for this indicator before using
the data.

Aaron

--- Ted Rice <tr...@netsuite.com> wrote:

> We have recently deployed a small JCS backed cache
> and are experiencing some
> symptoms that seem related to Keys retrieving the
> wrong Value from the Auxiliary
> Indexed Disk Cache.
> 
> Our Setup is as follows.
> 
> 1. JCS In-Memory running in same JVM as Application
> 2. Auxiliary Remote Cache on the same hardware with
> an Indexed Based Disk Cache 
> as Auxillary in a different JVM.
> 
> What appears to happen is as follows:
> 
> 1. After some period the main application is
> restarted clearing all in-memory 
> items.
> 2. A Get is executed that hits the Remote
> Auxillary's Disk Cache.
> 3. The Value returned from the Key does not match.
> 
> Is there any known use cases where this would occur?
> Also, could the improper 
> shutdown of the Indexed Based Disk Cache result in
> the key file and value file 
> getting out of sync?
> 
> 
> 
> 
> 
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> jcs-users-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
> jcs-users-help@jakarta.apache.org
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: jcs-users-unsubscribe@jakarta.apache.org
For additional commands, e-mail: jcs-users-help@jakarta.apache.org