You are viewing a plain text version of this content. The canonical link for it is here.
Posted to oak-dev@jackrabbit.apache.org by Amit Jain <am...@ieee.org> on 2014/04/02 12:27:05 UTC

Question regarding missing _lastRev recovery - OAK-1295

Hi,

How do we expose _lastRev recovery operation? This would need to check all
the cluster nodes info and run recovery for those nodes which need recovery.

1. We either have a scheduled job which checks all the nodes and run the
recovery. What should be the interval to trigger the job?
2. Or if we want it run only when triggered manually, then expose an
appropriate MBean.


Thanks
Amit

Re: Question regarding missing _lastRev recovery - OAK-1295

Posted by Chetan Mehrotra <ch...@gmail.com>.
> The lease time is set to 1 minute. Would it be ok to check this every
minute, from every node?

Adding to that the default time intervals are
- asyncDelay = 1 sec - The background operation are performed every 1
sec per cluster node. If nothing changes we would fire
1query/sec/cluster node to check the head revision

- cluster lease time = 1 min - This is the time after a cluster lease
would be renewed.

So we need to decide the time interval for Job for detecting recovery condition
Chetan Mehrotra


On Wed, Apr 2, 2014 at 4:31 PM, Amit Jain <am...@ieee.org> wrote:
> Hi,
>
>>> 1) a cluster node starts up and sees it didn't shut down properly. I'm
> not
>>> sure this information is available, but remember we discussed this once.
>
> Yes, this case has been taken care of in the startup.
>
>>>  this check could be done in the
>>> background operations thread on a regular basis. probably depending on
>>> the lease interval.
>
> The lease time is set to 1 minute. Would it be ok to check this every
> minute, from every node?
>
> Thanks
> Amit
>
>
> On Wed, Apr 2, 2014 at 4:14 PM, Marcel Reutegger <mr...@adobe.com> wrote:
>
>> Hi,
>>
>> I think the recovery should be triggered automatically by the system when:
>>
>> 1) a cluster node starts up and sees it didn't shut down properly. I'm not
>> sure this information is available, but remember we discussed this once.
>>
>> 2) a cluster node sees a lease timeout of another cluster node and
>> initiates
>> the recovery for the failed cluster node. this check could be done in the
>> background operations thread on a regular basis. probably depending on
>> the lease interval.
>>
>> In addition it would probably also be useful to have the recovery operation
>> available as a command in oak-run. that way you can manually trigger it
>> from
>> the command line. WDYT?
>>
>> Regards
>>  Marcel
>>
>> > How do we expose _lastRev recovery operation? This would need to check
>> > all
>> > the cluster nodes info and run recovery for those nodes which need
>> > recovery.
>> >
>> > 1. We either have a scheduled job which checks all the nodes and run the
>> > recovery. What should be the interval to trigger the job?
>> > 2. Or if we want it run only when triggered manually, then expose an
>> > appropriate MBean.
>> >
>> >
>> > Thanks
>> > Amit
>>

RE: Question regarding missing _lastRev recovery - OAK-1295

Posted by Marcel Reutegger <mr...@adobe.com>.
> The lease time is set to 1 minute. Would it be ok to check this every
> minute, from every node?

yes, that sounds reasonable and keeps the traffic low.

regards
 marcel

Re: Question regarding missing _lastRev recovery - OAK-1295

Posted by Amit Jain <am...@ieee.org>.
Hi,

>> 1) a cluster node starts up and sees it didn't shut down properly. I'm
not
>> sure this information is available, but remember we discussed this once.

Yes, this case has been taken care of in the startup.

>>  this check could be done in the
>> background operations thread on a regular basis. probably depending on
>> the lease interval.

The lease time is set to 1 minute. Would it be ok to check this every
minute, from every node?

Thanks
Amit


On Wed, Apr 2, 2014 at 4:14 PM, Marcel Reutegger <mr...@adobe.com> wrote:

> Hi,
>
> I think the recovery should be triggered automatically by the system when:
>
> 1) a cluster node starts up and sees it didn't shut down properly. I'm not
> sure this information is available, but remember we discussed this once.
>
> 2) a cluster node sees a lease timeout of another cluster node and
> initiates
> the recovery for the failed cluster node. this check could be done in the
> background operations thread on a regular basis. probably depending on
> the lease interval.
>
> In addition it would probably also be useful to have the recovery operation
> available as a command in oak-run. that way you can manually trigger it
> from
> the command line. WDYT?
>
> Regards
>  Marcel
>
> > How do we expose _lastRev recovery operation? This would need to check
> > all
> > the cluster nodes info and run recovery for those nodes which need
> > recovery.
> >
> > 1. We either have a scheduled job which checks all the nodes and run the
> > recovery. What should be the interval to trigger the job?
> > 2. Or if we want it run only when triggered manually, then expose an
> > appropriate MBean.
> >
> >
> > Thanks
> > Amit
>

RE: Question regarding missing _lastRev recovery - OAK-1295

Posted by Marcel Reutegger <mr...@adobe.com>.
Hi,

I think the recovery should be triggered automatically by the system when:

1) a cluster node starts up and sees it didn't shut down properly. I'm not
sure this information is available, but remember we discussed this once.

2) a cluster node sees a lease timeout of another cluster node and initiates
the recovery for the failed cluster node. this check could be done in the
background operations thread on a regular basis. probably depending on
the lease interval.

In addition it would probably also be useful to have the recovery operation
available as a command in oak-run. that way you can manually trigger it from
the command line. WDYT?

Regards
 Marcel

> How do we expose _lastRev recovery operation? This would need to check
> all
> the cluster nodes info and run recovery for those nodes which need
> recovery.
> 
> 1. We either have a scheduled job which checks all the nodes and run the
> recovery. What should be the interval to trigger the job?
> 2. Or if we want it run only when triggered manually, then expose an
> appropriate MBean.
> 
> 
> Thanks
> Amit