You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Gaojinchao <ga...@huawei.com> on 2011/10/25 06:33:27 UTC

A requirement to change time of the Hbase cluster.

Hi all,
We have a requirement to change time of the Hbase cluster.
The scene is the cluster changes the ntp server(my customer may do this),
We are ready to do this:
1. stop the cluster
2. change the ntp server
3. start the cluster.
But the cluster may move to one ntp server which system is slower.
we find the meta data can't update becuase newly added record is covered by old record and the cluster don't run normal.
I have a way to deal with this situation. before we update the meta data ,we can get it firstly and then compare the timestamp with the system time.
if system time is lower than timestamp, updating metadata can use as timestamp +1.

Re: A requirement to change time of the Hbase cluster.

Posted by Doug Meil <do...@explorysmedical.com>.
+1.  Well stated Gary.

Doing anything else is asking for trouble, and it's preventable.




On 10/26/11 2:39 PM, "Gary Helmling" <gh...@gmail.com> wrote:

>At the same time, it might be simpler to get your customers/operators
>to fix their ntp setups.
>
>Not having synchronized clocks throughout the cluster will cause
>problems in other areas as well.  It will make it very difficult to
>correlate events in different server logs when troubleshooting
>problems (you can't rely on the timestamp as a rough guideline).  The
>Kerberos infrastructure used by Hadoop and HBase security also
>requires synchronized clocks to operate correctly.  By default, if the
>clock skew between two machines is greater than 5 minutes, it will
>reject messages as invalid.  So you're likely to experience other
>headaches even if you can coax HBase into operating the way you'd
>like.
>
>
>On Wed, Oct 26, 2011 at 9:02 AM, Stack <st...@duboce.net> wrote:
>> On Tue, Oct 25, 2011 at 7:36 PM, Gaojinchao <ga...@huawei.com>
>>wrote:
>>> So we hope to add a choice about metadata is not time-dependent. Just
>>>like use data can use a number as a timestamp .
>>> If we can do this, the effect for time will be smaller. We don't use
>>>the ntp server, the cluster also can work normal ?
>>> Can I open a file? I will try to make a patch and share my mind.
>>>
>>
>> I suppose you could set an attribute on a table that says "use always
>> increasing version rather than timestamp".  You'd have to then on a
>> per region basis keep note of the most recent version and rather than
>> use system time, do a +1 per edit coming in.
>>
>> I think hfile already records the version of the last edit added to
>> the file.  On open of a region, you'd look at all hfiles and figure
>> the highest verison and then set your version machine to start at
>> highest-version +1.
>>
>> It might not be that hard to add.  You'd have to check the code but a
>> while back we made it so we indirectly got version by going to an
>> EnvironmentEdge class.  You could add your 'always increasing version'
>> as an atomic long or something and then it would be available
>> throughout.
>>
>> St.Ack
>>


Re: A requirement to change time of the Hbase cluster.

Posted by Gary Helmling <gh...@gmail.com>.
At the same time, it might be simpler to get your customers/operators
to fix their ntp setups.

Not having synchronized clocks throughout the cluster will cause
problems in other areas as well.  It will make it very difficult to
correlate events in different server logs when troubleshooting
problems (you can't rely on the timestamp as a rough guideline).  The
Kerberos infrastructure used by Hadoop and HBase security also
requires synchronized clocks to operate correctly.  By default, if the
clock skew between two machines is greater than 5 minutes, it will
reject messages as invalid.  So you're likely to experience other
headaches even if you can coax HBase into operating the way you'd
like.


On Wed, Oct 26, 2011 at 9:02 AM, Stack <st...@duboce.net> wrote:
> On Tue, Oct 25, 2011 at 7:36 PM, Gaojinchao <ga...@huawei.com> wrote:
>> So we hope to add a choice about metadata is not time-dependent. Just like use data can use a number as a timestamp .
>> If we can do this, the effect for time will be smaller. We don't use the ntp server, the cluster also can work normal ?
>> Can I open a file? I will try to make a patch and share my mind.
>>
>
> I suppose you could set an attribute on a table that says "use always
> increasing version rather than timestamp".  You'd have to then on a
> per region basis keep note of the most recent version and rather than
> use system time, do a +1 per edit coming in.
>
> I think hfile already records the version of the last edit added to
> the file.  On open of a region, you'd look at all hfiles and figure
> the highest verison and then set your version machine to start at
> highest-version +1.
>
> It might not be that hard to add.  You'd have to check the code but a
> while back we made it so we indirectly got version by going to an
> EnvironmentEdge class.  You could add your 'always increasing version'
> as an atomic long or something and then it would be available
> throughout.
>
> St.Ack
>

Re: A requirement to change time of the Hbase cluster.

Posted by Stack <st...@duboce.net>.
On Tue, Oct 25, 2011 at 7:36 PM, Gaojinchao <ga...@huawei.com> wrote:
> So we hope to add a choice about metadata is not time-dependent. Just like use data can use a number as a timestamp .
> If we can do this, the effect for time will be smaller. We don't use the ntp server, the cluster also can work normal ?
> Can I open a file? I will try to make a patch and share my mind.
>

I suppose you could set an attribute on a table that says "use always
increasing version rather than timestamp".  You'd have to then on a
per region basis keep note of the most recent version and rather than
use system time, do a +1 per edit coming in.

I think hfile already records the version of the last edit added to
the file.  On open of a region, you'd look at all hfiles and figure
the highest verison and then set your version machine to start at
highest-version +1.

It might not be that hard to add.  You'd have to check the code but a
while back we made it so we indirectly got version by going to an
EnvironmentEdge class.  You could add your 'always increasing version'
as an atomic long or something and then it would be available
throughout.

St.Ack

Re: A requirement to change time of the Hbase cluster.

Posted by Gaojinchao <ga...@huawei.com>.
Thanks for your reply.
Our application scenario, The equipment and network is not our, Operators may do some of the equipment and network adjustment.
In this case, These are some of the uncertainties. 
So we hope to add a choice about metadata is not time-dependent. Just like use data can use a number as a timestamp .
If we can do this, the effect for time will be smaller. We don't use the ntp server, the cluster also can work normal ?
Can I open a file? I will try to make a patch and share my mind.

-----邮件原件-----
发件人: Michel Segel [mailto:michael_segel@hotmail.com] 
发送时间: 2011年10月26日 3:55
收件人: user@hbase.apache.org
抄送: user@hbase.apache.org
主题: Re: A requirement to change time of the Hbase cluster.

Maybe I'm missing something...

The purpose of using an ntp server is that your machines all have the same time. Also you would sync your ntp server clock to one of the global ntp servers so that you have an accurate clock for your network...

You shouldn't have to restart your cluster unless your clocks are all way off...

Sent from a remote device. Please excuse any typos...

Mike Segel

On Oct 25, 2011, at 5:14 AM, Gaojinchao <ga...@huawei.com> wrote:

> Perhaps we should. add a choice of supporting incremental meta-data. All the timestamp is incremental, These data do not rely on the system time.
> 
> -----邮件原件-----
> 发件人: Gaojinchao [mailto:gaojinchao@huawei.com] 
> 发送时间: 2011年10月25日 12:33
> 收件人: user@hbase.apache.org
> 主题: A requirement to change time of the Hbase cluster.
> 
> Hi all,
> We have a requirement to change time of the Hbase cluster.
> The scene is the cluster changes the ntp server(my customer may do this),
> We are ready to do this:
> 1. stop the cluster
> 2. change the ntp server
> 3. start the cluster.
> But the cluster may move to one ntp server which system is slower.
> we find the meta data can't update becuase newly added record is covered by old record and the cluster don't run normal.
> I have a way to deal with this situation. before we update the meta data ,we can get it firstly and then compare the timestamp with the system time.
> if system time is lower than timestamp, updating metadata can use as timestamp +1.

Re: A requirement to change time of the Hbase cluster.

Posted by Michel Segel <mi...@hotmail.com>.
Maybe I'm missing something...

The purpose of using an ntp server is that your machines all have the same time. Also you would sync your ntp server clock to one of the global ntp servers so that you have an accurate clock for your network...

You shouldn't have to restart your cluster unless your clocks are all way off...

Sent from a remote device. Please excuse any typos...

Mike Segel

On Oct 25, 2011, at 5:14 AM, Gaojinchao <ga...@huawei.com> wrote:

> Perhaps we should. add a choice of supporting incremental meta-data. All the timestamp is incremental, These data do not rely on the system time.
> 
> -----邮件原件-----
> 发件人: Gaojinchao [mailto:gaojinchao@huawei.com] 
> 发送时间: 2011年10月25日 12:33
> 收件人: user@hbase.apache.org
> 主题: A requirement to change time of the Hbase cluster.
> 
> Hi all,
> We have a requirement to change time of the Hbase cluster.
> The scene is the cluster changes the ntp server(my customer may do this),
> We are ready to do this:
> 1. stop the cluster
> 2. change the ntp server
> 3. start the cluster.
> But the cluster may move to one ntp server which system is slower.
> we find the meta data can't update becuase newly added record is covered by old record and the cluster don't run normal.
> I have a way to deal with this situation. before we update the meta data ,we can get it firstly and then compare the timestamp with the system time.
> if system time is lower than timestamp, updating metadata can use as timestamp +1.

Re: A requirement to change time of the Hbase cluster.

Posted by Gaojinchao <ga...@huawei.com>.
Perhaps we should. add a choice of supporting incremental meta-data. All the timestamp is incremental, These data do not rely on the system time.

-----邮件原件-----
发件人: Gaojinchao [mailto:gaojinchao@huawei.com] 
发送时间: 2011年10月25日 12:33
收件人: user@hbase.apache.org
主题: A requirement to change time of the Hbase cluster.

Hi all,
We have a requirement to change time of the Hbase cluster.
The scene is the cluster changes the ntp server(my customer may do this),
We are ready to do this:
1. stop the cluster
2. change the ntp server
3. start the cluster.
But the cluster may move to one ntp server which system is slower.
we find the meta data can't update becuase newly added record is covered by old record and the cluster don't run normal.
I have a way to deal with this situation. before we update the meta data ,we can get it firstly and then compare the timestamp with the system time.
if system time is lower than timestamp, updating metadata can use as timestamp +1.