You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Jinsong Hu <ji...@hotmail.com> on 2010/03/17 00:06:14 UTC

can't read all hbase tables after hbase cluster IP netmask change

Hi, There:
  I migrated my hbase cluster from network 10.110.24.0 to 10.110.8.0 for all my hbase machines, and now I can't read the contents of the tables
any more. I run "bin/hbase shell", and then "scan  'TABLE_ABC', the log shows it is still trying to connect to region server 10.110.24.91. but that 
machine's IP is changed to 10.110.8.91. so the request fails. 
  Is there anyway I can fix this issue so I can read the table contents again?


Jimmy

Re: can't read all hbase tables after hbase cluster IP netmask change

Posted by Ryan Rawson <ry...@gmail.com>.
Hey,

HBase does in fact do those things you say that it does not.  There is
something else going on in here. You need to look at your logs and
possibly post some of them. Start with the master logs.





On Tue, Mar 16, 2010 at 5:02 PM, Jinsong Hu <ji...@hotmail.com> wrote:
> yes, the dns name resolution is working file. it is not dns issue. it looks
> I have to regenerate my hbase data again
> from my raw data.
>
> But it looks this IP change issue should be documented, so people won't
> encounter this issue again.
>
> Another observation I have is that if one region server dies, the region
> served by that server won't be accessible any
> more because of this binding to IP. the only way to resolve this is to build
> a new machine that takes over that IP and
> add it to cluster.
>
> Unfortunately, the hbase itself doesn't have HA build in. if would be nice
> if future versions of hbase add this high availability support.
>
> Jimmy.
>
> --------------------------------------------------
> From: "Ryan Rawson" <ry...@gmail.com>
> Sent: Tuesday, March 16, 2010 4:28 PM
> To: <hb...@hadoop.apache.org>
> Subject: Re: can't read all hbase tables after hbase cluster IP netmask
> change
>
>> With the versioning of HBase, the old data is not overwritten so it
>> may not be that issue.
>>
>> Can you check your DNS situation, see if the name to IP back to name
>> resolution is working correctly for all nodes?
>>
>> -ryan
>>
>> On Tue, Mar 16, 2010 at 4:24 PM, Jinsong Hu <ji...@hotmail.com>
>> wrote:
>>>
>>> Yes, I restarted the cluster. and it doesn't help. it still tries to go
>>> to
>>> the old IP.
>>> I looked at the binary data in the  /hbase/ROOT directory, and it turns
>>> out
>>> the old IP is in those data files.
>>>
>>> Jimmy
>>>
>>> --------------------------------------------------
>>> From: "Ryan Rawson" <ry...@gmail.com>
>>> Sent: Tuesday, March 16, 2010 4:16 PM
>>> To: <hb...@hadoop.apache.org>
>>> Subject: Re: can't read all hbase tables after hbase cluster IP netmask
>>> change
>>>
>>>> Have you restarted your cluster yet?  It looks like you changed the
>>>> IPs of your machines, is that correct?
>>>>
>>>> On Tue, Mar 16, 2010 at 4:06 PM, Jinsong Hu <ji...@hotmail.com>
>>>> wrote:
>>>>>
>>>>> Hi, There:
>>>>>  I migrated my hbase cluster from network 10.110.24.0 to 10.110.8.0 for
>>>>> all my hbase machines, and now I can't read the contents of the tables
>>>>> any more. I run "bin/hbase shell", and then "scan  'TABLE_ABC', the log
>>>>> shows it is still trying to connect to region server 10.110.24.91. but
>>>>> that
>>>>> machine's IP is changed to 10.110.8.91. so the request fails.
>>>>>  Is there anyway I can fix this issue so I can read the table contents
>>>>> again?
>>>>>
>>>>>
>>>>> Jimmy
>>>>>
>>>>
>>>
>>
>

Re: can't read all hbase tables after hbase cluster IP netmask change

Posted by Ryan Rawson <ry...@gmail.com>.
With the versioning of HBase, the old data is not overwritten so that
is not conclusive proof of the problem. Those tables you paste hold
state data, and on restart it is overwritten with new records.  I have
copied the hbase tables from one cluster to the other and restarted it
successfully as well.

I'm afraid we are going to need more context to help debug this issue.
 What happens is during a start, the old server assignments are
noticed to be bad (since there is no record of that server existing
anymore) and new assignments are made and recorded in ROOT.

Another issue could be that your table is not entirely online.  Try
> enable "table_name"

at the hbase shell and see if that helps.

-ryan

On Tue, Mar 16, 2010 at 5:31 PM, Jinsong Hu <ji...@hotmail.com> wrote:
> Here is the content of the binary  file
> /hbase/-ROOT-/70236052/info/110463219710538426. you can see the IP address
> 10.110.24.55 inside.
>
> This shows the hbase is saving the IP address of my cluster into the  ROOT
> data file. when I run the scan command
> under shell , the log says it was trying to connect to 10.110.24.xx
> machines.
>
> Jimmy
>
>
>
> DATABLK*# ?     .META.,,1 inforegioninfo &?Ph=  .META.,,1  .META.   IS_ROOT
> false IS_META true MEMSTORE_FLUSHSIZE 16384         historian  BLOOMFILTER
> false COMPRESSION NONE VERSIONS2147483647 TTL 604800     BLOCKSIZE 8192
>  IN_MEMORY false
> BLOCKCACHE false  info  BLOOMFILTER false COMPRESSION NONE VERSIONS 10 TTL
> 2147483647      BLOCKSIZE 8192  IN_MEMORY false
> BLOCKCACHE false I3     .META.,,1 infoserver 'J? ? 10.110.24.88:60020
>  .META.,,1 infoserver '0?}? 10.110.24.85:60020    .META.,,1 infoserver '0?{?
> 10.110.24.56:60020   .META.,,1 infoserver ' &^? 10.110.24.54:60020
> .META.,,1 infoserver ' #?? 10.110.24.55:60020   .META.,,1 infoserver '  ??
> 10.110.24.54:60020   .META.,,1 infoserver ' B?? 10.110.24.56:60020
> .META.,,1 infoserver ' R?z 10.110.24.55:60020   .META.,,1 infoserver &?NN?
> 10.110.24.54:60020   .META.,,1 infoserver &?S?4 10.110.24.55:60020(
>  .META.,,1 infoserverstartcode 'J? ?  'J?  (     .META.,,1
> infoserverstartcode '0?}?  '0?V?(     .META.,,1 infoserverstartcode '0?{?  '
> &?C(     .META.,,1 infoserverstartcode ' &^?  ' &5?(     .META.,,1
> infoserverstartcode ' #??  '  ?(      .META.,,1 infoserverstartcode '  ??  '
>
>>
>> (       .META.,,1 infoserverstartcode ' B??  ' Q??( .META.,,1
>> infoserverstartcode ' R?z  ' Q??(         .META.,,1 infoserverstartcode
>> &?NN?  &?N&G(     .META.,,1 infoserverstartcode &?S?4  &?R??
>>  MAJOR_COMPACTION_KEY  ? MAX_SEQ_ID_KEY  N  ? hfile.AVG_KEY_LEN  #
>> hfile.AVG_VALUE_LEN  ! hfile.COMPARATOR
>> 2org.apache.hadoop.hbase.KeyValue$RootKeyComparatorhfile.LASTKEY (
>>  .META.,,1 infoserverstartcode &?S?4 IDXBLK)+ `# .META.,,1 inforegioninfo
>> &?Ph= TRABLK"$ ` D  `
>
>
>
>
> --------------------------------------------------
> From: "Stack" <st...@duboce.net>
> Sent: Tuesday, March 16, 2010 5:15 PM
> To: <hb...@hadoop.apache.org>
> Subject: Re: can't read all hbase tables after hbase cluster IP netmask
> change
>
>> On Tue, Mar 16, 2010 at 5:02 PM, Jinsong Hu <ji...@hotmail.com>
>> wrote:
>>>
>>> yes, the dns name resolution is working file. it is not dns issue. it
>>> looks
>>> I have to regenerate my hbase data again
>>> from my raw data.
>>>
>>
>> We don't write machine names or ips into the data.  You can take your
>> data, copy it to another cluster altogether and it'll serve it without
>> modification.
>>
>>>
>>> Another observation I have is that if one region server dies, the region
>>> served by that server won't be accessible any
>>> more because of this binding to IP. the only way to resolve this is to
>>> build
>>> a new machine that takes over that IP and
>>> add it to cluster.
>>>
>>
>> Your premise is incorrect.  The above does not hold at all.
>>
>> St.Ack
>>
>

Re: can't read all hbase tables after hbase cluster IP netmask change

Posted by Jinsong Hu <ji...@hotmail.com>.
Here is the content of the binary  file 
/hbase/-ROOT-/70236052/info/110463219710538426. you can see the IP address
10.110.24.55 inside.

This shows the hbase is saving the IP address of my cluster into the  ROOT 
data file. when I run the scan command
under shell , the log says it was trying to connect to 10.110.24.xx 
machines.

Jimmy



DATABLK*#?	.META.,,1inforegioninfo&?Ph=.META.,,1.META.IS_ROOTfalseIS_METAtrueMEMSTORE_FLUSHSIZE16384	historianBLOOMFILTERfalseCOMPRESSIONNONEVERSIONS2147483647TTL604800	BLOCKSIZE8192	IN_MEMORYfalse
BLOCKCACHEfalseinfoBLOOMFILTERfalseCOMPRESSIONNONEVERSIONS10TTL
2147483647	BLOCKSIZE8192	IN_MEMORYfalse
BLOCKCACHEfalseI3	.META.,,1infoserver'J??10.110.24.88:60020.META.,,1infoserver'0?}?10.110.24.85:60020	.META.,,1infoserver'0?{?10.110.24.56:60020	.META.,,1infoserver'&^?10.110.24.54:60020	.META.,,1infoserver'#??10.110.24.55:60020	.META.,,1infoserver'??10.110.24.54:60020	.META.,,1infoserver'B??10.110.24.56:60020	.META.,,1infoserver'R?z10.110.24.55:60020	.META.,,1infoserver&?NN?10.110.24.54:60020	.META.,,1infoserver&?S?410.110.24.55:60020(	.META.,,1infoserverstartcode'J??'J?(	.META.,,1infoserverstartcode'0?}?'0?V?(	.META.,,1infoserverstartcode'0?{?'&?C(	.META.,,1infoserverstartcode'&^?'&5?(	.META.,,1infoserverstartcode'#??'?(	.META.,,1infoserverstartcode'??'
>(	.META.,,1infoserverstartcode'B??'Q??(.META.,,1infoserverstartcode'R?z'Q??(	.META.,,1infoserverstartcode&?NN?&?N&G(	.META.,,1infoserverstartcode&?S?4&?R??MAJOR_COMPACTION_KEY?MAX_SEQ_ID_KEYN?hfile.AVG_KEY_LEN#hfile.AVG_VALUE_LEN!hfile.COMPARATOR2org.apache.hadoop.hbase.KeyValue$RootKeyComparatorhfile.LASTKEY(	.META.,,1infoserverstartcode&?S?4IDXBLK)+`# 
>.META.,,1inforegioninfo&?Ph=TRABLK"$`D`




--------------------------------------------------
From: "Stack" <st...@duboce.net>
Sent: Tuesday, March 16, 2010 5:15 PM
To: <hb...@hadoop.apache.org>
Subject: Re: can't read all hbase tables after hbase cluster IP netmask 
change

> On Tue, Mar 16, 2010 at 5:02 PM, Jinsong Hu <ji...@hotmail.com> 
> wrote:
>> yes, the dns name resolution is working file. it is not dns issue. it 
>> looks
>> I have to regenerate my hbase data again
>> from my raw data.
>>
>
> We don't write machine names or ips into the data.  You can take your
> data, copy it to another cluster altogether and it'll serve it without
> modification.
>
>>
>> Another observation I have is that if one region server dies, the region
>> served by that server won't be accessible any
>> more because of this binding to IP. the only way to resolve this is to 
>> build
>> a new machine that takes over that IP and
>> add it to cluster.
>>
>
> Your premise is incorrect.  The above does not hold at all.
>
> St.Ack
> 

Re: can't read all hbase tables after hbase cluster IP netmask change

Posted by Stack <st...@duboce.net>.
On Tue, Mar 16, 2010 at 5:02 PM, Jinsong Hu <ji...@hotmail.com> wrote:
> yes, the dns name resolution is working file. it is not dns issue. it looks
> I have to regenerate my hbase data again
> from my raw data.
>

We don't write machine names or ips into the data.  You can take your
data, copy it to another cluster altogether and it'll serve it without
modification.

>
> Another observation I have is that if one region server dies, the region
> served by that server won't be accessible any
> more because of this binding to IP. the only way to resolve this is to build
> a new machine that takes over that IP and
> add it to cluster.
>

Your premise is incorrect.  The above does not hold at all.

St.Ack

Re: can't read all hbase tables after hbase cluster IP netmask change

Posted by Jinsong Hu <ji...@hotmail.com>.
yes, the dns name resolution is working file. it is not dns issue. it looks 
I have to regenerate my hbase data again
from my raw data.

But it looks this IP change issue should be documented, so people won't 
encounter this issue again.

Another observation I have is that if one region server dies, the region 
served by that server won't be accessible any
more because of this binding to IP. the only way to resolve this is to build 
a new machine that takes over that IP and
add it to cluster.

Unfortunately, the hbase itself doesn't have HA build in. if would be nice 
if future versions of hbase add this high availability support.

Jimmy.

--------------------------------------------------
From: "Ryan Rawson" <ry...@gmail.com>
Sent: Tuesday, March 16, 2010 4:28 PM
To: <hb...@hadoop.apache.org>
Subject: Re: can't read all hbase tables after hbase cluster IP netmask 
change

> With the versioning of HBase, the old data is not overwritten so it
> may not be that issue.
>
> Can you check your DNS situation, see if the name to IP back to name
> resolution is working correctly for all nodes?
>
> -ryan
>
> On Tue, Mar 16, 2010 at 4:24 PM, Jinsong Hu <ji...@hotmail.com> 
> wrote:
>> Yes, I restarted the cluster. and it doesn't help. it still tries to go 
>> to
>> the old IP.
>> I looked at the binary data in the  /hbase/ROOT directory, and it turns 
>> out
>> the old IP is in those data files.
>>
>> Jimmy
>>
>> --------------------------------------------------
>> From: "Ryan Rawson" <ry...@gmail.com>
>> Sent: Tuesday, March 16, 2010 4:16 PM
>> To: <hb...@hadoop.apache.org>
>> Subject: Re: can't read all hbase tables after hbase cluster IP netmask
>> change
>>
>>> Have you restarted your cluster yet?  It looks like you changed the
>>> IPs of your machines, is that correct?
>>>
>>> On Tue, Mar 16, 2010 at 4:06 PM, Jinsong Hu <ji...@hotmail.com>
>>> wrote:
>>>>
>>>> Hi, There:
>>>>  I migrated my hbase cluster from network 10.110.24.0 to 10.110.8.0 for
>>>> all my hbase machines, and now I can't read the contents of the tables
>>>> any more. I run "bin/hbase shell", and then "scan  'TABLE_ABC', the log
>>>> shows it is still trying to connect to region server 10.110.24.91. but 
>>>> that
>>>> machine's IP is changed to 10.110.8.91. so the request fails.
>>>>  Is there anyway I can fix this issue so I can read the table contents
>>>> again?
>>>>
>>>>
>>>> Jimmy
>>>>
>>>
>>
> 

Re: can't read all hbase tables after hbase cluster IP netmask change

Posted by Ryan Rawson <ry...@gmail.com>.
With the versioning of HBase, the old data is not overwritten so it
may not be that issue.

Can you check your DNS situation, see if the name to IP back to name
resolution is working correctly for all nodes?

-ryan

On Tue, Mar 16, 2010 at 4:24 PM, Jinsong Hu <ji...@hotmail.com> wrote:
> Yes, I restarted the cluster. and it doesn't help. it still tries to go to
> the old IP.
> I looked at the binary data in the  /hbase/ROOT directory, and it turns out
> the old IP is in those data files.
>
> Jimmy
>
> --------------------------------------------------
> From: "Ryan Rawson" <ry...@gmail.com>
> Sent: Tuesday, March 16, 2010 4:16 PM
> To: <hb...@hadoop.apache.org>
> Subject: Re: can't read all hbase tables after hbase cluster IP netmask
> change
>
>> Have you restarted your cluster yet?  It looks like you changed the
>> IPs of your machines, is that correct?
>>
>> On Tue, Mar 16, 2010 at 4:06 PM, Jinsong Hu <ji...@hotmail.com>
>> wrote:
>>>
>>> Hi, There:
>>>  I migrated my hbase cluster from network 10.110.24.0 to 10.110.8.0 for
>>> all my hbase machines, and now I can't read the contents of the tables
>>> any more. I run "bin/hbase shell", and then "scan  'TABLE_ABC', the log
>>> shows it is still trying to connect to region server 10.110.24.91. but that
>>> machine's IP is changed to 10.110.8.91. so the request fails.
>>>  Is there anyway I can fix this issue so I can read the table contents
>>> again?
>>>
>>>
>>> Jimmy
>>>
>>
>

Re: can't read all hbase tables after hbase cluster IP netmask change

Posted by Jinsong Hu <ji...@hotmail.com>.
Yes, I restarted the cluster. and it doesn't help. it still tries to go to 
the old IP.
I looked at the binary data in the  /hbase/ROOT directory, and it turns out 
the old IP is in those data files.

Jimmy

--------------------------------------------------
From: "Ryan Rawson" <ry...@gmail.com>
Sent: Tuesday, March 16, 2010 4:16 PM
To: <hb...@hadoop.apache.org>
Subject: Re: can't read all hbase tables after hbase cluster IP netmask 
change

> Have you restarted your cluster yet?  It looks like you changed the
> IPs of your machines, is that correct?
>
> On Tue, Mar 16, 2010 at 4:06 PM, Jinsong Hu <ji...@hotmail.com> 
> wrote:
>> Hi, There:
>>  I migrated my hbase cluster from network 10.110.24.0 to 10.110.8.0 for 
>> all my hbase machines, and now I can't read the contents of the tables
>> any more. I run "bin/hbase shell", and then "scan  'TABLE_ABC', the log 
>> shows it is still trying to connect to region server 10.110.24.91. but 
>> that
>> machine's IP is changed to 10.110.8.91. so the request fails.
>>  Is there anyway I can fix this issue so I can read the table contents 
>> again?
>>
>>
>> Jimmy
>>
> 

Re: can't read all hbase tables after hbase cluster IP netmask change

Posted by Ryan Rawson <ry...@gmail.com>.
Have you restarted your cluster yet?  It looks like you changed the
IPs of your machines, is that correct?

On Tue, Mar 16, 2010 at 4:06 PM, Jinsong Hu <ji...@hotmail.com> wrote:
> Hi, There:
>  I migrated my hbase cluster from network 10.110.24.0 to 10.110.8.0 for all my hbase machines, and now I can't read the contents of the tables
> any more. I run "bin/hbase shell", and then "scan  'TABLE_ABC', the log shows it is still trying to connect to region server 10.110.24.91. but that
> machine's IP is changed to 10.110.8.91. so the request fails.
>  Is there anyway I can fix this issue so I can read the table contents again?
>
>
> Jimmy
>