You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Jinsong Hu <ji...@hotmail.com> on 2010/03/17 00:06:14 UTC
can't read all hbase tables after hbase cluster IP netmask change
Hi, There:
I migrated my hbase cluster from network 10.110.24.0 to 10.110.8.0 for all my hbase machines, and now I can't read the contents of the tables
any more. I run "bin/hbase shell", and then "scan 'TABLE_ABC', the log shows it is still trying to connect to region server 10.110.24.91. but that
machine's IP is changed to 10.110.8.91. so the request fails.
Is there anyway I can fix this issue so I can read the table contents again?
Jimmy
Re: can't read all hbase tables after hbase cluster IP netmask change
Posted by Ryan Rawson <ry...@gmail.com>.
Hey,
HBase does in fact do those things you say that it does not. There is
something else going on in here. You need to look at your logs and
possibly post some of them. Start with the master logs.
On Tue, Mar 16, 2010 at 5:02 PM, Jinsong Hu <ji...@hotmail.com> wrote:
> yes, the dns name resolution is working file. it is not dns issue. it looks
> I have to regenerate my hbase data again
> from my raw data.
>
> But it looks this IP change issue should be documented, so people won't
> encounter this issue again.
>
> Another observation I have is that if one region server dies, the region
> served by that server won't be accessible any
> more because of this binding to IP. the only way to resolve this is to build
> a new machine that takes over that IP and
> add it to cluster.
>
> Unfortunately, the hbase itself doesn't have HA build in. if would be nice
> if future versions of hbase add this high availability support.
>
> Jimmy.
>
> --------------------------------------------------
> From: "Ryan Rawson" <ry...@gmail.com>
> Sent: Tuesday, March 16, 2010 4:28 PM
> To: <hb...@hadoop.apache.org>
> Subject: Re: can't read all hbase tables after hbase cluster IP netmask
> change
>
>> With the versioning of HBase, the old data is not overwritten so it
>> may not be that issue.
>>
>> Can you check your DNS situation, see if the name to IP back to name
>> resolution is working correctly for all nodes?
>>
>> -ryan
>>
>> On Tue, Mar 16, 2010 at 4:24 PM, Jinsong Hu <ji...@hotmail.com>
>> wrote:
>>>
>>> Yes, I restarted the cluster. and it doesn't help. it still tries to go
>>> to
>>> the old IP.
>>> I looked at the binary data in the /hbase/ROOT directory, and it turns
>>> out
>>> the old IP is in those data files.
>>>
>>> Jimmy
>>>
>>> --------------------------------------------------
>>> From: "Ryan Rawson" <ry...@gmail.com>
>>> Sent: Tuesday, March 16, 2010 4:16 PM
>>> To: <hb...@hadoop.apache.org>
>>> Subject: Re: can't read all hbase tables after hbase cluster IP netmask
>>> change
>>>
>>>> Have you restarted your cluster yet? It looks like you changed the
>>>> IPs of your machines, is that correct?
>>>>
>>>> On Tue, Mar 16, 2010 at 4:06 PM, Jinsong Hu <ji...@hotmail.com>
>>>> wrote:
>>>>>
>>>>> Hi, There:
>>>>> I migrated my hbase cluster from network 10.110.24.0 to 10.110.8.0 for
>>>>> all my hbase machines, and now I can't read the contents of the tables
>>>>> any more. I run "bin/hbase shell", and then "scan 'TABLE_ABC', the log
>>>>> shows it is still trying to connect to region server 10.110.24.91. but
>>>>> that
>>>>> machine's IP is changed to 10.110.8.91. so the request fails.
>>>>> Is there anyway I can fix this issue so I can read the table contents
>>>>> again?
>>>>>
>>>>>
>>>>> Jimmy
>>>>>
>>>>
>>>
>>
>
Re: can't read all hbase tables after hbase cluster IP netmask change
Posted by Ryan Rawson <ry...@gmail.com>.
With the versioning of HBase, the old data is not overwritten so that
is not conclusive proof of the problem. Those tables you paste hold
state data, and on restart it is overwritten with new records. I have
copied the hbase tables from one cluster to the other and restarted it
successfully as well.
I'm afraid we are going to need more context to help debug this issue.
What happens is during a start, the old server assignments are
noticed to be bad (since there is no record of that server existing
anymore) and new assignments are made and recorded in ROOT.
Another issue could be that your table is not entirely online. Try
> enable "table_name"
at the hbase shell and see if that helps.
-ryan
On Tue, Mar 16, 2010 at 5:31 PM, Jinsong Hu <ji...@hotmail.com> wrote:
> Here is the content of the binary file
> /hbase/-ROOT-/70236052/info/110463219710538426. you can see the IP address
> 10.110.24.55 inside.
>
> This shows the hbase is saving the IP address of my cluster into the ROOT
> data file. when I run the scan command
> under shell , the log says it was trying to connect to 10.110.24.xx
> machines.
>
> Jimmy
>
>
>
> DATABLK*# ? .META.,,1 inforegioninfo &?Ph= .META.,,1 .META. IS_ROOT
> false IS_META true MEMSTORE_FLUSHSIZE 16384 historian BLOOMFILTER
> false COMPRESSION NONE VERSIONS2147483647 TTL 604800 BLOCKSIZE 8192
> IN_MEMORY false
> BLOCKCACHE false info BLOOMFILTER false COMPRESSION NONE VERSIONS 10 TTL
> 2147483647 BLOCKSIZE 8192 IN_MEMORY false
> BLOCKCACHE false I3 .META.,,1 infoserver 'J? ? 10.110.24.88:60020
> .META.,,1 infoserver '0?}? 10.110.24.85:60020 .META.,,1 infoserver '0?{?
> 10.110.24.56:60020 .META.,,1 infoserver ' &^? 10.110.24.54:60020
> .META.,,1 infoserver ' #?? 10.110.24.55:60020 .META.,,1 infoserver ' ??
> 10.110.24.54:60020 .META.,,1 infoserver ' B?? 10.110.24.56:60020
> .META.,,1 infoserver ' R?z 10.110.24.55:60020 .META.,,1 infoserver &?NN?
> 10.110.24.54:60020 .META.,,1 infoserver &?S?4 10.110.24.55:60020(
> .META.,,1 infoserverstartcode 'J? ? 'J? ( .META.,,1
> infoserverstartcode '0?}? '0?V?( .META.,,1 infoserverstartcode '0?{? '
> &?C( .META.,,1 infoserverstartcode ' &^? ' &5?( .META.,,1
> infoserverstartcode ' #?? ' ?( .META.,,1 infoserverstartcode ' ?? '
>
>>
>> ( .META.,,1 infoserverstartcode ' B?? ' Q??( .META.,,1
>> infoserverstartcode ' R?z ' Q??( .META.,,1 infoserverstartcode
>> &?NN? &?N&G( .META.,,1 infoserverstartcode &?S?4 &?R??
>> MAJOR_COMPACTION_KEY ? MAX_SEQ_ID_KEY N ? hfile.AVG_KEY_LEN #
>> hfile.AVG_VALUE_LEN ! hfile.COMPARATOR
>> 2org.apache.hadoop.hbase.KeyValue$RootKeyComparatorhfile.LASTKEY (
>> .META.,,1 infoserverstartcode &?S?4 IDXBLK)+ `# .META.,,1 inforegioninfo
>> &?Ph= TRABLK"$ ` D `
>
>
>
>
> --------------------------------------------------
> From: "Stack" <st...@duboce.net>
> Sent: Tuesday, March 16, 2010 5:15 PM
> To: <hb...@hadoop.apache.org>
> Subject: Re: can't read all hbase tables after hbase cluster IP netmask
> change
>
>> On Tue, Mar 16, 2010 at 5:02 PM, Jinsong Hu <ji...@hotmail.com>
>> wrote:
>>>
>>> yes, the dns name resolution is working file. it is not dns issue. it
>>> looks
>>> I have to regenerate my hbase data again
>>> from my raw data.
>>>
>>
>> We don't write machine names or ips into the data. You can take your
>> data, copy it to another cluster altogether and it'll serve it without
>> modification.
>>
>>>
>>> Another observation I have is that if one region server dies, the region
>>> served by that server won't be accessible any
>>> more because of this binding to IP. the only way to resolve this is to
>>> build
>>> a new machine that takes over that IP and
>>> add it to cluster.
>>>
>>
>> Your premise is incorrect. The above does not hold at all.
>>
>> St.Ack
>>
>
Re: can't read all hbase tables after hbase cluster IP netmask change
Posted by Jinsong Hu <ji...@hotmail.com>.
Here is the content of the binary file
/hbase/-ROOT-/70236052/info/110463219710538426. you can see the IP address
10.110.24.55 inside.
This shows the hbase is saving the IP address of my cluster into the ROOT
data file. when I run the scan command
under shell , the log says it was trying to connect to 10.110.24.xx
machines.
Jimmy
DATABLK*#? .META.,,1inforegioninfo&?Ph=.META.,,1.META.IS_ROOTfalseIS_METAtrueMEMSTORE_FLUSHSIZE16384 historianBLOOMFILTERfalseCOMPRESSIONNONEVERSIONS2147483647TTL604800 BLOCKSIZE8192 IN_MEMORYfalse
BLOCKCACHEfalseinfoBLOOMFILTERfalseCOMPRESSIONNONEVERSIONS10TTL
2147483647 BLOCKSIZE8192 IN_MEMORYfalse
BLOCKCACHEfalseI3 .META.,,1infoserver'J??10.110.24.88:60020.META.,,1infoserver'0?}?10.110.24.85:60020 .META.,,1infoserver'0?{?10.110.24.56:60020 .META.,,1infoserver'&^?10.110.24.54:60020 .META.,,1infoserver'#??10.110.24.55:60020 .META.,,1infoserver'??10.110.24.54:60020 .META.,,1infoserver'B??10.110.24.56:60020 .META.,,1infoserver'R?z10.110.24.55:60020 .META.,,1infoserver&?NN?10.110.24.54:60020 .META.,,1infoserver&?S?410.110.24.55:60020( .META.,,1infoserverstartcode'J??'J?( .META.,,1infoserverstartcode'0?}?'0?V?( .META.,,1infoserverstartcode'0?{?'&?C( .META.,,1infoserverstartcode'&^?'&5?( .META.,,1infoserverstartcode'#??'?( .META.,,1infoserverstartcode'??'
>( .META.,,1infoserverstartcode'B??'Q??(.META.,,1infoserverstartcode'R?z'Q??( .META.,,1infoserverstartcode&?NN?&?N&G( .META.,,1infoserverstartcode&?S?4&?R??MAJOR_COMPACTION_KEY?MAX_SEQ_ID_KEYN?hfile.AVG_KEY_LEN#hfile.AVG_VALUE_LEN!hfile.COMPARATOR2org.apache.hadoop.hbase.KeyValue$RootKeyComparatorhfile.LASTKEY( .META.,,1infoserverstartcode&?S?4IDXBLK)+`#
>.META.,,1inforegioninfo&?Ph=TRABLK"$`D`
--------------------------------------------------
From: "Stack" <st...@duboce.net>
Sent: Tuesday, March 16, 2010 5:15 PM
To: <hb...@hadoop.apache.org>
Subject: Re: can't read all hbase tables after hbase cluster IP netmask
change
> On Tue, Mar 16, 2010 at 5:02 PM, Jinsong Hu <ji...@hotmail.com>
> wrote:
>> yes, the dns name resolution is working file. it is not dns issue. it
>> looks
>> I have to regenerate my hbase data again
>> from my raw data.
>>
>
> We don't write machine names or ips into the data. You can take your
> data, copy it to another cluster altogether and it'll serve it without
> modification.
>
>>
>> Another observation I have is that if one region server dies, the region
>> served by that server won't be accessible any
>> more because of this binding to IP. the only way to resolve this is to
>> build
>> a new machine that takes over that IP and
>> add it to cluster.
>>
>
> Your premise is incorrect. The above does not hold at all.
>
> St.Ack
>
Re: can't read all hbase tables after hbase cluster IP netmask change
Posted by Stack <st...@duboce.net>.
On Tue, Mar 16, 2010 at 5:02 PM, Jinsong Hu <ji...@hotmail.com> wrote:
> yes, the dns name resolution is working file. it is not dns issue. it looks
> I have to regenerate my hbase data again
> from my raw data.
>
We don't write machine names or ips into the data. You can take your
data, copy it to another cluster altogether and it'll serve it without
modification.
>
> Another observation I have is that if one region server dies, the region
> served by that server won't be accessible any
> more because of this binding to IP. the only way to resolve this is to build
> a new machine that takes over that IP and
> add it to cluster.
>
Your premise is incorrect. The above does not hold at all.
St.Ack
Re: can't read all hbase tables after hbase cluster IP netmask change
Posted by Jinsong Hu <ji...@hotmail.com>.
yes, the dns name resolution is working file. it is not dns issue. it looks
I have to regenerate my hbase data again
from my raw data.
But it looks this IP change issue should be documented, so people won't
encounter this issue again.
Another observation I have is that if one region server dies, the region
served by that server won't be accessible any
more because of this binding to IP. the only way to resolve this is to build
a new machine that takes over that IP and
add it to cluster.
Unfortunately, the hbase itself doesn't have HA build in. if would be nice
if future versions of hbase add this high availability support.
Jimmy.
--------------------------------------------------
From: "Ryan Rawson" <ry...@gmail.com>
Sent: Tuesday, March 16, 2010 4:28 PM
To: <hb...@hadoop.apache.org>
Subject: Re: can't read all hbase tables after hbase cluster IP netmask
change
> With the versioning of HBase, the old data is not overwritten so it
> may not be that issue.
>
> Can you check your DNS situation, see if the name to IP back to name
> resolution is working correctly for all nodes?
>
> -ryan
>
> On Tue, Mar 16, 2010 at 4:24 PM, Jinsong Hu <ji...@hotmail.com>
> wrote:
>> Yes, I restarted the cluster. and it doesn't help. it still tries to go
>> to
>> the old IP.
>> I looked at the binary data in the /hbase/ROOT directory, and it turns
>> out
>> the old IP is in those data files.
>>
>> Jimmy
>>
>> --------------------------------------------------
>> From: "Ryan Rawson" <ry...@gmail.com>
>> Sent: Tuesday, March 16, 2010 4:16 PM
>> To: <hb...@hadoop.apache.org>
>> Subject: Re: can't read all hbase tables after hbase cluster IP netmask
>> change
>>
>>> Have you restarted your cluster yet? It looks like you changed the
>>> IPs of your machines, is that correct?
>>>
>>> On Tue, Mar 16, 2010 at 4:06 PM, Jinsong Hu <ji...@hotmail.com>
>>> wrote:
>>>>
>>>> Hi, There:
>>>> I migrated my hbase cluster from network 10.110.24.0 to 10.110.8.0 for
>>>> all my hbase machines, and now I can't read the contents of the tables
>>>> any more. I run "bin/hbase shell", and then "scan 'TABLE_ABC', the log
>>>> shows it is still trying to connect to region server 10.110.24.91. but
>>>> that
>>>> machine's IP is changed to 10.110.8.91. so the request fails.
>>>> Is there anyway I can fix this issue so I can read the table contents
>>>> again?
>>>>
>>>>
>>>> Jimmy
>>>>
>>>
>>
>
Re: can't read all hbase tables after hbase cluster IP netmask change
Posted by Ryan Rawson <ry...@gmail.com>.
With the versioning of HBase, the old data is not overwritten so it
may not be that issue.
Can you check your DNS situation, see if the name to IP back to name
resolution is working correctly for all nodes?
-ryan
On Tue, Mar 16, 2010 at 4:24 PM, Jinsong Hu <ji...@hotmail.com> wrote:
> Yes, I restarted the cluster. and it doesn't help. it still tries to go to
> the old IP.
> I looked at the binary data in the /hbase/ROOT directory, and it turns out
> the old IP is in those data files.
>
> Jimmy
>
> --------------------------------------------------
> From: "Ryan Rawson" <ry...@gmail.com>
> Sent: Tuesday, March 16, 2010 4:16 PM
> To: <hb...@hadoop.apache.org>
> Subject: Re: can't read all hbase tables after hbase cluster IP netmask
> change
>
>> Have you restarted your cluster yet? It looks like you changed the
>> IPs of your machines, is that correct?
>>
>> On Tue, Mar 16, 2010 at 4:06 PM, Jinsong Hu <ji...@hotmail.com>
>> wrote:
>>>
>>> Hi, There:
>>> I migrated my hbase cluster from network 10.110.24.0 to 10.110.8.0 for
>>> all my hbase machines, and now I can't read the contents of the tables
>>> any more. I run "bin/hbase shell", and then "scan 'TABLE_ABC', the log
>>> shows it is still trying to connect to region server 10.110.24.91. but that
>>> machine's IP is changed to 10.110.8.91. so the request fails.
>>> Is there anyway I can fix this issue so I can read the table contents
>>> again?
>>>
>>>
>>> Jimmy
>>>
>>
>
Re: can't read all hbase tables after hbase cluster IP netmask change
Posted by Jinsong Hu <ji...@hotmail.com>.
Yes, I restarted the cluster. and it doesn't help. it still tries to go to
the old IP.
I looked at the binary data in the /hbase/ROOT directory, and it turns out
the old IP is in those data files.
Jimmy
--------------------------------------------------
From: "Ryan Rawson" <ry...@gmail.com>
Sent: Tuesday, March 16, 2010 4:16 PM
To: <hb...@hadoop.apache.org>
Subject: Re: can't read all hbase tables after hbase cluster IP netmask
change
> Have you restarted your cluster yet? It looks like you changed the
> IPs of your machines, is that correct?
>
> On Tue, Mar 16, 2010 at 4:06 PM, Jinsong Hu <ji...@hotmail.com>
> wrote:
>> Hi, There:
>> I migrated my hbase cluster from network 10.110.24.0 to 10.110.8.0 for
>> all my hbase machines, and now I can't read the contents of the tables
>> any more. I run "bin/hbase shell", and then "scan 'TABLE_ABC', the log
>> shows it is still trying to connect to region server 10.110.24.91. but
>> that
>> machine's IP is changed to 10.110.8.91. so the request fails.
>> Is there anyway I can fix this issue so I can read the table contents
>> again?
>>
>>
>> Jimmy
>>
>
Re: can't read all hbase tables after hbase cluster IP netmask change
Posted by Ryan Rawson <ry...@gmail.com>.
Have you restarted your cluster yet? It looks like you changed the
IPs of your machines, is that correct?
On Tue, Mar 16, 2010 at 4:06 PM, Jinsong Hu <ji...@hotmail.com> wrote:
> Hi, There:
> I migrated my hbase cluster from network 10.110.24.0 to 10.110.8.0 for all my hbase machines, and now I can't read the contents of the tables
> any more. I run "bin/hbase shell", and then "scan 'TABLE_ABC', the log shows it is still trying to connect to region server 10.110.24.91. but that
> machine's IP is changed to 10.110.8.91. so the request fails.
> Is there anyway I can fix this issue so I can read the table contents again?
>
>
> Jimmy
>