You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Rohit Nigam <rn...@decarta.com> on 2011/08/15 23:45:22 UTC

version mismatch exception

Hi Guys

 

I changed the endkey of one of the records in the '.META.' table because
of the chaining issue we experienced ,  using the program which gets
that row and does a put so that the  endkey could be changed , now when
I try to view the record in .META. table using shell I get a 

 

ERROR: org.apache.hadoop.io.VersionMismatchException: null

 

 

The master is also down and I can't bring this up , it is throwing an
exception:--

 

2011-08-15 14:32:34,639 FATAL [master-doop10.dt.sv4.decarta.com:60000]
master.HMaster(948): Unhandled exception. Starting shutdown.

A record version mismatch occured. Expecting v0, found v116

        at
org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java
:46)

        at
org.apache.hadoop.hbase.HRegionInfo.readFields(HRegionInfo.java:625)

        at
org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:105)

        at
org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:75)

        at
org.apache.hadoop.hbase.util.Writables.getHRegionInfo(Writables.java:119
)

        at
org.apache.hadoop.hbase.catalog.MetaReader.metaRowToRegionPairWithInfo(M
etaReader.java:401)

        at
org.apache.hadoop.hbase.master.AssignmentManager.rebuildUserRegions(Assi
gnmentManager.java:1358)

        at
org.apache.hadoop.hbase.master.AssignmentManager.processFailover(Assignm
entManager.java:209)

        at
org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java
:401)

        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:283)

 

 

Any help would be really appreciated. 

Thanks

Rohit


RE: version mismatch exception

Posted by Rohit Nigam <rn...@decarta.com>.
Hi Guys

I was wondering if anybody can help me out in resolving this issue , I
am not able to bring up hbase  master , it throws the exception 

 

2011-08-15 15:59:59,722 FATAL [master-doop3.dt.sv4.decarta.com:60000]
master.HMaster(948): Unhandled exception. Starting shutdown.

A record version mismatch occured. Expecting v0, found v116

        at
org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java
:46)

        at
org.apache.hadoop.hbase.HRegionInfo.readFields(HRegionInfo.java:625)

        at
org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:105)

        at
org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:75)

        at
org.apache.hadoop.hbase.util.Writables.getHRegionInfo(Writables.java:119
)

        at
org.apache.hadoop.hbase.catalog.MetaReader.metaRowToRegionPair(MetaReade
r.java:379)

        at
org.apache.hadoop.hbase.catalog.MetaReader$1.visit(MetaReader.java:181)

        at
org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:265)

        at
org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:237)

        at
org.apache.hadoop.hbase.catalog.MetaReader.fullScan(MetaReader.java:192)

        at
org.apache.hadoop.hbase.master.AssignmentManager.assignAllUserRegions(As
signmentManager.java:1222)

        at
org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java
:398)

        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:283)

2011-08-15 15:59:59,724 INFO  [master-doop3.dt.sv4.decarta.com:60000]
master.HMaster(991): Aborting

2011-08-15 15:59:59,726 INFO  [master-doop3.dt.sv4.decarta.com:60000]
master.HMaster(571): Stopping infoServer

2011-08-15 15:59:59,726 INF

 

Any help would be appreciated.

Rohit

From: Rohit Nigam 
Sent: Monday, August 15, 2011 2:45 PM
To: 'user@hbase.apache.org'
Cc: Search
Subject: version mismatch exception

 

Hi Guys

 

I changed the endkey of one of the records in the '.META.' table because
of the chaining issue we experienced ,  using the program which gets
that row and does a put so that the  endkey could be changed , now when
I try to view the record in .META. table using shell I get a 

 

ERROR: org.apache.hadoop.io.VersionMismatchException: null

 

 

The master is also down and I can't bring this up , it is throwing an
exception:--

 

2011-08-15 14:32:34,639 FATAL [master-doop10.dt.sv4.decarta.com:60000]
master.HMaster(948): Unhandled exception. Starting shutdown.

A record version mismatch occured. Expecting v0, found v116

        at
org.apache.hadoop.io.VersionedWritable.readFields(VersionedWritable.java
:46)

        at
org.apache.hadoop.hbase.HRegionInfo.readFields(HRegionInfo.java:625)

        at
org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:105)

        at
org.apache.hadoop.hbase.util.Writables.getWritable(Writables.java:75)

        at
org.apache.hadoop.hbase.util.Writables.getHRegionInfo(Writables.java:119
)

        at
org.apache.hadoop.hbase.catalog.MetaReader.metaRowToRegionPairWithInfo(M
etaReader.java:401)

        at
org.apache.hadoop.hbase.master.AssignmentManager.rebuildUserRegions(Assi
gnmentManager.java:1358)

        at
org.apache.hadoop.hbase.master.AssignmentManager.processFailover(Assignm
entManager.java:209)

        at
org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java
:401)

        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:283)

 

 

Any help would be really appreciated. 

Thanks

Rohit


RE: version mismatch exception

Posted by Rohit Nigam <rn...@decarta.com>.
Hi St.Ack
I was wondering if you can clear our understanding  in just changing the endkey of a record in the .META. table 

The way I am planning to do is to get the HRegionInfo object for that existing region key from the .META. table . Create a new HRegionInfo obj with the updated endkey , start key and regionid being the same as from the result above and do a put in the .META. table. I think if I  just change the endkey and nothing else it will not create  a new row in .META. table and would just update the existing row. Please confirm if my theory is right.
Thanks
Rohit



-----Original Message-----
From: Rohit Nigam 
Sent: Wednesday, August 17, 2011 1:12 PM
To: Geoff Hendrey; 'Stack'; 'user@hbase.apache.org'
Cc: Search
Subject: RE: version mismatch exception

Hi St.Ack
The region in the file System are good, all I am looking is to change the end key of that region in the .META. table so that chaining problem goes away .The way I am planning to do is to get the HRegionInfo object for that existing region key from the .META. table . Create a new HRegionInfo obj with the updated endkey , start key and regionid being the same as from the result above and do a put in the .META. table. I think I just change the endkey and nothing else it will not create  a new row in .META. table and would just update the existing row. Please confirm if my theory is right.
Rohit

-----Original Message-----
From: Geoff Hendrey 
Sent: Wednesday, August 17, 2011 9:55 AM
To: 'Stack'; 'user@hbase.apache.org'
Subject: RE: version mismatch exception

Hi St.Ack,

Keying off of what you said: " Did you update the
info:regioninfo cell so it has a new hregioninfo with same start and
end row?   You know this makes a new region, rather than extend the
range of the previous region? (So the old region will be in the
filesystem still with the old data).  You should also make sure the
original region is closed before you delete the row from .META."

So I take it the best approach is to:
1) close the original region (the region whose .META.'s endkey we want to repoint)
2) delete the region's row from .META.
3) Put a new row into .META., the new row's hregioninfo having the desired endkey such that there is no more "hole" in .META.

I'm trying to nail down exactly the sequence of steps we should take so that we don't have to do scary manual surgery of -ROOT- and .META. like we did yesterday. We we're pretty much fumbling around in the dark trying to figure out the structure of -ROOT-'s HDFS files, and same for .META. after our first failed attempt to update the endrow. We did figure it out, and removed the files from .META. and -ROOT- that prevented hbase from coming up. Our error was that instead of updating the endrow in .META. we inadvertently put a new row into .META. with default timestamp, and that basically shot everything to hell. I couldn't find docs on the structure of -ROOT- and .META. HDFS files, but we sort of pieced it together and we're able to remove the newly created files in -ROOT- and .META. based on the their creation times and grepping their content, after which hbase was able to come back up without error.

So, apologies for going slow on this, and really trying to exactly nail down the set of steps we should proceed with in order to avoid another self-inflicted corruption.

Best,
geoff


-----Original Message-----
From: Rohit Nigam 
Sent: Monday, August 15, 2011 8:46 PM
To: Stack; user@hbase.apache.org
Cc: Search
Subject: RE: version mismatch exception

So I got the info:regioninfo value from the .META. table for the key and I did a put in the .META. table with the  change in the endkey. Yes I did change the info:regioninfo cell with just changing the  endkey. I didn't see the change as the master didn't come up and kept throwing the version mismatch exception. Somehow the version got changed , no idea how? , I had to run the add_table.rb for that table to restore the whole thing for the master to be up. So my case is to just  update the endkey of a row in the .META. table for a table region because the chain is broken ,how do I do that so that this exception doesn't happen.
Rohit

-----Original Message-----
From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
Sent: Monday, August 15, 2011 8:29 PM
To: user@hbase.apache.org
Cc: Search
Subject: Re: version mismatch exception

On Mon, Aug 15, 2011 at 2:45 PM, Rohit Nigam <rn...@decarta.com> wrote:
> I changed the endkey of one of the records in the '.META.' table because
> of the chaining issue we experienced ,  using the program which gets
> that row and does a put so that the  endkey could be changed , now when
> I try to view the record in .META. table using shell I get a
>

Tell me how you did this?  You removed the original row and replaced
it with another that has different end key?  Did you update the
info:regioninfo cell so it has a new hregioninfo with same start and
end row?   You know this makes a new region, rather than extend the
range of the previous region? (So the old region will be in the
filesystem still with the old data).  You should also make sure the
original region is closed before you delete the row from .META.


>
>
> ERROR: org.apache.hadoop.io.VersionMismatchException: null
>

Where did you get this from?


> 2011-08-15 14:32:34,639 FATAL [master-doop10.dt.sv4.decarta.com:60000]
> master.HMaster(948): Unhandled exception. Starting shutdown.
>
> A record version mismatch occured. Expecting v0, found v116
>

What you doing when this happens?  It looks like we are deserializing
the wrong content?  Is that possible.

St.Ack

RE: version mismatch exception

Posted by Rohit Nigam <rn...@decarta.com>.
Hi Guys
Would like to post the source code as per the steps described by St.Ack. to edit one of the ENDKEY of a record in .META. table 

import java.util.logging.Level;
import java.util.logging.Logger;
import org.apache.hadoop.hbase.client.Get;
import java.io.IOException;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.HConstants;
import org.apache.hadoop.hbase.HRegionInfo;
import org.apache.hadoop.hbase.HTableDescriptor;
import org.apache.hadoop.hbase.client.Delete;
import org.apache.hadoop.hbase.client.HTable;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.util.Bytes;
import org.apache.hadoop.hbase.util.Writables;
import org.apache.hadoop.hbase.client.HBaseAdmin;


public class FixMetaTable {

    public static String regionNameKey = "RegionName";

    public static void main(String[] args) throws InterruptedException {
        try {
            System.out.println("Entering the Program to Edit .META. table");
            Configuration hConfig = HBaseConfiguration.create();
            hConfig.set("hbase.zookeeper.quorum", System.getProperty("zk"));

            HBaseAdmin admin = new HBaseAdmin(hConfig);
            
            HTable hTable = new HTable(hConfig, Bytes.toBytes(".META."));
            Get get = new Get(Bytes.toBytes("regionNameKey"));
            Result result = hTable.get(get);
            byte[] bytes = result.getValue(HConstants.CATALOG_FAMILY, HConstants.REGIONINFO_QUALIFIER);

            HRegionInfo closedRegion = Writables.getHRegionInfo(bytes);
            admin.closeRegion(closedRegion.getRegionName(), null);//. Close the existing region if open.
            System.out.println("Closed the Region " + closedRegion.getRegionNameAsString());




            HTable readTable = new HTable(hConfig, Bytes.toBytes(".META."));
            Get readGet = new Get(Bytes.toBytes(regionNameKey));
            Result readResult = readTable.get(readGet);
            byte[] readBytes = readResult.getValue(HConstants.CATALOG_FAMILY, HConstants.REGIONINFO_QUALIFIER);

            HRegionInfo existingRegion = Writables.getHRegionInfo(readBytes); //Read the existing hregioninfo.

            System.out.println("Read the existing region info after closing " + existingRegion.getRegionNameAsString());

            HTableDescriptor descriptor = new HTableDescriptor(existingRegion.getTableDesc()); //Use existing hregioninfo htabledescriptor and this construction
            // Just changing the End key , nothing else
            HRegionInfo newRegion = new HRegionInfo(descriptor, Bytes.toBytes("STARTKEY"), Bytes.toBytes("ENDKEY")); //byte[], byte[]),

            byte[] value = Writables.getBytes(newRegion);

            Put put = new Put(newRegion.getRegionName()); //  Same time stamp from the record.
            put.add(HConstants.CATALOG_FAMILY, HConstants.REGIONINFO_QUALIFIER, value);//Insert the new entry in .META. using new hregioninfo name as row key and add an info:regioninfo whose contents is the serialized new hregioninfo.
            HTable metaTable = new HTable(hConfig, ".META.");
            metaTable.put(put);
            System.out.println("Put a new Region " + newRegion.getRegionNameAsString() + " End key is " + Bytes.toString(newRegion.getEndKey()));


            Delete del = new Delete(closedRegion.getRegionName());//Delete the original row from .META.
            metaTable.delete(del);

            System.out.println("Deleted the closed region " + closedRegion.getRegionNameAsString());

            admin.assign(newRegion.getRegionName(), true); //Assign the new region.
            System.out.println("Assigned the new region " + newRegion.getRegionNameAsString());

        } catch (IOException ex) {
            Logger.getLogger(FixMetaTable.class.getName()).log(Level.SEVERE, null, ex);
        }

    }
}

Before running this code please  take a copy of the files from the actual table in the file system using the encodedid which can be figured out in the .META. for the region which is getting changed.Once the new Region is formed copy the data back in the file system under the new encodedid generated from the new region.

Thanks
Rohit

-----Original Message-----
From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
Sent: Thursday, August 18, 2011 2:45 PM
To: Rohit Nigam
Cc: Geoff Hendrey; user@hbase.apache.org; Search
Subject: Re: version mismatch exception

What you think caused it?
St.Ack

On Thu, Aug 18, 2011 at 2:43 PM, Rohit Nigam <rn...@decarta.com> wrote:
> Thanks St.Ack
> This really worked , was able to fix the hole .
> Thanks
> Rohit
>
> -----Original Message-----
> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
> Sent: Thursday, August 18, 2011 11:29 AM
> To: Rohit Nigam
> Cc: Geoff Hendrey; user@hbase.apache.org; Search
> Subject: Re: version mismatch exception
>
> On Wed, Aug 17, 2011 at 1:12 PM, Rohit Nigam <rn...@decarta.com> wrote:
>> Hi St.Ack
>> The region in the file System are good, all I am looking is to change the end key of that region in the .META. table so that chaining problem goes away .The way I am planning to do is to get the HRegionInfo object for that existing region key from the .META. table . Create a new HRegionInfo obj with the updated endkey , start key and regionid being the same as from the result above and do a put in the .META. table. I think I just change the endkey and nothing else it will not create  a new row in .META. table and would just update the existing row. Please confirm if my theory is right.
>
> 1. Close the existing region if open.
> 2. Read the existing hregioninfo.
> 3. Use existing hregioninfo htabledescriptor and this construction,
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HRegionInfo.html#HRegionInfo(org.apache.hadoop.hbase.HTableDescriptor,
> byte[], byte[]), to  make a new hregioninfo.   It will have a
> different encoding to the original.
> 4. Insert the new entry in .META. using new hregioninfo name as row
> key and add an info:regioninfo whose contents is the serialized new
> hregioninfo.
> 5. Delete the original row from .META.
> 6. Assign the new region.
>
> If you want the data from the old region in the new region, then you
> should copy any files in that are under the old entries directory into
> the new region directory (find the regions by using the encoded name;
> the encoded name is an attribute of hregioninfo).  After copying in
> the data, you'll need to reassign the region.  The files are only
> noticed on region open.
>
> St.Ack
>

RE: version mismatch exception

Posted by Rohit Nigam <rn...@decarta.com>.
That is a mystery but would do some surgery on it.
Thanks
Rohit

-----Original Message-----
From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
Sent: Thursday, August 18, 2011 2:45 PM
To: Rohit Nigam
Cc: Geoff Hendrey; user@hbase.apache.org; Search
Subject: Re: version mismatch exception

What you think caused it?
St.Ack

On Thu, Aug 18, 2011 at 2:43 PM, Rohit Nigam <rn...@decarta.com> wrote:
> Thanks St.Ack
> This really worked , was able to fix the hole .
> Thanks
> Rohit
>
> -----Original Message-----
> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
> Sent: Thursday, August 18, 2011 11:29 AM
> To: Rohit Nigam
> Cc: Geoff Hendrey; user@hbase.apache.org; Search
> Subject: Re: version mismatch exception
>
> On Wed, Aug 17, 2011 at 1:12 PM, Rohit Nigam <rn...@decarta.com> wrote:
>> Hi St.Ack
>> The region in the file System are good, all I am looking is to change the end key of that region in the .META. table so that chaining problem goes away .The way I am planning to do is to get the HRegionInfo object for that existing region key from the .META. table . Create a new HRegionInfo obj with the updated endkey , start key and regionid being the same as from the result above and do a put in the .META. table. I think I just change the endkey and nothing else it will not create  a new row in .META. table and would just update the existing row. Please confirm if my theory is right.
>
> 1. Close the existing region if open.
> 2. Read the existing hregioninfo.
> 3. Use existing hregioninfo htabledescriptor and this construction,
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HRegionInfo.html#HRegionInfo(org.apache.hadoop.hbase.HTableDescriptor,
> byte[], byte[]), to  make a new hregioninfo.   It will have a
> different encoding to the original.
> 4. Insert the new entry in .META. using new hregioninfo name as row
> key and add an info:regioninfo whose contents is the serialized new
> hregioninfo.
> 5. Delete the original row from .META.
> 6. Assign the new region.
>
> If you want the data from the old region in the new region, then you
> should copy any files in that are under the old entries directory into
> the new region directory (find the regions by using the encoded name;
> the encoded name is an attribute of hregioninfo).  After copying in
> the data, you'll need to reassign the region.  The files are only
> noticed on region open.
>
> St.Ack
>

Re: version mismatch exception

Posted by Stack <st...@duboce.net>.
What you think caused it?
St.Ack

On Thu, Aug 18, 2011 at 2:43 PM, Rohit Nigam <rn...@decarta.com> wrote:
> Thanks St.Ack
> This really worked , was able to fix the hole .
> Thanks
> Rohit
>
> -----Original Message-----
> From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
> Sent: Thursday, August 18, 2011 11:29 AM
> To: Rohit Nigam
> Cc: Geoff Hendrey; user@hbase.apache.org; Search
> Subject: Re: version mismatch exception
>
> On Wed, Aug 17, 2011 at 1:12 PM, Rohit Nigam <rn...@decarta.com> wrote:
>> Hi St.Ack
>> The region in the file System are good, all I am looking is to change the end key of that region in the .META. table so that chaining problem goes away .The way I am planning to do is to get the HRegionInfo object for that existing region key from the .META. table . Create a new HRegionInfo obj with the updated endkey , start key and regionid being the same as from the result above and do a put in the .META. table. I think I just change the endkey and nothing else it will not create  a new row in .META. table and would just update the existing row. Please confirm if my theory is right.
>
> 1. Close the existing region if open.
> 2. Read the existing hregioninfo.
> 3. Use existing hregioninfo htabledescriptor and this construction,
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HRegionInfo.html#HRegionInfo(org.apache.hadoop.hbase.HTableDescriptor,
> byte[], byte[]), to  make a new hregioninfo.   It will have a
> different encoding to the original.
> 4. Insert the new entry in .META. using new hregioninfo name as row
> key and add an info:regioninfo whose contents is the serialized new
> hregioninfo.
> 5. Delete the original row from .META.
> 6. Assign the new region.
>
> If you want the data from the old region in the new region, then you
> should copy any files in that are under the old entries directory into
> the new region directory (find the regions by using the encoded name;
> the encoded name is an attribute of hregioninfo).  After copying in
> the data, you'll need to reassign the region.  The files are only
> noticed on region open.
>
> St.Ack
>

RE: version mismatch exception

Posted by Rohit Nigam <rn...@decarta.com>.
Thanks St.Ack
This really worked , was able to fix the hole .
Thanks 
Rohit

-----Original Message-----
From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
Sent: Thursday, August 18, 2011 11:29 AM
To: Rohit Nigam
Cc: Geoff Hendrey; user@hbase.apache.org; Search
Subject: Re: version mismatch exception

On Wed, Aug 17, 2011 at 1:12 PM, Rohit Nigam <rn...@decarta.com> wrote:
> Hi St.Ack
> The region in the file System are good, all I am looking is to change the end key of that region in the .META. table so that chaining problem goes away .The way I am planning to do is to get the HRegionInfo object for that existing region key from the .META. table . Create a new HRegionInfo obj with the updated endkey , start key and regionid being the same as from the result above and do a put in the .META. table. I think I just change the endkey and nothing else it will not create  a new row in .META. table and would just update the existing row. Please confirm if my theory is right.

1. Close the existing region if open.
2. Read the existing hregioninfo.
3. Use existing hregioninfo htabledescriptor and this construction,
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HRegionInfo.html#HRegionInfo(org.apache.hadoop.hbase.HTableDescriptor,
byte[], byte[]), to  make a new hregioninfo.   It will have a
different encoding to the original.
4. Insert the new entry in .META. using new hregioninfo name as row
key and add an info:regioninfo whose contents is the serialized new
hregioninfo.
5. Delete the original row from .META.
6. Assign the new region.

If you want the data from the old region in the new region, then you
should copy any files in that are under the old entries directory into
the new region directory (find the regions by using the encoded name;
the encoded name is an attribute of hregioninfo).  After copying in
the data, you'll need to reassign the region.  The files are only
noticed on region open.

St.Ack

Re: version mismatch exception

Posted by Geoff Hendrey <gh...@decarta.com>.
This is awesome info!! Thank you!!

Sent from my iPhone

On Aug 18, 2011, at 11:29 AM, "Stack" <st...@duboce.net> wrote:

> On Wed, Aug 17, 2011 at 1:12 PM, Rohit Nigam <rn...@decarta.com> wrote:
> > Hi St.Ack
> > The region in the file System are good, all I am looking is to change the end key of that region in the .META. table so that chaining problem goes away .The way I am planning to do is to get the HRegionInfo object for that existing region key from the .META. table . Create a new HRegionInfo obj with the updated endkey , start key and regionid being the same as from the result above and do a put in the .META. table. I think I just change the endkey and nothing else it will not create  a new row in .META. table and would just update the existing row. Please confirm if my theory is right.
> 
> 1. Close the existing region if open.
> 2. Read the existing hregioninfo.
> 3. Use existing hregioninfo htabledescriptor and this construction,
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HRegionInfo.html#HRegionInfo(org.apache.hadoop.hbase.HTableDescriptor,
> byte[], byte[]), to  make a new hregioninfo.   It will have a
> different encoding to the original.
> 4. Insert the new entry in .META. using new hregioninfo name as row
> key and add an info:regioninfo whose contents is the serialized new
> hregioninfo.
> 5. Delete the original row from .META.
> 6. Assign the new region.
> 
> If you want the data from the old region in the new region, then you
> should copy any files in that are under the old entries directory into
> the new region directory (find the regions by using the encoded name;
> the encoded name is an attribute of hregioninfo).  After copying in
> the data, you'll need to reassign the region.  The files are only
> noticed on region open.
> 
> St.Ack

confirmed procedure for repairing hole in hbase metadata

Posted by Geoff Hendrey <gh...@decarta.com>.
Stack,

Thanks again for providing us this procedure. We had 5 holes in a big multi-terabyte table. We were able to repair them all and get back to work. Regarding your question of what caused the holes in the first place: we have no idea, unfortunately. Our load is read, write, and update and the rows contain about 1 MB of data in 1 column.

-geoff

-----Original Message-----
From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
Sent: Thursday, August 18, 2011 11:29 AM
To: Rohit Nigam
Cc: Geoff Hendrey; user@hbase.apache.org; Search
Subject: Re: version mismatch exception

On Wed, Aug 17, 2011 at 1:12 PM, Rohit Nigam <rn...@decarta.com> wrote:
> Hi St.Ack
> The region in the file System are good, all I am looking is to change the end key of that region in the .META. table so that chaining problem goes away .The way I am planning to do is to get the HRegionInfo object for that existing region key from the .META. table . Create a new HRegionInfo obj with the updated endkey , start key and regionid being the same as from the result above and do a put in the .META. table. I think I just change the endkey and nothing else it will not create  a new row in .META. table and would just update the existing row. Please confirm if my theory is right.

1. Close the existing region if open.
2. Read the existing hregioninfo.
3. Use existing hregioninfo htabledescriptor and this construction,
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HRegionInfo.html#HRegionInfo(org.apache.hadoop.hbase.HTableDescriptor,
byte[], byte[]), to  make a new hregioninfo.   It will have a
different encoding to the original.
4. Insert the new entry in .META. using new hregioninfo name as row
key and add an info:regioninfo whose contents is the serialized new
hregioninfo.
5. Delete the original row from .META.
6. Assign the new region.

If you want the data from the old region in the new region, then you
should copy any files in that are under the old entries directory into
the new region directory (find the regions by using the encoded name;
the encoded name is an attribute of hregioninfo).  After copying in
the data, you'll need to reassign the region.  The files are only
noticed on region open.

St.Ack

Re: version mismatch exception

Posted by Stack <st...@duboce.net>.
On Wed, Aug 17, 2011 at 1:12 PM, Rohit Nigam <rn...@decarta.com> wrote:
> Hi St.Ack
> The region in the file System are good, all I am looking is to change the end key of that region in the .META. table so that chaining problem goes away .The way I am planning to do is to get the HRegionInfo object for that existing region key from the .META. table . Create a new HRegionInfo obj with the updated endkey , start key and regionid being the same as from the result above and do a put in the .META. table. I think I just change the endkey and nothing else it will not create  a new row in .META. table and would just update the existing row. Please confirm if my theory is right.

1. Close the existing region if open.
2. Read the existing hregioninfo.
3. Use existing hregioninfo htabledescriptor and this construction,
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/HRegionInfo.html#HRegionInfo(org.apache.hadoop.hbase.HTableDescriptor,
byte[], byte[]), to  make a new hregioninfo.   It will have a
different encoding to the original.
4. Insert the new entry in .META. using new hregioninfo name as row
key and add an info:regioninfo whose contents is the serialized new
hregioninfo.
5. Delete the original row from .META.
6. Assign the new region.

If you want the data from the old region in the new region, then you
should copy any files in that are under the old entries directory into
the new region directory (find the regions by using the encoded name;
the encoded name is an attribute of hregioninfo).  After copying in
the data, you'll need to reassign the region.  The files are only
noticed on region open.

St.Ack

RE: version mismatch exception

Posted by Rohit Nigam <rn...@decarta.com>.
Hi St.Ack
The region in the file System are good, all I am looking is to change the end key of that region in the .META. table so that chaining problem goes away .The way I am planning to do is to get the HRegionInfo object for that existing region key from the .META. table . Create a new HRegionInfo obj with the updated endkey , start key and regionid being the same as from the result above and do a put in the .META. table. I think I just change the endkey and nothing else it will not create  a new row in .META. table and would just update the existing row. Please confirm if my theory is right.
Rohit

-----Original Message-----
From: Geoff Hendrey 
Sent: Wednesday, August 17, 2011 9:55 AM
To: 'Stack'; 'user@hbase.apache.org'
Subject: RE: version mismatch exception

Hi St.Ack,

Keying off of what you said: " Did you update the
info:regioninfo cell so it has a new hregioninfo with same start and
end row?   You know this makes a new region, rather than extend the
range of the previous region? (So the old region will be in the
filesystem still with the old data).  You should also make sure the
original region is closed before you delete the row from .META."

So I take it the best approach is to:
1) close the original region (the region whose .META.'s endkey we want to repoint)
2) delete the region's row from .META.
3) Put a new row into .META., the new row's hregioninfo having the desired endkey such that there is no more "hole" in .META.

I'm trying to nail down exactly the sequence of steps we should take so that we don't have to do scary manual surgery of -ROOT- and .META. like we did yesterday. We we're pretty much fumbling around in the dark trying to figure out the structure of -ROOT-'s HDFS files, and same for .META. after our first failed attempt to update the endrow. We did figure it out, and removed the files from .META. and -ROOT- that prevented hbase from coming up. Our error was that instead of updating the endrow in .META. we inadvertently put a new row into .META. with default timestamp, and that basically shot everything to hell. I couldn't find docs on the structure of -ROOT- and .META. HDFS files, but we sort of pieced it together and we're able to remove the newly created files in -ROOT- and .META. based on the their creation times and grepping their content, after which hbase was able to come back up without error.

So, apologies for going slow on this, and really trying to exactly nail down the set of steps we should proceed with in order to avoid another self-inflicted corruption.

Best,
geoff


-----Original Message-----
From: Rohit Nigam 
Sent: Monday, August 15, 2011 8:46 PM
To: Stack; user@hbase.apache.org
Cc: Search
Subject: RE: version mismatch exception

So I got the info:regioninfo value from the .META. table for the key and I did a put in the .META. table with the  change in the endkey. Yes I did change the info:regioninfo cell with just changing the  endkey. I didn't see the change as the master didn't come up and kept throwing the version mismatch exception. Somehow the version got changed , no idea how? , I had to run the add_table.rb for that table to restore the whole thing for the master to be up. So my case is to just  update the endkey of a row in the .META. table for a table region because the chain is broken ,how do I do that so that this exception doesn't happen.
Rohit

-----Original Message-----
From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
Sent: Monday, August 15, 2011 8:29 PM
To: user@hbase.apache.org
Cc: Search
Subject: Re: version mismatch exception

On Mon, Aug 15, 2011 at 2:45 PM, Rohit Nigam <rn...@decarta.com> wrote:
> I changed the endkey of one of the records in the '.META.' table because
> of the chaining issue we experienced ,  using the program which gets
> that row and does a put so that the  endkey could be changed , now when
> I try to view the record in .META. table using shell I get a
>

Tell me how you did this?  You removed the original row and replaced
it with another that has different end key?  Did you update the
info:regioninfo cell so it has a new hregioninfo with same start and
end row?   You know this makes a new region, rather than extend the
range of the previous region? (So the old region will be in the
filesystem still with the old data).  You should also make sure the
original region is closed before you delete the row from .META.


>
>
> ERROR: org.apache.hadoop.io.VersionMismatchException: null
>

Where did you get this from?


> 2011-08-15 14:32:34,639 FATAL [master-doop10.dt.sv4.decarta.com:60000]
> master.HMaster(948): Unhandled exception. Starting shutdown.
>
> A record version mismatch occured. Expecting v0, found v116
>

What you doing when this happens?  It looks like we are deserializing
the wrong content?  Is that possible.

St.Ack

Re: version mismatch exception

Posted by Stack <st...@duboce.net>.
On Wed, Aug 17, 2011 at 9:54 AM, Geoff Hendrey <gh...@decarta.com> wrote:
> So I take it the best approach is to:
> 1) close the original region (the region whose .META.'s endkey we want to repoint)
> 2) delete the region's row from .META.
> 3) Put a new row into .META., the new row's hregioninfo having the desired endkey such that there is no more "hole" in .META.
>

This looks good.


> I'm trying to nail down exactly the sequence of steps we should take so that we don't have to do scary manual surgery of -ROOT- and .META. like we did yesterday. We we're pretty much fumbling around in the dark trying to figure out the structure of -ROOT-'s HDFS files, and same for .META. after our first failed attempt to update the endrow. We did figure it out, and removed the files from .META. and -ROOT- that prevented hbase from coming up. Our error was that instead of updating the endrow in .META. we inadvertently put a new row into .META. with default timestamp, and that basically shot everything to hell. I couldn't find docs on the structure of -ROOT- and .META. HDFS files, but we sort of pieced it together and we're able to remove the newly created files in -ROOT- and .META. based on the their creation times and grepping their content, after which hbase was able to come back up without error.
>
> So, apologies for going slow on this, and really trying to exactly nail down the set of steps we should proceed with in order to avoid another self-inflicted corruption.
>

Ugh.  You shouldn't have to do this.  We owe better tools here.

You figured flushing .META. and -ROOT- then copying them aside in case
you make error next time around you could restart atop the copies?

On the format of .meta. and -root- hfiles, they are same as any other
hfiles.  The format of the key though in these tables could do w/ some
description in that the 'startrow' is a region name (in root the
startrow is the meta region name which itself contains a startrow that
is the name of a user-space region -- we should write this up).

St.Ack

RE: version mismatch exception

Posted by Geoff Hendrey <gh...@decarta.com>.
Hi St.Ack,

Keying off of what you said: " Did you update the
info:regioninfo cell so it has a new hregioninfo with same start and
end row?   You know this makes a new region, rather than extend the
range of the previous region? (So the old region will be in the
filesystem still with the old data).  You should also make sure the
original region is closed before you delete the row from .META."

So I take it the best approach is to:
1) close the original region (the region whose .META.'s endkey we want to repoint)
2) delete the region's row from .META.
3) Put a new row into .META., the new row's hregioninfo having the desired endkey such that there is no more "hole" in .META.

I'm trying to nail down exactly the sequence of steps we should take so that we don't have to do scary manual surgery of -ROOT- and .META. like we did yesterday. We we're pretty much fumbling around in the dark trying to figure out the structure of -ROOT-'s HDFS files, and same for .META. after our first failed attempt to update the endrow. We did figure it out, and removed the files from .META. and -ROOT- that prevented hbase from coming up. Our error was that instead of updating the endrow in .META. we inadvertently put a new row into .META. with default timestamp, and that basically shot everything to hell. I couldn't find docs on the structure of -ROOT- and .META. HDFS files, but we sort of pieced it together and we're able to remove the newly created files in -ROOT- and .META. based on the their creation times and grepping their content, after which hbase was able to come back up without error.

So, apologies for going slow on this, and really trying to exactly nail down the set of steps we should proceed with in order to avoid another self-inflicted corruption.

Best,
geoff


-----Original Message-----
From: Rohit Nigam 
Sent: Monday, August 15, 2011 8:46 PM
To: Stack; user@hbase.apache.org
Cc: Search
Subject: RE: version mismatch exception

So I got the info:regioninfo value from the .META. table for the key and I did a put in the .META. table with the  change in the endkey. Yes I did change the info:regioninfo cell with just changing the  endkey. I didn't see the change as the master didn't come up and kept throwing the version mismatch exception. Somehow the version got changed , no idea how? , I had to run the add_table.rb for that table to restore the whole thing for the master to be up. So my case is to just  update the endkey of a row in the .META. table for a table region because the chain is broken ,how do I do that so that this exception doesn't happen.
Rohit

-----Original Message-----
From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
Sent: Monday, August 15, 2011 8:29 PM
To: user@hbase.apache.org
Cc: Search
Subject: Re: version mismatch exception

On Mon, Aug 15, 2011 at 2:45 PM, Rohit Nigam <rn...@decarta.com> wrote:
> I changed the endkey of one of the records in the '.META.' table because
> of the chaining issue we experienced ,  using the program which gets
> that row and does a put so that the  endkey could be changed , now when
> I try to view the record in .META. table using shell I get a
>

Tell me how you did this?  You removed the original row and replaced
it with another that has different end key?  Did you update the
info:regioninfo cell so it has a new hregioninfo with same start and
end row?   You know this makes a new region, rather than extend the
range of the previous region? (So the old region will be in the
filesystem still with the old data).  You should also make sure the
original region is closed before you delete the row from .META.


>
>
> ERROR: org.apache.hadoop.io.VersionMismatchException: null
>

Where did you get this from?


> 2011-08-15 14:32:34,639 FATAL [master-doop10.dt.sv4.decarta.com:60000]
> master.HMaster(948): Unhandled exception. Starting shutdown.
>
> A record version mismatch occured. Expecting v0, found v116
>

What you doing when this happens?  It looks like we are deserializing
the wrong content?  Is that possible.

St.Ack

Re: version mismatch exception

Posted by Stack <st...@duboce.net>.
On Mon, Aug 15, 2011 at 8:46 PM, Rohit Nigam <rn...@decarta.com> wrote:
> So I got the info:regioninfo value from the .META. table for the key and I did a put in the .META. table with the  change in the endkey. Yes I did change the info:regioninfo cell with just changing the  endkey. I didn't see the change as the master didn't come up and kept throwing the version mismatch exception.
> Somehow the version got changed , no idea how? , I had to run the add_table.rb for that table to restore the whole thing for the master to be up. So my case is to just  update the endkey of a row in the .META. table for a table region because the chain is broken ,how do I do that so that this exception doesn't happen.
>


My guess is that you did not put a serialized HRegionInfo up into
.META.?  Is that possible?
St.Ack

RE: version mismatch exception

Posted by Rohit Nigam <rn...@decarta.com>.
So I got the info:regioninfo value from the .META. table for the key and I did a put in the .META. table with the  change in the endkey. Yes I did change the info:regioninfo cell with just changing the  endkey. I didn't see the change as the master didn't come up and kept throwing the version mismatch exception. Somehow the version got changed , no idea how? , I had to run the add_table.rb for that table to restore the whole thing for the master to be up. So my case is to just  update the endkey of a row in the .META. table for a table region because the chain is broken ,how do I do that so that this exception doesn't happen.
Rohit

-----Original Message-----
From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of Stack
Sent: Monday, August 15, 2011 8:29 PM
To: user@hbase.apache.org
Cc: Search
Subject: Re: version mismatch exception

On Mon, Aug 15, 2011 at 2:45 PM, Rohit Nigam <rn...@decarta.com> wrote:
> I changed the endkey of one of the records in the '.META.' table because
> of the chaining issue we experienced ,  using the program which gets
> that row and does a put so that the  endkey could be changed , now when
> I try to view the record in .META. table using shell I get a
>

Tell me how you did this?  You removed the original row and replaced
it with another that has different end key?  Did you update the
info:regioninfo cell so it has a new hregioninfo with same start and
end row?   You know this makes a new region, rather than extend the
range of the previous region? (So the old region will be in the
filesystem still with the old data).  You should also make sure the
original region is closed before you delete the row from .META.


>
>
> ERROR: org.apache.hadoop.io.VersionMismatchException: null
>

Where did you get this from?


> 2011-08-15 14:32:34,639 FATAL [master-doop10.dt.sv4.decarta.com:60000]
> master.HMaster(948): Unhandled exception. Starting shutdown.
>
> A record version mismatch occured. Expecting v0, found v116
>

What you doing when this happens?  It looks like we are deserializing
the wrong content?  Is that possible.

St.Ack

Re: version mismatch exception

Posted by Stack <st...@duboce.net>.
On Mon, Aug 15, 2011 at 2:45 PM, Rohit Nigam <rn...@decarta.com> wrote:
> I changed the endkey of one of the records in the '.META.' table because
> of the chaining issue we experienced ,  using the program which gets
> that row and does a put so that the  endkey could be changed , now when
> I try to view the record in .META. table using shell I get a
>

Tell me how you did this?  You removed the original row and replaced
it with another that has different end key?  Did you update the
info:regioninfo cell so it has a new hregioninfo with same start and
end row?   You know this makes a new region, rather than extend the
range of the previous region? (So the old region will be in the
filesystem still with the old data).  You should also make sure the
original region is closed before you delete the row from .META.


>
>
> ERROR: org.apache.hadoop.io.VersionMismatchException: null
>

Where did you get this from?


> 2011-08-15 14:32:34,639 FATAL [master-doop10.dt.sv4.decarta.com:60000]
> master.HMaster(948): Unhandled exception. Starting shutdown.
>
> A record version mismatch occured. Expecting v0, found v116
>

What you doing when this happens?  It looks like we are deserializing
the wrong content?  Is that possible.

St.Ack