You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Jorn Argelo - Ephorus <Jo...@ephorus.com> on 2011/12/07 09:27:57 UTC

CopyTable to remote cluster runs OK but doesn't copy anything

Hi all,

 

I'm trying to copy a table from one cluster to another cluster but this
does not seem to do what I expect it to do. The Map/Reduce job runs
successfully as you can see below, but it's not actually copying
anything to the remote cluster. It almost looks as if it's not parsing
the --peer.adr option and just copies the data inside the same cluster.
At least, the "WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same"
warning would suggest that.

 

Both clusters are running CHD3U1 and are both fully distributed,
although hbase-test1 is a single physical server running all components
for a fully distributed setup. The source cluster where I am running the
job from is a small 10 node cluster. Note that on hbase-test1 the target
table already exists with the same column families as in the source
cluster.

 

Does anybody have any idea what I'm doing wrong? Or maybe I found a bug?
There's another guy at stackoverflow reporting the same issue
(http://stackoverflow.com/questions/7952213/how-to-copy-a-table-from-one
-hbase-cluster-to-another-cluster) but nobody responded on that.

 

Thanks,

Jorn

 

 

$ hbase org.apache.hadoop.hbase.mapreduce.CopyTable
--peer.adr=hbase-test1:2181:/hbase chunk

11/12/07 08:52:24 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.

11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
environment:zookeeper.version=3.3.3-cdh3u1--1, built on 07/18/2011 16:48
GMT

11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
environment:host.name=namenode1

11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
environment:java.version=1.6.0_26

11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
environment:java.vendor=Sun Microsystems Inc.

11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.26/jre

11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
environment:java.class.path=<lots of jars, snipped out to prevent spam>

11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
environment:java.library.path=/usr/lib/hbase/bin/../lib/native/Linux-amd
64-64:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64

11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
environment:java.io.tmpdir=/tmp

11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
environment:java.compiler=<NA>

11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
environment:os.name=Linux

11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
environment:os.arch=amd64

11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
environment:os.version=2.6.32-33-server

11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
environment:user.name=mapred

11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
environment:user.home=/usr/lib/hadoop-0.20

11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
environment:user.dir=/usr/lib/hadoop-0.20

11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Initiating client
connection, connectString=hbase-test1:2181 sessionTimeout=10000
watcher=hconnection

11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Opening socket connection
to server hbase-test1/10.30.10.10:2181

11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Socket connection
established to hbase-test1/10.30.10.10:2181, initiating session

11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Session establishment
complete on server hbase-test1/10.30.10.10:2181, sessionid =
0x134126b2d250040, negotiated timeout = 10000

11/12/07 08:52:25 DEBUG
client.HConnectionManager$HConnectionImplementation: Lookedup root
region location,
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnection
Implementation@105691e; hsa=hbase-test1:60020

11/12/07 08:52:25 DEBUG
client.HConnectionManager$HConnectionImplementation: Cached location for
.META.,,1.1028785192 is hbase-test1:60020

11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting at
row=chunk,,00000000000000 for max=10 rows

11/12/07 08:52:25 DEBUG
client.HConnectionManager$HConnectionImplementation: Cached location for
chunk,,1323181597686.f527a21a31a39559a2f4cbd034d286a7. is
hbase-test1:60020

11/12/07 08:52:25 INFO mapreduce.TableOutputFormat: Created table
instance for chunk

11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Initiating client
connection, connectString=z01:2181,zk02:2181,zk03:2181
sessionTimeout=10000 watcher=hconnection

11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Opening socket connection
to server zk02/10.30.4.93:2181

11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Socket connection
established to zk02/10.30.4.93:2181, initiating session

11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Session establishment
complete on server zzk02/10.30.4.93:2181, sessionid = 0x233922a0b320c81,
negotiated timeout = 10000

11/12/07 08:52:25 DEBUG
client.HConnectionManager$HConnectionImplementation: Lookedup root
region location,
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnection
Implementation@3c3a1834; hsa=datanode1:60020

11/12/07 08:52:25 DEBUG
client.HConnectionManager$HConnectionImplementation: Cached location for
.META.,,1.1028785192 is datanode1:60020

11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting at
row=chunk,,00000000000000 for max=10 rows

11/12/07 08:52:25 DEBUG
client.HConnectionManager$HConnectionImplementation: Cached location for
chunk,,1323179990451.a243a485325744b9eedd8da2106712b6. is
datanode3:60020

11/12/07 08:52:25 DEBUG
client.HConnectionManager$HConnectionImplementation: Cached location for
chunk,array_for_lithiumion_battery_anode_material,1323179990451.6161771e
cadc7a45acd28afbfca88a09. is datanode3:60020

11/12/07 08:52:25 DEBUG
client.HConnectionManager$HConnectionImplementation: Cached location for
chunk,cytotoxic_elements_were_leached_from_the,1323179991855.5cf99f0425d
9e0e9fd41ac9645d65b93. is datanode3:60020

11/12/07 08:52:25 DEBUG
client.HConnectionManager$HConnectionImplementation: Cached location for
chunk,generation_cephalosporin_just_before_biopsy,1323179991855.db0d0df2
c2e076bf01c75ae6ac200436. is datanode3:60020

11/12/07 08:52:25 DEBUG
client.HConnectionManager$HConnectionImplementation: Cached location for
chunk,linked_with_the_2008_great_wenchuan,1323179964329.32ee9e359e50582b
f1b419396c9aa8ad. is datanode2:60020

11/12/07 08:52:25 DEBUG
client.HConnectionManager$HConnectionImplementation: Cached location for
chunk,pca_with_gabor_decomposition_offered_several,1323179964329.ed0b5be
c77229df3b6a5a08c117db355. is datanode2:60020

11/12/07 08:52:25 DEBUG
client.HConnectionManager$HConnectionImplementation: Cached location for
chunk,see_34\xE2\x80\x93_43however_when_many_partial,1323179993570.cf10c
9b3a05b1b26437c674af2b61cfc. is datanode3:60020

11/12/07 08:52:25 DEBUG
client.HConnectionManager$HConnectionImplementation: Cached location for
chunk,the_wheelsground_contact_between_vehicle_tire,1323179993570.ee5baf
fa6cdce4ced50c6a9c10beca75. is datanode3:60020

11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting at
row=chunk,,00000000000000 for max=2147483647 rows

11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
-> 0 -> datanode3:,array_for_lithiumion_battery_anode_material

11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
-> 1 ->
datanode3:array_for_lithiumion_battery_anode_material,cytotoxic_elements
_were_leached_from_the

11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
-> 2 ->
datanode3:cytotoxic_elements_were_leached_from_the,generation_cephalospo
rin_just_before_biopsy

11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
-> 3 ->
datanode3:generation_cephalosporin_just_before_biopsy,linked_with_the_20
08_great_wenchuan

11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
-> 4 ->
datanode2:linked_with_the_2008_great_wenchuan,pca_with_gabor_decompositi
on_offered_several

11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
-> 5 ->
datanode2:pca_with_gabor_decomposition_offered_several,see_34\xE2\x80\x9
3_43however_when_many_partial

11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
-> 6 ->
datanode3:see_34\xE2\x80\x93_43however_when_many_partial,the_wheelsgroun
d_contact_between_vehicle_tire

11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
-> 7 -> datanode3:the_wheelsground_contact_between_vehicle_tire,

11/12/07 08:52:25 INFO mapred.JobClient: Running job:
job_201111021158_0026

11/12/07 08:52:26 INFO mapred.JobClient:  map 0% reduce 0%

11/12/07 08:56:22 INFO mapred.JobClient:  map 12% reduce 0%

11/12/07 08:56:53 INFO mapred.JobClient:  map 25% reduce 0%

11/12/07 08:59:05 INFO mapred.JobClient:  map 37% reduce 0%

11/12/07 08:59:51 INFO mapred.JobClient:  map 50% reduce 0%

11/12/07 09:00:31 INFO mapred.JobClient:  map 62% reduce 0%

11/12/07 09:00:35 INFO mapred.JobClient:  map 75% reduce 0%

11/12/07 09:00:43 INFO mapred.JobClient:  map 87% reduce 0%

11/12/07 09:01:02 INFO mapred.JobClient:  map 100% reduce 0%

11/12/07 09:01:03 INFO mapred.JobClient: Job complete:
job_201111021158_0026

11/12/07 09:01:03 INFO mapred.JobClient: Counters: 13

11/12/07 09:01:03 INFO mapred.JobClient:   Job Counters 

11/12/07 09:01:03 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=3306288

11/12/07 09:01:03 INFO mapred.JobClient:     Total time spent by all
reduces waiting after reserving slots (ms)=0

11/12/07 09:01:03 INFO mapred.JobClient:     Total time spent by all
maps waiting after reserving slots (ms)=0

11/12/07 09:01:03 INFO mapred.JobClient:     Rack-local map tasks=8

11/12/07 09:01:03 INFO mapred.JobClient:     Launched map tasks=13

11/12/07 09:01:03 INFO mapred.JobClient:     Data-local map tasks=5

11/12/07 09:01:03 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0

11/12/07 09:01:03 INFO mapred.JobClient:   FileSystemCounters

11/12/07 09:01:03 INFO mapred.JobClient:     HDFS_BYTES_READ=1254

11/12/07 09:01:03 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=523502

11/12/07 09:01:03 INFO mapred.JobClient:   Map-Reduce Framework

11/12/07 09:01:03 INFO mapred.JobClient:     Map input records=26892941

11/12/07 09:01:03 INFO mapred.JobClient:     Spilled Records=0

11/12/07 09:01:03 INFO mapred.JobClient:     Map output records=26892941

11/12/07 09:01:03 INFO mapred.JobClient:     SPLIT_RAW_BYTES=1254

 

 


RE: CopyTable to remote cluster runs OK but doesn't copy anything

Posted by Jorn Argelo - Ephorus <Jo...@ephorus.com>.
Hi all,

To follow up on this,
org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication is
having exactly the same behaviour as CopyTable. 

Jorn

-----Oorspronkelijk bericht-----
Van: Jorn Argelo - Ephorus [mailto:Jorn.Argelo@ephorus.com] 
Verzonden: donderdag 8 december 2011 9:59
Aan: user@hbase.apache.org
Onderwerp: RE: CopyTable to remote cluster runs OK but doesn't copy
anything

Hi Jon / J-D,

Yeah, I had a bunch of additional stuff in my classpath which we needed
for other M/R jobs:
/etc/zookeeper:/etc/hadoop-0.20/conf:/usr/lib/hadoop-0.20/*:/usr/lib/had
oop-0.20/lib/*:/usr/lib/zookeeper/*:/usr/lib/zookeeper/lib/*

I tried just removing /etc/zookeeper from the classpath but then I still
had the same result. After removing that whole line from the classpath I
ended up with a working CopyTable. I could see that the MapReduce job
was now caching jars in /tmp which it didn't do before.

Maybe it's worthwhile to add this info the HBASE-4614? Let me know if
there's any way I can assist with testing.

Thanks a lot for your support.

Jorn

-----Oorspronkelijk bericht-----
Van: Jonathan Hsieh [mailto:jon@cloudera.com] 
Verzonden: woensdag 7 december 2011 19:09
Aan: user@hbase.apache.org
Onderwerp: Re: CopyTable to remote cluster runs OK but doesn't copy
anything

Jorn,

I recently ran into this problem.  The CopyTable it actually is copying
data to the same instance of the table, and likely because an hbase
client
in the MR job is picking up the settings from a zoo.cfg file.

Have you added `hbase classpath` to your hadoop-env.sh file?  Can you
check
if  zoo.cfg (possibly as /etc/zookeeper/* in CDH) is in the class path
of
the task trackers..

If it is, you may want to remove it from there and then add the ZK
settings
to your hbase-site.conf file.

Jon.

On Wed, Dec 7, 2011 at 9:31 AM, Jean-Daniel Cryans
<jd...@apache.org>wrote:

> It would most likely be this bug:
> https://issues.apache.org/jira/browse/HBASE-4614
>
> On Wed, Dec 7, 2011 at 12:27 AM, Jorn Argelo - Ephorus
> <Jo...@ephorus.com> wrote:
> > Hi all,
> >
> >
> >
> > I'm trying to copy a table from one cluster to another cluster but
this
> > does not seem to do what I expect it to do. The Map/Reduce job runs
> > successfully as you can see below, but it's not actually copying
> > anything to the remote cluster. It almost looks as if it's not
parsing
> > the --peer.adr option and just copies the data inside the same
cluster.
> > At least, the "WARN mapred.JobClient: Use GenericOptionsParser for
> > parsing the arguments. Applications should implement Tool for the
same"
> > warning would suggest that.
> >
> >
> >
> > Both clusters are running CHD3U1 and are both fully distributed,
> > although hbase-test1 is a single physical server running all
components
> > for a fully distributed setup. The source cluster where I am running
the
> > job from is a small 10 node cluster. Note that on hbase-test1 the
target
> > table already exists with the same column families as in the source
> > cluster.
> >
> >
> >
> > Does anybody have any idea what I'm doing wrong? Or maybe I found a
bug?
> > There's another guy at stackoverflow reporting the same issue
> >
(http://stackoverflow.com/questions/7952213/how-to-copy-a-table-from-one
> > -hbase-cluster-to-another-cluster) but nobody responded on that.
> >
> >
> >
> > Thanks,
> >
> > Jorn
> >
> >
> >
> >
> >
> > $ hbase org.apache.hadoop.hbase.mapreduce.CopyTable
> > --peer.adr=hbase-test1:2181:/hbase chunk
> >
> > 11/12/07 08:52:24 WARN mapred.JobClient: Use GenericOptionsParser
for
> > parsing the arguments. Applications should implement Tool for the
same.
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:zookeeper.version=3.3.3-cdh3u1--1, built on 07/18/2011
16:48
> > GMT
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:host.name=namenode1
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.version=1.6.0_26
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.vendor=Sun Microsystems Inc.
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.26/jre
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.class.path=<lots of jars, snipped out to prevent
spam>
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> >
environment:java.library.path=/usr/lib/hbase/bin/../lib/native/Linux-amd
> > 64-64:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.io.tmpdir=/tmp
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.compiler=<NA>
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:os.name=Linux
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:os.arch=amd64
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:os.version=2.6.32-33-server
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:user.name=mapred
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:user.home=/usr/lib/hadoop-0.20
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:user.dir=/usr/lib/hadoop-0.20
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Initiating client
> > connection, connectString=hbase-test1:2181 sessionTimeout=10000
> > watcher=hconnection
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Opening socket
connection
> > to server hbase-test1/10.30.10.10:2181
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Socket connection
> > established to hbase-test1/10.30.10.10:2181, initiating session
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Session establishment
> > complete on server hbase-test1/10.30.10.10:2181, sessionid =
> > 0x134126b2d250040, negotiated timeout = 10000
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Lookedup root
> > region location,
> >
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnection
> > Implementation@105691e; hsa=hbase-test1:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> > .META.,,1.1028785192 is hbase-test1:60020
> >
> > 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting
at
> > row=chunk,,00000000000000 for max=10 rows
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> > chunk,,1323181597686.f527a21a31a39559a2f4cbd034d286a7. is
> > hbase-test1:60020
> >
> > 11/12/07 08:52:25 INFO mapreduce.TableOutputFormat: Created table
> > instance for chunk
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Initiating client
> > connection, connectString=z01:2181,zk02:2181,zk03:2181
> > sessionTimeout=10000 watcher=hconnection
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Opening socket
connection
> > to server zk02/10.30.4.93:2181
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Socket connection
> > established to zk02/10.30.4.93:2181, initiating session
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Session establishment
> > complete on server zzk02/10.30.4.93:2181, sessionid =
0x233922a0b320c81,
> > negotiated timeout = 10000
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Lookedup root
> > region location,
> >
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnection
> > Implementation@3c3a1834; hsa=datanode1:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> > .META.,,1.1028785192 is datanode1:60020
> >
> > 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting
at
> > row=chunk,,00000000000000 for max=10 rows
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> > chunk,,1323179990451.a243a485325744b9eedd8da2106712b6. is
> > datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,array_for_lithiumion_battery_anode_material,1323179990451.6161771e
> > cadc7a45acd28afbfca88a09. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,cytotoxic_elements_were_leached_from_the,1323179991855.5cf99f0425d
> > 9e0e9fd41ac9645d65b93. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,generation_cephalosporin_just_before_biopsy,1323179991855.db0d0df2
> > c2e076bf01c75ae6ac200436. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,linked_with_the_2008_great_wenchuan,1323179964329.32ee9e359e50582b
> > f1b419396c9aa8ad. is datanode2:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,pca_with_gabor_decomposition_offered_several,1323179964329.ed0b5be
> > c77229df3b6a5a08c117db355. is datanode2:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,see_34\xE2\x80\x93_43however_when_many_partial,1323179993570.cf10c
> > 9b3a05b1b26437c674af2b61cfc. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,the_wheelsground_contact_between_vehicle_tire,1323179993570.ee5baf
> > fa6cdce4ced50c6a9c10beca75. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting
at
> > row=chunk,,00000000000000 for max=2147483647 rows
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 0 -> datanode3:,array_for_lithiumion_battery_anode_material
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 1 ->
> >
datanode3:array_for_lithiumion_battery_anode_material,cytotoxic_elements
> > _were_leached_from_the
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 2 ->
> >
datanode3:cytotoxic_elements_were_leached_from_the,generation_cephalospo
> > rin_just_before_biopsy
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 3 ->
> >
datanode3:generation_cephalosporin_just_before_biopsy,linked_with_the_20
> > 08_great_wenchuan
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 4 ->
> >
datanode2:linked_with_the_2008_great_wenchuan,pca_with_gabor_decompositi
> > on_offered_several
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 5 ->
> >
datanode2:pca_with_gabor_decomposition_offered_several,see_34\xE2\x80\x9
> > 3_43however_when_many_partial
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 6 ->
> >
datanode3:see_34\xE2\x80\x93_43however_when_many_partial,the_wheelsgroun
> > d_contact_between_vehicle_tire
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 7 -> datanode3:the_wheelsground_contact_between_vehicle_tire,
> >
> > 11/12/07 08:52:25 INFO mapred.JobClient: Running job:
> > job_201111021158_0026
> >
> > 11/12/07 08:52:26 INFO mapred.JobClient:  map 0% reduce 0%
> >
> > 11/12/07 08:56:22 INFO mapred.JobClient:  map 12% reduce 0%
> >
> > 11/12/07 08:56:53 INFO mapred.JobClient:  map 25% reduce 0%
> >
> > 11/12/07 08:59:05 INFO mapred.JobClient:  map 37% reduce 0%
> >
> > 11/12/07 08:59:51 INFO mapred.JobClient:  map 50% reduce 0%
> >
> > 11/12/07 09:00:31 INFO mapred.JobClient:  map 62% reduce 0%
> >
> > 11/12/07 09:00:35 INFO mapred.JobClient:  map 75% reduce 0%
> >
> > 11/12/07 09:00:43 INFO mapred.JobClient:  map 87% reduce 0%
> >
> > 11/12/07 09:01:02 INFO mapred.JobClient:  map 100% reduce 0%
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient: Job complete:
> > job_201111021158_0026
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient: Counters: 13
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:   Job Counters
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:
SLOTS_MILLIS_MAPS=3306288
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Total time spent by all
> > reduces waiting after reserving slots (ms)=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Total time spent by all
> > maps waiting after reserving slots (ms)=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Rack-local map tasks=8
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Launched map tasks=13
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Data-local map tasks=5
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:   FileSystemCounters
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     HDFS_BYTES_READ=1254
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:
FILE_BYTES_WRITTEN=523502
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:   Map-Reduce Framework
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Map input
records=26892941
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Spilled Records=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Map output
records=26892941
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     SPLIT_RAW_BYTES=1254
> >
> >
> >
> >
> >
>



-- 
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com

Re: CopyTable to remote cluster runs OK but doesn't copy anything

Posted by Jonathan Hsieh <jo...@cloudera.com>.
Jorn,

Did you restart mapreduce (the task trackers in particular) after the
change?  Hopefully when you restart you can check the TT's to make sure
that /etc/zookeeper/* is not in its class path.

Jon.

On Thu, Dec 8, 2011 at 12:59 AM, Jorn Argelo - Ephorus <
Jorn.Argelo@ephorus.com> wrote:

> Hi Jon / J-D,
>
> Yeah, I had a bunch of additional stuff in my classpath which we needed
> for other M/R jobs:
> /etc/zookeeper:/etc/hadoop-0.20/conf:/usr/lib/hadoop-0.20/*:/usr/lib/had
> oop-0.20/lib/*:/usr/lib/zookeeper/*:/usr/lib/zookeeper/lib/*
>
> I tried just removing /etc/zookeeper from the classpath but then I still
> had the same result. After removing that whole line from the classpath I
> ended up with a working CopyTable. I could see that the MapReduce job
> was now caching jars in /tmp which it didn't do before.
>
> Maybe it's worthwhile to add this info the HBASE-4614? Let me know if
> there's any way I can assist with testing.
>
> Thanks a lot for your support.
>
> Jorn
>
> -----Oorspronkelijk bericht-----
> Van: Jonathan Hsieh [mailto:jon@cloudera.com]
> Verzonden: woensdag 7 december 2011 19:09
> Aan: user@hbase.apache.org
> Onderwerp: Re: CopyTable to remote cluster runs OK but doesn't copy
> anything
>
> Jorn,
>
> I recently ran into this problem.  The CopyTable it actually is copying
> data to the same instance of the table, and likely because an hbase
> client
> in the MR job is picking up the settings from a zoo.cfg file.
>
> Have you added `hbase classpath` to your hadoop-env.sh file?  Can you
> check
> if  zoo.cfg (possibly as /etc/zookeeper/* in CDH) is in the class path
> of
> the task trackers..
>
> If it is, you may want to remove it from there and then add the ZK
> settings
> to your hbase-site.conf file.
>
> Jon.
>
> On Wed, Dec 7, 2011 at 9:31 AM, Jean-Daniel Cryans
> <jd...@apache.org>wrote:
>
> > It would most likely be this bug:
> > https://issues.apache.org/jira/browse/HBASE-4614
> >
> > On Wed, Dec 7, 2011 at 12:27 AM, Jorn Argelo - Ephorus
> > <Jo...@ephorus.com> wrote:
> > > Hi all,
> > >
> > >
> > >
> > > I'm trying to copy a table from one cluster to another cluster but
> this
> > > does not seem to do what I expect it to do. The Map/Reduce job runs
> > > successfully as you can see below, but it's not actually copying
> > > anything to the remote cluster. It almost looks as if it's not
> parsing
> > > the --peer.adr option and just copies the data inside the same
> cluster.
> > > At least, the "WARN mapred.JobClient: Use GenericOptionsParser for
> > > parsing the arguments. Applications should implement Tool for the
> same"
> > > warning would suggest that.
> > >
> > >
> > >
> > > Both clusters are running CHD3U1 and are both fully distributed,
> > > although hbase-test1 is a single physical server running all
> components
> > > for a fully distributed setup. The source cluster where I am running
> the
> > > job from is a small 10 node cluster. Note that on hbase-test1 the
> target
> > > table already exists with the same column families as in the source
> > > cluster.
> > >
> > >
> > >
> > > Does anybody have any idea what I'm doing wrong? Or maybe I found a
> bug?
> > > There's another guy at stackoverflow reporting the same issue
> > >
> (http://stackoverflow.com/questions/7952213/how-to-copy-a-table-from-one
> > > -hbase-cluster-to-another-cluster) but nobody responded on that.
> > >
> > >
> > >
> > > Thanks,
> > >
> > > Jorn
> > >
> > >
> > >
> > >
> > >
> > > $ hbase org.apache.hadoop.hbase.mapreduce.CopyTable
> > > --peer.adr=hbase-test1:2181:/hbase chunk
> > >
> > > 11/12/07 08:52:24 WARN mapred.JobClient: Use GenericOptionsParser
> for
> > > parsing the arguments. Applications should implement Tool for the
> same.
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > > environment:zookeeper.version=3.3.3-cdh3u1--1, built on 07/18/2011
> 16:48
> > > GMT
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > > environment:host.name=namenode1
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > > environment:java.version=1.6.0_26
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > > environment:java.vendor=Sun Microsystems Inc.
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > > environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.26/jre
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > > environment:java.class.path=<lots of jars, snipped out to prevent
> spam>
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > >
> environment:java.library.path=/usr/lib/hbase/bin/../lib/native/Linux-amd
> > > 64-64:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > > environment:java.io.tmpdir=/tmp
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > > environment:java.compiler=<NA>
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > > environment:os.name=Linux
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > > environment:os.arch=amd64
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > > environment:os.version=2.6.32-33-server
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > > environment:user.name=mapred
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > > environment:user.home=/usr/lib/hadoop-0.20
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > > environment:user.dir=/usr/lib/hadoop-0.20
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Initiating client
> > > connection, connectString=hbase-test1:2181 sessionTimeout=10000
> > > watcher=hconnection
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Opening socket
> connection
> > > to server hbase-test1/10.30.10.10:2181
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Socket connection
> > > established to hbase-test1/10.30.10.10:2181, initiating session
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Session establishment
> > > complete on server hbase-test1/10.30.10.10:2181, sessionid =
> > > 0x134126b2d250040, negotiated timeout = 10000
> > >
> > > 11/12/07 08:52:25 DEBUG
> > > client.HConnectionManager$HConnectionImplementation: Lookedup root
> > > region location,
> > >
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnection
> > > Implementation@105691e; hsa=hbase-test1:60020
> > >
> > > 11/12/07 08:52:25 DEBUG
> > > client.HConnectionManager$HConnectionImplementation: Cached location
> for
> > > .META.,,1.1028785192 is hbase-test1:60020
> > >
> > > 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting
> at
> > > row=chunk,,00000000000000 for max=10 rows
> > >
> > > 11/12/07 08:52:25 DEBUG
> > > client.HConnectionManager$HConnectionImplementation: Cached location
> for
> > > chunk,,1323181597686.f527a21a31a39559a2f4cbd034d286a7. is
> > > hbase-test1:60020
> > >
> > > 11/12/07 08:52:25 INFO mapreduce.TableOutputFormat: Created table
> > > instance for chunk
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Initiating client
> > > connection, connectString=z01:2181,zk02:2181,zk03:2181
> > > sessionTimeout=10000 watcher=hconnection
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Opening socket
> connection
> > > to server zk02/10.30.4.93:2181
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Socket connection
> > > established to zk02/10.30.4.93:2181, initiating session
> > >
> > > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Session establishment
> > > complete on server zzk02/10.30.4.93:2181, sessionid =
> 0x233922a0b320c81,
> > > negotiated timeout = 10000
> > >
> > > 11/12/07 08:52:25 DEBUG
> > > client.HConnectionManager$HConnectionImplementation: Lookedup root
> > > region location,
> > >
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnection
> > > Implementation@3c3a1834; hsa=datanode1:60020
> > >
> > > 11/12/07 08:52:25 DEBUG
> > > client.HConnectionManager$HConnectionImplementation: Cached location
> for
> > > .META.,,1.1028785192 is datanode1:60020
> > >
> > > 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting
> at
> > > row=chunk,,00000000000000 for max=10 rows
> > >
> > > 11/12/07 08:52:25 DEBUG
> > > client.HConnectionManager$HConnectionImplementation: Cached location
> for
> > > chunk,,1323179990451.a243a485325744b9eedd8da2106712b6. is
> > > datanode3:60020
> > >
> > > 11/12/07 08:52:25 DEBUG
> > > client.HConnectionManager$HConnectionImplementation: Cached location
> for
> > >
> chunk,array_for_lithiumion_battery_anode_material,1323179990451.6161771e
> > > cadc7a45acd28afbfca88a09. is datanode3:60020
> > >
> > > 11/12/07 08:52:25 DEBUG
> > > client.HConnectionManager$HConnectionImplementation: Cached location
> for
> > >
> chunk,cytotoxic_elements_were_leached_from_the,1323179991855.5cf99f0425d
> > > 9e0e9fd41ac9645d65b93. is datanode3:60020
> > >
> > > 11/12/07 08:52:25 DEBUG
> > > client.HConnectionManager$HConnectionImplementation: Cached location
> for
> > >
> chunk,generation_cephalosporin_just_before_biopsy,1323179991855.db0d0df2
> > > c2e076bf01c75ae6ac200436. is datanode3:60020
> > >
> > > 11/12/07 08:52:25 DEBUG
> > > client.HConnectionManager$HConnectionImplementation: Cached location
> for
> > >
> chunk,linked_with_the_2008_great_wenchuan,1323179964329.32ee9e359e50582b
> > > f1b419396c9aa8ad. is datanode2:60020
> > >
> > > 11/12/07 08:52:25 DEBUG
> > > client.HConnectionManager$HConnectionImplementation: Cached location
> for
> > >
> chunk,pca_with_gabor_decomposition_offered_several,1323179964329.ed0b5be
> > > c77229df3b6a5a08c117db355. is datanode2:60020
> > >
> > > 11/12/07 08:52:25 DEBUG
> > > client.HConnectionManager$HConnectionImplementation: Cached location
> for
> > >
> chunk,see_34\xE2\x80\x93_43however_when_many_partial,1323179993570.cf10c
> > > 9b3a05b1b26437c674af2b61cfc. is datanode3:60020
> > >
> > > 11/12/07 08:52:25 DEBUG
> > > client.HConnectionManager$HConnectionImplementation: Cached location
> for
> > >
> chunk,the_wheelsground_contact_between_vehicle_tire,1323179993570.ee5baf
> > > fa6cdce4ced50c6a9c10beca75. is datanode3:60020
> > >
> > > 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting
> at
> > > row=chunk,,00000000000000 for max=2147483647 rows
> > >
> > > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
> split
> > > -> 0 -> datanode3:,array_for_lithiumion_battery_anode_material
> > >
> > > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
> split
> > > -> 1 ->
> > >
> datanode3:array_for_lithiumion_battery_anode_material,cytotoxic_elements
> > > _were_leached_from_the
> > >
> > > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
> split
> > > -> 2 ->
> > >
> datanode3:cytotoxic_elements_were_leached_from_the,generation_cephalospo
> > > rin_just_before_biopsy
> > >
> > > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
> split
> > > -> 3 ->
> > >
> datanode3:generation_cephalosporin_just_before_biopsy,linked_with_the_20
> > > 08_great_wenchuan
> > >
> > > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
> split
> > > -> 4 ->
> > >
> datanode2:linked_with_the_2008_great_wenchuan,pca_with_gabor_decompositi
> > > on_offered_several
> > >
> > > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
> split
> > > -> 5 ->
> > >
> datanode2:pca_with_gabor_decomposition_offered_several,see_34\xE2\x80\x9
> > > 3_43however_when_many_partial
> > >
> > > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
> split
> > > -> 6 ->
> > >
> datanode3:see_34\xE2\x80\x93_43however_when_many_partial,the_wheelsgroun
> > > d_contact_between_vehicle_tire
> > >
> > > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
> split
> > > -> 7 -> datanode3:the_wheelsground_contact_between_vehicle_tire,
> > >
> > > 11/12/07 08:52:25 INFO mapred.JobClient: Running job:
> > > job_201111021158_0026
> > >
> > > 11/12/07 08:52:26 INFO mapred.JobClient:  map 0% reduce 0%
> > >
> > > 11/12/07 08:56:22 INFO mapred.JobClient:  map 12% reduce 0%
> > >
> > > 11/12/07 08:56:53 INFO mapred.JobClient:  map 25% reduce 0%
> > >
> > > 11/12/07 08:59:05 INFO mapred.JobClient:  map 37% reduce 0%
> > >
> > > 11/12/07 08:59:51 INFO mapred.JobClient:  map 50% reduce 0%
> > >
> > > 11/12/07 09:00:31 INFO mapred.JobClient:  map 62% reduce 0%
> > >
> > > 11/12/07 09:00:35 INFO mapred.JobClient:  map 75% reduce 0%
> > >
> > > 11/12/07 09:00:43 INFO mapred.JobClient:  map 87% reduce 0%
> > >
> > > 11/12/07 09:01:02 INFO mapred.JobClient:  map 100% reduce 0%
> > >
> > > 11/12/07 09:01:03 INFO mapred.JobClient: Job complete:
> > > job_201111021158_0026
> > >
> > > 11/12/07 09:01:03 INFO mapred.JobClient: Counters: 13
> > >
> > > 11/12/07 09:01:03 INFO mapred.JobClient:   Job Counters
> > >
> > > 11/12/07 09:01:03 INFO mapred.JobClient:
> SLOTS_MILLIS_MAPS=3306288
> > >
> > > 11/12/07 09:01:03 INFO mapred.JobClient:     Total time spent by all
> > > reduces waiting after reserving slots (ms)=0
> > >
> > > 11/12/07 09:01:03 INFO mapred.JobClient:     Total time spent by all
> > > maps waiting after reserving slots (ms)=0
> > >
> > > 11/12/07 09:01:03 INFO mapred.JobClient:     Rack-local map tasks=8
> > >
> > > 11/12/07 09:01:03 INFO mapred.JobClient:     Launched map tasks=13
> > >
> > > 11/12/07 09:01:03 INFO mapred.JobClient:     Data-local map tasks=5
> > >
> > > 11/12/07 09:01:03 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
> > >
> > > 11/12/07 09:01:03 INFO mapred.JobClient:   FileSystemCounters
> > >
> > > 11/12/07 09:01:03 INFO mapred.JobClient:     HDFS_BYTES_READ=1254
> > >
> > > 11/12/07 09:01:03 INFO mapred.JobClient:
> FILE_BYTES_WRITTEN=523502
> > >
> > > 11/12/07 09:01:03 INFO mapred.JobClient:   Map-Reduce Framework
> > >
> > > 11/12/07 09:01:03 INFO mapred.JobClient:     Map input
> records=26892941
> > >
> > > 11/12/07 09:01:03 INFO mapred.JobClient:     Spilled Records=0
> > >
> > > 11/12/07 09:01:03 INFO mapred.JobClient:     Map output
> records=26892941
> > >
> > > 11/12/07 09:01:03 INFO mapred.JobClient:     SPLIT_RAW_BYTES=1254
> > >
> > >
> > >
> > >
> > >
> >
>
>
>
> --
> // Jonathan Hsieh (shay)
> // Software Engineer, Cloudera
> // jon@cloudera.com
>



-- 
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com

RE: CopyTable to remote cluster runs OK but doesn't copy anything

Posted by Stuti Awasthi <st...@hcl.com>.
Hi,
I have also tried copyTable with different clusters. It worked for me fine. I set the hbase.zookeeper.quorum property in Hbase conf file. I used Hadoop-0.20.2.

Thanks

-----Original Message-----
From: Jorn Argelo - Ephorus [mailto:Jorn.Argelo@ephorus.com]
Sent: Thursday, December 08, 2011 2:29 PM
To: user@hbase.apache.org
Subject: RE: CopyTable to remote cluster runs OK but doesn't copy anything

Hi Jon / J-D,

Yeah, I had a bunch of additional stuff in my classpath which we needed for other M/R jobs:
/etc/zookeeper:/etc/hadoop-0.20/conf:/usr/lib/hadoop-0.20/*:/usr/lib/had
oop-0.20/lib/*:/usr/lib/zookeeper/*:/usr/lib/zookeeper/lib/*

I tried just removing /etc/zookeeper from the classpath but then I still had the same result. After removing that whole line from the classpath I ended up with a working CopyTable. I could see that the MapReduce job was now caching jars in /tmp which it didn't do before.

Maybe it's worthwhile to add this info the HBASE-4614? Let me know if there's any way I can assist with testing.

Thanks a lot for your support.

Jorn

-----Oorspronkelijk bericht-----
Van: Jonathan Hsieh [mailto:jon@cloudera.com]
Verzonden: woensdag 7 december 2011 19:09
Aan: user@hbase.apache.org
Onderwerp: Re: CopyTable to remote cluster runs OK but doesn't copy anything

Jorn,

I recently ran into this problem.  The CopyTable it actually is copying data to the same instance of the table, and likely because an hbase client in the MR job is picking up the settings from a zoo.cfg file.

Have you added `hbase classpath` to your hadoop-env.sh file?  Can you check if  zoo.cfg (possibly as /etc/zookeeper/* in CDH) is in the class path of the task trackers..

If it is, you may want to remove it from there and then add the ZK settings to your hbase-site.conf file.

Jon.

On Wed, Dec 7, 2011 at 9:31 AM, Jean-Daniel Cryans
<jd...@apache.org>wrote:

> It would most likely be this bug:
> https://issues.apache.org/jira/browse/HBASE-4614
>
> On Wed, Dec 7, 2011 at 12:27 AM, Jorn Argelo - Ephorus
> <Jo...@ephorus.com> wrote:
> > Hi all,
> >
> >
> >
> > I'm trying to copy a table from one cluster to another cluster but
this
> > does not seem to do what I expect it to do. The Map/Reduce job runs
> > successfully as you can see below, but it's not actually copying
> > anything to the remote cluster. It almost looks as if it's not
parsing
> > the --peer.adr option and just copies the data inside the same
cluster.
> > At least, the "WARN mapred.JobClient: Use GenericOptionsParser for
> > parsing the arguments. Applications should implement Tool for the
same"
> > warning would suggest that.
> >
> >
> >
> > Both clusters are running CHD3U1 and are both fully distributed,
> > although hbase-test1 is a single physical server running all
components
> > for a fully distributed setup. The source cluster where I am running
the
> > job from is a small 10 node cluster. Note that on hbase-test1 the
target
> > table already exists with the same column families as in the source
> > cluster.
> >
> >
> >
> > Does anybody have any idea what I'm doing wrong? Or maybe I found a
bug?
> > There's another guy at stackoverflow reporting the same issue
> >
(http://stackoverflow.com/questions/7952213/how-to-copy-a-table-from-one
> > -hbase-cluster-to-another-cluster) but nobody responded on that.
> >
> >
> >
> > Thanks,
> >
> > Jorn
> >
> >
> >
> >
> >
> > $ hbase org.apache.hadoop.hbase.mapreduce.CopyTable
> > --peer.adr=hbase-test1:2181:/hbase chunk
> >
> > 11/12/07 08:52:24 WARN mapred.JobClient: Use GenericOptionsParser
for
> > parsing the arguments. Applications should implement Tool for the
same.
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:zookeeper.version=3.3.3-cdh3u1--1, built on 07/18/2011
16:48
> > GMT
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:host.name=namenode1
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.version=1.6.0_26
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.vendor=Sun Microsystems Inc.
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.26/jre
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.class.path=<lots of jars, snipped out to prevent
spam>
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> >
environment:java.library.path=/usr/lib/hbase/bin/../lib/native/Linux-amd
> > 64-64:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.io.tmpdir=/tmp
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.compiler=<NA>
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:os.name=Linux
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:os.arch=amd64
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:os.version=2.6.32-33-server
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:user.name=mapred
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:user.home=/usr/lib/hadoop-0.20
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:user.dir=/usr/lib/hadoop-0.20
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Initiating client
> > connection, connectString=hbase-test1:2181 sessionTimeout=10000
> > watcher=hconnection
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Opening socket
connection
> > to server hbase-test1/10.30.10.10:2181
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Socket connection
> > established to hbase-test1/10.30.10.10:2181, initiating session
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Session establishment
> > complete on server hbase-test1/10.30.10.10:2181, sessionid =
> > 0x134126b2d250040, negotiated timeout = 10000
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Lookedup root
> > region location,
> >
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnection
> > Implementation@105691e; hsa=hbase-test1:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> > .META.,,1.1028785192 is hbase-test1:60020
> >
> > 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting
at
> > row=chunk,,00000000000000 for max=10 rows
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> > chunk,,1323181597686.f527a21a31a39559a2f4cbd034d286a7. is
> > hbase-test1:60020
> >
> > 11/12/07 08:52:25 INFO mapreduce.TableOutputFormat: Created table
> > instance for chunk
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Initiating client
> > connection, connectString=z01:2181,zk02:2181,zk03:2181
> > sessionTimeout=10000 watcher=hconnection
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Opening socket
connection
> > to server zk02/10.30.4.93:2181
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Socket connection
> > established to zk02/10.30.4.93:2181, initiating session
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Session establishment
> > complete on server zzk02/10.30.4.93:2181, sessionid =
0x233922a0b320c81,
> > negotiated timeout = 10000
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Lookedup root
> > region location,
> >
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnection
> > Implementation@3c3a1834; hsa=datanode1:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> > .META.,,1.1028785192 is datanode1:60020
> >
> > 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting
at
> > row=chunk,,00000000000000 for max=10 rows
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> > chunk,,1323179990451.a243a485325744b9eedd8da2106712b6. is
> > datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,array_for_lithiumion_battery_anode_material,1323179990451.6161771e
> > cadc7a45acd28afbfca88a09. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,cytotoxic_elements_were_leached_from_the,1323179991855.5cf99f0425d
> > 9e0e9fd41ac9645d65b93. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,generation_cephalosporin_just_before_biopsy,1323179991855.db0d0df2
> > c2e076bf01c75ae6ac200436. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,linked_with_the_2008_great_wenchuan,1323179964329.32ee9e359e50582b
> > f1b419396c9aa8ad. is datanode2:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,pca_with_gabor_decomposition_offered_several,1323179964329.ed0b5be
> > c77229df3b6a5a08c117db355. is datanode2:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,see_34\xE2\x80\x93_43however_when_many_partial,1323179993570.cf10c
> > 9b3a05b1b26437c674af2b61cfc. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,the_wheelsground_contact_between_vehicle_tire,1323179993570.ee5baf
> > fa6cdce4ced50c6a9c10beca75. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting
at
> > row=chunk,,00000000000000 for max=2147483647 rows
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 0 -> datanode3:,array_for_lithiumion_battery_anode_material
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 1 ->
> >
datanode3:array_for_lithiumion_battery_anode_material,cytotoxic_elements
> > _were_leached_from_the
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 2 ->
> >
datanode3:cytotoxic_elements_were_leached_from_the,generation_cephalospo
> > rin_just_before_biopsy
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 3 ->
> >
datanode3:generation_cephalosporin_just_before_biopsy,linked_with_the_20
> > 08_great_wenchuan
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 4 ->
> >
datanode2:linked_with_the_2008_great_wenchuan,pca_with_gabor_decompositi
> > on_offered_several
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 5 ->
> >
datanode2:pca_with_gabor_decomposition_offered_several,see_34\xE2\x80\x9
> > 3_43however_when_many_partial
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 6 ->
> >
datanode3:see_34\xE2\x80\x93_43however_when_many_partial,the_wheelsgroun
> > d_contact_between_vehicle_tire
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 7 -> datanode3:the_wheelsground_contact_between_vehicle_tire,
> >
> > 11/12/07 08:52:25 INFO mapred.JobClient: Running job:
> > job_201111021158_0026
> >
> > 11/12/07 08:52:26 INFO mapred.JobClient:  map 0% reduce 0%
> >
> > 11/12/07 08:56:22 INFO mapred.JobClient:  map 12% reduce 0%
> >
> > 11/12/07 08:56:53 INFO mapred.JobClient:  map 25% reduce 0%
> >
> > 11/12/07 08:59:05 INFO mapred.JobClient:  map 37% reduce 0%
> >
> > 11/12/07 08:59:51 INFO mapred.JobClient:  map 50% reduce 0%
> >
> > 11/12/07 09:00:31 INFO mapred.JobClient:  map 62% reduce 0%
> >
> > 11/12/07 09:00:35 INFO mapred.JobClient:  map 75% reduce 0%
> >
> > 11/12/07 09:00:43 INFO mapred.JobClient:  map 87% reduce 0%
> >
> > 11/12/07 09:01:02 INFO mapred.JobClient:  map 100% reduce 0%
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient: Job complete:
> > job_201111021158_0026
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient: Counters: 13
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:   Job Counters
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:
SLOTS_MILLIS_MAPS=3306288
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Total time spent by all
> > reduces waiting after reserving slots (ms)=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Total time spent by all
> > maps waiting after reserving slots (ms)=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Rack-local map tasks=8
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Launched map tasks=13
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Data-local map tasks=5
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:   FileSystemCounters
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     HDFS_BYTES_READ=1254
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:
FILE_BYTES_WRITTEN=523502
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:   Map-Reduce Framework
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Map input
records=26892941
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Spilled Records=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Map output
records=26892941
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     SPLIT_RAW_BYTES=1254
> >
> >
> >
> >
> >
>



--
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com

::DISCLAIMER::
-----------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
It shall not attach any liability on the originator or HCL or its affiliates. Any views or opinions presented in
this email are solely those of the author and may not necessarily reflect the opinions of HCL or its affiliates.
Any form of reproduction, dissemination, copying, disclosure, modification, distribution and / or publication of
this message without the prior written consent of the author of this e-mail is strictly prohibited. If you have
received this email in error please delete it and notify the sender immediately. Before opening any mail and
attachments please check them for viruses and defect.

-----------------------------------------------------------------------------------------------------------------------

RE: CopyTable to remote cluster runs OK but doesn't copy anything

Posted by Jorn Argelo - Ephorus <Jo...@ephorus.com>.
Hi Jon / J-D,

Yeah, I had a bunch of additional stuff in my classpath which we needed
for other M/R jobs:
/etc/zookeeper:/etc/hadoop-0.20/conf:/usr/lib/hadoop-0.20/*:/usr/lib/had
oop-0.20/lib/*:/usr/lib/zookeeper/*:/usr/lib/zookeeper/lib/*

I tried just removing /etc/zookeeper from the classpath but then I still
had the same result. After removing that whole line from the classpath I
ended up with a working CopyTable. I could see that the MapReduce job
was now caching jars in /tmp which it didn't do before.

Maybe it's worthwhile to add this info the HBASE-4614? Let me know if
there's any way I can assist with testing.

Thanks a lot for your support.

Jorn

-----Oorspronkelijk bericht-----
Van: Jonathan Hsieh [mailto:jon@cloudera.com] 
Verzonden: woensdag 7 december 2011 19:09
Aan: user@hbase.apache.org
Onderwerp: Re: CopyTable to remote cluster runs OK but doesn't copy
anything

Jorn,

I recently ran into this problem.  The CopyTable it actually is copying
data to the same instance of the table, and likely because an hbase
client
in the MR job is picking up the settings from a zoo.cfg file.

Have you added `hbase classpath` to your hadoop-env.sh file?  Can you
check
if  zoo.cfg (possibly as /etc/zookeeper/* in CDH) is in the class path
of
the task trackers..

If it is, you may want to remove it from there and then add the ZK
settings
to your hbase-site.conf file.

Jon.

On Wed, Dec 7, 2011 at 9:31 AM, Jean-Daniel Cryans
<jd...@apache.org>wrote:

> It would most likely be this bug:
> https://issues.apache.org/jira/browse/HBASE-4614
>
> On Wed, Dec 7, 2011 at 12:27 AM, Jorn Argelo - Ephorus
> <Jo...@ephorus.com> wrote:
> > Hi all,
> >
> >
> >
> > I'm trying to copy a table from one cluster to another cluster but
this
> > does not seem to do what I expect it to do. The Map/Reduce job runs
> > successfully as you can see below, but it's not actually copying
> > anything to the remote cluster. It almost looks as if it's not
parsing
> > the --peer.adr option and just copies the data inside the same
cluster.
> > At least, the "WARN mapred.JobClient: Use GenericOptionsParser for
> > parsing the arguments. Applications should implement Tool for the
same"
> > warning would suggest that.
> >
> >
> >
> > Both clusters are running CHD3U1 and are both fully distributed,
> > although hbase-test1 is a single physical server running all
components
> > for a fully distributed setup. The source cluster where I am running
the
> > job from is a small 10 node cluster. Note that on hbase-test1 the
target
> > table already exists with the same column families as in the source
> > cluster.
> >
> >
> >
> > Does anybody have any idea what I'm doing wrong? Or maybe I found a
bug?
> > There's another guy at stackoverflow reporting the same issue
> >
(http://stackoverflow.com/questions/7952213/how-to-copy-a-table-from-one
> > -hbase-cluster-to-another-cluster) but nobody responded on that.
> >
> >
> >
> > Thanks,
> >
> > Jorn
> >
> >
> >
> >
> >
> > $ hbase org.apache.hadoop.hbase.mapreduce.CopyTable
> > --peer.adr=hbase-test1:2181:/hbase chunk
> >
> > 11/12/07 08:52:24 WARN mapred.JobClient: Use GenericOptionsParser
for
> > parsing the arguments. Applications should implement Tool for the
same.
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:zookeeper.version=3.3.3-cdh3u1--1, built on 07/18/2011
16:48
> > GMT
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:host.name=namenode1
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.version=1.6.0_26
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.vendor=Sun Microsystems Inc.
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.26/jre
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.class.path=<lots of jars, snipped out to prevent
spam>
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> >
environment:java.library.path=/usr/lib/hbase/bin/../lib/native/Linux-amd
> > 64-64:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.io.tmpdir=/tmp
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.compiler=<NA>
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:os.name=Linux
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:os.arch=amd64
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:os.version=2.6.32-33-server
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:user.name=mapred
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:user.home=/usr/lib/hadoop-0.20
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:user.dir=/usr/lib/hadoop-0.20
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Initiating client
> > connection, connectString=hbase-test1:2181 sessionTimeout=10000
> > watcher=hconnection
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Opening socket
connection
> > to server hbase-test1/10.30.10.10:2181
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Socket connection
> > established to hbase-test1/10.30.10.10:2181, initiating session
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Session establishment
> > complete on server hbase-test1/10.30.10.10:2181, sessionid =
> > 0x134126b2d250040, negotiated timeout = 10000
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Lookedup root
> > region location,
> >
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnection
> > Implementation@105691e; hsa=hbase-test1:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> > .META.,,1.1028785192 is hbase-test1:60020
> >
> > 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting
at
> > row=chunk,,00000000000000 for max=10 rows
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> > chunk,,1323181597686.f527a21a31a39559a2f4cbd034d286a7. is
> > hbase-test1:60020
> >
> > 11/12/07 08:52:25 INFO mapreduce.TableOutputFormat: Created table
> > instance for chunk
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Initiating client
> > connection, connectString=z01:2181,zk02:2181,zk03:2181
> > sessionTimeout=10000 watcher=hconnection
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Opening socket
connection
> > to server zk02/10.30.4.93:2181
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Socket connection
> > established to zk02/10.30.4.93:2181, initiating session
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Session establishment
> > complete on server zzk02/10.30.4.93:2181, sessionid =
0x233922a0b320c81,
> > negotiated timeout = 10000
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Lookedup root
> > region location,
> >
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnection
> > Implementation@3c3a1834; hsa=datanode1:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> > .META.,,1.1028785192 is datanode1:60020
> >
> > 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting
at
> > row=chunk,,00000000000000 for max=10 rows
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> > chunk,,1323179990451.a243a485325744b9eedd8da2106712b6. is
> > datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,array_for_lithiumion_battery_anode_material,1323179990451.6161771e
> > cadc7a45acd28afbfca88a09. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,cytotoxic_elements_were_leached_from_the,1323179991855.5cf99f0425d
> > 9e0e9fd41ac9645d65b93. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,generation_cephalosporin_just_before_biopsy,1323179991855.db0d0df2
> > c2e076bf01c75ae6ac200436. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,linked_with_the_2008_great_wenchuan,1323179964329.32ee9e359e50582b
> > f1b419396c9aa8ad. is datanode2:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,pca_with_gabor_decomposition_offered_several,1323179964329.ed0b5be
> > c77229df3b6a5a08c117db355. is datanode2:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,see_34\xE2\x80\x93_43however_when_many_partial,1323179993570.cf10c
> > 9b3a05b1b26437c674af2b61cfc. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location
for
> >
chunk,the_wheelsground_contact_between_vehicle_tire,1323179993570.ee5baf
> > fa6cdce4ced50c6a9c10beca75. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting
at
> > row=chunk,,00000000000000 for max=2147483647 rows
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 0 -> datanode3:,array_for_lithiumion_battery_anode_material
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 1 ->
> >
datanode3:array_for_lithiumion_battery_anode_material,cytotoxic_elements
> > _were_leached_from_the
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 2 ->
> >
datanode3:cytotoxic_elements_were_leached_from_the,generation_cephalospo
> > rin_just_before_biopsy
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 3 ->
> >
datanode3:generation_cephalosporin_just_before_biopsy,linked_with_the_20
> > 08_great_wenchuan
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 4 ->
> >
datanode2:linked_with_the_2008_great_wenchuan,pca_with_gabor_decompositi
> > on_offered_several
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 5 ->
> >
datanode2:pca_with_gabor_decomposition_offered_several,see_34\xE2\x80\x9
> > 3_43however_when_many_partial
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 6 ->
> >
datanode3:see_34\xE2\x80\x93_43however_when_many_partial,the_wheelsgroun
> > d_contact_between_vehicle_tire
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits:
split
> > -> 7 -> datanode3:the_wheelsground_contact_between_vehicle_tire,
> >
> > 11/12/07 08:52:25 INFO mapred.JobClient: Running job:
> > job_201111021158_0026
> >
> > 11/12/07 08:52:26 INFO mapred.JobClient:  map 0% reduce 0%
> >
> > 11/12/07 08:56:22 INFO mapred.JobClient:  map 12% reduce 0%
> >
> > 11/12/07 08:56:53 INFO mapred.JobClient:  map 25% reduce 0%
> >
> > 11/12/07 08:59:05 INFO mapred.JobClient:  map 37% reduce 0%
> >
> > 11/12/07 08:59:51 INFO mapred.JobClient:  map 50% reduce 0%
> >
> > 11/12/07 09:00:31 INFO mapred.JobClient:  map 62% reduce 0%
> >
> > 11/12/07 09:00:35 INFO mapred.JobClient:  map 75% reduce 0%
> >
> > 11/12/07 09:00:43 INFO mapred.JobClient:  map 87% reduce 0%
> >
> > 11/12/07 09:01:02 INFO mapred.JobClient:  map 100% reduce 0%
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient: Job complete:
> > job_201111021158_0026
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient: Counters: 13
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:   Job Counters
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:
SLOTS_MILLIS_MAPS=3306288
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Total time spent by all
> > reduces waiting after reserving slots (ms)=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Total time spent by all
> > maps waiting after reserving slots (ms)=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Rack-local map tasks=8
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Launched map tasks=13
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Data-local map tasks=5
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:   FileSystemCounters
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     HDFS_BYTES_READ=1254
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:
FILE_BYTES_WRITTEN=523502
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:   Map-Reduce Framework
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Map input
records=26892941
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Spilled Records=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Map output
records=26892941
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     SPLIT_RAW_BYTES=1254
> >
> >
> >
> >
> >
>



-- 
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com

Re: CopyTable to remote cluster runs OK but doesn't copy anything

Posted by Jonathan Hsieh <jo...@cloudera.com>.
Jorn,

I recently ran into this problem.  The CopyTable it actually is copying
data to the same instance of the table, and likely because an hbase client
in the MR job is picking up the settings from a zoo.cfg file.

Have you added `hbase classpath` to your hadoop-env.sh file?  Can you check
if  zoo.cfg (possibly as /etc/zookeeper/* in CDH) is in the class path of
the task trackers..

If it is, you may want to remove it from there and then add the ZK settings
to your hbase-site.conf file.

Jon.

On Wed, Dec 7, 2011 at 9:31 AM, Jean-Daniel Cryans <jd...@apache.org>wrote:

> It would most likely be this bug:
> https://issues.apache.org/jira/browse/HBASE-4614
>
> On Wed, Dec 7, 2011 at 12:27 AM, Jorn Argelo - Ephorus
> <Jo...@ephorus.com> wrote:
> > Hi all,
> >
> >
> >
> > I'm trying to copy a table from one cluster to another cluster but this
> > does not seem to do what I expect it to do. The Map/Reduce job runs
> > successfully as you can see below, but it's not actually copying
> > anything to the remote cluster. It almost looks as if it's not parsing
> > the --peer.adr option and just copies the data inside the same cluster.
> > At least, the "WARN mapred.JobClient: Use GenericOptionsParser for
> > parsing the arguments. Applications should implement Tool for the same"
> > warning would suggest that.
> >
> >
> >
> > Both clusters are running CHD3U1 and are both fully distributed,
> > although hbase-test1 is a single physical server running all components
> > for a fully distributed setup. The source cluster where I am running the
> > job from is a small 10 node cluster. Note that on hbase-test1 the target
> > table already exists with the same column families as in the source
> > cluster.
> >
> >
> >
> > Does anybody have any idea what I'm doing wrong? Or maybe I found a bug?
> > There's another guy at stackoverflow reporting the same issue
> > (http://stackoverflow.com/questions/7952213/how-to-copy-a-table-from-one
> > -hbase-cluster-to-another-cluster) but nobody responded on that.
> >
> >
> >
> > Thanks,
> >
> > Jorn
> >
> >
> >
> >
> >
> > $ hbase org.apache.hadoop.hbase.mapreduce.CopyTable
> > --peer.adr=hbase-test1:2181:/hbase chunk
> >
> > 11/12/07 08:52:24 WARN mapred.JobClient: Use GenericOptionsParser for
> > parsing the arguments. Applications should implement Tool for the same.
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:zookeeper.version=3.3.3-cdh3u1--1, built on 07/18/2011 16:48
> > GMT
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:host.name=namenode1
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.version=1.6.0_26
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.vendor=Sun Microsystems Inc.
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.26/jre
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.class.path=<lots of jars, snipped out to prevent spam>
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.library.path=/usr/lib/hbase/bin/../lib/native/Linux-amd
> > 64-64:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.io.tmpdir=/tmp
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:java.compiler=<NA>
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:os.name=Linux
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:os.arch=amd64
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:os.version=2.6.32-33-server
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:user.name=mapred
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:user.home=/usr/lib/hadoop-0.20
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> > environment:user.dir=/usr/lib/hadoop-0.20
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Initiating client
> > connection, connectString=hbase-test1:2181 sessionTimeout=10000
> > watcher=hconnection
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Opening socket connection
> > to server hbase-test1/10.30.10.10:2181
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Socket connection
> > established to hbase-test1/10.30.10.10:2181, initiating session
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Session establishment
> > complete on server hbase-test1/10.30.10.10:2181, sessionid =
> > 0x134126b2d250040, negotiated timeout = 10000
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Lookedup root
> > region location,
> > connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnection
> > Implementation@105691e; hsa=hbase-test1:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location for
> > .META.,,1.1028785192 is hbase-test1:60020
> >
> > 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting at
> > row=chunk,,00000000000000 for max=10 rows
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location for
> > chunk,,1323181597686.f527a21a31a39559a2f4cbd034d286a7. is
> > hbase-test1:60020
> >
> > 11/12/07 08:52:25 INFO mapreduce.TableOutputFormat: Created table
> > instance for chunk
> >
> > 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Initiating client
> > connection, connectString=z01:2181,zk02:2181,zk03:2181
> > sessionTimeout=10000 watcher=hconnection
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Opening socket connection
> > to server zk02/10.30.4.93:2181
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Socket connection
> > established to zk02/10.30.4.93:2181, initiating session
> >
> > 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Session establishment
> > complete on server zzk02/10.30.4.93:2181, sessionid = 0x233922a0b320c81,
> > negotiated timeout = 10000
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Lookedup root
> > region location,
> > connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnection
> > Implementation@3c3a1834; hsa=datanode1:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location for
> > .META.,,1.1028785192 is datanode1:60020
> >
> > 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting at
> > row=chunk,,00000000000000 for max=10 rows
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location for
> > chunk,,1323179990451.a243a485325744b9eedd8da2106712b6. is
> > datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location for
> > chunk,array_for_lithiumion_battery_anode_material,1323179990451.6161771e
> > cadc7a45acd28afbfca88a09. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location for
> > chunk,cytotoxic_elements_were_leached_from_the,1323179991855.5cf99f0425d
> > 9e0e9fd41ac9645d65b93. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location for
> > chunk,generation_cephalosporin_just_before_biopsy,1323179991855.db0d0df2
> > c2e076bf01c75ae6ac200436. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location for
> > chunk,linked_with_the_2008_great_wenchuan,1323179964329.32ee9e359e50582b
> > f1b419396c9aa8ad. is datanode2:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location for
> > chunk,pca_with_gabor_decomposition_offered_several,1323179964329.ed0b5be
> > c77229df3b6a5a08c117db355. is datanode2:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location for
> > chunk,see_34\xE2\x80\x93_43however_when_many_partial,1323179993570.cf10c
> > 9b3a05b1b26437c674af2b61cfc. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG
> > client.HConnectionManager$HConnectionImplementation: Cached location for
> > chunk,the_wheelsground_contact_between_vehicle_tire,1323179993570.ee5baf
> > fa6cdce4ced50c6a9c10beca75. is datanode3:60020
> >
> > 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting at
> > row=chunk,,00000000000000 for max=2147483647 rows
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> > -> 0 -> datanode3:,array_for_lithiumion_battery_anode_material
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> > -> 1 ->
> > datanode3:array_for_lithiumion_battery_anode_material,cytotoxic_elements
> > _were_leached_from_the
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> > -> 2 ->
> > datanode3:cytotoxic_elements_were_leached_from_the,generation_cephalospo
> > rin_just_before_biopsy
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> > -> 3 ->
> > datanode3:generation_cephalosporin_just_before_biopsy,linked_with_the_20
> > 08_great_wenchuan
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> > -> 4 ->
> > datanode2:linked_with_the_2008_great_wenchuan,pca_with_gabor_decompositi
> > on_offered_several
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> > -> 5 ->
> > datanode2:pca_with_gabor_decomposition_offered_several,see_34\xE2\x80\x9
> > 3_43however_when_many_partial
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> > -> 6 ->
> > datanode3:see_34\xE2\x80\x93_43however_when_many_partial,the_wheelsgroun
> > d_contact_between_vehicle_tire
> >
> > 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> > -> 7 -> datanode3:the_wheelsground_contact_between_vehicle_tire,
> >
> > 11/12/07 08:52:25 INFO mapred.JobClient: Running job:
> > job_201111021158_0026
> >
> > 11/12/07 08:52:26 INFO mapred.JobClient:  map 0% reduce 0%
> >
> > 11/12/07 08:56:22 INFO mapred.JobClient:  map 12% reduce 0%
> >
> > 11/12/07 08:56:53 INFO mapred.JobClient:  map 25% reduce 0%
> >
> > 11/12/07 08:59:05 INFO mapred.JobClient:  map 37% reduce 0%
> >
> > 11/12/07 08:59:51 INFO mapred.JobClient:  map 50% reduce 0%
> >
> > 11/12/07 09:00:31 INFO mapred.JobClient:  map 62% reduce 0%
> >
> > 11/12/07 09:00:35 INFO mapred.JobClient:  map 75% reduce 0%
> >
> > 11/12/07 09:00:43 INFO mapred.JobClient:  map 87% reduce 0%
> >
> > 11/12/07 09:01:02 INFO mapred.JobClient:  map 100% reduce 0%
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient: Job complete:
> > job_201111021158_0026
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient: Counters: 13
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:   Job Counters
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=3306288
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Total time spent by all
> > reduces waiting after reserving slots (ms)=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Total time spent by all
> > maps waiting after reserving slots (ms)=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Rack-local map tasks=8
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Launched map tasks=13
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Data-local map tasks=5
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:   FileSystemCounters
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     HDFS_BYTES_READ=1254
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=523502
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:   Map-Reduce Framework
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Map input records=26892941
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Spilled Records=0
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     Map output records=26892941
> >
> > 11/12/07 09:01:03 INFO mapred.JobClient:     SPLIT_RAW_BYTES=1254
> >
> >
> >
> >
> >
>



-- 
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com

Re: CopyTable to remote cluster runs OK but doesn't copy anything

Posted by Jean-Daniel Cryans <jd...@apache.org>.
It would most likely be this bug:
https://issues.apache.org/jira/browse/HBASE-4614

On Wed, Dec 7, 2011 at 12:27 AM, Jorn Argelo - Ephorus
<Jo...@ephorus.com> wrote:
> Hi all,
>
>
>
> I'm trying to copy a table from one cluster to another cluster but this
> does not seem to do what I expect it to do. The Map/Reduce job runs
> successfully as you can see below, but it's not actually copying
> anything to the remote cluster. It almost looks as if it's not parsing
> the --peer.adr option and just copies the data inside the same cluster.
> At least, the "WARN mapred.JobClient: Use GenericOptionsParser for
> parsing the arguments. Applications should implement Tool for the same"
> warning would suggest that.
>
>
>
> Both clusters are running CHD3U1 and are both fully distributed,
> although hbase-test1 is a single physical server running all components
> for a fully distributed setup. The source cluster where I am running the
> job from is a small 10 node cluster. Note that on hbase-test1 the target
> table already exists with the same column families as in the source
> cluster.
>
>
>
> Does anybody have any idea what I'm doing wrong? Or maybe I found a bug?
> There's another guy at stackoverflow reporting the same issue
> (http://stackoverflow.com/questions/7952213/how-to-copy-a-table-from-one
> -hbase-cluster-to-another-cluster) but nobody responded on that.
>
>
>
> Thanks,
>
> Jorn
>
>
>
>
>
> $ hbase org.apache.hadoop.hbase.mapreduce.CopyTable
> --peer.adr=hbase-test1:2181:/hbase chunk
>
> 11/12/07 08:52:24 WARN mapred.JobClient: Use GenericOptionsParser for
> parsing the arguments. Applications should implement Tool for the same.
>
> 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> environment:zookeeper.version=3.3.3-cdh3u1--1, built on 07/18/2011 16:48
> GMT
>
> 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> environment:host.name=namenode1
>
> 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> environment:java.version=1.6.0_26
>
> 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> environment:java.vendor=Sun Microsystems Inc.
>
> 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.26/jre
>
> 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> environment:java.class.path=<lots of jars, snipped out to prevent spam>
>
> 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> environment:java.library.path=/usr/lib/hbase/bin/../lib/native/Linux-amd
> 64-64:/usr/lib/hadoop-0.20/lib/native/Linux-amd64-64
>
> 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> environment:java.io.tmpdir=/tmp
>
> 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> environment:java.compiler=<NA>
>
> 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> environment:os.name=Linux
>
> 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> environment:os.arch=amd64
>
> 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> environment:os.version=2.6.32-33-server
>
> 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> environment:user.name=mapred
>
> 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> environment:user.home=/usr/lib/hadoop-0.20
>
> 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Client
> environment:user.dir=/usr/lib/hadoop-0.20
>
> 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Initiating client
> connection, connectString=hbase-test1:2181 sessionTimeout=10000
> watcher=hconnection
>
> 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Opening socket connection
> to server hbase-test1/10.30.10.10:2181
>
> 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Socket connection
> established to hbase-test1/10.30.10.10:2181, initiating session
>
> 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Session establishment
> complete on server hbase-test1/10.30.10.10:2181, sessionid =
> 0x134126b2d250040, negotiated timeout = 10000
>
> 11/12/07 08:52:25 DEBUG
> client.HConnectionManager$HConnectionImplementation: Lookedup root
> region location,
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnection
> Implementation@105691e; hsa=hbase-test1:60020
>
> 11/12/07 08:52:25 DEBUG
> client.HConnectionManager$HConnectionImplementation: Cached location for
> .META.,,1.1028785192 is hbase-test1:60020
>
> 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting at
> row=chunk,,00000000000000 for max=10 rows
>
> 11/12/07 08:52:25 DEBUG
> client.HConnectionManager$HConnectionImplementation: Cached location for
> chunk,,1323181597686.f527a21a31a39559a2f4cbd034d286a7. is
> hbase-test1:60020
>
> 11/12/07 08:52:25 INFO mapreduce.TableOutputFormat: Created table
> instance for chunk
>
> 11/12/07 08:52:25 INFO zookeeper.ZooKeeper: Initiating client
> connection, connectString=z01:2181,zk02:2181,zk03:2181
> sessionTimeout=10000 watcher=hconnection
>
> 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Opening socket connection
> to server zk02/10.30.4.93:2181
>
> 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Socket connection
> established to zk02/10.30.4.93:2181, initiating session
>
> 11/12/07 08:52:25 INFO zookeeper.ClientCnxn: Session establishment
> complete on server zzk02/10.30.4.93:2181, sessionid = 0x233922a0b320c81,
> negotiated timeout = 10000
>
> 11/12/07 08:52:25 DEBUG
> client.HConnectionManager$HConnectionImplementation: Lookedup root
> region location,
> connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnection
> Implementation@3c3a1834; hsa=datanode1:60020
>
> 11/12/07 08:52:25 DEBUG
> client.HConnectionManager$HConnectionImplementation: Cached location for
> .META.,,1.1028785192 is datanode1:60020
>
> 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting at
> row=chunk,,00000000000000 for max=10 rows
>
> 11/12/07 08:52:25 DEBUG
> client.HConnectionManager$HConnectionImplementation: Cached location for
> chunk,,1323179990451.a243a485325744b9eedd8da2106712b6. is
> datanode3:60020
>
> 11/12/07 08:52:25 DEBUG
> client.HConnectionManager$HConnectionImplementation: Cached location for
> chunk,array_for_lithiumion_battery_anode_material,1323179990451.6161771e
> cadc7a45acd28afbfca88a09. is datanode3:60020
>
> 11/12/07 08:52:25 DEBUG
> client.HConnectionManager$HConnectionImplementation: Cached location for
> chunk,cytotoxic_elements_were_leached_from_the,1323179991855.5cf99f0425d
> 9e0e9fd41ac9645d65b93. is datanode3:60020
>
> 11/12/07 08:52:25 DEBUG
> client.HConnectionManager$HConnectionImplementation: Cached location for
> chunk,generation_cephalosporin_just_before_biopsy,1323179991855.db0d0df2
> c2e076bf01c75ae6ac200436. is datanode3:60020
>
> 11/12/07 08:52:25 DEBUG
> client.HConnectionManager$HConnectionImplementation: Cached location for
> chunk,linked_with_the_2008_great_wenchuan,1323179964329.32ee9e359e50582b
> f1b419396c9aa8ad. is datanode2:60020
>
> 11/12/07 08:52:25 DEBUG
> client.HConnectionManager$HConnectionImplementation: Cached location for
> chunk,pca_with_gabor_decomposition_offered_several,1323179964329.ed0b5be
> c77229df3b6a5a08c117db355. is datanode2:60020
>
> 11/12/07 08:52:25 DEBUG
> client.HConnectionManager$HConnectionImplementation: Cached location for
> chunk,see_34\xE2\x80\x93_43however_when_many_partial,1323179993570.cf10c
> 9b3a05b1b26437c674af2b61cfc. is datanode3:60020
>
> 11/12/07 08:52:25 DEBUG
> client.HConnectionManager$HConnectionImplementation: Cached location for
> chunk,the_wheelsground_contact_between_vehicle_tire,1323179993570.ee5baf
> fa6cdce4ced50c6a9c10beca75. is datanode3:60020
>
> 11/12/07 08:52:25 DEBUG client.MetaScanner: Scanning .META. starting at
> row=chunk,,00000000000000 for max=2147483647 rows
>
> 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> -> 0 -> datanode3:,array_for_lithiumion_battery_anode_material
>
> 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> -> 1 ->
> datanode3:array_for_lithiumion_battery_anode_material,cytotoxic_elements
> _were_leached_from_the
>
> 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> -> 2 ->
> datanode3:cytotoxic_elements_were_leached_from_the,generation_cephalospo
> rin_just_before_biopsy
>
> 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> -> 3 ->
> datanode3:generation_cephalosporin_just_before_biopsy,linked_with_the_20
> 08_great_wenchuan
>
> 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> -> 4 ->
> datanode2:linked_with_the_2008_great_wenchuan,pca_with_gabor_decompositi
> on_offered_several
>
> 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> -> 5 ->
> datanode2:pca_with_gabor_decomposition_offered_several,see_34\xE2\x80\x9
> 3_43however_when_many_partial
>
> 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> -> 6 ->
> datanode3:see_34\xE2\x80\x93_43however_when_many_partial,the_wheelsgroun
> d_contact_between_vehicle_tire
>
> 11/12/07 08:52:25 DEBUG mapreduce.TableInputFormatBase: getSplits: split
> -> 7 -> datanode3:the_wheelsground_contact_between_vehicle_tire,
>
> 11/12/07 08:52:25 INFO mapred.JobClient: Running job:
> job_201111021158_0026
>
> 11/12/07 08:52:26 INFO mapred.JobClient:  map 0% reduce 0%
>
> 11/12/07 08:56:22 INFO mapred.JobClient:  map 12% reduce 0%
>
> 11/12/07 08:56:53 INFO mapred.JobClient:  map 25% reduce 0%
>
> 11/12/07 08:59:05 INFO mapred.JobClient:  map 37% reduce 0%
>
> 11/12/07 08:59:51 INFO mapred.JobClient:  map 50% reduce 0%
>
> 11/12/07 09:00:31 INFO mapred.JobClient:  map 62% reduce 0%
>
> 11/12/07 09:00:35 INFO mapred.JobClient:  map 75% reduce 0%
>
> 11/12/07 09:00:43 INFO mapred.JobClient:  map 87% reduce 0%
>
> 11/12/07 09:01:02 INFO mapred.JobClient:  map 100% reduce 0%
>
> 11/12/07 09:01:03 INFO mapred.JobClient: Job complete:
> job_201111021158_0026
>
> 11/12/07 09:01:03 INFO mapred.JobClient: Counters: 13
>
> 11/12/07 09:01:03 INFO mapred.JobClient:   Job Counters
>
> 11/12/07 09:01:03 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=3306288
>
> 11/12/07 09:01:03 INFO mapred.JobClient:     Total time spent by all
> reduces waiting after reserving slots (ms)=0
>
> 11/12/07 09:01:03 INFO mapred.JobClient:     Total time spent by all
> maps waiting after reserving slots (ms)=0
>
> 11/12/07 09:01:03 INFO mapred.JobClient:     Rack-local map tasks=8
>
> 11/12/07 09:01:03 INFO mapred.JobClient:     Launched map tasks=13
>
> 11/12/07 09:01:03 INFO mapred.JobClient:     Data-local map tasks=5
>
> 11/12/07 09:01:03 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
>
> 11/12/07 09:01:03 INFO mapred.JobClient:   FileSystemCounters
>
> 11/12/07 09:01:03 INFO mapred.JobClient:     HDFS_BYTES_READ=1254
>
> 11/12/07 09:01:03 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=523502
>
> 11/12/07 09:01:03 INFO mapred.JobClient:   Map-Reduce Framework
>
> 11/12/07 09:01:03 INFO mapred.JobClient:     Map input records=26892941
>
> 11/12/07 09:01:03 INFO mapred.JobClient:     Spilled Records=0
>
> 11/12/07 09:01:03 INFO mapred.JobClient:     Map output records=26892941
>
> 11/12/07 09:01:03 INFO mapred.JobClient:     SPLIT_RAW_BYTES=1254
>
>
>
>
>