You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Adrian Sandulescu <sa...@gmail.com> on 2013/10/09 22:32:03 UTC
exportSnapshot MISSING_FILES

Hello everyone,

Please forgive me and ignore me if this isn't the appropriate place to ask
this question.


What I am trying to do is perfect a method of exporting snapshots to S3 for
backup purposes ( I don't have another HDFS on hand to backup to; also S3
seems like a more fault tolerant option)

Exporting works great simply by using the S3 path with -copy-to.

To import the snapshot back into HDFS from S3, I overwrite Hadoop's
configuration to set the root and defaultFS to the S3 path.
This also works great, but only when using s3n:// protocol (5GB per file S3
limit)


Here is output from the two types of attempts:

>From a successful s3n:// import

2013-10-08 14:57:04,816 INFO
org.apache.hadoop.hbase.snapshot.ExportSnapshot: copy file
input=v/campaign_digital_ad_time_dev=d9e6cff519bdd232d7a7f8eb676d92ee-6912d21c32754e7cb34ac85952a67ce1
output=hdfs://mycluster:8020/hbase/.archive/campaign_digital_ad_time_dev/d9e6cff519bdd232d7a7f8eb676d92ee/v/6912d21c32754e7cb34ac85952a67ce1
2013-10-08 14:57:04,965 INFO
org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening
's3n://AKIAIWNEBAESDM4DKBGA:UEFai8K1IBrjOKlXB2hbIbTFDJ8apubuA01LCc02@hbase-export/hbase/.archive/campaign_digital_ad_time_dev/d9e6cff519bdd232d7a7f8eb676d92ee/v/6912d21c32754e7cb34ac85952a67ce1'
for reading
2013-10-08 14:57:05,039 INFO
org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key
'hbase/.archive/campaign_digital_ad_time_dev/d9e6cff519bdd232d7a7f8eb676d92ee/v/6912d21c32754e7cb34ac85952a67ce1'
for reading at position '0'
2013-10-08 14:57:05,299 INFO
org.apache.hadoop.hbase.snapshot.ExportSnapshot: Skip copy
v/campaign_digital_ad_time_dev=d9e6cff519bdd232d7a7f8eb676d92ee-6912d21c32754e7cb34ac85952a67ce1
to hdfs://mycluster:8020/hbase/.archive/campaign_digital_ad_time_dev/d9e6cff519bdd232d7a7f8eb676d92ee/v/6912d21c32754e7cb34ac85952a67ce1,
same file.
2013-10-08 14:57:05,300 INFO
org.apache.hadoop.hbase.snapshot.ExportSnapshot: copy completed for
input=v/campaign_digital_ad_time_dev=d9e6cff519bdd232d7a7f8eb676d92ee-6912d21c32754e7cb34ac85952a67ce1
output=hdfs://mycluster:8020/hbase/.archive/campaign_digital_ad_time_dev/d9e6cff519bdd232d7a7f8eb676d92ee/v/6912d21c32754e7cb34ac85952a67ce1

>From a failed s3:// import

2013-10-08 15:27:21,810 INFO
org.apache.hadoop.hbase.snapshot.ExportSnapshot: copy file
input=v/campaign_digital_ad_time_dev=d9e6cff519bdd232d7a7f8eb676d92ee-6912d21c32754e7cb34ac85952a67ce1
output=hdfs://mycluster:8020/hbase/.archive/campaign_digital_ad_time_dev/d9e6cff519bdd232d7a7f8eb676d92ee/v/6912d21c32754e7cb34ac85952a67ce1
2013-10-08 15:27:21,834 ERROR
org.apache.hadoop.hbase.snapshot.ExportSnapshot: Unable to open source
file=v/campaign_digital_ad_time_dev=d9e6cff519bdd232d7a7f8eb676d92ee-6912d21c32754e7cb34ac85952a67ce1
java.io.IOException: No such file.
	at org.apache.hadoop.fs.s3.S3FileSystem.checkFile(S3FileSystem.java:181)
	at org.apache.hadoop.fs.s3.S3FileSystem.open(S3FileSystem.java:246)
	at org.apache.hadoop.hbase.io.FileLink$FileLinkInputStream.tryOpen(FileLink.java:289)
	at org.apache.hadoop.hbase.io.FileLink$FileLinkInputStream.<init>(FileLink.java:120)
	at org.apache.hadoop.hbase.io.FileLink$FileLinkInputStream.<init>(FileLink.java:111)
	at org.apache.hadoop.hbase.io.FileLink.open(FileLink.java:390)
	at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.openSourceFile(ExportSnapshot.java:302)
	at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:175)
	at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:146)
	at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:95)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
	at org.apache.hadoop.mapred.Child.main(Child.java:262)



Here are the variables I set for import:

            // Override dfs configuration to point to S3
            config.set("fs.default.name", s3protocol + accessKey + ":"
+ accessSecret + "@" + bucketName);
            config.set("fs.defaultFS", s3protocol + accessKey + ":" +
accessSecret  + "@" + bucketName);
            config.set("fs.s3.awsAccessKeyId", accessKey);
            config.set("fs.s3.awsSecretAccessKey", accessSecret);
            config.set("hbase.tmp.dir", "/tmp/hbase-${user.name}");
            config.set("hbase.rootdir", s3Url);

Has anyone seen this before?
I'm trying to debug it myself, but I'm having a pretty hard time (I'm not a
developer).

Thank you,
Adrian