You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "Stephen Yuan Jiang (JIRA)" <ji...@apache.org> on 2015/05/06 01:05:59 UTC
[jira] [Created] (HBASE-13625) Use HDFS for HFileOutputFormat2
partitioner's path
Stephen Yuan Jiang created HBASE-13625:
------------------------------------------
Summary: Use HDFS for HFileOutputFormat2 partitioner's path
Key: HBASE-13625
URL: https://issues.apache.org/jira/browse/HBASE-13625
Project: HBase
Issue Type: Bug
Components: mapreduce
Affects Versions: 2.0.0, 1.1.0, 1.2.0
Reporter: Stephen Yuan Jiang
Assignee: Stephen Yuan Jiang
HBASE-13010 changed hard-coded '/tmp' in HFileOutputFormat2 partitioner's path to 'hadoop.tmp.dir'. This breaks unit test in Windows.
{code}
static void configurePartitioner(Job job, List<ImmutableBytesWritable> splitPoints)
...
// create the partitions file
- FileSystem fs = FileSystem.get(job.getConfiguration());
- Path partitionsPath = new Path("/tmp", "partitions_" + UUID.randomUUID());
+ FileSystem fs = FileSystem.get(conf);
+ Path partitionsPath = new Path(conf.get("hadoop.tmp.dir"), "partitions_" + UUID.randomUUID());
{code}
Here is the exception from 1 of the UTs when running against Windows (from branch-1.1) - The ':' is an invalid character in windows file path:
{code}
java.lang.IllegalArgumentException: Pathname /C:/hbase-server/target/test-data/d25e2228-8959-43ee-b413-4fa69cdb8032/hadoop_tmp/partitions_fb96c0a0-41e6-4964-a391-738cb761ee3e from C:/hbase-server/target/test-data/d25e2228-8959-43ee-b413-4fa69cdb8032/hadoop_tmp/partitions_fb96c0a0-41e6-4964-a391-738cb761ee3e is not a valid DFS filename.
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:197)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:106)
at org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448)
at org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:444)
at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909)
at org.apache.hadoop.io.SequenceFile$Writer.<init>(SequenceFile.java:1074)
at org.apache.hadoop.io.SequenceFile$RecordCompressWriter.<init>(SequenceFile.java:1374)
at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:275)
at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:297)
at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.writePartitions(HFileOutputFormat2.java:335)
at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configurePartitioner(HFileOutputFormat2.java:593)
at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:440)
at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2.configureIncrementalLoad(HFileOutputFormat2.java:405)
at org.apache.hadoop.hbase.mapreduce.ImportTsv.createSubmittableJob(ImportTsv.java:539)
at org.apache.hadoop.hbase.mapreduce.ImportTsv.run(ImportTsv.java:720)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.hbase.mapreduce.TestImportTsv.doMROnTableTest(TestImportTsv.java:313)
at org.apache.hadoop.hbase.mapreduce.TestImportTsv.testBulkOutputWithoutAnExistingTable(TestImportTsv.java:168)
{code}
The proposed fix is to use a config to point to a hdfs directory.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)