You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Geoff Hendrey <gh...@decarta.com> on 2011/03/03 06:08:04 UTC

follow-up question on TotalOrderPartitioner

        I upgraded my client to 0.90.1 per the suggestion (although the
server is still 0.89). I no longer get a NullPointerException when I try
to use TotalOrderPartitioner. However, I cannot get the
TotalOrderPartitioner to actually create the partition file, even though
a message is printed "hadoopbackport.InputSampler: Using 64 samples"
which indicated my custom sampler is running and generating partition
points. Can someone take a quick grock at my code below and tell me what
I'm missing? I use a fully qualified path name, I even set
"total.order.partitioner.path". All my println statements indicate the
partitions file is created, but it isn't created. Even more strangely,
when the mapreduce job starts, it complains that "File _partition.lst
does not exist" (even though I've explicitly told it to use a file named
"partitions-file" as opposed to the default "_partition.lst").

 

        Path input = FileInputFormat.getInputPaths(job)[0];

        input = input.makeQualified(input.getFileSystem(config));

        Path partitionFilePath = new Path(input, "partitions-file");

        TotalOrderPartitioner.setPartitionFile(config,
partitionFilePath);

        job.getConfiguration().set("total.order.partitioner.path",
partitionFilePath.toString());

        job.setPartitionerClass(TotalOrderPartitioner.class);

        System.out.println("TotalOrderPartitioner thinks it's partition
file is: " + TotalOrderPartitioner.getPartitionFile(config));

        job.setNumReduceTasks(100);

        InputSampler.Sampler randomSampler = new RandomKeySampler<Text,
HitList>(100);

        InputSampler.writePartitionFile(job, randomSampler);

        System.out.println("wrote partition file: " +
TotalOrderPartitioner.getPartitionFile(config));

 

Any help greatly appreciated since I spent the day looking through
TotalOrderPartitioner and can't find what I'm doing wrong. Thanks!

 

-geoff

        


RE: follow-up question on TotalOrderPartitioner

Posted by Geoff Hendrey <gh...@decarta.com>.
oops...may have spotted the problem. I was accidentally looking at the
*mapred* TotalOrderPartitioner source code. Looks as though the
hadoopbackport TotalOrderPartitioner source is using the name
"mapreduce.totalorderpartitioner.path" as the job configuration
parameter name.

-----Original Message-----
From: Geoff Hendrey [mailto:ghendrey@decarta.com] 
Sent: Wednesday, March 02, 2011 9:08 PM
To: hbase-user@hadoop.apache.org
Subject: follow-up question on TotalOrderPartitioner

        I upgraded my client to 0.90.1 per the suggestion (although the
server is still 0.89). I no longer get a NullPointerException when I try
to use TotalOrderPartitioner. However, I cannot get the
TotalOrderPartitioner to actually create the partition file, even though
a message is printed "hadoopbackport.InputSampler: Using 64 samples"
which indicated my custom sampler is running and generating partition
points. Can someone take a quick grock at my code below and tell me what
I'm missing? I use a fully qualified path name, I even set
"total.order.partitioner.path". All my println statements indicate the
partitions file is created, but it isn't created. Even more strangely,
when the mapreduce job starts, it complains that "File _partition.lst
does not exist" (even though I've explicitly told it to use a file named
"partitions-file" as opposed to the default "_partition.lst").

 

        Path input = FileInputFormat.getInputPaths(job)[0];

        input = input.makeQualified(input.getFileSystem(config));

        Path partitionFilePath = new Path(input, "partitions-file");

        TotalOrderPartitioner.setPartitionFile(config,
partitionFilePath);

        job.getConfiguration().set("total.order.partitioner.path",
partitionFilePath.toString());

        job.setPartitionerClass(TotalOrderPartitioner.class);

        System.out.println("TotalOrderPartitioner thinks it's partition
file is: " + TotalOrderPartitioner.getPartitionFile(config));

        job.setNumReduceTasks(100);

        InputSampler.Sampler randomSampler = new RandomKeySampler<Text,
HitList>(100);

        InputSampler.writePartitionFile(job, randomSampler);

        System.out.println("wrote partition file: " +
TotalOrderPartitioner.getPartitionFile(config));

 

Any help greatly appreciated since I spent the day looking through
TotalOrderPartitioner and can't find what I'm doing wrong. Thanks!

 

-geoff