You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Julien Serdaru (JIRA)" <ji...@apache.org> on 2013/05/01 03:23:13 UTC
[jira] [Created] (HADOOP-9530) DBInputSplit creates one invalid
range on Oracle
Julien Serdaru created HADOOP-9530:
--------------------------------------
Summary: DBInputSplit creates one invalid range on Oracle
Key: HADOOP-9530
URL: https://issues.apache.org/jira/browse/HADOOP-9530
Project: Hadoop Common
Issue Type: Bug
Affects Versions: 1.1.2
Reporter: Julien Serdaru
The DBInputFormat on Oracle does not create valid ranges.
The method getSplit line 263 is as follows:
split = new DBInputSplit(i * chunkSize, (i * chunkSize)
+ chunkSize);
So the first split will have a start value of 0 (0*chunkSize).
However, the OracleDBRecordReader, line 84 is as follows:
if (split.getLength() > 0 && split.getStart() > 0){
Since the start value of the first range is equal to 0, we will skip the block that partitions the input set. As a result, one of the map task will process the entire data set, rather than the partition.
I'm assuming the fix is trivial and would involve removing the second check in the if block.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira