You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Rushabh S Shah (JIRA)" <ji...@apache.org> on 2017/11/01 15:41:00 UTC
[jira] [Created] (MAPREDUCE-6996) FileInputFormat#getBlockIndex
should include file name in the exception.
Rushabh S Shah created MAPREDUCE-6996:
-----------------------------------------
Summary: FileInputFormat#getBlockIndex should include file name in the exception.
Key: MAPREDUCE-6996
URL: https://issues.apache.org/jira/browse/MAPREDUCE-6996
Project: Hadoop Map/Reduce
Issue Type: Bug
Affects Versions: 2.6.0
Reporter: Rushabh S Shah
Priority: Minor
{code:title=FileInputFormat..java|borderStyle=solid}
// Some comments here
protected int getBlockIndex(BlockLocation[] blkLocations,
long offset) {
{
...
...
BlockLocation last = blkLocations[blkLocations.length -1];
long fileLength = last.getOffset() + last.getLength() -1;
throw new IllegalArgumentException("Offset " + offset +
" is outside of file (0.." +
fileLength + ")");
}
{code}
When the file is open for writing, the {{last.getLength()}} and {{last.getOffset()}} will be zero and we see the following exception stack trace.
{noformat}
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:288)
Caused by: java.lang.IllegalArgumentException: Offset 0 is outside of file (0..-1)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getBlockIndex(FileInputFormat.java:453)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:413)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:265)
... 18 more
{noformat}
Its difficult to debug which file was open.
So creating this ticket to include the filename in the exception.
Since {{FileInputFormat#getBlockIndex}} is protected, we can't change the signature of that method and add file name to arguments.
The only way I can think to fix this is:
{code:title=FileInputFormat..java|borderStyle=solid}
public InputSplit[] getSplits(JobConf job, int numSplits)
throws IOException {
{
...
...
for (FileStatus file: files) {
Path path = file.getPath();
long length = file.getLen();
if (length != 0) {
FileSystem fs = path.getFileSystem(job);
BlockLocation[] blkLocations;
if (file instanceof LocatedFileStatus) {
blkLocations = ((LocatedFileStatus) file).getBlockLocations();
} else {
blkLocations = fs.getFileBlockLocations(file, 0, length);
}
if (isSplitable(fs, path)) {
long blockSize = file.getBlockSize();
long splitSize = computeSplitSize(goalSize, minSize, blockSize);
long bytesRemaining = length;
while (((double) bytesRemaining)/splitSize > SPLIT_SLOP) {
String[][] splitHosts = getSplitHostsAndCachedHosts(blkLocations,
length-bytesRemaining, splitSize, clusterMap);
splits.add(makeSplit(path, length-bytesRemaining, splitSize,
splitHosts[0], splitHosts[1]));
bytesRemaining -= splitSize;
}
if (bytesRemaining != 0) {
String[][] splitHosts = getSplitHostsAndCachedHosts(blkLocations, length
- bytesRemaining, bytesRemaining, clusterMap);
splits.add(makeSplit(path, length - bytesRemaining, bytesRemaining,
splitHosts[0], splitHosts[1]));
}
} else {
String[][] splitHosts = getSplitHostsAndCachedHosts(blkLocations,0,length,clusterMap);
splits.add(makeSplit(path, 0, length, splitHosts[0], splitHosts[1]));
}
} else {
//Create empty hosts array for zero length files
splits.add(makeSplit(path, 0, length, new String[0]));
}
}
{code}
Have a try-catch block around the above code chunk and catch {{IllegalArgumentException}} and check for message {{Offset 0 is outside of file (0..-1)}}.
If yes, add the file name and rethrow {{IllegalArgumentException}}.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org