You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Sunil Govindan (JIRA)" <ji...@apache.org> on 2018/11/23 12:01:02 UTC
[jira] [Updated] (MAPREDUCE-6996) FileInputFormat#getBlockIndex
should include file name in the exception.
[ https://issues.apache.org/jira/browse/MAPREDUCE-6996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sunil Govindan updated MAPREDUCE-6996:
--------------------------------------
Target Version/s: 3.3.0 (was: 3.2.0)
Bulk update: moved all 3.2.0 non-blocker issues, please move back if it is a blocker.
> FileInputFormat#getBlockIndex should include file name in the exception.
> ------------------------------------------------------------------------
>
> Key: MAPREDUCE-6996
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6996
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Affects Versions: 2.6.0
> Reporter: Rushabh S Shah
> Priority: Minor
> Labels: newbie++
>
> {code:title=FileInputFormat..java|borderStyle=solid}
> // Some comments here
> protected int getBlockIndex(BlockLocation[] blkLocations,
> long offset) {
> {
> ...
> ...
> BlockLocation last = blkLocations[blkLocations.length -1];
> long fileLength = last.getOffset() + last.getLength() -1;
> throw new IllegalArgumentException("Offset " + offset +
> " is outside of file (0.." +
> fileLength + ")");
> }
> {code}
> When the file is open for writing, the {{last.getLength()}} and {{last.getOffset()}} will be zero and we see the following exception stack trace.
> {noformat}
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$1.run(MapReduceLauncher.java:288)
> Caused by: java.lang.IllegalArgumentException: Offset 0 is outside of file (0..-1)
> at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getBlockIndex(FileInputFormat.java:453)
> at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:413)
> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:265)
> ... 18 more
> {noformat}
> Its difficult to debug which file was open.
> So creating this ticket to include the filename in the exception.
> Since {{FileInputFormat#getBlockIndex}} is protected, we can't change the signature of that method and add file name to arguments.
> The only way I can think to fix this is:
> {code:title=FileInputFormat..java|borderStyle=solid}
> public InputSplit[] getSplits(JobConf job, int numSplits)
> throws IOException {
> {
> ...
> ...
> for (FileStatus file: files) {
> Path path = file.getPath();
> long length = file.getLen();
> if (length != 0) {
> FileSystem fs = path.getFileSystem(job);
> BlockLocation[] blkLocations;
> if (file instanceof LocatedFileStatus) {
> blkLocations = ((LocatedFileStatus) file).getBlockLocations();
> } else {
> blkLocations = fs.getFileBlockLocations(file, 0, length);
> }
> if (isSplitable(fs, path)) {
> long blockSize = file.getBlockSize();
> long splitSize = computeSplitSize(goalSize, minSize, blockSize);
> long bytesRemaining = length;
> while (((double) bytesRemaining)/splitSize > SPLIT_SLOP) {
> String[][] splitHosts = getSplitHostsAndCachedHosts(blkLocations,
> length-bytesRemaining, splitSize, clusterMap);
> splits.add(makeSplit(path, length-bytesRemaining, splitSize,
> splitHosts[0], splitHosts[1]));
> bytesRemaining -= splitSize;
> }
> if (bytesRemaining != 0) {
> String[][] splitHosts = getSplitHostsAndCachedHosts(blkLocations, length
> - bytesRemaining, bytesRemaining, clusterMap);
> splits.add(makeSplit(path, length - bytesRemaining, bytesRemaining,
> splitHosts[0], splitHosts[1]));
> }
> } else {
> String[][] splitHosts = getSplitHostsAndCachedHosts(blkLocations,0,length,clusterMap);
> splits.add(makeSplit(path, 0, length, splitHosts[0], splitHosts[1]));
> }
> } else {
> //Create empty hosts array for zero length files
> splits.add(makeSplit(path, 0, length, new String[0]));
> }
> }
> {code}
> Have a try-catch block around the above code chunk and catch {{IllegalArgumentException}} and check for message {{Offset 0 is outside of file (0..-1)}}.
> If yes, add the file name and rethrow {{IllegalArgumentException}}.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org