You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Hudson (JIRA)" <ji...@apache.org> on 2014/07/21 23:54:41 UTC
[jira] [Commented] (MAPREDUCE-5756)
CombineFileInputFormat.getSplits() including directories in its results
[ https://issues.apache.org/jira/browse/MAPREDUCE-5756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14069372#comment-14069372 ]
Hudson commented on MAPREDUCE-5756:
-----------------------------------
SUCCESS: Integrated in Hadoop-trunk-Commit #5926 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5926/])
MAPREDUCE-5756. CombineFileInputFormat.getSplits() including directories in its results. Contributed by Jason Dere (jlowe: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1612400)
* /hadoop/common/trunk/hadoop-mapreduce-project/CHANGES.txt
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/CombineFileInputFormat.java
* /hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient/src/test/java/org/apache/hadoop/mapreduce/lib/input/TestCombineFileInputFormat.java
> CombineFileInputFormat.getSplits() including directories in its results
> -----------------------------------------------------------------------
>
> Key: MAPREDUCE-5756
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-5756
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: Jason Dere
> Assignee: Jason Dere
> Fix For: 3.0.0, 2.6.0
>
> Attachments: MAPREDUCE-5756.1.patch, MAPREDUCE-5756.2.patch
>
>
> Trying to track down HIVE-6401, where we see some "is not a file" errors because getSplits() is giving us directories. I believe the culprit is FileInputFormat.listStatus():
> {code}
> if (recursive && stat.isDirectory()) {
> addInputPathRecursively(result, fs, stat.getPath(),
> inputFilter);
> } else {
> result.add(stat);
> }
> {code}
> Which seems to be allowing directories to be added to the results if recursive is false. Is this meant to return directories? If not, I think it should look like this:
> {code}
> if (stat.isDirectory()) {
> if (recursive) {
> addInputPathRecursively(result, fs, stat.getPath(),
> inputFilter);
> }
> } else {
> result.add(stat);
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)