You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Harsh J (JIRA)" <ji...@apache.org> on 2012/05/30 08:08:23 UTC

[jira] [Commented] (MAPREDUCE-3193) FileInputFormat doesn't read files recursively in the input path dir

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13285410#comment-13285410 ] 

Harsh J commented on MAPREDUCE-3193:
------------------------------------

Thanks for the patches. This should get in soon cause its a wide divide between old and new API feature sets.

I have a few questions though:

bq. MAPREDUCE-1501 added this behaviour to the old API. Can you change your patch to share code and tests so that both the old and new API behave in the same way? Also, the old configuration parameter should be deprecated, but still supported in the new API.

Given that both APIs are now supported, do we really need the deprecation? Will the new name apply to both? Are other properties handled in the same way today?

For example I see in old API the following reuse:

{code}
public static final String NUM_INPUT_FILES =
    org.apache.hadoop.mapreduce.lib.input.FileInputFormat.NUM_INPUT_FILES;
{code}

While this patch does not change similar things in mapred.lib even after deprecation marker. Can this be done here too?

{quote}
+      <groupId>org.apache.hadoop</groupId>
+      <artifactId>hadoop-hdfs</artifactId>
{quote}

Can the test not be done with just LFS? We can avoid a dependency if it can be done. Similarly a LJRunner test would be great too, if alright - instead of an MR cluster.

bq. mapreduce.input.fileinputformat.readinputfilesrecursively

The last part can still be bettered I think. (Nit: Its not reading recursively, just listing that way.) Perhaps "mapreduce.input.fileinputformat.input.dir.recursive" is simpler to have?
                
> FileInputFormat doesn't read files recursively in the input path dir
> --------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3193
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3193
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv1, mrv2
>    Affects Versions: 1.0.2, 0.23.2, 2.0.0-alpha, 3.0.0
>            Reporter: Ramgopal N
>            Assignee: Devaraj K
>         Attachments: MAPREDUCE-3193-1.patch, MAPREDUCE-3193-2.patch, MAPREDUCE-3193-2.patch, MAPREDUCE-3193.patch, MAPREDUCE-3193.security.patch
>
>
> java.io.FileNotFoundException is thrown,if input file is more than one folder level deep and the job is getting failed.
> Example:Input file is /r1/r2/input.txt

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira