You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by "Harsh J (JIRA)" <ji...@apache.org> on 2011/07/27 10:26:09 UTC

[jira] [Resolved] (MAPREDUCE-2715) submitAndMonitorJob() doesn't play nice with MultipleOutputFile

     [ https://issues.apache.org/jira/browse/MAPREDUCE-2715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J resolved MAPREDUCE-2715.
--------------------------------

    Resolution: Not A Problem

Geoffrey, so taking a look at stable release today, 0.22 and trunk, I think we can close this as a 'Not a Problem' as the directory check is purely from the OutputFormat class instance itself. That said, you should be fully able to remove that check yourself in your MultipleOutputFormat derivative by overriding the checkOutputSpecs method as pointed before.

In case that doesn't resolve it for you, do reopen!

> submitAndMonitorJob() doesn't play nice with MultipleOutputFile
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-2715
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2715
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Geoffrey Young
>
> part of submitAndMonitorJob() balks if the output directory currently exists but is non-empty:
>   "Error launching job , Output path already exists : "
> this logic actually conflicts with the ideas behind MultipleOutputFile, where the output file path is calculated later on.
> it would be really nice to remove the restriction for non-empty output directories in submitAndMonitorJob() so that MultipleOutputFile becomes more useful - as it stands now, I can't, for example, specify a base output path then use MutlipleOutputFile to partition by date on a daily basis.
> thanks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira