You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crunch.apache.org by "Robert Chu (JIRA)" <ji...@apache.org> on 2012/09/28 01:51:07 UTC

[jira] [Created] (CRUNCH-82) Support OutputFormats not based on FileOutputFormat when writing to multiple targets.

Robert Chu created CRUNCH-82:
--------------------------------

             Summary: Support OutputFormats not based on FileOutputFormat when writing to multiple targets.
                 Key: CRUNCH-82
                 URL: https://issues.apache.org/jira/browse/CRUNCH-82
             Project: Crunch
          Issue Type: Improvement
    Affects Versions: 0.4.0
            Reporter: Robert Chu


Currently, writing to multiple targets with OutputFormats that aren't subclasses of FileOutputFormat fails because Crunch expects these targets to write to hdfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CRUNCH-82) Support OutputFormats not based on FileOutputFormat when writing to multiple targets.

Posted by "Robert Chu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-82?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13465285#comment-13465285 ] 

Robert Chu commented on CRUNCH-82:
----------------------------------

Oops, didn't mean to leave that in.
                
> Support OutputFormats not based on FileOutputFormat when writing to multiple targets.
> -------------------------------------------------------------------------------------
>
>                 Key: CRUNCH-82
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-82
>             Project: Crunch
>          Issue Type: Improvement
>    Affects Versions: 0.4.0
>            Reporter: Robert Chu
>         Attachments: multipleoutputs.patch, multipleoutputs.patch
>
>
> Currently, writing to multiple targets with OutputFormats that aren't subclasses of FileOutputFormat fails because Crunch expects these targets to write to hdfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CRUNCH-82) Support OutputFormats not based on FileOutputFormat when writing to multiple targets.

Posted by "Robert Chu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CRUNCH-82?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Chu updated CRUNCH-82:
-----------------------------

    Attachment: multipleoutputs.patch

Here's a fix for some of these issues. Targets that use non HDFS based OutputFormats will still have to call:

FileOutputFormat.setOutputPath

in their configureForMapReduce method since job submission uses TextOutputFormat when no OutputFormat is set for a job (because writing to multiple targets uses the MultipleOutputs library). Job submission then validates the outputPath set in the job's configuration which will fail unless the above method is called.
                
> Support OutputFormats not based on FileOutputFormat when writing to multiple targets.
> -------------------------------------------------------------------------------------
>
>                 Key: CRUNCH-82
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-82
>             Project: Crunch
>          Issue Type: Improvement
>    Affects Versions: 0.4.0
>            Reporter: Robert Chu
>         Attachments: multipleoutputs.patch
>
>
> Currently, writing to multiple targets with OutputFormats that aren't subclasses of FileOutputFormat fails because Crunch expects these targets to write to hdfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (CRUNCH-82) Support OutputFormats not based on FileOutputFormat when writing to multiple targets.

Posted by "Robert Chu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CRUNCH-82?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Chu resolved CRUNCH-82.
------------------------------

       Resolution: Fixed
    Fix Version/s: 0.4.0

Pushed to master.
                
> Support OutputFormats not based on FileOutputFormat when writing to multiple targets.
> -------------------------------------------------------------------------------------
>
>                 Key: CRUNCH-82
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-82
>             Project: Crunch
>          Issue Type: Improvement
>    Affects Versions: 0.4.0
>            Reporter: Robert Chu
>             Fix For: 0.4.0
>
>         Attachments: multipleoutputs.patch, multipleoutputs.patch
>
>
> Currently, writing to multiple targets with OutputFormats that aren't subclasses of FileOutputFormat fails because Crunch expects these targets to write to hdfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CRUNCH-82) Support OutputFormats not based on FileOutputFormat when writing to multiple targets.

Posted by "Josh Wills (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-82?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13465273#comment-13465273 ] 

Josh Wills commented on CRUNCH-82:
----------------------------------

Code looks good, but you need to drop the pom.xml changes.
                
> Support OutputFormats not based on FileOutputFormat when writing to multiple targets.
> -------------------------------------------------------------------------------------
>
>                 Key: CRUNCH-82
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-82
>             Project: Crunch
>          Issue Type: Improvement
>    Affects Versions: 0.4.0
>            Reporter: Robert Chu
>         Attachments: multipleoutputs.patch
>
>
> Currently, writing to multiple targets with OutputFormats that aren't subclasses of FileOutputFormat fails because Crunch expects these targets to write to hdfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CRUNCH-82) Support OutputFormats not based on FileOutputFormat when writing to multiple targets.

Posted by "Robert Chu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/CRUNCH-82?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Robert Chu updated CRUNCH-82:
-----------------------------

    Attachment: multipleoutputs.patch
    
> Support OutputFormats not based on FileOutputFormat when writing to multiple targets.
> -------------------------------------------------------------------------------------
>
>                 Key: CRUNCH-82
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-82
>             Project: Crunch
>          Issue Type: Improvement
>    Affects Versions: 0.4.0
>            Reporter: Robert Chu
>         Attachments: multipleoutputs.patch, multipleoutputs.patch
>
>
> Currently, writing to multiple targets with OutputFormats that aren't subclasses of FileOutputFormat fails because Crunch expects these targets to write to hdfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CRUNCH-82) Support OutputFormats not based on FileOutputFormat when writing to multiple targets.

Posted by "Josh Wills (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/CRUNCH-82?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13465286#comment-13465286 ] 

Josh Wills commented on CRUNCH-82:
----------------------------------

+1
                
> Support OutputFormats not based on FileOutputFormat when writing to multiple targets.
> -------------------------------------------------------------------------------------
>
>                 Key: CRUNCH-82
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-82
>             Project: Crunch
>          Issue Type: Improvement
>    Affects Versions: 0.4.0
>            Reporter: Robert Chu
>         Attachments: multipleoutputs.patch, multipleoutputs.patch
>
>
> Currently, writing to multiple targets with OutputFormats that aren't subclasses of FileOutputFormat fails because Crunch expects these targets to write to hdfs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira