You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Ismaël Mejía (Jira)" <ji...@apache.org> on 2019/11/12 10:46:00 UTC

[jira] [Comment Edited] (BEAM-8569) Support Hadoop 3 on Beam

    [ https://issues.apache.org/jira/browse/BEAM-8569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16972233#comment-16972233 ] 

Ismaël Mejía edited comment on BEAM-8569 at 11/12/19 10:45 AM:
---------------------------------------------------------------

It seems we have two kind of modules which depend on Hadoop. Those that depend directly like hadoop-common, hadoop-file-system and hadoop-format that at the Beam level we can ‘easily’ guarantee they work by just providing the required dependency and validate that new commits do not break any of the tests.

The second group are dependencies who depend transitively on hadoop. For example HBaseIO, HCatalogIO and even ParquetIO for those we really would have to play catch up once they are ready.


was (Author: iemejia):
It seems we have two kind of modules who depend on Hadoop. Those who depend directly like hadoop-common, hadoop-file-system and hadoop-format that at the Beam level we can ‘easily’ guarantee that they work by just providing the required dependency and validate that new commits do not break any of the tests.

The second group are dependencies who depend transitively on hadoop. For example HBaseIO, HCatalogIO and even ParquetIO for those we really would have to play catch up once they are ready.

> Support Hadoop 3 on Beam
> ------------------------
>
>                 Key: BEAM-8569
>                 URL: https://issues.apache.org/jira/browse/BEAM-8569
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-hadoop-file-system, io-java-hadoop-format, runner-spark
>            Reporter: Ismaël Mejía
>            Priority: Minor
>
> It seems that Hadoop 3 in production is finally happening. CDH supports it in their latest version and Spark 3 will include support for Hadoop 3 too.
> This is an uber ticket to cover the required changes to the codebase to ensure compliance with Hadoop 3.x
> Hadoop dependencies in Beam are mostly provided and APIs are until some point compatible, but we may require changes in the CI to test that new changes work both in Hadoop 2 and Hadoop 3 until we decide to remove support for Hadoop 3.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)