You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/09/18 18:16:00 UTC

[jira] [Commented] (FLINK-10365) Consolidate shaded Hadoop classes for filesystems

    [ https://issues.apache.org/jira/browse/FLINK-10365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16619508#comment-16619508 ] 

ASF GitHub Bot commented on FLINK-10365:
----------------------------------------

StephanEwen opened a new pull request #6714:  [FLINK-10365] [FLINK-10366] [s3] Create common bases for File System implementations
URL: https://github.com/apache/flink/pull/6714
 
 
   ## What is the purpose of the change
   
   We currently have have three bundled/shaded filesystem connectors that build on top of Hadoop's classes. More will probably come, when we add more bundles file system connector libraries, for example for GCS. Each of them re-builds the shaded Hadoop module, including creating the relocated config, adapting native code loading, etc.
   
   Similarly, there is a lot of code coming for the S3 connectors that will be shared between the Hadoop- and Presto-based implementations.
   
   This PR creates common bases projects for shaded Hadoop and common S3 functionality to be reused.
   
   ## Brief change log
   
     - Create the `flink-fs-hadoop-shaded` module and factors out the shaded Hadoop FS classes from the shaded S3 file systems into that module.
     - Bumps the Hadoop dependency to 3.1 to get access to newer connectors and better/later utilities. Adjusts the shading of the Hadoop configuration.
     - Creates an S3 base module `flink-s3-fs-base` as the common denominator for the Hadoop- and Presto-based implementations
     - Adjusts the Hadoop-based s3 connector to use the common denominator module
     - Adjust Presto-based S3 adapter to use the common denominator module
     - Consolidates shared classes for S3 in `flink-s3-fs-base` module
     - Upgrades the build script shading checks to new patterns.
   
   I put each change in a separate commit, for easier reviews.
   
   ## Verifying this change
   
   The test reworks and upgrades dependencies, it does not change functionality.
   The existing integration test cases and end-to-end tests still the existing functionality.
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): **yes**
     - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: **no**
     - The serializers: **no**
     - The runtime per-record code paths (performance sensitive): **no**
     - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: **no**
     - The S3 file system connector: **yes**
   
   ## Documentation
   
     - Does this pull request introduce a new feature? **no**
     - If yes, how is the feature documented? **not applicable**
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Consolidate shaded Hadoop classes for filesystems
> -------------------------------------------------
>
>                 Key: FLINK-10365
>                 URL: https://issues.apache.org/jira/browse/FLINK-10365
>             Project: Flink
>          Issue Type: Improvement
>          Components: FileSystem
>            Reporter: Stephan Ewen
>            Assignee: Stephan Ewen
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.7.0
>
>
> We currently have have three bundled/shaded filesystem connectors that build on top of Hadoop's classes. More will probably come. Each of them re-builds the shaded Hadoop module, including creating the relocated config, adapting native code loading, etc.
> We should factor that out into a single base project to avoid duplicating work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)