You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Satish Saley (Jira)" <ji...@apache.org> on 2020/08/14 17:12:00 UTC
[jira] [Commented] (FLINK-11086) Add support for Hadoop 3
[ https://issues.apache.org/jira/browse/FLINK-11086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17177948#comment-17177948 ]
Satish Saley commented on FLINK-11086:
--------------------------------------
Hi [~rmetzger], was there a reason for not shading hadoop-common [https://github.com/apache/flink/commit/e1e7d7f7ecc080c850a264021bf1b20e3d27d373#diff-e7b798a682ee84ab804988165e99761cR38-R44] ? This is leaking lots of classes such as guava and causing issues in our flink application.
I see that hadoop-common classes were shaded in earlier versions [https://mvnrepository.com/artifact/org.apache.flink/flink-s3-fs-hadoop/1.9.0]
> Add support for Hadoop 3
> ------------------------
>
> Key: FLINK-11086
> URL: https://issues.apache.org/jira/browse/FLINK-11086
> Project: Flink
> Issue Type: New Feature
> Components: Deployment / YARN
> Reporter: Sebastian Klemke
> Assignee: Robert Metzger
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.11.0
>
>
> All builds using maven 3.2.5 on commithash ed8ff14ed39d08cd319efe75b40b9742a2ae7558.
> Attempted builds:
> - mvn clean install -Dhadoop.version=3.0.3
> - mvn clean install -Dhadoop.version=3.1.1
> Integration tests with Hadoop input format datasource fail. Example stack trace, taken from hadoop.version 3.1.1 build:
> {code:java}
> testJobCollectionExecution(org.apache.flink.test.hadoopcompatibility.mapred.WordCountMapredITCase) Time elapsed: 0.275 sec <<< ERR
> OR!
> java.lang.NoClassDefFoundError: org/apache/flink/hadoop/shaded/com/google/re2j/PatternSyntaxException
> at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at org.apache.hadoop.fs.Globber.doGlob(Globber.java:210)
> at org.apache.hadoop.fs.Globber.glob(Globber.java:149)
> at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:2085)
> at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:269)
> at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:239)
> at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:325)
> at org.apache.flink.api.java.hadoop.mapred.HadoopInputFormatBase.createInputSplits(HadoopInputFormatBase.java:150)
> at org.apache.flink.api.java.hadoop.mapred.HadoopInputFormatBase.createInputSplits(HadoopInputFormatBase.java:58)
> at org.apache.flink.api.common.operators.GenericDataSourceBase.executeOnCollections(GenericDataSourceBase.java:225)
> at org.apache.flink.api.common.operators.CollectionExecutor.executeDataSource(CollectionExecutor.java:219)
> at org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:155)
> at org.apache.flink.api.common.operators.CollectionExecutor.executeUnaryOperator(CollectionExecutor.java:229)
> at org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:149)
> at org.apache.flink.api.common.operators.CollectionExecutor.executeUnaryOperator(CollectionExecutor.java:229)
> at org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:149)
> at org.apache.flink.api.common.operators.CollectionExecutor.executeUnaryOperator(CollectionExecutor.java:229)
> at org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:149)
> at org.apache.flink.api.common.operators.CollectionExecutor.executeUnaryOperator(CollectionExecutor.java:229)
> at org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:149)
> at org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:131)
> at org.apache.flink.api.common.operators.CollectionExecutor.executeDataSink(CollectionExecutor.java:182)
> at org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:158)
> at org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:131)
> at org.apache.flink.api.common.operators.CollectionExecutor.execute(CollectionExecutor.java:115)
> at org.apache.flink.api.java.CollectionEnvironment.execute(CollectionEnvironment.java:38)
> at org.apache.flink.test.util.CollectionTestEnvironment.execute(CollectionTestEnvironment.java:52)
> at org.apache.flink.test.hadoopcompatibility.mapred.WordCountMapredITCase.internalRun(WordCountMapredITCase.java:121)
> at org.apache.flink.test.hadoopcompatibility.mapred.WordCountMapredITCase.testProgram(WordCountMapredITCase.java:71)
> {code}
> Maybe hadoop 3.x versions could be added to test matrix as well?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)