You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/01/29 14:00:48 UTC

[GitHub] [flink] azagrebin commented on a change in pull request #10932: [FLINK-15614][docs] Consolidate Hadoop documentation

azagrebin commented on a change in pull request #10932: [FLINK-15614][docs] Consolidate Hadoop documentation
URL: https://github.com/apache/flink/pull/10932#discussion_r372393162
 
 

 ##########
 File path: docs/ops/deployment/hadoop.md
 ##########
 @@ -38,13 +38,18 @@ Referencing the HDFS configuration in the [Flink configuration]({{ site.baseurl
 
 Another way to provide the Hadoop configuration is to have it on the class path of the Flink process, see more details below.
 
-## Adding Hadoop Classpaths
+## Providing Hadoop classes
 
-The required classes to use Hadoop should be available in the `lib/` folder of the Flink installation
-(on all machines running Flink) unless Flink is built with [Hadoop shaded dependencies]({{ site.baseurl }}/flinkDev/building.html#pre-bundled-versions).
+In order to use Hadoop features (e.g., YARN, HDFS) it is ncessary to provide Flink with the required Hadoop classes,
+as these are not bundled by default.
 
-If putting the files into the directory is not possible, Flink also respects
-the `HADOOP_CLASSPATH` environment variable to add Hadoop jar files to the classpath.
+This can be done in 2 ways:
+* Adding the Hadoop classpath to Flink
 
 Review comment:
   Are there no expected dependency clashes in case of just exporting `HADOOP_CLASSPATH`?
   In other words, why is the relocation needed for `/lib` but not for `HADOOP_CLASSPATH`?
   
   I also somewhat liked the previous idea of mentioning that the first way should be a recommended way to go and only in case of problems (giving examples) go to the option 2. Is it still the case?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services