You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by aljoscha <gi...@git.apache.org> on 2018/02/20 09:52:26 UTC

[GitHub] flink pull request #5531: [FLINK-8668] Document how to set HADOOP_CLASSPATH ...

GitHub user aljoscha opened a pull request:

    https://github.com/apache/flink/pull/5531

    [FLINK-8668] Document how to set HADOOP_CLASSPATH for Flink

    R: @zentol @StephanEwen 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/aljoscha/flink jira-8668-doc-hadoop-classpath

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/5531.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5531
    
----
commit 10f6a53bcd85da126746a8cdeec97cce31c013f0
Author: Aljoscha Krettek <al...@...>
Date:   2018-02-20T09:51:27Z

    [FLINK-8668] Document how to set HADOOP_CLASSPATH for Flink

----


---

[GitHub] flink pull request #5531: [FLINK-8668] Document how to set HADOOP_CLASSPATH ...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5531#discussion_r169268952
  
    --- Diff: docs/ops/config.md ---
    @@ -82,6 +82,26 @@ prefix that is checked against the fully qualified class name. By default, this
     If you want to change this setting you have to make sure to also include the default patterns in
     your list of patterns if you want to keep that default behaviour.
     
    +## Configuring Flink with Hadoop Classpaths
    +
    +Flink will use the environment variable `HADOOP_CLASSPATH` to augment the
    +classpath that is used when starting Flink components such as the Client,
    +JobManager, or TaskManager. Most Hadoop distributions and cloud environments
    +will not set this variable by default so if the Hadoop classpath should be
    +picked up by Flink the environment variable should be exported on all machines
    +that are running Flink components.
    +
    +When running on YARN, this is usually not a problem because the components
    +running inside YARN will be started with the Hadoop classpaths anyways but it
    --- End diff --
    
    remove `anyways` and replace it with a comma.


---

[GitHub] flink pull request #5531: [FLINK-8668] Document how to set HADOOP_CLASSPATH ...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5531#discussion_r169600042
  
    --- Diff: docs/ops/deployment/hadoop.md ---
    @@ -0,0 +1,47 @@
    +---
    +title:  "Hadoop Integration"
    +nav-title: Hadoop Integration
    +nav-parent_id: deployment
    +nav-pos: 8
    +---
    +<!--
    +Licensed to the Apache Software Foundation (ASF) under one
    +or more contributor license agreements.  See the NOTICE file
    +distributed with this work for additional information
    +regarding copyright ownership.  The ASF licenses this file
    +to you under the Apache License, Version 2.0 (the
    +"License"); you may not use this file except in compliance
    +with the License.  You may obtain a copy of the License at
    +
    +  http://www.apache.org/licenses/LICENSE-2.0
    +
    +Unless required by applicable law or agreed to in writing,
    +software distributed under the License is distributed on an
    +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    +KIND, either express or implied.  See the License for the
    +specific language governing permissions and limitations
    +under the License.
    +-->
    +
    +* This will be replaced by the TOC
    +{:toc}
    +
    +## Configuring Flink with Hadoop Classpaths
    +
    +Flink will use the environment variable `HADOOP_CLASSPATH` to augment the
    +classpath that is used when starting Flink components such as the Client,
    +JobManager, or TaskManager. Most Hadoop distributions and cloud environments
    +will not set this variable by default so if the Hadoop classpath should be
    +picked up by Flink the environment variable must be exported on all machines
    +that are running Flink components.
    +
    +When running on YARN, this is usually not a problem because the components
    +running inside YARN will be started with the Hadoop classpaths, but it can
    +happen that the Hadoop dependencies must be in the classpath when submitting a
    +job to YARN. For this, it's usually enough to do a
    +
    +```
    +export HADOOP_CLASSPATH=`hadoop classpath`
    --- End diff --
    
    add `<` `>`? 


---

[GitHub] flink pull request #5531: [FLINK-8668] Document how to set HADOOP_CLASSPATH ...

Posted by aljoscha <gi...@git.apache.org>.
Github user aljoscha commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5531#discussion_r169606158
  
    --- Diff: docs/ops/deployment/hadoop.md ---
    @@ -0,0 +1,47 @@
    +---
    +title:  "Hadoop Integration"
    +nav-title: Hadoop Integration
    +nav-parent_id: deployment
    +nav-pos: 8
    +---
    +<!--
    +Licensed to the Apache Software Foundation (ASF) under one
    +or more contributor license agreements.  See the NOTICE file
    +distributed with this work for additional information
    +regarding copyright ownership.  The ASF licenses this file
    +to you under the Apache License, Version 2.0 (the
    +"License"); you may not use this file except in compliance
    +with the License.  You may obtain a copy of the License at
    +
    +  http://www.apache.org/licenses/LICENSE-2.0
    +
    +Unless required by applicable law or agreed to in writing,
    +software distributed under the License is distributed on an
    +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
    +KIND, either express or implied.  See the License for the
    +specific language governing permissions and limitations
    +under the License.
    +-->
    +
    +* This will be replaced by the TOC
    +{:toc}
    +
    +## Configuring Flink with Hadoop Classpaths
    +
    +Flink will use the environment variable `HADOOP_CLASSPATH` to augment the
    +classpath that is used when starting Flink components such as the Client,
    +JobManager, or TaskManager. Most Hadoop distributions and cloud environments
    +will not set this variable by default so if the Hadoop classpath should be
    +picked up by Flink the environment variable must be exported on all machines
    +that are running Flink components.
    +
    +When running on YARN, this is usually not a problem because the components
    +running inside YARN will be started with the Hadoop classpaths, but it can
    +happen that the Hadoop dependencies must be in the classpath when submitting a
    +job to YARN. For this, it's usually enough to do a
    +
    +```
    +export HADOOP_CLASSPATH=`hadoop classpath`
    --- End diff --
    
    it's the `hadoop` binary with `classpath` as argument


---

[GitHub] flink pull request #5531: [FLINK-8668] Document how to set HADOOP_CLASSPATH ...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5531#discussion_r169269041
  
    --- Diff: docs/ops/config.md ---
    @@ -82,6 +82,26 @@ prefix that is checked against the fully qualified class name. By default, this
     If you want to change this setting you have to make sure to also include the default patterns in
     your list of patterns if you want to keep that default behaviour.
     
    +## Configuring Flink with Hadoop Classpaths
    +
    +Flink will use the environment variable `HADOOP_CLASSPATH` to augment the
    +classpath that is used when starting Flink components such as the Client,
    +JobManager, or TaskManager. Most Hadoop distributions and cloud environments
    +will not set this variable by default so if the Hadoop classpath should be
    +picked up by Flink the environment variable should be exported on all machines
    --- End diff --
    
    replace "should" with "must"?


---

[GitHub] flink pull request #5531: [FLINK-8668] Document how to set HADOOP_CLASSPATH ...

Posted by aljoscha <gi...@git.apache.org>.
Github user aljoscha closed the pull request at:

    https://github.com/apache/flink/pull/5531


---

[GitHub] flink issue #5531: [FLINK-8668] Document how to set HADOOP_CLASSPATH for Fli...

Posted by aljoscha <gi...@git.apache.org>.
Github user aljoscha commented on the issue:

    https://github.com/apache/flink/pull/5531
  
    @zentol Yes, I struggled with where exactly to put this. I think I will just create a "Hadoop" page under "Clusters&Operations" that has only this section for now. WDYT?


---

[GitHub] flink pull request #5531: [FLINK-8668] Document how to set HADOOP_CLASSPATH ...

Posted by aljoscha <gi...@git.apache.org>.
Github user aljoscha commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5531#discussion_r169287058
  
    --- Diff: docs/ops/config.md ---
    @@ -82,6 +82,26 @@ prefix that is checked against the fully qualified class name. By default, this
     If you want to change this setting you have to make sure to also include the default patterns in
     your list of patterns if you want to keep that default behaviour.
     
    +## Configuring Flink with Hadoop Classpaths
    +
    +Flink will use the environment variable `HADOOP_CLASSPATH` to augment the
    +classpath that is used when starting Flink components such as the Client,
    +JobManager, or TaskManager. Most Hadoop distributions and cloud environments
    +will not set this variable by default so if the Hadoop classpath should be
    +picked up by Flink the environment variable should be exported on all machines
    --- End diff --
    
    fixing


---

[GitHub] flink pull request #5531: [FLINK-8668] Document how to set HADOOP_CLASSPATH ...

Posted by aljoscha <gi...@git.apache.org>.
Github user aljoscha commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5531#discussion_r169287038
  
    --- Diff: docs/ops/config.md ---
    @@ -82,6 +82,26 @@ prefix that is checked against the fully qualified class name. By default, this
     If you want to change this setting you have to make sure to also include the default patterns in
     your list of patterns if you want to keep that default behaviour.
     
    +## Configuring Flink with Hadoop Classpaths
    +
    +Flink will use the environment variable `HADOOP_CLASSPATH` to augment the
    +classpath that is used when starting Flink components such as the Client,
    +JobManager, or TaskManager. Most Hadoop distributions and cloud environments
    +will not set this variable by default so if the Hadoop classpath should be
    +picked up by Flink the environment variable should be exported on all machines
    +that are running Flink components.
    +
    +When running on YARN, this is usually not a problem because the components
    +running inside YARN will be started with the Hadoop classpaths anyways but it
    --- End diff --
    
    fixing


---

[GitHub] flink issue #5531: [FLINK-8668] Document how to set HADOOP_CLASSPATH for Fli...

Posted by zentol <gi...@git.apache.org>.
Github user zentol commented on the issue:

    https://github.com/apache/flink/pull/5531
  
    We may instead want to add a whole new page under "Clusters&Operations" for hadoop related things.


---