You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by rmetzger <gi...@git.apache.org> on 2015/01/25 16:06:56 UTC

[GitHub] flink pull request: [FLINK-1433] Add HADOOP_CLASSPATH to start scr...

GitHub user rmetzger opened a pull request:

    https://github.com/apache/flink/pull/337

    [FLINK-1433] Add HADOOP_CLASSPATH to start scripts

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rmetzger/flink FLINK-1433

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/337.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #337
    
----
commit b9d1140df9b3232be53c105636f370f1d11aca37
Author: Robert Metzger <rm...@apache.org>
Date:   2015-01-25T15:05:20Z

    [FLINK-1433] Add HADOOP_CLASSPATH to start scripts

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1433] Add HADOOP_CLASSPATH to start scr...

Posted by mxm <gi...@git.apache.org>.

Github user mxm commented on the pull request:

    https://github.com/apache/flink/pull/337#issuecomment-71618846
  
    Looks good to merge.
    
    Like Robert said, the `HADOOP_CLASSPATH` is used to add third party libraries. From `hadoop-env.sh`:
    
        # Extra Java CLASSPATH elements.  Automatically insert capacity-scheduler.
        for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
        if [ "$HADOOP_CLASSPATH" ]; then
          export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
        else
          export HADOOP_CLASSPATH=$f
        fi
        done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1433] Add HADOOP_CLASSPATH to start scr...

Posted by StephanEwen <gi...@git.apache.org>.

Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/337#issuecomment-71548405
  
    Okay, if it is an auxiliary classpath then it should be fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1433] Add HADOOP_CLASSPATH to start scr...

Posted by StephanEwen <gi...@git.apache.org>.

Github user StephanEwen commented on the pull request:

    https://github.com/apache/flink/pull/337#issuecomment-71412124
  
    Does this mean that we now have multiple hadoop dependencies in the class path? The ones that are part of the Flink lib directory, plus the ones in the HADOOP_CLASSPATH variable.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1433] Add HADOOP_CLASSPATH to start scr...

Posted by rmetzger <gi...@git.apache.org>.

Github user rmetzger commented on the pull request:

    https://github.com/apache/flink/pull/337#issuecomment-71429315
  
    It depends on what the user does with the `HADOOP_CLASSPATH`.
    In my understanding, it is meant as a variable for adding 3rd party jar files to Hadoop. The jar files of hadoop are added to the `CLASSPATH` variable in the `libexec/hadoop-config.sh` script. There, you see variables like `HADOOP_COMMON_LIB_JARS_DIR`, `HDFS_LIB_JARS_DIR`, `YARN_LIB_JARS_DIR`, ... being added to the CLASSPATH. In the very last step, they add the HADOOP_CLASSPATH variable (by default to the end of the classpath, but there is an additional option to put it in front of it).
    
    I found that we need to add this on Google Compute Engine's Hadoop deployment. They have their Google Storage configured by default but it currently doesn't work in non-yarn setups because the Google Storage jar is not in our classpath. On these clusters, the `HADOOP_CLASSPATH` variable contains the path to the storage-jar.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1433] Add HADOOP_CLASSPATH to start scr...

Posted by asfgit <gi...@git.apache.org>.

Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/337


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1433] Add HADOOP_CLASSPATH to start scr...

Posted by rmetzger <gi...@git.apache.org>.

Github user rmetzger commented on the pull request:

    https://github.com/apache/flink/pull/337#issuecomment-71631049
  
    Merging it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-1433] Add HADOOP_CLASSPATH to start scr...

Posted by fhueske <gi...@git.apache.org>.

Github user fhueske commented on the pull request:

    https://github.com/apache/flink/pull/337#issuecomment-71378388
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---