You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mesos.apache.org by "Alan Braithwaite (JIRA)" <ji...@apache.org> on 2015/09/26 19:10:04 UTC

[jira] [Updated] (MESOS-3527) HDFS HA fails outside of docker context

     [ https://issues.apache.org/jira/browse/MESOS-3527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alan Braithwaite updated MESOS-3527:
------------------------------------
    Description: 
I'm using Spark with the Mesos driver.

When I pass in a `hdfs://<namespace>/path` url in for the spark application, the fetcher attempts to download the jar files outside the spark context (the docker container in this case).  The problem is that the core-site.xml and hdfs-site.xml configs exist inside the container.  The host machine does not have the necessary hdfs configuration to connect to the HA cluster.

Currently, I'm not sure what the alternative ways of accessing a HA hadoop cluster besides through the hadoop client.

{code}
I0926 06:34:19.346851 18851 fetcher.cpp:214] Fetching URI 'hdfs://hdfsha/tmp/spark-job.jar'
I0926 06:34:19.622860 18851 fetcher.cpp:99] Fetching URI 'hdfs://hdfsha/tmp/spark-job.jar' using Hadoop Client
I0926 06:34:19.622936 18851 fetcher.cpp:109] Downloading resource from 'hdfs://hdfsha/tmp/spark-job.jar' to '/state/var/lib/mesos/slaves/20150602-065056-269165578-5050-17724-S12/frameworks/20150914-102037-285942794-5050-31214-0029/executors/driver-20150926063418-0002/runs/9953ae1b-9387-489f-8645-5472d9c5eacf/spark-job.jar'
E0926 06:34:20.814858 18851 fetcher.cpp:113] HDFS copyToLocal failed: /usr/local/hadoop/bin/hadoop fs -copyToLocal 'hdfs://hdfsha/tmp/spark-job.jar' '/state/var/lib/mesos/slaves/20150602-065056-269165578-5050-17724-S12/frameworks/20150914-102037-285942794-5050-31214-0029/executors/driver-20150926063418-0002/runs/9953ae1b-9387-489f-8645-5472d9c5eacf/spark-job.jar'
-copyToLocal: java.net.UnknownHostException: hdfsha
Usage: hadoop fs [generic options] -copyToLocal [-p] [-ignoreCrc] [-crc] <src> ... <localdst>
Failed to fetch: hdfs://hdfsha/tmp/spark-job.jar
{code}

The code in question:
https://github.com/apache/mesos/blob/fbb12a52969710fe69c309c83db0a5441dbea886/src/launcher/fetcher.cpp#L92-L114

  was:
I'm using Spark with the Mesos driver.

When I pass in a `hdfs://<namespace>/path` url in for the spark application, the fetcher attempts to download the jar files outside the spark context (the docker container in this case).  The problem is that the core-site.xml and hdfs-site.xml configs exist inside the container.  The host machine does not have the necessary hdfs configuration to connect to the HA cluster.

Currently, I'm not sure what the alternative ways of accessing a HA hadoop cluster besides through the hadoop client.

{code}
I0926 06:34:19.346851 18851 fetcher.cpp:214] Fetching URI 'hdfs://hdfsha/tmp/spark-job.jar'
I0926 06:34:19.622860 18851 fetcher.cpp:99] Fetching URI 'hdfs://hdfsha/tmp/spark-job.jar' using Hadoop Client
I0926 06:34:19.622936 18851 fetcher.cpp:109] Downloading resource from 'hdfs://hdfsha/tmp/spark-job.jar' to '/state/var/lib/mesos/slaves/20150602-065056-269165578-5050-17724-S12/frameworks/20150914-102037-285942794-5050-31214-0029/executors/driver-20150926063418-0002/runs/9953ae1b-9387-489f-8645-5472d9c5eacf/spark-job.jar'
E0926 06:34:20.814858 18851 fetcher.cpp:113] HDFS copyToLocal failed: /usr/local/hadoop/bin/hadoop fs -copyToLocal 'hdfs://hdfsha/tmp/spark-job.jar' '/state/var/lib/mesos/slaves/20150602-065056-269165578-5050-17724-S12/frameworks/20150914-102037-285942794-5050-31214-0029/executors/driver-20150926063418-0002/runs/9953ae1b-9387-489f-8645-5472d9c5eacf/spark-job.jar'
-copyToLocal: java.net.UnknownHostException: hdfsha
Usage: hadoop fs [generic options] -copyToLocal [-p] [-ignoreCrc] [-crc] <src> ... <localdst>
Failed to fetch: hdfs://hdfsha/tmp/spark-job.jar
{code}


> HDFS HA fails outside of docker context
> ---------------------------------------
>
>                 Key: MESOS-3527
>                 URL: https://issues.apache.org/jira/browse/MESOS-3527
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Alan Braithwaite
>
> I'm using Spark with the Mesos driver.
> When I pass in a `hdfs://<namespace>/path` url in for the spark application, the fetcher attempts to download the jar files outside the spark context (the docker container in this case).  The problem is that the core-site.xml and hdfs-site.xml configs exist inside the container.  The host machine does not have the necessary hdfs configuration to connect to the HA cluster.
> Currently, I'm not sure what the alternative ways of accessing a HA hadoop cluster besides through the hadoop client.
> {code}
> I0926 06:34:19.346851 18851 fetcher.cpp:214] Fetching URI 'hdfs://hdfsha/tmp/spark-job.jar'
> I0926 06:34:19.622860 18851 fetcher.cpp:99] Fetching URI 'hdfs://hdfsha/tmp/spark-job.jar' using Hadoop Client
> I0926 06:34:19.622936 18851 fetcher.cpp:109] Downloading resource from 'hdfs://hdfsha/tmp/spark-job.jar' to '/state/var/lib/mesos/slaves/20150602-065056-269165578-5050-17724-S12/frameworks/20150914-102037-285942794-5050-31214-0029/executors/driver-20150926063418-0002/runs/9953ae1b-9387-489f-8645-5472d9c5eacf/spark-job.jar'
> E0926 06:34:20.814858 18851 fetcher.cpp:113] HDFS copyToLocal failed: /usr/local/hadoop/bin/hadoop fs -copyToLocal 'hdfs://hdfsha/tmp/spark-job.jar' '/state/var/lib/mesos/slaves/20150602-065056-269165578-5050-17724-S12/frameworks/20150914-102037-285942794-5050-31214-0029/executors/driver-20150926063418-0002/runs/9953ae1b-9387-489f-8645-5472d9c5eacf/spark-job.jar'
> -copyToLocal: java.net.UnknownHostException: hdfsha
> Usage: hadoop fs [generic options] -copyToLocal [-p] [-ignoreCrc] [-crc] <src> ... <localdst>
> Failed to fetch: hdfs://hdfsha/tmp/spark-job.jar
> {code}
> The code in question:
> https://github.com/apache/mesos/blob/fbb12a52969710fe69c309c83db0a5441dbea886/src/launcher/fetcher.cpp#L92-L114



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)