You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Guillaume Pitel <gu...@exensa.com> on 2014/01/04 21:54:37 UTC

Troubles with the Spark-EC2 stuff

Hi,

I'm making my first steps on EC2 (using 0.8.1 bin for CDH4) and some problems
occured. First one is that once the cluster is created, the script cannot find
it again for login, destroying and so on. Not a big deal, I can do that
manually, but it's annoying.

Second problem is not really related to spark but to hdfs/mapreduce. I want to
make a hadoop distcp from S3 to the local ephemeral HDFS. The distcp fails
because there's no mapreduce running.

Questions :

- anyone has advice about a better way to copy from S3 to hdfs, or a way to make
distcp work ?
- any idea why the spark-ec2 cannot find the clusters back ?

Thanks in advance for any experience and advices !

Guillaume
-- 
eXenSa

	
*Guillaume PITEL, Président*
+33(0)6 25 48 86 80 / +33(0)9 70 44 67 53

eXenSa S.A.S. <http://www.exensa.com/>
41, rue Périer - 92120 Montrouge - FRANCE
Tel +33(0)1 84 16 36 77 / Fax +33(0)9 72 28 37 05


Re: Troubles with the Spark-EC2 stuff

Posted by Patrick Wendell <pw...@gmail.com>.
Look in /root/mapreduce. This is different for hadoop2 clusters because
mapreduce is now distributed as a separate project.


On Sat, Jan 4, 2014 at 2:04 PM, Guillaume Pitel
<gu...@exensa.com>wrote:

>  Hi,
>
> Thanks, it wasn't actually the problem but your suggestion made me found
> it. I've started the cluster with hadoop v2, which seems not to include
> mapred with it (while I think it's still possible to have it). Wondering if
> there's a dictcp using yarn, now...
>
> And btw, one just need to bin/start-mapred.sh
>
> Thanks again
>
> Guillaume
>
>
>
>  For the second problem, just start Hadoop MapReduce before running
> distcp:
>
>  /root/ephemeral-hadoop/bin/start-all.sh
>
>
>
> --
>    [image: eXenSa]
>  *Guillaume PITEL, Président*
> +33(0)6 25 48 86 80 / +33(0)9 70 44 67 53
>
>  eXenSa S.A.S. <http://www.exensa.com/>
>  41, rue Périer - 92120 Montrouge - FRANCE
> Tel +33(0)1 84 16 36 77 / Fax +33(0)9 72 28 37 05
>

Re: Troubles with the Spark-EC2 stuff

Posted by Guillaume Pitel <gu...@exensa.com>.
Hi,

Thanks, it wasn't actually the problem but your suggestion made me found it.
I've started the cluster with hadoop v2, which seems not to include mapred with
it (while I think it's still possible to have it). Wondering if there's a dictcp
using yarn, now...

And btw, one just need to bin/start-mapred.sh

Thanks again

Guillaume



> For the second problem, just start Hadoop MapReduce before running distcp:
>
> /root/ephemeral-hadoop/bin/start-all.sh
>
>

-- 
eXenSa

	
*Guillaume PITEL, Président*
+33(0)6 25 48 86 80 / +33(0)9 70 44 67 53

eXenSa S.A.S. <http://www.exensa.com/>
41, rue Périer - 92120 Montrouge - FRANCE
Tel +33(0)1 84 16 36 77 / Fax +33(0)9 72 28 37 05


Re: Troubles with the Spark-EC2 stuff

Posted by Josh Rosen <ro...@gmail.com>.
For the second problem, just start Hadoop MapReduce before running distcp:

/root/ephemeral-hadoop/bin/start-all.sh




On Sat, Jan 4, 2014 at 12:54 PM, Guillaume Pitel <guillaume.pitel@exensa.com
> wrote:

>  Hi,
>
> I'm making my first steps on EC2 (using 0.8.1 bin for CDH4) and some
> problems occured. First one is that once the cluster is created, the script
> cannot find it again for login, destroying and so on. Not a big deal, I can
> do that manually, but it's annoying.
>
> Second problem is not really related to spark but to hdfs/mapreduce. I
> want to make a hadoop distcp from S3 to the local ephemeral HDFS. The
> distcp fails because there's no mapreduce running.
>
> Questions :
>
> - anyone has advice about a better way to copy from S3 to hdfs, or a way
> to make distcp work ?
> - any idea why the spark-ec2 cannot find the clusters back ?
>
> Thanks in advance for any experience and advices !
>
> Guillaume
> --
>    [image: eXenSa]
>  *Guillaume PITEL, Président*
> +33(0)6 25 48 86 80 / +33(0)9 70 44 67 53
>
>  eXenSa S.A.S. <http://www.exensa.com/>
>  41, rue Périer - 92120 Montrouge - FRANCE
> Tel +33(0)1 84 16 36 77 / Fax +33(0)9 72 28 37 05
>