You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Blind Faith <pe...@gmail.com> on 2014/11/08 23:20:18 UTC

Does spark works on multicore systems?

I am a Spark newbie and I use python (pyspark). I am trying to run a
program on a 64 core system, but no matter what I do, it always uses 1
core. It doesn't matter if I run it using "spark-submit --master local[64]
run.sh" or I call x.repartition(64) in my code with an RDD, the spark
program always uses one core. Has anyone experience of running spark
programs on multicore processors with success? Can someone provide me a
very simple example that does properly run on all cores of a multicore
system?

Re: Does spark works on multicore systems?

Posted by Akhil Das <ak...@sigmoidanalytics.com>.
Try adding the following entry inside your conf/spark-defaults.conf file

spark.cores.max 64

Thanks
Best Regards

On Sun, Nov 9, 2014 at 3:50 AM, Blind Faith <pe...@gmail.com>
wrote:

> I am a Spark newbie and I use python (pyspark). I am trying to run a
> program on a 64 core system, but no matter what I do, it always uses 1
> core. It doesn't matter if I run it using "spark-submit --master local[64]
> run.sh" or I call x.repartition(64) in my code with an RDD, the spark
> program always uses one core. Has anyone experience of running spark
> programs on multicore processors with success? Can someone provide me a
> very simple example that does properly run on all cores of a multicore
> system?
>

Re: Does spark works on multicore systems?

Posted by Sonal Goyal <so...@gmail.com>.
Also, the level of parallelism would be affected by how big your input is.
Could this be a problem in your  case?

On Sunday, November 9, 2014, Aaron Davidson <il...@gmail.com> wrote:

> oops, meant to cc userlist too
>
> On Sat, Nov 8, 2014 at 3:13 PM, Aaron Davidson <ilikerps@gmail.com
> <javascript:_e(%7B%7D,'cvml','ilikerps@gmail.com');>> wrote:
>
>> The default local master is "local[*]", which should use all cores on
>> your system. So you should be able to just do "./bin/pyspark" and
>> "sc.parallelize(range(1000)).count()" and see that all your cores were used.
>>
>> On Sat, Nov 8, 2014 at 2:20 PM, Blind Faith <person.of.book@gmail.com
>> <javascript:_e(%7B%7D,'cvml','person.of.book@gmail.com');>> wrote:
>>
>>> I am a Spark newbie and I use python (pyspark). I am trying to run a
>>> program on a 64 core system, but no matter what I do, it always uses 1
>>> core. It doesn't matter if I run it using "spark-submit --master local[64]
>>> run.sh" or I call x.repartition(64) in my code with an RDD, the spark
>>> program always uses one core. Has anyone experience of running spark
>>> programs on multicore processors with success? Can someone provide me a
>>> very simple example that does properly run on all cores of a multicore
>>> system?
>>>
>>
>>
>

-- 
Best Regards,
Sonal
Nube Technologies <http://www.nubetech.co>

<http://in.linkedin.com/in/sonalgoyal>

Re: Does spark works on multicore systems?

Posted by Aaron Davidson <il...@gmail.com>.
oops, meant to cc userlist too

On Sat, Nov 8, 2014 at 3:13 PM, Aaron Davidson <il...@gmail.com> wrote:

> The default local master is "local[*]", which should use all cores on your
> system. So you should be able to just do "./bin/pyspark" and
> "sc.parallelize(range(1000)).count()" and see that all your cores were used.
>
> On Sat, Nov 8, 2014 at 2:20 PM, Blind Faith <pe...@gmail.com>
> wrote:
>
>> I am a Spark newbie and I use python (pyspark). I am trying to run a
>> program on a 64 core system, but no matter what I do, it always uses 1
>> core. It doesn't matter if I run it using "spark-submit --master local[64]
>> run.sh" or I call x.repartition(64) in my code with an RDD, the spark
>> program always uses one core. Has anyone experience of running spark
>> programs on multicore processors with success? Can someone provide me a
>> very simple example that does properly run on all cores of a multicore
>> system?
>>
>
>