You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Alexey Romanchuk <al...@gmail.com> on 2014/09/25 08:30:46 UTC

How to increase number of Active Stages

Hello!

I run local spark cluster with 64 cores total and perform data migration
from protobuf to parquet. After consolidation number of protobuf files into
one big parquet file I save it to hdfs and it takes a lot of time and uses
only 1 core.

To perform migration faster I start a lot of migration tasks in parallel.
After some time I have all 8 active stages saving files and only 8 cores
used. (See screenshot). Is there any way to increase the maximum number of
active stages?

Thanks
[image: Inline image 2]

Re: How to increase number of Active Stages

Posted by Alexey Romanchuk <al...@gmail.com>.
Hey Akhil!

Thanks for reply. Yes, I have check docs from the official site. I need to
save exactly one partition and just want to increase number of active tasks.

On Thu, Sep 25, 2014 at 1:43 PM, Akhil Das <ak...@sigmoidanalytics.com>
wrote:

> Have a look at http://spark.apache.org/docs/1.0.0/tuning.html One thing
> you can try is to increase the number of partition such as >= the number of
> cores.
>
> Thanks
> Best Regards
>
> On Thu, Sep 25, 2014 at 12:00 PM, Alexey Romanchuk <
> alexey.romanchuk@gmail.com> wrote:
>
>> Hello!
>>
>> I run local spark cluster with 64 cores total and perform data migration
>> from protobuf to parquet. After consolidation number of protobuf files into
>> one big parquet file I save it to hdfs and it takes a lot of time and uses
>> only 1 core.
>>
>> To perform migration faster I start a lot of migration tasks in parallel.
>> After some time I have all 8 active stages saving files and only 8 cores
>> used. (See screenshot). Is there any way to increase the maximum number of
>> active stages?
>>
>> Thanks
>> [image: Inline image 2]
>>
>
>

Re: How to increase number of Active Stages

Posted by Akhil Das <ak...@sigmoidanalytics.com>.
Have a look at http://spark.apache.org/docs/1.0.0/tuning.html One thing you
can try is to increase the number of partition such as >= the number of
cores.

Thanks
Best Regards

On Thu, Sep 25, 2014 at 12:00 PM, Alexey Romanchuk <
alexey.romanchuk@gmail.com> wrote:

> Hello!
>
> I run local spark cluster with 64 cores total and perform data migration
> from protobuf to parquet. After consolidation number of protobuf files into
> one big parquet file I save it to hdfs and it takes a lot of time and uses
> only 1 core.
>
> To perform migration faster I start a lot of migration tasks in parallel.
> After some time I have all 8 active stages saving files and only 8 cores
> used. (See screenshot). Is there any way to increase the maximum number of
> active stages?
>
> Thanks
> [image: Inline image 2]
>