You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by Manhua Jiang <ma...@apache.org> on 2020/04/02 12:30:53 UTC

Carbon over-use cluster resources

Hi All,
Recently, I found carbon over-use cluster resources. Generally the design of carbon work flow does not act as common spark task which only do one small work in one thread, but the task has its mind/logic.

For example,
1.launch carbon with --num-executors=1 but set carbon.number.of.cores.while.loading=10;
2.no_sort table with multi-block input, N Iterator<CarbonRowBatch> for example, carbon will start N tasks in parallel. And in each task the CarbonFactDataHandlerColumnar has model.getNumberOfCores() (let's say C) in ProducerPool. Totally launch N*C threads; ==>This is the case makes me take this as serious problem. To many threads stucks the executor to send heartbeat and be killed.

So, the over-use is related to usage of threadpool.

This would affect the cluster overall resource usage and may lead to wrong performance results.

I hope this get your notice while fixing or writing new codes.

Re: Carbon over-use cluster resources

Posted by Manhua Jiang <ma...@apache.org>.
Hi Ajantha,
If we think of this problem in the opposite, carbon may waste resources if user do not set the properties correctly. 

What about the case when concurrent loading? 

So first of all, we need to figure out where and how many the executor services is used. If keeping logic of one node one task, need to keep the overall running threads in a task. 

Then, a little thinking:
Is a global executor service possible? That may cause some dependencies in different steps of loading.
Is multiple executor services for each step(or others) of loading possible? Can the specific executor services change size? (like local-sort is done, then most threads work for writing and none for input reading and converting)

BTW, do you know why the cofigurtation "carbon.number.of.cores.while.loading" born ? 




On 2020/04/15 13:54:50, Ajantha Bhat <aj...@gmail.com> wrote: 
> Hi Manhua,
> 
> For only No sort and Local sort, we don't follow spark task launch logic.
> we have our own logic of one node one task. And inside that task we can
> control resource by configuration (carbon.number.of.cores.while.loading)
> 
> As you pointed in the above mail, *N * C is controlled by configuration*
> and the default value of C is 2.
> *I see over use cluster problem only if you configure it badly.*
> 
> Do you have any suggestion to the change design? Feel free to raise a
> discussion and work on it.
> 
> Thanks,
> Ajantha
> 
> On Tue, Apr 14, 2020 at 6:06 PM Liang Chen <ch...@gmail.com> wrote:
> 
> > OK, thank you feedbacked this issue, let us look into it.
> >
> > Regards
> > Liang
> >
> >
> > Manhua Jiang wrote
> > > Hi All,
> > > Recently, I found carbon over-use cluster resources. Generally the design
> > > of carbon work flow does not act as common spark task which only do one
> > > small work in one thread, but the task has its mind/logic.
> > >
> > > For example,
> > > 1.launch carbon with --num-executors=1 but set
> > > carbon.number.of.cores.while.loading=10;
> > > 2.no_sort table with multi-block input, N Iterator
> > > <CarbonRowBatch>
> > >  for example, carbon will start N tasks in parallel. And in each task the
> > > CarbonFactDataHandlerColumnar has model.getNumberOfCores() (let's say C)
> > > in ProducerPool. Totally launch N*C threads; ==>This is the case makes me
> > > take this as serious problem. To many threads stucks the executor to send
> > > heartbeat and be killed.
> > >
> > > So, the over-use is related to usage of threadpool.
> > >
> > > This would affect the cluster overall resource usage and may lead to
> > wrong
> > > performance results.
> > >
> > > I hope this get your notice while fixing or writing new codes.
> >
> >
> >
> >
> >
> > --
> > Sent from:
> > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
> >
> 

Re: Carbon over-use cluster resources

Posted by Manhua Jiang <ma...@apache.org>.
Hi Vishal,
what you said "1 core launching 2 thread" could be the view from system level, right?
In yarn mode, what application got is vCore, so carbon should not take that as a physical core.

On 2020/04/16 16:15:23, Kumar Vishal <ku...@gmail.com> wrote: 
> Hi Manhua,
> In addition to what Ajantha said. All the configuration are exposed to the
> user.
> And by default no of threads is 2, so in 1 core launching 2 thread is okay.
> 
> -Regarda
> Kumar Vishal
> 
> On Wed, 15 Apr 2020 at 9:55 PM, Ajantha Bhat <aj...@gmail.com> wrote:
> 
> > Hi Manhua,
> >
> > For only No sort and Local sort, we don't follow spark task launch logic.
> > we have our own logic of one node one task. And inside that task we can
> > control resource by configuration (carbon.number.of.cores.while.loading)
> >
> > As you pointed in the above mail, *N * C is controlled by configuration*
> > and the default value of C is 2.
> > *I see over use cluster problem only if you configure it badly.*
> >
> > Do you have any suggestion to the change design? Feel free to raise a
> > discussion and work on it.
> >
> > Thanks,
> > Ajantha
> >
> > On Tue, Apr 14, 2020 at 6:06 PM Liang Chen <ch...@gmail.com>
> > wrote:
> >
> > > OK, thank you feedbacked this issue, let us look into it.
> > >
> > > Regards
> > > Liang
> > >
> > >
> > > Manhua Jiang wrote
> > > > Hi All,
> > > > Recently, I found carbon over-use cluster resources. Generally the
> > design
> > > > of carbon work flow does not act as common spark task which only do one
> > > > small work in one thread, but the task has its mind/logic.
> > > >
> > > > For example,
> > > > 1.launch carbon with --num-executors=1 but set
> > > > carbon.number.of.cores.while.loading=10;
> > > > 2.no_sort table with multi-block input, N Iterator
> > > > <CarbonRowBatch>
> > > >  for example, carbon will start N tasks in parallel. And in each task
> > the
> > > > CarbonFactDataHandlerColumnar has model.getNumberOfCores() (let's say
> > C)
> > > > in ProducerPool. Totally launch N*C threads; ==>This is the case makes
> > me
> > > > take this as serious problem. To many threads stucks the executor to
> > send
> > > > heartbeat and be killed.
> > > >
> > > > So, the over-use is related to usage of threadpool.
> > > >
> > > > This would affect the cluster overall resource usage and may lead to
> > > wrong
> > > > performance results.
> > > >
> > > > I hope this get your notice while fixing or writing new codes.
> > >
> > >
> > >
> > >
> > >
> > > --
> > > Sent from:
> > > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
> > >
> >
> 

Re: Carbon over-use cluster resources

Posted by Kumar Vishal <ku...@gmail.com>.
Hi Manhua,
In addition to what Ajantha said. All the configuration are exposed to the
user.
And by default no of threads is 2, so in 1 core launching 2 thread is okay.

-Regarda
Kumar Vishal

On Wed, 15 Apr 2020 at 9:55 PM, Ajantha Bhat <aj...@gmail.com> wrote:

> Hi Manhua,
>
> For only No sort and Local sort, we don't follow spark task launch logic.
> we have our own logic of one node one task. And inside that task we can
> control resource by configuration (carbon.number.of.cores.while.loading)
>
> As you pointed in the above mail, *N * C is controlled by configuration*
> and the default value of C is 2.
> *I see over use cluster problem only if you configure it badly.*
>
> Do you have any suggestion to the change design? Feel free to raise a
> discussion and work on it.
>
> Thanks,
> Ajantha
>
> On Tue, Apr 14, 2020 at 6:06 PM Liang Chen <ch...@gmail.com>
> wrote:
>
> > OK, thank you feedbacked this issue, let us look into it.
> >
> > Regards
> > Liang
> >
> >
> > Manhua Jiang wrote
> > > Hi All,
> > > Recently, I found carbon over-use cluster resources. Generally the
> design
> > > of carbon work flow does not act as common spark task which only do one
> > > small work in one thread, but the task has its mind/logic.
> > >
> > > For example,
> > > 1.launch carbon with --num-executors=1 but set
> > > carbon.number.of.cores.while.loading=10;
> > > 2.no_sort table with multi-block input, N Iterator
> > > <CarbonRowBatch>
> > >  for example, carbon will start N tasks in parallel. And in each task
> the
> > > CarbonFactDataHandlerColumnar has model.getNumberOfCores() (let's say
> C)
> > > in ProducerPool. Totally launch N*C threads; ==>This is the case makes
> me
> > > take this as serious problem. To many threads stucks the executor to
> send
> > > heartbeat and be killed.
> > >
> > > So, the over-use is related to usage of threadpool.
> > >
> > > This would affect the cluster overall resource usage and may lead to
> > wrong
> > > performance results.
> > >
> > > I hope this get your notice while fixing or writing new codes.
> >
> >
> >
> >
> >
> > --
> > Sent from:
> > http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
> >
>

Re: Carbon over-use cluster resources

Posted by Ajantha Bhat <aj...@gmail.com>.
Hi Manhua,

For only No sort and Local sort, we don't follow spark task launch logic.
we have our own logic of one node one task. And inside that task we can
control resource by configuration (carbon.number.of.cores.while.loading)

As you pointed in the above mail, *N * C is controlled by configuration*
and the default value of C is 2.
*I see over use cluster problem only if you configure it badly.*

Do you have any suggestion to the change design? Feel free to raise a
discussion and work on it.

Thanks,
Ajantha

On Tue, Apr 14, 2020 at 6:06 PM Liang Chen <ch...@gmail.com> wrote:

> OK, thank you feedbacked this issue, let us look into it.
>
> Regards
> Liang
>
>
> Manhua Jiang wrote
> > Hi All,
> > Recently, I found carbon over-use cluster resources. Generally the design
> > of carbon work flow does not act as common spark task which only do one
> > small work in one thread, but the task has its mind/logic.
> >
> > For example,
> > 1.launch carbon with --num-executors=1 but set
> > carbon.number.of.cores.while.loading=10;
> > 2.no_sort table with multi-block input, N Iterator
> > <CarbonRowBatch>
> >  for example, carbon will start N tasks in parallel. And in each task the
> > CarbonFactDataHandlerColumnar has model.getNumberOfCores() (let's say C)
> > in ProducerPool. Totally launch N*C threads; ==>This is the case makes me
> > take this as serious problem. To many threads stucks the executor to send
> > heartbeat and be killed.
> >
> > So, the over-use is related to usage of threadpool.
> >
> > This would affect the cluster overall resource usage and may lead to
> wrong
> > performance results.
> >
> > I hope this get your notice while fixing or writing new codes.
>
>
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>

Re: Carbon over-use cluster resources

Posted by 爱在西元前 <37...@qq.com>.
亮哥,Carbon现在的实现确实是存在不少临时创建线程,做并发处理的地方,我们内部先看一下怎么回复。
另外,现在Carbon还是昆哥负责,我们几个跟着一起搞。




------------------&nbsp;Original&nbsp;------------------
From:&nbsp;"chenliang6136"<chenliang6136@gmail.com&gt;;
Date:&nbsp;Tue, Apr 14, 2020 08:36 PM
To:&nbsp;"dev"<dev@carbondata.apache.org&gt;;

Subject:&nbsp;Re: Carbon over-use cluster resources



OK, thank you feedbacked this issue, let us look into it.

Regards
Liang


Manhua Jiang wrote
&gt; Hi All,
&gt; Recently, I found carbon over-use cluster resources. Generally the design
&gt; of carbon work flow does not act as common spark task which only do one
&gt; small work in one thread, but the task has its mind/logic.
&gt; 
&gt; For example,
&gt; 1.launch carbon with --num-executors=1 but set
&gt; carbon.number.of.cores.while.loading=10;
&gt; 2.no_sort table with multi-block input, N Iterator
&gt; <CarbonRowBatch&gt;
&gt;&nbsp; for example, carbon will start N tasks in parallel. And in each task the
&gt; CarbonFactDataHandlerColumnar has model.getNumberOfCores() (let's say C)
&gt; in ProducerPool. Totally launch N*C threads; ==&gt;This is the case makes me
&gt; take this as serious problem. To many threads stucks the executor to send
&gt; heartbeat and be killed.
&gt; 
&gt; So, the over-use is related to usage of threadpool.
&gt; 
&gt; This would affect the cluster overall resource usage and may lead to wrong
&gt; performance results.
&gt; 
&gt; I hope this get your notice while fixing or writing new codes.





--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: Carbon over-use cluster resources

Posted by Liang Chen <ch...@gmail.com>.
OK, thank you feedbacked this issue, let us look into it.

Regards
Liang


Manhua Jiang wrote
> Hi All,
> Recently, I found carbon over-use cluster resources. Generally the design
> of carbon work flow does not act as common spark task which only do one
> small work in one thread, but the task has its mind/logic.
> 
> For example,
> 1.launch carbon with --num-executors=1 but set
> carbon.number.of.cores.while.loading=10;
> 2.no_sort table with multi-block input, N Iterator
> <CarbonRowBatch>
>  for example, carbon will start N tasks in parallel. And in each task the
> CarbonFactDataHandlerColumnar has model.getNumberOfCores() (let's say C)
> in ProducerPool. Totally launch N*C threads; ==>This is the case makes me
> take this as serious problem. To many threads stucks the executor to send
> heartbeat and be killed.
> 
> So, the over-use is related to usage of threadpool.
> 
> This would affect the cluster overall resource usage and may lead to wrong
> performance results.
> 
> I hope this get your notice while fixing or writing new codes.





--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: Carbon over-use cluster resources

Posted by David CaiQiang <da...@gmail.com>.
Hi, manhua
  Now no_sort reuse the loading flow of local_sort. It is not a good
solution and led to the situation which you have mentioned. In my opinion,
we need to adjust the loading flow of no_sort, maybe like global_sort
finally.

  In addition, the producer-consumer pattern in data encoding and
compression also can be optimized for no_sort and global_sort, maybe just
prefetch one page and process it instead of using a thread pool.



-----
Best Regards
David Cai
--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/