You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by xuchuanyin <xu...@hust.edu.cn> on 2018/10/15 13:03:27 UTC

[Proposal] Proposal to change default value of two parameters for data loading

Hi, all:

About a year ago, we introduced 'multiple dirs for temp data' to solve disk
hotspot problem in data loading. 

This feature enables carbon randomly pick one of the local directories
configured in yarn-local-dirs when it writes any temp files to disk (for
example: sort temp files and fact data files).

For about one years' usage in productive environment, this feature turns out
to be effective and correct. So here I propose to enable the related
parameters by default.

The related parameters are

1. `carbon.use.local.dir` : Currently it is `false` by default, we will turn
it to `true` by default;

2. `carbon.user.multiple.dir` : Currently it is `false` by default, we will
turn it to `true` by default.



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Proposal] Proposal to change default value of two parameters for data loading

Posted by Jacky Li <ja...@qq.com>.
+1


> 在 2018年10月15日,下午9:03,xuchuanyin <xu...@hust.edu.cn> 写道:
> 
> Hi, all:
> 
> About a year ago, we introduced 'multiple dirs for temp data' to solve disk
> hotspot problem in data loading. 
> 
> This feature enables carbon randomly pick one of the local directories
> configured in yarn-local-dirs when it writes any temp files to disk (for
> example: sort temp files and fact data files).
> 
> For about one years' usage in productive environment, this feature turns out
> to be effective and correct. So here I propose to enable the related
> parameters by default.
> 
> The related parameters are
> 
> 1. `carbon.use.local.dir` : Currently it is `false` by default, we will turn
> it to `true` by default;
> 
> 2. `carbon.user.multiple.dir` : Currently it is `false` by default, we will
> turn it to `true` by default.
> 
> 
> --
> Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
> 


Re: [Proposal] Proposal to change default value of two parameters for data loading

Posted by xuchuanyin <xu...@hust.edu.cn>.
Yes, it needs further modification to meet the requirement -- an additional
property is needed to handle this, we can configure multiple directories
there.



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [Proposal] Proposal to change default value of two parameters for data loading

Posted by xm_zzc <44...@qq.com>.
Hi chuanyin:
  +1 for this. One question: these two parameters just support for on-yarn
mode, right? Can it support to config other path instead of /tmp path when
user run app without on-yarn mode?



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/