You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by 岑玉海 <ce...@163.com> on 2017/10/30 05:13:35 UTC

回复:[DISCUSS] Change task distribution mechanism

+1






Best regards!
Yuhai Cen


在2017年10月30日 13:07,Jacky Li<ja...@qq.com> 写道:
Hi All,

Currently in carbondata spark integration module CarbonScanRDD, carbon is overriding spark task distribution mechanism. This is required in older version of carbon, because in carbon V1 and V2 format the blocklet size in the file is small, by distributing spark task as per number of blocklet it can improve task parallelism. 

However, this feature is not required for V3 format, since the blocklet size now is much bigger, so it is not much benefit we can get from this feature and it makes code very complex. Furthermore, it is not good to manipulate even the executor allocation in carbon layer.

So I suggest to remove this feature.

Regards,
Jacky Li