You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@carbondata.apache.org by lianganping <li...@163.com> on 2018/11/09 02:04:52 UTC

[DISCUSSION] refining usage of numberofcores in CarbonProperties

1. many places use the function 'getNumOfCores' of CarbonProperties which
returns the loading cores.
2. so if we still use the value in scene like 'query' or 'compaction' , it
will be confused.

when loading data,the loading cores will be changed not the default value if
the property 'carbon.number.of.cores.while.loading' unset, and the value
will be set in CarbonProperties which is singleton, the code:
    // get the value of 'spark.executor.cores' from spark conf, default
value is 1
    val sparkExecutorCores =
sparkSession.sparkContext.conf.get("spark.executor.cores", "1")
    // get the value of 'carbon.number.of.cores.while.loading' from carbon
properties,
    // default value is the value of 'spark.executor.cores'
    val numCoresLoading =
      try {
        CarbonProperties.getInstance()
            .getProperty(CarbonCommonConstants.NUM_CORES_LOADING,
sparkExecutorCores)
      } catch {
        case exc: NumberFormatException =>
          LOGGER.error("Configured value for property " +
CarbonCommonConstants.NUM_CORES_LOADING
              + " is wrong. Falling back to the default value "
              + sparkExecutorCores)
          sparkExecutorCores
      }
    // update the property with new value
    carbonProperty.addProperty(CarbonCommonConstants.NUM_CORES_LOADING,
numCoresLoading)

so the value 'cores' of compation or partition should be dealed with the
same?

and i start the pr  2907 <https://github.com/apache/carbondata/pull/2907>  



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [DISCUSSION] refining usage of numberofcores in CarbonProperties

Posted by xuchuanyin <xu...@hust.edu.cn>.

I think you are talking about 2 problems.

The first is that：We have some configurations about numberOfCores such as
cores for Loading/Compaction/AlterPartition, currently they use the same
method to get the configured value which means they all are actually using
the numberOfCoresWhileLoading. And your PR#2907 is to fix this problem.

But there is another problem: For data loading, in the document we say it
will use 2 cores by default if the numberOfCoresWhileLoading is not
configured. But actually in our code, we are using the
numberOfCoresForCurrentExecutor as the default value. And this problem is
not resolved in your PR#2907 yet.

To fix the 2nd problem, my suggestion is as below:

1. You can update the document for the default value of
numOfCoresWhileLoading. The default value will be the
'spark.executor.cores'.

2. At the same time I think you should also optimize the default value of
numOfCoresWhileCompation. To keep the behavior the same, the default value
should also be 'spark.executor.cores'. -- This requires the modification of
the document as well as the code.



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: [DISCUSSION] refining usage of numberofcores in CarbonProperties

Posted by xuchuanyin <xu...@hust.edu.cn>.

In addition to the last mail, for the numCoresOfAlterPartition, you can
handle it similarly.

Please remember to fix these in another PR, not in PR#2907.



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/