You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by Zhangshunyu <gi...@git.apache.org> on 2016/07/01 08:19:49 UTC

[GitHub] incubator-carbondata pull request #8: Record load performance statistics(con...

GitHub user Zhangshunyu opened a pull request:

    https://github.com/apache/incubator-carbondata/pull/8

    Record load performance statistics(configurable)

    1)We can use a parameter "time.statistics.open" which can be configured by user to determine whether the statistics will be recorded and calculated during data loading, the default value is false.
    2)We defined a dummy util, if we don't need to record the statistics, it will do nothing.
    
    For example, we set "time.statistics.open" to "true" and run CarbonExample, the message as following:
    
    Data load request has been received for table default.t3
    ============================== lru Cache Load Cost Time: 0.003(s)
    ]==========**********Compress One Node For One Thread pool-24-thread-1 Cost: 0.017
    ]========== TIME STATISTICS PartitionID: 0==========
    ]STATISTICS ->Raw data IO cost(load csv to dataFrame and generate block dict): 0.354(s)
    ]STATISTICS ->Distinct Value IO cost(global dict shuffle and write dict file): 0.153(s)
    ]STATISTICS ->  |_maximum distinct column shuffle time: 0.046(s)
    ]STATISTICS ->  |_maximum distinct column write dict file time: 0.087(s)
    ]STATISTICS ->Total cost of gen surrogate key, sort and write to temp files: 0.45(s)
    ]STATISTICS ->  |_read cost of raw csv file: 0.313(s)
    ]STATISTICS ->  |_cost of transform to surrogate key: 0.353(s)
    ]STATISTICS ->  |_io cost(sort rows and write to temp file): 0.32(s)
    ]STATISTICS ->IO cost(tansform to MDK, compress and write fact file): 0.69(s)
    ]==============================
    ]========== BLOCK_INFO ==========
    ]BLOCK_INFO ->Node host: localhost
    ]BLOCK_INFO ->The block count in this node: 1
    ]==============================
    Data load is successful for default.t3
    +-------+------+
    |country|amount|
    +-------+------+
    | france|   101|
    |  china|   849|
    +-------+------+

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/Zhangshunyu/incubator-carbondata stat71

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-carbondata/pull/8.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #8
    
----
commit 6f86351cf7108d04708c9f1771fe1a3e18cb8af9
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-06-01T12:17:57Z

    record statitics during data loading

commit 753f192e777482c1082b6dcf91a6fb5585cdf93c
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-06-30T02:51:26Z

    rebase630

commit b1421f0329a28b9f18ad8afe1bc9d78acb307601
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-06-30T07:32:28Z

    new structure

commit e868cda3f48d0e0de1fda2325b44791dafce581f
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-06-30T08:07:45Z

    style

commit 903798149076dddc5be7e8991ab2a61e11226579
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-06-30T09:53:53Z

    style

commit 1a3b750ee12e796c2bba19ba486da77f2bfc9922
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-06-30T10:01:09Z

    style

commit c0da49d7ebf90150ad1cef3168968cd63b5561a7
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-06-30T10:07:18Z

    style

commit ccf96a3576df6c1b2dd703038d579893a5a3bf26
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-06-30T10:15:51Z

    style

commit 1f3ea224addfc46d59130872cb2fd1ab8a844b8c
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-07-01T01:04:39Z

    modify the peramerter

commit 4d8ac7cf30bb08a50600bcabb7c62bcf21ca975b
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-07-01T03:16:30Z

    remove set for nopar stat

commit 5938bfce96061eaf309d903ccc7f9686bc0a68a5
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-07-01T03:21:08Z

    style

commit bd92401605ee3dedf971fef0b8fab008cea4b720
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-07-01T07:31:35Z

    remove tree set

commit 708686761b0bd3dc1c0c97f94140692ce8ca9a97
Author: Zhangshunyu <zh...@huawei.com>
Date:   2016-07-01T07:38:49Z

    fix style

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #8: [CARBONDATA-30] Record load performanc...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/8#discussion_r69298991
  
    --- Diff: core/src/main/java/org/carbondata/core/util/CarbonTimeStatisticsFactory.java ---
    @@ -0,0 +1,45 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.carbondata.core.util;
    +
    +import org.carbondata.core.constants.CarbonCommonConstants;
    +
    +public class CarbonTimeStatisticsFactory {
    +  private static String timeStatisticsUtilType;
    +
    +  static {
    +    CarbonTimeStatisticsFactory.updateTimeStatisticsUtilStatus();
    +  }
    +
    +  private static void updateTimeStatisticsUtilStatus() {
    +    timeStatisticsUtilType = CarbonProperties.getInstance()
    +        .getProperty(CarbonCommonConstants.TIME_STAT_OPEN,
    +            CarbonCommonConstants.TIME_STAT_OPEN_DEFAULT);
    +  }
    +
    +  public static AbstractTimeStatisticsUtil getTimeStatisticsUtil() {
    +    switch (timeStatisticsUtilType.toLowerCase()) {
    --- End diff --
    
    No need to check every time. You can create instance  in static block 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #8: [CARBONDATA-30] Record load performanc...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/8#discussion_r69298468
  
    --- Diff: core/src/main/java/org/carbondata/core/constants/CarbonCommonConstants.java ---
    @@ -415,6 +415,14 @@
        */
       public static final String AGGREAGATE_COLUMNAR_KEY_BLOCK_DEFAULTVALUE = "true";
       /**
    +   * TIME_STAT_UTIL_TYPE
    +   */
    +  public static final String TIME_STAT_OPEN = "time.statistics.open";
    --- End diff --
    
    Name looks not understandable. Better change as `enable.data.loading.statistics`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-carbondata pull request #8: [CARBONDATA-30] Record load performanc...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/incubator-carbondata/pull/8#discussion_r69298587
  
    --- Diff: core/src/main/java/org/carbondata/core/util/CarbonTimeStatisticsFactory.java ---
    @@ -0,0 +1,45 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.carbondata.core.util;
    +
    +import org.carbondata.core.constants.CarbonCommonConstants;
    +
    +public class CarbonTimeStatisticsFactory {
    +  private static String timeStatisticsUtilType;
    +
    +  static {
    +    CarbonTimeStatisticsFactory.updateTimeStatisticsUtilStatus();
    +  }
    +
    +  private static void updateTimeStatisticsUtilStatus() {
    +    timeStatisticsUtilType = CarbonProperties.getInstance()
    +        .getProperty(CarbonCommonConstants.TIME_STAT_OPEN,
    +            CarbonCommonConstants.TIME_STAT_OPEN_DEFAULT);
    +  }
    +
    +  public static AbstractTimeStatisticsUtil getTimeStatisticsUtil() {
    +    switch (timeStatisticsUtilType.toLowerCase()) {
    +      case "false":
    +        return CarbonTimeStatisticsDummyUtil.getInstance();
    +      case "true":
    +        return CarbonTimeStatisticsUtil.getInstance();
    +      default:
    +        throw new UnsupportedOperationException("Not supported statistics type");
    --- End diff --
    
    Don't throw exception. Just log it and use default value.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---