You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by Zhangshunyu <gi...@git.apache.org> on 2016/07/01 08:19:49 UTC
[GitHub] incubator-carbondata pull request #8: Record load performance statistics(con...
GitHub user Zhangshunyu opened a pull request:
https://github.com/apache/incubator-carbondata/pull/8
Record load performance statistics(configurable)
1)We can use a parameter "time.statistics.open" which can be configured by user to determine whether the statistics will be recorded and calculated during data loading, the default value is false.
2)We defined a dummy util, if we don't need to record the statistics, it will do nothing.
For example, we set "time.statistics.open" to "true" and run CarbonExample, the message as following:
Data load request has been received for table default.t3
============================== lru Cache Load Cost Time: 0.003(s)
]==========**********Compress One Node For One Thread pool-24-thread-1 Cost: 0.017
]========== TIME STATISTICS PartitionID: 0==========
]STATISTICS ->Raw data IO cost(load csv to dataFrame and generate block dict): 0.354(s)
]STATISTICS ->Distinct Value IO cost(global dict shuffle and write dict file): 0.153(s)
]STATISTICS -> |_maximum distinct column shuffle time: 0.046(s)
]STATISTICS -> |_maximum distinct column write dict file time: 0.087(s)
]STATISTICS ->Total cost of gen surrogate key, sort and write to temp files: 0.45(s)
]STATISTICS -> |_read cost of raw csv file: 0.313(s)
]STATISTICS -> |_cost of transform to surrogate key: 0.353(s)
]STATISTICS -> |_io cost(sort rows and write to temp file): 0.32(s)
]STATISTICS ->IO cost(tansform to MDK, compress and write fact file): 0.69(s)
]==============================
]========== BLOCK_INFO ==========
]BLOCK_INFO ->Node host: localhost
]BLOCK_INFO ->The block count in this node: 1
]==============================
Data load is successful for default.t3
+-------+------+
|country|amount|
+-------+------+
| france| 101|
| china| 849|
+-------+------+
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/Zhangshunyu/incubator-carbondata stat71
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-carbondata/pull/8.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #8
----
commit 6f86351cf7108d04708c9f1771fe1a3e18cb8af9
Author: Zhangshunyu <zh...@huawei.com>
Date: 2016-06-01T12:17:57Z
record statitics during data loading
commit 753f192e777482c1082b6dcf91a6fb5585cdf93c
Author: Zhangshunyu <zh...@huawei.com>
Date: 2016-06-30T02:51:26Z
rebase630
commit b1421f0329a28b9f18ad8afe1bc9d78acb307601
Author: Zhangshunyu <zh...@huawei.com>
Date: 2016-06-30T07:32:28Z
new structure
commit e868cda3f48d0e0de1fda2325b44791dafce581f
Author: Zhangshunyu <zh...@huawei.com>
Date: 2016-06-30T08:07:45Z
style
commit 903798149076dddc5be7e8991ab2a61e11226579
Author: Zhangshunyu <zh...@huawei.com>
Date: 2016-06-30T09:53:53Z
style
commit 1a3b750ee12e796c2bba19ba486da77f2bfc9922
Author: Zhangshunyu <zh...@huawei.com>
Date: 2016-06-30T10:01:09Z
style
commit c0da49d7ebf90150ad1cef3168968cd63b5561a7
Author: Zhangshunyu <zh...@huawei.com>
Date: 2016-06-30T10:07:18Z
style
commit ccf96a3576df6c1b2dd703038d579893a5a3bf26
Author: Zhangshunyu <zh...@huawei.com>
Date: 2016-06-30T10:15:51Z
style
commit 1f3ea224addfc46d59130872cb2fd1ab8a844b8c
Author: Zhangshunyu <zh...@huawei.com>
Date: 2016-07-01T01:04:39Z
modify the peramerter
commit 4d8ac7cf30bb08a50600bcabb7c62bcf21ca975b
Author: Zhangshunyu <zh...@huawei.com>
Date: 2016-07-01T03:16:30Z
remove set for nopar stat
commit 5938bfce96061eaf309d903ccc7f9686bc0a68a5
Author: Zhangshunyu <zh...@huawei.com>
Date: 2016-07-01T03:21:08Z
style
commit bd92401605ee3dedf971fef0b8fab008cea4b720
Author: Zhangshunyu <zh...@huawei.com>
Date: 2016-07-01T07:31:35Z
remove tree set
commit 708686761b0bd3dc1c0c97f94140692ce8ca9a97
Author: Zhangshunyu <zh...@huawei.com>
Date: 2016-07-01T07:38:49Z
fix style
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] incubator-carbondata pull request #8: [CARBONDATA-30] Record load performanc...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/8#discussion_r69298991
--- Diff: core/src/main/java/org/carbondata/core/util/CarbonTimeStatisticsFactory.java ---
@@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.carbondata.core.util;
+
+import org.carbondata.core.constants.CarbonCommonConstants;
+
+public class CarbonTimeStatisticsFactory {
+ private static String timeStatisticsUtilType;
+
+ static {
+ CarbonTimeStatisticsFactory.updateTimeStatisticsUtilStatus();
+ }
+
+ private static void updateTimeStatisticsUtilStatus() {
+ timeStatisticsUtilType = CarbonProperties.getInstance()
+ .getProperty(CarbonCommonConstants.TIME_STAT_OPEN,
+ CarbonCommonConstants.TIME_STAT_OPEN_DEFAULT);
+ }
+
+ public static AbstractTimeStatisticsUtil getTimeStatisticsUtil() {
+ switch (timeStatisticsUtilType.toLowerCase()) {
--- End diff --
No need to check every time. You can create instance in static block
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] incubator-carbondata pull request #8: [CARBONDATA-30] Record load performanc...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/8#discussion_r69298468
--- Diff: core/src/main/java/org/carbondata/core/constants/CarbonCommonConstants.java ---
@@ -415,6 +415,14 @@
*/
public static final String AGGREAGATE_COLUMNAR_KEY_BLOCK_DEFAULTVALUE = "true";
/**
+ * TIME_STAT_UTIL_TYPE
+ */
+ public static final String TIME_STAT_OPEN = "time.statistics.open";
--- End diff --
Name looks not understandable. Better change as `enable.data.loading.statistics`
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] incubator-carbondata pull request #8: [CARBONDATA-30] Record load performanc...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/incubator-carbondata/pull/8#discussion_r69298587
--- Diff: core/src/main/java/org/carbondata/core/util/CarbonTimeStatisticsFactory.java ---
@@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.carbondata.core.util;
+
+import org.carbondata.core.constants.CarbonCommonConstants;
+
+public class CarbonTimeStatisticsFactory {
+ private static String timeStatisticsUtilType;
+
+ static {
+ CarbonTimeStatisticsFactory.updateTimeStatisticsUtilStatus();
+ }
+
+ private static void updateTimeStatisticsUtilStatus() {
+ timeStatisticsUtilType = CarbonProperties.getInstance()
+ .getProperty(CarbonCommonConstants.TIME_STAT_OPEN,
+ CarbonCommonConstants.TIME_STAT_OPEN_DEFAULT);
+ }
+
+ public static AbstractTimeStatisticsUtil getTimeStatisticsUtil() {
+ switch (timeStatisticsUtilType.toLowerCase()) {
+ case "false":
+ return CarbonTimeStatisticsDummyUtil.getInstance();
+ case "true":
+ return CarbonTimeStatisticsUtil.getInstance();
+ default:
+ throw new UnsupportedOperationException("Not supported statistics type");
--- End diff --
Don't throw exception. Just log it and use default value.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---