You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by jackylk <gi...@git.apache.org> on 2018/04/20 10:08:37 UTC
[GitHub] carbondata pull request #2197: [WIP] Add Profiler output in EXPLAIN command
GitHub user jackylk opened a pull request:
https://github.com/apache/carbondata/pull/2197
[WIP] Add Profiler output in EXPLAIN command
Be sure to do all of the following checklist to help us incorporate
your contribution quickly and easily:
- [ ] Any interfaces changed?
- [ ] Any backward compatibility impacted?
- [ ] Document update required?
- [ ] Testing done
Please provide details on
- Whether new unit test cases have been added or why no new tests are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance test report.
- Any additional information to help reviewers in testing this change.
- [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/jackylk/incubator-carbondata profiler
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/carbondata/pull/2197.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #2197
----
commit 028106ffed51bc0b8fa3239dd1505df685d1e60c
Author: Jacky Li <ja...@...>
Date: 2018-04-20T08:02:32Z
support profiler in EXPLAIN
commit 8ce5194b594a1be60b1a22116777df21bc47d477
Author: Jacky Li <ja...@...>
Date: 2018-04-20T10:06:17Z
add test
----
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5511/
---
[GitHub] carbondata pull request #2197: [CARBONDATA-2371] Add Profiler output in EXPL...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/carbondata/pull/2197
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5512/
---
[GitHub] carbondata pull request #2197: [CARBONDATA-2371] Add Profiler output in EXPL...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2197#discussion_r183286495
--- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java ---
@@ -654,6 +656,7 @@ public boolean isScanRequired(FilterResolverIntf filterExp) {
startIndex++;
}
}
+ ExplainCollector.setTotalBlocklets(numBlocklets);
--- End diff --
This is just for one datamap blocklets, where are you summing all bloclklets?
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4312/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2197
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4565/
---
[GitHub] carbondata pull request #2197: [CARBONDATA-2371] Add Profiler output in EXPL...
Posted by QiangCai <gi...@git.apache.org>.
Github user QiangCai commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2197#discussion_r183196989
--- Diff: dev/findbugs-exclude.xml ---
@@ -40,7 +40,7 @@
<Bug pattern="MS_MUTABLE_ARRAY"/>
</Match>
<Match>
- <Class name="org.apache.carbondata.core.scan.expression.ExpressionResult"/>
+ <Class name="org.apache.carbondata.core.scan.filterExpression.ExpressionResult"/>
--- End diff --
not required match
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:
https://github.com/apache/carbondata/pull/2197
retest this please
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2197
retest this please
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4066/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4303/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5266/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5309/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:
https://github.com/apache/carbondata/pull/2197
retest this please
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5317/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2197
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4496/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2197
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4579/
---
[GitHub] carbondata pull request #2197: [CARBONDATA-2371] Add Profiler output in EXPL...
Posted by xuchuanyin <gi...@git.apache.org>.
Github user xuchuanyin commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2197#discussion_r183199499
--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggregateTableSelection.scala ---
@@ -379,6 +382,68 @@ class TestPreAggregateTableSelection extends SparkQueryTest with BeforeAndAfterA
checkAnswer(df, Seq(Row(10,10.0)))
}
+ test("explain query") {
+ sql("explain select name,sum(age) from mainTable where name = 'a' group by name").show(false)
+ var rows = sql("explain select name,sum(age) from mainTable where name = 'a' group by name").collect()
+ assertResult(
+ """== CarbonData Profiler ==
+ |Query rewrite based on DataMap:
+ | - agg1 (preaggregate)
+ |Table Scan on maintable_agg1
+ | - filter: (maintable_name <> null and maintable_name = a)
+ | - pruned by main index
+ | - all blocklets: 1
+ | - skipped blocklets: 1
+ |""".stripMargin)(rows(0).getString(0))
+
+ rows = sql("explain select name,sum(age) from mainTable group by name").collect()
+ assertResult(
+ """== CarbonData Profiler ==
+ |Query rewrite based on DataMap:
+ | - agg1 (preaggregate)
+ |Table Scan on maintable_agg1
+ | - filter: None
+ | - all blocklets: 1
+ | - skipped blocklets: 0
+ |""".stripMargin)(rows(0).getString(0))
+ }
+
+ test("explain query with lucene datamap") {
--- End diff --
better to move this testcase to lucene module or add a separate testsuite for profile
---
[GitHub] carbondata pull request #2197: [CARBONDATA-2371] Add Profiler output in EXPL...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2197#discussion_r183285491
--- Diff: core/src/main/java/org/apache/carbondata/core/profiler/ExplainCollector.java ---
@@ -0,0 +1,146 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.profiler;
+
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+
+import org.apache.carbondata.common.annotations.InterfaceAudience;
+import org.apache.carbondata.core.metadata.schema.table.DataMapSchema;
+
+/**
+ * An information collector used for EXPLAIN command, to print out
+ * SQL rewrite and pruning information
+ */
+@InterfaceAudience.Internal
+public class ExplainCollector {
+
+ private static final ThreadLocal<ExplainCollector> explainProfiler = new ThreadLocal<>();
+
+ private List<String> olapDataMapProviders = new ArrayList<>();
+ private List<String> olapDataMapNames = new ArrayList<>();
+
+ // mapping of table name to pruning info
+ private Map<String, TablePruningInfo> scans = new ConcurrentHashMap<>();
+
+ public void recordMatchedOlapDataMap(String dataMapProvider, String dataMapName) {
+ Objects.requireNonNull(dataMapProvider);
+ Objects.requireNonNull(dataMapName);
+ olapDataMapProviders.add(dataMapProvider);
+ olapDataMapNames.add(dataMapName);
+ }
+
+ public static boolean enabled() {
+ return explainProfiler.get() != null;
+ }
+
+ public static void setup() {
+ explainProfiler.set(new ExplainCollector());
+ }
+
+ public static ExplainCollector get() {
+ return explainProfiler.get();
+ }
+
+ public static void addPruningInfo(String tableName) {
+ if (enabled()) {
+ ExplainCollector profiler = get();
+ if (!profiler.scans.containsKey(tableName)) {
+ profiler.scans.put(tableName, new TablePruningInfo());
+ }
+ }
+ }
+
+ public static void setFilterStatement(String filterStatement) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setFilterStatement(filterStatement);
+ }
+ }
+
+ public static void recordDefaultDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setNumBlockletsAfterDefaultPruning(dataMapSchema, numBlocklets);
+ }
+ }
+
+ public static void recordCGDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setNumBlockletsAfterCGPruning(dataMapSchema, numBlocklets);
+ }
+ }
+
+ public static void recordFGDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setNumBlockletsAfterFGPruning(dataMapSchema, numBlocklets);
+ }
+ }
+
+ public static void setTotalBlocklets(int totalBlocklets) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setTotalBlocklets(totalBlocklets);
+ }
+ }
+
+ /**
+ * Return the current TablePruningInfo (It is the last one in the map, since it is in
+ * single thread)
+ */
+ private static TablePruningInfo getCurrentTablePruningInfo() {
--- End diff --
How can you make sure that you are adding information to right table. I think it is better to pass tableName and get the `TablePruningInfo` as per that.
---
[GitHub] carbondata pull request #2197: [CARBONDATA-2371] Add Profiler output in EXPL...
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2197#discussion_r183200584
--- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/integration/spark/testsuite/preaggregate/TestPreAggregateTableSelection.scala ---
@@ -379,6 +382,68 @@ class TestPreAggregateTableSelection extends SparkQueryTest with BeforeAndAfterA
checkAnswer(df, Seq(Row(10,10.0)))
}
+ test("explain query") {
+ sql("explain select name,sum(age) from mainTable where name = 'a' group by name").show(false)
+ var rows = sql("explain select name,sum(age) from mainTable where name = 'a' group by name").collect()
+ assertResult(
+ """== CarbonData Profiler ==
+ |Query rewrite based on DataMap:
+ | - agg1 (preaggregate)
+ |Table Scan on maintable_agg1
+ | - filter: (maintable_name <> null and maintable_name = a)
+ | - pruned by main index
+ | - all blocklets: 1
+ | - skipped blocklets: 1
+ |""".stripMargin)(rows(0).getString(0))
+
+ rows = sql("explain select name,sum(age) from mainTable group by name").collect()
+ assertResult(
+ """== CarbonData Profiler ==
+ |Query rewrite based on DataMap:
+ | - agg1 (preaggregate)
+ |Table Scan on maintable_agg1
+ | - filter: None
+ | - all blocklets: 1
+ | - skipped blocklets: 0
+ |""".stripMargin)(rows(0).getString(0))
+ }
+
+ test("explain query with lucene datamap") {
--- End diff --
I have moved it to LuceneFineGrainDataMapSuite
---
[GitHub] carbondata pull request #2197: [CARBONDATA-2371] Add Profiler output in EXPL...
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2197#discussion_r183287439
--- Diff: core/src/main/java/org/apache/carbondata/core/profiler/ExplainCollector.java ---
@@ -0,0 +1,146 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.profiler;
+
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+
+import org.apache.carbondata.common.annotations.InterfaceAudience;
+import org.apache.carbondata.core.metadata.schema.table.DataMapSchema;
+
+/**
+ * An information collector used for EXPLAIN command, to print out
+ * SQL rewrite and pruning information
+ */
+@InterfaceAudience.Internal
+public class ExplainCollector {
+
+ private static final ThreadLocal<ExplainCollector> explainProfiler = new ThreadLocal<>();
+
+ private List<String> olapDataMapProviders = new ArrayList<>();
+ private List<String> olapDataMapNames = new ArrayList<>();
+
+ // mapping of table name to pruning info
+ private Map<String, TablePruningInfo> scans = new ConcurrentHashMap<>();
+
+ public void recordMatchedOlapDataMap(String dataMapProvider, String dataMapName) {
+ Objects.requireNonNull(dataMapProvider);
+ Objects.requireNonNull(dataMapName);
+ olapDataMapProviders.add(dataMapProvider);
+ olapDataMapNames.add(dataMapName);
+ }
+
+ public static boolean enabled() {
+ return explainProfiler.get() != null;
+ }
+
+ public static void setup() {
+ explainProfiler.set(new ExplainCollector());
+ }
+
+ public static ExplainCollector get() {
+ return explainProfiler.get();
+ }
+
+ public static void addPruningInfo(String tableName) {
+ if (enabled()) {
+ ExplainCollector profiler = get();
+ if (!profiler.scans.containsKey(tableName)) {
+ profiler.scans.put(tableName, new TablePruningInfo());
+ }
+ }
+ }
+
+ public static void setFilterStatement(String filterStatement) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setFilterStatement(filterStatement);
+ }
+ }
+
+ public static void recordDefaultDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setNumBlockletsAfterDefaultPruning(dataMapSchema, numBlocklets);
+ }
+ }
+
+ public static void recordCGDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setNumBlockletsAfterCGPruning(dataMapSchema, numBlocklets);
+ }
+ }
+
+ public static void recordFGDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setNumBlockletsAfterFGPruning(dataMapSchema, numBlocklets);
+ }
+ }
+
+ public static void setTotalBlocklets(int totalBlocklets) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setTotalBlocklets(totalBlocklets);
+ }
+ }
+
+ /**
+ * Return the current TablePruningInfo (It is the last one in the map, since it is in
+ * single thread)
+ */
+ private static TablePruningInfo getCurrentTablePruningInfo() {
--- End diff --
Because this ExplainCollector is used when `queryExecution.toRdd.partitions` is invoked in CarbonExplainCommand. It will process relation one by one, so it is like a Stack. The last TablePruningInfo is for current table.
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:
https://github.com/apache/carbondata/pull/2197
retest this please
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4274/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5401/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4129/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5470/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4136/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5251/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2197
SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4613/
---
[GitHub] carbondata pull request #2197: [CARBONDATA-2371] Add Profiler output in EXPL...
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2197#discussion_r183284986
--- Diff: core/src/main/java/org/apache/carbondata/core/profiler/ExplainCollector.java ---
@@ -0,0 +1,146 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.profiler;
+
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+
+import org.apache.carbondata.common.annotations.InterfaceAudience;
+import org.apache.carbondata.core.metadata.schema.table.DataMapSchema;
+
+/**
+ * An information collector used for EXPLAIN command, to print out
+ * SQL rewrite and pruning information
+ */
+@InterfaceAudience.Internal
+public class ExplainCollector {
+
+ private static final ThreadLocal<ExplainCollector> explainProfiler = new ThreadLocal<>();
+
+ private List<String> olapDataMapProviders = new ArrayList<>();
+ private List<String> olapDataMapNames = new ArrayList<>();
+
+ // mapping of table name to pruning info
+ private Map<String, TablePruningInfo> scans = new ConcurrentHashMap<>();
+
+ public void recordMatchedOlapDataMap(String dataMapProvider, String dataMapName) {
+ Objects.requireNonNull(dataMapProvider);
+ Objects.requireNonNull(dataMapName);
+ olapDataMapProviders.add(dataMapProvider);
+ olapDataMapNames.add(dataMapName);
+ }
+
+ public static boolean enabled() {
+ return explainProfiler.get() != null;
+ }
+
+ public static void setup() {
+ explainProfiler.set(new ExplainCollector());
+ }
+
+ public static ExplainCollector get() {
+ return explainProfiler.get();
+ }
+
+ public static void addPruningInfo(String tableName) {
+ if (enabled()) {
+ ExplainCollector profiler = get();
+ if (!profiler.scans.containsKey(tableName)) {
+ profiler.scans.put(tableName, new TablePruningInfo());
+ }
+ }
+ }
+
+ public static void setFilterStatement(String filterStatement) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setFilterStatement(filterStatement);
+ }
+ }
+
+ public static void recordDefaultDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setNumBlockletsAfterDefaultPruning(dataMapSchema, numBlocklets);
+ }
+ }
+
+ public static void recordCGDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setNumBlockletsAfterCGPruning(dataMapSchema, numBlocklets);
+ }
+ }
+
+ public static void recordFGDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setNumBlockletsAfterFGPruning(dataMapSchema, numBlocklets);
+ }
+ }
+
+ public static void setTotalBlocklets(int totalBlocklets) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setTotalBlocklets(totalBlocklets);
+ }
+ }
+
+ /**
+ * Return the current TablePruningInfo (It is the last one in the map, since it is in
+ * single thread)
+ */
+ private static TablePruningInfo getCurrentTablePruningInfo() {
+ Iterator<TablePruningInfo> iterator = explainProfiler.get().scans.values().iterator();
+ TablePruningInfo output = null;
+ while (iterator.hasNext()) {
+ output = iterator.next();
+ }
+ return output;
+ }
+
+ public static void remove() {
+ explainProfiler.remove();
--- End diff --
fixed
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5479/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4258/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2197
SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/4495/
---
[GitHub] carbondata pull request #2197: [CARBONDATA-2371] Add Profiler output in EXPL...
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2197#discussion_r183200529
--- Diff: dev/findbugs-exclude.xml ---
@@ -40,7 +40,7 @@
<Bug pattern="MS_MUTABLE_ARRAY"/>
</Match>
<Match>
- <Class name="org.apache.carbondata.core.scan.expression.ExpressionResult"/>
+ <Class name="org.apache.carbondata.core.scan.filterExpression.ExpressionResult"/>
--- End diff --
fixed
---
[GitHub] carbondata pull request #2197: [CARBONDATA-2371] Add Profiler output in EXPL...
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2197#discussion_r183286607
--- Diff: core/src/main/java/org/apache/carbondata/core/profiler/ExplainCollector.java ---
@@ -0,0 +1,146 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.profiler;
+
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+
+import org.apache.carbondata.common.annotations.InterfaceAudience;
+import org.apache.carbondata.core.metadata.schema.table.DataMapSchema;
+
+/**
+ * An information collector used for EXPLAIN command, to print out
+ * SQL rewrite and pruning information
+ */
+@InterfaceAudience.Internal
+public class ExplainCollector {
+
+ private static final ThreadLocal<ExplainCollector> explainProfiler = new ThreadLocal<>();
+
+ private List<String> olapDataMapProviders = new ArrayList<>();
+ private List<String> olapDataMapNames = new ArrayList<>();
+
+ // mapping of table name to pruning info
+ private Map<String, TablePruningInfo> scans = new ConcurrentHashMap<>();
+
+ public void recordMatchedOlapDataMap(String dataMapProvider, String dataMapName) {
+ Objects.requireNonNull(dataMapProvider);
+ Objects.requireNonNull(dataMapName);
+ olapDataMapProviders.add(dataMapProvider);
+ olapDataMapNames.add(dataMapName);
+ }
+
+ public static boolean enabled() {
+ return explainProfiler.get() != null;
+ }
+
+ public static void setup() {
+ explainProfiler.set(new ExplainCollector());
+ }
+
+ public static ExplainCollector get() {
+ return explainProfiler.get();
+ }
+
+ public static void addPruningInfo(String tableName) {
+ if (enabled()) {
+ ExplainCollector profiler = get();
+ if (!profiler.scans.containsKey(tableName)) {
+ profiler.scans.put(tableName, new TablePruningInfo());
+ }
+ }
+ }
+
+ public static void setFilterStatement(String filterStatement) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setFilterStatement(filterStatement);
+ }
+ }
+
+ public static void recordDefaultDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setNumBlockletsAfterDefaultPruning(dataMapSchema, numBlocklets);
+ }
+ }
+
+ public static void recordCGDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setNumBlockletsAfterCGPruning(dataMapSchema, numBlocklets);
+ }
+ }
+
+ public static void recordFGDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setNumBlockletsAfterFGPruning(dataMapSchema, numBlocklets);
+ }
+ }
+
+ public static void setTotalBlocklets(int totalBlocklets) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setTotalBlocklets(totalBlocklets);
+ }
+ }
+
+ /**
+ * Return the current TablePruningInfo (It is the last one in the map, since it is in
+ * single thread)
+ */
+ private static TablePruningInfo getCurrentTablePruningInfo() {
--- End diff --
ok, I will add CarbonTable parameter in all functions in this class
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5465/
---
[GitHub] carbondata pull request #2197: [CARBONDATA-2371] Add Profiler output in EXPL...
Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2197#discussion_r183286723
--- Diff: core/src/main/java/org/apache/carbondata/core/indexstore/blockletindex/BlockletDataMap.java ---
@@ -654,6 +656,7 @@ public boolean isScanRequired(FilterResolverIntf filterExp) {
startIndex++;
}
}
+ ExplainCollector.setTotalBlocklets(numBlocklets);
--- End diff --
ok. I will change setTotalBlocklets to sum it
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5435/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4166/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4154/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5334/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4234/
---
[GitHub] carbondata issue #2197: [WIP] Add Profiler output in EXPLAIN command
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5240/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4106/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4086/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4178/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4346/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4072/
---
[GitHub] carbondata pull request #2197: [CARBONDATA-2371] Add Profiler output in EXPL...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/2197#discussion_r183283756
--- Diff: core/src/main/java/org/apache/carbondata/core/profiler/ExplainCollector.java ---
@@ -0,0 +1,146 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.profiler;
+
+import java.util.ArrayList;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.ConcurrentHashMap;
+
+import org.apache.carbondata.common.annotations.InterfaceAudience;
+import org.apache.carbondata.core.metadata.schema.table.DataMapSchema;
+
+/**
+ * An information collector used for EXPLAIN command, to print out
+ * SQL rewrite and pruning information
+ */
+@InterfaceAudience.Internal
+public class ExplainCollector {
+
+ private static final ThreadLocal<ExplainCollector> explainProfiler = new ThreadLocal<>();
+
+ private List<String> olapDataMapProviders = new ArrayList<>();
+ private List<String> olapDataMapNames = new ArrayList<>();
+
+ // mapping of table name to pruning info
+ private Map<String, TablePruningInfo> scans = new ConcurrentHashMap<>();
+
+ public void recordMatchedOlapDataMap(String dataMapProvider, String dataMapName) {
+ Objects.requireNonNull(dataMapProvider);
+ Objects.requireNonNull(dataMapName);
+ olapDataMapProviders.add(dataMapProvider);
+ olapDataMapNames.add(dataMapName);
+ }
+
+ public static boolean enabled() {
+ return explainProfiler.get() != null;
+ }
+
+ public static void setup() {
+ explainProfiler.set(new ExplainCollector());
+ }
+
+ public static ExplainCollector get() {
+ return explainProfiler.get();
+ }
+
+ public static void addPruningInfo(String tableName) {
+ if (enabled()) {
+ ExplainCollector profiler = get();
+ if (!profiler.scans.containsKey(tableName)) {
+ profiler.scans.put(tableName, new TablePruningInfo());
+ }
+ }
+ }
+
+ public static void setFilterStatement(String filterStatement) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setFilterStatement(filterStatement);
+ }
+ }
+
+ public static void recordDefaultDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setNumBlockletsAfterDefaultPruning(dataMapSchema, numBlocklets);
+ }
+ }
+
+ public static void recordCGDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setNumBlockletsAfterCGPruning(dataMapSchema, numBlocklets);
+ }
+ }
+
+ public static void recordFGDataMapPruning(DataMapSchema dataMapSchema, int numBlocklets) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setNumBlockletsAfterFGPruning(dataMapSchema, numBlocklets);
+ }
+ }
+
+ public static void setTotalBlocklets(int totalBlocklets) {
+ if (enabled()) {
+ TablePruningInfo scan = getCurrentTablePruningInfo();
+ scan.setTotalBlocklets(totalBlocklets);
+ }
+ }
+
+ /**
+ * Return the current TablePruningInfo (It is the last one in the map, since it is in
+ * single thread)
+ */
+ private static TablePruningInfo getCurrentTablePruningInfo() {
+ Iterator<TablePruningInfo> iterator = explainProfiler.get().scans.values().iterator();
+ TablePruningInfo output = null;
+ while (iterator.hasNext()) {
+ output = iterator.next();
+ }
+ return output;
+ }
+
+ public static void remove() {
+ explainProfiler.remove();
--- End diff --
Better check enabled() here as well
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5286/
---
[GitHub] carbondata issue #2197: [WIP] Add Profiler output in EXPLAIN command
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4053/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/2197
LGTM
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Failed with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/5355/
---
[GitHub] carbondata issue #2197: [CARBONDATA-2371] Add Profiler output in EXPLAIN com...
Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:
https://github.com/apache/carbondata/pull/2197
Build Success with Spark 2.2.1, Please check CI http://88.99.58.216:8080/job/ApacheCarbonPRBuilder/4347/
---