You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/07/27 00:52:14 UTC

[GitHub] [pinot] SabrinaZhaozyf opened a new pull request #7210: Support EXPLAIN PLAN

SabrinaZhaozyf opened a new pull request #7210:
URL: https://github.com/apache/pinot/pull/7210


   This PR adds support for EXPLAIN PLAN feature.
   
   Here is the design doc that we shared the following design doc with the open source community a few weeks ago: https://docs.google.com/document/d/1lBMGcR65JY_8iIUCA1iCazzn0LzTVxMJlPzFIsZ7TRk/edit?usp=sharing
   (Please request access if you don't have it already.)
   Here is the link to the issue: https://github.com/apache/incubator-pinot/issues/6978
   
   - Added end-to-end tests for different kinds of queries (including query rewrite operations, compile time invocation, etc.)
   - As explained in the design doc, 2 kinds of output formats are supported (tabular and non-tabular).
   
   TODOs:
   1. Add support for queries that use StarTree index.
   2. Add support for stretch goal outlined in the design doc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kishoreg commented on pull request #7210: Support EXPLAIN PLAN

Posted by GitBox <gi...@apache.org>.
kishoreg commented on pull request #7210:
URL: https://github.com/apache/pinot/pull/7210#issuecomment-887134222


   > > Thanks for the PR. can you provide read-only access for everyone by default?
   > 
   > Unfortunately, I created the document using my LinkedIn account so it won't let me share access by default. I will give commenter access when people request.
   
   Can you please use your personal account to recreate and reshare the link. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kishoreg commented on a change in pull request #7210: Support EXPLAIN PLAN

Posted by GitBox <gi...@apache.org>.
kishoreg commented on a change in pull request #7210:
URL: https://github.com/apache/pinot/pull/7210#discussion_r677048347



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/explain/ExplainPlanUtils.java
##########
@@ -0,0 +1,90 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.pinot.core.query.explain;
+
+import java.util.HashSet;
+import java.util.List;
+import org.apache.pinot.common.request.context.predicate.Predicate;
+import org.apache.pinot.spi.config.table.FieldConfig;
+import org.apache.pinot.spi.config.table.IndexingConfig;
+
+
+/**
+ * Helper methods for processing EXPLAIN PLAN queries
+ */
+public class ExplainPlanUtils {
+
+  private ExplainPlanUtils() {
+  }
+
+  public static String getIndexUsed(String column, Predicate.Type predicateType, IndexingConfig indexingConfig,

Review comment:
       this is duplicating the logic and we have a lot of optimizations on when to use and when not to use indexes. This will create more confusion for the user trying and also huge overhead on the engineers having to maintain the logic in two places.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #7210: Support EXPLAIN PLAN

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #7210:
URL: https://github.com/apache/pinot/pull/7210#discussion_r677038965



##########
File path: pinot-common/src/main/java/org/apache/pinot/common/request/PinotQuery.java
##########
@@ -155,34 +155,34 @@ public short getThriftFieldId() {
   public static final java.util.Map<_Fields, org.apache.thrift.meta_data.FieldMetaData> metaDataMap;
   static {
     java.util.Map<_Fields, org.apache.thrift.meta_data.FieldMetaData> tmpMap = new java.util.EnumMap<_Fields, org.apache.thrift.meta_data.FieldMetaData>(_Fields.class);
-    tmpMap.put(_Fields.VERSION, new org.apache.thrift.meta_data.FieldMetaData("version", org.apache.thrift.TFieldRequirementType.OPTIONAL, 

Review comment:
       Need to undo these changes. Might have gotten added earlier when style guideline was not imported I think




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on pull request #7210: Support EXPLAIN PLAN

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on pull request #7210:
URL: https://github.com/apache/pinot/pull/7210#issuecomment-896470893


   Update - We had met and discussed that directly doing phase 2 work outlined in the design doc is better. So we are reworking the PR (most likely will send new PR and close this).
   
   Let me see if I can open the current document for comment access to all. If no luck, I will copy the document to a new one and share from my gmail account.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] amrishlal commented on pull request #7210: Support EXPLAIN PLAN

Posted by GitBox <gi...@apache.org>.
amrishlal commented on pull request #7210:
URL: https://github.com/apache/pinot/pull/7210#issuecomment-894584129


   > Late to this discussion. I have the same question on why additional constructs are needed under `explain` package. Shouldn't it be easier to send a bit along with the request to the servers for skipping the execution and just collecting the stats back to the broker?
   
   The server-side implementation is currently being worked on, but given the large potential scope of the EXPLAIN plan functionality (logical plan, physical plan, run-time costs, plus possible profiling information), I don't think that even the server-side implementation would be considered close to "done" in any sense. IMHO, the best case scenario is probably a BETA cut that should be integrated into the codebase (possibly under an execution flag) sooner than later so that we can start iterating upon it. Anything that is merged at this point will drastically change if / when we get to the point of sending EXPLAIN query to all the servers for collecting profiling cost information, etc.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mayankshriv commented on pull request #7210: Support EXPLAIN PLAN

Posted by GitBox <gi...@apache.org>.
mayankshriv commented on pull request #7210:
URL: https://github.com/apache/pinot/pull/7210#issuecomment-894869783


   > > > Thanks for the PR. can you provide read-only access for everyone by default?
   > > 
   > > 
   > > Unfortunately, I created the document using my LinkedIn account so it won't let me share access by default. I will give commenter access when people request.
   > 
   > Can you please use your personal account to recreate and reshare the link.
   
   +1, please reshare a link from an account that has read access for all (by default).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on pull request #7210: Support EXPLAIN PLAN

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on pull request #7210:
URL: https://github.com/apache/pinot/pull/7210#issuecomment-951329113


   Based on the discussion, @amrishlal  created a new PR  https://github.com/apache/pinot/pull/7568 to directly do phase 2 work 
   Closing this


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] yupeng9 commented on pull request #7210: Support EXPLAIN PLAN

Posted by GitBox <gi...@apache.org>.
yupeng9 commented on pull request #7210:
URL: https://github.com/apache/pinot/pull/7210#issuecomment-894567690


   Late to this discussion. I have the same question on why additional constructs are needed under `explain` package. Shouldn't it be easier to send a bit along with the request to the servers for skipping the execution and just collecting the stats back to the broker?
   
   @chenboat wrote a [proposal](https://docs.google.com/document/d/1K3ZgA_2O99ChKD8LffVw0tHQhT1u6A9pj4xiiXzSgzQ/edit) on this earlier this year. I'm curious to see his thoughts on this.
   
   Lastly, can you grant access to the design doc? I requested the access a couple of days ago.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kishoreg commented on a change in pull request #7210: Support EXPLAIN PLAN

Posted by GitBox <gi...@apache.org>.
kishoreg commented on a change in pull request #7210:
URL: https://github.com/apache/pinot/pull/7210#discussion_r677041475



##########
File path: pinot-common/src/main/java/org/apache/pinot/common/request/context/ExpressionContext.java
##########
@@ -106,7 +107,11 @@ public int hashCode() {
   public String toString() {
     switch (_type) {
       case LITERAL:
-        return '\'' + _value + '\'';
+        if (NumberUtils.isCreatable(_value)) {

Review comment:
       why did we change this?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] SabrinaZhaozyf commented on pull request #7210: Support EXPLAIN PLAN

Posted by GitBox <gi...@apache.org>.
SabrinaZhaozyf commented on pull request #7210:
URL: https://github.com/apache/pinot/pull/7210#issuecomment-887130819


   > Thanks for the PR. can you provide read-only access for everyone by default?
   
   Unfortunately, I created the document using my LinkedIn account so it won't let me share access by default. I will give commenter access when people request.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kishoreg commented on pull request #7210: Support EXPLAIN PLAN

Posted by GitBox <gi...@apache.org>.
kishoreg commented on pull request #7210:
URL: https://github.com/apache/pinot/pull/7210#issuecomment-887128776


   Thanks for the PR. can you provide read-only access for everyone by default?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu commented on pull request #7210: Support EXPLAIN PLAN

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on pull request #7210:
URL: https://github.com/apache/pinot/pull/7210#issuecomment-891171181


   @kishoreg we have had PRs before that needed cleanups later. JSON support and Range index support comes to mind. https://github.com/apache/pinot/pull/5240. I am sure there are others where we added support so that it satisfies a need now, and we clean up things under the hood later. Some of them are still pending (and that is ok, people will get to it when they can). In fact, in some cases the design documents came in after the first PR (and/or was not circulated widely). But we have allowed it in the past with some understanding  that the feature was a timely one for the community.
   
   In this case, a design document was floated appropriately along with the issue, and a month's time was provided for the community to provide feedback. The phasing of the project was also clearly mentioned in there. 
   
   We know that a cleanup is needed. It is also clear that @siddharthteotia  has committed to do the cleanup (and also in a timely manner). I think we should avoid blocking PRs in such cases.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on a change in pull request #7210: Support EXPLAIN PLAN

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on a change in pull request #7210:
URL: https://github.com/apache/pinot/pull/7210#discussion_r677624261



##########
File path: pinot-core/src/main/java/org/apache/pinot/core/query/explain/ExplainPlanUtils.java
##########
@@ -0,0 +1,90 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.pinot.core.query.explain;
+
+import java.util.HashSet;
+import java.util.List;
+import org.apache.pinot.common.request.context.predicate.Predicate;
+import org.apache.pinot.spi.config.table.FieldConfig;
+import org.apache.pinot.spi.config.table.IndexingConfig;
+
+
+/**
+ * Helper methods for processing EXPLAIN PLAN queries
+ */
+public class ExplainPlanUtils {
+
+  private ExplainPlanUtils() {
+  }
+
+  public static String getIndexUsed(String column, Predicate.Type predicateType, IndexingConfig indexingConfig,

Review comment:
       @kishoreg 
   
   As outlined in the design doc, the first phase focuses on building the plan at the broker from PinotQuery -> QueryContext. Since we walk the QueryContext (which is the same thing what server does), there shouldn't be any difference there.
   
   Coming to indexes used etc, at this point there is hardly any optimization today
   
   - Server builds the same plan for all segments (sorted index eval is done before inv index before scan etc)
   - The only thing that can change is the ordering of children under AND which should not matter to the output of EXPLAIN PLAN since the parent-child relationship does not change. 
   
   So the indexes that are configured today in table config are used - which is what we have done in Phase 1 implementation phase mentioned in the design doc. 
   
   Now if in future things change and more smarts are added to the server where it can potentially come up with drastically different plan for each segment, we had planned for phase 2 (not yet implemented). As part of that phase, the idea is to talk to one/few of the servers (and one/few segments to get an approximation plan) and get that information as part of response from server to broker and use the already implemented plan builder (in this PR) on the broker to put together final plan tree which will be sent to the user. We are still in the process of implementing that. But what we have here in the PR is full functional, well-tested phase 1 implementation. 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] kishoreg commented on pull request #7210: Support EXPLAIN PLAN

Posted by GitBox <gi...@apache.org>.
kishoreg commented on pull request #7210:
URL: https://github.com/apache/pinot/pull/7210#issuecomment-891188205


   @mcvsubbu the two implementations are very different and I would not consider that as clean up. This is introducing a very different way of implementing Explain and is not along the lines of the final implementation. What are we gaining by adding a lot of code that will be deleted later?
   
   I understand that there was a lot of effort involved by the intern to implement this. I am happy to sit down and help implement it the right way. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] jackjlli commented on a change in pull request #7210: Support EXPLAIN PLAN

Posted by GitBox <gi...@apache.org>.
jackjlli commented on a change in pull request #7210:
URL: https://github.com/apache/pinot/pull/7210#discussion_r678507134



##########
File path: pinot-broker/src/main/java/org/apache/pinot/broker/requesthandler/BaseBrokerRequestHandler.java
##########
@@ -498,6 +517,36 @@ private BrokerResponseNative handleSQLRequest(long requestId, String query, Json
     return brokerResponse;
   }
 
+  private BrokerResponseNative processExplainPlanQuery
+      (BrokerRequest brokerRequest, BrokerRequest offlineBrokerRequest, BrokerRequest realtimeBrokerRequest,
+          String offlineTableName, String realtimeTableName) {
+      QueryContext queryContext = BrokerRequestToQueryContextConverter.convert(brokerRequest);

Review comment:
       I found out that `queryContext` is also converted in `reduceOnDataTable` method of `BrokerReduceService` class on broker. Maybe we should move the conversion upfront so that the same object can be reused?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia closed pull request #7210: Support EXPLAIN PLAN

Posted by GitBox <gi...@apache.org>.
siddharthteotia closed pull request #7210:
URL: https://github.com/apache/pinot/pull/7210


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] siddharthteotia commented on pull request #7210: Support EXPLAIN PLAN

Posted by GitBox <gi...@apache.org>.
siddharthteotia commented on pull request #7210:
URL: https://github.com/apache/pinot/pull/7210#issuecomment-887147283


   > I will review the design doc once I get the access. High level comment.
   > 
   > This is a lot of code and it looks like we are creating a completely different code path for explain. This can get outdated and out of sync. We should have extended the operator interface to add explain implementation.
   > 
   > I will review the design doc and add more comments. But please hold off on merging this
   
   @kishoreg  We had shared the design doc a month ago on the mailing list and then proceeded with implementation. 
   
   Vast majority of LOCs are coming from exhaustive functional test class file. 
   
   Please spend some time reviewing. Happy to address feedback. We can also meet to discuss. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org