You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2022/07/17 12:01:37 UTC

[GitHub] [drill] Leon-WTF opened a new pull request, #2599: DRILL-4232 Support for EXCEPT and INTERSECT set operator

Leon-WTF opened a new pull request, #2599:
URL: https://github.com/apache/drill/pull/2599

   # [DRILL-4232](https://issues.apache.org/jira/browse/DRILL-4232): Support for EXCEPT and INTERSECT set operator
   
   ## Description
   Can have hash set operator and sorted set operator, only implement hash version in this PR.
   Compute number of left-input duplicates(numLeft) and number of left-input duplicates(numRight) for each same tuple:
   INTERSECT: if numRight > 0 and numLeft > 0, output one tuple
   INTERSECT ALL: if numRight > 0 and numLeft > 0, output min(numLeft,numRight) tuples
   EXCEPT: if numRight = 0 and numLeft > 0, output one tuple
   EXCEPT ALL: if numRight = 0 and numLeft > 0, output numLeft - numRight tuples
   
   ## Documentation
   TODO
   
   ## Testing
   TODO


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] vvysotskyi commented on a diff in pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

Posted by "vvysotskyi (via GitHub)" <gi...@apache.org>.
vvysotskyi commented on code in PR #2599:
URL: https://github.com/apache/drill/pull/2599#discussion_r1097000816


##########
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillAddAggForExceptRule.java:
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptRuleOperand;
+import org.apache.calcite.plan.hep.HepRelVertex;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.util.ImmutableBitSet;
+import org.apache.calcite.util.trace.CalciteTrace;
+import org.apache.curator.shaded.com.google.common.collect.ImmutableList;
+import org.apache.drill.exec.planner.physical.PrelUtil;
+import org.slf4j.Logger;
+
+import static org.apache.drill.exec.ExecConstants.EXCEPT_ADD_AGG_BELOW;
+
+/**
+ * Rule that try to add agg for Except set op.
+ */
+public class DrillAddAggForExceptRule extends RelOptRule {
+  public static final RelOptRule INSTANCE = new DrillAddAggForExceptRule(RelOptHelper.any(DrillExceptRel.class), "DrillAddAggForExceptRule");
+  protected static final Logger tracer = CalciteTrace.getPlannerTracer();
+
+  public DrillAddAggForExceptRule(RelOptRuleOperand operand, String description) {
+    super(operand, description);
+  }
+
+  @Override
+  public boolean matches(RelOptRuleCall call) {
+    DrillExceptRel drillExceptRel = call.rel(0);
+    return !drillExceptRel.all && !drillExceptRel.isAggAdded() && !findAggRel(drillExceptRel.getInput(0));
+  }
+
+  private boolean findAggRel(RelNode relNode) {
+    if (relNode instanceof HepRelVertex) {
+      return findAggRel(((HepRelVertex) relNode).getCurrentRel());
+    }
+    if (relNode instanceof DrillAggregateRel) {
+      return true;
+    }
+    if (relNode.getInputs().size() == 1 && relNode.getInput(0) != null) {
+      return findAggRel(relNode.getInput(0));
+    }
+    return false;
+  }
+
+  @Override
+  public void onMatch(RelOptRuleCall call) {
+    final DrillExceptRel drillExceptRel = call.rel(0);
+    boolean addAggBelow = PrelUtil.getPlannerSettings(call.getPlanner()).getOptions().getOption(EXCEPT_ADD_AGG_BELOW);
+    if (addAggBelow) {
+      RelNode aggNode = new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.getInput(0),
+        ImmutableBitSet.range(0, drillExceptRel.getInput(0).getRowType().getFieldList().size()), ImmutableList.of(), ImmutableList.of());
+      call.transformTo(drillExceptRel.copy(ImmutableList.of(aggNode, drillExceptRel.getInput(1)), true));
+    } else {
+      call.transformTo(new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.copy(true),

Review Comment:
   But output values should be already distinct after the execution of except operator, so the aggregate will do nothing.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] Leon-WTF commented on a diff in pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

Posted by "Leon-WTF (via GitHub)" <gi...@apache.org>.
Leon-WTF commented on code in PR #2599:
URL: https://github.com/apache/drill/pull/2599#discussion_r1096965727


##########
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillAddAggForExceptRule.java:
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptRuleOperand;
+import org.apache.calcite.plan.hep.HepRelVertex;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.util.ImmutableBitSet;
+import org.apache.calcite.util.trace.CalciteTrace;
+import org.apache.curator.shaded.com.google.common.collect.ImmutableList;
+import org.apache.drill.exec.planner.physical.PrelUtil;
+import org.slf4j.Logger;
+
+import static org.apache.drill.exec.ExecConstants.EXCEPT_ADD_AGG_BELOW;
+
+/**
+ * Rule that try to add agg for Except set op.
+ */
+public class DrillAddAggForExceptRule extends RelOptRule {
+  public static final RelOptRule INSTANCE = new DrillAddAggForExceptRule(RelOptHelper.any(DrillExceptRel.class), "DrillAddAggForExceptRule");
+  protected static final Logger tracer = CalciteTrace.getPlannerTracer();
+
+  public DrillAddAggForExceptRule(RelOptRuleOperand operand, String description) {
+    super(operand, description);
+  }
+
+  @Override
+  public boolean matches(RelOptRuleCall call) {
+    DrillExceptRel drillExceptRel = call.rel(0);
+    return !drillExceptRel.all && !drillExceptRel.isAggAdded() && !findAggRel(drillExceptRel.getInput(0));
+  }
+
+  private boolean findAggRel(RelNode relNode) {
+    if (relNode instanceof HepRelVertex) {
+      return findAggRel(((HepRelVertex) relNode).getCurrentRel());
+    }
+    if (relNode instanceof DrillAggregateRel) {
+      return true;
+    }
+    if (relNode.getInputs().size() == 1 && relNode.getInput(0) != null) {
+      return findAggRel(relNode.getInput(0));
+    }
+    return false;
+  }
+
+  @Override
+  public void onMatch(RelOptRuleCall call) {
+    final DrillExceptRel drillExceptRel = call.rel(0);
+    boolean addAggBelow = PrelUtil.getPlannerSettings(call.getPlanner()).getOptions().getOption(EXCEPT_ADD_AGG_BELOW);
+    if (addAggBelow) {
+      RelNode aggNode = new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.getInput(0),
+        ImmutableBitSet.range(0, drillExceptRel.getInput(0).getRowType().getFieldList().size()), ImmutableList.of(), ImmutableList.of());
+      call.transformTo(drillExceptRel.copy(ImmutableList.of(aggNode, drillExceptRel.getInput(1)), true));
+    } else {
+      call.transformTo(new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.copy(true),

Review Comment:
   @vvysotskyi It's for performance, if the data cardinality is high, aggregate before except may not reduce many data, if the data after except left are few, aggregate after except will only handle few data which is faster than before except. This may be choosen by statistics info + CBO automaticlly in the future.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] Leon-WTF commented on pull request #2599: DRILL-4232 Support for EXCEPT and INTERSECT set operator

Posted by GitBox <gi...@apache.org>.
Leon-WTF commented on PR #2599:
URL: https://github.com/apache/drill/pull/2599#issuecomment-1279921703

   > @Leon-WTF I don't know if you saw this or not, but #2602 has been merged. We are actually running on Calcite 1.32 now!
   
   @cgivre Sorry for the late response. Yes, I have merged that in. I'm almost done and I'm adding ut.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] vvysotskyi commented on a diff in pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

Posted by "vvysotskyi (via GitHub)" <gi...@apache.org>.
vvysotskyi commented on code in PR #2599:
URL: https://github.com/apache/drill/pull/2599#discussion_r1103702656


##########
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillAddAggForExceptRule.java:
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptRuleOperand;
+import org.apache.calcite.plan.hep.HepRelVertex;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.util.ImmutableBitSet;
+import org.apache.calcite.util.trace.CalciteTrace;
+import org.apache.curator.shaded.com.google.common.collect.ImmutableList;
+import org.apache.drill.exec.planner.physical.PrelUtil;
+import org.slf4j.Logger;
+
+import static org.apache.drill.exec.ExecConstants.EXCEPT_ADD_AGG_BELOW;
+
+/**
+ * Rule that try to add agg for Except set op.
+ */
+public class DrillAddAggForExceptRule extends RelOptRule {
+  public static final RelOptRule INSTANCE = new DrillAddAggForExceptRule(RelOptHelper.any(DrillExceptRel.class), "DrillAddAggForExceptRule");
+  protected static final Logger tracer = CalciteTrace.getPlannerTracer();
+
+  public DrillAddAggForExceptRule(RelOptRuleOperand operand, String description) {
+    super(operand, description);
+  }
+
+  @Override
+  public boolean matches(RelOptRuleCall call) {
+    DrillExceptRel drillExceptRel = call.rel(0);
+    return !drillExceptRel.all && !drillExceptRel.isAggAdded() && !findAggRel(drillExceptRel.getInput(0));
+  }
+
+  private boolean findAggRel(RelNode relNode) {
+    if (relNode instanceof HepRelVertex) {
+      return findAggRel(((HepRelVertex) relNode).getCurrentRel());
+    }
+    if (relNode instanceof DrillAggregateRel) {
+      return true;
+    }
+    if (relNode.getInputs().size() == 1 && relNode.getInput(0) != null) {
+      return findAggRel(relNode.getInput(0));
+    }
+    return false;
+  }
+
+  @Override
+  public void onMatch(RelOptRuleCall call) {
+    final DrillExceptRel drillExceptRel = call.rel(0);
+    boolean addAggBelow = PrelUtil.getPlannerSettings(call.getPlanner()).getOptions().getOption(EXCEPT_ADD_AGG_BELOW);
+    if (addAggBelow) {
+      RelNode aggNode = new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.getInput(0),
+        ImmutableBitSet.range(0, drillExceptRel.getInput(0).getRowType().getFieldList().size()), ImmutableList.of(), ImmutableList.of());
+      call.transformTo(drillExceptRel.copy(ImmutableList.of(aggNode, drillExceptRel.getInput(1)), true));
+    } else {
+      call.transformTo(new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.copy(true),

Review Comment:
   Drill doesn't use streaming aggregate for distinct calls, so only hash agg should be enough.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] Leon-WTF commented on pull request #2599: DRILL-4232 Support for EXCEPT and INTERSECT set operator

Posted by GitBox <gi...@apache.org>.
Leon-WTF commented on PR #2599:
URL: https://github.com/apache/drill/pull/2599#issuecomment-1193267131

   > @Leon-WTF You might want to wait until #2602 is merged before you continue this.
   
   #2602 is very cool, I will focus on implementing the operator firstly, and adapt to the new Calcite when it's merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] cgivre commented on pull request #2599: DRILL-4232 Support for EXCEPT and INTERSECT set operator

Posted by GitBox <gi...@apache.org>.
cgivre commented on PR #2599:
URL: https://github.com/apache/drill/pull/2599#issuecomment-1229353429

   @Leon-WTF Is this ready for review?  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] Leon-WTF commented on a diff in pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

Posted by "Leon-WTF (via GitHub)" <gi...@apache.org>.
Leon-WTF commented on code in PR #2599:
URL: https://github.com/apache/drill/pull/2599#discussion_r1096965727


##########
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillAddAggForExceptRule.java:
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptRuleOperand;
+import org.apache.calcite.plan.hep.HepRelVertex;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.util.ImmutableBitSet;
+import org.apache.calcite.util.trace.CalciteTrace;
+import org.apache.curator.shaded.com.google.common.collect.ImmutableList;
+import org.apache.drill.exec.planner.physical.PrelUtil;
+import org.slf4j.Logger;
+
+import static org.apache.drill.exec.ExecConstants.EXCEPT_ADD_AGG_BELOW;
+
+/**
+ * Rule that try to add agg for Except set op.
+ */
+public class DrillAddAggForExceptRule extends RelOptRule {
+  public static final RelOptRule INSTANCE = new DrillAddAggForExceptRule(RelOptHelper.any(DrillExceptRel.class), "DrillAddAggForExceptRule");
+  protected static final Logger tracer = CalciteTrace.getPlannerTracer();
+
+  public DrillAddAggForExceptRule(RelOptRuleOperand operand, String description) {
+    super(operand, description);
+  }
+
+  @Override
+  public boolean matches(RelOptRuleCall call) {
+    DrillExceptRel drillExceptRel = call.rel(0);
+    return !drillExceptRel.all && !drillExceptRel.isAggAdded() && !findAggRel(drillExceptRel.getInput(0));
+  }
+
+  private boolean findAggRel(RelNode relNode) {
+    if (relNode instanceof HepRelVertex) {
+      return findAggRel(((HepRelVertex) relNode).getCurrentRel());
+    }
+    if (relNode instanceof DrillAggregateRel) {
+      return true;
+    }
+    if (relNode.getInputs().size() == 1 && relNode.getInput(0) != null) {
+      return findAggRel(relNode.getInput(0));
+    }
+    return false;
+  }
+
+  @Override
+  public void onMatch(RelOptRuleCall call) {
+    final DrillExceptRel drillExceptRel = call.rel(0);
+    boolean addAggBelow = PrelUtil.getPlannerSettings(call.getPlanner()).getOptions().getOption(EXCEPT_ADD_AGG_BELOW);
+    if (addAggBelow) {
+      RelNode aggNode = new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.getInput(0),
+        ImmutableBitSet.range(0, drillExceptRel.getInput(0).getRowType().getFieldList().size()), ImmutableList.of(), ImmutableList.of());
+      call.transformTo(drillExceptRel.copy(ImmutableList.of(aggNode, drillExceptRel.getInput(1)), true));
+    } else {
+      call.transformTo(new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.copy(true),

Review Comment:
   It's for performance, if the data cardinality is high, aggregate before except may not reduce many data, if the data after except left are few, aggregate after except will only handle few data which is faster than before except.



##########
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillAddAggForExceptRule.java:
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptRuleOperand;
+import org.apache.calcite.plan.hep.HepRelVertex;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.util.ImmutableBitSet;
+import org.apache.calcite.util.trace.CalciteTrace;
+import org.apache.curator.shaded.com.google.common.collect.ImmutableList;
+import org.apache.drill.exec.planner.physical.PrelUtil;
+import org.slf4j.Logger;
+
+import static org.apache.drill.exec.ExecConstants.EXCEPT_ADD_AGG_BELOW;
+
+/**
+ * Rule that try to add agg for Except set op.
+ */
+public class DrillAddAggForExceptRule extends RelOptRule {
+  public static final RelOptRule INSTANCE = new DrillAddAggForExceptRule(RelOptHelper.any(DrillExceptRel.class), "DrillAddAggForExceptRule");
+  protected static final Logger tracer = CalciteTrace.getPlannerTracer();
+
+  public DrillAddAggForExceptRule(RelOptRuleOperand operand, String description) {
+    super(operand, description);
+  }
+
+  @Override
+  public boolean matches(RelOptRuleCall call) {
+    DrillExceptRel drillExceptRel = call.rel(0);
+    return !drillExceptRel.all && !drillExceptRel.isAggAdded() && !findAggRel(drillExceptRel.getInput(0));
+  }
+
+  private boolean findAggRel(RelNode relNode) {
+    if (relNode instanceof HepRelVertex) {
+      return findAggRel(((HepRelVertex) relNode).getCurrentRel());
+    }
+    if (relNode instanceof DrillAggregateRel) {
+      return true;
+    }
+    if (relNode.getInputs().size() == 1 && relNode.getInput(0) != null) {
+      return findAggRel(relNode.getInput(0));
+    }
+    return false;
+  }
+
+  @Override
+  public void onMatch(RelOptRuleCall call) {
+    final DrillExceptRel drillExceptRel = call.rel(0);
+    boolean addAggBelow = PrelUtil.getPlannerSettings(call.getPlanner()).getOptions().getOption(EXCEPT_ADD_AGG_BELOW);
+    if (addAggBelow) {
+      RelNode aggNode = new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.getInput(0),
+        ImmutableBitSet.range(0, drillExceptRel.getInput(0).getRowType().getFieldList().size()), ImmutableList.of(), ImmutableList.of());
+      call.transformTo(drillExceptRel.copy(ImmutableList.of(aggNode, drillExceptRel.getInput(1)), true));
+    } else {
+      call.transformTo(new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.copy(true),

Review Comment:
   @vvysotskyi It's for performance, if the data cardinality is high, aggregate before except may not reduce many data, if the data after except left are few, aggregate after except will only handle few data which is faster than before except.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] cgivre commented on pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

Posted by "cgivre (via GitHub)" <gi...@apache.org>.
cgivre commented on PR #2599:
URL: https://github.com/apache/drill/pull/2599#issuecomment-1424152751

   Hey @Leon-WTF Any chance you could address @vvysotskyi 's comments soon.   This is one of the last PRs slated to be merged for the next release.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] Leon-WTF commented on pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

Posted by "Leon-WTF (via GitHub)" <gi...@apache.org>.
Leon-WTF commented on PR #2599:
URL: https://github.com/apache/drill/pull/2599#issuecomment-1425212100

   > Hey @Leon-WTF Any chance you could address @vvysotskyi 's comments soon. This is one of the last PRs slated to be merged for the next release.
   
   I will do it by this weekend, is that ok?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] cgivre commented on pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

Posted by "cgivre (via GitHub)" <gi...@apache.org>.
cgivre commented on PR #2599:
URL: https://github.com/apache/drill/pull/2599#issuecomment-1425760414

   > 
   
   @Leon-WTF This weekend would be great!  Very excited to get this merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] Leon-WTF commented on a diff in pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

Posted by "Leon-WTF (via GitHub)" <gi...@apache.org>.
Leon-WTF commented on code in PR #2599:
URL: https://github.com/apache/drill/pull/2599#discussion_r1089851904


##########
exec/java-exec/src/test/java/org/apache/drill/TestSetOp.java:
##########
@@ -0,0 +1,1093 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill;
+
+import org.apache.drill.exec.planner.physical.PlannerSettings;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchemaBuilder;
+import org.apache.drill.exec.record.metadata.SchemaBuilder;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.drill.categories.OperatorTest;
+import org.apache.drill.categories.SqlTest;
+import org.apache.drill.categories.UnlikelyTest;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterTest;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import java.io.BufferedWriter;
+import java.io.File;
+import java.io.FileWriter;
+import java.nio.file.Paths;
+import java.util.List;
+
+@Category({SqlTest.class, OperatorTest.class})
+public class TestSetOp extends ClusterTest {

Review Comment:
   Sure, I have added that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] vvysotskyi commented on a diff in pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

Posted by "vvysotskyi (via GitHub)" <gi...@apache.org>.
vvysotskyi commented on code in PR #2599:
URL: https://github.com/apache/drill/pull/2599#discussion_r1096581680


##########
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillAddAggForExceptRule.java:
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptRuleOperand;
+import org.apache.calcite.plan.hep.HepRelVertex;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.util.ImmutableBitSet;
+import org.apache.calcite.util.trace.CalciteTrace;
+import org.apache.curator.shaded.com.google.common.collect.ImmutableList;
+import org.apache.drill.exec.planner.physical.PrelUtil;
+import org.slf4j.Logger;
+
+import static org.apache.drill.exec.ExecConstants.EXCEPT_ADD_AGG_BELOW;
+
+/**
+ * Rule that try to add agg for Except set op.
+ */
+public class DrillAddAggForExceptRule extends RelOptRule {
+  public static final RelOptRule INSTANCE = new DrillAddAggForExceptRule(RelOptHelper.any(DrillExceptRel.class), "DrillAddAggForExceptRule");
+  protected static final Logger tracer = CalciteTrace.getPlannerTracer();
+
+  public DrillAddAggForExceptRule(RelOptRuleOperand operand, String description) {
+    super(operand, description);
+  }
+
+  @Override
+  public boolean matches(RelOptRuleCall call) {
+    DrillExceptRel drillExceptRel = call.rel(0);
+    return !drillExceptRel.all && !drillExceptRel.isAggAdded() && !findAggRel(drillExceptRel.getInput(0));

Review Comment:
   I'm not sure whether this check would work properly in some cases. For example, the volcano planner will use RelSet to wrap nodes, and perhaps there are some other cases. Instead, I propose using `RelMetadataQuery.getUniqueKeys()` to ensure that input columns have unique values, and if it is so, do not add aggregate. It calls methods from `org.apache.calcite.rel.metadata.RelMdUniqueKeys` for specific node types and should handle more cases than existing checks. In this case, the `isAggAdded` field wouldn't be required.



##########
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillAddAggForExceptRule.java:
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptRuleOperand;
+import org.apache.calcite.plan.hep.HepRelVertex;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.util.ImmutableBitSet;
+import org.apache.calcite.util.trace.CalciteTrace;
+import org.apache.curator.shaded.com.google.common.collect.ImmutableList;
+import org.apache.drill.exec.planner.physical.PrelUtil;
+import org.slf4j.Logger;
+
+import static org.apache.drill.exec.ExecConstants.EXCEPT_ADD_AGG_BELOW;
+
+/**
+ * Rule that try to add agg for Except set op.
+ */
+public class DrillAddAggForExceptRule extends RelOptRule {
+  public static final RelOptRule INSTANCE = new DrillAddAggForExceptRule(RelOptHelper.any(DrillExceptRel.class), "DrillAddAggForExceptRule");
+  protected static final Logger tracer = CalciteTrace.getPlannerTracer();
+
+  public DrillAddAggForExceptRule(RelOptRuleOperand operand, String description) {
+    super(operand, description);
+  }
+
+  @Override
+  public boolean matches(RelOptRuleCall call) {
+    DrillExceptRel drillExceptRel = call.rel(0);
+    return !drillExceptRel.all && !drillExceptRel.isAggAdded() && !findAggRel(drillExceptRel.getInput(0));
+  }
+
+  private boolean findAggRel(RelNode relNode) {
+    if (relNode instanceof HepRelVertex) {
+      return findAggRel(((HepRelVertex) relNode).getCurrentRel());
+    }
+    if (relNode instanceof DrillAggregateRel) {
+      return true;
+    }
+    if (relNode.getInputs().size() == 1 && relNode.getInput(0) != null) {
+      return findAggRel(relNode.getInput(0));
+    }
+    return false;
+  }
+
+  @Override
+  public void onMatch(RelOptRuleCall call) {
+    final DrillExceptRel drillExceptRel = call.rel(0);
+    boolean addAggBelow = PrelUtil.getPlannerSettings(call.getPlanner()).getOptions().getOption(EXCEPT_ADD_AGG_BELOW);
+    if (addAggBelow) {
+      RelNode aggNode = new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.getInput(0),
+        ImmutableBitSet.range(0, drillExceptRel.getInput(0).getRowType().getFieldList().size()), ImmutableList.of(), ImmutableList.of());
+      call.transformTo(drillExceptRel.copy(ImmutableList.of(aggNode, drillExceptRel.getInput(1)), true));
+    } else {
+      call.transformTo(new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.copy(true),

Review Comment:
   Do we need to add aggregate on top of except?



##########
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillSetOpRule.java:
##########
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+import org.apache.calcite.plan.Convention;
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptRuleOperand;
+import org.apache.calcite.plan.RelTraitSet;
+import org.apache.calcite.rel.InvalidRelException;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.rel.core.SetOp;
+import org.apache.calcite.rel.logical.LogicalIntersect;
+import org.apache.calcite.rel.logical.LogicalMinus;
+import org.apache.calcite.util.trace.CalciteTrace;
+import org.slf4j.Logger;
+
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+
+/**
+ * Rule that converts {@link LogicalIntersect} or {@link LogicalMinus} to
+ * {@link DrillIntersectRel} or {@link DrillExceptRel}.
+ */
+public class DrillSetOpRule extends RelOptRule {
+  public static final List<RelOptRule> INSTANCES = Arrays.asList(
+      new DrillSetOpRule(RelOptHelper.any(LogicalIntersect.class, Convention.NONE), "DrillIntersectRelRule"),
+      new DrillSetOpRule(RelOptHelper.any(LogicalMinus.class, Convention.NONE), "DrillExceptRelRule")
+  );
+  protected static final Logger tracer = CalciteTrace.getPlannerTracer();
+
+  public DrillSetOpRule(RelOptRuleOperand operand, String description) {
+    super(operand, description);
+  }
+
+  @Override
+  public void onMatch(RelOptRuleCall call) {
+    final SetOp setOp = call.rel(0);
+    final RelTraitSet traits = setOp.getTraitSet().plus(DrillRel.DRILL_LOGICAL);
+    final List<RelNode> convertedInputs = new ArrayList<>();
+    for (RelNode input : setOp.getInputs()) {
+      RelNode convertedInput = convert(input, input.getTraitSet().plus(DrillRel.DRILL_LOGICAL).simplify());
+      convertedInputs.add(convertedInput);
+    }
+    try {
+      if (setOp instanceof LogicalMinus) {

Review Comment:
   We could create and pass a specific rel factory to the place where the rule is created and use it here instead of checking which type of node is.



##########
exec/java-exec/src/test/java/org/apache/drill/TestSetOp.java:
##########
@@ -0,0 +1,1183 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill;
+
+import org.apache.drill.exec.planner.physical.PlannerSettings;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchemaBuilder;
+import org.apache.drill.exec.record.metadata.SchemaBuilder;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.drill.categories.OperatorTest;
+import org.apache.drill.categories.SqlTest;
+import org.apache.drill.categories.UnlikelyTest;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterTest;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import java.io.BufferedWriter;
+import java.io.File;
+import java.io.FileWriter;
+import java.nio.file.Paths;
+import java.util.List;
+
+import static org.junit.Assert.assertTrue;
+
+@Category({SqlTest.class, OperatorTest.class})
+public class TestSetOp extends ClusterTest {
+  private static final String EMPTY_DIR_NAME = "empty_directory";
+  private static final String SLICE_TARGET_DEFAULT = "alter session reset `planner.slice_target`";
+
+  @BeforeClass
+  public static void setupTestFiles() throws Exception {
+    startCluster(ClusterFixture.builder(dirTestWatcher));
+    dirTestWatcher.copyResourceToRoot(Paths.get("multilevel", "parquet"));
+    dirTestWatcher.makeTestTmpSubDir(Paths.get(EMPTY_DIR_NAME));
+  }
+
+  @Test
+  public void TestExceptionWithSchemaLessDataSource() {
+    boolean exceptionEncountered = true;
+    String root = "/multilevel/csv/1994/Q1/orders_94_q1.csv";
+    try {
+      testBuilder()
+        .sqlQuery("select * from cp.`%s` intersect select * from cp.`%s`", root, root)
+        .unOrdered()
+        .baselineColumns("a", "b")
+        .baselineValues(1, 1)
+        .go();
+      exceptionEncountered = false;

Review Comment:
   instead of the flag please use the `Assert.fail("reason")` method here.



##########
logical/src/main/java/org/apache/drill/common/logical/data/visitors/LogicalVisitor.java:
##########
@@ -62,7 +64,10 @@
     public RETURN visitRunningAggregate(RunningAggregate runningAggregate, EXTRA value) throws EXCEP;
     public RETURN visitTransform(Transform transform, EXTRA value) throws EXCEP;
     public RETURN visitUnion(Union union, EXTRA value) throws EXCEP;
-    public RETURN visitWindow(Window window, EXTRA value) throws EXCEP;
+    public RETURN visitExcept(Except except, EXTRA value) throws EXCEP;
+    public RETURN visitIntersect(Intersect intersect, EXTRA value) throws EXCEP;
+
+  public RETURN visitWindow(Window window, EXTRA value) throws EXCEP;

Review Comment:
   Please fix the indentation here.



##########
exec/java-exec/src/test/java/org/apache/drill/TestSetOp.java:
##########
@@ -0,0 +1,1183 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill;
+
+import org.apache.drill.exec.planner.physical.PlannerSettings;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchemaBuilder;
+import org.apache.drill.exec.record.metadata.SchemaBuilder;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.drill.categories.OperatorTest;
+import org.apache.drill.categories.SqlTest;
+import org.apache.drill.categories.UnlikelyTest;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterTest;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import java.io.BufferedWriter;
+import java.io.File;
+import java.io.FileWriter;
+import java.nio.file.Paths;
+import java.util.List;
+
+import static org.junit.Assert.assertTrue;
+
+@Category({SqlTest.class, OperatorTest.class})
+public class TestSetOp extends ClusterTest {
+  private static final String EMPTY_DIR_NAME = "empty_directory";
+  private static final String SLICE_TARGET_DEFAULT = "alter session reset `planner.slice_target`";
+
+  @BeforeClass
+  public static void setupTestFiles() throws Exception {
+    startCluster(ClusterFixture.builder(dirTestWatcher));
+    dirTestWatcher.copyResourceToRoot(Paths.get("multilevel", "parquet"));
+    dirTestWatcher.makeTestTmpSubDir(Paths.get(EMPTY_DIR_NAME));
+  }
+
+  @Test
+  public void TestExceptionWithSchemaLessDataSource() {
+    boolean exceptionEncountered = true;
+    String root = "/multilevel/csv/1994/Q1/orders_94_q1.csv";
+    try {
+      testBuilder()
+        .sqlQuery("select * from cp.`%s` intersect select * from cp.`%s`", root, root)
+        .unOrdered()
+        .baselineColumns("a", "b")
+        .baselineValues(1, 1)
+        .go();
+      exceptionEncountered = false;
+    } catch (Exception ex) {
+      assertTrue(ex.getMessage(),

Review Comment:
   `assertThat` + `containsString` will show more informative messages in the case of failures.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] Leon-WTF commented on a diff in pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

Posted by "Leon-WTF (via GitHub)" <gi...@apache.org>.
Leon-WTF commented on code in PR #2599:
URL: https://github.com/apache/drill/pull/2599#discussion_r1097007805


##########
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillAddAggForExceptRule.java:
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptRuleOperand;
+import org.apache.calcite.plan.hep.HepRelVertex;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.util.ImmutableBitSet;
+import org.apache.calcite.util.trace.CalciteTrace;
+import org.apache.curator.shaded.com.google.common.collect.ImmutableList;
+import org.apache.drill.exec.planner.physical.PrelUtil;
+import org.slf4j.Logger;
+
+import static org.apache.drill.exec.ExecConstants.EXCEPT_ADD_AGG_BELOW;
+
+/**
+ * Rule that try to add agg for Except set op.
+ */
+public class DrillAddAggForExceptRule extends RelOptRule {
+  public static final RelOptRule INSTANCE = new DrillAddAggForExceptRule(RelOptHelper.any(DrillExceptRel.class), "DrillAddAggForExceptRule");
+  protected static final Logger tracer = CalciteTrace.getPlannerTracer();
+
+  public DrillAddAggForExceptRule(RelOptRuleOperand operand, String description) {
+    super(operand, description);
+  }
+
+  @Override
+  public boolean matches(RelOptRuleCall call) {
+    DrillExceptRel drillExceptRel = call.rel(0);
+    return !drillExceptRel.all && !drillExceptRel.isAggAdded() && !findAggRel(drillExceptRel.getInput(0));
+  }
+
+  private boolean findAggRel(RelNode relNode) {
+    if (relNode instanceof HepRelVertex) {
+      return findAggRel(((HepRelVertex) relNode).getCurrentRel());
+    }
+    if (relNode instanceof DrillAggregateRel) {
+      return true;
+    }
+    if (relNode.getInputs().size() == 1 && relNode.getInput(0) != null) {
+      return findAggRel(relNode.getInput(0));
+    }
+    return false;
+  }
+
+  @Override
+  public void onMatch(RelOptRuleCall call) {
+    final DrillExceptRel drillExceptRel = call.rel(0);
+    boolean addAggBelow = PrelUtil.getPlannerSettings(call.getPlanner()).getOptions().getOption(EXCEPT_ADD_AGG_BELOW);
+    if (addAggBelow) {
+      RelNode aggNode = new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.getInput(0),
+        ImmutableBitSet.range(0, drillExceptRel.getInput(0).getRowType().getFieldList().size()), ImmutableList.of(), ImmutableList.of());
+      call.transformTo(drillExceptRel.copy(ImmutableList.of(aggNode, drillExceptRel.getInput(1)), true));
+    } else {
+      call.transformTo(new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.copy(true),

Review Comment:
   It's not distinct for left table after except operator.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] Leon-WTF commented on pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

Posted by "Leon-WTF (via GitHub)" <gi...@apache.org>.
Leon-WTF commented on PR #2599:
URL: https://github.com/apache/drill/pull/2599#issuecomment-1415544424

   @vvysotskyi Hi, any more comments on this PR?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] Leon-WTF commented on a diff in pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

Posted by "Leon-WTF (via GitHub)" <gi...@apache.org>.
Leon-WTF commented on code in PR #2599:
URL: https://github.com/apache/drill/pull/2599#discussion_r1102305963


##########
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillAddAggForExceptRule.java:
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptRuleOperand;
+import org.apache.calcite.plan.hep.HepRelVertex;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.util.ImmutableBitSet;
+import org.apache.calcite.util.trace.CalciteTrace;
+import org.apache.curator.shaded.com.google.common.collect.ImmutableList;
+import org.apache.drill.exec.planner.physical.PrelUtil;
+import org.slf4j.Logger;
+
+import static org.apache.drill.exec.ExecConstants.EXCEPT_ADD_AGG_BELOW;
+
+/**
+ * Rule that try to add agg for Except set op.
+ */
+public class DrillAddAggForExceptRule extends RelOptRule {
+  public static final RelOptRule INSTANCE = new DrillAddAggForExceptRule(RelOptHelper.any(DrillExceptRel.class), "DrillAddAggForExceptRule");
+  protected static final Logger tracer = CalciteTrace.getPlannerTracer();
+
+  public DrillAddAggForExceptRule(RelOptRuleOperand operand, String description) {
+    super(operand, description);
+  }
+
+  @Override
+  public boolean matches(RelOptRuleCall call) {
+    DrillExceptRel drillExceptRel = call.rel(0);
+    return !drillExceptRel.all && !drillExceptRel.isAggAdded() && !findAggRel(drillExceptRel.getInput(0));
+  }
+
+  private boolean findAggRel(RelNode relNode) {
+    if (relNode instanceof HepRelVertex) {
+      return findAggRel(((HepRelVertex) relNode).getCurrentRel());
+    }
+    if (relNode instanceof DrillAggregateRel) {
+      return true;
+    }
+    if (relNode.getInputs().size() == 1 && relNode.getInput(0) != null) {
+      return findAggRel(relNode.getInput(0));
+    }
+    return false;
+  }
+
+  @Override
+  public void onMatch(RelOptRuleCall call) {
+    final DrillExceptRel drillExceptRel = call.rel(0);
+    boolean addAggBelow = PrelUtil.getPlannerSettings(call.getPlanner()).getOptions().getOption(EXCEPT_ADD_AGG_BELOW);
+    if (addAggBelow) {
+      RelNode aggNode = new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.getInput(0),
+        ImmutableBitSet.range(0, drillExceptRel.getInput(0).getRowType().getFieldList().size()), ImmutableList.of(), ImmutableList.of());
+      call.transformTo(drillExceptRel.copy(ImmutableList.of(aggNode, drillExceptRel.getInput(1)), true));
+    } else {
+      call.transformTo(new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.copy(true),

Review Comment:
   One more question about moving converting rule to physical phase, I need to add physical agg rel node, so needs to add both hash(distribute by all keys/single key) and stream agg, right?



##########
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillAddAggForExceptRule.java:
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptRuleOperand;
+import org.apache.calcite.plan.hep.HepRelVertex;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.util.ImmutableBitSet;
+import org.apache.calcite.util.trace.CalciteTrace;
+import org.apache.curator.shaded.com.google.common.collect.ImmutableList;
+import org.apache.drill.exec.planner.physical.PrelUtil;
+import org.slf4j.Logger;
+
+import static org.apache.drill.exec.ExecConstants.EXCEPT_ADD_AGG_BELOW;
+
+/**
+ * Rule that try to add agg for Except set op.
+ */
+public class DrillAddAggForExceptRule extends RelOptRule {
+  public static final RelOptRule INSTANCE = new DrillAddAggForExceptRule(RelOptHelper.any(DrillExceptRel.class), "DrillAddAggForExceptRule");
+  protected static final Logger tracer = CalciteTrace.getPlannerTracer();
+
+  public DrillAddAggForExceptRule(RelOptRuleOperand operand, String description) {
+    super(operand, description);
+  }
+
+  @Override
+  public boolean matches(RelOptRuleCall call) {
+    DrillExceptRel drillExceptRel = call.rel(0);
+    return !drillExceptRel.all && !drillExceptRel.isAggAdded() && !findAggRel(drillExceptRel.getInput(0));
+  }
+
+  private boolean findAggRel(RelNode relNode) {
+    if (relNode instanceof HepRelVertex) {
+      return findAggRel(((HepRelVertex) relNode).getCurrentRel());
+    }
+    if (relNode instanceof DrillAggregateRel) {
+      return true;
+    }
+    if (relNode.getInputs().size() == 1 && relNode.getInput(0) != null) {
+      return findAggRel(relNode.getInput(0));
+    }
+    return false;
+  }
+
+  @Override
+  public void onMatch(RelOptRuleCall call) {
+    final DrillExceptRel drillExceptRel = call.rel(0);
+    boolean addAggBelow = PrelUtil.getPlannerSettings(call.getPlanner()).getOptions().getOption(EXCEPT_ADD_AGG_BELOW);
+    if (addAggBelow) {
+      RelNode aggNode = new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.getInput(0),
+        ImmutableBitSet.range(0, drillExceptRel.getInput(0).getRowType().getFieldList().size()), ImmutableList.of(), ImmutableList.of());
+      call.transformTo(drillExceptRel.copy(ImmutableList.of(aggNode, drillExceptRel.getInput(1)), true));
+    } else {
+      call.transformTo(new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.copy(true),

Review Comment:
   @vvysotskyi One more question about moving converting rule to physical phase, I need to add physical agg rel node, so needs to add both hash(distribute by all keys/single key) and stream agg, right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] vvysotskyi commented on a diff in pull request #2599: DRILL-4232 Support for EXCEPT and INTERSECT set operator

Posted by GitBox <gi...@apache.org>.
vvysotskyi commented on code in PR #2599:
URL: https://github.com/apache/drill/pull/2599#discussion_r1057326323


##########
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/common/HashTable.java:
##########
@@ -98,6 +98,10 @@ void setup(HashTableConfig htConfig, BufferAllocator allocator, VectorContainer
    */
   int probeForKey(int incomingRowIdx, int hashCode) throws SchemaChangeException;
 
+  int getNum(int currentIndex);

Review Comment:
   Please rename this method to clarify that it holds the count of records for a specific key and add JavaDoc.



##########
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggBatch.java:
##########
@@ -188,10 +192,13 @@ public HashAggBatch(HashAggregate popConfig, RecordBatch incoming, FragmentConte
     wasKilled = false;
 
     final int numGrpByExprs = popConfig.getGroupByExprs().size();
-    comparators = Lists.newArrayListWithExpectedSize(numGrpByExprs);
-    for (int i=0; i<numGrpByExprs; i++) {
-      // nulls are equal in group by case
-      comparators.add(Comparator.IS_NOT_DISTINCT_FROM);
+    isDynamicColumns = numGrpByExprs == 1 && SchemaPath.DYNAMIC_STAR.equals(popConfig.getGroupByExprs().get(0).getRef().getRootSegment().getPath());

Review Comment:
   It is possible to have both dynamic star and explicit columns projection, like
   ```sql
   SELECT *, col1, col2 from tab1
   ```



##########
exec/java-exec/src/test/java/org/apache/drill/TestSetOp.java:
##########
@@ -0,0 +1,1093 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill;
+
+import org.apache.drill.exec.planner.physical.PlannerSettings;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchemaBuilder;
+import org.apache.drill.exec.record.metadata.SchemaBuilder;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.drill.categories.OperatorTest;
+import org.apache.drill.categories.SqlTest;
+import org.apache.drill.categories.UnlikelyTest;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterTest;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import java.io.BufferedWriter;
+import java.io.File;
+import java.io.FileWriter;
+import java.nio.file.Paths;
+import java.util.List;
+
+@Category({SqlTest.class, OperatorTest.class})
+public class TestSetOp extends ClusterTest {

Review Comment:
   Could you please add more tests that check several batches? It could be done using the `UNION ALL` operator. Also, it would be interesting to see cases when the first batch of one side is empty, and so on.
   One more scenario is to check how it behaves with complex types. It is fine if not supported, but we should be sure that we have the correct error message and error handling.



##########
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillSetOpRel.java:
##########
@@ -0,0 +1,79 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+import org.apache.calcite.linq4j.Ord;
+import org.apache.calcite.plan.RelOptCluster;
+import org.apache.calcite.plan.RelTraitSet;
+import org.apache.calcite.rel.InvalidRelException;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.drill.common.logical.data.LogicalOperator;
+import org.apache.drill.common.logical.data.Union;
+import org.apache.drill.exec.planner.common.DrillSetOpRelBase;
+
+import java.util.List;
+
+/**
+ * SetOp implemented in Drill.
+ */
+public class DrillSetOpRel extends DrillSetOpRelBase implements DrillRel {
+  private boolean isAggAdded;
+
+  public DrillSetOpRel(RelOptCluster cluster, RelTraitSet traits,
+                       List<RelNode> inputs, SqlKind kind, boolean all, boolean checkCompatibility, boolean isAggAdded) throws InvalidRelException {
+    super(cluster, traits, inputs, kind, all, checkCompatibility);
+    this.isAggAdded = isAggAdded;
+  }
+
+  public DrillSetOpRel(RelOptCluster cluster, RelTraitSet traits,
+                       List<RelNode> inputs, SqlKind kind, boolean all, boolean checkCompatibility) throws InvalidRelException {
+    super(cluster, traits, inputs, kind, all, checkCompatibility);
+    this.isAggAdded = false;
+  }
+
+  public boolean isAggAdded() {
+    return isAggAdded;
+  }
+
+  public void setAggAdded(boolean aggAdded) {
+    isAggAdded = aggAdded;

Review Comment:
   Rel nodes should be effectively immutable; rules shouldn't modify rel nodes but create new ones.



##########
exec/java-exec/src/test/java/org/apache/drill/TestSetOp.java:
##########
@@ -0,0 +1,1093 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill;
+
+import org.apache.drill.exec.planner.physical.PlannerSettings;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchemaBuilder;
+import org.apache.drill.exec.record.metadata.SchemaBuilder;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.drill.categories.OperatorTest;
+import org.apache.drill.categories.SqlTest;
+import org.apache.drill.categories.UnlikelyTest;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterTest;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import java.io.BufferedWriter;
+import java.io.File;
+import java.io.FileWriter;
+import java.nio.file.Paths;
+import java.util.List;
+
+@Category({SqlTest.class, OperatorTest.class})
+public class TestSetOp extends ClusterTest {
+  private static final String EMPTY_DIR_NAME = "empty_directory";
+  private static final String SLICE_TARGET_DEFAULT = "alter session reset `planner.slice_target`";
+
+  @BeforeClass
+  public static void setupTestFiles() throws Exception {
+    startCluster(ClusterFixture.builder(dirTestWatcher));
+    dirTestWatcher.copyResourceToRoot(Paths.get("multilevel", "parquet"));
+    dirTestWatcher.makeTestTmpSubDir(Paths.get(EMPTY_DIR_NAME));
+  }
+
+  @Test
+  public void testIntersect() throws Exception {
+    String query = "select * from (values(4,4), (2,2), (4,4), (1,1), (3,4), (2,2), (1,1)) t(a,b) intersect select * from (values(1,1), (1,1), (2,2), (3,3)) t(a,b)";
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("a", "b")
+      .baselineValues(2, 2)
+      .baselineValues(1, 1)
+      .build().run();
+
+    query = "select * from (values(4,4), (2,2), (4,4), (1,1), (3,4), (2,2), (1,1)) t(a,b) intersect all select * from (values(1,1), (1,1), (2,2), (3,3)) t(a,b)";
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("a", "b")
+      .baselineValues(2, 2)
+      .baselineValues(1, 1)
+      .baselineValues(1, 1)
+      .build().run();
+  }
+
+  @Test
+  public void testExcept() throws Exception {
+    String query = "select * from (values(4,4), (2,2), (4,4), (1,1), (3,4), (2,2), (1,1)) t(a,b) except select * from (values(1,1), (1,1), (2,2), (3,3)) t(a,b)";
+    String aggAbovePattern = ".*Screen.*Agg.*SetOp.*";
+    String aggBelowPattern = ".*SetOp.*Agg.*Values.*";
+
+    queryBuilder()
+      .sql(query)
+      .planMatcher()
+      .include(aggAbovePattern)
+      .exclude(aggBelowPattern)
+      .match(true);
+
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("a", "b")
+      .baselineValues(4, 4)
+      .baselineValues(3, 4)
+      .build().run();
+
+    try {
+      client.alterSession(ExecConstants.EXCEPT_ADD_AGG_BELOW_KEY, true);
+      query = "select * from (values(4,4), (2,2), (4,4), (1,1), (3,4), (2,2), (1,1)) t(a,b) except select a, b from (values(1,1), (1,1), (2,2), (3,3)) t(a,b)";
+      queryBuilder()
+        .sql(query)
+        .planMatcher()
+        .include(aggBelowPattern)
+        .exclude(aggAbovePattern)
+        .match(true);
+
+      testBuilder()
+        .sqlQuery(query)
+        .unOrdered()
+        .baselineColumns("a", "b")
+        .baselineValues(4, 4)
+        .baselineValues(3, 4)
+        .build().run();
+    } finally {
+      client.resetSession(ExecConstants.EXCEPT_ADD_AGG_BELOW_KEY);
+    }
+
+    query = "select * from (values(4,4), (2,2), (4,4), (1,1), (3,4), (2,2), (1,1)) t(a,b) except all select * from (values(1,1), (1,1), (2,2), (3,3)) t(a,b)";
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("a", "b")
+      .baselineValues(4, 4)
+      .baselineValues(4, 4)
+      .baselineValues(3, 4)
+      .baselineValues(2, 2)
+      .build().run();
+  }
+
+
+  @Test
+  public void testOverJoin() throws Exception {
+    String query =
+      "select n1.n_nationkey from cp.`tpch/nation.parquet` n1 inner join cp.`tpch/region.parquet` r1 on n1.n_regionkey = r1.r_regionkey where n1.n_nationkey in (1, 2, 3, 4) " +
+      "except " +
+      "select n2.n_nationkey from cp.`tpch/nation.parquet` n2 inner join cp.`tpch/region.parquet` r2 on n2.n_regionkey = r2.r_regionkey where n2.n_nationkey in (3, 4)";
+
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("n_nationkey")
+      .baselineValues(1)
+      .baselineValues(2)
+      .build().run();
+  }
+
+  @Test
+  public void testOverAgg() throws Exception {
+    String query = "select n1.n_regionkey from cp.`tpch/nation.parquet` n1 group by n1.n_regionkey except " +
+      "select r1.r_regionkey from cp.`tpch/region.parquet` r1 where r1.r_regionkey in (0, 1) group by r1.r_regionkey";
+
+    String excludePattern = "Screen.*Agg.*SetOp";
+    queryBuilder()
+      .sql(query)
+      .planMatcher()
+      .exclude(excludePattern)
+      .match(true);
+
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("n_regionkey")
+      .baselineValues(2)
+      .baselineValues(3)
+      .baselineValues(4)
+      .build().run();
+  }
+
+  @Test
+  public void testChain() throws Exception {
+    String query = "select n_regionkey from cp.`tpch/nation.parquet` intersect " +
+      "select r_regionkey from cp.`tpch/region.parquet` intersect " +
+      "select n_nationkey from cp.`tpch/nation.parquet` where n_nationkey in (1,2) intersect " +
+      "select c_custkey from cp.`tpch/customer.parquet` where c_custkey < 5";
+
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("n_regionkey")
+      .baselineValues(1)
+      .baselineValues(2)
+      .build().run();
+  }
+
+  @Test
+  public void testSameColumn() throws Exception {
+    String query = "select n_nationkey, n_regionkey from cp.`tpch/nation.parquet` where n_regionkey = 1 intersect all select r_regionkey, r_regionkey from cp.`tpch/region.parquet` where r_regionkey = 1";
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("n_nationkey", "n_regionkey")
+      .baselineValues(1, 1)
+      .build().run();
+
+    query = "select n_regionkey, n_regionkey from cp.`tpch/nation.parquet` where n_regionkey = 1 except all select r_regionkey, r_regionkey from cp.`tpch/region.parquet` where r_regionkey = 1";
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("n_regionkey", "n_regionkey0")
+      .baselineValues(1, 1)
+      .baselineValues(1, 1)
+      .baselineValues(1, 1)
+      .baselineValues(1, 1)
+      .build().run();
+  }
+
+  @Test
+  public void testTwoStringColumns() throws Exception {
+    String query = "select r_comment, r_regionkey from cp.`tpch/region.parquet` except select n_name, n_nationkey from cp.`tpch/nation.parquet`";
+
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("r_comment", "r_regionkey")
+      .baselineValues("lar deposits. blithely final packages cajole. regular waters are final requests. regular accounts are according to ", 0)
+      .baselineValues("hs use ironic, even requests. s", 1)
+      .baselineValues("ges. thinly even pinto beans ca", 2)
+      .baselineValues("ly final courts cajole furiously final excuse", 3)
+      .baselineValues("uickly special accounts cajole carefully blithely close requests. carefully final asymptotes haggle furiousl", 4)
+      .build().run();
+  }
+
+
+  @Test
+  public void testConstantLiterals() throws Exception {
+    String query = "(select 'CONST' as LiteralConstant, 1 as NumberConstant, n_nationkey from cp.`tpch/nation.parquet`) " +
+      "intersect " +
+      "(select 'CONST', 1, r_regionkey from cp.`tpch/region.parquet`)";
+
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("LiteralConstant", "NumberConstant", "n_nationkey")
+      .baselineValues("CONST", 1, 0)
+      .baselineValues("CONST", 1, 1)
+      .baselineValues("CONST", 1, 2)
+      .baselineValues("CONST", 1, 3)
+      .baselineValues("CONST", 1, 4)
+      .build().run();
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testViewExpandableStar() throws Exception {
+    try {
+      run("use dfs.tmp");
+      run("create view nation_view as select n_nationkey, n_name from (values(4,'4'), (2,'2'), (4,'4'), (1,'1'), (3,'4'), (2,'2'), (1,'1')) t(n_nationkey, n_name)");
+      run("create view region_view as select r_regionkey, r_name from (values(1,'1'), (1,'1'), (2,'2'), (3,'3')) t(r_regionkey, r_name)");
+
+      String query1 = "(select * from dfs.tmp.`nation_view`) " +
+        "except " +
+        "(select * from dfs.tmp.`region_view`) ";
+
+      String query2 =  "(select r_regionkey, r_name from (values(1,'1'), (1,'1'), (2,'2'), (3,'3')) t(r_regionkey, r_name)) " +
+        "intersect " +
+        "(select * from dfs.tmp.`nation_view`)";
+
+      testBuilder()
+        .sqlQuery(query1)
+        .unOrdered()
+        .baselineColumns("n_nationkey", "n_name")
+        .baselineValues(4, "4")
+        .baselineValues(3, "4")
+        .build().run();
+
+      testBuilder()
+        .sqlQuery(query2)
+        .unOrdered()
+        .baselineColumns("r_regionkey", "r_name")
+        .baselineValues(1, "1")
+        .baselineValues(2, "2")
+        .build().run();
+    } finally {
+      run("drop view if exists nation_view");
+      run("drop view if exists region_view");
+    }
+  }
+
+  @Test
+  public void testDiffDataTypesAndModes() throws Exception {
+    try {
+      run("use dfs.tmp");
+      run("create view nation_view as select n_nationkey, n_name from (values(4,'4'), (2,'2'), (4,'4'), (1,'1'), (3,'4'), (2,'2'), (1,'1')) t(n_nationkey, n_name)");
+      run("create view region_view as select r_regionkey, r_name from (values(1,'1'), (1,'1'), (2,'2'), (3,'3')) t(r_regionkey, r_name)");
+
+
+      String t1 = "(select r_regionkey, r_name from (values(1,'1'), (1,'1'), (2,'2'), (3,'3')) t(r_regionkey, r_name))";
+      String t2 = "(select * from nation_view)";
+      String t3 = "(select * from region_view)";
+      String t4 = "(select store_id, full_name from cp.`employee.json` limit 5)";
+
+      String query1 = t1 + " intersect all " + t2 + " intersect all " + t3 + " except all " + t4;
+
+      testBuilder()
+        .sqlQuery(query1)
+        .unOrdered()
+        .baselineColumns("r_regionkey", "r_name")
+        .baselineValues(1, "1")
+        .baselineValues(1, "1")
+        .baselineValues(2, "2")
+        .build().run();
+    } finally {
+      run("drop view if exists nation_view");
+      run("drop view if exists region_view");
+    }
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testDistinctOverUnionAllwithFullyQualifiedColumnNames() throws Exception {
+    String query = "select distinct sq.x1 " +
+      "from " +
+      "((select n_regionkey as a1 from cp.`tpch/nation.parquet`) " +
+      "intersect all " +
+      "(select r_regionkey as a2 from cp.`tpch/region.parquet`)) as sq(x1)";
+
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("x1")
+      .baselineValues(0)
+      .baselineValues(1)
+      .baselineValues(2)
+      .baselineValues(3)
+      .baselineValues(4)
+      .build().run();
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testContainsColumnAndNumericConstant() throws Exception {
+    String query = "(select n_nationkey, n_regionkey, n_name from cp.`tpch/nation.parquet`) " +
+      "intersect " +
+      "(select 1, n_regionkey, 'ARGENTINA' from cp.`tpch/nation.parquet`)";
+
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("n_nationkey", "n_regionkey", "n_name")
+      .baselineValues(1, 1, "ARGENTINA")
+      .build().run();
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testEmptySides() throws Exception {
+    String query1 = "(select n_nationkey, n_regionkey, n_name from cp.`tpch/nation.parquet` limit 0) " +
+      "intersect " +
+      "(select 1, n_regionkey, 'ARGENTINA' from cp.`tpch/nation.parquet`)";
+
+    String query2 = "(select n_nationkey, n_regionkey, n_name from cp.`tpch/nation.parquet` where n_nationkey = 1) " +
+      "except " +
+      "(select 1, n_regionkey, 'ARGENTINA' from cp.`tpch/nation.parquet` limit 0)";
+
+    testBuilder()
+      .sqlQuery(query1)
+      .unOrdered()
+      .baselineColumns("n_nationkey", "n_regionkey", "n_name")
+      .expectsEmptyResultSet()
+      .build().run();
+
+    testBuilder()
+      .sqlQuery(query2)
+      .unOrdered()
+      .baselineColumns("n_nationkey", "n_regionkey", "n_name")
+      .baselineValues(1, 1, "ARGENTINA")
+      .build().run();
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testAggregationOnIntersectOperator() throws Exception {
+    String root = "/store/text/data/t.json";
+
+    testBuilder()
+      .sqlQuery("(select calc1, max(b1) as `max`, min(b1) as `min`, count(c1) as `count` " +
+        "from (select a1 + 10 as calc1, b1, c1 from cp.`%s` " +
+        "intersect all select a1 + 10 as diff1, b1 as diff2, c1 as diff3 from cp.`%s`) " +
+        "group by calc1 order by calc1)", root, root)
+      .ordered()
+      .baselineColumns("calc1", "max", "min", "count")
+      .baselineValues(10L, 2L, 1L, 5L)
+      .baselineValues(20L, 5L, 3L, 5L)
+      .build().run();
+
+    testBuilder()
+      .sqlQuery("(select calc1, min(b1) as `min`, max(b1) as `max`, count(c1) as `count` " +
+        "from (select a1 + 10 as calc1, b1, c1 from cp.`%s` " +
+        "intersect all select a1 + 10 as diff1, b1 as diff2, c1 as diff3 from cp.`%s`) " +
+        "group by calc1 order by calc1)", root, root)
+      .ordered()
+      .baselineColumns("calc1", "min", "max", "count")
+      .baselineValues(10L, 1L, 2L, 5L)
+      .baselineValues(20L, 3L, 5L, 5L)
+      .build().run();
+  }
+
+  @Test(expected = UserException.class)
+  public void testImplicitCastingFailure() throws Exception {
+    String rootInt = "/store/json/intData.json";
+    String rootBoolean = "/store/json/booleanData.json";
+
+    run("(select key from cp.`%s` " +
+      "intersect all " +
+      "select key from cp.`%s` )", rootInt, rootBoolean);
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testDateAndTimestampJson() throws Exception {
+    String rootDate = "/store/json/dateData.json";
+    String rootTimpStmp = "/store/json/timeStmpData.json";
+
+    testBuilder()
+      .sqlQuery("(select max(key) as key from cp.`%s` " +
+        "except all select key from cp.`%s`)", rootDate, rootTimpStmp)
+      .unOrdered()
+      .baselineColumns("key")
+      .baselineValues("2011-07-26")
+      .build().run();
+
+    testBuilder()
+      .sqlQuery("select key from cp.`%s` " +
+        "except select max(key) as key from cp.`%s`", rootTimpStmp, rootDate)
+      .unOrdered()
+      .baselineColumns("key")
+      .baselineValues("2015-03-26 19:04:55.542")
+      .baselineValues("2015-03-26 19:04:55.543")
+      .baselineValues("2015-03-26 19:04:55.544")
+      .build().run();
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testOneInputContainsAggFunction() throws Exception {
+    String root = "/multilevel/csv/1994/Q1/orders_94_q1.csv";
+
+    testBuilder()
+      .sqlQuery("select * from ((select max(c1) as ct from (select columns[0] c1 from cp.`%s`)) \n" +
+        "intersect all (select columns[0] c2 from cp.`%s`)) order by ct limit 3", root, root)
+      .ordered()
+      .baselineColumns("ct")
+      .baselineValues("99")
+      .build().run();
+
+    testBuilder()
+      .sqlQuery("select * from ((select columns[0] ct from cp.`%s`)\n" +
+        "intersect all (select max(c1) as c2 from (select columns[0] c1 from cp.`%s`))) order by ct limit 3", root, root)
+      .ordered()
+      .baselineColumns("ct")
+      .baselineValues("99")
+      .build().run();
+
+    testBuilder()
+      .sqlQuery("select * from ((select max(c1) as ct from (select columns[0] c1 from cp.`%s`))\n" +
+        "intersect all (select max(c1) as c2 from (select columns[0] c1 from cp.`%s`))) order by ct", root, root)
+      .ordered()
+      .baselineColumns("ct")
+      .baselineValues("99")
+      .build().run();
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testUnionInputsGroupByOnCSV() throws Exception {
+    String root = "/multilevel/csv/1994/Q1/orders_94_q1.csv";
+
+    testBuilder()
+      .sqlQuery("select * from \n" +
+          "((select columns[0] as col0 from cp.`%s` t1 \n" +
+          "where t1.columns[0] = 66) \n" +
+          "intersect all \n" +
+          "(select columns[0] c2 from cp.`%s` t2 \n" +
+          "where t2.columns[0] is not null \n" +
+          "group by columns[0])) \n" +
+          "group by col0",
+        root, root)
+      .unOrdered()
+      .baselineColumns("col0")
+      .baselineValues("66")
+      .build().run();
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testDiffTypesAtPlanning() throws Exception {
+    testBuilder()
+      .sqlQuery("select count(c1) as ct from (select cast(r_regionkey as int) c1 from cp.`tpch/region.parquet`) " +
+        "intersect (select cast(r_regionkey as int) + 1 c2 from cp.`tpch/region.parquet`)")
+      .ordered()
+      .baselineColumns("ct")
+      .baselineValues((long) 5)
+      .build().run();
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testRightEmptyJson() throws Exception {
+    String rootEmpty = "/project/pushdown/empty.json";
+    String rootSimple = "/store/json/booleanData.json";
+
+    testBuilder()
+      .sqlQuery("select key from cp.`%s` " +
+          "intersect all " +
+          "select key from cp.`%s`",
+        rootSimple,
+        rootEmpty)
+      .unOrdered()
+      .baselineColumns("key")
+      .expectsEmptyResultSet()
+      .build().run();
+
+    testBuilder()
+      .sqlQuery("select key from cp.`%s` " +
+          "except all " +
+          "select key from cp.`%s`",
+        rootSimple,
+        rootEmpty)
+      .unOrdered()
+      .baselineColumns("key")
+      .baselineValues(true)
+      .baselineValues(false)
+      .build().run();
+  }
+
+  @Test
+  public void testUnionAllLeftEmptyJson() throws Exception {

Review Comment:
   Intersect



##########
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/aggregate/HashAggBatch.java:
##########
@@ -188,10 +192,13 @@ public HashAggBatch(HashAggregate popConfig, RecordBatch incoming, FragmentConte
     wasKilled = false;
 
     final int numGrpByExprs = popConfig.getGroupByExprs().size();
-    comparators = Lists.newArrayListWithExpectedSize(numGrpByExprs);
-    for (int i=0; i<numGrpByExprs; i++) {
-      // nulls are equal in group by case
-      comparators.add(Comparator.IS_NOT_DISTINCT_FROM);
+    isDynamicColumns = numGrpByExprs == 1 && SchemaPath.DYNAMIC_STAR.equals(popConfig.getGroupByExprs().get(0).getRef().getRootSegment().getPath());

Review Comment:
   Let's use the same approach as for union all - allow using dynamic star only if the datasource has a defined schema, otherwise, throw an error. In this case, star expansion will be done during planning, and there is no need to handle star columns here.



##########
exec/java-exec/src/test/java/org/apache/drill/TestSetOp.java:
##########
@@ -0,0 +1,1093 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill;
+
+import org.apache.drill.exec.planner.physical.PlannerSettings;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchemaBuilder;
+import org.apache.drill.exec.record.metadata.SchemaBuilder;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.drill.categories.OperatorTest;
+import org.apache.drill.categories.SqlTest;
+import org.apache.drill.categories.UnlikelyTest;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterTest;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import java.io.BufferedWriter;
+import java.io.File;
+import java.io.FileWriter;
+import java.nio.file.Paths;
+import java.util.List;
+
+@Category({SqlTest.class, OperatorTest.class})
+public class TestSetOp extends ClusterTest {
+  private static final String EMPTY_DIR_NAME = "empty_directory";
+  private static final String SLICE_TARGET_DEFAULT = "alter session reset `planner.slice_target`";
+
+  @BeforeClass
+  public static void setupTestFiles() throws Exception {
+    startCluster(ClusterFixture.builder(dirTestWatcher));
+    dirTestWatcher.copyResourceToRoot(Paths.get("multilevel", "parquet"));
+    dirTestWatcher.makeTestTmpSubDir(Paths.get(EMPTY_DIR_NAME));
+  }
+
+  @Test
+  public void testIntersect() throws Exception {
+    String query = "select * from (values(4,4), (2,2), (4,4), (1,1), (3,4), (2,2), (1,1)) t(a,b) intersect select * from (values(1,1), (1,1), (2,2), (3,3)) t(a,b)";
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("a", "b")
+      .baselineValues(2, 2)
+      .baselineValues(1, 1)
+      .build().run();
+
+    query = "select * from (values(4,4), (2,2), (4,4), (1,1), (3,4), (2,2), (1,1)) t(a,b) intersect all select * from (values(1,1), (1,1), (2,2), (3,3)) t(a,b)";
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("a", "b")
+      .baselineValues(2, 2)
+      .baselineValues(1, 1)
+      .baselineValues(1, 1)
+      .build().run();
+  }
+
+  @Test
+  public void testExcept() throws Exception {
+    String query = "select * from (values(4,4), (2,2), (4,4), (1,1), (3,4), (2,2), (1,1)) t(a,b) except select * from (values(1,1), (1,1), (2,2), (3,3)) t(a,b)";
+    String aggAbovePattern = ".*Screen.*Agg.*SetOp.*";
+    String aggBelowPattern = ".*SetOp.*Agg.*Values.*";
+
+    queryBuilder()
+      .sql(query)
+      .planMatcher()
+      .include(aggAbovePattern)
+      .exclude(aggBelowPattern)
+      .match(true);
+
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("a", "b")
+      .baselineValues(4, 4)
+      .baselineValues(3, 4)
+      .build().run();
+
+    try {
+      client.alterSession(ExecConstants.EXCEPT_ADD_AGG_BELOW_KEY, true);
+      query = "select * from (values(4,4), (2,2), (4,4), (1,1), (3,4), (2,2), (1,1)) t(a,b) except select a, b from (values(1,1), (1,1), (2,2), (3,3)) t(a,b)";
+      queryBuilder()
+        .sql(query)
+        .planMatcher()
+        .include(aggBelowPattern)
+        .exclude(aggAbovePattern)
+        .match(true);
+
+      testBuilder()
+        .sqlQuery(query)
+        .unOrdered()
+        .baselineColumns("a", "b")
+        .baselineValues(4, 4)
+        .baselineValues(3, 4)
+        .build().run();
+    } finally {
+      client.resetSession(ExecConstants.EXCEPT_ADD_AGG_BELOW_KEY);
+    }
+
+    query = "select * from (values(4,4), (2,2), (4,4), (1,1), (3,4), (2,2), (1,1)) t(a,b) except all select * from (values(1,1), (1,1), (2,2), (3,3)) t(a,b)";
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("a", "b")
+      .baselineValues(4, 4)
+      .baselineValues(4, 4)
+      .baselineValues(3, 4)
+      .baselineValues(2, 2)
+      .build().run();
+  }
+
+
+  @Test
+  public void testOverJoin() throws Exception {
+    String query =
+      "select n1.n_nationkey from cp.`tpch/nation.parquet` n1 inner join cp.`tpch/region.parquet` r1 on n1.n_regionkey = r1.r_regionkey where n1.n_nationkey in (1, 2, 3, 4) " +
+      "except " +
+      "select n2.n_nationkey from cp.`tpch/nation.parquet` n2 inner join cp.`tpch/region.parquet` r2 on n2.n_regionkey = r2.r_regionkey where n2.n_nationkey in (3, 4)";
+
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("n_nationkey")
+      .baselineValues(1)
+      .baselineValues(2)
+      .build().run();
+  }
+
+  @Test
+  public void testOverAgg() throws Exception {
+    String query = "select n1.n_regionkey from cp.`tpch/nation.parquet` n1 group by n1.n_regionkey except " +
+      "select r1.r_regionkey from cp.`tpch/region.parquet` r1 where r1.r_regionkey in (0, 1) group by r1.r_regionkey";
+
+    String excludePattern = "Screen.*Agg.*SetOp";
+    queryBuilder()
+      .sql(query)
+      .planMatcher()
+      .exclude(excludePattern)
+      .match(true);
+
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("n_regionkey")
+      .baselineValues(2)
+      .baselineValues(3)
+      .baselineValues(4)
+      .build().run();
+  }
+
+  @Test
+  public void testChain() throws Exception {
+    String query = "select n_regionkey from cp.`tpch/nation.parquet` intersect " +
+      "select r_regionkey from cp.`tpch/region.parquet` intersect " +
+      "select n_nationkey from cp.`tpch/nation.parquet` where n_nationkey in (1,2) intersect " +
+      "select c_custkey from cp.`tpch/customer.parquet` where c_custkey < 5";
+
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("n_regionkey")
+      .baselineValues(1)
+      .baselineValues(2)
+      .build().run();
+  }
+
+  @Test
+  public void testSameColumn() throws Exception {
+    String query = "select n_nationkey, n_regionkey from cp.`tpch/nation.parquet` where n_regionkey = 1 intersect all select r_regionkey, r_regionkey from cp.`tpch/region.parquet` where r_regionkey = 1";
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("n_nationkey", "n_regionkey")
+      .baselineValues(1, 1)
+      .build().run();
+
+    query = "select n_regionkey, n_regionkey from cp.`tpch/nation.parquet` where n_regionkey = 1 except all select r_regionkey, r_regionkey from cp.`tpch/region.parquet` where r_regionkey = 1";
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("n_regionkey", "n_regionkey0")
+      .baselineValues(1, 1)
+      .baselineValues(1, 1)
+      .baselineValues(1, 1)
+      .baselineValues(1, 1)
+      .build().run();
+  }
+
+  @Test
+  public void testTwoStringColumns() throws Exception {
+    String query = "select r_comment, r_regionkey from cp.`tpch/region.parquet` except select n_name, n_nationkey from cp.`tpch/nation.parquet`";
+
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("r_comment", "r_regionkey")
+      .baselineValues("lar deposits. blithely final packages cajole. regular waters are final requests. regular accounts are according to ", 0)
+      .baselineValues("hs use ironic, even requests. s", 1)
+      .baselineValues("ges. thinly even pinto beans ca", 2)
+      .baselineValues("ly final courts cajole furiously final excuse", 3)
+      .baselineValues("uickly special accounts cajole carefully blithely close requests. carefully final asymptotes haggle furiousl", 4)
+      .build().run();
+  }
+
+
+  @Test
+  public void testConstantLiterals() throws Exception {
+    String query = "(select 'CONST' as LiteralConstant, 1 as NumberConstant, n_nationkey from cp.`tpch/nation.parquet`) " +
+      "intersect " +
+      "(select 'CONST', 1, r_regionkey from cp.`tpch/region.parquet`)";
+
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("LiteralConstant", "NumberConstant", "n_nationkey")
+      .baselineValues("CONST", 1, 0)
+      .baselineValues("CONST", 1, 1)
+      .baselineValues("CONST", 1, 2)
+      .baselineValues("CONST", 1, 3)
+      .baselineValues("CONST", 1, 4)
+      .build().run();
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testViewExpandableStar() throws Exception {
+    try {
+      run("use dfs.tmp");
+      run("create view nation_view as select n_nationkey, n_name from (values(4,'4'), (2,'2'), (4,'4'), (1,'1'), (3,'4'), (2,'2'), (1,'1')) t(n_nationkey, n_name)");
+      run("create view region_view as select r_regionkey, r_name from (values(1,'1'), (1,'1'), (2,'2'), (3,'3')) t(r_regionkey, r_name)");
+
+      String query1 = "(select * from dfs.tmp.`nation_view`) " +
+        "except " +
+        "(select * from dfs.tmp.`region_view`) ";
+
+      String query2 =  "(select r_regionkey, r_name from (values(1,'1'), (1,'1'), (2,'2'), (3,'3')) t(r_regionkey, r_name)) " +
+        "intersect " +
+        "(select * from dfs.tmp.`nation_view`)";
+
+      testBuilder()
+        .sqlQuery(query1)
+        .unOrdered()
+        .baselineColumns("n_nationkey", "n_name")
+        .baselineValues(4, "4")
+        .baselineValues(3, "4")
+        .build().run();
+
+      testBuilder()
+        .sqlQuery(query2)
+        .unOrdered()
+        .baselineColumns("r_regionkey", "r_name")
+        .baselineValues(1, "1")
+        .baselineValues(2, "2")
+        .build().run();
+    } finally {
+      run("drop view if exists nation_view");
+      run("drop view if exists region_view");
+    }
+  }
+
+  @Test
+  public void testDiffDataTypesAndModes() throws Exception {
+    try {
+      run("use dfs.tmp");
+      run("create view nation_view as select n_nationkey, n_name from (values(4,'4'), (2,'2'), (4,'4'), (1,'1'), (3,'4'), (2,'2'), (1,'1')) t(n_nationkey, n_name)");
+      run("create view region_view as select r_regionkey, r_name from (values(1,'1'), (1,'1'), (2,'2'), (3,'3')) t(r_regionkey, r_name)");
+
+
+      String t1 = "(select r_regionkey, r_name from (values(1,'1'), (1,'1'), (2,'2'), (3,'3')) t(r_regionkey, r_name))";
+      String t2 = "(select * from nation_view)";
+      String t3 = "(select * from region_view)";
+      String t4 = "(select store_id, full_name from cp.`employee.json` limit 5)";
+
+      String query1 = t1 + " intersect all " + t2 + " intersect all " + t3 + " except all " + t4;
+
+      testBuilder()
+        .sqlQuery(query1)
+        .unOrdered()
+        .baselineColumns("r_regionkey", "r_name")
+        .baselineValues(1, "1")
+        .baselineValues(1, "1")
+        .baselineValues(2, "2")
+        .build().run();
+    } finally {
+      run("drop view if exists nation_view");
+      run("drop view if exists region_view");
+    }
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testDistinctOverUnionAllwithFullyQualifiedColumnNames() throws Exception {
+    String query = "select distinct sq.x1 " +
+      "from " +
+      "((select n_regionkey as a1 from cp.`tpch/nation.parquet`) " +
+      "intersect all " +
+      "(select r_regionkey as a2 from cp.`tpch/region.parquet`)) as sq(x1)";
+
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("x1")
+      .baselineValues(0)
+      .baselineValues(1)
+      .baselineValues(2)
+      .baselineValues(3)
+      .baselineValues(4)
+      .build().run();
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testContainsColumnAndNumericConstant() throws Exception {
+    String query = "(select n_nationkey, n_regionkey, n_name from cp.`tpch/nation.parquet`) " +
+      "intersect " +
+      "(select 1, n_regionkey, 'ARGENTINA' from cp.`tpch/nation.parquet`)";
+
+    testBuilder()
+      .sqlQuery(query)
+      .unOrdered()
+      .baselineColumns("n_nationkey", "n_regionkey", "n_name")
+      .baselineValues(1, 1, "ARGENTINA")
+      .build().run();
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testEmptySides() throws Exception {
+    String query1 = "(select n_nationkey, n_regionkey, n_name from cp.`tpch/nation.parquet` limit 0) " +
+      "intersect " +
+      "(select 1, n_regionkey, 'ARGENTINA' from cp.`tpch/nation.parquet`)";
+
+    String query2 = "(select n_nationkey, n_regionkey, n_name from cp.`tpch/nation.parquet` where n_nationkey = 1) " +
+      "except " +
+      "(select 1, n_regionkey, 'ARGENTINA' from cp.`tpch/nation.parquet` limit 0)";
+
+    testBuilder()
+      .sqlQuery(query1)
+      .unOrdered()
+      .baselineColumns("n_nationkey", "n_regionkey", "n_name")
+      .expectsEmptyResultSet()
+      .build().run();
+
+    testBuilder()
+      .sqlQuery(query2)
+      .unOrdered()
+      .baselineColumns("n_nationkey", "n_regionkey", "n_name")
+      .baselineValues(1, 1, "ARGENTINA")
+      .build().run();
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testAggregationOnIntersectOperator() throws Exception {
+    String root = "/store/text/data/t.json";
+
+    testBuilder()
+      .sqlQuery("(select calc1, max(b1) as `max`, min(b1) as `min`, count(c1) as `count` " +
+        "from (select a1 + 10 as calc1, b1, c1 from cp.`%s` " +
+        "intersect all select a1 + 10 as diff1, b1 as diff2, c1 as diff3 from cp.`%s`) " +
+        "group by calc1 order by calc1)", root, root)
+      .ordered()
+      .baselineColumns("calc1", "max", "min", "count")
+      .baselineValues(10L, 2L, 1L, 5L)
+      .baselineValues(20L, 5L, 3L, 5L)
+      .build().run();
+
+    testBuilder()
+      .sqlQuery("(select calc1, min(b1) as `min`, max(b1) as `max`, count(c1) as `count` " +
+        "from (select a1 + 10 as calc1, b1, c1 from cp.`%s` " +
+        "intersect all select a1 + 10 as diff1, b1 as diff2, c1 as diff3 from cp.`%s`) " +
+        "group by calc1 order by calc1)", root, root)
+      .ordered()
+      .baselineColumns("calc1", "min", "max", "count")
+      .baselineValues(10L, 1L, 2L, 5L)
+      .baselineValues(20L, 3L, 5L, 5L)
+      .build().run();
+  }
+
+  @Test(expected = UserException.class)
+  public void testImplicitCastingFailure() throws Exception {
+    String rootInt = "/store/json/intData.json";
+    String rootBoolean = "/store/json/booleanData.json";
+
+    run("(select key from cp.`%s` " +
+      "intersect all " +
+      "select key from cp.`%s` )", rootInt, rootBoolean);
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testDateAndTimestampJson() throws Exception {
+    String rootDate = "/store/json/dateData.json";
+    String rootTimpStmp = "/store/json/timeStmpData.json";
+
+    testBuilder()
+      .sqlQuery("(select max(key) as key from cp.`%s` " +
+        "except all select key from cp.`%s`)", rootDate, rootTimpStmp)
+      .unOrdered()
+      .baselineColumns("key")
+      .baselineValues("2011-07-26")
+      .build().run();
+
+    testBuilder()
+      .sqlQuery("select key from cp.`%s` " +
+        "except select max(key) as key from cp.`%s`", rootTimpStmp, rootDate)
+      .unOrdered()
+      .baselineColumns("key")
+      .baselineValues("2015-03-26 19:04:55.542")
+      .baselineValues("2015-03-26 19:04:55.543")
+      .baselineValues("2015-03-26 19:04:55.544")
+      .build().run();
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testOneInputContainsAggFunction() throws Exception {
+    String root = "/multilevel/csv/1994/Q1/orders_94_q1.csv";
+
+    testBuilder()
+      .sqlQuery("select * from ((select max(c1) as ct from (select columns[0] c1 from cp.`%s`)) \n" +
+        "intersect all (select columns[0] c2 from cp.`%s`)) order by ct limit 3", root, root)
+      .ordered()
+      .baselineColumns("ct")
+      .baselineValues("99")
+      .build().run();
+
+    testBuilder()
+      .sqlQuery("select * from ((select columns[0] ct from cp.`%s`)\n" +
+        "intersect all (select max(c1) as c2 from (select columns[0] c1 from cp.`%s`))) order by ct limit 3", root, root)
+      .ordered()
+      .baselineColumns("ct")
+      .baselineValues("99")
+      .build().run();
+
+    testBuilder()
+      .sqlQuery("select * from ((select max(c1) as ct from (select columns[0] c1 from cp.`%s`))\n" +
+        "intersect all (select max(c1) as c2 from (select columns[0] c1 from cp.`%s`))) order by ct", root, root)
+      .ordered()
+      .baselineColumns("ct")
+      .baselineValues("99")
+      .build().run();
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testUnionInputsGroupByOnCSV() throws Exception {
+    String root = "/multilevel/csv/1994/Q1/orders_94_q1.csv";
+
+    testBuilder()
+      .sqlQuery("select * from \n" +
+          "((select columns[0] as col0 from cp.`%s` t1 \n" +
+          "where t1.columns[0] = 66) \n" +
+          "intersect all \n" +
+          "(select columns[0] c2 from cp.`%s` t2 \n" +
+          "where t2.columns[0] is not null \n" +
+          "group by columns[0])) \n" +
+          "group by col0",
+        root, root)
+      .unOrdered()
+      .baselineColumns("col0")
+      .baselineValues("66")
+      .build().run();
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testDiffTypesAtPlanning() throws Exception {
+    testBuilder()
+      .sqlQuery("select count(c1) as ct from (select cast(r_regionkey as int) c1 from cp.`tpch/region.parquet`) " +
+        "intersect (select cast(r_regionkey as int) + 1 c2 from cp.`tpch/region.parquet`)")
+      .ordered()
+      .baselineColumns("ct")
+      .baselineValues((long) 5)
+      .build().run();
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testRightEmptyJson() throws Exception {
+    String rootEmpty = "/project/pushdown/empty.json";
+    String rootSimple = "/store/json/booleanData.json";
+
+    testBuilder()
+      .sqlQuery("select key from cp.`%s` " +
+          "intersect all " +
+          "select key from cp.`%s`",
+        rootSimple,
+        rootEmpty)
+      .unOrdered()
+      .baselineColumns("key")
+      .expectsEmptyResultSet()
+      .build().run();
+
+    testBuilder()
+      .sqlQuery("select key from cp.`%s` " +
+          "except all " +
+          "select key from cp.`%s`",
+        rootSimple,
+        rootEmpty)
+      .unOrdered()
+      .baselineColumns("key")
+      .baselineValues(true)
+      .baselineValues(false)
+      .build().run();
+  }
+
+  @Test
+  public void testUnionAllLeftEmptyJson() throws Exception {
+    final String rootEmpty = "/project/pushdown/empty.json";
+    final String rootSimple = "/store/json/booleanData.json";
+
+    testBuilder()
+      .sqlQuery("select key from cp.`%s` " +
+          "intersect all " +
+          "select key from cp.`%s`",
+        rootEmpty,
+        rootSimple)
+      .unOrdered()
+      .baselineColumns("key")
+      .expectsEmptyResultSet()
+      .build().run();
+
+    testBuilder()
+      .sqlQuery("select key from cp.`%s` " +
+          "except all " +
+          "select key from cp.`%s`",
+        rootEmpty,
+        rootSimple)
+      .unOrdered()
+      .baselineColumns("key")
+      .expectsEmptyResultSet()
+      .build().run();
+  }
+
+  @Test
+  public void testUnionAllBothEmptyJson() throws Exception {
+    final String rootEmpty = "/project/pushdown/empty.json";
+
+    final List<Pair<SchemaPath, TypeProtos.MajorType>> expectedSchema = Lists.newArrayList();
+    final TypeProtos.MajorType majorType = TypeProtos.MajorType.newBuilder()
+      .setMinorType(TypeProtos.MinorType.INT)
+      .setMode(TypeProtos.DataMode.OPTIONAL)
+      .build();
+    expectedSchema.add(Pair.of(SchemaPath.getSimplePath("key"), majorType));
+
+    testBuilder()
+      .sqlQuery("select key from cp.`%s` " +
+          "intersect all " +
+          "select key from cp.`%s`",
+        rootEmpty,
+        rootEmpty)
+      .schemaBaseLine(expectedSchema)
+      .build()
+      .run();
+  }
+
+  @Test
+  public void testRightEmptyDataBatch() throws Exception {
+    String rootSimple = "/store/json/booleanData.json";
+
+    testBuilder()
+      .sqlQuery("select key from cp.`%s` " +
+          "except all " +
+          "select key from cp.`%s` where 1 = 0",
+        rootSimple,
+        rootSimple)
+      .unOrdered()
+      .baselineColumns("key")
+      .baselineValues(true)
+      .baselineValues(false)
+      .build().run();
+  }
+
+  @Test
+  public void testLeftEmptyDataBatch() throws Exception {
+    String rootSimple = "/store/json/booleanData.json";
+
+    testBuilder()
+      .sqlQuery("select key from cp.`%s` where 1 = 0 " +
+          "except all " +
+          "select key from cp.`%s`",
+        rootSimple,
+        rootSimple)
+      .unOrdered()
+      .baselineColumns("key")
+      .expectsEmptyResultSet()
+      .build()
+      .run();
+  }
+
+  @Test
+  public void testBothEmptyDataBatch() throws Exception {
+    String rootSimple = "/store/json/booleanData.json";
+
+    final List<Pair<SchemaPath, TypeProtos.MajorType>> expectedSchema = Lists.newArrayList();
+    final TypeProtos.MajorType majorType = TypeProtos.MajorType.newBuilder()
+      .setMinorType(TypeProtos.MinorType.BIT) // field "key" is boolean type
+      .setMode(TypeProtos.DataMode.OPTIONAL)
+      .build();
+    expectedSchema.add(Pair.of(SchemaPath.getSimplePath("key"), majorType));
+
+    testBuilder()
+      .sqlQuery("select key from cp.`%s` where 1 = 0 " +
+          "intersect all " +
+          "select key from cp.`%s` where 1 = 0",
+        rootSimple,
+        rootSimple)
+      .schemaBaseLine(expectedSchema)
+      .build()
+      .run();
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testInListOnIntersect() throws Exception {
+    String query = "select n_nationkey \n" +
+      "from (select n1.n_nationkey from cp.`tpch/nation.parquet` n1 inner join cp.`tpch/region.parquet` r1 on n1.n_regionkey = r1.r_regionkey \n" +
+      "intersect \n" +
+      "select n2.n_nationkey from cp.`tpch/nation.parquet` n2 inner join cp.`tpch/region.parquet` r2 on n2.n_regionkey = r2.r_regionkey) \n" +
+      "where n_nationkey in (1, 2)";
+
+    // Validate the plan
+    final String[] expectedPlan = {"Project.*\n" +
+      ".*SetOp\\(all=\\[false\\], kind=\\[INTERSECT\\]\\).*\n" +
+      ".*Project.*\n" +
+      ".*HashJoin.*\n" +
+      ".*SelectionVectorRemover.*\n" +
+      ".*Filter.*\n" +
+      ".*Scan.*columns=\\[`n_regionkey`, `n_nationkey`\\].*\n" +
+      ".*Scan.*columns=\\[`r_regionkey`\\].*\n" +
+      ".*Project.*\n" +
+      ".*HashJoin.*\n" +
+      ".*SelectionVectorRemover.*\n" +
+      ".*Filter.*\n" +
+      ".*Scan.*columns=\\[`n_regionkey`, `n_nationkey`\\].*\n" +
+      ".*Scan.*columns=\\[`r_regionkey`\\].*"};
+    queryBuilder()
+      .sql(query)
+      .planMatcher()
+      .include(expectedPlan)
+      .match();
+
+    // Validate the result
+    testBuilder()
+      .sqlQuery(query)
+      .ordered()
+      .baselineColumns("n_nationkey")
+      .baselineValues(1)
+      .baselineValues(2)
+      .build()
+      .run();
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testIntersectWith() throws Exception {
+    final String query1 = "WITH year_total \n" +
+      "     AS (SELECT c.r_regionkey    customer_id,\n" +
+      "                1 year_total\n" +
+      "         FROM   cp.`tpch/region.parquet` c\n" +
+      "         Intersect ALL \n" +
+      "         SELECT c.r_regionkey    customer_id, \n" +
+      "                1 year_total\n" +
+      "         FROM   cp.`tpch/region.parquet` c) \n" +
+      "SELECT count(t_s_secyear.customer_id) as ct \n" +
+      "FROM   year_total t_s_firstyear, \n" +
+      "       year_total t_s_secyear, \n" +
+      "       year_total t_w_firstyear, \n" +
+      "       year_total t_w_secyear \n" +
+      "WHERE  t_s_secyear.customer_id = t_s_firstyear.customer_id \n" +
+      "       AND t_s_firstyear.customer_id = t_w_secyear.customer_id \n" +
+      "       AND t_s_firstyear.customer_id = t_w_firstyear.customer_id \n" +
+      "       AND CASE \n" +
+      "             WHEN t_w_firstyear.year_total > 0 THEN t_w_secyear.year_total \n" +
+      "             ELSE NULL \n" +
+      "           END > -1";
+
+    final String query2 = "WITH year_total \n" +
+      "     AS (SELECT c.r_regionkey    customer_id,\n" +
+      "                1 year_total\n" +
+      "         FROM   cp.`tpch/region.parquet` c\n" +
+      "         Intersect ALL \n" +
+      "         SELECT c.r_regionkey    customer_id, \n" +
+      "                1 year_total\n" +
+      "         FROM   cp.`tpch/region.parquet` c) \n" +
+      "SELECT count(t_w_firstyear.customer_id) as ct \n" +
+      "FROM   year_total t_w_firstyear, \n" +
+      "       year_total t_w_secyear \n" +
+      "WHERE  t_w_firstyear.year_total = t_w_secyear.year_total \n" +
+      " AND t_w_firstyear.year_total > 0 and t_w_secyear.year_total > 0";
+
+    final String query3 = "WITH year_total_1\n" +
+      "             AS (SELECT c.r_regionkey    customer_id,\n" +
+      "                        1 year_total\n" +
+      "                 FROM   cp.`tpch/region.parquet` c\n" +
+      "                 Intersect ALL \n" +
+      "                 SELECT c.r_regionkey    customer_id, \n" +
+      "                        1 year_total\n" +
+      "                 FROM   cp.`tpch/region.parquet` c) \n" +
+      "             , year_total_2\n" +
+      "             AS (SELECT c.n_nationkey    customer_id,\n" +
+      "                        1 year_total\n" +
+      "                 FROM   cp.`tpch/nation.parquet` c\n" +
+      "                 Intersect ALL \n" +
+      "                 SELECT c.n_nationkey    customer_id, \n" +
+      "                        1 year_total\n" +
+      "                 FROM   cp.`tpch/nation.parquet` c) \n" +
+      "        SELECT count(t_w_firstyear.customer_id) as ct\n" +
+      "        FROM   year_total_1 t_w_firstyear,\n" +
+      "               year_total_2 t_w_secyear\n" +
+      "        WHERE  t_w_firstyear.year_total = t_w_secyear.year_total\n" +
+      "           AND t_w_firstyear.year_total > 0 and t_w_secyear.year_total > 0";
+
+    final String query4 = "WITH year_total_1\n" +
+      "             AS (SELECT c.n_regionkey    customer_id,\n" +
+      "                        1 year_total\n" +
+      "                 FROM   cp.`tpch/nation.parquet` c\n" +
+      "                 Intersect ALL \n" +
+      "                 SELECT c.r_regionkey    customer_id, \n" +
+      "                        1 year_total\n" +
+      "                 FROM   cp.`tpch/region.parquet` c), \n" +
+      "             year_total_2\n" +
+      "             AS (SELECT c.n_regionkey    customer_id,\n" +
+      "                        1 year_total\n" +
+      "                 FROM   cp.`tpch/nation.parquet` c\n" +
+      "                 Intersect ALL \n" +
+      "                 SELECT c.r_regionkey    customer_id, \n" +
+      "                        1 year_total\n" +
+      "                 FROM   cp.`tpch/region.parquet` c) \n" +
+      "        SELECT count(t_w_firstyear.customer_id) as ct \n" +
+      "        FROM   year_total_1 t_w_firstyear,\n" +
+      "               year_total_2 t_w_secyear\n" +
+      "        WHERE  t_w_firstyear.year_total = t_w_secyear.year_total\n" +
+      "         AND t_w_firstyear.year_total > 0 and t_w_secyear.year_total > 0";
+
+    testBuilder()
+      .sqlQuery(query1)
+      .ordered()
+      .baselineColumns("ct")
+      .baselineValues((long) 5)
+      .build()
+      .run();
+
+    testBuilder()
+      .sqlQuery(query2)
+      .ordered()
+      .baselineColumns("ct")
+      .baselineValues((long) 25)
+      .build()
+      .run();
+
+    testBuilder()
+      .sqlQuery(query3)
+      .ordered()
+      .baselineColumns("ct")
+      .baselineValues((long) 125)
+      .build()
+      .run();
+
+    testBuilder()
+      .sqlQuery(query4)
+      .ordered()
+      .baselineColumns("ct")
+      .baselineValues((long) 25)
+      .build()
+      .run();
+  }
+
+  @Test
+  @Category(UnlikelyTest.class)
+  public void testDrill4147_1() throws Exception {
+    final String l = "/multilevel/parquet/1994";
+    final String r = "/multilevel/parquet/1995";
+
+    final String query = String.format("SELECT o_custkey FROM dfs.`%s` \n" +
+      "Except All SELECT o_custkey FROM dfs.`%s`", l, r);
+
+    // Validate the plan
+    final String[] expectedPlan = {"UnionExchange.*\n",
+      ".*SetOp"};
+
+    try {
+      client.alterSession(ExecConstants.SLICE_TARGET, 1);
+      queryBuilder()
+        .sql(query)
+        .planMatcher()
+        .include(expectedPlan)
+        .match();
+
+      testBuilder()
+        .optionSettingQueriesForBaseline(SLICE_TARGET_DEFAULT)
+        .unOrdered()
+        .sqlQuery(query)
+        .sqlBaselineQuery(query)
+        .build()
+        .run();
+    } finally {
+      client.resetSession(ExecConstants.SLICE_TARGET);
+    }
+  }
+
+  @Test// group-by on top of set op
+  public void testDrill4147_2() throws Exception {

Review Comment:
   Please fix test name



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] cgivre commented on pull request #2599: DRILL-4232 Support for EXCEPT and INTERSECT set operator

Posted by GitBox <gi...@apache.org>.
cgivre commented on PR #2599:
URL: https://github.com/apache/drill/pull/2599#issuecomment-1264537373

   @Leon-WTF I don't know if you saw this or not, but #2602 has been merged.  We are actually running on Calcite 1.32 now!  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] Leon-WTF commented on pull request #2599: DRILL-4232 Support for EXCEPT and INTERSECT set operator

Posted by GitBox <gi...@apache.org>.
Leon-WTF commented on PR #2599:
URL: https://github.com/apache/drill/pull/2599#issuecomment-1280190592

   > HI @Leon-WTF I saw that you added unit tests. Is this PR ready for review? If not, what is remaining?
   
   @cgivre I think it's just still need more ut to test more completely and fix bugs while adding ut. By the way, do we have any plan to release drill 2.0? I will try to catch that plan.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] Leon-WTF commented on pull request #2599: DRILL-4232 Support for EXCEPT and INTERSECT set operator

Posted by GitBox <gi...@apache.org>.
Leon-WTF commented on PR #2599:
URL: https://github.com/apache/drill/pull/2599#issuecomment-1229365001

   > @Leon-WTF Is this ready for review?
   
   @cgivre Not yet, I'm handling the EXCEPT case, it needs to remove the duplicate records for probe side, I'm trying to add an Agg phase after setop phase. The agg phase needs a flag to indicate that it needs to group by all columns as the columns can not be known when doing the plan. Any suggestion on this?
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] cgivre commented on pull request #2599: DRILL-4232 Support for EXCEPT and INTERSECT set operator

Posted by GitBox <gi...@apache.org>.
cgivre commented on PR #2599:
URL: https://github.com/apache/drill/pull/2599#issuecomment-1280002423

   HI @Leon-WTF 
   I saw that you added unit tests.  Is this PR ready for review?  If not, what is remaining?  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] vvysotskyi commented on a diff in pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

Posted by "vvysotskyi (via GitHub)" <gi...@apache.org>.
vvysotskyi commented on code in PR #2599:
URL: https://github.com/apache/drill/pull/2599#discussion_r1097057876


##########
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillAddAggForExceptRule.java:
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptRuleOperand;
+import org.apache.calcite.plan.hep.HepRelVertex;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.util.ImmutableBitSet;
+import org.apache.calcite.util.trace.CalciteTrace;
+import org.apache.curator.shaded.com.google.common.collect.ImmutableList;
+import org.apache.drill.exec.planner.physical.PrelUtil;
+import org.slf4j.Logger;
+
+import static org.apache.drill.exec.ExecConstants.EXCEPT_ADD_AGG_BELOW;
+
+/**
+ * Rule that try to add agg for Except set op.
+ */
+public class DrillAddAggForExceptRule extends RelOptRule {
+  public static final RelOptRule INSTANCE = new DrillAddAggForExceptRule(RelOptHelper.any(DrillExceptRel.class), "DrillAddAggForExceptRule");
+  protected static final Logger tracer = CalciteTrace.getPlannerTracer();
+
+  public DrillAddAggForExceptRule(RelOptRuleOperand operand, String description) {
+    super(operand, description);
+  }
+
+  @Override
+  public boolean matches(RelOptRuleCall call) {
+    DrillExceptRel drillExceptRel = call.rel(0);
+    return !drillExceptRel.all && !drillExceptRel.isAggAdded() && !findAggRel(drillExceptRel.getInput(0));
+  }
+
+  private boolean findAggRel(RelNode relNode) {
+    if (relNode instanceof HepRelVertex) {
+      return findAggRel(((HepRelVertex) relNode).getCurrentRel());
+    }
+    if (relNode instanceof DrillAggregateRel) {
+      return true;
+    }
+    if (relNode.getInputs().size() == 1 && relNode.getInput(0) != null) {
+      return findAggRel(relNode.getInput(0));
+    }
+    return false;
+  }
+
+  @Override
+  public void onMatch(RelOptRuleCall call) {
+    final DrillExceptRel drillExceptRel = call.rel(0);
+    boolean addAggBelow = PrelUtil.getPlannerSettings(call.getPlanner()).getOptions().getOption(EXCEPT_ADD_AGG_BELOW);
+    if (addAggBelow) {
+      RelNode aggNode = new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.getInput(0),
+        ImmutableBitSet.range(0, drillExceptRel.getInput(0).getRowType().getFieldList().size()), ImmutableList.of(), ImmutableList.of());
+      call.transformTo(drillExceptRel.copy(ImmutableList.of(aggNode, drillExceptRel.getInput(1)), true));
+    } else {
+      call.transformTo(new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.copy(true),

Review Comment:
   Oh, ok, if it is specific to our implementation of except operator, aggregation added here possibly could be removed by other Calcite rules which assume that results would be already distinct.
   
   I think it would be better to add an aggregation when converting it to physical rel nodes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] Leon-WTF commented on a diff in pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

Posted by "Leon-WTF (via GitHub)" <gi...@apache.org>.
Leon-WTF commented on code in PR #2599:
URL: https://github.com/apache/drill/pull/2599#discussion_r1097148776


##########
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillAddAggForExceptRule.java:
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptRuleOperand;
+import org.apache.calcite.plan.hep.HepRelVertex;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.util.ImmutableBitSet;
+import org.apache.calcite.util.trace.CalciteTrace;
+import org.apache.curator.shaded.com.google.common.collect.ImmutableList;
+import org.apache.drill.exec.planner.physical.PrelUtil;
+import org.slf4j.Logger;
+
+import static org.apache.drill.exec.ExecConstants.EXCEPT_ADD_AGG_BELOW;
+
+/**
+ * Rule that try to add agg for Except set op.
+ */
+public class DrillAddAggForExceptRule extends RelOptRule {
+  public static final RelOptRule INSTANCE = new DrillAddAggForExceptRule(RelOptHelper.any(DrillExceptRel.class), "DrillAddAggForExceptRule");
+  protected static final Logger tracer = CalciteTrace.getPlannerTracer();
+
+  public DrillAddAggForExceptRule(RelOptRuleOperand operand, String description) {
+    super(operand, description);
+  }
+
+  @Override
+  public boolean matches(RelOptRuleCall call) {
+    DrillExceptRel drillExceptRel = call.rel(0);
+    return !drillExceptRel.all && !drillExceptRel.isAggAdded() && !findAggRel(drillExceptRel.getInput(0));
+  }
+
+  private boolean findAggRel(RelNode relNode) {
+    if (relNode instanceof HepRelVertex) {
+      return findAggRel(((HepRelVertex) relNode).getCurrentRel());
+    }
+    if (relNode instanceof DrillAggregateRel) {
+      return true;
+    }
+    if (relNode.getInputs().size() == 1 && relNode.getInput(0) != null) {
+      return findAggRel(relNode.getInput(0));
+    }
+    return false;
+  }
+
+  @Override
+  public void onMatch(RelOptRuleCall call) {
+    final DrillExceptRel drillExceptRel = call.rel(0);
+    boolean addAggBelow = PrelUtil.getPlannerSettings(call.getPlanner()).getOptions().getOption(EXCEPT_ADD_AGG_BELOW);
+    if (addAggBelow) {
+      RelNode aggNode = new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.getInput(0),
+        ImmutableBitSet.range(0, drillExceptRel.getInput(0).getRowType().getFieldList().size()), ImmutableList.of(), ImmutableList.of());
+      call.transformTo(drillExceptRel.copy(ImmutableList.of(aggNode, drillExceptRel.getInput(1)), true));
+    } else {
+      call.transformTo(new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.copy(true),

Review Comment:
   oh, yes, I will refactor that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] Leon-WTF commented on a diff in pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

Posted by "Leon-WTF (via GitHub)" <gi...@apache.org>.
Leon-WTF commented on code in PR #2599:
URL: https://github.com/apache/drill/pull/2599#discussion_r1103724674


##########
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillAddAggForExceptRule.java:
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptRuleOperand;
+import org.apache.calcite.plan.hep.HepRelVertex;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.util.ImmutableBitSet;
+import org.apache.calcite.util.trace.CalciteTrace;
+import org.apache.curator.shaded.com.google.common.collect.ImmutableList;
+import org.apache.drill.exec.planner.physical.PrelUtil;
+import org.slf4j.Logger;
+
+import static org.apache.drill.exec.ExecConstants.EXCEPT_ADD_AGG_BELOW;
+
+/**
+ * Rule that try to add agg for Except set op.
+ */
+public class DrillAddAggForExceptRule extends RelOptRule {
+  public static final RelOptRule INSTANCE = new DrillAddAggForExceptRule(RelOptHelper.any(DrillExceptRel.class), "DrillAddAggForExceptRule");
+  protected static final Logger tracer = CalciteTrace.getPlannerTracer();
+
+  public DrillAddAggForExceptRule(RelOptRuleOperand operand, String description) {
+    super(operand, description);
+  }
+
+  @Override
+  public boolean matches(RelOptRuleCall call) {
+    DrillExceptRel drillExceptRel = call.rel(0);
+    return !drillExceptRel.all && !drillExceptRel.isAggAdded() && !findAggRel(drillExceptRel.getInput(0));
+  }
+
+  private boolean findAggRel(RelNode relNode) {
+    if (relNode instanceof HepRelVertex) {
+      return findAggRel(((HepRelVertex) relNode).getCurrentRel());
+    }
+    if (relNode instanceof DrillAggregateRel) {
+      return true;
+    }
+    if (relNode.getInputs().size() == 1 && relNode.getInput(0) != null) {
+      return findAggRel(relNode.getInput(0));
+    }
+    return false;
+  }
+
+  @Override
+  public void onMatch(RelOptRuleCall call) {
+    final DrillExceptRel drillExceptRel = call.rel(0);
+    boolean addAggBelow = PrelUtil.getPlannerSettings(call.getPlanner()).getOptions().getOption(EXCEPT_ADD_AGG_BELOW);
+    if (addAggBelow) {
+      RelNode aggNode = new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.getInput(0),
+        ImmutableBitSet.range(0, drillExceptRel.getInput(0).getRowType().getFieldList().size()), ImmutableList.of(), ImmutableList.of());
+      call.transformTo(drillExceptRel.copy(ImmutableList.of(aggNode, drillExceptRel.getInput(1)), true));
+    } else {
+      call.transformTo(new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.copy(true),

Review Comment:
   > Drill doesn't use streaming aggregate for distinct calls, so only hash agg should be enough.
   
   @vvysotskyi I see it checks aggregate.containsDistinctCall() in StreamAggPrule, but It will generate steam agg for sql like "select a,b,c from foo group by a,b,c".



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] vvysotskyi commented on a diff in pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

Posted by "vvysotskyi (via GitHub)" <gi...@apache.org>.
vvysotskyi commented on code in PR #2599:
URL: https://github.com/apache/drill/pull/2599#discussion_r1083511030


##########
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillSetOpRel.java:
##########
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+import org.apache.calcite.linq4j.Ord;
+import org.apache.calcite.plan.RelOptCluster;
+import org.apache.calcite.plan.RelTraitSet;
+import org.apache.calcite.rel.InvalidRelException;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.sql.SqlKind;
+import org.apache.drill.common.logical.data.LogicalOperator;
+import org.apache.drill.common.logical.data.Union;
+import org.apache.drill.exec.planner.common.DrillSetOpRelBase;
+
+import java.util.List;
+
+/**
+ * SetOp implemented in Drill.
+ */
+public class DrillSetOpRel extends DrillSetOpRelBase implements DrillRel {

Review Comment:
   I think instead of having a common class for the intersect and except, it would be better to differ them as it is done in Calcite and extend their implementations, so it will help to remove defining custom `estimateRowCount` methods and allow using more optimizations designed for these operators, like push down to JDBC and so on.



##########
exec/java-exec/src/test/java/org/apache/drill/TestSetOp.java:
##########
@@ -0,0 +1,1093 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill;
+
+import org.apache.drill.exec.planner.physical.PlannerSettings;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchemaBuilder;
+import org.apache.drill.exec.record.metadata.SchemaBuilder;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.drill.categories.OperatorTest;
+import org.apache.drill.categories.SqlTest;
+import org.apache.drill.categories.UnlikelyTest;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterTest;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import java.io.BufferedWriter;
+import java.io.File;
+import java.io.FileWriter;
+import java.nio.file.Paths;
+import java.util.List;
+
+@Category({SqlTest.class, OperatorTest.class})
+public class TestSetOp extends ClusterTest {

Review Comment:
   It is good that it is handled in the code, but it could also be fine to have a test that verifies it works as expected, so we will be sure that no future changes will break it.



##########
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/setop/HashSetOpProbeTemplate.java:
##########
@@ -0,0 +1,354 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.physical.impl.setop;
+
+import org.apache.calcite.sql.SqlKind;
+import org.apache.drill.exec.exception.SchemaChangeException;
+import org.apache.drill.exec.physical.impl.common.HashPartition;
+import org.apache.drill.exec.physical.impl.join.HashJoinHelper;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.RecordBatch.IterOutcome;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.vector.IntVector;
+import org.apache.drill.exec.vector.ValueVector;
+
+import java.util.ArrayList;
+
+import static org.apache.drill.exec.record.JoinBatchMemoryManager.LEFT_INDEX;
+
+public class HashSetOpProbeTemplate implements HashSetOpProbe {

Review Comment:
   It looks like it has a lot of code copied from HashAggTemplate. Is it possible to move out common code to avoid duplicating it?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] Leon-WTF commented on a diff in pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

Posted by "Leon-WTF (via GitHub)" <gi...@apache.org>.
Leon-WTF commented on code in PR #2599:
URL: https://github.com/apache/drill/pull/2599#discussion_r1088599209


##########
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/setop/HashSetOpProbeTemplate.java:
##########
@@ -0,0 +1,354 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.physical.impl.setop;
+
+import org.apache.calcite.sql.SqlKind;
+import org.apache.drill.exec.exception.SchemaChangeException;
+import org.apache.drill.exec.physical.impl.common.HashPartition;
+import org.apache.drill.exec.physical.impl.join.HashJoinHelper;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.record.RecordBatch.IterOutcome;
+import org.apache.drill.exec.record.VectorContainer;
+import org.apache.drill.exec.record.VectorWrapper;
+import org.apache.drill.exec.vector.IntVector;
+import org.apache.drill.exec.vector.ValueVector;
+
+import java.util.ArrayList;
+
+import static org.apache.drill.exec.record.JoinBatchMemoryManager.LEFT_INDEX;
+
+public class HashSetOpProbeTemplate implements HashSetOpProbe {

Review Comment:
   It's more like HashJoinProbeTemplate, I have refactored it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] cgivre commented on pull request #2599: DRILL-4232 Support for EXCEPT and INTERSECT set operator

Posted by GitBox <gi...@apache.org>.
cgivre commented on PR #2599:
URL: https://github.com/apache/drill/pull/2599#issuecomment-1255460340

   Hi @Leon-WTF how is this coming?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] cgivre commented on pull request #2599: DRILL-4232 Support for EXCEPT and INTERSECT set operator

Posted by GitBox <gi...@apache.org>.
cgivre commented on PR #2599:
URL: https://github.com/apache/drill/pull/2599#issuecomment-1289163116

   > > HI @Leon-WTF I saw that you added unit tests. Is this PR ready for review? If not, what is remaining?
   > 
   > @cgivre I think it's just still need more ut to test more completely and fix bugs while adding ut. By the way, do we have any plan to release drill 2.0? I will try to catch that plan.
   
   We are getting ready to release Drill 1.20.3.  Once that's done, I'd like to start discussions around Drill 2.0.  There have been a lot of major work in Drill and I'd like to see that getting used. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] Leon-WTF commented on a diff in pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

Posted by GitBox <gi...@apache.org>.
Leon-WTF commented on code in PR #2599:
URL: https://github.com/apache/drill/pull/2599#discussion_r1070523002


##########
exec/java-exec/src/test/java/org/apache/drill/TestSetOp.java:
##########
@@ -0,0 +1,1093 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill;
+
+import org.apache.drill.exec.planner.physical.PlannerSettings;
+import org.apache.drill.exec.record.BatchSchema;
+import org.apache.drill.exec.record.BatchSchemaBuilder;
+import org.apache.drill.exec.record.metadata.SchemaBuilder;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.apache.commons.lang3.tuple.Pair;
+import org.apache.drill.categories.OperatorTest;
+import org.apache.drill.categories.SqlTest;
+import org.apache.drill.categories.UnlikelyTest;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterTest;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import java.io.BufferedWriter;
+import java.io.File;
+import java.io.FileWriter;
+import java.nio.file.Paths;
+import java.util.List;
+
+@Category({SqlTest.class, OperatorTest.class})
+public class TestSetOp extends ClusterTest {

Review Comment:
   first empty batch will be skipped in sniffNonEmptyBatch



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] cgivre commented on pull request #2599: DRILL-4232 Support for EXCEPT and INTERSECT set operator

Posted by GitBox <gi...@apache.org>.
cgivre commented on PR #2599:
URL: https://github.com/apache/drill/pull/2599#issuecomment-1190414440

   @Leon-WTF You might want to wait until https://github.com/apache/drill/pull/2602 is merged before you continue this. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] Leon-WTF commented on a diff in pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

Posted by "Leon-WTF (via GitHub)" <gi...@apache.org>.
Leon-WTF commented on code in PR #2599:
URL: https://github.com/apache/drill/pull/2599#discussion_r1097007805


##########
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillAddAggForExceptRule.java:
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptRuleOperand;
+import org.apache.calcite.plan.hep.HepRelVertex;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.util.ImmutableBitSet;
+import org.apache.calcite.util.trace.CalciteTrace;
+import org.apache.curator.shaded.com.google.common.collect.ImmutableList;
+import org.apache.drill.exec.planner.physical.PrelUtil;
+import org.slf4j.Logger;
+
+import static org.apache.drill.exec.ExecConstants.EXCEPT_ADD_AGG_BELOW;
+
+/**
+ * Rule that try to add agg for Except set op.
+ */
+public class DrillAddAggForExceptRule extends RelOptRule {
+  public static final RelOptRule INSTANCE = new DrillAddAggForExceptRule(RelOptHelper.any(DrillExceptRel.class), "DrillAddAggForExceptRule");
+  protected static final Logger tracer = CalciteTrace.getPlannerTracer();
+
+  public DrillAddAggForExceptRule(RelOptRuleOperand operand, String description) {
+    super(operand, description);
+  }
+
+  @Override
+  public boolean matches(RelOptRuleCall call) {
+    DrillExceptRel drillExceptRel = call.rel(0);
+    return !drillExceptRel.all && !drillExceptRel.isAggAdded() && !findAggRel(drillExceptRel.getInput(0));
+  }
+
+  private boolean findAggRel(RelNode relNode) {
+    if (relNode instanceof HepRelVertex) {
+      return findAggRel(((HepRelVertex) relNode).getCurrentRel());
+    }
+    if (relNode instanceof DrillAggregateRel) {
+      return true;
+    }
+    if (relNode.getInputs().size() == 1 && relNode.getInput(0) != null) {
+      return findAggRel(relNode.getInput(0));
+    }
+    return false;
+  }
+
+  @Override
+  public void onMatch(RelOptRuleCall call) {
+    final DrillExceptRel drillExceptRel = call.rel(0);
+    boolean addAggBelow = PrelUtil.getPlannerSettings(call.getPlanner()).getOptions().getOption(EXCEPT_ADD_AGG_BELOW);
+    if (addAggBelow) {
+      RelNode aggNode = new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.getInput(0),
+        ImmutableBitSet.range(0, drillExceptRel.getInput(0).getRowType().getFieldList().size()), ImmutableList.of(), ImmutableList.of());
+      call.transformTo(drillExceptRel.copy(ImmutableList.of(aggNode, drillExceptRel.getInput(1)), true));
+    } else {
+      call.transformTo(new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.copy(true),

Review Comment:
   It's not distinct for left table after except operator. I choosed to reuse an aggregate operator to do the distinct.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] Leon-WTF commented on a diff in pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

Posted by "Leon-WTF (via GitHub)" <gi...@apache.org>.
Leon-WTF commented on code in PR #2599:
URL: https://github.com/apache/drill/pull/2599#discussion_r1103724674


##########
exec/java-exec/src/main/java/org/apache/drill/exec/planner/logical/DrillAddAggForExceptRule.java:
##########
@@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.logical;
+
+import org.apache.calcite.plan.RelOptRule;
+import org.apache.calcite.plan.RelOptRuleCall;
+import org.apache.calcite.plan.RelOptRuleOperand;
+import org.apache.calcite.plan.hep.HepRelVertex;
+import org.apache.calcite.rel.RelNode;
+import org.apache.calcite.util.ImmutableBitSet;
+import org.apache.calcite.util.trace.CalciteTrace;
+import org.apache.curator.shaded.com.google.common.collect.ImmutableList;
+import org.apache.drill.exec.planner.physical.PrelUtil;
+import org.slf4j.Logger;
+
+import static org.apache.drill.exec.ExecConstants.EXCEPT_ADD_AGG_BELOW;
+
+/**
+ * Rule that try to add agg for Except set op.
+ */
+public class DrillAddAggForExceptRule extends RelOptRule {
+  public static final RelOptRule INSTANCE = new DrillAddAggForExceptRule(RelOptHelper.any(DrillExceptRel.class), "DrillAddAggForExceptRule");
+  protected static final Logger tracer = CalciteTrace.getPlannerTracer();
+
+  public DrillAddAggForExceptRule(RelOptRuleOperand operand, String description) {
+    super(operand, description);
+  }
+
+  @Override
+  public boolean matches(RelOptRuleCall call) {
+    DrillExceptRel drillExceptRel = call.rel(0);
+    return !drillExceptRel.all && !drillExceptRel.isAggAdded() && !findAggRel(drillExceptRel.getInput(0));
+  }
+
+  private boolean findAggRel(RelNode relNode) {
+    if (relNode instanceof HepRelVertex) {
+      return findAggRel(((HepRelVertex) relNode).getCurrentRel());
+    }
+    if (relNode instanceof DrillAggregateRel) {
+      return true;
+    }
+    if (relNode.getInputs().size() == 1 && relNode.getInput(0) != null) {
+      return findAggRel(relNode.getInput(0));
+    }
+    return false;
+  }
+
+  @Override
+  public void onMatch(RelOptRuleCall call) {
+    final DrillExceptRel drillExceptRel = call.rel(0);
+    boolean addAggBelow = PrelUtil.getPlannerSettings(call.getPlanner()).getOptions().getOption(EXCEPT_ADD_AGG_BELOW);
+    if (addAggBelow) {
+      RelNode aggNode = new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.getInput(0),
+        ImmutableBitSet.range(0, drillExceptRel.getInput(0).getRowType().getFieldList().size()), ImmutableList.of(), ImmutableList.of());
+      call.transformTo(drillExceptRel.copy(ImmutableList.of(aggNode, drillExceptRel.getInput(1)), true));
+    } else {
+      call.transformTo(new DrillAggregateRel(drillExceptRel.getCluster(), drillExceptRel.getTraitSet(), drillExceptRel.copy(true),

Review Comment:
   > Drill doesn't use streaming aggregate for distinct calls, so only hash agg should be enough.
   
   I see it checks aggregate.containsDistinctCall() in StreamAggPrule, but It will generate steam agg for sql like "select a,b,c from foo group by a,b,c".



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] cgivre commented on pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

Posted by "cgivre (via GitHub)" <gi...@apache.org>.
cgivre commented on PR #2599:
URL: https://github.com/apache/drill/pull/2599#issuecomment-1427120271

   @Leon-WTF Thanks for this.   @vvysotskyi Thanks for the review.  Merging now!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [drill] cgivre merged pull request #2599: DRILL-4232: Support for EXCEPT and INTERSECT set operator

Posted by "cgivre (via GitHub)" <gi...@apache.org>.
cgivre merged PR #2599:
URL: https://github.com/apache/drill/pull/2599


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@drill.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org