You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Harish Butani <rh...@gmail.com> on 2013/11/20 19:04:22 UTC

Review Request 15718: HIVE-5614: Subquery support: allow subquery expressions in having clause

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15718/
-----------------------------------------------------------

Review request for hive and Ashutosh Chauhan.


Bugs: HIVE-5614
    https://issues.apache.org/jira/browse/HIVE-5614


Repository: hive-git


Description
-------

support for subquery predicates in having clause. SubTask of HIVE-784


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java fa111cc 
  ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java 3e8215d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7979873 
  ql/src/test/queries/clientpositive/subquery_exists_having.q PRE-CREATION 
  ql/src/test/queries/clientpositive/subquery_in_having.q PRE-CREATION 
  ql/src/test/queries/clientpositive/subquery_notexists_having.q PRE-CREATION 
  ql/src/test/queries/clientpositive/subquery_notin_having.q PRE-CREATION 
  ql/src/test/results/clientpositive/subquery_exists_having.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/subquery_in_having.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/subquery_multiinsert.q.out 8dfb485 
  ql/src/test/results/clientpositive/subquery_notexists_having.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/subquery_notin_having.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/15718/diff/


Testing
-------

added new tests: subquery_in_having.q, subquery_notin_having.q, subquery_exists_having.q, subquery_notexists_having.q


Thanks,

Harish Butani


Re: Review Request 15718: HIVE-5614: Subquery support: allow subquery expressions in having clause

Posted by Ashutosh Chauhan <ha...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15718/#review29328
-----------------------------------------------------------

Ship it!


Ship It!

- Ashutosh Chauhan


On Nov. 23, 2013, 12:14 a.m., Harish Butani wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15718/
> -----------------------------------------------------------
> 
> (Updated Nov. 23, 2013, 12:14 a.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-5614
>     https://issues.apache.org/jira/browse/HIVE-5614
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> support for subquery predicates in having clause. SubTask of HIVE-784
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java fa111cc 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java 3e8215d 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7979873 
>   ql/src/test/queries/clientpositive/subquery_exists_having.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/subquery_in_having.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/subquery_notexists_having.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/subquery_notin_having.q PRE-CREATION 
>   ql/src/test/results/clientpositive/subquery_exists_having.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/subquery_in_having.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/subquery_multiinsert.q.out 8dfb485 
>   ql/src/test/results/clientpositive/subquery_notexists_having.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/subquery_notin_having.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15718/diff/
> 
> 
> Testing
> -------
> 
> added new tests: subquery_in_having.q, subquery_notin_having.q, subquery_exists_having.q, subquery_notexists_having.q
> 
> 
> Thanks,
> 
> Harish Butani
> 
>


Re: Review Request 15718: HIVE-5614: Subquery support: allow subquery expressions in having clause

Posted by Harish Butani <rh...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15718/
-----------------------------------------------------------

(Updated Nov. 23, 2013, 12:14 a.m.)


Review request for hive and Ashutosh Chauhan.


Changes
-------

Thanks for the feedback.
attempted to address all the feedback.


Bugs: HIVE-5614
    https://issues.apache.org/jira/browse/HIVE-5614


Repository: hive-git


Description
-------

support for subquery predicates in having clause. SubTask of HIVE-784


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java fa111cc 
  ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java 3e8215d 
  ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7979873 
  ql/src/test/queries/clientpositive/subquery_exists_having.q PRE-CREATION 
  ql/src/test/queries/clientpositive/subquery_in_having.q PRE-CREATION 
  ql/src/test/queries/clientpositive/subquery_notexists_having.q PRE-CREATION 
  ql/src/test/queries/clientpositive/subquery_notin_having.q PRE-CREATION 
  ql/src/test/results/clientpositive/subquery_exists_having.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/subquery_in_having.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/subquery_multiinsert.q.out 8dfb485 
  ql/src/test/results/clientpositive/subquery_notexists_having.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/subquery_notin_having.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/15718/diff/


Testing
-------

added new tests: subquery_in_having.q, subquery_notin_having.q, subquery_exists_having.q, subquery_notexists_having.q


Thanks,

Harish Butani


Re: Review Request 15718: HIVE-5614: Subquery support: allow subquery expressions in having clause

Posted by Harish Butani <rh...@gmail.com>.

> On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java, line 131
> > <https://reviews.apache.org/r/15718/diff/1/?file=388900#file388900line131>
> >
> >     It will be good to add a comment about various fields in Conjunct class.

done


> On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java, line 138
> > <https://reviews.apache.org/r/15718/diff/1/?file=388900#file388900line138>
> >
> >     Can this constructor be package protected instead of public ?

done


> On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java, line 203
> > <https://reviews.apache.org/r/15718/diff/1/?file=388900#file388900line203>
> >
> >     It will be good to add a comment, how behavior of ConjunctAnalyzer changes when forHavingClause = true instead of false.

done


> On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java, line 298
> > <https://reviews.apache.org/r/15718/diff/1/?file=388900#file388900line298>
> >
> >     Should this exception needs to be propagated up the stack. At the least, we should have LOG.warn() message here.

this is not an error. This is only a check if the expression is in the OuterQuery RR


> On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 1916
> > <https://reviews.apache.org/r/15718/diff/1/?file=388901#file388901line1916>
> >
> >     It will be good to add a comment here along the lines of
> >      there could be a subq in having clause, if so we need to generate subq plan followed by semi-join.

done


> On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java, line 1926
> > <https://reviews.apache.org/r/15718/diff/1/?file=388901#file388901line1926>
> >
> >     It will be good to add a comment how this new boolean changes behavior of this method.

done


> On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/queries/clientpositive/subquery_in_having.q, line 25
> > <https://reviews.apache.org/r/15718/diff/1/?file=388903#file388903line25>
> >
> >     It will be good to add a test which has a subq in both where clause as well as having clause

done


> On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/queries/clientpositive/subquery_in_having.q, line 63
> > <https://reviews.apache.org/r/15718/diff/1/?file=388903#file388903line63>
> >
> >     Same comment w.r.t map-join on. Also, if we support over clause in subq, it will be good to have a test for that.

done


> On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/queries/clientpositive/subquery_notexists_having.q, line 46
> > <https://reviews.apache.org/r/15718/diff/1/?file=388904#file388904line46>
> >
> >     It will be good to add a negative test where subq and outer query both uses same table alias. It seems in such cases we may generate incorrect results, so we should disable those.

this is not just for having. Added a jira HIVE-5874 to give a better error for this case.


> On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/subquery_in_having.q.out, line 57
> > <https://reviews.apache.org/r/15718/diff/1/?file=388907#file388907line57>
> >
> >     In this plan, we are first computing outq, then subq and then doing left semi-join on resultset of those two. As we discussed efficient way for this is to push filter conditions in subq to outer query to cut-down the output generated by outq. Though, I am not sure whether its better to do it in optimizer phase via Transformer or right here. Either ways, I think thats an optimization which we can do as a follow-up.

agreed


> On Nov. 22, 2013, 7:02 p.m., Ashutosh Chauhan wrote:
> > ql/src/test/results/clientpositive/subquery_notexists_having.q.out, line 149
> > <https://reviews.apache.org/r/15718/diff/1/?file=388909#file388909line149>
> >
> >     First expression in this filter is redundant. Thats not strictly required. However, since there is an active work going on for constant folding optimization, this may get optimized way via that optimization. Either way, this can be done in follow-up.

(1=1) is added as a placeholder. Yes it should be removed.


- Harish


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15718/#review29298
-----------------------------------------------------------


On Nov. 23, 2013, 12:14 a.m., Harish Butani wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15718/
> -----------------------------------------------------------
> 
> (Updated Nov. 23, 2013, 12:14 a.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-5614
>     https://issues.apache.org/jira/browse/HIVE-5614
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> support for subquery predicates in having clause. SubTask of HIVE-784
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java fa111cc 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java 3e8215d 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7979873 
>   ql/src/test/queries/clientpositive/subquery_exists_having.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/subquery_in_having.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/subquery_notexists_having.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/subquery_notin_having.q PRE-CREATION 
>   ql/src/test/results/clientpositive/subquery_exists_having.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/subquery_in_having.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/subquery_multiinsert.q.out 8dfb485 
>   ql/src/test/results/clientpositive/subquery_notexists_having.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/subquery_notin_having.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15718/diff/
> 
> 
> Testing
> -------
> 
> added new tests: subquery_in_having.q, subquery_notin_having.q, subquery_exists_having.q, subquery_notexists_having.q
> 
> 
> Thanks,
> 
> Harish Butani
> 
>


Re: Review Request 15718: HIVE-5614: Subquery support: allow subquery expressions in having clause

Posted by Ashutosh Chauhan <ha...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15718/#review29298
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java
<https://reviews.apache.org/r/15718/#comment56454>

    It will be good to add a comment about various fields in Conjunct class.



ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java
<https://reviews.apache.org/r/15718/#comment56455>

    Can this constructor be package protected instead of public ?



ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java
<https://reviews.apache.org/r/15718/#comment56456>

    protected instead of public?



ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java
<https://reviews.apache.org/r/15718/#comment56457>

    It will be good to add a comment, how behavior of ConjunctAnalyzer changes when forHavingClause = true instead of false.



ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java
<https://reviews.apache.org/r/15718/#comment56458>

    Should this exception needs to be propagated up the stack. At the least, we should have LOG.warn() message here.



ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
<https://reviews.apache.org/r/15718/#comment56452>

    It will be good to add a comment here along the lines of
     there could be a subq in having clause, if so we need to generate subq plan followed by semi-join.



ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
<https://reviews.apache.org/r/15718/#comment56459>

    It will be good to add a comment how this new boolean changes behavior of this method.



ql/src/test/queries/clientpositive/subquery_in_having.q
<https://reviews.apache.org/r/15718/#comment56453>

    It will be good to add a test which has a subq in both where clause as well as having clause



ql/src/test/queries/clientpositive/subquery_in_having.q
<https://reviews.apache.org/r/15718/#comment56448>

    Same comment w.r.t map-join on. Also, if we support over clause in subq, it will be good to have a test for that.



ql/src/test/queries/clientpositive/subquery_notexists_having.q
<https://reviews.apache.org/r/15718/#comment56449>

    It will be good to add a negative test where subq and outer query both uses same table alias. It seems in such cases we may generate incorrect results, so we should disable those.



ql/src/test/results/clientpositive/subquery_in_having.q.out
<https://reviews.apache.org/r/15718/#comment56450>

    In this plan, we are first computing outq, then subq and then doing left semi-join on resultset of those two. As we discussed efficient way for this is to push filter conditions in subq to outer query to cut-down the output generated by outq. Though, I am not sure whether its better to do it in optimizer phase via Transformer or right here. Either ways, I think thats an optimization which we can do as a follow-up.



ql/src/test/results/clientpositive/subquery_notexists_having.q.out
<https://reviews.apache.org/r/15718/#comment56451>

    First expression in this filter is redundant. Thats not strictly required. However, since there is an active work going on for constant folding optimization, this may get optimized way via that optimization. Either way, this can be done in follow-up.


- Ashutosh Chauhan


On Nov. 20, 2013, 6:04 p.m., Harish Butani wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15718/
> -----------------------------------------------------------
> 
> (Updated Nov. 20, 2013, 6:04 p.m.)
> 
> 
> Review request for hive and Ashutosh Chauhan.
> 
> 
> Bugs: HIVE-5614
>     https://issues.apache.org/jira/browse/HIVE-5614
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> support for subquery predicates in having clause. SubTask of HIVE-784
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/QB.java fa111cc 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/QBSubQuery.java 3e8215d 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 7979873 
>   ql/src/test/queries/clientpositive/subquery_exists_having.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/subquery_in_having.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/subquery_notexists_having.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/subquery_notin_having.q PRE-CREATION 
>   ql/src/test/results/clientpositive/subquery_exists_having.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/subquery_in_having.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/subquery_multiinsert.q.out 8dfb485 
>   ql/src/test/results/clientpositive/subquery_notexists_having.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/subquery_notin_having.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/15718/diff/
> 
> 
> Testing
> -------
> 
> added new tests: subquery_in_having.q, subquery_notin_having.q, subquery_exists_having.q, subquery_notexists_having.q
> 
> 
> Thanks,
> 
> Harish Butani
> 
>