You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@impala.apache.org by "Feng, Guangyuan" <gu...@intel.com> on 2016/06/15 09:33:31 UTC

Investigation of IMPALA-3678

Hi,

I'm investigating the issue of IMPALA-3678, and wanted to work on it. Some findings as follow:

[TEST SQL]:
select xx.o_orderkey from (
(select o_orderkey from orders x order by o_orderkey desc limit 15)
union
(select o_orderkey from tpch_parquet.orders_pq x order by o_orderkey desc limit 15)) xx
left join lineitem l
on xx.o_orderkey = l.l_orderkey where xx.o_orderkey > 0

This bug appears in the phrase of creating the plan tree. More precisely, while creating UNION node,
the predicate(xx.o_orderkey > 0), owned by LEFT JOIN clause, will be propagated to UNION clause,
at the same time, a cloned predicate will be added into the global conjuncts. Then, on line SortNode.java:92,
assignConjuncts(analyzer) could match the new predicate and add it to conjuncts_, obviously, the next statement
Preconditions.checkState(conjuncts_.isEmpty()) fails.

My rough idea to fix it is, add LEFT JOIN predicates into globalState_.ojClauseByConjunct, or do some other tricks in
assignConjuncts(analyzer).

Would anyone help clarify if it's not going in the right way?
Thanks.

Re: Investigation of IMPALA-3678

Posted by Alex Behm <al...@cloudera.com>.
Discussion is being continued on the JIRA/CR.

On Fri, Jun 17, 2016 at 9:28 AM, Dimitris Tsirogiannis <
dtsirogiannis@cloudera.com> wrote:

> You're more than welcome to submit a patch with your tricks. When Alex or
> Marcel get back from their PTO, they can go through the code review and see
> if it is the right solution.
>
> Thanks
> Dimitris
>
> On Fri, Jun 17, 2016 at 12:59 AM, Feng, Guangyuan <
> guangyuan.feng@intel.com> wrote:
>
>> Thanks for your answering.
>>
>> I'm afraid this simpler case's processing behavior is much different from
>> the mixed SQL with UNION and LEFT JOIN,
>> because it will pass if (!canMigrateConjuncts(inlineViewRef)), but the
>> mixed won't. I must make sure these difference
>> would not impact on too much.
>>
>> Actually, the test SQL works well, if LEFT JOIN was replaced with RIGHT
>> JOIN, and it won't pass IF condition to do migration,
>> so will the one with LEFT JOIN. So, I think the new test SQL are more
>> worth to explore.
>>
>> Whatever, I have examined them all, and now, I still stand on the side of
>> my proposal to do some tricks in globalState_ or Analyzer.
>> getUnassignedConjuncts(...).
>>
>> What do you think?
>>
>> Thanks.
>> Feng, Guangyuan
>>
>> -----Original Message-----
>> From: Dimitris Tsirogiannis [mailto:dtsirogiannis@cloudera.com]
>> Sent: Thursday, June 16, 2016 5:41 AM
>> To: Tim Armstrong <ta...@cloudera.com>
>> Cc: dev@impala.incubator.apache.org; Alex Behm <al...@cloudera.com>;
>> mmokhtar@cloudera.com; Wang, Youwei A <yo...@intel.com>; Zheng,
>> Kai <ka...@intel.com>
>> Subject: Re: Investigation of IMPALA-3678
>>
>> I am not sure I understand the proposed solution but let me explain what
>> the problem is here. The problem at a high level is that the predicate
>> xx.o_orderkey > 0 cannot be pushed to the operands of the union statement
>> because of the order by limit clause. You may want to see how the planner
>> handles a simpler case (e.g. explain select * from (select a, b from foo
>> order by b limit 1) x where x.b > 1) and get some idea of how this can be
>> solved.
>>
>> Thanks
>> Dimitris
>>
>> On Wed, Jun 15, 2016 at 1:11 PM, Tim Armstrong <ta...@cloudera.com>
>> wrote:
>>
>> > Dimitris and Alex are the local experts on that code, so maybe they
>> > will have a better idea of what the correct solution is.
>> >
>> >
>> > On Wed, Jun 15, 2016 at 2:33 AM, Feng, Guangyuan
>> > <guangyuan.feng@intel.com
>> > > wrote:
>> >
>> >> Hi,
>> >>
>> >> I'm investigating the issue of IMPALA-3678, and wanted to work on it.
>> >> Some findings as follow:
>> >>
>> >> [TEST SQL]:
>> >> select xx.o_orderkey from (
>> >> (select o_orderkey from orders x order by o_orderkey desc limit 15)
>> >> union (select o_orderkey from tpch_parquet.orders_pq x order by
>> >> o_orderkey desc limit 15)) xx left join lineitem l on xx.o_orderkey =
>> >> l.l_orderkey where xx.o_orderkey > 0
>> >>
>> >> This bug appears in the phrase of creating the plan tree. More
>> >> precisely, while creating UNION node, the predicate(xx.o_orderkey >
>> >> 0), owned by LEFT JOIN clause, will be propagated to UNION clause, at
>> >> the same time, a cloned predicate will be added into the global
>> >> conjuncts. Then, on line SortNode.java:92,
>> >> assignConjuncts(analyzer) could match the new predicate and add it to
>> >> conjuncts_, obviously, the next statement
>> >> Preconditions.checkState(conjuncts_.isEmpty()) fails.
>> >>
>> >> My rough idea to fix it is, add LEFT JOIN predicates into
>> >> globalState_.ojClauseByConjunct, or do some other tricks in
>> >> assignConjuncts(analyzer).
>> >>
>> >> Would anyone help clarify if it's not going in the right way?
>> >> Thanks.
>> >>
>> >
>> >
>>
>
>

Re: Investigation of IMPALA-3678

Posted by Dimitris Tsirogiannis <dt...@cloudera.com>.
You're more than welcome to submit a patch with your tricks. When Alex or
Marcel get back from their PTO, they can go through the code review and see
if it is the right solution.

Thanks
Dimitris

On Fri, Jun 17, 2016 at 12:59 AM, Feng, Guangyuan <gu...@intel.com>
wrote:

> Thanks for your answering.
>
> I'm afraid this simpler case's processing behavior is much different from
> the mixed SQL with UNION and LEFT JOIN,
> because it will pass if (!canMigrateConjuncts(inlineViewRef)), but the
> mixed won't. I must make sure these difference
> would not impact on too much.
>
> Actually, the test SQL works well, if LEFT JOIN was replaced with RIGHT
> JOIN, and it won't pass IF condition to do migration,
> so will the one with LEFT JOIN. So, I think the new test SQL are more
> worth to explore.
>
> Whatever, I have examined them all, and now, I still stand on the side of
> my proposal to do some tricks in globalState_ or Analyzer.
> getUnassignedConjuncts(...).
>
> What do you think?
>
> Thanks.
> Feng, Guangyuan
>
> -----Original Message-----
> From: Dimitris Tsirogiannis [mailto:dtsirogiannis@cloudera.com]
> Sent: Thursday, June 16, 2016 5:41 AM
> To: Tim Armstrong <ta...@cloudera.com>
> Cc: dev@impala.incubator.apache.org; Alex Behm <al...@cloudera.com>;
> mmokhtar@cloudera.com; Wang, Youwei A <yo...@intel.com>; Zheng,
> Kai <ka...@intel.com>
> Subject: Re: Investigation of IMPALA-3678
>
> I am not sure I understand the proposed solution but let me explain what
> the problem is here. The problem at a high level is that the predicate
> xx.o_orderkey > 0 cannot be pushed to the operands of the union statement
> because of the order by limit clause. You may want to see how the planner
> handles a simpler case (e.g. explain select * from (select a, b from foo
> order by b limit 1) x where x.b > 1) and get some idea of how this can be
> solved.
>
> Thanks
> Dimitris
>
> On Wed, Jun 15, 2016 at 1:11 PM, Tim Armstrong <ta...@cloudera.com>
> wrote:
>
> > Dimitris and Alex are the local experts on that code, so maybe they
> > will have a better idea of what the correct solution is.
> >
> >
> > On Wed, Jun 15, 2016 at 2:33 AM, Feng, Guangyuan
> > <guangyuan.feng@intel.com
> > > wrote:
> >
> >> Hi,
> >>
> >> I'm investigating the issue of IMPALA-3678, and wanted to work on it.
> >> Some findings as follow:
> >>
> >> [TEST SQL]:
> >> select xx.o_orderkey from (
> >> (select o_orderkey from orders x order by o_orderkey desc limit 15)
> >> union (select o_orderkey from tpch_parquet.orders_pq x order by
> >> o_orderkey desc limit 15)) xx left join lineitem l on xx.o_orderkey =
> >> l.l_orderkey where xx.o_orderkey > 0
> >>
> >> This bug appears in the phrase of creating the plan tree. More
> >> precisely, while creating UNION node, the predicate(xx.o_orderkey >
> >> 0), owned by LEFT JOIN clause, will be propagated to UNION clause, at
> >> the same time, a cloned predicate will be added into the global
> >> conjuncts. Then, on line SortNode.java:92,
> >> assignConjuncts(analyzer) could match the new predicate and add it to
> >> conjuncts_, obviously, the next statement
> >> Preconditions.checkState(conjuncts_.isEmpty()) fails.
> >>
> >> My rough idea to fix it is, add LEFT JOIN predicates into
> >> globalState_.ojClauseByConjunct, or do some other tricks in
> >> assignConjuncts(analyzer).
> >>
> >> Would anyone help clarify if it's not going in the right way?
> >> Thanks.
> >>
> >
> >
>

RE: Investigation of IMPALA-3678

Posted by "Feng, Guangyuan" <gu...@intel.com>.
Thanks for your answering.

I'm afraid this simpler case's processing behavior is much different from the mixed SQL with UNION and LEFT JOIN,
because it will pass if (!canMigrateConjuncts(inlineViewRef)), but the mixed won't. I must make sure these difference
would not impact on too much.

Actually, the test SQL works well, if LEFT JOIN was replaced with RIGHT JOIN, and it won't pass IF condition to do migration,
so will the one with LEFT JOIN. So, I think the new test SQL are more worth to explore.

Whatever, I have examined them all, and now, I still stand on the side of my proposal to do some tricks in globalState_ or Analyzer. getUnassignedConjuncts(...).

What do you think?

Thanks.
Feng, Guangyuan

-----Original Message-----
From: Dimitris Tsirogiannis [mailto:dtsirogiannis@cloudera.com] 
Sent: Thursday, June 16, 2016 5:41 AM
To: Tim Armstrong <ta...@cloudera.com>
Cc: dev@impala.incubator.apache.org; Alex Behm <al...@cloudera.com>; mmokhtar@cloudera.com; Wang, Youwei A <yo...@intel.com>; Zheng, Kai <ka...@intel.com>
Subject: Re: Investigation of IMPALA-3678

I am not sure I understand the proposed solution but let me explain what the problem is here. The problem at a high level is that the predicate xx.o_orderkey > 0 cannot be pushed to the operands of the union statement because of the order by limit clause. You may want to see how the planner handles a simpler case (e.g. explain select * from (select a, b from foo order by b limit 1) x where x.b > 1) and get some idea of how this can be solved.

Thanks
Dimitris

On Wed, Jun 15, 2016 at 1:11 PM, Tim Armstrong <ta...@cloudera.com>
wrote:

> Dimitris and Alex are the local experts on that code, so maybe they 
> will have a better idea of what the correct solution is.
>
>
> On Wed, Jun 15, 2016 at 2:33 AM, Feng, Guangyuan 
> <guangyuan.feng@intel.com
> > wrote:
>
>> Hi,
>>
>> I'm investigating the issue of IMPALA-3678, and wanted to work on it.
>> Some findings as follow:
>>
>> [TEST SQL]:
>> select xx.o_orderkey from (
>> (select o_orderkey from orders x order by o_orderkey desc limit 15) 
>> union (select o_orderkey from tpch_parquet.orders_pq x order by 
>> o_orderkey desc limit 15)) xx left join lineitem l on xx.o_orderkey = 
>> l.l_orderkey where xx.o_orderkey > 0
>>
>> This bug appears in the phrase of creating the plan tree. More 
>> precisely, while creating UNION node, the predicate(xx.o_orderkey > 
>> 0), owned by LEFT JOIN clause, will be propagated to UNION clause, at 
>> the same time, a cloned predicate will be added into the global 
>> conjuncts. Then, on line SortNode.java:92,
>> assignConjuncts(analyzer) could match the new predicate and add it to 
>> conjuncts_, obviously, the next statement
>> Preconditions.checkState(conjuncts_.isEmpty()) fails.
>>
>> My rough idea to fix it is, add LEFT JOIN predicates into 
>> globalState_.ojClauseByConjunct, or do some other tricks in 
>> assignConjuncts(analyzer).
>>
>> Would anyone help clarify if it's not going in the right way?
>> Thanks.
>>
>
>

Re: Investigation of IMPALA-3678

Posted by Dimitris Tsirogiannis <dt...@cloudera.com>.
I am not sure I understand the proposed solution but let me explain what
the problem is here. The problem at a high level is that the predicate
xx.o_orderkey > 0 cannot be pushed to the operands of the union statement
because of the order by limit clause. You may want to see how the planner
handles a simpler case (e.g. explain select * from (select a, b from foo
order by b limit 1) x where x.b > 1) and get some idea of how this can be
solved.

Thanks
Dimitris

On Wed, Jun 15, 2016 at 1:11 PM, Tim Armstrong <ta...@cloudera.com>
wrote:

> Dimitris and Alex are the local experts on that code, so maybe they will
> have a better idea of what the correct solution is.
>
>
> On Wed, Jun 15, 2016 at 2:33 AM, Feng, Guangyuan <guangyuan.feng@intel.com
> > wrote:
>
>> Hi,
>>
>> I'm investigating the issue of IMPALA-3678, and wanted to work on it.
>> Some findings as follow:
>>
>> [TEST SQL]:
>> select xx.o_orderkey from (
>> (select o_orderkey from orders x order by o_orderkey desc limit 15)
>> union
>> (select o_orderkey from tpch_parquet.orders_pq x order by o_orderkey desc
>> limit 15)) xx
>> left join lineitem l
>> on xx.o_orderkey = l.l_orderkey where xx.o_orderkey > 0
>>
>> This bug appears in the phrase of creating the plan tree. More precisely,
>> while creating UNION node,
>> the predicate(xx.o_orderkey > 0), owned by LEFT JOIN clause, will be
>> propagated to UNION clause,
>> at the same time, a cloned predicate will be added into the global
>> conjuncts. Then, on line SortNode.java:92,
>> assignConjuncts(analyzer) could match the new predicate and add it to
>> conjuncts_, obviously, the next statement
>> Preconditions.checkState(conjuncts_.isEmpty()) fails.
>>
>> My rough idea to fix it is, add LEFT JOIN predicates into
>> globalState_.ojClauseByConjunct, or do some other tricks in
>> assignConjuncts(analyzer).
>>
>> Would anyone help clarify if it's not going in the right way?
>> Thanks.
>>
>
>

Re: Investigation of IMPALA-3678

Posted by Tim Armstrong <ta...@cloudera.com>.
Dimitris and Alex are the local experts on that code, so maybe they will
have a better idea of what the correct solution is.

On Wed, Jun 15, 2016 at 2:33 AM, Feng, Guangyuan <gu...@intel.com>
wrote:

> Hi,
>
> I'm investigating the issue of IMPALA-3678, and wanted to work on it. Some
> findings as follow:
>
> [TEST SQL]:
> select xx.o_orderkey from (
> (select o_orderkey from orders x order by o_orderkey desc limit 15)
> union
> (select o_orderkey from tpch_parquet.orders_pq x order by o_orderkey desc
> limit 15)) xx
> left join lineitem l
> on xx.o_orderkey = l.l_orderkey where xx.o_orderkey > 0
>
> This bug appears in the phrase of creating the plan tree. More precisely,
> while creating UNION node,
> the predicate(xx.o_orderkey > 0), owned by LEFT JOIN clause, will be
> propagated to UNION clause,
> at the same time, a cloned predicate will be added into the global
> conjuncts. Then, on line SortNode.java:92,
> assignConjuncts(analyzer) could match the new predicate and add it to
> conjuncts_, obviously, the next statement
> Preconditions.checkState(conjuncts_.isEmpty()) fails.
>
> My rough idea to fix it is, add LEFT JOIN predicates into
> globalState_.ojClauseByConjunct, or do some other tricks in
> assignConjuncts(analyzer).
>
> Would anyone help clarify if it's not going in the right way?
> Thanks.
>