You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Rahul Challapalli (JIRA)" <ji...@apache.org> on 2014/11/07 00:07:36 UTC

[jira] [Created] (DRILL-1649) JSON : Joining 2 sub-queries (one of them uses flatten) fails with "Hash Join doe not support schema changes"

Rahul Challapalli created DRILL-1649:
----------------------------------------

             Summary: JSON : Joining 2 sub-queries (one of them uses flatten) fails with "Hash Join doe not support schema changes" 
                 Key: DRILL-1649
                 URL: https://issues.apache.org/jira/browse/DRILL-1649
             Project: Apache Drill
          Issue Type: Bug
          Components: Functions - Drill, Storage - JSON
            Reporter: Rahul Challapalli


git.commit.id.abbrev=60aa446

I am running this test against Jason's branch which has some fixes to a few flatten issues.  

The below query fails 
{code}
select event_info.uid, transaction_info.trans_id, event_info.event.evnt_id
from (
 select userinfo.transaction.trans_id trans_id, max(userinfo.event.event_time) max_event_time
 from (
     select uid, flatten(events) event, flatten(transactions) transaction from `json_kvgenflatten/single-user-transactions.json`
 ) userinfo
 where userinfo.transaction.trans_time >= userinfo.event.event_time
 group by userinfo.transaction.trans_id
) transaction_info
inner join
(
 select uid, flatten(events) event
 from `json_kvgenflatten/single-user-transactions.json`
) event_info
on transaction_info.max_event_time = event_info.event.event_time;
{code}

The problem still persists even if I create views on top of each sub-query and the join them

{code}
create view v1 as 
select userinfo.transaction.trans_id trans_id, max(userinfo.event.event_time) max_event_time
 from (
     select uid, flatten(events) event, flatten(transactions) transaction from `json_kvgenflatten/single-user-transactions.json`
 ) userinfo 
 where userinfo.transaction.trans_time >= userinfo.event.event_time 
 group by userinfo.transaction.trans_id;
 
create view v2 as select uid, flatten(events) event
 from `json_kvgenflatten/single-user-transactions.json`;
 
select v2.uid, v1.trans_id, v2.event.evnt_id
from v1 inner join v2 
on v1.max_event_time = v2.event.event_time;
{code}

However if I create 2 files with the exact data from the outputs of the 2 sub-queries and try to join them the everything works fine.

I attached the data, and the error log files. Let me know if you need anything



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)