You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by "Lunagariya, Dhaval " <dh...@citi.com> on 2016/06/08 12:22:11 UTC
Need Your Inputs For Below Scenario
Hey folks,
Need your help.
Input Table1:
Column1
Column2
Column3
Column4
Column5
Column6
Column7
A
B1
B2
B3(NULL)
B4
B5
B6
Input Table2:
Column1
Column2
B1
D1
B2
D2
B3
D3
B4
D4
B5
D5
B6
D6
Output:
Column1
Column2
Column3
A
B1
D1
A
B2
D2
A
B4
D4
A
B5
D5
A
B6
D6
Here B3 is skipped because B3 is NULL.
What is the efficient way to get above result using Hive?
Regards,
Dhaval
RE: Need Your Inputs For Below Scenario
Posted by "Markovitz, Dudu" <dm...@paypal.com>.
Explode + joins
----------------------------------------------------------------------------------------------------
-- bash
----------------------------------------------------------------------------------------------------
mkdir t1
mkdir t2
cat>t1/data.txt
A B1 B2 B4 B5 B6
cat>t2/data.txt
B1 D1
B2 D2
B3 D3
B4 D4
B5 D5
B6 D6
hdfs dfs -put t1 t2 /tmp
----------------------------------------------------------------------------------------------------
-- hive
----------------------------------------------------------------------------------------------------
create external table t1
(
Column1 string
,Column2 string
,Column3 string
,Column4 string
,Column5 string
,Column6 string
,Column7 string
)
row format delimited
fields terminated by '\t'
location '/tmp/t1'
;
create external table t2
(
Column1 string
,Column2 string
)
row format delimited
fields terminated by '\t'
location '/tmp/t2'
;
Theoretically I would have written the query like this -
select t1.Column1
,t1_unpivot.val
,t2.Column2
from t1
lateral view explode (array(Column2,Column3,Column4,Column5,Column6,Column7)) t1_unpivot as val
join t2
on t2.Column1 =
t1_unpivot.val
;
Unfortunately, this syntax is not supported
FAILED: SemanticException [Error 10085]: Line 7:32 JOIN with a LATERAL VIEW is not supported 'val'
As a work-around I'm nesting the "lateral view'
select t1.Column1
,t1.val
,t2.Column2
from (select t1.Column1
,t1_unpivot.val
from t1
lateral view explode (array(Column2,Column3,Column4,Column5,Column6,Column7)) t1_unpivot as val
)
as t1
join t2
on t2.Column1 =
t1.val
;
A B1 D1
A B2 D2
A B4 D4
A B5 D5
A B6 D6
From: Lunagariya, Dhaval [mailto:dhaval.lunagariya@citi.com]
Sent: Wednesday, June 08, 2016 6:25 PM
To: 'user@hive.apache.org' <us...@hive.apache.org>
Cc: 'er.dcpatel@gmail.com' <er...@gmail.com>
Subject: RE: Need Your Inputs For Below Scenario
Here Table2 is very large table and contains lakhs of rows.
From: Lunagariya, Dhaval [CCC-OT]
Sent: Wednesday, June 08, 2016 5:52 PM
To: user@hive.apache.org<ma...@hive.apache.org>
Subject: Need Your Inputs For Below Scenario
Hey folks,
Need your help.
Input Table1:
Column1
Column2
Column3
Column4
Column5
Column6
Column7
A
B1
B2
B3(NULL)
B4
B5
B6
Input Table2:
Column1
Column2
B1
D1
B2
D2
B3
D3
B4
D4
B5
D5
B6
D6
Output:
Column1
Column2
Column3
A
B1
D1
A
B2
D2
A
B4
D4
A
B5
D5
A
B6
D6
Here B3 is skipped because B3 is NULL.
What is the efficient way to get above result using Hive?
Regards,
Dhaval
RE: Need Your Inputs For Below Scenario
Posted by "Lunagariya, Dhaval " <dh...@citi.com>.
Here Table2 is very large table and contains lakhs of rows.
From: Lunagariya, Dhaval [CCC-OT]
Sent: Wednesday, June 08, 2016 5:52 PM
To: user@hive.apache.org
Subject: Need Your Inputs For Below Scenario
Hey folks,
Need your help.
Input Table1:
Column1
Column2
Column3
Column4
Column5
Column6
Column7
A
B1
B2
B3(NULL)
B4
B5
B6
Input Table2:
Column1
Column2
B1
D1
B2
D2
B3
D3
B4
D4
B5
D5
B6
D6
Output:
Column1
Column2
Column3
A
B1
D1
A
B2
D2
A
B4
D4
A
B5
D5
A
B6
D6
Here B3 is skipped because B3 is NULL.
What is the efficient way to get above result using Hive?
Regards,
Dhaval