You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Feng Yuan (JIRA)" <ji...@apache.org> on 2015/11/30 03:29:11 UTC
[jira] [Commented] (HIVE-12530) Merge join in mutiple subsquence
join and a mapjoin in it in mr model
[ https://issues.apache.org/jira/browse/HIVE-12530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15031251#comment-15031251 ]
Feng Yuan commented on HIVE-12530:
----------------------------------
[~vikram.dixit],hi could u please look at this issue?
its similar to HIVE-9832,i think,thank you!
> Merge join in mutiple subsquence join and a mapjoin in it in mr model
> ---------------------------------------------------------------------
>
> Key: HIVE-12530
> URL: https://issues.apache.org/jira/browse/HIVE-12530
> Project: Hive
> Issue Type: Bug
> Components: CLI
> Affects Versions: 1.2.1
> Reporter: Feng Yuan
> Fix For: 2.00
>
>
> sample hql:
> select A.state_date,
> A.customer,
> A.channel_2,
> A.id,
> A.pid,
> A.type,
> A.pv,
> A.uv,
> A.visits,
> if(C.stay_visits is null,0,C.stay_visits) as stay_visits,
> A.stay_time,
> if(B.bounce is null,0,B.bounce) as bounce
> from
> (select a.state_date,
> a.customer,
> b.url as channel_2,
> b.id,
> b.pid,
> b.type,
> count(1) as pv,
> count(distinct a.gid) uv,
> count(distinct a.session_id) as visits,
> sum(a.stay_time) as stay_time
> from
> ( select state_date,
> customer,
> gid,
> session_id,
> ep,
> stay_time
> from bdi_fact.mid_pageview_dt0
> where l_date ='$v_date'
> )a
> join
> (select l_date as state_date ,
> url,
> id,
> pid,
> type,
> cid
> from bdi_fact.frequency_channel
> where l_date ='$v_date'
> and type ='2'
> and dr='0'
> )b
> on a.customer=b.cid
> where a.ep rlike b.url
> group by a.state_date, a.customer, b.url,b.id,b.pid,b.type
> )A
>
> left outer join
> ( select
> c.state_date ,
> c.customer ,
> d.url as channel_2,
> d.id,
> sum(pagedepth) as bounce
> from
> ( select
> t1.state_date ,
> t1.customer ,
> t1.session_id,
> t1.ep,
> t2.pagedepth
> from
> ( select
> state_date ,
> customer ,
> session_id,
> exit_url as ep
> from ods.mid_session_enter_exit_dt0
> where l_date ='$v_date'
> )t1
> join
> ( select
> state_date ,
> customer ,
> session_id,
> pagedepth
> from ods.mid_session_action_dt0
> where l_date ='$v_date'
> and pagedepth='1'
> )t2
> on t1.customer=t2.customer
> and t1.session_id=t2.session_id
> )c
> join
> (select *
> from bdi_fact.frequency_channel
> where l_date ='$v_date'
> and type ='2'
> and dr='0'
> )d
> on c.customer=d.cid
> where c.ep rlike d.url
> group by c.state_date,c.customer,d.url,d.id
> )B
> on
> A.customer=B.customer
> and A.channel_2=B.channel_2
> and A.id=B.id
> left outer join
> (
> select e.state_date,
> e.customer,
> f.url as channel_2,
> f.id,
> f.pid,
> f.type,
> count(distinct e.session_id) as stay_visits
> from
> ( select state_date,
> customer,
> gid,
> session_id,
> ep,
> stay_time
> from bdi_fact.mid_pageview_dt0
> where l_date ='$v_date'
> )e
> join
> (select l_date as state_date,
> url,
> id,
> pid,
> type,
> cid
> from bdi_fact.frequency_channel
> where l_date ='$v_date'
> and type ='2'
> and dr='0'
> )f
> on e.customer=f.cid
> where e.ep rlike f.url
> and e.stay_time is not null
> and e.stay_time <>'0'
> group by e.state_date, e.customer, f.url,f.id,f.pid,f.type
> )C
> on
> A.customer=C.customer
> and A.channel_2=C.channel_2
> and A.id=C.id
> and A.pid=C.pid
> and A.type=C.type
> where A.customer='Cdianyingwang' and A.channel_2='http://www.1905.com/film/filmnews/jk/' and A.id='127';"
> exception:
> 2015-11-26 15:10:53,607 Stage-14 map = 67%, reduce = 0%, Cumulative CPU 8.65 sec
> 2015-11-26 15:11:23,193 Stage-14 map = 100%, reduce = 0%, Cumulative CPU 8.65 sec
> MapReduce Total cumulative CPU time: 8 seconds 650 msec
> Ended Job = job_1448437287773_0686 with errors
> Error during job, obtaining debugging information...
> Examining task ID: task_1448437287773_0686_m_000001 (and more) from job job_1448437287773_0686
> Task with the most failures(4):
> -----
> Task ID:
> task_1448437287773_0686_m_000001
> URL:
> http://0.0.0.0:8088/taskdetails.jsp?jobid=job_1448437287773_0686&tipid=task_1448437287773_0686_m_000001
> -----
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: Hive Runtime Error while closing operators
> at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:232)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.NullPointerException
> at org.apache.hadoop.hive.ql.exec.MapJoinOperator.closeOp(MapJoinOperator.java:317)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:598)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
> at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:610)
> at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:205)
> ... 8 more
> FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)