You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tez.apache.org by Rohini Palaniswamy <ro...@gmail.com> on 2014/03/08 08:41:09 UTC

Fwd: [jira] [Commented] (TEZ-917) NPE when executing running via a custom edge

Hi,
   Could you guys tell us what is the hive team using custom edges for?

Regards,
Rohini

---------- Forwarded message ----------
From: Siddharth Seth (JIRA) <ji...@apache.org>
Date: Thu, Mar 6, 2014 at 10:24 AM
Subject: [jira] [Commented] (TEZ-917) NPE when executing running via a
custom edge
To: issues@tez.incubator.apache.org



    [
https://issues.apache.org/jira/browse/TEZ-917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13922841#comment-13922841]

Siddharth Seth commented on TEZ-917:
------------------------------------

Scratch that. Looking at the trace, this is likely a race in the Hive
custom edge plugin..

> NPE when executing running via a custom edge
> --------------------------------------------
>
>                 Key: TEZ-917
>                 URL: https://issues.apache.org/jira/browse/TEZ-917
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Siddharth Seth
>
> Reported by [~vikram.dixit]. Likely a race in event routing.
> {code}
> java.lang.NullPointerException
>   at
org.apache.hadoop.hive.ql.exec.tez.CustomPartitionEdge.getNumSourceTaskPhysicalOutputs(CustomPartitionEdge.java:55)
>   at org.apache.tez.dag.app.dag.impl.Edge.getSourceSpec(Edge.java:183)
>   at
org.apache.tez.dag.app.dag.impl.VertexImpl.getOutputSpecList(VertexImpl.java:2371)
>   at
org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.createRemoteTaskSpec(TaskAttemptImpl.java:518)
>   at
org.apache.tez.dag.app.dag.impl.TaskAttemptImpl$ScheduleTaskattemptTransition.transition(TaskAttemptImpl.java:1038)
>   at
org.apache.tez.dag.app.dag.impl.TaskAttemptImpl$ScheduleTaskattemptTransition.transition(TaskAttemptImpl.java:1027)
>   at
org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
>   at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
>   at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>   at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>   at
org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:721)
>   at
org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.handle(TaskAttemptImpl.java:105)
>   at
org.apache.tez.dag.app.DAGAppMaster$TaskAttemptEventDispatcher.handle(DAGAppMaster.java:1432)
>   at
org.apache.tez.dag.app.DAGAppMaster$TaskAttemptEventDispatcher.handle(DAGAppMaster.java:1417)
>   at
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:173)
>   at
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:106)
>   at java.lang.Thread.run(Thread.java:695)
> 2014-03-03 14:55:56,519 INFO [AsyncDispatcher event handler]
org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Re: [jira] [Commented] (TEZ-917) NPE when executing running via a custom edge

Posted by Siddharth Seth <ss...@apache.org>.
Minimal details - Vikram / Gunther should be able to provide more.
At the moment Hive is using this to implement Bucketed Map Joins, where
one side of the join does not need to be pre-bucketed.

A simple 2 table example:
Table 1 is pre-bucketed.
Table 2 is not - so it will be bucketed dynamically during execution.

Table 1 determines the number of tasks, and the distribution of work to
individual tasks. A single bucket may span multiple tasks. Depending on
the task distribution, buckets generated by Table2 are routed to the
correct set of tasks (belonging to the appropriate bucket). Custom
Edge/VertexManagers are used since this isnĀ¹t a standard routing pattern.

Thanks
- Sid


On 3/7/14, 11:41 PM, "Rohini Palaniswamy" <ro...@gmail.com> wrote:

>Hi,
>   Could you guys tell us what is the hive team using custom edges for?
>
>Regards,
>Rohini
>
>---------- Forwarded message ----------
>From: Siddharth Seth (JIRA) <ji...@apache.org>
>Date: Thu, Mar 6, 2014 at 10:24 AM
>Subject: [jira] [Commented] (TEZ-917) NPE when executing running via a
>custom edge
>To: issues@tez.incubator.apache.org
>
>
>
>    [
>https://issues.apache.org/jira/browse/TEZ-917?page=com.atlassian.jira.plug
>in.system.issuetabpanels:comment-tabpanel&focusedCommentId=13922841#commen
>t-13922841]
>
>Siddharth Seth commented on TEZ-917:
>------------------------------------
>
>Scratch that. Looking at the trace, this is likely a race in the Hive
>custom edge plugin..
>
>> NPE when executing running via a custom edge
>> --------------------------------------------
>>
>>                 Key: TEZ-917
>>                 URL: https://issues.apache.org/jira/browse/TEZ-917
>>             Project: Apache Tez
>>          Issue Type: Bug
>>            Reporter: Siddharth Seth
>>
>> Reported by [~vikram.dixit]. Likely a race in event routing.
>> {code}
>> java.lang.NullPointerException
>>   at
>org.apache.hadoop.hive.ql.exec.tez.CustomPartitionEdge.getNumSourceTaskPhy
>sicalOutputs(CustomPartitionEdge.java:55)
>>   at org.apache.tez.dag.app.dag.impl.Edge.getSourceSpec(Edge.java:183)
>>   at
>org.apache.tez.dag.app.dag.impl.VertexImpl.getOutputSpecList(VertexImpl.ja
>va:2371)
>>   at
>org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.createRemoteTaskSpec(TaskA
>ttemptImpl.java:518)
>>   at
>org.apache.tez.dag.app.dag.impl.TaskAttemptImpl$ScheduleTaskattemptTransit
>ion.transition(TaskAttemptImpl.java:1038)
>>   at
>org.apache.tez.dag.app.dag.impl.TaskAttemptImpl$ScheduleTaskattemptTransit
>ion.transition(TaskAttemptImpl.java:1027)
>>   at
>org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTrans
>ition(StateMachineFactory.java:362)
>>   at
>org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachine
>Factory.java:302)
>>   at
>org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFa
>ctory.java:46)
>>   at
>org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTr
>ansition(StateMachineFactory.java:448)
>>   at
>org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.handle(TaskAttemptImpl.jav
>a:721)
>>   at
>org.apache.tez.dag.app.dag.impl.TaskAttemptImpl.handle(TaskAttemptImpl.jav
>a:105)
>>   at
>org.apache.tez.dag.app.DAGAppMaster$TaskAttemptEventDispatcher.handle(DAGA
>ppMaster.java:1432)
>>   at
>org.apache.tez.dag.app.DAGAppMaster$TaskAttemptEventDispatcher.handle(DAGA
>ppMaster.java:1417)
>>   at
>org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java
>:173)
>>   at
>org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:10
>6)
>>   at java.lang.Thread.run(Thread.java:695)
>> 2014-03-03 14:55:56,519 INFO [AsyncDispatcher event handler]
>org.apache.hadoop.yarn.event.AsyncDispatcher: Exiting, bbye
>> {code}
>
>
>
>--
>This message was sent by Atlassian JIRA
>(v6.2#6252)