You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Bikas Saha (JIRA)" <ji...@apache.org> on 2014/07/06 22:05:34 UTC

[jira] [Commented] (TEZ-1257) Error on empty partition when using OnFileUnorderedKVOutput and ShuffledMergedInput

    [ https://issues.apache.org/jira/browse/TEZ-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14053215#comment-14053215 ] 

Bikas Saha commented on TEZ-1257:
---------------------------------

I am wondering if OnFileUnorderedKVOutput and ShuffledMergedInput are a correct/efficient combination. If the output is not sorted then what happens in the merge phase of the input where its tries to sort merge input chunks that are expected to be sorted? Should we be using ShuffledUnorderedKVInput when we use OnFileUnorderedKVOutput ?

> Error on empty partition when using OnFileUnorderedKVOutput and ShuffledMergedInput
> -----------------------------------------------------------------------------------
>
>                 Key: TEZ-1257
>                 URL: https://issues.apache.org/jira/browse/TEZ-1257
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rohini Palaniswamy
>
> Encountering exception
> {code}
> org.apache.tez.dag.api.TezUncheckedException: Path component must start with: attempt InputAttemptIdentifier [inputIdentifier=InputIdentifier [inputIndex=0], attemptNumber=0, pathComponent=]
>         at org.apache.tez.runtime.library.common.InputAttemptIdentifier.<init>(InputAttemptIdentifier.java:45)
>         at org.apache.tez.runtime.library.common.InputAttemptIdentifier.<init>(InputAttemptIdentifier.java:51)
>         at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleInputEventHandler.processDataMovementEvent(ShuffleInputEventHandler.java:81)
>         at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleInputEventHandler.handleEvent(ShuffleInputEventHandler.java:66)
>         at org.apache.tez.runtime.library.common.shuffle.impl.ShuffleInputEventHandler.handleEvents(ShuffleInputEventHandler.java:59)
> {code}
> This is because the pathComponent is not set by UnorderedPartitionedKVWriter for empty partition
> {code}
> if (emptyPartitions.cardinality() != numPartitions) {
>       // Populate payload only if at least 1 partition has data
>       payloadBuidler.setHost(host);
>       payloadBuidler.setPort(shufflePort);
>       payloadBuidler.setPathComponent(outputContext.getUniqueIdentifier());
>     }
> {code}
> The combination of OnFileUnorderedKVOutput and ShuffledMergedInput works fine otherwise if there are no empty partitions.



--
This message was sent by Atlassian JIRA
(v6.2#6252)