You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Cyrille Chépélov (JIRA)" <ji...@apache.org> on 2015/05/29 09:47:18 UTC

[jira] [Commented] (TEZ-2505) PipelinedSorter attempts to compare keys from EOF'd streams

    [ https://issues.apache.org/jira/browse/TEZ-2505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14564361#comment-14564361 ] 

Cyrille Chépélov commented on TEZ-2505:
---------------------------------------

the pre-JIRA discussion is here: https://groups.google.com/forum/#!topic/cascading-user/ve77GORr370 

> PipelinedSorter attempts to compare keys from EOF'd streams
> -----------------------------------------------------------
>
>                 Key: TEZ-2505
>                 URL: https://issues.apache.org/jira/browse/TEZ-2505
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>         Environment: Scalding 0.13.1+PR1220 ; Cascading-3.0.0-wip-118 ; scala 2.11.6 ; java openjdk 1.8.0_45-internal ; Debian linux 8 (stable); Intel(R) Core(TM) i7-3770 (amd64)
>            Reporter: Cyrille Chépélov
>
> When attempting to run the same multi-DAG application (that worked fine under the same environment except Cascading-3.0.0-wip-115 and tez 0.6.1), one of the early, and simplest DAG crashes on the PipelinedSorter.
> The stack at the crash site looks like:
> {code}
> 2015-05-28 11:52:47,120 ERROR [TezChild] element.TrapHandler: caught Throwable, no trap available, rethrowing
> cascading.CascadingException: unable to compare stream elements in position: 0
>         at cascading.tuple.hadoop.util.DeserializerComparator.compareTuples(DeserializerComparator.java:164)
>         at cascading.tuple.hadoop.util.TupleComparator.compare(TupleComparator.java:38)
>         at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.compareKeys(PipelinedSorter.java:669)
>         at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.compare(PipelinedSorter.java:684)
>         at org.apache.hadoop.util.QuickSort.sortInternal(QuickSort.java:99)
>         at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:63)
>         at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.sort(PipelinedSorter.java:631)
>         at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:230)
>         at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:311)
>         at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.write(PipelinedSorter.java:272)
>         at org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput$1.write(OrderedPartitionedKVOutput.java:164)
>         at cascading.flow.tez.stream.element.OldOutputCollector.collect(OldOutputCollector.java:51)
>         at cascading.tap.hadoop.util.MeasuredOutputCollector.collect(MeasuredOutputCollector.java:69)
>         at cascading.flow.tez.stream.element.TezCoGroupGate.wrapGroupingAndCollect(TezCoGroupGate.java:193)
>         at cascading.flow.hadoop.stream.HadoopGroupGate.receive(HadoopGroupGate.java:103)
>         at cascading.flow.hadoop.stream.HadoopGroupGate.receive(HadoopGroupGate.java:45)
>         at cascading.flow.stream.element.FunctionEachStage$1.collect(FunctionEachStage.java:81)
>         at cascading.tuple.TupleEntryCollector.safeCollect(TupleEntryCollector.java:145)
>         at cascading.tuple.TupleEntryCollector.add(TupleEntryCollector.java:133)
>         at com.twitter.scalding.FlatMapFunction$$anonfun$operate$2.apply(Operations.scala:49)
>         at com.twitter.scalding.FlatMapFunction$$anonfun$operate$2.apply(Operations.scala:47)
>         at scala.collection.Iterator$class.foreach(Iterator.scala:750)
>         at scala.collection.AbstractIterator.foreach(Iterator.scala:1202)
>         at com.twitter.scalding.FlatMapFunction.operate(Operations.scala:47)
>         at cascading.flow.stream.element.FunctionEachStage.receive(FunctionEachStage.java:100)
>         at cascading.flow.stream.element.FunctionEachStage.receive(FunctionEachStage.java:40)
>         at cascading.flow.stream.element.FunctionEachStage$1.collect(FunctionEachStage.java:81)
>         at cascading.tuple.TupleEntryCollector.safeCollect(TupleEntryCollector.java:145)
>         at cascading.tuple.TupleEntryCollector.add(TupleEntryCollector.java:133)
>         at com.twitter.scalding.MapFunction.operate(Operations.scala:60)
>         at cascading.flow.stream.element.FunctionEachStage.receive(FunctionEachStage.java:100)
>         at cascading.flow.stream.element.FunctionEachStage.receive(FunctionEachStage.java:40)
>         at cascading.flow.stream.element.FunctionEachStage$1.collect(FunctionEachStage.java:81)
>         at cascading.tuple.TupleEntryCollector.safeCollect(TupleEntryCollector.java:145)
>         at cascading.tuple.TupleEntryCollector.add(TupleEntryCollector.java:133)
>         at cascading.operation.Identity$1.operate(Identity.java:124)
>         at cascading.operation.Identity.operate(Identity.java:150)
>         at cascading.flow.stream.element.FunctionEachStage.receive(FunctionEachStage.java:100)
>         at cascading.flow.stream.element.FunctionEachStage.receive(FunctionEachStage.java:40)
>         at cascading.flow.stream.element.SourceStage.map(SourceStage.java:110)
>         at cascading.flow.stream.element.SourceStage.run(SourceStage.java:66)
>         at cascading.flow.tez.stream.element.TezSourceStage.run(TezSourceStage.java:95)
>         at cascading.flow.tez.FlowProcessor.run(FlowProcessor.java:165)
>         at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
>         at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
>         at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>         at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
>         at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
>         at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: cascading.CascadingException: unable to read element from underlying stream
>         at cascading.tuple.hadoop.util.TupleElementComparator.compare(TupleElementComparator.java:82)
>         at cascading.tuple.hadoop.util.TupleElementComparator.compare(TupleElementComparator.java:33)
>         at cascading.tuple.hadoop.util.DeserializerComparator.compareTuples(DeserializerComparator.java:160)
>         ... 55 more
> Caused by: java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:197)
>         at java.io.DataInputStream.readFully(DataInputStream.java:169)
>         at org.apache.hadoop.io.WritableUtils.readString(WritableUtils.java:125)
>         at cascading.tuple.hadoop.io.HadoopTupleInputStream.readString(HadoopTupleInputStream.java:75)
>         at cascading.tuple.hadoop.io.HadoopTupleInputStream.readType(HadoopTupleInputStream.java:85)
>         at cascading.tuple.hadoop.io.HadoopTupleInputStream.getNextElement(HadoopTupleInputStream.java:52)
>         at cascading.tuple.hadoop.util.TupleElementComparator.compare(TupleElementComparator.java:77)
>         ... 57 more
> {code}
> with an apparently random variation at the top of stack which is
> {code}
> 2015-05-28 13:10:13,459 ERROR [TezChild] element.TrapHandler: caught Throwable, no trap available, rethrowing
> cascading.CascadingException: java.io.EOFException
>         at cascading.tuple.hadoop.util.TupleComparator.compare(TupleComparator.java:42)
>         at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.compareKeys(PipelinedSorter.java:669)
>         at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.compare(PipelinedSorter.java:684)
>         at org.apache.hadoop.util.QuickSort.sortInternal(QuickSort.java:99)
>         at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:63)
>         at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.sort(PipelinedSorter.java:631)
> {code}
> all running TezChildren fail with the same stacks (either variant) at the same time on the same node, which is straight off a HDFS-backed CSV file.
> The cascading.tuple.hadoop.util.TupleComparator#compare code at the top of stack has been in use in a MAPREDUCE context for over 2.5 years; first analysis with [~cwensel] (who successfully reproduced the issue without scalding) points towards an issue on tez side.
> as a workaround, it is possible to run with {code:scala}"tez.runtime.sorter.class" -> "LEGACY"{code}, but this is impractical in the long run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)