You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Cyrille Chépélov (JIRA)" <ji...@apache.org> on 2015/05/29 09:42:18 UTC
[jira] [Updated] (TEZ-2505) PipelinedSorter attempts to compare
keys from EOF'd streams
[ https://issues.apache.org/jira/browse/TEZ-2505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Cyrille Chépélov updated TEZ-2505:
----------------------------------
Environment:
Scalding 0.13.1+PR1220 ; Cascading-3.0.0-wip-118 ; scala 2.11.6 ; java openjdk 1.8.0_45-internal ; Debian linux 8 (stable); Intel(R) Core(TM) i7-3770 (amd64)
The stack at the crash site looks like:
{code}
2015-05-28 11:52:47,120 ERROR [TezChild] element.TrapHandler: caught Throwable, no trap available, rethrowing
cascading.CascadingException: unable to compare stream elements in position: 0
at cascading.tuple.hadoop.util.DeserializerComparator.compareTuples(DeserializerComparator.java:164)
at cascading.tuple.hadoop.util.TupleComparator.compare(TupleComparator.java:38)
at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.compareKeys(PipelinedSorter.java:669)
at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.compare(PipelinedSorter.java:684)
at org.apache.hadoop.util.QuickSort.sortInternal(QuickSort.java:99)
at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:63)
at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.sort(PipelinedSorter.java:631)
at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:230)
at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:311)
at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.write(PipelinedSorter.java:272)
at org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput$1.write(OrderedPartitionedKVOutput.java:164)
at cascading.flow.tez.stream.element.OldOutputCollector.collect(OldOutputCollector.java:51)
at cascading.tap.hadoop.util.MeasuredOutputCollector.collect(MeasuredOutputCollector.java:69)
at cascading.flow.tez.stream.element.TezCoGroupGate.wrapGroupingAndCollect(TezCoGroupGate.java:193)
at cascading.flow.hadoop.stream.HadoopGroupGate.receive(HadoopGroupGate.java:103)
at cascading.flow.hadoop.stream.HadoopGroupGate.receive(HadoopGroupGate.java:45)
at cascading.flow.stream.element.FunctionEachStage$1.collect(FunctionEachStage.java:81)
at cascading.tuple.TupleEntryCollector.safeCollect(TupleEntryCollector.java:145)
at cascading.tuple.TupleEntryCollector.add(TupleEntryCollector.java:133)
at com.twitter.scalding.FlatMapFunction$$anonfun$operate$2.apply(Operations.scala:49)
at com.twitter.scalding.FlatMapFunction$$anonfun$operate$2.apply(Operations.scala:47)
at scala.collection.Iterator$class.foreach(Iterator.scala:750)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1202)
at com.twitter.scalding.FlatMapFunction.operate(Operations.scala:47)
at cascading.flow.stream.element.FunctionEachStage.receive(FunctionEachStage.java:100)
at cascading.flow.stream.element.FunctionEachStage.receive(FunctionEachStage.java:40)
at cascading.flow.stream.element.FunctionEachStage$1.collect(FunctionEachStage.java:81)
at cascading.tuple.TupleEntryCollector.safeCollect(TupleEntryCollector.java:145)
at cascading.tuple.TupleEntryCollector.add(TupleEntryCollector.java:133)
at com.twitter.scalding.MapFunction.operate(Operations.scala:60)
at cascading.flow.stream.element.FunctionEachStage.receive(FunctionEachStage.java:100)
at cascading.flow.stream.element.FunctionEachStage.receive(FunctionEachStage.java:40)
at cascading.flow.stream.element.FunctionEachStage$1.collect(FunctionEachStage.java:81)
at cascading.tuple.TupleEntryCollector.safeCollect(TupleEntryCollector.java:145)
at cascading.tuple.TupleEntryCollector.add(TupleEntryCollector.java:133)
at cascading.operation.Identity$1.operate(Identity.java:124)
at cascading.operation.Identity.operate(Identity.java:150)
at cascading.flow.stream.element.FunctionEachStage.receive(FunctionEachStage.java:100)
at cascading.flow.stream.element.FunctionEachStage.receive(FunctionEachStage.java:40)
at cascading.flow.stream.element.SourceStage.map(SourceStage.java:110)
at cascading.flow.stream.element.SourceStage.run(SourceStage.java:66)
at cascading.flow.tez.stream.element.TezSourceStage.run(TezSourceStage.java:95)
at cascading.flow.tez.FlowProcessor.run(FlowProcessor.java:165)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: cascading.CascadingException: unable to read element from underlying stream
at cascading.tuple.hadoop.util.TupleElementComparator.compare(TupleElementComparator.java:82)
at cascading.tuple.hadoop.util.TupleElementComparator.compare(TupleElementComparator.java:33)
at cascading.tuple.hadoop.util.DeserializerComparator.compareTuples(DeserializerComparator.java:160)
... 55 more
Caused by: java.io.EOFException
at java.io.DataInputStream.readFully(DataInputStream.java:197)
at java.io.DataInputStream.readFully(DataInputStream.java:169)
at org.apache.hadoop.io.WritableUtils.readString(WritableUtils.java:125)
at cascading.tuple.hadoop.io.HadoopTupleInputStream.readString(HadoopTupleInputStream.java:75)
at cascading.tuple.hadoop.io.HadoopTupleInputStream.readType(HadoopTupleInputStream.java:85)
at cascading.tuple.hadoop.io.HadoopTupleInputStream.getNextElement(HadoopTupleInputStream.java:52)
at cascading.tuple.hadoop.util.TupleElementComparator.compare(TupleElementComparator.java:77)
... 57 more
{code}
with an apparently random variation at the top of stack which is
{code}
2015-05-28 13:10:13,459 ERROR [TezChild] element.TrapHandler: caught Throwable, no trap available, rethrowing
cascading.CascadingException: java.io.EOFException
at cascading.tuple.hadoop.util.TupleComparator.compare(TupleComparator.java:42)
at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.compareKeys(PipelinedSorter.java:669)
at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.compare(PipelinedSorter.java:684)
at org.apache.hadoop.util.QuickSort.sortInternal(QuickSort.java:99)
at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:63)
at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.sort(PipelinedSorter.java:631)
{code}
all running TezChildren fail with the same stacks (either variant) at the same time on the same node, which is straight off a HDFS-backed CSV file.
The cascading.tuple.hadoop.util.TupleComparator#compare code at the top of stack has been in use in a MAPREDUCE context for over 2.5 years; first analysis with [~cwensel] (who successfully reproduced the issue without scalding) points towards an issue on tez side.
as a workaround, it is possible to run with {code:scala}"tez.runtime.sorter.class" -> "LEGACY"{code}, but this is impractical in the long run.
was:Scalding 0.13.1+PR1220 ; Cascading-3.0.0-wip-118 ; scala 2.11.6 ; java openjdk 1.8.0_45-internal ; Debian linux 8 (stable); Intel(R) Core(TM) i7-3770 (amd64)
> PipelinedSorter attempts to compare keys from EOF'd streams
> -----------------------------------------------------------
>
> Key: TEZ-2505
> URL: https://issues.apache.org/jira/browse/TEZ-2505
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.7.0
> Environment: Scalding 0.13.1+PR1220 ; Cascading-3.0.0-wip-118 ; scala 2.11.6 ; java openjdk 1.8.0_45-internal ; Debian linux 8 (stable); Intel(R) Core(TM) i7-3770 (amd64)
> The stack at the crash site looks like:
> {code}
> 2015-05-28 11:52:47,120 ERROR [TezChild] element.TrapHandler: caught Throwable, no trap available, rethrowing
> cascading.CascadingException: unable to compare stream elements in position: 0
> at cascading.tuple.hadoop.util.DeserializerComparator.compareTuples(DeserializerComparator.java:164)
> at cascading.tuple.hadoop.util.TupleComparator.compare(TupleComparator.java:38)
> at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.compareKeys(PipelinedSorter.java:669)
> at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.compare(PipelinedSorter.java:684)
> at org.apache.hadoop.util.QuickSort.sortInternal(QuickSort.java:99)
> at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:63)
> at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.sort(PipelinedSorter.java:631)
> at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.sort(PipelinedSorter.java:230)
> at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.collect(PipelinedSorter.java:311)
> at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter.write(PipelinedSorter.java:272)
> at org.apache.tez.runtime.library.output.OrderedPartitionedKVOutput$1.write(OrderedPartitionedKVOutput.java:164)
> at cascading.flow.tez.stream.element.OldOutputCollector.collect(OldOutputCollector.java:51)
> at cascading.tap.hadoop.util.MeasuredOutputCollector.collect(MeasuredOutputCollector.java:69)
> at cascading.flow.tez.stream.element.TezCoGroupGate.wrapGroupingAndCollect(TezCoGroupGate.java:193)
> at cascading.flow.hadoop.stream.HadoopGroupGate.receive(HadoopGroupGate.java:103)
> at cascading.flow.hadoop.stream.HadoopGroupGate.receive(HadoopGroupGate.java:45)
> at cascading.flow.stream.element.FunctionEachStage$1.collect(FunctionEachStage.java:81)
> at cascading.tuple.TupleEntryCollector.safeCollect(TupleEntryCollector.java:145)
> at cascading.tuple.TupleEntryCollector.add(TupleEntryCollector.java:133)
> at com.twitter.scalding.FlatMapFunction$$anonfun$operate$2.apply(Operations.scala:49)
> at com.twitter.scalding.FlatMapFunction$$anonfun$operate$2.apply(Operations.scala:47)
> at scala.collection.Iterator$class.foreach(Iterator.scala:750)
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1202)
> at com.twitter.scalding.FlatMapFunction.operate(Operations.scala:47)
> at cascading.flow.stream.element.FunctionEachStage.receive(FunctionEachStage.java:100)
> at cascading.flow.stream.element.FunctionEachStage.receive(FunctionEachStage.java:40)
> at cascading.flow.stream.element.FunctionEachStage$1.collect(FunctionEachStage.java:81)
> at cascading.tuple.TupleEntryCollector.safeCollect(TupleEntryCollector.java:145)
> at cascading.tuple.TupleEntryCollector.add(TupleEntryCollector.java:133)
> at com.twitter.scalding.MapFunction.operate(Operations.scala:60)
> at cascading.flow.stream.element.FunctionEachStage.receive(FunctionEachStage.java:100)
> at cascading.flow.stream.element.FunctionEachStage.receive(FunctionEachStage.java:40)
> at cascading.flow.stream.element.FunctionEachStage$1.collect(FunctionEachStage.java:81)
> at cascading.tuple.TupleEntryCollector.safeCollect(TupleEntryCollector.java:145)
> at cascading.tuple.TupleEntryCollector.add(TupleEntryCollector.java:133)
> at cascading.operation.Identity$1.operate(Identity.java:124)
> at cascading.operation.Identity.operate(Identity.java:150)
> at cascading.flow.stream.element.FunctionEachStage.receive(FunctionEachStage.java:100)
> at cascading.flow.stream.element.FunctionEachStage.receive(FunctionEachStage.java:40)
> at cascading.flow.stream.element.SourceStage.map(SourceStage.java:110)
> at cascading.flow.stream.element.SourceStage.run(SourceStage.java:66)
> at cascading.flow.tez.stream.element.TezSourceStage.run(TezSourceStage.java:95)
> at cascading.flow.tez.FlowProcessor.run(FlowProcessor.java:165)
> at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:337)
> at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:179)
> at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:171)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:171)
> at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:167)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: cascading.CascadingException: unable to read element from underlying stream
> at cascading.tuple.hadoop.util.TupleElementComparator.compare(TupleElementComparator.java:82)
> at cascading.tuple.hadoop.util.TupleElementComparator.compare(TupleElementComparator.java:33)
> at cascading.tuple.hadoop.util.DeserializerComparator.compareTuples(DeserializerComparator.java:160)
> ... 55 more
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readFully(DataInputStream.java:197)
> at java.io.DataInputStream.readFully(DataInputStream.java:169)
> at org.apache.hadoop.io.WritableUtils.readString(WritableUtils.java:125)
> at cascading.tuple.hadoop.io.HadoopTupleInputStream.readString(HadoopTupleInputStream.java:75)
> at cascading.tuple.hadoop.io.HadoopTupleInputStream.readType(HadoopTupleInputStream.java:85)
> at cascading.tuple.hadoop.io.HadoopTupleInputStream.getNextElement(HadoopTupleInputStream.java:52)
> at cascading.tuple.hadoop.util.TupleElementComparator.compare(TupleElementComparator.java:77)
> ... 57 more
> {code}
> with an apparently random variation at the top of stack which is
> {code}
> 2015-05-28 13:10:13,459 ERROR [TezChild] element.TrapHandler: caught Throwable, no trap available, rethrowing
> cascading.CascadingException: java.io.EOFException
> at cascading.tuple.hadoop.util.TupleComparator.compare(TupleComparator.java:42)
> at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.compareKeys(PipelinedSorter.java:669)
> at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.compare(PipelinedSorter.java:684)
> at org.apache.hadoop.util.QuickSort.sortInternal(QuickSort.java:99)
> at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:63)
> at org.apache.tez.runtime.library.common.sort.impl.PipelinedSorter$SortSpan.sort(PipelinedSorter.java:631)
> {code}
> all running TezChildren fail with the same stacks (either variant) at the same time on the same node, which is straight off a HDFS-backed CSV file.
> The cascading.tuple.hadoop.util.TupleComparator#compare code at the top of stack has been in use in a MAPREDUCE context for over 2.5 years; first analysis with [~cwensel] (who successfully reproduced the issue without scalding) points towards an issue on tez side.
> as a workaround, it is possible to run with {code:scala}"tez.runtime.sorter.class" -> "LEGACY"{code}, but this is impractical in the long run.
> Reporter: Cyrille Chépélov
>
> When attempting to run the same multi-DAG application (that worked fine under the same environment except Cascading-3.0.0-wip-115 and tez 0.6.1), one of the early, and simplest DAG crashes on the PipelinedSorter.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)