You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Jonathan Eagles (JIRA)" <ji...@apache.org> on 2016/02/09 04:41:18 UTC

[jira] [Updated] (TEZ-3104) Tez fails on Bzip2 intermediate output format

     [ https://issues.apache.org/jira/browse/TEZ-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Eagles updated TEZ-3104:
---------------------------------
    Attachment: TEZ-3104.1.patch

Attaching sample patch.

Tez reuses CodecPool inside fetcher threads but does not initialize native bits before starting the threads. This creates memory tension and causes the native bits to not be loaded in some threads.

In addition, if I set the mapreduce.reduce.shuffle.parallelcopies=1, it also worked since it was in sync with itself.

> Tez fails on Bzip2 intermediate output format
> ---------------------------------------------
>
>                 Key: TEZ-3104
>                 URL: https://issues.apache.org/jira/browse/TEZ-3104
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jonathan Eagles
>            Assignee: Jonathan Eagles
>         Attachments: TEZ-3104.1.patch
>
>
> HADOOP_CLASSPATH="$TEZ_CONF_DIR:$TEZ_HOME/*:$TEZ_HOME/lib/*" yarn jar /home/gs/tez/current/tez-tests-*.jar mrrsleep -Dmapreduce.reduce.log.level=TRACE -Dtez.task.log.level=TRACE -Dtez.runtime.compress=true -Dmapreduce.map.output.compress.codec=org.apache.hadoop.io.compress.BZip2Codec -Dmapred.map.output.compression.codec=org.apache.hadoop.io.compress.BZip2Codec -Dmapred.output.compression.codec=org.apache.hadoop.io.compress.BZip2Codec -Dmapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.BZip2Codec -Dmapreduce.reduce.shuffle.parallelcopies=30 -m 100 -ir 10 -r 100
> {noformat}
> 2016-02-09 02:31:36,605 [ERROR] [ShuffleAndMergeRunner {map}] |orderedgrouped.Shuffle|: map: ShuffleRunner failed with error
> org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError: error in shuffle in fetcher {map} #16
> 	at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:360)
> 	at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:337)
> 	at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.UnsupportedOperationException
> 	at org.apache.hadoop.io.compress.bzip2.BZip2DummyDecompressor.decompress(BZip2DummyDecompressor.java:32)
> 	at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:91)
> 	at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
> 	at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:192)
> 	at org.apache.tez.runtime.library.common.sort.impl.IFile$Reader.readToMemory(IFile.java:626)
> 	at org.apache.tez.runtime.library.common.shuffle.ShuffleUtils.shuffleToMemory(ShuffleUtils.java:113)
> 	at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyMapOutput(FetcherOrderedGrouped.java:502)
> 	at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:279)
> 	at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.fetchNext(FetcherOrderedGrouped.java:169)
> 	at org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.run(FetcherOrderedGrouped.java:184)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)