You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Abhinay (JIRA)" <ji...@apache.org> on 2019/03/08 15:58:00 UTC

[jira] [Commented] (SPARK-25928) NoSuchMethodError net.jpountz.lz4.LZ4BlockInputStream.(Ljava/io/InputStream;Z)V

    [ https://issues.apache.org/jira/browse/SPARK-25928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16787996#comment-16787996 ] 

Abhinay commented on SPARK-25928:
---------------------------------

We are running into the same issue with EMR 5.20.0 with Oozie 5.0.0 and Spark 2.4.0. The workaround seemed to be working for now but it does require some additional setup either converting to Snappy or deleting the lz 1.4.0 in Oozie share lib. 

 

The feedback from AWS support team was it was more of an upstream issue with Spark. Can we know when this issue will be resolved?

 

> NoSuchMethodError net.jpountz.lz4.LZ4BlockInputStream.<init>(Ljava/io/InputStream;Z)V
> -------------------------------------------------------------------------------------
>
>                 Key: SPARK-25928
>                 URL: https://issues.apache.org/jira/browse/SPARK-25928
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Submit
>    Affects Versions: 2.3.1
>         Environment: EMR 5.17 which is using oozie 5.0.0 and spark 2.3.1
>            Reporter: Jerry Chabot
>            Priority: Major
>
> I am not sure if this is an Oozie problem, a Spark problem or a user error. It is blocking our upcoming release.
> We are upgrading from Amazon's EMR 5.7 to EMR 5.17. The version changes are:
>     oozie 4.3.0 -> 5.0.0
>      spark 2.1.1 -> 2.3.1
> All our Oozie/Spark jobs were working in EMR 5.7. After ugprading, some of our jobs which use a spark action are failing with the NoSuchMethod as shown further in the description. It seems like conflicting classes.
> I noticed the spark share lib directory has two versions of the LZ4 jar.
> sudo -u hdfs hadoop fs -ls /user/oozie/share/lib/lib_20181029182704/spark/*lz*
>  -rw-r--r--   3 oozie oozie      79845 2018-10-29 18:27 /user/oozie/share/lib/lib_20181029182704/spark/compress-lzf-1.0.3.jar
>  -rw-r--r--   3 hdfs  oozie     236880 2018-11-01 18:22 /user/oozie/share/lib/lib_20181029182704/spark/lz4-1.3.0.jar
>  -rw-r--r--   3 oozie oozie     370119 2018-10-29 18:27 /user/oozie/share/lib/lib_20181029182704/spark/lz4-java-1.4.0.jar
> But, both of these jars have the constructor LZ4BlockInputStream(java/io/InputStream). The spark/jars directory has only lz4-java-1.4.0.jar. share lib seems to be getting it from the /usr/lib/oozie/oozie-sharelib.tar.gz.
> Unfortunately, my team member that knows most about Spark is on vacation. Does anyone have any suggestions on how best to troubleshoot this problem?
> Here is the strack trace.
>       diagnostics: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, ip-172-27-113-49.ec2.internal, executor 2): java.lang.NoSuchMethodError: net.jpountz.lz4.LZ4BlockInputStream.<init>(Ljava/io/InputStream;Z)V
>         at org.apache.spark.io.LZ4CompressionCodec.compressedInputStream(CompressionCodec.scala:122)
>         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$6.apply(TorrentBroadcast.scala:304)
>         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$6.apply(TorrentBroadcast.scala:304)
>         at scala.Option.map(Option.scala:146)
>         at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:304)
>         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1$$anonfun$apply$2.apply(TorrentBroadcast.scala:235)
>         at scala.Option.getOrElse(Option.scala:121)
>         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:211)
>         at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1346)
>         at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:207)
>         at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:66)
>         at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:66)
>         at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96)
>         at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
>         at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:86)
>         at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
>         at org.apache.spark.scheduler.Task.run(Task.scala:109)
>         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org