You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/02/26 16:29:45 UTC

[GitHub] LucaCanali opened a new pull request #23898: [SPARK-26995][K8S] Running Spark in Docker image with Alpine Linux 3.9.0 throws errors when using snappy

LucaCanali opened a new pull request #23898: [SPARK-26995][K8S] Running Spark in Docker image with Alpine Linux 3.9.0 throws errors when using snappy
URL: https://github.com/apache/spark/pull/23898
 
 
   Running Spark in Docker image with Alpine Linux 3.9.0 throws errors when using snappy. 
   
   The issue can be reproduced for example as follows: `Seq(1,2).toDF("id").write.format("parquet").save("DELETEME1")` 
   The key part of the error stack is as follows `SparkException: Task failed while writing rows. .... Caused by: java.lang.UnsatisfiedLinkError: /tmp/snappy-1.1.7-2b4872f1-7c41-4b84-bda1-dbcb8dd0ce4c-libsnappyjava.so: Error loading shared library ld-linux-x86-64.so.2: Noded by /tmp/snappy-1.1.7-2b4872f1-7c41-4b84-bda1-dbcb8dd0ce4c-libsnappyjava.so)` 
   
   The source of the error appears to be that libsnappyjava.so needs ld-linux-x86-64.so.2 and looks for it in /lib, while in Alpine Linux 3.9.0 with libc6-compat version 1.1.20-r3 ld-linux-x86-64.so.2 is located in /lib64.
   Note: this issue is not present with Alpine Linux 3.8 and libc6-compat version 1.1.19-r10 
   
   ## What changes were proposed in this pull request?
   
   A possible workaround proposed with this PR is to modify the Dockerfile by adding a symbolic link between /lib and /lib64 so that linux-x86-64.so.2 can be found in /lib. This is probably not the cleanest solution, but I have observed that this is what happened/happens already when using Alpine Linux 3.8.1 (a version of Alpine Linux which was not affected by the issue reported here).
   
   ## How was this patch tested?
   
   Manually tested by running a simple workload with spark-shell, using docker on a client machine and using Spark on a Kubernetes cluster.
   The test workload is: `Seq(1,2).toDF("id").write.format("parquet").save("DELETEME1")` 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org