You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Russell Jurney <ru...@gmail.com> on 2016/09/27 21:47:03 UTC

Parquet compression jars not found - both snappy and lzo - PySpark 2.0.0

In PySpark 2.0.0, despite adding snappy and lzo to my spark.jars path, I
get errors that say these classes can't be found when I save to a parquet
file. I tried switching from default snappy to lzo and added that jar and I
get the same error.

What am I to do?

I can't figure out any other steps to take. Note that this bug appeared
when I upgraded from Spark 1.6 to 2.0.0. Other than spark.jars, what steps
can I take to give Parquet access to its jars? Why isn't PySpark just
handling this, since Parquet is included in Spark? Is this a bug?

A gist of my code and the error is here:
https://gist.github.com/rjurney/6783d19397cf3b4b88af3603d6e256bd
-- 
Russell Jurney twitter.com/rjurney russell.jurney@gmail.com relato.io