You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Lian Jiang <ji...@gmail.com> on 2018/11/15 23:46:43 UTC
spark in jupyter cannot find a class in a jar
I am using spark in Jupyter as below:
import findspark
findspark.init()
from pyspark import SQLContext, SparkContext
sqlCtx = SQLContext(sc)
df = sqlCtx.read.parquet("oci://mybucket@mytenant/myfile.parquet")
The error is:
Py4JJavaError: An error occurred while calling o198.parquet.
: org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem
for scheme "oci"
I have put oci-hdfs-full-2.7.2.0.jar defining oci filesystem on all
namenodes and datanodes on hadoop.
export PYSPARK_SUBMIT_ARGS="--master yarn --deploy-mode client
pyspark-shell --driver-cores 8 --driver-memory 20g --num-executors 2
--executor-cores 6 --executor-memory 30g --jars
/mnt/data/hdfs/oci-hdfs-full-2.7.2.0.jar --conf
spark.executor.extraClassPath=/mnt/data/hdfs/oci-hdfs-full-2.7.2.0.jar
--conf spark.driver.extraClassPath=/mnt/data/hdfs/oci-hdfs-full-2.7.2.0.jar"
Any idea why this still happens? Thanks for any clue.