You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@livy.apache.org by Lian Jiang <ji...@gmail.com> on 2018/08/17 19:49:08 UTC

specify external jar for livy2.spark interpreter

Hi,

I have a hadoop cluster installed using HDP3.0 and ambari 2.7 blueprint. I
have copied oracle oci hdfs connector jar (downloaded from
https://github.com/oracle/oci-hdfs-connector/releases) to
/usr/hdp/3.0.0.0-1634/livy2/repl_2.11-jars/
on namenodes and datanodes by following
https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.0/using-zeppelin/zeppelin-using.pdf
.

When using livy2.spark interpreter in Zeppelin 0.8.0 (which use yarn
cluster mode), I got mixed story.

This works which means the oci connector hdfs jar is imported successfully.
%livy2.spark
import com.oracle.bmc.hdfs._

This failed:
%livy2.spark
val df = spark.read.parquet("oci://mybucket@mytenancy/my.parquet")

org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for
scheme "oci" at
org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3332) at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3352) at
org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124) at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3403) at
org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3371) at
org.apache.hadoop.fs.FileSystem.get(FileSystem.java:477) at
org.apache.hadoop.fs.Path.getFileSystem(Path.java:361) at
org.apache.spark.sql.execution.datasources.DataSource$.org$apache$spark$sql$execution$datasources$DataSource$$checkAndGlobPathIfNecessary(DataSource.scala:705)
at
org.apache.spark.sql.execution.datasources.DataSource$$anonfun$15.apply(DataSource.scala:389)
at
org.apache.spark.sql.execution.datasources.DataSource$$anonfun$15.apply(DataSource.scala:389)
at
scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at
scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.immutable.List.foreach(List.scala:381) at
scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.immutable.List.flatMap(List.scala:344) at
org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:388)
at
org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227) at
org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:622) at
org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:606) ...
60 elided

Any idea? Appreciate very much!

Re: specify external jar for livy2.spark interpreter

Posted by Lian Jiang <ji...@gmail.com>.
Turned out that I need to set livy.spark.driver.extraClassPath and
livy.spark.executor.extraClassPath for livy2 interpreter in zeppelin.
Thanks.

On Fri, Aug 17, 2018 at 12:49 PM Lian Jiang <ji...@gmail.com> wrote:

> Hi,
>
> I have a hadoop cluster installed using HDP3.0 and ambari 2.7 blueprint. I
> have copied oracle oci hdfs connector jar (downloaded from
> https://github.com/oracle/oci-hdfs-connector/releases) to /usr/hdp/3.0.0.0-1634/livy2/repl_2.11-jars/
> on namenodes and datanodes by following
> https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.0/using-zeppelin/zeppelin-using.pdf
> .
>
> When using livy2.spark interpreter in Zeppelin 0.8.0 (which use yarn
> cluster mode), I got mixed story.
>
> This works which means the oci connector hdfs jar is imported successfully.
> %livy2.spark
> import com.oracle.bmc.hdfs._
>
> This failed:
> %livy2.spark
> val df = spark.read.parquet("oci://mybucket@mytenancy/my.parquet")
>
> org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for
> scheme "oci" at
> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3332) at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3352) at
> org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:124) at
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3403) at
> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3371) at
> org.apache.hadoop.fs.FileSystem.get(FileSystem.java:477) at
> org.apache.hadoop.fs.Path.getFileSystem(Path.java:361) at
> org.apache.spark.sql.execution.datasources.DataSource$.org$apache$spark$sql$execution$datasources$DataSource$$checkAndGlobPathIfNecessary(DataSource.scala:705)
> at
> org.apache.spark.sql.execution.datasources.DataSource$$anonfun$15.apply(DataSource.scala:389)
> at
> org.apache.spark.sql.execution.datasources.DataSource$$anonfun$15.apply(DataSource.scala:389)
> at
> scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
> at
> scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
> at scala.collection.immutable.List.foreach(List.scala:381) at
> scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
> at scala.collection.immutable.List.flatMap(List.scala:344) at
> org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:388)
> at
> org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227) at
> org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:622) at
> org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:606) ...
> 60 elided
>
> Any idea? Appreciate very much!
>
>
>
>
>
>