You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@toree.apache.org by "fxoSa (JIRA)" <ji...@apache.org> on 2017/12/04 09:15:00 UTC
[jira] [Commented] (TOREE-457) spark context seen corrupted after
load KAfka libraries
[ https://issues.apache.org/jira/browse/TOREE-457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16276493#comment-16276493 ]
fxoSa commented on TOREE-457:
-----------------------------
On Google cloud Platform I setup a Jupiter notebook over Google dataproc cluster (spark 2.2.0). After Jupiter instalation, I setup Toree in the following way:
sudo /opt/conda/bin/pip install https://dist.apache.org/repos/dist/dev/incubator/toree/0.2.0/snapshots/dev1/toree-pip/toree-0.2.0.dev1.tar.gz
sudo /opt/conda/bin/jupyter toree install --sys-prefix --spark_home=//usr/lib/spark
.
> spark context seen corrupted after load KAfka libraries
> -------------------------------------------------------
>
> Key: TOREE-457
> URL: https://issues.apache.org/jira/browse/TOREE-457
> Project: TOREE
> Issue Type: Bug
> Components: Kernel
> Reporter: fxoSa
> Priority: Minor
>
> I am trying to set up a jupyter notebook (apache-toree Scala) to access kafka logs from spark a streaming.
> First I add dependencies using AddDeps:
>
> {code:java}
> %AddDeps org.apache.spark spark-streaming-kafka-0-10_2.11 2.2.0.
> Marking org.apache.spark:spark-streaming-kafka-0-10_2.11:2.2.0 for download Preparing to fetch from:
> -> file:/tmp/toree_add_deps8235567186565695423/
> -> https://repo1.maven.org/maven2
> -> New file at /tmp/toree_add_deps8235567186565695423/https/repo1.maven.org/maven2/org/apache/spark/spark-streaming-kafka-0-10_2.11/2.2.0/spark-streaming-kafka-0-10_2.11-2.2.0.jar
> {code}
> After that I am able to import successfully part of necesary libraries:
> {code:java}
> import org.apache.spark.SparkConf
> import org.apache.spark.streaming._
> import org.apache.spark.streaming.kafka010._
> {code}
> However code fails when I try to create streaming context:
> {code:java}
> val ssc = new StreamingContext(sc, Seconds(2))
> Name: Compile Error
> Message: <console>:38: error: overloaded method constructor StreamingContext with alternatives:
> (path: String,sparkContext: org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.SparkContext)org.apache.spark.streaming.StreamingContext <and>
> (path: String,hadoopConf: org.apache.hadoop.conf.Configuration)org.apache.spark.streaming.StreamingContext <and>
> (conf: org.apache.spark.SparkConf,batchDuration: org.apache.spark.streaming.Duration)org.apache.spark.streaming.StreamingContext <and>
> (sparkContext: org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.SparkContext,batchDuration: org.apache.spark.streaming.Duration)org.apache.spark.streaming.StreamingContext
> cannot be applied to (org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.org.apache.spark.SparkContext, org.apache.spark.streaming.Duration)
> val ssc = new StreamingContext(sc, Seconds(2))
> ^
> StackTrace:
> {code}
> I have try it, in a jupyter docker
> https://github.com/jupyter/docker-stacks/tree/master/all-spark-notebook
> and in spark cluster set up in Google cloud platform with the same results
> Thanks
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)