You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by William Kupersanin <wk...@gmail.com> on 2016/09/29 20:23:53 UTC

Setting conf options in jupyter

Hello,

I am trying to figure out how to correctly set config options in jupyter
when I am already provided a SparkContext and a HiveContext. I need to
increase a couple of memory allocations. My program dies indicating that I
am trying to call methods on a stopped SparkContext. I thought I had
created a new one with the new conf so I am not sure why

My code is as follows:

from pyspark import SparkConf, SparkContext
from pyspark.sql import HiveContext
from pyspark.sql import SQLContext
conf = (SparkConf()
        .set("spark.yarn.executor.memoryOverhead", "4096")
       .set("spark.kryoserializer.buffer.max.mb", "1024"))

sc.stop()
sc = SparkContext(conf=conf)
sqlContext2 = SQLContext.getOrCreate(sc)
starttime = time.time()
sampledate = "20160913"
networkdf = sqlContext2.read.json("/sp/network/" + sampledate + "/03/*")


An error occurred while calling o144.json.
: java.lang.IllegalStateException: Cannot call methods on a stopped
SparkContext.
This stopped SparkContext was created at:....