You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by "Piotr Nestorow (JIRA)" <ji...@apache.org> on 2017/02/23 11:18:44 UTC
[jira] [Created] (ZEPPELIN-2156) Paragraph with PySpark streaming -
running job cannot be canceled
Piotr Nestorow created ZEPPELIN-2156:
----------------------------------------
Summary: Paragraph with PySpark streaming - running job cannot be canceled
Key: ZEPPELIN-2156
URL: https://issues.apache.org/jira/browse/ZEPPELIN-2156
Project: Zeppelin
Issue Type: Bug
Components: pySpark
Affects Versions: 0.7.0
Environment: Linux Ubuntu
Reporter: Piotr Nestorow
In a 'spark.pyspark' paragraph a StreamingContext to a Kafka stream is created.
The paragraph is started and while the job is running the spark context produces correct output from the code.
The problem is the job cannot be stopped in the Zeppelin web interface.
Installed Kafka version: kafka_2.11-0.8.2.2
Spark Kafka jar: spark-streaming-kafka-0-8_2.11-2.1.0.jar
Zeppelin: zeppelin-0.7.0-bin-all
Tried:
1. Paragraph Cancel ( || button ) has no effect.
2. Zeppelin Job view Stop All has no effect
3. Another paragraph with
%spark.pyspark
ssc.stop(stopSparkContext=false, stopGracefully=true)
is started by stays in 'Pending'
4. Restarting the 'spark' interpreter stops the job
The example logic:
%spark.pyspark
import sys
import json
from pyspark import SparkContext
from pyspark.streaming import StreamingContext(II())
from pyspark.streaming.kafka import KafkaUtils
zkQuorum, topic, interval = ('localhost:2181', 'airport', 60)
ssc = StreamingContext(sc, interval)
kvs = KafkaUtils.createStream(ssc, zkQuorum, "spark-streaming-consumer", {topic: 1})
parsed = kvs.map(lambda (k, v): json.loads(v))
summed = parsed.\
filter(lambda event: 'kind' in event and event['kind']=='gate').\
map(lambda event: ('count_all', int(event['value']['passengers']))).\
reduceByKey(lambda x,y: x + y).\
map(lambda x: {'sum': x[0], "passengers": x[1]})
summed.pprint()
ssc.start()
ssc.awaitTermination()
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)