You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Saurabh Bidwai (JIRA)" <ji...@apache.org> on 2017/09/07 06:36:00 UTC
[jira] [Commented] (SPARK-8337) KafkaUtils.createDirectStream for
python is lacking API/feature parity with the Scala/Java version
[ https://issues.apache.org/jira/browse/SPARK-8337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156559#comment-16156559 ]
Saurabh Bidwai commented on SPARK-8337:
---------------------------------------
for this i'm getting this error
*kstream = KafkaUtils.createDirectStream(ssc, topics = ['twitterstream'], kafkaParams = {"metadata.broker.list": ["dn1001:6667","dn2001:6667","dn3001:6667","dn4001:6667"]})
*
---------------------------------------------------------------------------
Py4JJavaError Traceback (most recent call last)
<ipython-input-10-62bb7226120a> in <module>()
----> 1 streamer(sc)
<ipython-input-3-99af708ed717> in streamer(sc)
5 pwords = load_wordlist("/home/daasuser/twitter/kafkatweets/Twitter-Sentiment-Analysis-Using-Spark-Streaming-And-Kafka/Dataset/positive.txt")
6 nwords = load_wordlist("/home/daasuser/twitter/kafkatweets/Twitter-Sentiment-Analysis-Using-Spark-Streaming-And-Kafka/Dataset/negative.txt")
----> 7 counts = stream(ssc, pwords, nwords, 600)
8 make_plot(counts)
<ipython-input-9-be266104bdd8> in stream(ssc, pwords, nwords, duration)
1 def stream(ssc, pwords, nwords, duration):
----> 2 kstream = KafkaUtils.createDirectStream(ssc, topics = ['twitterstream'], kafkaParams = {"metadata.broker.list": ["dn1001:6667","dn2001:6667","dn3001:6667","dn4001:6667"]})
3 tweets = kstream.map(lambda x: x[1].encode("utf-8", "ignore"))
4
5 # Each element of tweets will be the text of a tweet.
/usr/hdp/current/spark-client/python/lib/pyspark.zip/pyspark/streaming/kafka.py in createDirectStream(ssc, topics, kafkaParams, fromOffsets, keyDecoder, valueDecoder, messageHandler)
150 if 'ClassNotFoundException' in str(e.java_exception):
151 KafkaUtils._printErrorMsg(ssc.sparkContext)
--> 152 raise e
153
154 stream = DStream(jstream, ssc, ser).map(func)
Py4JJavaError: An error occurred while calling o40.loadClass.
: java.lang.ClassNotFoundException: org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
at py4j.Gateway.invoke(Gateway.java:259)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:209)
at java.lang.Thread.run(Thread.java:745)
> KafkaUtils.createDirectStream for python is lacking API/feature parity with the Scala/Java version
> --------------------------------------------------------------------------------------------------
>
> Key: SPARK-8337
> URL: https://issues.apache.org/jira/browse/SPARK-8337
> Project: Spark
> Issue Type: Bug
> Components: DStreams, PySpark
> Affects Versions: 1.4.0
> Reporter: Amit Ramesh
> Priority: Critical
>
> See the following thread for context.
> http://apache-spark-developers-list.1001551.n3.nabble.com/Re-Spark-1-4-Python-API-for-getting-Kafka-offsets-in-direct-mode-tt12714.html
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org