You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Saurabh Bidwai (JIRA)" <ji...@apache.org> on 2017/09/07 06:36:00 UTC

[jira] [Commented] (SPARK-8337) KafkaUtils.createDirectStream for python is lacking API/feature parity with the Scala/Java version

    [ https://issues.apache.org/jira/browse/SPARK-8337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156559#comment-16156559 ] 

Saurabh Bidwai commented on SPARK-8337:
---------------------------------------

for this i'm getting this error

*kstream = KafkaUtils.createDirectStream(ssc, topics = ['twitterstream'], kafkaParams = {"metadata.broker.list": ["dn1001:6667","dn2001:6667","dn3001:6667","dn4001:6667"]}) 
*

---------------------------------------------------------------------------
Py4JJavaError                             Traceback (most recent call last)
<ipython-input-10-62bb7226120a> in <module>()
----> 1 streamer(sc)

<ipython-input-3-99af708ed717> in streamer(sc)
      5     pwords = load_wordlist("/home/daasuser/twitter/kafkatweets/Twitter-Sentiment-Analysis-Using-Spark-Streaming-And-Kafka/Dataset/positive.txt")
      6     nwords = load_wordlist("/home/daasuser/twitter/kafkatweets/Twitter-Sentiment-Analysis-Using-Spark-Streaming-And-Kafka/Dataset/negative.txt")
----> 7     counts = stream(ssc, pwords, nwords, 600)
      8     make_plot(counts)

<ipython-input-9-be266104bdd8> in stream(ssc, pwords, nwords, duration)
      1 def stream(ssc, pwords, nwords, duration):
----> 2     kstream = KafkaUtils.createDirectStream(ssc, topics = ['twitterstream'], kafkaParams = {"metadata.broker.list": ["dn1001:6667","dn2001:6667","dn3001:6667","dn4001:6667"]})
      3     tweets = kstream.map(lambda x: x[1].encode("utf-8", "ignore"))
      4 
      5     # Each element of tweets will be the text of a tweet.

/usr/hdp/current/spark-client/python/lib/pyspark.zip/pyspark/streaming/kafka.py in createDirectStream(ssc, topics, kafkaParams, fromOffsets, keyDecoder, valueDecoder, messageHandler)
    150             if 'ClassNotFoundException' in str(e.java_exception):
    151                 KafkaUtils._printErrorMsg(ssc.sparkContext)
--> 152             raise e
    153 
    154         stream = DStream(jstream, ssc, ser).map(func)

Py4JJavaError: An error occurred while calling o40.loadClass.
: java.lang.ClassNotFoundException: org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper
	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
	at py4j.Gateway.invoke(Gateway.java:259)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:209)
	at java.lang.Thread.run(Thread.java:745)

> KafkaUtils.createDirectStream for python is lacking API/feature parity with the Scala/Java version
> --------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-8337
>                 URL: https://issues.apache.org/jira/browse/SPARK-8337
>             Project: Spark
>          Issue Type: Bug
>          Components: DStreams, PySpark
>    Affects Versions: 1.4.0
>            Reporter: Amit Ramesh
>            Priority: Critical
>
> See the following thread for context.
> http://apache-spark-developers-list.1001551.n3.nabble.com/Re-Spark-1-4-Python-API-for-getting-Kafka-offsets-in-direct-mode-tt12714.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org