You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by kant kodali <ka...@gmail.com> on 2017/04/23 19:11:10 UTC

Re: Zeppelin Spark Streaming Twitter Stuck

@Chaoran Yu Yeah I don't think its dependency issue. you wouldn't be able
to call methods if you are missing dependencies.

I am also in a similar boat though I am trying to get Streaming and
Zeppelin to work except I have my own indirect receiver (not the direct
stream). That twitter example is pretty old. I am using Spark 2.1.0. I
wonder if I should call streamingContext.awaitTermination from zeppelin? I
only do the following and streamingContext is stopped every time after
execute the following lines. still debugging and trying to see what is
going on. can let you know once I have something working.

jsonDStream.foreachRDD(rdd => println(rdd.count()));
streamingContext.start();

On Thu, Mar 30, 2017 at 6:47 AM, Chaoran Yu <yu...@gmail.com> wrote:

>
> I think I’ve added required artifacts in interpreter dependency settings.
> If there are artifacts missing, wouldn’t I see errors either in Zeppelin
> notebook UI or /logs folder? But I didn’t. For example, if I missed a
> Twitter artifact, I would have gotten a ClassNotFound error. Instead, what
> I saw was the notebook started running but never made any progress.
>
>
> On Mar 30, 2017, at 3:59 AM, Raffaele S <r....@gmail.com> wrote:
>
> You might have to add the relative artifacts manually in the interpreter
> page.
>
> Raffaele
>
>
>
> 2017-03-30 4:12 GMT+02:00 Chaoran Yu <yu...@gmail.com>:
>
>> Hello,
>>
>> Has anybody got Spark Streaming Twitter example to work in Zeppelin? When
>> I started the streaming context with ssc.start(), the Zeppelin paragraph
>> seemed to have started but it got stuck there. The top right corner of the
>> paragraph says “RUNNING 0%”
>>
>> I think this is a problem with Spark Streaming + Zeppelin, rather than
>> one with Twitter example in particular. Because I’ve tried my own simple
>> streaming tests but got the same thing: Stuck in “RUNNING 0%” status
>> forever.
>>
>> I also tried to stop the streaming context with ssc.stop() in a new
>> paragraph but it won’t execute i.e. remain in “PENDING” status. In fact,
>> any new code won’t execute in a new paragraph. I had to restart Zeppelin to
>> get me out of this situation. Zeppelin logs didn’t reveal any errors either.
>>
>> Could anyone help me here?
>>
>> Thank you,
>> Chaoran Yu
>
>
>
>

Re: Zeppelin Spark Streaming Twitter Stuck

Posted by kant kodali <ka...@gmail.com>.
@Chaoran Yu  I finally got it working. here is my code. I usually code in
Java but tried to convert it into scala below.

import spark.implicits._
import org.apache.spark.streaming._

SparkConf sparkConf = sc.getConf();
sparkConf.setJars(JavaSparkContext.jarOfClass(Hello.class));

val streamingContext = new StreamingContext(sc, 1000)

val jsonDStream = streamingContext.receiverStream(receiver); // indirect
receiver

jsonDStream.foreachRDD{rdd =>
    val jsonDF = spark.read.json(rdd)
    jsonDF.createOrReplaceTempView("jsondf")
}
streamingContext.start()

%sql select * from jsondf



On Sun, Apr 23, 2017 at 12:11 PM, kant kodali <ka...@gmail.com> wrote:

> @Chaoran Yu Yeah I don't think its dependency issue. you wouldn't be able
> to call methods if you are missing dependencies.
>
> I am also in a similar boat though I am trying to get Streaming and
> Zeppelin to work except I have my own indirect receiver (not the direct
> stream). That twitter example is pretty old. I am using Spark 2.1.0. I
> wonder if I should call streamingContext.awaitTermination from zeppelin?
> I only do the following and streamingContext is stopped every time after
> execute the following lines. still debugging and trying to see what is
> going on. can let you know once I have something working.
>
> jsonDStream.foreachRDD(rdd => println(rdd.count()));
> streamingContext.start();
>
> On Thu, Mar 30, 2017 at 6:47 AM, Chaoran Yu <yu...@gmail.com>
> wrote:
>
>>
>> I think I’ve added required artifacts in interpreter dependency settings.
>> If there are artifacts missing, wouldn’t I see errors either in Zeppelin
>> notebook UI or /logs folder? But I didn’t. For example, if I missed a
>> Twitter artifact, I would have gotten a ClassNotFound error. Instead, what
>> I saw was the notebook started running but never made any progress.
>>
>>
>> On Mar 30, 2017, at 3:59 AM, Raffaele S <r....@gmail.com> wrote:
>>
>> You might have to add the relative artifacts manually in the interpreter
>> page.
>>
>> Raffaele
>>
>>
>>
>> 2017-03-30 4:12 GMT+02:00 Chaoran Yu <yu...@gmail.com>:
>>
>>> Hello,
>>>
>>> Has anybody got Spark Streaming Twitter example to work in Zeppelin?
>>> When I started the streaming context with ssc.start(), the Zeppelin
>>> paragraph seemed to have started but it got stuck there. The top right
>>> corner of the paragraph says “RUNNING 0%”
>>>
>>> I think this is a problem with Spark Streaming + Zeppelin, rather than
>>> one with Twitter example in particular. Because I’ve tried my own simple
>>> streaming tests but got the same thing: Stuck in “RUNNING 0%” status
>>> forever.
>>>
>>> I also tried to stop the streaming context with ssc.stop() in a new
>>> paragraph but it won’t execute i.e. remain in “PENDING” status. In fact,
>>> any new code won’t execute in a new paragraph. I had to restart Zeppelin to
>>> get me out of this situation. Zeppelin logs didn’t reveal any errors either.
>>>
>>> Could anyone help me here?
>>>
>>> Thank you,
>>> Chaoran Yu
>>
>>
>>
>>
>