You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Michael Campbell <mi...@gmail.com> on 2014/06/07 03:50:48 UTC

New user streaming question

I've been playing with spark and streaming and have a question on stream
outputs.  The symptom is I don't get any.

I have run spark-shell and all does as I expect, but when I run the
word-count example with streaming, it *works* in that things happen and
there are no errors, but I never get any output.

Am I understanding how it it is supposed to work correctly?  Is the
Dstream.print() method supposed to print the output for every (micro)batch
of the streamed data?  If that's the case, I'm not seeing it.

I'm using the "netcat" example and the StreamingContext uses the network to
read words, but as I said, nothing comes out.

I tried changing the .print() to .saveAsTextFiles(), and I AM getting a
file, but nothing is in it other than a "_temporary" subdir.

I'm sure I'm confused here, but not sure where.  Help?

Re: New user streaming question

Posted by Michael Campbell <mi...@gmail.com>.
Thanks all - I still don't know what the underlying problem is, but I KIND
OF got it working by dumping my random-words stuff to a file and pointing
spark streaming to that.  So it's not "Streaming", as such, but I got
output.

More investigation to follow =)


On Sat, Jun 7, 2014 at 8:22 AM, Gino Bustelo <lb...@gmail.com> wrote:

> I would make sure that your workers are running. It is very difficult to
> tell from the console dribble if you just have no data or the workers just
> disassociated from masters.
>
> Gino B.
>
> On Jun 6, 2014, at 11:32 PM, Jeremy Lee <un...@gmail.com>
> wrote:
>
> Yup, when it's running, DStream.print() will print out a timestamped block
> for every time step, even if the block is empty. (for v1.0.0, which I have
> running in the other window)
>
> If you're not getting that, I'd guess the stream hasn't started up
> properly.
>
>
> On Sat, Jun 7, 2014 at 11:50 AM, Michael Campbell <
> michael.campbell@gmail.com> wrote:
>
>> I've been playing with spark and streaming and have a question on stream
>> outputs.  The symptom is I don't get any.
>>
>> I have run spark-shell and all does as I expect, but when I run the
>> word-count example with streaming, it *works* in that things happen and
>> there are no errors, but I never get any output.
>>
>> Am I understanding how it it is supposed to work correctly?  Is the
>> Dstream.print() method supposed to print the output for every (micro)batch
>> of the streamed data?  If that's the case, I'm not seeing it.
>>
>> I'm using the "netcat" example and the StreamingContext uses the network
>> to read words, but as I said, nothing comes out.
>>
>> I tried changing the .print() to .saveAsTextFiles(), and I AM getting a
>> file, but nothing is in it other than a "_temporary" subdir.
>>
>> I'm sure I'm confused here, but not sure where.  Help?
>>
>
>
>
> --
> Jeremy Lee  BCompSci(Hons)
>   The Unorthodox Engineers
>
>

Re: New user streaming question

Posted by Gino Bustelo <lb...@gmail.com>.
I would make sure that your workers are running. It is very difficult to tell from the console dribble if you just have no data or the workers just disassociated from masters. 

Gino B.

> On Jun 6, 2014, at 11:32 PM, Jeremy Lee <un...@gmail.com> wrote:
> 
> Yup, when it's running, DStream.print() will print out a timestamped block for every time step, even if the block is empty. (for v1.0.0, which I have running in the other window)
> 
> If you're not getting that, I'd guess the stream hasn't started up properly. 
> 
> 
>> On Sat, Jun 7, 2014 at 11:50 AM, Michael Campbell <mi...@gmail.com> wrote:
>> I've been playing with spark and streaming and have a question on stream outputs.  The symptom is I don't get any.
>> 
>> I have run spark-shell and all does as I expect, but when I run the word-count example with streaming, it *works* in that things happen and there are no errors, but I never get any output.  
>> 
>> Am I understanding how it it is supposed to work correctly?  Is the Dstream.print() method supposed to print the output for every (micro)batch of the streamed data?  If that's the case, I'm not seeing it.
>> 
>> I'm using the "netcat" example and the StreamingContext uses the network to read words, but as I said, nothing comes out. 
>> 
>> I tried changing the .print() to .saveAsTextFiles(), and I AM getting a file, but nothing is in it other than a "_temporary" subdir.
>> 
>> I'm sure I'm confused here, but not sure where.  Help?
> 
> 
> 
> -- 
> Jeremy Lee  BCompSci(Hons)
>   The Unorthodox Engineers

Re: New user streaming question

Posted by Jeremy Lee <un...@gmail.com>.
Yup, when it's running, DStream.print() will print out a timestamped block
for every time step, even if the block is empty. (for v1.0.0, which I have
running in the other window)

If you're not getting that, I'd guess the stream hasn't started up
properly.


On Sat, Jun 7, 2014 at 11:50 AM, Michael Campbell <
michael.campbell@gmail.com> wrote:

> I've been playing with spark and streaming and have a question on stream
> outputs.  The symptom is I don't get any.
>
> I have run spark-shell and all does as I expect, but when I run the
> word-count example with streaming, it *works* in that things happen and
> there are no errors, but I never get any output.
>
> Am I understanding how it it is supposed to work correctly?  Is the
> Dstream.print() method supposed to print the output for every (micro)batch
> of the streamed data?  If that's the case, I'm not seeing it.
>
> I'm using the "netcat" example and the StreamingContext uses the network
> to read words, but as I said, nothing comes out.
>
> I tried changing the .print() to .saveAsTextFiles(), and I AM getting a
> file, but nothing is in it other than a "_temporary" subdir.
>
> I'm sure I'm confused here, but not sure where.  Help?
>



-- 
Jeremy Lee  BCompSci(Hons)
  The Unorthodox Engineers