You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by "Matthias J. Sax" <mj...@apache.org> on 2016/01/19 15:46:03 UTC

Emitting to non-declared output stream

Hi,

currently, I am using Storm 0.9.3. For first tests on a new topology, I
use LocalCluster. It happened to me, that I emitted tuples to an output
stream, that I did never declare (and thus not connect to). For this, I
would expect an error message in the log. However, I don't get anything
which makes debugging very hard.

What do you think about it? Should I open a JIRA for it?

For real cluster deployment, I think the overhead of checking the output
stream ID is too large and one can easily see the problem in the UI --
the non-declared output streams that gets tuples show up there. However,
for LocalCluster, there is not UI and an error log message would be nice.


-Matthias


Re: Emitting to non-declared output stream

Posted by "Matthias J. Sax" <mj...@apache.org>.
Hi,

I opened an PR for this two weeks ago. Would love to get some feedback
about it: https://github.com/apache/storm/pull/1031

-Matthias

On 01/19/2016 04:45 PM, Bobby Evans wrote:
> I think this is something that we should be able to handle efficiently in all cases.
> https://github.com/apache/storm/blob/master/storm-core/src/clj/org/apache/storm/daemon/task.clj#L120-L167
> creates the task function that is used for routing tuples to the correct downstream component.  Not surprisingly emit-direct ignores the stream (which might make things more difficult for that case), but for the normal emit
> https://github.com/apache/storm/blob/master/storm-core/src/clj/org/apache/storm/daemon/task.clj#L153
> looks up the grouper by the stream id.  My guess is that we are getting a null/nil back and the fast-list-iter is skipping over everything, when we should be able to do something with that null and throw an exception.
> 
> If it does look like we cannot do it without adding a lot of code to the critical path, then go ahead and do a config to turn it on/off.
>  - Bobby 
> 
>     On Tuesday, January 19, 2016 8:47 AM, Matthias J. Sax <mj...@apache.org> wrote:
>  
> 
>  Hi,
> 
> currently, I am using Storm 0.9.3. For first tests on a new topology, I
> use LocalCluster. It happened to me, that I emitted tuples to an output
> stream, that I did never declare (and thus not connect to). For this, I
> would expect an error message in the log. However, I don't get anything
> which makes debugging very hard.
> 
> What do you think about it? Should I open a JIRA for it?
> 
> For real cluster deployment, I think the overhead of checking the output
> stream ID is too large and one can easily see the problem in the UI --
> the non-declared output streams that gets tuples show up there. However,
> for LocalCluster, there is not UI and an error log message would be nice.
> 
> 
> -Matthias
> 
> 
>   
> 


Re: Emitting to non-declared output stream

Posted by Bobby Evans <ev...@yahoo-inc.com.INVALID>.
I think this is something that we should be able to handle efficiently in all cases.
https://github.com/apache/storm/blob/master/storm-core/src/clj/org/apache/storm/daemon/task.clj#L120-L167
creates the task function that is used for routing tuples to the correct downstream component.  Not surprisingly emit-direct ignores the stream (which might make things more difficult for that case), but for the normal emit
https://github.com/apache/storm/blob/master/storm-core/src/clj/org/apache/storm/daemon/task.clj#L153
looks up the grouper by the stream id.  My guess is that we are getting a null/nil back and the fast-list-iter is skipping over everything, when we should be able to do something with that null and throw an exception.

If it does look like we cannot do it without adding a lot of code to the critical path, then go ahead and do a config to turn it on/off.
 - Bobby 

    On Tuesday, January 19, 2016 8:47 AM, Matthias J. Sax <mj...@apache.org> wrote:
 

 Hi,

currently, I am using Storm 0.9.3. For first tests on a new topology, I
use LocalCluster. It happened to me, that I emitted tuples to an output
stream, that I did never declare (and thus not connect to). For this, I
would expect an error message in the log. However, I don't get anything
which makes debugging very hard.

What do you think about it? Should I open a JIRA for it?

For real cluster deployment, I think the overhead of checking the output
stream ID is too large and one can easily see the problem in the UI --
the non-declared output streams that gets tuples show up there. However,
for LocalCluster, there is not UI and an error log message would be nice.


-Matthias