You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by Ankur Garg <an...@gmail.com> on 2015/09/11 12:51:43 UTC

Significance of boolean direct Output Fields declare (boolean direct, Fields fields)

Hi ,

Looking at the OutputFieldsdeclare class , I see there is an overloaded
method for
declare with a boolean flag direct .

If i use the declare method i.e declare(Fields fields) ,it sets this
boolean flag as false .

I am not sure how storm interprets this boolean field internally while
processing with Spouts and Bolts .

Can somebody explain me the significance of this flag ?

Thanks
Ankur

Re: Significance of boolean direct Output Fields declare (boolean direct, Fields fields)

Posted by Ankur Garg <an...@gmail.com>.
Thanks Matthias , this clears things up for me.


On Fri, Sep 11, 2015 at 4:41 PM, Matthias J. Sax <mj...@apache.org> wrote:

> Hi Ankur,
>
> If you declare a direct stream (setting the flag to true), you need to
> emit tuples via
>
>   collector.directEmit(...)
>
> methods (collector.emit(...) is not allowed for direct streams). Those
> methods require to specify the consumer task ID that should receive the
> tuple.
>
> Furthermore, when connecting a consumer to a direct stream, you need to
> specify
>
>   builder.setBolt(....).directGrouping()
>
> All other connection patterns are not allowed on direct stream.
>
> Direct streams have the advantage, that you have fine-grained controlled
> over the data distribution from producer to consumer. You can implement
> any imaginable distribution pattern. Of course, direct streams are much
> more difficult to handle. For example, you need to know the task IDs of
> subscribes consumers (those can be looked up on the TopologyContext
> fiven in Bolt.prepare)
>
> -Matthias
>
>
> On 09/11/2015 12:51 PM, Ankur Garg wrote:
> > Hi ,
> >
> > Looking at the OutputFieldsdeclare class , I see there is an overloaded
> > method for
> > declare with a boolean flag direct .
> >
> > If i use the declare method i.e declare(Fields fields) ,it sets this
> > boolean flag as false .
> >
> > I am not sure how storm interprets this boolean field internally while
> > processing with Spouts and Bolts .
> >
> > Can somebody explain me the significance of this flag ?
> >
> > Thanks
> > Ankur
>
>

Re: Significance of boolean direct Output Fields declare (boolean direct, Fields fields)

Posted by "Matthias J. Sax" <mj...@apache.org>.
Hi Ankur,

If you declare a direct stream (setting the flag to true), you need to
emit tuples via

  collector.directEmit(...)

methods (collector.emit(...) is not allowed for direct streams). Those
methods require to specify the consumer task ID that should receive the
tuple.

Furthermore, when connecting a consumer to a direct stream, you need to
specify

  builder.setBolt(....).directGrouping()

All other connection patterns are not allowed on direct stream.

Direct streams have the advantage, that you have fine-grained controlled
over the data distribution from producer to consumer. You can implement
any imaginable distribution pattern. Of course, direct streams are much
more difficult to handle. For example, you need to know the task IDs of
subscribes consumers (those can be looked up on the TopologyContext
fiven in Bolt.prepare)

-Matthias


On 09/11/2015 12:51 PM, Ankur Garg wrote:
> Hi ,
> 
> Looking at the OutputFieldsdeclare class , I see there is an overloaded
> method for 
> declare with a boolean flag direct . 
> 
> If i use the declare method i.e declare(Fields fields) ,it sets this
> boolean flag as false . 
> 
> I am not sure how storm interprets this boolean field internally while
> processing with Spouts and Bolts . 
> 
> Can somebody explain me the significance of this flag ?
> 
> Thanks
> Ankur