You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@crunch.apache.org by venkatesh kavuluri <ve...@gmail.com> on 2015/02/13 01:12:46 UTC

call DoFn#increment from helper class

Hi All,

I am trying to figure out the best way to call increment method (for
counters) from a helper class invoked from DoFn#process.

I have something like this.

class A extends DoFn<> {

Foo foo = nee Foo(); // helper class

@Override
public void process (S input, Emitter<T> emitter) {
    foo.processAndEmitEvents(bar, emitter);
}

I want to call DoFn#increment method from Foo#processAndEmitEvents. Is
there a better way to go about it other than “Foo" extending “A” just to
get hold of DoFn#increment.

Thanks for the help.

Re: call DoFn#increment from helper class

Posted by venkatesh kavuluri <ve...@gmail.com>.
Thanks Micah. We are going to use only MRPipeline and passing
the TaskInputOutputContext enabled me to implement counters in the helper
class.

On Thu, Feb 12, 2015 at 4:27 PM, Micah Whitacre <mk...@gmail.com>
wrote:

> Venkatesh,
>   There isn't really a great way without you exposing your implementation
> of DoFn to the runtime container you are running in.  The DoFn.increment
> method is hiding if the "increment" call is being made to alter a MapReduce
> counter or Spark.  If you know that you will only ever run in a MapReduce
> world then you could pull the TaskInputOutputContext and increment a
> counter using that but that seems like an abstraction leak of a concept
> only known to your selected Pipeline implementation.
>
> Micah
>
> On Thu, Feb 12, 2015 at 6:12 PM, venkatesh kavuluri <
> venkatesh.kowluru@gmail.com> wrote:
>
>> Hi All,
>>
>> I am trying to figure out the best way to call increment method (for
>> counters) from a helper class invoked from DoFn#process.
>>
>> I have something like this.
>>
>> class A extends DoFn<> {
>>
>> Foo foo = nee Foo(); // helper class
>>
>> @Override
>> public void process (S input, Emitter<T> emitter) {
>>     foo.processAndEmitEvents(bar, emitter);
>> }
>>
>> I want to call DoFn#increment method from Foo#processAndEmitEvents. Is
>> there a better way to go about it other than “Foo" extending “A” just to
>> get hold of DoFn#increment.
>>
>> Thanks for the help.
>>
>
>

Re: call DoFn#increment from helper class

Posted by Micah Whitacre <mk...@gmail.com>.
Venkatesh,
  There isn't really a great way without you exposing your implementation
of DoFn to the runtime container you are running in.  The DoFn.increment
method is hiding if the "increment" call is being made to alter a MapReduce
counter or Spark.  If you know that you will only ever run in a MapReduce
world then you could pull the TaskInputOutputContext and increment a
counter using that but that seems like an abstraction leak of a concept
only known to your selected Pipeline implementation.

Micah

On Thu, Feb 12, 2015 at 6:12 PM, venkatesh kavuluri <
venkatesh.kowluru@gmail.com> wrote:

> Hi All,
>
> I am trying to figure out the best way to call increment method (for
> counters) from a helper class invoked from DoFn#process.
>
> I have something like this.
>
> class A extends DoFn<> {
>
> Foo foo = nee Foo(); // helper class
>
> @Override
> public void process (S input, Emitter<T> emitter) {
>     foo.processAndEmitEvents(bar, emitter);
> }
>
> I want to call DoFn#increment method from Foo#processAndEmitEvents. Is
> there a better way to go about it other than “Foo" extending “A” just to
> get hold of DoFn#increment.
>
> Thanks for the help.
>

Re: call DoFn#increment from helper class

Posted by venkatesh kavuluri <ve...@gmail.com>.
Thanks Josh. I followed the former approach
involving TaskInputOutputContext and I am able to implement the job
counters.

On Thu, Feb 12, 2015 at 4:19 PM, Josh Wills <jw...@cloudera.com> wrote:

> You could pass the TaskInputOutputContext to foo and call its Counter
> methods directly, something like:
>
> @Override
> public void process (S input, Emitter<T> emitter) {
>     foo.processAndEmitEvents(bar, emitter, getContext());
> }
>
> or you could add a public inc() method to your A class, pass "this" to
> processAndEmitEvents, and have that public inc() method call the protected
> increment() method.
>
> J
>
>
> On Thu, Feb 12, 2015 at 4:12 PM, venkatesh kavuluri <
> venkatesh.kowluru@gmail.com> wrote:
>
>> Hi All,
>>
>> I am trying to figure out the best way to call increment method (for
>> counters) from a helper class invoked from DoFn#process.
>>
>> I have something like this.
>>
>> class A extends DoFn<> {
>>
>> Foo foo = nee Foo(); // helper class
>>
>> @Override
>> public void process (S input, Emitter<T> emitter) {
>>     foo.processAndEmitEvents(bar, emitter);
>> }
>>
>> I want to call DoFn#increment method from Foo#processAndEmitEvents. Is
>> there a better way to go about it other than “Foo" extending “A” just to
>> get hold of DoFn#increment.
>>
>> Thanks for the help.
>>
>
>
>
> --
> Director of Data Science
> Cloudera <http://www.cloudera.com>
> Twitter: @josh_wills <http://twitter.com/josh_wills>
>

Re: call DoFn#increment from helper class

Posted by Josh Wills <jw...@cloudera.com>.
You could pass the TaskInputOutputContext to foo and call its Counter
methods directly, something like:

@Override
public void process (S input, Emitter<T> emitter) {
    foo.processAndEmitEvents(bar, emitter, getContext());
}

or you could add a public inc() method to your A class, pass "this" to
processAndEmitEvents, and have that public inc() method call the protected
increment() method.

J


On Thu, Feb 12, 2015 at 4:12 PM, venkatesh kavuluri <
venkatesh.kowluru@gmail.com> wrote:

> Hi All,
>
> I am trying to figure out the best way to call increment method (for
> counters) from a helper class invoked from DoFn#process.
>
> I have something like this.
>
> class A extends DoFn<> {
>
> Foo foo = nee Foo(); // helper class
>
> @Override
> public void process (S input, Emitter<T> emitter) {
>     foo.processAndEmitEvents(bar, emitter);
> }
>
> I want to call DoFn#increment method from Foo#processAndEmitEvents. Is
> there a better way to go about it other than “Foo" extending “A” just to
> get hold of DoFn#increment.
>
> Thanks for the help.
>



-- 
Director of Data Science
Cloudera <http://www.cloudera.com>
Twitter: @josh_wills <http://twitter.com/josh_wills>