You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Frank Luo <jl...@merkleinc.com> on 2015/04/22 06:01:36 UTC

MapredContext not available when tez enabled

We have a UDF to collect some counts during Hive execution. It has been working fine until tez is enabled.

A bit digging shows that GenericUDF#configure method was not called. So in this case, is it possible to get counters through other means, or we have to implement Counter concept ourselves?

Thanks in advance

RE: MapredContext not available when tez enabled

Posted by Frank Luo <jl...@merkleinc.com>.
Cause found. 

I had "... limit 5" in the query. Once I take it out, the query runs fine. 

-----Original Message-----
From: Frank Luo [mailto:jluo@merkleinc.com] 
Sent: Wednesday, April 22, 2015 10:50 AM
To: user@hive.apache.org
Subject: RE: MapredContext not available when tez enabled

Gopal, 

Here is basically my code and I can clearly see configure() was not called and JavaCode on GenericUDF#configure reads:  "This is only called in runtime of MapRedTask.". Also based on my observation, the query is not executed as a M/R because Yarn monitoring knows nothing about the job. Seems to be the query is executed locally (forgive me for knowing very little on how Tez works internally).

=========================================== 

public class GenericUDFTracker extends GenericUDF {
    private MapredContext context;

    @Override
    public void configure(MapredContext aContext) {
        super.configure(aContext);
        context = aContext;
        LOGGER.info("configure setting context:{}", context); //this never executed
    }

    @Override
    public Object evaluate(DeferredObject[] arguments)
        throws HiveException {
        LOGGER.info(ArrayUtils.toString(arguments));

        String arg0 = elementOI0.getPrimitiveJavaObject(arguments[0].get());
        String arg1 = elementOI1.getPrimitiveJavaObject(arguments[1].get());
        LOGGER.info("context = {}", context);  //it prints 'context = null'
        Reporter reporter = context.getReporter(); //NPE is thrown
        LOGGER.info("reporter = {}", reporter);
       ...
  }
}

-----Original Message-----
From: Gopal Vijayaraghavan [mailto:gopal@hortonworks.com] On Behalf Of Gopal Vijayaraghavan
Sent: Tuesday, April 21, 2015 11:41 PM
To: user@hive.apache.org
Subject: Re: MapredContext not available when tez enabled


 
> A bit digging shows that GenericUDF#configure method was not called. 
>So in this case, is it possible to get counters through other means, or 
>we have to implement Counter concept ourselves?

You should be getting a TezContext object there (which inherits from MapRedContext).

And the method should get called depending on a needConfigure() check - if it is not getting called, that is very strange.

Cheers,
Gopal



RE: MapredContext not available when tez enabled

Posted by Frank Luo <jl...@merkleinc.com>.
Gopal, 

Here is basically my code and I can clearly see configure() was not called and JavaCode on GenericUDF#configure reads:  "This is only called in runtime of MapRedTask.". Also based on my observation, the query is not executed as a M/R because Yarn monitoring knows nothing about the job. Seems to be the query is executed locally (forgive me for knowing very little on how Tez works internally).

=========================================== 

public class GenericUDFTracker extends GenericUDF {
    private MapredContext context;

    @Override
    public void configure(MapredContext aContext) {
        super.configure(aContext);
        context = aContext;
        LOGGER.info("configure setting context:{}", context); //this never executed
    }

    @Override
    public Object evaluate(DeferredObject[] arguments)
        throws HiveException {
        LOGGER.info(ArrayUtils.toString(arguments));

        String arg0 = elementOI0.getPrimitiveJavaObject(arguments[0].get());
        String arg1 = elementOI1.getPrimitiveJavaObject(arguments[1].get());
        LOGGER.info("context = {}", context);  //it prints 'context = null'
        Reporter reporter = context.getReporter(); //NPE is thrown
        LOGGER.info("reporter = {}", reporter);
       ...
  }
}

-----Original Message-----
From: Gopal Vijayaraghavan [mailto:gopal@hortonworks.com] On Behalf Of Gopal Vijayaraghavan
Sent: Tuesday, April 21, 2015 11:41 PM
To: user@hive.apache.org
Subject: Re: MapredContext not available when tez enabled


 
> A bit digging shows that GenericUDF#configure method was not called. 
>So in this case, is it possible to get counters through other means, or 
>we have to implement Counter concept ourselves?

You should be getting a TezContext object there (which inherits from MapRedContext).

And the method should get called depending on a needConfigure() check - if it is not getting called, that is very strange.

Cheers,
Gopal



Re: MapredContext not available when tez enabled

Posted by Gopal Vijayaraghavan <go...@apache.org>.
 
> A bit digging shows that GenericUDF#configure method was not called. So
>in this case, is it possible to get counters through other means, or we
>have to implement Counter concept ourselves?

You should be getting a TezContext object there (which inherits from
MapRedContext).

And the method should get called depending on a needConfigure() check - if
it is not getting called, that is very strange.

Cheers,
Gopal