You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "John LeBrun (JIRA)" <ji...@apache.org> on 2019/06/10 13:48:00 UTC

[jira] [Updated] (HIVE-21853) NPE in StatsUtils.getWritableSize() when value passed in is null

     [ https://issues.apache.org/jira/browse/HIVE-21853?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John LeBrun updated HIVE-21853:
-------------------------------
    Attachment: HIVE-21853.patch

> NPE in StatsUtils.getWritableSize() when value passed in is null
> ----------------------------------------------------------------
>
>                 Key: HIVE-21853
>                 URL: https://issues.apache.org/jira/browse/HIVE-21853
>             Project: Hive
>          Issue Type: Bug
>         Environment: Hortonworks 
>  * Ambari version 2.7.3.0
>  * HDP stack version 3.1
>  * HDP stack repo version 3.1.0.0
>  * stack vdf version 3.1.0.0-78
>            Reporter: John LeBrun
>            Priority: Major
>         Attachments: HIVE-21853.patch, HIVE21853.java
>
>
> getWritableSize(ObjectInspector oi, Object value) method in org.apache.hadoop.hive.ql.stats.StatsUtils class fails with NPE when 2nd parameter (Object value) is null.
> Attached is patch with unit test and fix
> Issue was originally found when running UDF query against Hortonworks cluster with HDP 3.1 running Hive 3.1.0. The issue occurs when executing the UDF against a cluster using the tez execution engine
> beeline hive configurations
> set hive.execution.engine=tez;
> set hive.fetch.task.conversion=none;
> Attached is sample code with an implementation of a simple UDF that duplicates the behavior.
> steps to reproduce
> on a Hortonworks cluster with HDP 3.1 deployed
> -start beeline Hive session
> -set above hive configurations
> -add jar containing UDF from sample code
> -create table containing one string column
>     create table tmptable(col1 string)
>     insert into table tmptable values ('somestring')
> -create function bugUdf as 'BugUDF';
> -select bugUdf from tmptable;
> this will result in a null pointer exception similar to this
> ql.Driver ()) - FAILED: NullPointerException nulljava.lang.NullPointerException 
> at org.apache.hadoop.hive.ql.stats.StatsUtils.getWritableSize(StatsUtils.java:1373) 
> at org.apache.hadoop.hive.ql.stats.StatsUtils.getSizeOfStruct(StatsUtils.java:1356) 
> at org.apache.hadoop.hive.ql.stats.StatsUtils.getSizeOfComplexTypes(StatsUtils.java:1212) 
> at org.apache.hadoop.hive.ql.stats.StatsUtils.getAvgColLenOf(StatsUtils.java:1140) 
> at org.apache.hadoop.hive.ql.stats.StatsUtils.getColStatisticsFromExpression(StatsUtils.java:1584) 
> at org.apache.hadoop.hive.ql.stats.StatsUtils.getColStatisticsFromExprMap(StatsUtils.java:1424) 
> at org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory$SelectStatsRule.process(StatsRulesProcFactory.java:196) 
> at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:90) 
> at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatchAndReturn(DefaultGraphWalker.java:105) 
> at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:89) 
> at org.apache.hadoop.hive.ql.lib.LevelOrderWalker.startWalking(LevelOrderWalker.java:122) 
> at org.apache.hadoop.hive.ql.optimizer.stats.annotation.AnnotateWithStatistics.transform(AnnotateWithStatistics.java:78) 
> at org.apache.hadoop.hive.ql.parse.TezCompiler.runStatsAnnotation(TezCompiler.java:397) 
> at org.apache.hadoop.hive.ql.parse.TezCompiler.optimizeOperatorPlan(TezCompiler.java:161) 
> at org.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:148) 
> at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:12443) 
> at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:358) 
> at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:285) 
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:664) 
> at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1863) 
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1810) 
> at org.apache.hadoop.hive.ql.Driver.compileAndRespond(Driver.java:1805) 
> at org.apache.hadoop.hive.ql.reexec.ReExecDriver.compileAndRespond(ReExecDriver.java:126) 
> at org.apache.hive.service.cli.operation.SQLOperation.prepare(SQLOperation.java:197) 
> at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:262) 
> at org.apache.hive.service.cli.operation.Operation.run(Operation.java:247) 
> at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal(HiveSessionImpl.java:541) 
> at org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync(HiveSessionImpl.java:527) 
> at org.apache.hive.service.cli.CLIService.executeStatementAsync(CLIService.java:315) 
> at org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:562) 
> at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1557) 
> at org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1542) 
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
> at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) 
> at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) 
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
> at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)