You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Marta Kuczora (JIRA)" <ji...@apache.org> on 2017/08/01 08:01:00 UTC

[jira] [Commented] (HIVE-16845) INSERT OVERWRITE a table with dynamic partitions on S3 fails with NPE

    [ https://issues.apache.org/jira/browse/HIVE-16845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16108547#comment-16108547 ] 

Marta Kuczora commented on HIVE-16845:
--------------------------------------

Checked the failing tests. They are not related to this patch.

> INSERT OVERWRITE a table with dynamic partitions on S3 fails with NPE
> ---------------------------------------------------------------------
>
>                 Key: HIVE-16845
>                 URL: https://issues.apache.org/jira/browse/HIVE-16845
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 2.1.1
>            Reporter: Marta Kuczora
>            Assignee: Marta Kuczora
>         Attachments: HIVE-16845.1.patch, HIVE-16845.2.patch, HIVE-16845.3.patch, HIVE-16845.4.patch
>
>
> *How to reproduce*
> - Create a partitioned table on S3:
> {noformat}
> CREATE EXTERNAL TABLE s3table(user_id string COMMENT '', event_name string COMMENT '') PARTITIONED BY (reported_date string, product_id int) LOCATION 's3a://<bucket name>'; 
> {noformat}
> - Create a temp table:
> {noformat}
> create table tmp_table (id string, name string, date string, pid int) row format delimited fields terminated by '\t' lines terminated by '\n' stored as textfile;
> {noformat}
> - Load the following rows to the tmp table:
> {noformat}
> u1	value1	2017-04-10	10000
> u2	value2	2017-04-10	10000
> u3	value3	2017-04-10	10001
> {noformat}
> - Set the following parameters:
> -- hive.exec.dynamic.partition.mode=nonstrict
> -- mapreduce.input.fileinputformat.split.maxsize=10
> -- hive.blobstore.optimizations.enabled=true
> -- hive.blobstore.use.blobstore.as.scratchdir=false
> -- hive.merge.mapfiles=true
> - Insert the rows from the temp table into the s3 table:
> {noformat}
> INSERT OVERWRITE TABLE s3table
> PARTITION (reported_date, product_id)
> SELECT
>   t.id as user_id,
>   t.name as event_name,
>   t.date as reported_date,
>   t.pid as product_id
> FROM tmp_table t;
> {noformat}
> A NPE will occur with the following stacktrace:
> {noformat}
> 2017-05-08 21:32:50,607 ERROR org.apache.hive.service.cli.operation.Operation: [HiveServer2-Background-Pool: Thread-184028]: Error running hive query: 
> org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.ConditionalTask. null
> at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:400)
> at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:239)
> at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:88)
> at org.apache.hive.service.cli.operation.SQLOperation$3$1.run(SQLOperation.java:293)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
> at org.apache.hive.service.cli.operation.SQLOperation$3.run(SQLOperation.java:306)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
> at org.apache.hadoop.hive.ql.plan.ConditionalResolverMergeFiles.generateActualTasks(ConditionalResolverMergeFiles.java:290)
> at org.apache.hadoop.hive.ql.plan.ConditionalResolverMergeFiles.getTasks(ConditionalResolverMergeFiles.java:175)
> at org.apache.hadoop.hive.ql.exec.ConditionalTask.execute(ConditionalTask.java:81)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:214)
> at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:100)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1977)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1690)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1422)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1206)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1201)
> at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:237)
> ... 11 more 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)