You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Chris Veregge (Jira)" <ji...@apache.org> on 2020/05/19 17:59:00 UTC
[jira] [Created] (HIVE-23511) percentile_approx throws error when
using CTAS statement
Chris Veregge created HIVE-23511:
------------------------------------
Summary: percentile_approx throws error when using CTAS statement
Key: HIVE-23511
URL: https://issues.apache.org/jira/browse/HIVE-23511
Project: Hive
Issue Type: Bug
Components: Hive
Affects Versions: 2.1.0
Environment: [vereggcadmin@ip-10-40-51-103 ~]$ hive --version
Hive 2.1.0-amzn-0
Subversion git://ip-10-169-254-27/workspace/workspace/bigtop.release-rpm-5.2.0/build/hive/rpm/BUILD/apache-hive-2.1.0-amzn-0-src -r 418fa8c602f2a4b153c1a89806305f6b5a27a524
Compiled by ec2-user on Wed Nov 16 03:10:37 UTC 2016
From source with checksum 64a5b18bfaf894a6b2f1cd14a0654e92
Reporter: Chris Veregge
CTAS statements appear to fail with percentile_approx when using a float array as the second argument.
Here's example code that demonstrates the issue.
This statement works
select
percentile_approx(num,array(0.1,0.5,0.9)) as ptile
from sample;
but adding a CTAS statement to the same query results in an error
create table ptile_table as
select
percentile_approx(num,array(0.1,0.5,0.9)) as ptile
from sample;
FAILED: UDFArgumentTypeException The second argument must be a constant, but array<double> was passed instead.
here's verbose log output including a statment to make the table "sample" which is just a column of float values
Logging initialized using configuration in file:/etc/hive/conf.dist/hive-log4j2.properties Async: false
set hive.cli.print.header=true
set hive.resultset.use.unique.column.names=false
set hive.exec.parallel=false
set hive.groupby.orderby.position.alias = true
set mapreduce.job.reduce.slowstart.completedmaps = 0.95
set hive.execution.engine=tez
set hive.tez.auto.reducer.parallelism=true
set hive.default.fileformat=orc
set hive.default.fileformat.managed=orc
create table if not exists sample as
select rand() as num
from ucp.dim_date limit 100
OK
Time taken: 0.99 seconds
select
percentile_approx(num,array(0.1,0.5,0.9)) as ptile
from sample
Query ID = vereggcadmin_20200519172814_e2cabf47-d8e4-45a9-b5c5-87e323ee8668
Total jobs = 1
Launching Job 1 out of 1
Waiting for Tez session and AM to be ready...
Status: Running (Executing on YARN cluster with App id application_1577992969986_117744)
Map 1: 0/1 Reducer 2: 0/1
Map 1: 0/1 Reducer 2: 0/1
Map 1: 0(+1)/1 Reducer 2: 0/1
Map 1: 1/1 Reducer 2: 0/1
Map 1: 1/1 Reducer 2: 0(+1)/1
Map 1: 1/1 Reducer 2: 1/1
OK
ptile
[0.0539687133111435,0.5168283485290134,0.8464088546353761]
Time taken: 14.694 seconds, Fetched: 1 row(s)
create table ptile_table as
select
percentile_approx(num,array(0.1,0.5,0.9)) as ptile
from sample
FAILED: UDFArgumentTypeException The second argument must be a constant, but array<double> was passed instead.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)