You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@orc.apache.org by "László Bodor (Jira)" <ji...@apache.org> on 2019/10/14 09:09:00 UTC

[jira] [Resolved] (ORC-558) ORC-540 changes result in orc_remove_col.q

     [ https://issues.apache.org/jira/browse/ORC-558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

László Bodor resolved ORC-558.
------------------------------
    Resolution: Won't Fix

> ORC-540 changes result in orc_remove_col.q
> ------------------------------------------
>
>                 Key: ORC-558
>                 URL: https://issues.apache.org/jira/browse/ORC-558
>             Project: ORC
>          Issue Type: Bug
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Major
>         Attachments: ORC-558-hive.patch, orc_remove_cols_hive_logs.txt
>
>
> After ORC-540, result rows all disappear from hive's orc_remove_col.q qtest.
> Reproduction (locally):
> 1. reset orc to 1.5.6 + cherry pick ORC-540
> 2. build orc + build hive (hive depends on ORC 1.5.6, so no pom.xml change is needed)
> 3. run hive qtest: mvn test -Dtest.output.overwrite=true -pl itests/qtest -Dtest=TestCliDriver -Dqfile=orc_remove_cols.q
> {code}
> Client Execution succeeded but contained differences (error code = 1) after executing orc_remove_cols.q
> 62a63,72
> > -1073279343 today
> > -1073051226 today
> > -1072910839 today
> > -1072081801 today
> > -1072076362 today
> > -1071480828 today
> > -1071363017 today
> > -1070883071 today
> > -1070551679 today
> > -1069736047 today
> 72a83,92
> > -1073279343 tomorrow
> > -1073051226 tomorrow
> > -1072910839 tomorrow
> > -1072081801 tomorrow
> > -1072076362 tomorrow
> > -1071480828 tomorrow
> > -1071363017 tomorrow
> > -1070883071 tomorrow
> > -1070551679 tomorrow
> > -1069736047 tomorrow
> {code}
> qtest containing the selects:
> {code}
> --! qt:dataset:alltypesorc
> set hive.vectorized.execution.enabled=false;
> SET hive.exec.schema.evolution=false;
> set hive.fetch.task.conversion=more;
> set hive.mapred.mode=nonstrict;
> CREATE TABLE orc_partitioned(a INT, b STRING) partitioned by (ds string) STORED AS ORC;
> insert into table orc_partitioned partition (ds = 'today') select cint, cstring1 from alltypesorc where cint is not null order by cint limit 10;
> insert into table orc_partitioned partition (ds = 'tomorrow') select cint, cstring1 from alltypesorc where cint is not null order by cint limit 10;
> -- Use the old change the SERDE trick to avoid ORC DDL checks... and remove a column on the end.
> ALTER TABLE orc_partitioned SET SERDE 'org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe';
> ALTER TABLE orc_partitioned REPLACE COLUMNS (a int);
> ALTER TABLE orc_partitioned SET SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde';
> SELECT * FROM orc_partitioned WHERE ds = 'today';
> SELECT * FROM orc_partitioned WHERE ds = 'tomorrow';
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)