You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2022/02/20 03:37:00 UTC

[jira] [Created] (IMPALA-11136) Improve readability of ORC date out of range warnings

Quanlong Huang created IMPALA-11136:
---------------------------------------

             Summary: Improve readability of ORC date out of range warnings
                 Key: IMPALA-11136
                 URL: https://issues.apache.org/jira/browse/IMPALA-11136
             Project: IMPALA
          Issue Type: Improvement
          Components: Backend
            Reporter: Quanlong Huang
            Assignee: Pranav Yogi Lodha


When selecting the {{{}date_tbl{}}}, I got the following warnings:
{code:java}
[localhost:21050] functional_orc_def> select * from date_tbl order by id_col;
Query: select * from date_tbl order by id_col
Query submitted at: 2022-02-20 11:19:36 (Coordinator: http://quanlong-OptiPlex-BJ:25000)
Query progress can be monitored at: http://quanlong-OptiPlex-BJ:25000/query_plan?query_id=a14cc5049351c48a:703197c000000000
+--------+------------+------------+
| id_col | date_col   | date_part  |
+--------+------------+------------+
| 0      | NULL       | 0001-01-01 |
| 1      | 0001-12-29 | 0001-01-01 |
| 2      | 0001-12-30 | 0001-01-01 |
| 3      | 1400-01-08 | 0001-01-01 |
| 4      | 2017-11-28 | 0001-01-01 |
| 5      | 9999-12-31 | 0001-01-01 |
| 6      | NULL       | 0001-01-01 |
| 10     | 2017-11-28 | 1399-06-27 |
| 11     | NULL       | 1399-06-27 |
| 12     | 2018-12-31 | 1399-06-27 |
| 20     | 0001-06-19 | 2017-11-27 |
| 21     | 0001-06-20 | 2017-11-27 |
| 22     | 0001-06-21 | 2017-11-27 |
| 23     | 0001-06-22 | 2017-11-27 |
| 24     | 0001-06-23 | 2017-11-27 |
| 25     | 0001-06-24 | 2017-11-27 |
| 26     | 0001-06-25 | 2017-11-27 |
| 27     | 0001-06-26 | 2017-11-27 |
| 28     | 0001-06-27 | 2017-11-27 |
| 29     | 2017-11-28 | 2017-11-27 |
| 30     | 9999-12-01 | 9999-12-31 |
| 31     | 9999-12-31 | 9999-12-31 |
+--------+------------+------------+
WARNINGS: ORC file 'hdfs://localhost:20500/test-warehouse/managed/date_tbl_orc_def/date_part=0001-01-01/base_0000005/bucket_00000_0' column '8' contains an out of range date. The valid date range is 0001-01-01..9999-12-31. {code}
This table has only 3 columns. It's unclear to users what column '8' is. Actually, 8 is the orc column type id which is not the column index in the table schema. (This table is ACID-enabled).

The warnings for out of range timestamps from the orc scanner has the same issue.

Parquet scanner produces a better warning with the column name. We should improve this in the orc scanner.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)