You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "WangSheng (Jira)" <ji...@apache.org> on 2020/07/17 11:25:00 UTC

[jira] [Updated] (IMPALA-9967) Scan orc failed when table contains timestamp column

     [ https://issues.apache.org/jira/browse/IMPALA-9967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

WangSheng updated IMPALA-9967:
------------------------------
    Description: 
Recently, when I test impala query orc table, I found that scanning failed when table contains timestamp column, here is there exception: 

{code:java}
I0717 08:31:47.179124 78759 status.cc:129] 68436a6e0883be84:53877f7200000002] Encountered parse error in tail of ORC file hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-00000.orc: Unknown type kind
    @          0x1c9f753  impala::Status::Status()
    @          0x27aa049  impala::HdfsOrcScanner::ProcessFileTail()
    @          0x27a7fb3  impala::HdfsOrcScanner::Open()
    @          0x27365fe  impala::HdfsScanNodeBase::CreateAndOpenScannerHelper()
    @          0x28cb379  impala::HdfsScanNode::ProcessSplit()
    @          0x28caa7d  impala::HdfsScanNode::ScannerThread()
    @          0x28c9de5  _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
    @          0x28cc19e  _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
    @          0x2053333  boost::function0<>::operator()()
    @          0x2675d93  impala::Thread::SuperviseThread()
    @          0x267dd30  boost::_bi::list5<>::operator()<>()
    @          0x267dc54  boost::_bi::bind_t<>::operator()()
    @          0x267dc15  boost::detail::thread_data<>::run()
    @          0x3e3c3c1  thread_proxy
    @     0x7f32360336b9  start_thread
    @     0x7f3232bfe41c  clone
I0717 08:31:47.325670 78759 hdfs-scan-node.cc:490] 68436a6e0883be84:53877f7200000002] Error preparing scanner for scan range hdfs://localhost:20500/test-warehouse/iceberg_test/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-00000.orc(0:582). Encountered parse error in tail of ORC file hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-00000.orc: Unknown type kind
{code}

When I remove timestamp colum from table, and generate test data, query success. By the way, my test data is generated by spark.

  was:
Recently, when I test impala query orc table, I found that scanning failed when table contains timestamp column, here is there exception: 

{code:java}
I0717 08:31:47.179124 78759 status.cc:129] 68436a6e0883be84:53877f7200000002] Encountered parse error in tail of ORC file hdfs://localhost:20500/test-warehouse/iceberg_test/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-00000.orc: Unknown type kind
    @          0x1c9f753  impala::Status::Status()
    @          0x27aa049  impala::HdfsOrcScanner::ProcessFileTail()
    @          0x27a7fb3  impala::HdfsOrcScanner::Open()
    @          0x27365fe  impala::HdfsScanNodeBase::CreateAndOpenScannerHelper()
    @          0x28cb379  impala::HdfsScanNode::ProcessSplit()
    @          0x28caa7d  impala::HdfsScanNode::ScannerThread()
    @          0x28c9de5  _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
    @          0x28cc19e  _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
    @          0x2053333  boost::function0<>::operator()()
    @          0x2675d93  impala::Thread::SuperviseThread()
    @          0x267dd30  boost::_bi::list5<>::operator()<>()
    @          0x267dc54  boost::_bi::bind_t<>::operator()()
    @          0x267dc15  boost::detail::thread_data<>::run()
    @          0x3e3c3c1  thread_proxy
    @     0x7f32360336b9  start_thread
    @     0x7f3232bfe41c  clone
I0717 08:31:47.325670 78759 hdfs-scan-node.cc:490] 68436a6e0883be84:53877f7200000002] Error preparing scanner for scan range hdfs://localhost:20500/test-warehouse/iceberg_test/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-00000.orc(0:582). Encountered parse error in tail of ORC file hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-00000.orc: Unknown type kind
{code}

When I remove timestamp colum from table, and generate test data, query success. By the way, my test data is generated by spark.


> Scan orc failed when table contains timestamp column
> ----------------------------------------------------
>
>                 Key: IMPALA-9967
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9967
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 4.0
>            Reporter: WangSheng
>            Priority: Minor
>
> Recently, when I test impala query orc table, I found that scanning failed when table contains timestamp column, here is there exception: 
> {code:java}
> I0717 08:31:47.179124 78759 status.cc:129] 68436a6e0883be84:53877f7200000002] Encountered parse error in tail of ORC file hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-00000.orc: Unknown type kind
>     @          0x1c9f753  impala::Status::Status()
>     @          0x27aa049  impala::HdfsOrcScanner::ProcessFileTail()
>     @          0x27a7fb3  impala::HdfsOrcScanner::Open()
>     @          0x27365fe  impala::HdfsScanNodeBase::CreateAndOpenScannerHelper()
>     @          0x28cb379  impala::HdfsScanNode::ProcessSplit()
>     @          0x28caa7d  impala::HdfsScanNode::ScannerThread()
>     @          0x28c9de5  _ZZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS_18ThreadResourcePoolEENKUlvE_clEv
>     @          0x28cc19e  _ZN5boost6detail8function26void_function_obj_invoker0IZN6impala12HdfsScanNode22ThreadTokenAvailableCbEPNS3_18ThreadResourcePoolEEUlvE_vE6invokeERNS1_15function_bufferE
>     @          0x2053333  boost::function0<>::operator()()
>     @          0x2675d93  impala::Thread::SuperviseThread()
>     @          0x267dd30  boost::_bi::list5<>::operator()<>()
>     @          0x267dc54  boost::_bi::bind_t<>::operator()()
>     @          0x267dc15  boost::detail::thread_data<>::run()
>     @          0x3e3c3c1  thread_proxy
>     @     0x7f32360336b9  start_thread
>     @     0x7f3232bfe41c  clone
> I0717 08:31:47.325670 78759 hdfs-scan-node.cc:490] 68436a6e0883be84:53877f7200000002] Error preparing scanner for scan range hdfs://localhost:20500/test-warehouse/iceberg_test/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-00000.orc(0:582). Encountered parse error in tail of ORC file hdfs://localhost:20500/test-warehouse/orc_scanner_test/00031-31-ac3cccf1-3ce7-40c6-933c-4fbd7bd57550-00000.orc: Unknown type kind
> {code}
> When I remove timestamp colum from table, and generate test data, query success. By the way, my test data is generated by spark.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org