You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Sagar Sumit (Jira)" <ji...@apache.org> on 2022/09/08 13:20:00 UTC

[jira] [Commented] (HUDI-3391) presto and hive beeline fails to read MOR table w/ 2 or more array fields

    [ https://issues.apache.org/jira/browse/HUDI-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17601819#comment-17601819 ] 

Sagar Sumit commented on HUDI-3391:
-----------------------------------

I have attempted to reproduce the issue with latest Presto master. It is reproducible i.e. queries on *_ro table runs fine but snapshot queries on *_rt table runs into error. The cause is due to incorrect instantiation of realtime file split in presto. Locally, i upgraded Hudi version in Presto to 0.12.0. With that, I cannot reproduce the issue any more. There is a PR out [https://github.com/prestodb/presto/pull/18209]

We can close this ticket.

> presto and hive beeline fails to read MOR table w/ 2 or more array fields
> -------------------------------------------------------------------------
>
>                 Key: HUDI-3391
>                 URL: https://issues.apache.org/jira/browse/HUDI-3391
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: dependencies, reader-core, trino-presto
>            Reporter: sivabalan narayanan
>            Assignee: Sagar Sumit
>            Priority: Blocker
>             Fix For: 0.12.1
>
>   Original Estimate: 4h
>  Remaining Estimate: 4h
>
> We have an issue reported by user [here|[https://github.com/apache/hudi/issues/2657].] Looks like w/ 0.10.0 or later, spark datasource read works, but hive beeline does not work. Even spark.sql (hive table) querying works as well. 
> Another related ticket: [https://github.com/apache/hudi/issues/3834#issuecomment-997307677]
>  
> Steps that I tried:
> [https://gist.github.com/nsivabalan/fdb8794104181f93b9268380c7f7f079]
> From beeline, you will encounter below exception
> {code:java}
> Failed with exception java.io.IOException:org.apache.hudi.org.apache.avro.SchemaParseException: Can't redefine: array {code}
> All linked ticket states that upgrading parquet to 1.11.0 or greater should work. We need to try it out w/ latest master and go from there. 
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)