You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Mahesh Balakrishnan (Jira)" <ji...@apache.org> on 2020/10/30 21:12:00 UTC

[jira] [Commented] (SQOOP-3311) Importing as ORC file to support full ACID Hive tables

    [ https://issues.apache.org/jira/browse/SQOOP-3311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17223903#comment-17223903 ] 

Mahesh Balakrishnan commented on SQOOP-3311:
--------------------------------------------

[~dvoros], All of this is for import what about export?  I know this jira is old one but there is a need for exporting acid orc table which currently does not work with hcat and as there is no hive-table option for export it will never work.

> Importing as ORC file to support full ACID Hive tables
> ------------------------------------------------------
>
>                 Key: SQOOP-3311
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3311
>             Project: Sqoop
>          Issue Type: New Feature
>          Components: hive-integration
>            Reporter: Daniel Voros
>            Assignee: Daniel Voros
>            Priority: Major
>
> Hive 3 will introduce a switch (HIVE-18294) to create eligible tables as ACID by default. This will probably result in increased usage of ACID tables and the need to support importing into ACID tables with Sqoop.
> Currently the only table format supporting full ACID tables is ORC.
> The easiest and most effective way to support importing into these tables would be to write out files as ORC and keep using LOAD DATA as we do for all other Hive tables (supported since HIVE-17361).
> Workaround could be to create table as textfile (as before) and then CTAS from that. This would push the responsibility of creating ORC format to Hive. However it would result in writing every record twice; in text format and in ORC.
> Note that ORC is only necessary for full ACID tables. Insert-only (aka. micromanaged) ACID tables can use arbitrary file format.
> Supporting full ACID tables would also be the first step in making "lastmodified" incremental imports work with Hive.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)