You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Ashutosh Chauhan (JIRA)" <ji...@apache.org> on 2012/12/19 16:51:13 UTC

[jira] [Updated] (HIVE-3106) Add option to make multi inserts more atomic

     [ https://issues.apache.org/jira/browse/HIVE-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ashutosh Chauhan updated HIVE-3106:
-----------------------------------

    Fix Version/s: 0.10.0
    
> Add option to make multi inserts more atomic
> --------------------------------------------
>
>                 Key: HIVE-3106
>                 URL: https://issues.apache.org/jira/browse/HIVE-3106
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Kevin Wilfong
>            Assignee: Kevin Wilfong
>             Fix For: 0.10.0
>
>         Attachments: HIVE-3106.1.patch.txt, HIVE-3106.2.patch.txt
>
>
> Currently, with multi-insert queries as soon the output of one of the inserts is ready the move task associated with that insert is run, creating the table/partition.  However, if concurrency is enabled the lock on this table/partition is not released until the entire query finishes, which can be much later.
> This causes issues if, for example, a user is waiting for an output of the multi-insert query which is created long before the other outputs, and checking for it's existence using the metastore's Thrift methods (get_table/get_partition).  In which case, the user will run their query which uses the output, and it will experience a timeout trying to acquire the lock on the table/partition.
> If all the move tasks depend on the parent's of all other move tasks, the output creation will be much closer to atomic relieving this problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira