You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Sergey Shelukhin (JIRA)" <ji...@apache.org> on 2017/10/02 23:26:00 UTC

[jira] [Created] (HIVE-17675) verify SMB join with multiple inserts

Sergey Shelukhin created HIVE-17675:
---------------------------------------

             Summary: verify SMB join with multiple inserts
                 Key: HIVE-17675
                 URL: https://issues.apache.org/jira/browse/HIVE-17675
             Project: Hive
          Issue Type: Bug
            Reporter: Sergey Shelukhin


Hive has a family of joins that interact with sorted and bucketed tables. Afaik one (all?) of them actually rely on the table being sorted, rather than sorting it. 
If one runs insert on such a table without merge more than once, there'd be 2+ files for every bucket that are individually sorted; but globally, the table would no longer be sorted.
Would these joins work/disable themselves correctly in this case, or could it produce incorrect results? We might need a q file.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)