You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Sergey Shelukhin (JIRA)" <ji...@apache.org> on 2017/02/28 01:39:45 UTC

[jira] [Assigned] (HIVE-16051) MM tables: skewjoin test fails

     [ https://issues.apache.org/jira/browse/HIVE-16051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sergey Shelukhin reassigned HIVE-16051:
---------------------------------------

    Assignee: Sergey Shelukhin

> MM tables: skewjoin test fails
> ------------------------------
>
>                 Key: HIVE-16051
>                 URL: https://issues.apache.org/jira/browse/HIVE-16051
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>
> {noformat}
> set hive.optimize.skewjoin = true;
> set hive.skewjoin.key = 2;
> set hive.optimize.metadataonly=false;
> CREATE TABLE dest_j1(key INT, value STRING) STORED AS TEXTFILE tblproperties ("transactional"="true", "transactional_properties"="insert_only");
> FROM src src1 JOIN src src2 ON (src1.key = src2.key)
> INSERT OVERWRITE TABLE dest_j1 SELECT src1.key, src2.value;
> select count(distinct key) from dest_j1;
> {noformat}
> Different results for MM and non-MM table.
> Probably has something to do with how skewjoin handles files; however, looking at MM/debugging logs, there are no suspicious deletes, and everything looks the same for both cases; all the logging for skewjoin row containers and stuff is identical between the two runs (except for the numbers/guids; the number of files, paths, etc. are all the same). So not sure what's going on. Probably dfs dump can answer this question, but it doesn't work for me currently on q files.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)