You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "BELUGA BEHR (JIRA)" <ji...@apache.org> on 2017/06/14 01:06:00 UTC

[jira] [Commented] (HIVE-16868) Query Hint For Primary Key / Foreign Key Joins

    [ https://issues.apache.org/jira/browse/HIVE-16868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16048565#comment-16048565 ] 

BELUGA BEHR commented on HIVE-16868:
------------------------------------

I'm not sure if it does this or not, but also include information in the EXPLAIN plan if this PK/FK optimization is being used.

> Query Hint For Primary Key / Foreign Key Joins
> ----------------------------------------------
>
>                 Key: HIVE-16868
>                 URL: https://issues.apache.org/jira/browse/HIVE-16868
>             Project: Hive
>          Issue Type: New Feature
>          Components: Physical Optimizer
>    Affects Versions: 2.1.1, 3.0.0
>            Reporter: BELUGA BEHR
>            Priority: Minor
>
> {code:title=org.apache.hadoop.hive.ql.stats.StatsUtils.java|borderStyle=solid}
>   /**
>    * Based on the provided column statistics and number of rows, this method infers if the column
>    * can be primary key. It checks if the difference between the min and max value is equal to
>    * number of rows specified.
>    * @param numRows - number of rows
>    * @param colStats - column statistics
>    */
>   public static void inferAndSetPrimaryKey(long numRows, List<ColStatistics> colStats) {
>     if (colStats != null) {
>       for (ColStatistics cs : colStats) {
>         if (cs != null && cs.getCountDistint() >= numRows) {
>           cs.setPrimaryKey(true);
>         }
>         else if (cs != null && cs.getRange() != null && cs.getRange().minValue != null &&
>             cs.getRange().maxValue != null) {
>           if (numRows ==
>               ((cs.getRange().maxValue.longValue() - cs.getRange().minValue.longValue()) + 1)) {
>             cs.setPrimaryKey(true);
>           }
>         }
>       }
>     }
>   }
> {code}
> This code is likely to miss many PK key scenarios because users may delete rows from their tables over time and cause this to miss.
> {code}
> PK Values: 1,2,4
> Range = ( 3 +1 ) = 4
> Rows = 3
> {code}
> Allow a query hint that can be used by the user to specify a join as a PK-FK relationship.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)