You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "BELUGA BEHR (JIRA)" <ji...@apache.org> on 2017/06/09 14:21:18 UTC
[jira] [Created] (HIVE-16868) Query Hint For Primary Key / Foreign
Key Joins
BELUGA BEHR created HIVE-16868:
----------------------------------
Summary: Query Hint For Primary Key / Foreign Key Joins
Key: HIVE-16868
URL: https://issues.apache.org/jira/browse/HIVE-16868
Project: Hive
Issue Type: New Feature
Components: Physical Optimizer
Affects Versions: 2.1.1, 3.0.0
Reporter: BELUGA BEHR
Priority: Minor
{code:title=org.apache.hadoop.hive.ql.stats.StatsUtils.java|borderStyle=solid}
/**
* Based on the provided column statistics and number of rows, this method infers if the column
* can be primary key. It checks if the difference between the min and max value is equal to
* number of rows specified.
* @param numRows - number of rows
* @param colStats - column statistics
*/
public static void inferAndSetPrimaryKey(long numRows, List<ColStatistics> colStats) {
if (colStats != null) {
for (ColStatistics cs : colStats) {
if (cs != null && cs.getCountDistint() >= numRows) {
cs.setPrimaryKey(true);
}
else if (cs != null && cs.getRange() != null && cs.getRange().minValue != null &&
cs.getRange().maxValue != null) {
if (numRows ==
((cs.getRange().maxValue.longValue() - cs.getRange().minValue.longValue()) + 1)) {
cs.setPrimaryKey(true);
}
}
}
}
}
{code}
This code is likely to miss many PK key scenarios because users may delete rows from their tables over time and cause this to miss.
{code}
PK Values: 1,2,4
Range = ( 3 +1 ) = 4
Rows = 3
{code}
Allow a query hint that can be used by the user to specify a join as a PK-FK relationship.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)