You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Edward Yoon (JIRA)" <ji...@apache.org> on 2007/12/01 09:15:43 UTC
[jira] Updated: (HADOOP-2021) Sort Join Implementation
[ https://issues.apache.org/jira/browse/HADOOP-2021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Edward Yoon updated HADOOP-2021:
--------------------------------
Status: Open (was: Patch Available)
Canceling, it seems not registered.
> Sort Join Implementation
> ------------------------
>
> Key: HADOOP-2021
> URL: https://issues.apache.org/jira/browse/HADOOP-2021
> Project: Hadoop
> Issue Type: Sub-task
> Components: contrib/hbase
> Affects Versions: 0.14.1
> Environment: all environments
> Reporter: Edward Yoon
> Assignee: Edward Yoon
> Priority: Minor
> Fix For: 0.16.0
>
> Attachments: 2021_v01.patch, 2021_v02.txt, 2021_v04.patch, 2021_v05, 2021_v05.patch, 2021_v06.patch
>
>
> If we don't have an index for a domain in the join, we can still improve on the nested-loop join using sort join.
> {code}
> R1 = table('movieLog_table');
> R2 = table('stockCompany_info');
> result = R1.join(R1.studioName = R2.corporation) and R2;
> {code}
> ----
> {code}
> r1
> a b c
> ======================
> row1 a1 b1 c1
> row2 a2 b2 c2
> row3 a1 b3 c3
> r2
> e f
> ==================
> row1 e1 a1
> row2 e2 f2
> row3 e3 f3
> row4 e4 a1
> row5 e5 a2
> r1 = table('r1');
> r2 = table('r2');
> r3 = r1.join(r1.a = r2.f) and r2;
> ---------------------------------------------
> temp table T : Sorted set by "f"
> row
> =============
> a1 row:row1
> row:row4
> a2 row:row5
> f2 row:row2
> f3 row:row3
> ---------------------
> r3
> r1.row a b c r2.row e f
> ===================================================
> row1.row1 row1 a1 b1 c1 row1 e1 a1
> row1.row4 row1 a1 b1 c1 row4 e4 a1
> row2.row5 row2 a2 b2 c2 row5 e5 a2
> row3.row1 row3 a1 b3 c3 row1 e1 a1
> row3.row4 row3 a1 b3 c3 row4 e4 a1
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.