You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Liyin Tang (JIRA)" <ji...@apache.org> on 2010/11/02 03:29:26 UTC
[jira] Updated: (HIVE-1754) Remove JDBM component from Map Join
[ https://issues.apache.org/jira/browse/HIVE-1754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Liyin Tang updated HIVE-1754:
-----------------------------
Status: Patch Available (was: Open)
This patch modifies the following things
1) Remove the JDBM from Hive
2) All the data in the small table will be stored in in-memory hashtable.
3) Create a light-weight RowContainer: MapJoinRowContainer.
4) Optimize MapJoinObjectKey. If there are only one join key or two join keys, it will use MapJoinSingleKey or MapJoinDoulbeKeys instead of MapJoinObjectKey.
> Remove JDBM component from Map Join
> -----------------------------------
>
> Key: HIVE-1754
> URL: https://issues.apache.org/jira/browse/HIVE-1754
> Project: Hive
> Issue Type: Improvement
> Components: Query Processor
> Affects Versions: 0.6.0, 0.7.0
> Reporter: Liyin Tang
> Assignee: Liyin Tang
> Fix For: 0.7.0
>
> Attachments: Hive-1754.patch
>
>
> Right now, JDBM is the major performance bottleneck of performance.
> With the growth of the small table, the PUT and GET operation will take most of execution time.
> Map Join is designed to load the data of small table into memory.
> If the data is too large to hold in memory, then there is no need to use the map join strategy.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.