You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Remus Rusanu (JIRA)" <ji...@apache.org> on 2013/10/03 16:17:41 UTC

[jira] [Updated] (HIVE-4850) Implement vectorized JOIN operators

     [ https://issues.apache.org/jira/browse/HIVE-4850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Remus Rusanu updated HIVE-4850:
-------------------------------

    Attachment: HIVE-4850.2.patch

This is a working implementation based on current trunk. It is simpler than the .1 patch in as it delegates the JOIN entirely to the row-mode MapJoinOperator. The vectorized operator is literally calling the row-mode implementaiton for each row in the input batch and collects the row-mode forward into the output batch. This is not as bad as it seems because the JOIN operators has to resort to row-mode operations anyway, due to the small tables (hashtables) being row-mode (objects and object-inspectors). By delegating the entire join logic to the row mode we piggyback on the correctness of exiting implementation. I do plan to come up with a full-vectorized mode implementation but that would require changes to the hash table creation-serialization. Note that the filtering and key evaluation of the big table *does* use vectorized operators. the row mode applies only to the key HT lookup and to the JOIN logic.

> Implement vectorized JOIN operators
> -----------------------------------
>
>                 Key: HIVE-4850
>                 URL: https://issues.apache.org/jira/browse/HIVE-4850
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Remus Rusanu
>            Assignee: Remus Rusanu
>         Attachments: HIVE-4850.1.patch, HIVE-4850.2.patch
>
>
> Easysauce



--
This message was sent by Atlassian JIRA
(v6.1#6144)