You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org> on 2009/06/01 19:17:07 UTC

[jira] Commented: (HIVE-417) Implement Indexing in Hive

    [ https://issues.apache.org/jira/browse/HIVE-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715153#action_12715153 ] 

Joydeep Sen Sarma commented on HIVE-417:
----------------------------------------

- are we going to have one index file per hdfs file? (or one per partition?)

related question is how this is going to interact with sampling? (i think currently the sampling predicate is optimized out for bucketed tables - although not terribly sure).

i would love to see the api to invoke the index. 
- ideally we would like to plug in different indexing schemes - as well with map-side joins - the hashmap storing the smaller table can be seen as an index on this table. It would seem that one should be able to replace a map-side join based on tables loaded into jdbm with tables with indices proposed here (and thereby do joins based on indices almost trivially). 
- we should enable people to be able to plug in their own indices (since it's quite likely that over time there will be multiple indexing efforts on hadoop files).

> Implement Indexing in Hive
> --------------------------
>
>                 Key: HIVE-417
>                 URL: https://issues.apache.org/jira/browse/HIVE-417
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Metastore, Query Processor
>    Affects Versions: 0.2.0, 0.3.0, 0.3.1, 0.4.0
>            Reporter: Prasad Chakka
>            Assignee: He Yongqiang
>         Attachments: hive-417.proto.patch
>
>
> Implement indexing on Hive so that lookup and range queries are efficient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.