You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Santosh Achhra <sa...@gmail.com> on 2013/01/11 07:19:14 UTC

Hive Index

Hello Hive Users,

I have created index on table with deferred rebuild.

1) After Index is created I see two tables;
           a) MAIN_TABLE
           b) MAIN_TABLE_INDEX

No when I do a explain on the table, I see that it is going for table scan
and not index scan ?

        MAIN_TABLE
*          TableScan*
            alias: MAIN_TABLE
            Filter Operator
              predicate:
                  expr: (trim(F1) = 'V1')
                  type: boolean

Is this expected ? Are there any additional steps which should
be done after index is created ?

I found below mentioned text in archives which is from Mark

INSERT OVERWRITE DIRECTORY '/tmp/indexes/x' SELECT `_bucketname`,
`_offsets` FROM default__t_x__
where j='and';
(The name default__t_x__ can be found in the output of step 2. Also,
/tmp/indexes directory
needs to exist in HDFS. You can substitute this to be any pre-existing
directory in HDFS)
SET hive.index.compact.file=/tmp/indexes/x;
SET hive.input.format=org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexInputFormat;
SELECT a, count(*) from t where j='and' group by a;


Also does join keys use indexes which were created ?



Good wishes,always !
Santosh