You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Rutuja Raghoji <Ru...@persistent.co.in> on 2009/09/21 10:55:30 UTC

Indexing in hive

Hi,

This query is with reference to the Indexing patch mentioned in the following link.

https://issues.apache.org/jira/browse/HIVE-678

I applied the patch (hive-678-2009-07-25) on the Hive revision 796926 and built the code.

Thereafter, created a small table pokes with two fields foo and bar and also created index on this table with type projection as follows.

hive>CREATE TABLE pokes(foo int , bar string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';

hive>CREATE INDEX pokes_index1 TYPE PROJECTION ON TABLE pokes1(foo,bar);
OK
Time taken: 0.159 seconds

hive> select a.* from pokes_index1 a;
OK
Time taken: 0.073 seconds

As seen in the last query, I am not able to see the contents of index. Is there any way to see the contents of index table?

Secondly, three kinds of indexes have been mentioned in the link namely, Projection, Summary and Compact. Could somebody explain the scenarions in which we can use them.
DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.

Re: Indexing in hive

Posted by He Yongqiang <he...@software.ict.ac.cn>.
After "create index" command, you can use "update index index_name", index
actually is a table itself. So you can see its contents and do other normal
table operations.

Thanks,
Yongqiang
On 09-9-21 下午5:15, "Rutuja Raghoji" <Ru...@persistent.co.in>
wrote:

> Hi,
>  Thanks for the fast reply.  I have tried a lot to to find the index contens
> and its location.Is there any way to see the contents of index?
> 
> 
> 
> Regards,
> Rutuja
> 
> 
> From: He Yongqiang [heyongqiang@software.ict.ac.cn]
> Sent: Monday, September 21, 2009 2:41 PM
> To: hive-user@hadoop.apache.org
> Subject: Re: Indexing in hive
> 
> Hi Rutuja,
>   Sorry, that patch is not ready yet. Will restart working on it from next
> week. Sorry for the inconvenience. I will send you an email when the ready
> patch is attached. Thanks for your attention.
> 
> "Projection, Summary and Compact"
> Projection, I think we may consider drop it. That's because in my lots of
> recent experiments, projection does not speed much.
> Summary is to record one index entry for every record.
> Compact is to record only one index entry for all records with the same
> index key.
> 
> For example, we have three entries in table:
> aa, abc, adc
> aa, abcd, dd
> 
> If we wants to use the first column as index key,
> The summary index will look like:
> aa, offset-of-first-record
> aa, offset-of-second-record
> 
> The compact index will look like:
> aa, offset-of-first-record | offset-of-second-record
> 
> Thanks,
> yongqiang
> On 09-9-21 下午4:55, "Rutuja Raghoji" <Ru...@persistent.co.in>
> wrote:
> 
>> Hi,
>> 
>> This query is with reference to the Indexing patch mentioned in the following
>> link.
>> 
>> https://issues.apache.org/jira/browse/HIVE-678
>> 
>> I applied the patch (hive-678-2009-07-25) on the Hive revision 796926 and
>> built the code.
>> 
>> Thereafter, created a small table pokes with two fields foo and bar and also
>> created index on this table with type projection as follows.
>> 
>> hive>CREATE TABLE pokes(foo int , bar string) ROW FORMAT DELIMITED FIELDS
>> TERMINATED BY ',';
>> 
>> hive>CREATE INDEX pokes_index1 TYPE PROJECTION ON TABLE pokes1(foo,bar);
>> OK
>> Time taken: 0.159 seconds
>> 
>> hive> select a.* from pokes_index1 a;
>> OK
>> Time taken: 0.073 seconds
>> 
>> As seen in the last query, I am not able to see the contents of index. Is
>> there any way to see the contents of index table?
>> 
>> Secondly, three kinds of indexes have been mentioned in the link namely,
>> Projection, Summary and Compact. Could somebody explain the scenarions in
>> which we can use them.
>> DISCLAIMER
>> ==========
>> This e-mail may contain privileged and confidential information which is the
>> property of Persistent Systems Ltd. It is intended only for the use of the
>> individual or entity to which it is addressed. If you are not the intended
>> recipient, you are not authorized to read, retain, copy, print, distribute or
>> use this message. If you have received this communication in error, please
>> notify the sender and delete all copies of this message. Persistent Systems
>> Ltd. does not accept any liability for virus infected mails.
>> 
>> 
> DISCLAIMER
> ==========
> This e-mail may contain privileged and confidential information which is the
> property of Persistent Systems Ltd. It is intended only for the use of the
> individual or entity to which it is addressed. If you are not the intended
> recipient, you are not authorized to read, retain, copy, print, distribute or
> use this message. If you have received this communication in error, please
> notify the sender and delete all copies of this message. Persistent Systems
> Ltd. does not accept any liability for virus infected mails.
> 
> 



RE: Indexing in hive

Posted by Rutuja Raghoji <Ru...@persistent.co.in>.
Hi,
 Thanks for the fast reply.  I have tried a lot to to find the index contens and its location.Is there any way to see the contents of index?



Regards,
Rutuja


From: He Yongqiang [heyongqiang@software.ict.ac.cn]
Sent: Monday, September 21, 2009 2:41 PM
To: hive-user@hadoop.apache.org
Subject: Re: Indexing in hive

Hi Rutuja,
  Sorry, that patch is not ready yet. Will restart working on it from next
week. Sorry for the inconvenience. I will send you an email when the ready
patch is attached. Thanks for your attention.

"Projection, Summary and Compact"
Projection, I think we may consider drop it. That's because in my lots of
recent experiments, projection does not speed much.
Summary is to record one index entry for every record.
Compact is to record only one index entry for all records with the same
index key.

For example, we have three entries in table:
aa, abc, adc
aa, abcd, dd

If we wants to use the first column as index key,
The summary index will look like:
aa, offset-of-first-record
aa, offset-of-second-record

The compact index will look like:
aa, offset-of-first-record | offset-of-second-record

Thanks,
yongqiang
On 09-9-21 下午4:55, "Rutuja Raghoji" <Ru...@persistent.co.in>
wrote:

> Hi,
>
> This query is with reference to the Indexing patch mentioned in the following
> link.
>
> https://issues.apache.org/jira/browse/HIVE-678
>
> I applied the patch (hive-678-2009-07-25) on the Hive revision 796926 and
> built the code.
>
> Thereafter, created a small table pokes with two fields foo and bar and also
> created index on this table with type projection as follows.
>
> hive>CREATE TABLE pokes(foo int , bar string) ROW FORMAT DELIMITED FIELDS
> TERMINATED BY ',';
>
> hive>CREATE INDEX pokes_index1 TYPE PROJECTION ON TABLE pokes1(foo,bar);
> OK
> Time taken: 0.159 seconds
>
> hive> select a.* from pokes_index1 a;
> OK
> Time taken: 0.073 seconds
>
> As seen in the last query, I am not able to see the contents of index. Is
> there any way to see the contents of index table?
>
> Secondly, three kinds of indexes have been mentioned in the link namely,
> Projection, Summary and Compact. Could somebody explain the scenarions in
> which we can use them.
> DISCLAIMER
> ==========
> This e-mail may contain privileged and confidential information which is the
> property of Persistent Systems Ltd. It is intended only for the use of the
> individual or entity to which it is addressed. If you are not the intended
> recipient, you are not authorized to read, retain, copy, print, distribute or
> use this message. If you have received this communication in error, please
> notify the sender and delete all copies of this message. Persistent Systems
> Ltd. does not accept any liability for virus infected mails.
>
>
DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.

Re: Indexing in hive

Posted by He Yongqiang <he...@software.ict.ac.cn>.
Hi Rutuja,
  Sorry, that patch is not ready yet. Will restart working on it from next
week. Sorry for the inconvenience. I will send you an email when the ready
patch is attached. Thanks for your attention.

"Projection, Summary and Compact"
Projection, I think we may consider drop it. That's because in my lots of
recent experiments, projection does not speed much.
Summary is to record one index entry for every record.
Compact is to record only one index entry for all records with the same
index key.

For example, we have three entries in table:
aa, abc, adc
aa, abcd, dd

If we wants to use the first column as index key,
The summary index will look like:
aa, offset-of-first-record
aa, offset-of-second-record

The compact index will look like:
aa, offset-of-first-record | offset-of-second-record

Thanks,
yongqiang
On 09-9-21 下午4:55, "Rutuja Raghoji" <Ru...@persistent.co.in>
wrote:

> Hi,
> 
> This query is with reference to the Indexing patch mentioned in the following
> link.
> 
> https://issues.apache.org/jira/browse/HIVE-678
> 
> I applied the patch (hive-678-2009-07-25) on the Hive revision 796926 and
> built the code.
> 
> Thereafter, created a small table pokes with two fields foo and bar and also
> created index on this table with type projection as follows.
> 
> hive>CREATE TABLE pokes(foo int , bar string) ROW FORMAT DELIMITED FIELDS
> TERMINATED BY ',';
> 
> hive>CREATE INDEX pokes_index1 TYPE PROJECTION ON TABLE pokes1(foo,bar);
> OK
> Time taken: 0.159 seconds
> 
> hive> select a.* from pokes_index1 a;
> OK
> Time taken: 0.073 seconds
> 
> As seen in the last query, I am not able to see the contents of index. Is
> there any way to see the contents of index table?
> 
> Secondly, three kinds of indexes have been mentioned in the link namely,
> Projection, Summary and Compact. Could somebody explain the scenarions in
> which we can use them.
> DISCLAIMER
> ==========
> This e-mail may contain privileged and confidential information which is the
> property of Persistent Systems Ltd. It is intended only for the use of the
> individual or entity to which it is addressed. If you are not the intended
> recipient, you are not authorized to read, retain, copy, print, distribute or
> use this message. If you have received this communication in error, please
> notify the sender and delete all copies of this message. Persistent Systems
> Ltd. does not accept any liability for virus infected mails.
> 
>