You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by André Hacker <an...@gmail.com> on 2014/08/05 09:23:18 UTC

Difference between Hive and HCat table?

Hi,

a very simple question: Is there a difference between a table in Hive and a
table in HCat?
In other words: Can I create a table in Hive that is invisible in HCat, or
vice versa?
(Assuming that Hive and HCat point to the same metastore)

>From my understanding, HCat is just a wrapper around the Hive metastore, so
there should be no major difference. This is in line with my experience: If
I create a table via the Hive CLI, it will be shown in HCat too when
running hcat -e "show tables;". And vice versa.

I ask because some online documentation makes me feel that I have to run my
DDL in HCat to make it visible there. At least I didn't find documents that
say that I can use either Hive CLI or HCat.

Thanks,

André Hacker

Re: Difference between Hive and HCat table?

Posted by Peyman Mohajerian <mo...@gmail.com>.
Other tools, e.g. Pig can access HCat and find out what the schema is. At
Teradata we look up the meta-data directly from HCat and then read the data
in parallel from HDFS rather than the slower route that is Hiverserver2.
So HCat is an important tool for vendors who want to import/export data to
Hadoop and don't have to have direct dependency on Hive.


On Sat, Aug 9, 2014 at 10:04 AM, André Hacker <an...@gmail.com>
wrote:

> Thank you Andrew and Lefty, that helped a lot with clarification.
>
> So the link tells me that, assuming a single metastore, everything done in
> the HCat CLI will be reflected in Hive CLI and vice versa, but there are
> some features exclusively available in Hive CLI and a few others
> exclusively in HCat CLI (only table groups/permissions as far as I can see).
>
> From my user perspective it still looks a bit redundant to distinguish
> these two CLIs. However, I understand that there are reasons to distinguish
> HCat, which is a very generic metadata layer, and Hive, which is one (the
> most popular one) of many engines running on HCat. The fact that HCat is
> bundled with Hive and at the same time separated was always a bit confusing
> to me, so I wanted to see if I missed something.
>
> So thanks again, this section in the documentation was just what I was
> looking for.
>
> André
> Am 05.08.2014 21:32 schrieb "Lefty Leverenz" <le...@gmail.com>:
>
>> Perhaps this documentation will help:  HCatalog CLI -- Hive CLI
>> <https://cwiki.apache.org/confluence/display/Hive/HCatalog+CLI#HCatalogCLI-HiveCLI>
>> .
>>
>> Also note the section that follows it, which begins "HCatalog supports
>> all Hive Data Definition Language except those operations that require
>> running a MapReduce job."
>>
>> -- Lefty
>>
>>
>> On Tue, Aug 5, 2014 at 5:00 AM, Andrew Mains <an...@kontagent.com>
>> wrote:
>>
>>> André,
>>>
>>> To my knowledge, your understanding is correct--given that both Hive and
>>> HCatalog are pointing to the same metastore instance, all HCatalog table
>>> operations should be
>>> reflected in Hive, and vice versa. You should be able to use the Hive
>>> CLI and hcat interchangeably to execute your DDL.
>>>
>>> Andrew
>>>
>>>
>>> On 8/5/14, 12:23 AM, André Hacker wrote:
>>>
>>>> Hi,
>>>>
>>>> a very simple question: Is there a difference between a table in Hive
>>>> and a table in HCat?
>>>> In other words: Can I create a table in Hive that is invisible in HCat,
>>>> or vice versa?
>>>> (Assuming that Hive and HCat point to the same metastore)
>>>>
>>>> From my understanding, HCat is just a wrapper around the Hive
>>>> metastore, so there should be no major difference. This is in line with my
>>>> experience: If I create a table via the Hive CLI, it will be shown in HCat
>>>> too when running hcat -e "show tables;". And vice versa.
>>>>
>>>> I ask because some online documentation makes me feel that I have to
>>>> run my DDL in HCat to make it visible there. At least I didn't find
>>>> documents that say that I can use either Hive CLI or HCat.
>>>>
>>>> Thanks,
>>>>
>>>> André Hacker
>>>>
>>>>
>>>
>>

Re: Difference between Hive and HCat table?

Posted by André Hacker <an...@gmail.com>.
Thank you Andrew and Lefty, that helped a lot with clarification.

So the link tells me that, assuming a single metastore, everything done in
the HCat CLI will be reflected in Hive CLI and vice versa, but there are
some features exclusively available in Hive CLI and a few others
exclusively in HCat CLI (only table groups/permissions as far as I can see).

>From my user perspective it still looks a bit redundant to distinguish
these two CLIs. However, I understand that there are reasons to distinguish
HCat, which is a very generic metadata layer, and Hive, which is one (the
most popular one) of many engines running on HCat. The fact that HCat is
bundled with Hive and at the same time separated was always a bit confusing
to me, so I wanted to see if I missed something.

So thanks again, this section in the documentation was just what I was
looking for.

André
Am 05.08.2014 21:32 schrieb "Lefty Leverenz" <le...@gmail.com>:

> Perhaps this documentation will help:  HCatalog CLI -- Hive CLI
> <https://cwiki.apache.org/confluence/display/Hive/HCatalog+CLI#HCatalogCLI-HiveCLI>
> .
>
> Also note the section that follows it, which begins "HCatalog supports all
> Hive Data Definition Language except those operations that require running
> a MapReduce job."
>
> -- Lefty
>
>
> On Tue, Aug 5, 2014 at 5:00 AM, Andrew Mains <an...@kontagent.com>
> wrote:
>
>> André,
>>
>> To my knowledge, your understanding is correct--given that both Hive and
>> HCatalog are pointing to the same metastore instance, all HCatalog table
>> operations should be
>> reflected in Hive, and vice versa. You should be able to use the Hive CLI
>> and hcat interchangeably to execute your DDL.
>>
>> Andrew
>>
>>
>> On 8/5/14, 12:23 AM, André Hacker wrote:
>>
>>> Hi,
>>>
>>> a very simple question: Is there a difference between a table in Hive
>>> and a table in HCat?
>>> In other words: Can I create a table in Hive that is invisible in HCat,
>>> or vice versa?
>>> (Assuming that Hive and HCat point to the same metastore)
>>>
>>> From my understanding, HCat is just a wrapper around the Hive metastore,
>>> so there should be no major difference. This is in line with my experience:
>>> If I create a table via the Hive CLI, it will be shown in HCat too when
>>> running hcat -e "show tables;". And vice versa.
>>>
>>> I ask because some online documentation makes me feel that I have to run
>>> my DDL in HCat to make it visible there. At least I didn't find documents
>>> that say that I can use either Hive CLI or HCat.
>>>
>>> Thanks,
>>>
>>> André Hacker
>>>
>>>
>>
>

Re: Difference between Hive and HCat table?

Posted by Lefty Leverenz <le...@gmail.com>.
Perhaps this documentation will help:  HCatalog CLI -- Hive CLI
<https://cwiki.apache.org/confluence/display/Hive/HCatalog+CLI#HCatalogCLI-HiveCLI>
.

Also note the section that follows it, which begins "HCatalog supports all
Hive Data Definition Language except those operations that require running
a MapReduce job."

-- Lefty


On Tue, Aug 5, 2014 at 5:00 AM, Andrew Mains <an...@kontagent.com>
wrote:

> André,
>
> To my knowledge, your understanding is correct--given that both Hive and
> HCatalog are pointing to the same metastore instance, all HCatalog table
> operations should be
> reflected in Hive, and vice versa. You should be able to use the Hive CLI
> and hcat interchangeably to execute your DDL.
>
> Andrew
>
>
> On 8/5/14, 12:23 AM, André Hacker wrote:
>
>> Hi,
>>
>> a very simple question: Is there a difference between a table in Hive and
>> a table in HCat?
>> In other words: Can I create a table in Hive that is invisible in HCat,
>> or vice versa?
>> (Assuming that Hive and HCat point to the same metastore)
>>
>> From my understanding, HCat is just a wrapper around the Hive metastore,
>> so there should be no major difference. This is in line with my experience:
>> If I create a table via the Hive CLI, it will be shown in HCat too when
>> running hcat -e "show tables;". And vice versa.
>>
>> I ask because some online documentation makes me feel that I have to run
>> my DDL in HCat to make it visible there. At least I didn't find documents
>> that say that I can use either Hive CLI or HCat.
>>
>> Thanks,
>>
>> André Hacker
>>
>>
>

Re: Difference between Hive and HCat table?

Posted by Andrew Mains <an...@kontagent.com>.
André,

To my knowledge, your understanding is correct--given that both Hive and 
HCatalog are pointing to the same metastore instance, all HCatalog table 
operations should be
reflected in Hive, and vice versa. You should be able to use the Hive 
CLI and hcat interchangeably to execute your DDL.

Andrew

On 8/5/14, 12:23 AM, André Hacker wrote:
> Hi,
>
> a very simple question: Is there a difference between a table in Hive 
> and a table in HCat?
> In other words: Can I create a table in Hive that is invisible in 
> HCat, or vice versa?
> (Assuming that Hive and HCat point to the same metastore)
>
> From my understanding, HCat is just a wrapper around the Hive 
> metastore, so there should be no major difference. This is in line 
> with my experience: If I create a table via the Hive CLI, it will be 
> shown in HCat too when running hcat -e "show tables;". And vice versa.
>
> I ask because some online documentation makes me feel that I have to 
> run my DDL in HCat to make it visible there. At least I didn't find 
> documents that say that I can use either Hive CLI or HCat.
>
> Thanks,
>
> André Hacker
>