You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@impala.apache.org by "Petter von Dolwitz (Hem)" <pe...@gmail.com> on 2017/04/12 12:48:26 UTC

Hiding databases, tables and views from Impala

Hi,

we work in environment where we use both Hive and Impala and where these
two tools share the same metastore. Some of the tables are only usable from
Hive since they are backed by a file format that is not supported by Impala.

If a user would try to access such a table from Impala he/she would get the
message

AnalysisException: Failed to load metadata for table: 'mydb.mytable' CAUSED
BY: TableLoadingException: Unrecognized table type for table: mydb.mytable

We are looking for a way to hide entire DBs and/or Tables and views from
Impala that are only usable from Hive. Anyone know a way to accomplish this?

I would guess a way to support it (if not all ready present) is to add a db
or table property signaling this fact. Another way would be to configure
Sentry to achieve the same thing but we are not sure on how to do this.

Any help is appreciated!

Br,
Petter

Re: Hiding databases, tables and views from Impala

Posted by "Petter von Dolwitz (Hem)" <pe...@gmail.com>.
Thank you for comments!

@Brock
>I can see both sides of this:

>A. Users see table and wonder "why cannot I query this table"
>B. Users don't see table and wonder why doesn't this table show up despite
repeatedly invalidating metadata

>I prefer A.

I think you formulated option B a bit harsh. In a situation where you have
a multi-tenant cluster with a lot of Hive and Impala tables I would be
pleased as a user to only be presented with the tables that a relevant in
an Impala context when using the Impala tool. In the same cluster, an
INVALIDATE METADATA command is to me more of an administrator command. Not
something a user would have to worry about. You are assuming that all users
have the knowledge of all available metastore tables whereas in my opinion
an Impala user should only care and have knowledge about the subset of
tables that are queryable in Impala.

As an administrator in the administrator role I might share your opinion
but as a user option A only causes confusion.

A not so intrusive behaviour would be that a property was applied to the
table (or database) at creation time signalling that this table is not
intended for Impala. This property would then be checked when the DB/Table
tree is built by Impala. This would be a more controlled behaviour
initiated by an administrator rather than a decision taken by Impala.

@Alex
>You could use Sentry to configure something like a Hive Role and an Impala
Role. In Impala, users that do not have privileges on a table will not see
that table. A user could have one of the roles or both. Users that have
both roles would still see that error. Not sure if this approach works in
your case.

Thank you for the suggestion. I have to give this some more though by my
initial comments are the following. We do use Sentry heavily and our users
are granted access to DBs based on their needs/classification. However, a
user can use both Hive and Impala and as such he/she should have both the
Impala role and the Hive role at all times. We do not have a group of users
that are only Impala users. If we had then I think your suggestion would be
applicable. But as it is now we have to combine user rights with rights
that are given to a tool but I think Impala and Hive impersonates the end
user when using each tool so the tool information is gone when
querying Sentry. I might be all wrong here and as said, I will check
further on this.

Thanks,
Petter










2017-04-12 19:16 GMT+02:00 Brock Noland <br...@phdata.io>:

> "It's an interesting question whether Impala should show tables that are
> guaranteed to be unusable at all"
>
> I can see both sides of this:
>
> A. Users see table and wonder "why cannot I query this table"
> B. Users don't see table and wonder why doesn't this table show up despite
> repeatedly invalidating metadata
>
> I prefer A.
>
> On Wed, Apr 12, 2017 at 11:55 AM, Alexander Behm <al...@cloudera.com>
> wrote:
>
>> It's an interesting question whether Impala should show tables that are
>> guaranteed to be unusable at all. Doing that is a little tricky on the
>> technical side, but it might be worth considering.
>>
>> On Wed, Apr 12, 2017 at 9:51 AM, Alexander Behm <al...@cloudera.com>
>> wrote:
>>
>>> I'm not aware of a table property that would have this effect.
>>>
>>> You could use Sentry to configure something like a Hive Role and an
>>> Impala Role. In Impala, users that do not have privileges on a table will
>>> not see that table. A user could have one of the roles or both. Users that
>>> have both roles would still see that error. Not sure if this approach works
>>> in your case.
>>>
>>> Alex
>>>
>>> On Wed, Apr 12, 2017 at 5:48 AM, Petter von Dolwitz (Hem) <
>>> petter.von.dolwitz@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> we work in environment where we use both Hive and Impala and where
>>>> these two tools share the same metastore. Some of the tables are only
>>>> usable from Hive since they are backed by a file format that is not
>>>> supported by Impala.
>>>>
>>>> If a user would try to access such a table from Impala he/she would get
>>>> the message
>>>>
>>>> AnalysisException: Failed to load metadata for table: 'mydb.mytable'
>>>> CAUSED BY: TableLoadingException: Unrecognized table type for table:
>>>> mydb.mytable
>>>>
>>>> We are looking for a way to hide entire DBs and/or Tables and views
>>>> from Impala that are only usable from Hive. Anyone know a way to accomplish
>>>> this?
>>>>
>>>> I would guess a way to support it (if not all ready present) is to add
>>>> a db or table property signaling this fact. Another way would be to
>>>> configure Sentry to achieve the same thing but we are not sure on how to do
>>>> this.
>>>>
>>>> Any help is appreciated!
>>>>
>>>> Br,
>>>> Petter
>>>>
>>>>
>>>>
>>>>
>>>
>>
>

Re: Hiding databases, tables and views from Impala

Posted by Brock Noland <br...@phdata.io>.
"It's an interesting question whether Impala should show tables that are
guaranteed to be unusable at all"

I can see both sides of this:

A. Users see table and wonder "why cannot I query this table"
B. Users don't see table and wonder why doesn't this table show up despite
repeatedly invalidating metadata

I prefer A.

On Wed, Apr 12, 2017 at 11:55 AM, Alexander Behm <al...@cloudera.com>
wrote:

> It's an interesting question whether Impala should show tables that are
> guaranteed to be unusable at all. Doing that is a little tricky on the
> technical side, but it might be worth considering.
>
> On Wed, Apr 12, 2017 at 9:51 AM, Alexander Behm <al...@cloudera.com>
> wrote:
>
>> I'm not aware of a table property that would have this effect.
>>
>> You could use Sentry to configure something like a Hive Role and an
>> Impala Role. In Impala, users that do not have privileges on a table will
>> not see that table. A user could have one of the roles or both. Users that
>> have both roles would still see that error. Not sure if this approach works
>> in your case.
>>
>> Alex
>>
>> On Wed, Apr 12, 2017 at 5:48 AM, Petter von Dolwitz (Hem) <
>> petter.von.dolwitz@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> we work in environment where we use both Hive and Impala and where these
>>> two tools share the same metastore. Some of the tables are only usable from
>>> Hive since they are backed by a file format that is not supported by Impala.
>>>
>>> If a user would try to access such a table from Impala he/she would get
>>> the message
>>>
>>> AnalysisException: Failed to load metadata for table: 'mydb.mytable'
>>> CAUSED BY: TableLoadingException: Unrecognized table type for table:
>>> mydb.mytable
>>>
>>> We are looking for a way to hide entire DBs and/or Tables and views from
>>> Impala that are only usable from Hive. Anyone know a way to accomplish this?
>>>
>>> I would guess a way to support it (if not all ready present) is to add a
>>> db or table property signaling this fact. Another way would be to configure
>>> Sentry to achieve the same thing but we are not sure on how to do this.
>>>
>>> Any help is appreciated!
>>>
>>> Br,
>>> Petter
>>>
>>>
>>>
>>>
>>
>

Re: Hiding databases, tables and views from Impala

Posted by Alexander Behm <al...@cloudera.com>.
It's an interesting question whether Impala should show tables that are
guaranteed to be unusable at all. Doing that is a little tricky on the
technical side, but it might be worth considering.

On Wed, Apr 12, 2017 at 9:51 AM, Alexander Behm <al...@cloudera.com>
wrote:

> I'm not aware of a table property that would have this effect.
>
> You could use Sentry to configure something like a Hive Role and an Impala
> Role. In Impala, users that do not have privileges on a table will not see
> that table. A user could have one of the roles or both. Users that have
> both roles would still see that error. Not sure if this approach works in
> your case.
>
> Alex
>
> On Wed, Apr 12, 2017 at 5:48 AM, Petter von Dolwitz (Hem) <
> petter.von.dolwitz@gmail.com> wrote:
>
>> Hi,
>>
>> we work in environment where we use both Hive and Impala and where these
>> two tools share the same metastore. Some of the tables are only usable from
>> Hive since they are backed by a file format that is not supported by Impala.
>>
>> If a user would try to access such a table from Impala he/she would get
>> the message
>>
>> AnalysisException: Failed to load metadata for table: 'mydb.mytable'
>> CAUSED BY: TableLoadingException: Unrecognized table type for table:
>> mydb.mytable
>>
>> We are looking for a way to hide entire DBs and/or Tables and views from
>> Impala that are only usable from Hive. Anyone know a way to accomplish this?
>>
>> I would guess a way to support it (if not all ready present) is to add a
>> db or table property signaling this fact. Another way would be to configure
>> Sentry to achieve the same thing but we are not sure on how to do this.
>>
>> Any help is appreciated!
>>
>> Br,
>> Petter
>>
>>
>>
>>
>

Re: Hiding databases, tables and views from Impala

Posted by Alexander Behm <al...@cloudera.com>.
I'm not aware of a table property that would have this effect.

You could use Sentry to configure something like a Hive Role and an Impala
Role. In Impala, users that do not have privileges on a table will not see
that table. A user could have one of the roles or both. Users that have
both roles would still see that error. Not sure if this approach works in
your case.

Alex

On Wed, Apr 12, 2017 at 5:48 AM, Petter von Dolwitz (Hem) <
petter.von.dolwitz@gmail.com> wrote:

> Hi,
>
> we work in environment where we use both Hive and Impala and where these
> two tools share the same metastore. Some of the tables are only usable from
> Hive since they are backed by a file format that is not supported by Impala.
>
> If a user would try to access such a table from Impala he/she would get
> the message
>
> AnalysisException: Failed to load metadata for table: 'mydb.mytable'
> CAUSED BY: TableLoadingException: Unrecognized table type for table:
> mydb.mytable
>
> We are looking for a way to hide entire DBs and/or Tables and views from
> Impala that are only usable from Hive. Anyone know a way to accomplish this?
>
> I would guess a way to support it (if not all ready present) is to add a
> db or table property signaling this fact. Another way would be to configure
> Sentry to achieve the same thing but we are not sure on how to do this.
>
> Any help is appreciated!
>
> Br,
> Petter
>
>
>
>