You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Prasad Chakka <pc...@facebook.com> on 2009/07/14 06:58:09 UTC

Re: It there any way sharing directories between different users in hive

Then the question is why not everyone share the metadata and let hadoop enforce the file system permissions?


________________________________
From: Min Zhou <co...@gmail.com>
Reply-To: <hi...@hadoop.apache.org>
Date: Mon, 13 Jul 2009 21:55:59 -0700
To: <hi...@hadoop.apache.org>
Subject: Re: It there any way sharing directories between different users in  hive

It's also my way.  but actually, our table wanna sharing is one partitioned by date ,  it's metadata changed everyday. it's boring loading.

On Tue, Jul 14, 2009 at 12:51 PM, Prasad Chakka <pc...@facebook.com> wrote:
One way is to create external table pointing to the original table location.

________________________________
From: Min Zhou <coderplay@gmail.com <ht...@gmail.com> >
Reply-To: <hive-user@hadoop.apache.org <ht...@hadoop.apache.org> >
Date: Mon, 13 Jul 2009 21:44:38 -0700
To: hive-user <hive-user@hadoop.apache.org <ht...@hadoop.apache.org> >
Subject: It there any way sharing directories between different users in hive


Hi all,

We've set up a environment for multiple users to use hive, seperated mysql database for storing metadata was created for each user, so those users'  execution can be completely isolated. But sometimes, we need to share a table to some sort of users, how to achieve it? Do you have another alternative deployment if our approach can't.

Thanks in advance!
Min
--
My research interests are distributed systems, parallel computing and bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com




--
My research interests are distributed systems, parallel computing and bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com


Re: It there any way sharing directories between different users in hive

Posted by Prasad Chakka <pc...@facebook.com>.
There is support for databases in metastore. But query language doesn't support it. Each database could have a separate root/base directory and users can have their own databases and can only drop from there. Adding database (or one could call it schema) support to QL solves lot of these sharing problems.


________________________________
From: Frederick Oko <fr...@gmail.com>
Reply-To: <hi...@hadoop.apache.org>
Date: Tue, 14 Jul 2009 02:01:50 -0700
To: <hi...@hadoop.apache.org>
Subject: Re: It there any way sharing directories between different users in  hive

W/o HIVE-493 behavior and having shared, partitioned base table data such schema metadata duplication gets uglier -- while we had multiple schemas in general most users fell into a single schema and it is definitely preferred that there could be permissioned views in the single schema metadata even to reduce clutter of user tables in eg 'show tables'. We had disabled general availability of DROP (paranoid code tweak) and you have to be using HDFS permissions to protect against insert overwrites.

Also aside from metadata from an HDFS resource mgmt perspective it would have been nice if user created tables defaulted to a different root directory than schema root e.g. if shared tables in /analytics user tables in /analytics_usr/userx -- the external table support allows for such now but the use of 'external' is a user behavior that is not inherently enforced. I suppose in your per-user schema model u could technically be specifying the root as /analytics_usr/userx and created the shared tables as external but such config is unwieldy. Actually, w/o doing this HDFS scoping but using Hive schema scoping don't you have the possibility of blind conflicts I.e. two users can create the same table w/o realizing it and may/maynot pollute each other solely on HDFS permissions?

On Jul 13, 2009 10:18 PM, "Min Zhou" <co...@gmail.com> wrote:

Hmm, HIVE-493 is okay since authentication subsystem isn't existing. I was considering using something like crontab to automatically add paritions for our tables.

Thanks, Prasad!

On Tue, Jul 14, 2009 at 1:05 PM, Prasad Chakka <pc...@facebook.com> wrote: > > You could fix htt...


Re: It there any way sharing directories between different users in hive

Posted by Frederick Oko <fr...@gmail.com>.
W/o HIVE-493 behavior and having shared, partitioned base table data such
schema metadata duplication gets uglier -- while we had multiple schemas in
general most users fell into a single schema and it is definitely preferred
that there could be permissioned views in the single schema metadata even to
reduce clutter of user tables in eg 'show tables'. We had disabled general
availability of DROP (paranoid code tweak) and you have to be using HDFS
permissions to protect against insert overwrites.

Also aside from metadata from an HDFS resource mgmt perspective it would
have been nice if user created tables defaulted to a different root
directory than schema root e.g. if shared tables in /analytics user tables
in /analytics_usr/userx -- the external table support allows for such now
but the use of 'external' is a user behavior that is not inherently
enforced. I suppose in your per-user schema model u could technically be
specifying the root as /analytics_usr/userx and created the shared tables as
external but such config is unwieldy. Actually, w/o doing this HDFS scoping
but using Hive schema scoping don't you have the possibility of blind
conflicts I.e. two users can create the same table w/o realizing it and
may/maynot pollute each other solely on HDFS permissions?

On Jul 13, 2009 10:18 PM, "Min Zhou" <co...@gmail.com> wrote:

Hmm, HIVE-493 is okay since authentication subsystem isn't existing. I was
considering using something like crontab to automatically add paritions for
our tables.

Thanks, Prasad!

On Tue, Jul 14, 2009 at 1:05 PM, Prasad Chakka <pc...@facebook.com> wrote:
> > You could fix htt...

Re: It there any way sharing directories between different users in hive

Posted by Min Zhou <co...@gmail.com>.
Hmm, HIVE-493 is okay since authentication subsystem isn't existing. I was
considering using something like crontab to automatically add paritions for
our tables.

Thanks, Prasad!

On Tue, Jul 14, 2009 at 1:05 PM, Prasad Chakka <pc...@facebook.com> wrote:

>  You could fix https://issues.apache.org/jira/browse/HIVE-493 and enable
> it through an option. But keep in mind that using this will raise issues
> that I mentioned in that JIRA. I personally would not use this option
> because it will create headaches down the road but if you don’t care then it
> should be fine. We used to do this some time before but taken out due to the
> issues mentioned there. You could bring that code back (disable it by
> default).
>
>
> ------------------------------
> *From: *Min Zhou <co...@gmail.com>
> *Reply-To: *<hi...@hadoop.apache.org>
> *Date: *Mon, 13 Jul 2009 22:01:44 -0700
> *To: *<hi...@hadoop.apache.org>
> *Subject: *Re: It there any way sharing directories between different
> users in  hive
>
> The fact is that only a few tables need to share,  users usually create
> their own tables for their jobs.
>
>
> On Tue, Jul 14, 2009 at 12:58 PM, Prasad Chakka <pc...@facebook.com>
> wrote:
>
> Then the question is why not everyone share the metadata and let hadoop
> enforce the file system permissions?
>
>
>
> ------------------------------
> *From: *Min Zhou <coderplay@gmail.com <ht...@gmail.com> >
> *Reply-To: *<hive-user@hadoop.apache.org <
> http://hive-user@hadoop.apache.org> >
> *Date: *Mon, 13 Jul 2009 21:55:59 -0700
>
> *To: *<hive-user@hadoop.apache.org <ht...@hadoop.apache.org> >
> *Subject: *Re: It there any way sharing directories between different
> users in  hive
>
>
> It's also my way.  but actually, our table wanna sharing is one partitioned
> by date ,  it's metadata changed everyday. it's boring loading.
>
> On Tue, Jul 14, 2009 at 12:51 PM, Prasad Chakka <pchakka@facebook.com <
> http://pchakka@facebook.com> > wrote:
>
> One way is to create external table pointing to the original table
> location.
>
> ------------------------------
> *From: *Min Zhou <coderplay@gmail.com <ht...@gmail.com>  <
> http://coderplay@gmail.com> >
> *Reply-To: *<hive-user@hadoop.apache.org <
> http://hive-user@hadoop.apache.org>  <ht...@hadoop.apache.org>
> >
>
> *Date: *Mon, 13 Jul 2009 21:44:38 -0700
> *To: *hive-user <hive-user@hadoop.apache.org <
> http://hive-user@hadoop.apache.org>  <ht...@hadoop.apache.org>
> >
>
> *Subject: *It there any way sharing directories between different users in
> hive
>
>
> Hi all,
>
> We've set up a environment for multiple users to use hive, seperated mysql
> database for storing metadata was created for each user, so those users'
> execution can be completely isolated. But sometimes, we need to share a
> table to some sort of users, how to achieve it? Do you have another
> alternative deployment if our approach can't.
>
> Thanks in advance!
> Min
> --
> My research interests are distributed systems, parallel computing and
> bytecode based virtual machine.
>
> My profile:
> http://www.linkedin.com/in/coderplay
> My blog:
> http://coderplay.javaeye.com
>
>
>
>
> --
> My research interests are distributed systems, parallel computing and
> bytecode based virtual machine.
>
> My profile:
> http://www.linkedin.com/in/coderplay
> My blog:
> http://coderplay.javaeye.com
>
>
>
>
> --
> My research interests are distributed systems, parallel computing and
> bytecode based virtual machine.
>
> My profile:
> http://www.linkedin.com/in/coderplay
> My blog:
> http://coderplay.javaeye.com
>
>


-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

Re: It there any way sharing directories between different users in hive

Posted by Min Zhou <co...@gmail.com>.
If we share metadata to everyone,  then tables created by a user may be
droped by another one accidentally.

On Tue, Jul 14, 2009 at 1:01 PM, Min Zhou <co...@gmail.com> wrote:

> The fact is that only a few tables need to share,  users usually create
> their own tables for their jobs.
>
>
>
> On Tue, Jul 14, 2009 at 12:58 PM, Prasad Chakka <pc...@facebook.com>wrote:
>
>>  Then the question is why not everyone share the metadata and let hadoop
>> enforce the file system permissions?
>>
>>
>> ------------------------------
>> *From: *Min Zhou <co...@gmail.com>
>> *Reply-To: *<hi...@hadoop.apache.org>
>> *Date: *Mon, 13 Jul 2009 21:55:59 -0700
>> *To: *<hi...@hadoop.apache.org>
>> *Subject: *Re: It there any way sharing directories between different
>> users in  hive
>>
>> It's also my way.  but actually, our table wanna sharing is one
>> partitioned by date ,  it's metadata changed everyday. it's boring loading.
>>
>> On Tue, Jul 14, 2009 at 12:51 PM, Prasad Chakka <pc...@facebook.com>
>> wrote:
>>
>> One way is to create external table pointing to the original table
>> location.
>>
>> ------------------------------
>> *From: *Min Zhou <coderplay@gmail.com <ht...@gmail.com> >
>> *Reply-To: *<hive-user@hadoop.apache.org <
>> http://hive-user@hadoop.apache.org> >
>> *Date: *Mon, 13 Jul 2009 21:44:38 -0700
>> *To: *hive-user <hive-user@hadoop.apache.org <
>> http://hive-user@hadoop.apache.org> >
>> *Subject: *It there any way sharing directories between different users
>> in hive
>>
>>
>> Hi all,
>>
>> We've set up a environment for multiple users to use hive, seperated mysql
>> database for storing metadata was created for each user, so those users'
>> execution can be completely isolated. But sometimes, we need to share a
>> table to some sort of users, how to achieve it? Do you have another
>> alternative deployment if our approach can't.
>>
>> Thanks in advance!
>> Min
>> --
>> My research interests are distributed systems, parallel computing and
>> bytecode based virtual machine.
>>
>> My profile:
>> http://www.linkedin.com/in/coderplay
>> My blog:
>> http://coderplay.javaeye.com
>>
>>
>>
>>
>> --
>> My research interests are distributed systems, parallel computing and
>> bytecode based virtual machine.
>>
>> My profile:
>> http://www.linkedin.com/in/coderplay
>> My blog:
>> http://coderplay.javaeye.com
>>
>>
>
>
> --
> My research interests are distributed systems, parallel computing and
> bytecode based virtual machine.
>
> My profile:
> http://www.linkedin.com/in/coderplay
> My blog:
> http://coderplay.javaeye.com
>



-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

Re: It there any way sharing directories between different users in hive

Posted by Prasad Chakka <pc...@facebook.com>.
You could fix https://issues.apache.org/jira/browse/HIVE-493 and enable it through an option. But keep in mind that using this will raise issues that I mentioned in that JIRA. I personally would not use this option because it will create headaches down the road but if you don't care then it should be fine. We used to do this some time before but taken out due to the issues mentioned there. You could bring that code back (disable it by default).


________________________________
From: Min Zhou <co...@gmail.com>
Reply-To: <hi...@hadoop.apache.org>
Date: Mon, 13 Jul 2009 22:01:44 -0700
To: <hi...@hadoop.apache.org>
Subject: Re: It there any way sharing directories between different users in  hive

The fact is that only a few tables need to share,  users usually create their own tables for their jobs.


On Tue, Jul 14, 2009 at 12:58 PM, Prasad Chakka <pc...@facebook.com> wrote:
Then the question is why not everyone share the metadata and let hadoop enforce the file system permissions?



________________________________
From: Min Zhou <coderplay@gmail.com <ht...@gmail.com> >
Reply-To: <hive-user@hadoop.apache.org <ht...@hadoop.apache.org> >
Date: Mon, 13 Jul 2009 21:55:59 -0700

To: <hive-user@hadoop.apache.org <ht...@hadoop.apache.org> >
Subject: Re: It there any way sharing directories between different users in  hive


It's also my way.  but actually, our table wanna sharing is one partitioned by date ,  it's metadata changed everyday. it's boring loading.

On Tue, Jul 14, 2009 at 12:51 PM, Prasad Chakka <pchakka@facebook.com <ht...@facebook.com> > wrote:
One way is to create external table pointing to the original table location.

________________________________
From: Min Zhou <coderplay@gmail.com <ht...@gmail.com>  <ht...@gmail.com> >
Reply-To: <hive-user@hadoop.apache.org <ht...@hadoop.apache.org>  <ht...@hadoop.apache.org> >

Date: Mon, 13 Jul 2009 21:44:38 -0700
To: hive-user <hive-user@hadoop.apache.org <ht...@hadoop.apache.org>  <ht...@hadoop.apache.org> >

Subject: It there any way sharing directories between different users in hive


Hi all,

We've set up a environment for multiple users to use hive, seperated mysql database for storing metadata was created for each user, so those users'  execution can be completely isolated. But sometimes, we need to share a table to some sort of users, how to achieve it? Do you have another alternative deployment if our approach can't.

Thanks in advance!
Min
--
My research interests are distributed systems, parallel computing and bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com




--
My research interests are distributed systems, parallel computing and bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com




--
My research interests are distributed systems, parallel computing and bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com


Re: It there any way sharing directories between different users in hive

Posted by Min Zhou <co...@gmail.com>.
The fact is that only a few tables need to share,  users usually create
their own tables for their jobs.


On Tue, Jul 14, 2009 at 12:58 PM, Prasad Chakka <pc...@facebook.com>wrote:

>  Then the question is why not everyone share the metadata and let hadoop
> enforce the file system permissions?
>
>
> ------------------------------
> *From: *Min Zhou <co...@gmail.com>
> *Reply-To: *<hi...@hadoop.apache.org>
> *Date: *Mon, 13 Jul 2009 21:55:59 -0700
> *To: *<hi...@hadoop.apache.org>
> *Subject: *Re: It there any way sharing directories between different
> users in  hive
>
> It's also my way.  but actually, our table wanna sharing is one partitioned
> by date ,  it's metadata changed everyday. it's boring loading.
>
> On Tue, Jul 14, 2009 at 12:51 PM, Prasad Chakka <pc...@facebook.com>
> wrote:
>
> One way is to create external table pointing to the original table
> location.
>
> ------------------------------
> *From: *Min Zhou <coderplay@gmail.com <ht...@gmail.com> >
> *Reply-To: *<hive-user@hadoop.apache.org <
> http://hive-user@hadoop.apache.org> >
> *Date: *Mon, 13 Jul 2009 21:44:38 -0700
> *To: *hive-user <hive-user@hadoop.apache.org <
> http://hive-user@hadoop.apache.org> >
> *Subject: *It there any way sharing directories between different users in
> hive
>
>
> Hi all,
>
> We've set up a environment for multiple users to use hive, seperated mysql
> database for storing metadata was created for each user, so those users'
> execution can be completely isolated. But sometimes, we need to share a
> table to some sort of users, how to achieve it? Do you have another
> alternative deployment if our approach can't.
>
> Thanks in advance!
> Min
> --
> My research interests are distributed systems, parallel computing and
> bytecode based virtual machine.
>
> My profile:
> http://www.linkedin.com/in/coderplay
> My blog:
> http://coderplay.javaeye.com
>
>
>
>
> --
> My research interests are distributed systems, parallel computing and
> bytecode based virtual machine.
>
> My profile:
> http://www.linkedin.com/in/coderplay
> My blog:
> http://coderplay.javaeye.com
>
>


-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com