You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by Simon Mottram <Si...@cucumber.co.nz> on 2019/09/03 00:34:45 UTC

Multi-Tenancy and shared records

Hi

I'm working on a project where we have a combination of very sparse
data columns with added headaches of multi-tenancy.  Hbase looks great
for the back end but I need to check that we can support the customer's
multi-tenancy requirements.

There are 2 that I'm struggling to find a definitive answer for. Any
info most gratefully received

Shared Data
===========
Each record in the table must be secured but it could be multiple
tenants for a record.  Think 'shared' data.

So for example if you had 3 records

record1, some secret data
record2, some other secret data 
record3, data? what data.

We need 
user1 to be able to see record1 and record2
user2 to be able to see record2 and record3

From what I see in the mult-tenancy doco, the tenant_id field is a
VARCHAR,  can this be multiple values?  

The actual 'multiple tenant' value would be set at creation and very
rarely (if ever) changed, but I couldn't guarantee immutability


Enforced Security
=================
Can you prevent access without TenantId?  Otherwise if someone just
edits the connection info they can sidestep all the multi-tenancy
features.   Our users include scientific types who will want to connect
directly using JDBC/Python/Other so we need to be sure to lock this
data down.

Of course they want 'admin' types who CAN see all =) Whether there is a
special connection that allows non-tenanted connections or have a
multi-tenant key that always contains a master tenantid (yuck)

If not possible I guess we have to look at doing something at the HBase
level.

Best Regards

Simon

Re: Multi-Tenancy and shared records

Posted by Simon Mottram <Si...@cucumber.co.nz>.
Thanks for the response

I'm pretty much tied to Phoenix 4.14.2-HBase-1.4 as we are using Amazon
EMR.

Looks like I can get the table using the same deprecated method that
Phoenix does:

tablename comes from the Mutation kvPair in the original fragment

HTableInterface hTable = conn.getQueryServices().getTable(tableName);
hTable.batch(mutationList);

This saves the data without having to do a jdbc commit().  So it
'looks' like it works, which brings me to the next problem, how to
confirm!

Time to inspect the HBase API...

That's a fight for another day

Thanks again.


On Tue, 2019-10-15 at 23:37 -0400, Billy Watson wrote:
> I ran into this too with other code. Make sure you’re on the same
> API. HBase 2’s APIs changed heavily so you may have to do some
> googling for docs to convert the above code into something usable in
> your version of the HBase API. 
> 
> Also for your original problem, I’m not sure if Apache Ranger
> supports row-level yet in their HBase plugin but you can certainly
> add that functionality to make what you’re talking about a LOT easier
> to maintain. 
> 
> Good luck,
> 
> Billy Watson
> 
> On Tue, Oct 15, 2019 at 23:05 Simon Mottram <
> Simon.Mottram@cucumber.co.nz> wrote:
> > Hi Ankit
> > 
> > Getting stuck into this, but I am having trouble finding out how to
> > persist the ACL mutations
> > 
> > The updates to the mutations aren't being persisted as far as I can
> > tell.  I see in your code you are using htable.batch().
> > 
> > I'm struggling to find a way to that object,  I can get a PTable
> > from
> > the connection using the byte[] tablename but PTable doesn't have
> > the
> > batch() method.
> > 
> > I'm also unclear on how the htable.batch() method works with
> > connection.commit().  Is commit required?
> > 
> > I have a horrible feeling that the mutations require an Hbase
> > connection rather than Phoenix and I have to go direct to Hbase API
> > 
> > Best Regards and thanks for the help
> > 
> > Simon
> > 
> > 
> > On Wed, 2019-09-04 at 19:06 -0700, Ankit Singhal wrote:
> > > >>would it be best to
> > > use the HBase API for creating the data.
> > > yes, you can use HBase API but you need to ensure that Phoenix
> > Data
> > > type APIs are used to 
> > > convert your column values into bytes and also while creating a
> > > composite key(if applicable). 
> > > otherwise you would not be able to read data from Phoenix when
> > using
> > > different data types 
> > > other than varchar or unsigned_bigint.
> > > 
> > > >> The sparse nature of the data means
> > > that I will be constantly adding new columns, not sure if Phoenix
> > > would
> > > have a problem with that.
> > > Phoenix supports dynamic columns so you should not have a problem
> > > with that.
> > > 
> > > Regards,
> > > Ankit Singhal
> > > 
> > > On Wed, Sep 4, 2019 at 6:24 PM Simon Mottram <
> > > Simon.Mottram@cucumber.co.nz> wrote:
> > > > Hi Ankit
> > > > 
> > > > Thats very useful, many thanks.
> > > > 
> > > > Before I dive into using Phoenix (which has given me a torrid
> > time
> > > > over
> > > > the last few days!), is using Phoenix the best option given
> > that
> > > > I'm
> > > > doing some low level access to Cell information, or would it be
> > > > best to
> > > > use the HBase API for creating the data.
> > > > 
> > > > We would of course use Phoenix for querying the tables, I'm
> > just
> > > > wondering if the import of data would be better handled via the
> > > > native
> > > > HBase API.   
> > > > 
> > > > I think I only need to set labels or use the ACL system,
> > everything
> > > > else should be straight forward.  The sparse nature of the data
> > > > means
> > > > that I will be constantly adding new columns, not sure if
> > Phoenix
> > > > would
> > > > have a problem with that.
> > > > 
> > > > Best Regards
> > > > 
> > > > Simon
> > > > 
> > > > On Tue, 2019-09-03 at 16:30 -0700, Ankit Singhal wrote:
> > > > > >> If not possible I guess we have to look at doing something
> > at
> > > > the
> > > > > HBase
> > > > > level.
> > > > > As Josh said, it's not yet supported in Phoenix, Though you
> > may
> > > > try
> > > > > using cell-level security of HBase with some Phoenix internal
> > API
> > > > and
> > > > > let us know if it works for you.
> > > > > Sharing a sample code if you wanna try.
> > > > > 
> > > > > /**
> > > > > * Do writes using cell based ACLs
> > > > > **/
> > > > > Properties props = new Properties();
> > > > > //conf = Hbase conf
> > > > > PhoenixConnection conn = (PhoenixConnection)
> > > > > QueryUtil.getConnection(props, conf);
> > > > > conn.setAutoCommit(false);
> > > > > conn.createStatement().executeUpdate("<your upsert>");
> > > > > final Iterator<Pair<byte[],List<Mutation>>> iterator =
> > > > > pconn.getMutationState().toMutations(false);
> > > > > while (iterator.hasNext()) {
> > > > >         Pair<byte[], List<Mutation>> kvPair =
> > iterator.next();
> > > > >         List<Mutation> mutationList = kvPair.getSecond();
> > > > >         byte[] tableName = kvPair.getFirst();
> > > > >         for (Mutation mutation : mutationList) {
> > > > >                 //perms is user->permissions map
> > > > >                 mutation.setACL(perms);            
> > > > >         }
> > > > >         htable.batch(mutationList);
> > > > > }
> > > > > conn.rollback();
> > > > > 
> > > > > 
> > > > > On Tue, Sep 3, 2019 at 3:19 PM Simon Mottram <
> > > > > Simon.Mottram@cucumber.co.nz> wrote:
> > > > > > Hi Josh
> > > > > > 
> > > > > > Thought as much, thanks very much for taking the time to
> > > > respond.
> > > > > > 
> > > > > > Appreciated
> > > > > > 
> > > > > > Simon
> > > > > > 
> > > > > > On Tue, 2019-09-03 at 11:19 -0400, Josh Elser wrote:
> > > > > > > Hi Simon,
> > > > > > > 
> > > > > > > Phoenix does not provide any authorization/security
> > layers on
> > > > top
> > > > > > of 
> > > > > > > what HBase does (the thread on user@hbase has a
> > suggestion on
> > > > > > cell
> > > > > > > ACLs 
> > > > > > > which is good).
> > > > > > > 
> > > > > > > I think the question you're ultimately asking is: no, the
> > > > > > TenantID
> > > > > > > is 
> > > > > > > not an authorization layer. In a nut-shell, the TenantID
> > is
> > > > just
> > > > > > an 
> > > > > > > extra attribute (column) added to your primary key
> > > > constraint 
> > > > > > > auto-magically. If a user doesn't set a TenantID, then
> > they
> > > > see
> > > > > > _all_
> > > > > > > data.
> > > > > > > 
> > > > > > > Unless you have a layer in-between Phoenix and your end-
> > users
> > > > > > that
> > > > > > > add 
> > > > > > > extra guarantees/restrictions, a user could set their own
> > > > > > TenantID
> > > > > > > and 
> > > > > > > see other folks' data. I don't think this is a good
> > solution
> > > > for
> > > > > > > what 
> > > > > > > you're trying to accomplish.
> > > > > > > 
> > > > > > > On 9/2/19 8:34 PM, Simon Mottram wrote:
> > > > > > > > Hi
> > > > > > > > 
> > > > > > > > I'm working on a project where we have a combination of
> > > > very
> > > > > > sparse
> > > > > > > > data columns with added headaches of multi-tenancy. 
> > Hbase
> > > > > > looks
> > > > > > > > great
> > > > > > > > for the back end but I need to check that we can
> > support
> > > > the
> > > > > > > > customer's
> > > > > > > > multi-tenancy requirements.
> > > > > > > > 
> > > > > > > > There are 2 that I'm struggling to find a definitive
> > answer
> > > > > > for.
> > > > > > > > Any
> > > > > > > > info most gratefully received
> > > > > > > > 
> > > > > > > > Shared Data
> > > > > > > > ===========
> > > > > > > > Each record in the table must be secured but it could
> > be
> > > > > > multiple
> > > > > > > > tenants for a record.  Think 'shared' data.
> > > > > > > > 
> > > > > > > > So for example if you had 3 records
> > > > > > > > 
> > > > > > > > record1, some secret data
> > > > > > > > record2, some other secret data
> > > > > > > > record3, data? what data.
> > > > > > > > 
> > > > > > > > We need
> > > > > > > > user1 to be able to see record1 and record2
> > > > > > > > user2 to be able to see record2 and record3
> > > > > > > > 
> > > > > > > >  From what I see in the mult-tenancy doco, the
> > tenant_id
> > > > field
> > > > > > is a
> > > > > > > > VARCHAR,  can this be multiple values?
> > > > > > > > 
> > > > > > > > The actual 'multiple tenant' value would be set at
> > creation
> > > > and
> > > > > > > > very
> > > > > > > > rarely (if ever) changed, but I couldn't guarantee
> > > > immutability
> > > > > > > > 
> > > > > > > > 
> > > > > > > > Enforced Security
> > > > > > > > =================
> > > > > > > > Can you prevent access without TenantId?  Otherwise if
> > > > someone
> > > > > > just
> > > > > > > > edits the connection info they can sidestep all the
> > multi-
> > > > > > tenancy
> > > > > > > > features.   Our users include scientific types who will
> > > > want to
> > > > > > > > connect
> > > > > > > > directly using JDBC/Python/Other so we need to be sure
> > to
> > > > lock
> > > > > > this
> > > > > > > > data down.
> > > > > > > > 
> > > > > > > > Of course they want 'admin' types who CAN see all =)
> > > > Whether
> > > > > > there
> > > > > > > > is a
> > > > > > > > special connection that allows non-tenanted connections
> > or
> > > > have
> > > > > > a
> > > > > > > > multi-tenant key that always contains a master tenantid
> > > > (yuck)
> > > > > > > > 
> > > > > > > > If not possible I guess we have to look at doing
> > something
> > > > at
> > > > > > the
> > > > > > > > HBase
> > > > > > > > level.
> > > > > > > > 
> > > > > > > > Best Regards
> > > > > > > > 
> > > > > > > > Simon
> > > > > > > > 
> 
> -- 
> William Watson
> 

Re: Multi-Tenancy and shared records

Posted by Billy Watson <wi...@gmail.com>.
I ran into this too with other code. Make sure you’re on the same API.
HBase 2’s APIs changed heavily so you may have to do some googling for docs
to convert the above code into something usable in your version of the
HBase API.

Also for your original problem, I’m not sure if Apache Ranger supports
row-level yet in their HBase plugin but you can certainly add that
functionality to make what you’re talking about a LOT easier to maintain.

Good luck,

Billy Watson

On Tue, Oct 15, 2019 at 23:05 Simon Mottram <Si...@cucumber.co.nz>
wrote:

> Hi Ankit
>
> Getting stuck into this, but I am having trouble finding out how to
> persist the ACL mutations
>
> The updates to the mutations aren't being persisted as far as I can
> tell.  I see in your code you are using htable.batch().
>
> I'm struggling to find a way to that object,  I can get a PTable from
> the connection using the byte[] tablename but PTable doesn't have the
> batch() method.
>
> I'm also unclear on how the htable.batch() method works with
> connection.commit().  Is commit required?
>
> I have a horrible feeling that the mutations require an Hbase
> connection rather than Phoenix and I have to go direct to Hbase API
>
> Best Regards and thanks for the help
>
> Simon
>
>
> On Wed, 2019-09-04 at 19:06 -0700, Ankit Singhal wrote:
> > >>would it be best to
> > use the HBase API for creating the data.
> > yes, you can use HBase API but you need to ensure that Phoenix Data
> > type APIs are used to
> > convert your column values into bytes and also while creating a
> > composite key(if applicable).
> > otherwise you would not be able to read data from Phoenix when using
> > different data types
> > other than varchar or unsigned_bigint.
> >
> > >> The sparse nature of the data means
> > that I will be constantly adding new columns, not sure if Phoenix
> > would
> > have a problem with that.
> > Phoenix supports dynamic columns so you should not have a problem
> > with that.
> >
> > Regards,
> > Ankit Singhal
> >
> > On Wed, Sep 4, 2019 at 6:24 PM Simon Mottram <
> > Simon.Mottram@cucumber.co.nz> wrote:
> > > Hi Ankit
> > >
> > > Thats very useful, many thanks.
> > >
> > > Before I dive into using Phoenix (which has given me a torrid time
> > > over
> > > the last few days!), is using Phoenix the best option given that
> > > I'm
> > > doing some low level access to Cell information, or would it be
> > > best to
> > > use the HBase API for creating the data.
> > >
> > > We would of course use Phoenix for querying the tables, I'm just
> > > wondering if the import of data would be better handled via the
> > > native
> > > HBase API.
> > >
> > > I think I only need to set labels or use the ACL system, everything
> > > else should be straight forward.  The sparse nature of the data
> > > means
> > > that I will be constantly adding new columns, not sure if Phoenix
> > > would
> > > have a problem with that.
> > >
> > > Best Regards
> > >
> > > Simon
> > >
> > > On Tue, 2019-09-03 at 16:30 -0700, Ankit Singhal wrote:
> > > > >> If not possible I guess we have to look at doing something at
> > > the
> > > > HBase
> > > > level.
> > > > As Josh said, it's not yet supported in Phoenix, Though you may
> > > try
> > > > using cell-level security of HBase with some Phoenix internal API
> > > and
> > > > let us know if it works for you.
> > > > Sharing a sample code if you wanna try.
> > > >
> > > > /**
> > > > * Do writes using cell based ACLs
> > > > **/
> > > > Properties props = new Properties();
> > > > //conf = Hbase conf
> > > > PhoenixConnection conn = (PhoenixConnection)
> > > > QueryUtil.getConnection(props, conf);
> > > > conn.setAutoCommit(false);
> > > > conn.createStatement().executeUpdate("<your upsert>");
> > > > final Iterator<Pair<byte[],List<Mutation>>> iterator =
> > > > pconn.getMutationState().toMutations(false);
> > > > while (iterator.hasNext()) {
> > > >         Pair<byte[], List<Mutation>> kvPair = iterator.next();
> > > >         List<Mutation> mutationList = kvPair.getSecond();
> > > >         byte[] tableName = kvPair.getFirst();
> > > >         for (Mutation mutation : mutationList) {
> > > >                 //perms is user->permissions map
> > > >                 mutation.setACL(perms);
> > > >         }
> > > >         htable.batch(mutationList);
> > > > }
> > > > conn.rollback();
> > > >
> > > >
> > > > On Tue, Sep 3, 2019 at 3:19 PM Simon Mottram <
> > > > Simon.Mottram@cucumber.co.nz> wrote:
> > > > > Hi Josh
> > > > >
> > > > > Thought as much, thanks very much for taking the time to
> > > respond.
> > > > >
> > > > > Appreciated
> > > > >
> > > > > Simon
> > > > >
> > > > > On Tue, 2019-09-03 at 11:19 -0400, Josh Elser wrote:
> > > > > > Hi Simon,
> > > > > >
> > > > > > Phoenix does not provide any authorization/security layers on
> > > top
> > > > > of
> > > > > > what HBase does (the thread on user@hbase has a suggestion on
> > > > > cell
> > > > > > ACLs
> > > > > > which is good).
> > > > > >
> > > > > > I think the question you're ultimately asking is: no, the
> > > > > TenantID
> > > > > > is
> > > > > > not an authorization layer. In a nut-shell, the TenantID is
> > > just
> > > > > an
> > > > > > extra attribute (column) added to your primary key
> > > constraint
> > > > > > auto-magically. If a user doesn't set a TenantID, then they
> > > see
> > > > > _all_
> > > > > > data.
> > > > > >
> > > > > > Unless you have a layer in-between Phoenix and your end-users
> > > > > that
> > > > > > add
> > > > > > extra guarantees/restrictions, a user could set their own
> > > > > TenantID
> > > > > > and
> > > > > > see other folks' data. I don't think this is a good solution
> > > for
> > > > > > what
> > > > > > you're trying to accomplish.
> > > > > >
> > > > > > On 9/2/19 8:34 PM, Simon Mottram wrote:
> > > > > > > Hi
> > > > > > >
> > > > > > > I'm working on a project where we have a combination of
> > > very
> > > > > sparse
> > > > > > > data columns with added headaches of multi-tenancy.  Hbase
> > > > > looks
> > > > > > > great
> > > > > > > for the back end but I need to check that we can support
> > > the
> > > > > > > customer's
> > > > > > > multi-tenancy requirements.
> > > > > > >
> > > > > > > There are 2 that I'm struggling to find a definitive answer
> > > > > for.
> > > > > > > Any
> > > > > > > info most gratefully received
> > > > > > >
> > > > > > > Shared Data
> > > > > > > ===========
> > > > > > > Each record in the table must be secured but it could be
> > > > > multiple
> > > > > > > tenants for a record.  Think 'shared' data.
> > > > > > >
> > > > > > > So for example if you had 3 records
> > > > > > >
> > > > > > > record1, some secret data
> > > > > > > record2, some other secret data
> > > > > > > record3, data? what data.
> > > > > > >
> > > > > > > We need
> > > > > > > user1 to be able to see record1 and record2
> > > > > > > user2 to be able to see record2 and record3
> > > > > > >
> > > > > > >  From what I see in the mult-tenancy doco, the tenant_id
> > > field
> > > > > is a
> > > > > > > VARCHAR,  can this be multiple values?
> > > > > > >
> > > > > > > The actual 'multiple tenant' value would be set at creation
> > > and
> > > > > > > very
> > > > > > > rarely (if ever) changed, but I couldn't guarantee
> > > immutability
> > > > > > >
> > > > > > >
> > > > > > > Enforced Security
> > > > > > > =================
> > > > > > > Can you prevent access without TenantId?  Otherwise if
> > > someone
> > > > > just
> > > > > > > edits the connection info they can sidestep all the multi-
> > > > > tenancy
> > > > > > > features.   Our users include scientific types who will
> > > want to
> > > > > > > connect
> > > > > > > directly using JDBC/Python/Other so we need to be sure to
> > > lock
> > > > > this
> > > > > > > data down.
> > > > > > >
> > > > > > > Of course they want 'admin' types who CAN see all =)
> > > Whether
> > > > > there
> > > > > > > is a
> > > > > > > special connection that allows non-tenanted connections or
> > > have
> > > > > a
> > > > > > > multi-tenant key that always contains a master tenantid
> > > (yuck)
> > > > > > >
> > > > > > > If not possible I guess we have to look at doing something
> > > at
> > > > > the
> > > > > > > HBase
> > > > > > > level.
> > > > > > >
> > > > > > > Best Regards
> > > > > > >
> > > > > > > Simon
> > > > > > >
>
-- 
William Watson

Re: Multi-Tenancy and shared records

Posted by Simon Mottram <Si...@cucumber.co.nz>.
Hi Ankit

Getting stuck into this, but I am having trouble finding out how to
persist the ACL mutations

The updates to the mutations aren't being persisted as far as I can
tell.  I see in your code you are using htable.batch().

I'm struggling to find a way to that object,  I can get a PTable from
the connection using the byte[] tablename but PTable doesn't have the
batch() method.

I'm also unclear on how the htable.batch() method works with
connection.commit().  Is commit required?

I have a horrible feeling that the mutations require an Hbase
connection rather than Phoenix and I have to go direct to Hbase API

Best Regards and thanks for the help

Simon


On Wed, 2019-09-04 at 19:06 -0700, Ankit Singhal wrote:
> >>would it be best to
> use the HBase API for creating the data.
> yes, you can use HBase API but you need to ensure that Phoenix Data
> type APIs are used to 
> convert your column values into bytes and also while creating a
> composite key(if applicable). 
> otherwise you would not be able to read data from Phoenix when using
> different data types 
> other than varchar or unsigned_bigint.
> 
> >> The sparse nature of the data means
> that I will be constantly adding new columns, not sure if Phoenix
> would
> have a problem with that.
> Phoenix supports dynamic columns so you should not have a problem
> with that.
> 
> Regards,
> Ankit Singhal
> 
> On Wed, Sep 4, 2019 at 6:24 PM Simon Mottram <
> Simon.Mottram@cucumber.co.nz> wrote:
> > Hi Ankit
> > 
> > Thats very useful, many thanks.
> > 
> > Before I dive into using Phoenix (which has given me a torrid time
> > over
> > the last few days!), is using Phoenix the best option given that
> > I'm
> > doing some low level access to Cell information, or would it be
> > best to
> > use the HBase API for creating the data.
> > 
> > We would of course use Phoenix for querying the tables, I'm just
> > wondering if the import of data would be better handled via the
> > native
> > HBase API.   
> > 
> > I think I only need to set labels or use the ACL system, everything
> > else should be straight forward.  The sparse nature of the data
> > means
> > that I will be constantly adding new columns, not sure if Phoenix
> > would
> > have a problem with that.
> > 
> > Best Regards
> > 
> > Simon
> > 
> > On Tue, 2019-09-03 at 16:30 -0700, Ankit Singhal wrote:
> > > >> If not possible I guess we have to look at doing something at
> > the
> > > HBase
> > > level.
> > > As Josh said, it's not yet supported in Phoenix, Though you may
> > try
> > > using cell-level security of HBase with some Phoenix internal API
> > and
> > > let us know if it works for you.
> > > Sharing a sample code if you wanna try.
> > > 
> > > /**
> > > * Do writes using cell based ACLs
> > > **/
> > > Properties props = new Properties();
> > > //conf = Hbase conf
> > > PhoenixConnection conn = (PhoenixConnection)
> > > QueryUtil.getConnection(props, conf);
> > > conn.setAutoCommit(false);
> > > conn.createStatement().executeUpdate("<your upsert>");
> > > final Iterator<Pair<byte[],List<Mutation>>> iterator =
> > > pconn.getMutationState().toMutations(false);
> > > while (iterator.hasNext()) {
> > >         Pair<byte[], List<Mutation>> kvPair = iterator.next();
> > >         List<Mutation> mutationList = kvPair.getSecond();
> > >         byte[] tableName = kvPair.getFirst();
> > >         for (Mutation mutation : mutationList) {
> > >                 //perms is user->permissions map
> > >                 mutation.setACL(perms);            
> > >         }
> > >         htable.batch(mutationList);
> > > }
> > > conn.rollback();
> > > 
> > > 
> > > On Tue, Sep 3, 2019 at 3:19 PM Simon Mottram <
> > > Simon.Mottram@cucumber.co.nz> wrote:
> > > > Hi Josh
> > > > 
> > > > Thought as much, thanks very much for taking the time to
> > respond.
> > > > 
> > > > Appreciated
> > > > 
> > > > Simon
> > > > 
> > > > On Tue, 2019-09-03 at 11:19 -0400, Josh Elser wrote:
> > > > > Hi Simon,
> > > > > 
> > > > > Phoenix does not provide any authorization/security layers on
> > top
> > > > of 
> > > > > what HBase does (the thread on user@hbase has a suggestion on
> > > > cell
> > > > > ACLs 
> > > > > which is good).
> > > > > 
> > > > > I think the question you're ultimately asking is: no, the
> > > > TenantID
> > > > > is 
> > > > > not an authorization layer. In a nut-shell, the TenantID is
> > just
> > > > an 
> > > > > extra attribute (column) added to your primary key
> > constraint 
> > > > > auto-magically. If a user doesn't set a TenantID, then they
> > see
> > > > _all_
> > > > > data.
> > > > > 
> > > > > Unless you have a layer in-between Phoenix and your end-users
> > > > that
> > > > > add 
> > > > > extra guarantees/restrictions, a user could set their own
> > > > TenantID
> > > > > and 
> > > > > see other folks' data. I don't think this is a good solution
> > for
> > > > > what 
> > > > > you're trying to accomplish.
> > > > > 
> > > > > On 9/2/19 8:34 PM, Simon Mottram wrote:
> > > > > > Hi
> > > > > > 
> > > > > > I'm working on a project where we have a combination of
> > very
> > > > sparse
> > > > > > data columns with added headaches of multi-tenancy.  Hbase
> > > > looks
> > > > > > great
> > > > > > for the back end but I need to check that we can support
> > the
> > > > > > customer's
> > > > > > multi-tenancy requirements.
> > > > > > 
> > > > > > There are 2 that I'm struggling to find a definitive answer
> > > > for.
> > > > > > Any
> > > > > > info most gratefully received
> > > > > > 
> > > > > > Shared Data
> > > > > > ===========
> > > > > > Each record in the table must be secured but it could be
> > > > multiple
> > > > > > tenants for a record.  Think 'shared' data.
> > > > > > 
> > > > > > So for example if you had 3 records
> > > > > > 
> > > > > > record1, some secret data
> > > > > > record2, some other secret data
> > > > > > record3, data? what data.
> > > > > > 
> > > > > > We need
> > > > > > user1 to be able to see record1 and record2
> > > > > > user2 to be able to see record2 and record3
> > > > > > 
> > > > > >  From what I see in the mult-tenancy doco, the tenant_id
> > field
> > > > is a
> > > > > > VARCHAR,  can this be multiple values?
> > > > > > 
> > > > > > The actual 'multiple tenant' value would be set at creation
> > and
> > > > > > very
> > > > > > rarely (if ever) changed, but I couldn't guarantee
> > immutability
> > > > > > 
> > > > > > 
> > > > > > Enforced Security
> > > > > > =================
> > > > > > Can you prevent access without TenantId?  Otherwise if
> > someone
> > > > just
> > > > > > edits the connection info they can sidestep all the multi-
> > > > tenancy
> > > > > > features.   Our users include scientific types who will
> > want to
> > > > > > connect
> > > > > > directly using JDBC/Python/Other so we need to be sure to
> > lock
> > > > this
> > > > > > data down.
> > > > > > 
> > > > > > Of course they want 'admin' types who CAN see all =)
> > Whether
> > > > there
> > > > > > is a
> > > > > > special connection that allows non-tenanted connections or
> > have
> > > > a
> > > > > > multi-tenant key that always contains a master tenantid
> > (yuck)
> > > > > > 
> > > > > > If not possible I guess we have to look at doing something
> > at
> > > > the
> > > > > > HBase
> > > > > > level.
> > > > > > 
> > > > > > Best Regards
> > > > > > 
> > > > > > Simon
> > > > > > 

Re: Multi-Tenancy and shared records

Posted by Ankit Singhal <an...@gmail.com>.
>>would it be best to
use the HBase API for creating the data.
yes, you can use HBase API but you need to ensure that Phoenix Data type
APIs are used to
convert your column values into bytes and also while creating a composite
key(if applicable).
otherwise you would not be able to read data from Phoenix when using
different data types
other than varchar or unsigned_bigint.

>> The sparse nature of the data means
that I will be constantly adding new columns, not sure if Phoenix would
have a problem with that.
Phoenix supports dynamic columns so you should not have a problem with that.

Regards,
Ankit Singhal

On Wed, Sep 4, 2019 at 6:24 PM Simon Mottram <Si...@cucumber.co.nz>
wrote:

> Hi Ankit
>
> Thats very useful, many thanks.
>
> Before I dive into using Phoenix (which has given me a torrid time over
> the last few days!), is using Phoenix the best option given that I'm
> doing some low level access to Cell information, or would it be best to
> use the HBase API for creating the data.
>
> We would of course use Phoenix for querying the tables, I'm just
> wondering if the import of data would be better handled via the native
> HBase API.
>
> I think I only need to set labels or use the ACL system, everything
> else should be straight forward.  The sparse nature of the data means
> that I will be constantly adding new columns, not sure if Phoenix would
> have a problem with that.
>
> Best Regards
>
> Simon
>
> On Tue, 2019-09-03 at 16:30 -0700, Ankit Singhal wrote:
> > >> If not possible I guess we have to look at doing something at the
> > HBase
> > level.
> > As Josh said, it's not yet supported in Phoenix, Though you may try
> > using cell-level security of HBase with some Phoenix internal API and
> > let us know if it works for you.
> > Sharing a sample code if you wanna try.
> >
> > /**
> > * Do writes using cell based ACLs
> > **/
> > Properties props = new Properties();
> > //conf = Hbase conf
> > PhoenixConnection conn = (PhoenixConnection)
> > QueryUtil.getConnection(props, conf);
> > conn.setAutoCommit(false);
> > conn.createStatement().executeUpdate("<your upsert>");
> > final Iterator<Pair<byte[],List<Mutation>>> iterator =
> > pconn.getMutationState().toMutations(false);
> > while (iterator.hasNext()) {
> >         Pair<byte[], List<Mutation>> kvPair = iterator.next();
> >         List<Mutation> mutationList = kvPair.getSecond();
> >         byte[] tableName = kvPair.getFirst();
> >         for (Mutation mutation : mutationList) {
> >                 //perms is user->permissions map
> >                 mutation.setACL(perms);
> >         }
> >         htable.batch(mutationList);
> > }
> > conn.rollback();
> >
> >
> > On Tue, Sep 3, 2019 at 3:19 PM Simon Mottram <
> > Simon.Mottram@cucumber.co.nz> wrote:
> > > Hi Josh
> > >
> > > Thought as much, thanks very much for taking the time to respond.
> > >
> > > Appreciated
> > >
> > > Simon
> > >
> > > On Tue, 2019-09-03 at 11:19 -0400, Josh Elser wrote:
> > > > Hi Simon,
> > > >
> > > > Phoenix does not provide any authorization/security layers on top
> > > of
> > > > what HBase does (the thread on user@hbase has a suggestion on
> > > cell
> > > > ACLs
> > > > which is good).
> > > >
> > > > I think the question you're ultimately asking is: no, the
> > > TenantID
> > > > is
> > > > not an authorization layer. In a nut-shell, the TenantID is just
> > > an
> > > > extra attribute (column) added to your primary key constraint
> > > > auto-magically. If a user doesn't set a TenantID, then they see
> > > _all_
> > > > data.
> > > >
> > > > Unless you have a layer in-between Phoenix and your end-users
> > > that
> > > > add
> > > > extra guarantees/restrictions, a user could set their own
> > > TenantID
> > > > and
> > > > see other folks' data. I don't think this is a good solution for
> > > > what
> > > > you're trying to accomplish.
> > > >
> > > > On 9/2/19 8:34 PM, Simon Mottram wrote:
> > > > > Hi
> > > > >
> > > > > I'm working on a project where we have a combination of very
> > > sparse
> > > > > data columns with added headaches of multi-tenancy.  Hbase
> > > looks
> > > > > great
> > > > > for the back end but I need to check that we can support the
> > > > > customer's
> > > > > multi-tenancy requirements.
> > > > >
> > > > > There are 2 that I'm struggling to find a definitive answer
> > > for.
> > > > > Any
> > > > > info most gratefully received
> > > > >
> > > > > Shared Data
> > > > > ===========
> > > > > Each record in the table must be secured but it could be
> > > multiple
> > > > > tenants for a record.  Think 'shared' data.
> > > > >
> > > > > So for example if you had 3 records
> > > > >
> > > > > record1, some secret data
> > > > > record2, some other secret data
> > > > > record3, data? what data.
> > > > >
> > > > > We need
> > > > > user1 to be able to see record1 and record2
> > > > > user2 to be able to see record2 and record3
> > > > >
> > > > >  From what I see in the mult-tenancy doco, the tenant_id field
> > > is a
> > > > > VARCHAR,  can this be multiple values?
> > > > >
> > > > > The actual 'multiple tenant' value would be set at creation and
> > > > > very
> > > > > rarely (if ever) changed, but I couldn't guarantee immutability
> > > > >
> > > > >
> > > > > Enforced Security
> > > > > =================
> > > > > Can you prevent access without TenantId?  Otherwise if someone
> > > just
> > > > > edits the connection info they can sidestep all the multi-
> > > tenancy
> > > > > features.   Our users include scientific types who will want to
> > > > > connect
> > > > > directly using JDBC/Python/Other so we need to be sure to lock
> > > this
> > > > > data down.
> > > > >
> > > > > Of course they want 'admin' types who CAN see all =) Whether
> > > there
> > > > > is a
> > > > > special connection that allows non-tenanted connections or have
> > > a
> > > > > multi-tenant key that always contains a master tenantid (yuck)
> > > > >
> > > > > If not possible I guess we have to look at doing something at
> > > the
> > > > > HBase
> > > > > level.
> > > > >
> > > > > Best Regards
> > > > >
> > > > > Simon
> > > > >
>

Re: Multi-Tenancy and shared records

Posted by Simon Mottram <Si...@cucumber.co.nz>.
Hi Ankit

Thats very useful, many thanks.

Before I dive into using Phoenix (which has given me a torrid time over
the last few days!), is using Phoenix the best option given that I'm
doing some low level access to Cell information, or would it be best to
use the HBase API for creating the data.

We would of course use Phoenix for querying the tables, I'm just
wondering if the import of data would be better handled via the native
HBase API.   

I think I only need to set labels or use the ACL system, everything
else should be straight forward.  The sparse nature of the data means
that I will be constantly adding new columns, not sure if Phoenix would
have a problem with that.

Best Regards

Simon

On Tue, 2019-09-03 at 16:30 -0700, Ankit Singhal wrote:
> >> If not possible I guess we have to look at doing something at the
> HBase
> level.
> As Josh said, it's not yet supported in Phoenix, Though you may try
> using cell-level security of HBase with some Phoenix internal API and
> let us know if it works for you.
> Sharing a sample code if you wanna try.
> 
> /**
> * Do writes using cell based ACLs
> **/
> Properties props = new Properties();
> //conf = Hbase conf
> PhoenixConnection conn = (PhoenixConnection)
> QueryUtil.getConnection(props, conf);
> conn.setAutoCommit(false);
> conn.createStatement().executeUpdate("<your upsert>");
> final Iterator<Pair<byte[],List<Mutation>>> iterator =
> pconn.getMutationState().toMutations(false);
> while (iterator.hasNext()) {
>         Pair<byte[], List<Mutation>> kvPair = iterator.next();
>         List<Mutation> mutationList = kvPair.getSecond();
>         byte[] tableName = kvPair.getFirst();
>         for (Mutation mutation : mutationList) {
>                 //perms is user->permissions map
>                 mutation.setACL(perms);            
>         }
>         htable.batch(mutationList);
> }
> conn.rollback();
> 
> 
> On Tue, Sep 3, 2019 at 3:19 PM Simon Mottram <
> Simon.Mottram@cucumber.co.nz> wrote:
> > Hi Josh
> > 
> > Thought as much, thanks very much for taking the time to respond.
> > 
> > Appreciated
> > 
> > Simon
> > 
> > On Tue, 2019-09-03 at 11:19 -0400, Josh Elser wrote:
> > > Hi Simon,
> > > 
> > > Phoenix does not provide any authorization/security layers on top
> > of 
> > > what HBase does (the thread on user@hbase has a suggestion on
> > cell
> > > ACLs 
> > > which is good).
> > > 
> > > I think the question you're ultimately asking is: no, the
> > TenantID
> > > is 
> > > not an authorization layer. In a nut-shell, the TenantID is just
> > an 
> > > extra attribute (column) added to your primary key constraint 
> > > auto-magically. If a user doesn't set a TenantID, then they see
> > _all_
> > > data.
> > > 
> > > Unless you have a layer in-between Phoenix and your end-users
> > that
> > > add 
> > > extra guarantees/restrictions, a user could set their own
> > TenantID
> > > and 
> > > see other folks' data. I don't think this is a good solution for
> > > what 
> > > you're trying to accomplish.
> > > 
> > > On 9/2/19 8:34 PM, Simon Mottram wrote:
> > > > Hi
> > > > 
> > > > I'm working on a project where we have a combination of very
> > sparse
> > > > data columns with added headaches of multi-tenancy.  Hbase
> > looks
> > > > great
> > > > for the back end but I need to check that we can support the
> > > > customer's
> > > > multi-tenancy requirements.
> > > > 
> > > > There are 2 that I'm struggling to find a definitive answer
> > for.
> > > > Any
> > > > info most gratefully received
> > > > 
> > > > Shared Data
> > > > ===========
> > > > Each record in the table must be secured but it could be
> > multiple
> > > > tenants for a record.  Think 'shared' data.
> > > > 
> > > > So for example if you had 3 records
> > > > 
> > > > record1, some secret data
> > > > record2, some other secret data
> > > > record3, data? what data.
> > > > 
> > > > We need
> > > > user1 to be able to see record1 and record2
> > > > user2 to be able to see record2 and record3
> > > > 
> > > >  From what I see in the mult-tenancy doco, the tenant_id field
> > is a
> > > > VARCHAR,  can this be multiple values?
> > > > 
> > > > The actual 'multiple tenant' value would be set at creation and
> > > > very
> > > > rarely (if ever) changed, but I couldn't guarantee immutability
> > > > 
> > > > 
> > > > Enforced Security
> > > > =================
> > > > Can you prevent access without TenantId?  Otherwise if someone
> > just
> > > > edits the connection info they can sidestep all the multi-
> > tenancy
> > > > features.   Our users include scientific types who will want to
> > > > connect
> > > > directly using JDBC/Python/Other so we need to be sure to lock
> > this
> > > > data down.
> > > > 
> > > > Of course they want 'admin' types who CAN see all =) Whether
> > there
> > > > is a
> > > > special connection that allows non-tenanted connections or have
> > a
> > > > multi-tenant key that always contains a master tenantid (yuck)
> > > > 
> > > > If not possible I guess we have to look at doing something at
> > the
> > > > HBase
> > > > level.
> > > > 
> > > > Best Regards
> > > > 
> > > > Simon
> > > > 

Re: Multi-Tenancy and shared records

Posted by Ankit Singhal <an...@gmail.com>.
>> If not possible I guess we have to look at doing something at the HBase
level.
As Josh said, it's not yet supported in Phoenix, Though you may try using
cell-level security of HBase with some Phoenix internal API and let us know
if it works for you.
Sharing a sample code if you wanna try.

/**
* Do writes using cell based ACLs
**/
Properties props = new Properties();
//conf = Hbase conf
PhoenixConnection conn = (PhoenixConnection) QueryUtil.getConnection(props,
conf);
conn.setAutoCommit(false);
conn.createStatement().executeUpdate("<your upsert>");
final Iterator<Pair<byte[],List<Mutation>>> iterator =
pconn.getMutationState().toMutations(false);
while (iterator.hasNext()) {
        Pair<byte[], List<Mutation>> kvPair = iterator.next();
        List<Mutation> mutationList = kvPair.getSecond();
        byte[] tableName = kvPair.getFirst();
        for (Mutation mutation : mutationList) {
                //perms is user->permissions map
                mutation.setACL(perms);
        }
        htable.batch(mutationList);
}
conn.rollback();


On Tue, Sep 3, 2019 at 3:19 PM Simon Mottram <Si...@cucumber.co.nz>
wrote:

> Hi Josh
>
> Thought as much, thanks very much for taking the time to respond.
>
> Appreciated
>
> Simon
>
> On Tue, 2019-09-03 at 11:19 -0400, Josh Elser wrote:
> > Hi Simon,
> >
> > Phoenix does not provide any authorization/security layers on top of
> > what HBase does (the thread on user@hbase has a suggestion on cell
> > ACLs
> > which is good).
> >
> > I think the question you're ultimately asking is: no, the TenantID
> > is
> > not an authorization layer. In a nut-shell, the TenantID is just an
> > extra attribute (column) added to your primary key constraint
> > auto-magically. If a user doesn't set a TenantID, then they see _all_
> > data.
> >
> > Unless you have a layer in-between Phoenix and your end-users that
> > add
> > extra guarantees/restrictions, a user could set their own TenantID
> > and
> > see other folks' data. I don't think this is a good solution for
> > what
> > you're trying to accomplish.
> >
> > On 9/2/19 8:34 PM, Simon Mottram wrote:
> > > Hi
> > >
> > > I'm working on a project where we have a combination of very sparse
> > > data columns with added headaches of multi-tenancy.  Hbase looks
> > > great
> > > for the back end but I need to check that we can support the
> > > customer's
> > > multi-tenancy requirements.
> > >
> > > There are 2 that I'm struggling to find a definitive answer for.
> > > Any
> > > info most gratefully received
> > >
> > > Shared Data
> > > ===========
> > > Each record in the table must be secured but it could be multiple
> > > tenants for a record.  Think 'shared' data.
> > >
> > > So for example if you had 3 records
> > >
> > > record1, some secret data
> > > record2, some other secret data
> > > record3, data? what data.
> > >
> > > We need
> > > user1 to be able to see record1 and record2
> > > user2 to be able to see record2 and record3
> > >
> > >  From what I see in the mult-tenancy doco, the tenant_id field is a
> > > VARCHAR,  can this be multiple values?
> > >
> > > The actual 'multiple tenant' value would be set at creation and
> > > very
> > > rarely (if ever) changed, but I couldn't guarantee immutability
> > >
> > >
> > > Enforced Security
> > > =================
> > > Can you prevent access without TenantId?  Otherwise if someone just
> > > edits the connection info they can sidestep all the multi-tenancy
> > > features.   Our users include scientific types who will want to
> > > connect
> > > directly using JDBC/Python/Other so we need to be sure to lock this
> > > data down.
> > >
> > > Of course they want 'admin' types who CAN see all =) Whether there
> > > is a
> > > special connection that allows non-tenanted connections or have a
> > > multi-tenant key that always contains a master tenantid (yuck)
> > >
> > > If not possible I guess we have to look at doing something at the
> > > HBase
> > > level.
> > >
> > > Best Regards
> > >
> > > Simon
> > >
>

Re: Multi-Tenancy and shared records

Posted by Simon Mottram <Si...@cucumber.co.nz>.
Hi Josh

Thought as much, thanks very much for taking the time to respond.

Appreciated

Simon

On Tue, 2019-09-03 at 11:19 -0400, Josh Elser wrote:
> Hi Simon,
> 
> Phoenix does not provide any authorization/security layers on top of 
> what HBase does (the thread on user@hbase has a suggestion on cell
> ACLs 
> which is good).
> 
> I think the question you're ultimately asking is: no, the TenantID
> is 
> not an authorization layer. In a nut-shell, the TenantID is just an 
> extra attribute (column) added to your primary key constraint 
> auto-magically. If a user doesn't set a TenantID, then they see _all_
> data.
> 
> Unless you have a layer in-between Phoenix and your end-users that
> add 
> extra guarantees/restrictions, a user could set their own TenantID
> and 
> see other folks' data. I don't think this is a good solution for
> what 
> you're trying to accomplish.
> 
> On 9/2/19 8:34 PM, Simon Mottram wrote:
> > Hi
> > 
> > I'm working on a project where we have a combination of very sparse
> > data columns with added headaches of multi-tenancy.  Hbase looks
> > great
> > for the back end but I need to check that we can support the
> > customer's
> > multi-tenancy requirements.
> > 
> > There are 2 that I'm struggling to find a definitive answer for.
> > Any
> > info most gratefully received
> > 
> > Shared Data
> > ===========
> > Each record in the table must be secured but it could be multiple
> > tenants for a record.  Think 'shared' data.
> > 
> > So for example if you had 3 records
> > 
> > record1, some secret data
> > record2, some other secret data
> > record3, data? what data.
> > 
> > We need
> > user1 to be able to see record1 and record2
> > user2 to be able to see record2 and record3
> > 
> >  From what I see in the mult-tenancy doco, the tenant_id field is a
> > VARCHAR,  can this be multiple values?
> > 
> > The actual 'multiple tenant' value would be set at creation and
> > very
> > rarely (if ever) changed, but I couldn't guarantee immutability
> > 
> > 
> > Enforced Security
> > =================
> > Can you prevent access without TenantId?  Otherwise if someone just
> > edits the connection info they can sidestep all the multi-tenancy
> > features.   Our users include scientific types who will want to
> > connect
> > directly using JDBC/Python/Other so we need to be sure to lock this
> > data down.
> > 
> > Of course they want 'admin' types who CAN see all =) Whether there
> > is a
> > special connection that allows non-tenanted connections or have a
> > multi-tenant key that always contains a master tenantid (yuck)
> > 
> > If not possible I guess we have to look at doing something at the
> > HBase
> > level.
> > 
> > Best Regards
> > 
> > Simon
> > 

Re: Multi-Tenancy and shared records

Posted by Josh Elser <el...@apache.org>.
Hi Simon,

Phoenix does not provide any authorization/security layers on top of 
what HBase does (the thread on user@hbase has a suggestion on cell ACLs 
which is good).

I think the question you're ultimately asking is: no, the TenantID is 
not an authorization layer. In a nut-shell, the TenantID is just an 
extra attribute (column) added to your primary key constraint 
auto-magically. If a user doesn't set a TenantID, then they see _all_ data.

Unless you have a layer in-between Phoenix and your end-users that add 
extra guarantees/restrictions, a user could set their own TenantID and 
see other folks' data. I don't think this is a good solution for what 
you're trying to accomplish.

On 9/2/19 8:34 PM, Simon Mottram wrote:
> Hi
> 
> I'm working on a project where we have a combination of very sparse
> data columns with added headaches of multi-tenancy.  Hbase looks great
> for the back end but I need to check that we can support the customer's
> multi-tenancy requirements.
> 
> There are 2 that I'm struggling to find a definitive answer for. Any
> info most gratefully received
> 
> Shared Data
> ===========
> Each record in the table must be secured but it could be multiple
> tenants for a record.  Think 'shared' data.
> 
> So for example if you had 3 records
> 
> record1, some secret data
> record2, some other secret data
> record3, data? what data.
> 
> We need
> user1 to be able to see record1 and record2
> user2 to be able to see record2 and record3
> 
>  From what I see in the mult-tenancy doco, the tenant_id field is a
> VARCHAR,  can this be multiple values?
> 
> The actual 'multiple tenant' value would be set at creation and very
> rarely (if ever) changed, but I couldn't guarantee immutability
> 
> 
> Enforced Security
> =================
> Can you prevent access without TenantId?  Otherwise if someone just
> edits the connection info they can sidestep all the multi-tenancy
> features.   Our users include scientific types who will want to connect
> directly using JDBC/Python/Other so we need to be sure to lock this
> data down.
> 
> Of course they want 'admin' types who CAN see all =) Whether there is a
> special connection that allows non-tenanted connections or have a
> multi-tenant key that always contains a master tenantid (yuck)
> 
> If not possible I guess we have to look at doing something at the HBase
> level.
> 
> Best Regards
> 
> Simon
>