You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@phoenix.apache.org by Simon Mottram <Si...@cucumber.co.nz> on 2019/09/03 00:34:45 UTC
Multi-Tenancy and shared records
Hi
I'm working on a project where we have a combination of very sparse
data columns with added headaches of multi-tenancy. Hbase looks great
for the back end but I need to check that we can support the customer's
multi-tenancy requirements.
There are 2 that I'm struggling to find a definitive answer for. Any
info most gratefully received
Shared Data
===========
Each record in the table must be secured but it could be multiple
tenants for a record. Think 'shared' data.
So for example if you had 3 records
record1, some secret data
record2, some other secret data
record3, data? what data.
We need
user1 to be able to see record1 and record2
user2 to be able to see record2 and record3
From what I see in the mult-tenancy doco, the tenant_id field is a
VARCHAR, can this be multiple values?
The actual 'multiple tenant' value would be set at creation and very
rarely (if ever) changed, but I couldn't guarantee immutability
Enforced Security
=================
Can you prevent access without TenantId? Otherwise if someone just
edits the connection info they can sidestep all the multi-tenancy
features. Our users include scientific types who will want to connect
directly using JDBC/Python/Other so we need to be sure to lock this
data down.
Of course they want 'admin' types who CAN see all =) Whether there is a
special connection that allows non-tenanted connections or have a
multi-tenant key that always contains a master tenantid (yuck)
If not possible I guess we have to look at doing something at the HBase
level.
Best Regards
Simon
Re: Multi-Tenancy and shared records
Posted by Simon Mottram <Si...@cucumber.co.nz>.
Thanks for the response
I'm pretty much tied to Phoenix 4.14.2-HBase-1.4 as we are using Amazon
EMR.
Looks like I can get the table using the same deprecated method that
Phoenix does:
tablename comes from the Mutation kvPair in the original fragment
HTableInterface hTable = conn.getQueryServices().getTable(tableName);
hTable.batch(mutationList);
This saves the data without having to do a jdbc commit(). So it
'looks' like it works, which brings me to the next problem, how to
confirm!
Time to inspect the HBase API...
That's a fight for another day
Thanks again.
On Tue, 2019-10-15 at 23:37 -0400, Billy Watson wrote:
> I ran into this too with other code. Make sure you’re on the same
> API. HBase 2’s APIs changed heavily so you may have to do some
> googling for docs to convert the above code into something usable in
> your version of the HBase API.
>
> Also for your original problem, I’m not sure if Apache Ranger
> supports row-level yet in their HBase plugin but you can certainly
> add that functionality to make what you’re talking about a LOT easier
> to maintain.
>
> Good luck,
>
> Billy Watson
>
> On Tue, Oct 15, 2019 at 23:05 Simon Mottram <
> Simon.Mottram@cucumber.co.nz> wrote:
> > Hi Ankit
> >
> > Getting stuck into this, but I am having trouble finding out how to
> > persist the ACL mutations
> >
> > The updates to the mutations aren't being persisted as far as I can
> > tell. I see in your code you are using htable.batch().
> >
> > I'm struggling to find a way to that object, I can get a PTable
> > from
> > the connection using the byte[] tablename but PTable doesn't have
> > the
> > batch() method.
> >
> > I'm also unclear on how the htable.batch() method works with
> > connection.commit(). Is commit required?
> >
> > I have a horrible feeling that the mutations require an Hbase
> > connection rather than Phoenix and I have to go direct to Hbase API
> >
> > Best Regards and thanks for the help
> >
> > Simon
> >
> >
> > On Wed, 2019-09-04 at 19:06 -0700, Ankit Singhal wrote:
> > > >>would it be best to
> > > use the HBase API for creating the data.
> > > yes, you can use HBase API but you need to ensure that Phoenix
> > Data
> > > type APIs are used to
> > > convert your column values into bytes and also while creating a
> > > composite key(if applicable).
> > > otherwise you would not be able to read data from Phoenix when
> > using
> > > different data types
> > > other than varchar or unsigned_bigint.
> > >
> > > >> The sparse nature of the data means
> > > that I will be constantly adding new columns, not sure if Phoenix
> > > would
> > > have a problem with that.
> > > Phoenix supports dynamic columns so you should not have a problem
> > > with that.
> > >
> > > Regards,
> > > Ankit Singhal
> > >
> > > On Wed, Sep 4, 2019 at 6:24 PM Simon Mottram <
> > > Simon.Mottram@cucumber.co.nz> wrote:
> > > > Hi Ankit
> > > >
> > > > Thats very useful, many thanks.
> > > >
> > > > Before I dive into using Phoenix (which has given me a torrid
> > time
> > > > over
> > > > the last few days!), is using Phoenix the best option given
> > that
> > > > I'm
> > > > doing some low level access to Cell information, or would it be
> > > > best to
> > > > use the HBase API for creating the data.
> > > >
> > > > We would of course use Phoenix for querying the tables, I'm
> > just
> > > > wondering if the import of data would be better handled via the
> > > > native
> > > > HBase API.
> > > >
> > > > I think I only need to set labels or use the ACL system,
> > everything
> > > > else should be straight forward. The sparse nature of the data
> > > > means
> > > > that I will be constantly adding new columns, not sure if
> > Phoenix
> > > > would
> > > > have a problem with that.
> > > >
> > > > Best Regards
> > > >
> > > > Simon
> > > >
> > > > On Tue, 2019-09-03 at 16:30 -0700, Ankit Singhal wrote:
> > > > > >> If not possible I guess we have to look at doing something
> > at
> > > > the
> > > > > HBase
> > > > > level.
> > > > > As Josh said, it's not yet supported in Phoenix, Though you
> > may
> > > > try
> > > > > using cell-level security of HBase with some Phoenix internal
> > API
> > > > and
> > > > > let us know if it works for you.
> > > > > Sharing a sample code if you wanna try.
> > > > >
> > > > > /**
> > > > > * Do writes using cell based ACLs
> > > > > **/
> > > > > Properties props = new Properties();
> > > > > //conf = Hbase conf
> > > > > PhoenixConnection conn = (PhoenixConnection)
> > > > > QueryUtil.getConnection(props, conf);
> > > > > conn.setAutoCommit(false);
> > > > > conn.createStatement().executeUpdate("<your upsert>");
> > > > > final Iterator<Pair<byte[],List<Mutation>>> iterator =
> > > > > pconn.getMutationState().toMutations(false);
> > > > > while (iterator.hasNext()) {
> > > > > Pair<byte[], List<Mutation>> kvPair =
> > iterator.next();
> > > > > List<Mutation> mutationList = kvPair.getSecond();
> > > > > byte[] tableName = kvPair.getFirst();
> > > > > for (Mutation mutation : mutationList) {
> > > > > //perms is user->permissions map
> > > > > mutation.setACL(perms);
> > > > > }
> > > > > htable.batch(mutationList);
> > > > > }
> > > > > conn.rollback();
> > > > >
> > > > >
> > > > > On Tue, Sep 3, 2019 at 3:19 PM Simon Mottram <
> > > > > Simon.Mottram@cucumber.co.nz> wrote:
> > > > > > Hi Josh
> > > > > >
> > > > > > Thought as much, thanks very much for taking the time to
> > > > respond.
> > > > > >
> > > > > > Appreciated
> > > > > >
> > > > > > Simon
> > > > > >
> > > > > > On Tue, 2019-09-03 at 11:19 -0400, Josh Elser wrote:
> > > > > > > Hi Simon,
> > > > > > >
> > > > > > > Phoenix does not provide any authorization/security
> > layers on
> > > > top
> > > > > > of
> > > > > > > what HBase does (the thread on user@hbase has a
> > suggestion on
> > > > > > cell
> > > > > > > ACLs
> > > > > > > which is good).
> > > > > > >
> > > > > > > I think the question you're ultimately asking is: no, the
> > > > > > TenantID
> > > > > > > is
> > > > > > > not an authorization layer. In a nut-shell, the TenantID
> > is
> > > > just
> > > > > > an
> > > > > > > extra attribute (column) added to your primary key
> > > > constraint
> > > > > > > auto-magically. If a user doesn't set a TenantID, then
> > they
> > > > see
> > > > > > _all_
> > > > > > > data.
> > > > > > >
> > > > > > > Unless you have a layer in-between Phoenix and your end-
> > users
> > > > > > that
> > > > > > > add
> > > > > > > extra guarantees/restrictions, a user could set their own
> > > > > > TenantID
> > > > > > > and
> > > > > > > see other folks' data. I don't think this is a good
> > solution
> > > > for
> > > > > > > what
> > > > > > > you're trying to accomplish.
> > > > > > >
> > > > > > > On 9/2/19 8:34 PM, Simon Mottram wrote:
> > > > > > > > Hi
> > > > > > > >
> > > > > > > > I'm working on a project where we have a combination of
> > > > very
> > > > > > sparse
> > > > > > > > data columns with added headaches of multi-tenancy.
> > Hbase
> > > > > > looks
> > > > > > > > great
> > > > > > > > for the back end but I need to check that we can
> > support
> > > > the
> > > > > > > > customer's
> > > > > > > > multi-tenancy requirements.
> > > > > > > >
> > > > > > > > There are 2 that I'm struggling to find a definitive
> > answer
> > > > > > for.
> > > > > > > > Any
> > > > > > > > info most gratefully received
> > > > > > > >
> > > > > > > > Shared Data
> > > > > > > > ===========
> > > > > > > > Each record in the table must be secured but it could
> > be
> > > > > > multiple
> > > > > > > > tenants for a record. Think 'shared' data.
> > > > > > > >
> > > > > > > > So for example if you had 3 records
> > > > > > > >
> > > > > > > > record1, some secret data
> > > > > > > > record2, some other secret data
> > > > > > > > record3, data? what data.
> > > > > > > >
> > > > > > > > We need
> > > > > > > > user1 to be able to see record1 and record2
> > > > > > > > user2 to be able to see record2 and record3
> > > > > > > >
> > > > > > > > From what I see in the mult-tenancy doco, the
> > tenant_id
> > > > field
> > > > > > is a
> > > > > > > > VARCHAR, can this be multiple values?
> > > > > > > >
> > > > > > > > The actual 'multiple tenant' value would be set at
> > creation
> > > > and
> > > > > > > > very
> > > > > > > > rarely (if ever) changed, but I couldn't guarantee
> > > > immutability
> > > > > > > >
> > > > > > > >
> > > > > > > > Enforced Security
> > > > > > > > =================
> > > > > > > > Can you prevent access without TenantId? Otherwise if
> > > > someone
> > > > > > just
> > > > > > > > edits the connection info they can sidestep all the
> > multi-
> > > > > > tenancy
> > > > > > > > features. Our users include scientific types who will
> > > > want to
> > > > > > > > connect
> > > > > > > > directly using JDBC/Python/Other so we need to be sure
> > to
> > > > lock
> > > > > > this
> > > > > > > > data down.
> > > > > > > >
> > > > > > > > Of course they want 'admin' types who CAN see all =)
> > > > Whether
> > > > > > there
> > > > > > > > is a
> > > > > > > > special connection that allows non-tenanted connections
> > or
> > > > have
> > > > > > a
> > > > > > > > multi-tenant key that always contains a master tenantid
> > > > (yuck)
> > > > > > > >
> > > > > > > > If not possible I guess we have to look at doing
> > something
> > > > at
> > > > > > the
> > > > > > > > HBase
> > > > > > > > level.
> > > > > > > >
> > > > > > > > Best Regards
> > > > > > > >
> > > > > > > > Simon
> > > > > > > >
>
> --
> William Watson
>
Re: Multi-Tenancy and shared records
Posted by Billy Watson <wi...@gmail.com>.
I ran into this too with other code. Make sure you’re on the same API.
HBase 2’s APIs changed heavily so you may have to do some googling for docs
to convert the above code into something usable in your version of the
HBase API.
Also for your original problem, I’m not sure if Apache Ranger supports
row-level yet in their HBase plugin but you can certainly add that
functionality to make what you’re talking about a LOT easier to maintain.
Good luck,
Billy Watson
On Tue, Oct 15, 2019 at 23:05 Simon Mottram <Si...@cucumber.co.nz>
wrote:
> Hi Ankit
>
> Getting stuck into this, but I am having trouble finding out how to
> persist the ACL mutations
>
> The updates to the mutations aren't being persisted as far as I can
> tell. I see in your code you are using htable.batch().
>
> I'm struggling to find a way to that object, I can get a PTable from
> the connection using the byte[] tablename but PTable doesn't have the
> batch() method.
>
> I'm also unclear on how the htable.batch() method works with
> connection.commit(). Is commit required?
>
> I have a horrible feeling that the mutations require an Hbase
> connection rather than Phoenix and I have to go direct to Hbase API
>
> Best Regards and thanks for the help
>
> Simon
>
>
> On Wed, 2019-09-04 at 19:06 -0700, Ankit Singhal wrote:
> > >>would it be best to
> > use the HBase API for creating the data.
> > yes, you can use HBase API but you need to ensure that Phoenix Data
> > type APIs are used to
> > convert your column values into bytes and also while creating a
> > composite key(if applicable).
> > otherwise you would not be able to read data from Phoenix when using
> > different data types
> > other than varchar or unsigned_bigint.
> >
> > >> The sparse nature of the data means
> > that I will be constantly adding new columns, not sure if Phoenix
> > would
> > have a problem with that.
> > Phoenix supports dynamic columns so you should not have a problem
> > with that.
> >
> > Regards,
> > Ankit Singhal
> >
> > On Wed, Sep 4, 2019 at 6:24 PM Simon Mottram <
> > Simon.Mottram@cucumber.co.nz> wrote:
> > > Hi Ankit
> > >
> > > Thats very useful, many thanks.
> > >
> > > Before I dive into using Phoenix (which has given me a torrid time
> > > over
> > > the last few days!), is using Phoenix the best option given that
> > > I'm
> > > doing some low level access to Cell information, or would it be
> > > best to
> > > use the HBase API for creating the data.
> > >
> > > We would of course use Phoenix for querying the tables, I'm just
> > > wondering if the import of data would be better handled via the
> > > native
> > > HBase API.
> > >
> > > I think I only need to set labels or use the ACL system, everything
> > > else should be straight forward. The sparse nature of the data
> > > means
> > > that I will be constantly adding new columns, not sure if Phoenix
> > > would
> > > have a problem with that.
> > >
> > > Best Regards
> > >
> > > Simon
> > >
> > > On Tue, 2019-09-03 at 16:30 -0700, Ankit Singhal wrote:
> > > > >> If not possible I guess we have to look at doing something at
> > > the
> > > > HBase
> > > > level.
> > > > As Josh said, it's not yet supported in Phoenix, Though you may
> > > try
> > > > using cell-level security of HBase with some Phoenix internal API
> > > and
> > > > let us know if it works for you.
> > > > Sharing a sample code if you wanna try.
> > > >
> > > > /**
> > > > * Do writes using cell based ACLs
> > > > **/
> > > > Properties props = new Properties();
> > > > //conf = Hbase conf
> > > > PhoenixConnection conn = (PhoenixConnection)
> > > > QueryUtil.getConnection(props, conf);
> > > > conn.setAutoCommit(false);
> > > > conn.createStatement().executeUpdate("<your upsert>");
> > > > final Iterator<Pair<byte[],List<Mutation>>> iterator =
> > > > pconn.getMutationState().toMutations(false);
> > > > while (iterator.hasNext()) {
> > > > Pair<byte[], List<Mutation>> kvPair = iterator.next();
> > > > List<Mutation> mutationList = kvPair.getSecond();
> > > > byte[] tableName = kvPair.getFirst();
> > > > for (Mutation mutation : mutationList) {
> > > > //perms is user->permissions map
> > > > mutation.setACL(perms);
> > > > }
> > > > htable.batch(mutationList);
> > > > }
> > > > conn.rollback();
> > > >
> > > >
> > > > On Tue, Sep 3, 2019 at 3:19 PM Simon Mottram <
> > > > Simon.Mottram@cucumber.co.nz> wrote:
> > > > > Hi Josh
> > > > >
> > > > > Thought as much, thanks very much for taking the time to
> > > respond.
> > > > >
> > > > > Appreciated
> > > > >
> > > > > Simon
> > > > >
> > > > > On Tue, 2019-09-03 at 11:19 -0400, Josh Elser wrote:
> > > > > > Hi Simon,
> > > > > >
> > > > > > Phoenix does not provide any authorization/security layers on
> > > top
> > > > > of
> > > > > > what HBase does (the thread on user@hbase has a suggestion on
> > > > > cell
> > > > > > ACLs
> > > > > > which is good).
> > > > > >
> > > > > > I think the question you're ultimately asking is: no, the
> > > > > TenantID
> > > > > > is
> > > > > > not an authorization layer. In a nut-shell, the TenantID is
> > > just
> > > > > an
> > > > > > extra attribute (column) added to your primary key
> > > constraint
> > > > > > auto-magically. If a user doesn't set a TenantID, then they
> > > see
> > > > > _all_
> > > > > > data.
> > > > > >
> > > > > > Unless you have a layer in-between Phoenix and your end-users
> > > > > that
> > > > > > add
> > > > > > extra guarantees/restrictions, a user could set their own
> > > > > TenantID
> > > > > > and
> > > > > > see other folks' data. I don't think this is a good solution
> > > for
> > > > > > what
> > > > > > you're trying to accomplish.
> > > > > >
> > > > > > On 9/2/19 8:34 PM, Simon Mottram wrote:
> > > > > > > Hi
> > > > > > >
> > > > > > > I'm working on a project where we have a combination of
> > > very
> > > > > sparse
> > > > > > > data columns with added headaches of multi-tenancy. Hbase
> > > > > looks
> > > > > > > great
> > > > > > > for the back end but I need to check that we can support
> > > the
> > > > > > > customer's
> > > > > > > multi-tenancy requirements.
> > > > > > >
> > > > > > > There are 2 that I'm struggling to find a definitive answer
> > > > > for.
> > > > > > > Any
> > > > > > > info most gratefully received
> > > > > > >
> > > > > > > Shared Data
> > > > > > > ===========
> > > > > > > Each record in the table must be secured but it could be
> > > > > multiple
> > > > > > > tenants for a record. Think 'shared' data.
> > > > > > >
> > > > > > > So for example if you had 3 records
> > > > > > >
> > > > > > > record1, some secret data
> > > > > > > record2, some other secret data
> > > > > > > record3, data? what data.
> > > > > > >
> > > > > > > We need
> > > > > > > user1 to be able to see record1 and record2
> > > > > > > user2 to be able to see record2 and record3
> > > > > > >
> > > > > > > From what I see in the mult-tenancy doco, the tenant_id
> > > field
> > > > > is a
> > > > > > > VARCHAR, can this be multiple values?
> > > > > > >
> > > > > > > The actual 'multiple tenant' value would be set at creation
> > > and
> > > > > > > very
> > > > > > > rarely (if ever) changed, but I couldn't guarantee
> > > immutability
> > > > > > >
> > > > > > >
> > > > > > > Enforced Security
> > > > > > > =================
> > > > > > > Can you prevent access without TenantId? Otherwise if
> > > someone
> > > > > just
> > > > > > > edits the connection info they can sidestep all the multi-
> > > > > tenancy
> > > > > > > features. Our users include scientific types who will
> > > want to
> > > > > > > connect
> > > > > > > directly using JDBC/Python/Other so we need to be sure to
> > > lock
> > > > > this
> > > > > > > data down.
> > > > > > >
> > > > > > > Of course they want 'admin' types who CAN see all =)
> > > Whether
> > > > > there
> > > > > > > is a
> > > > > > > special connection that allows non-tenanted connections or
> > > have
> > > > > a
> > > > > > > multi-tenant key that always contains a master tenantid
> > > (yuck)
> > > > > > >
> > > > > > > If not possible I guess we have to look at doing something
> > > at
> > > > > the
> > > > > > > HBase
> > > > > > > level.
> > > > > > >
> > > > > > > Best Regards
> > > > > > >
> > > > > > > Simon
> > > > > > >
>
--
William Watson
Re: Multi-Tenancy and shared records
Posted by Simon Mottram <Si...@cucumber.co.nz>.
Hi Ankit
Getting stuck into this, but I am having trouble finding out how to
persist the ACL mutations
The updates to the mutations aren't being persisted as far as I can
tell. I see in your code you are using htable.batch().
I'm struggling to find a way to that object, I can get a PTable from
the connection using the byte[] tablename but PTable doesn't have the
batch() method.
I'm also unclear on how the htable.batch() method works with
connection.commit(). Is commit required?
I have a horrible feeling that the mutations require an Hbase
connection rather than Phoenix and I have to go direct to Hbase API
Best Regards and thanks for the help
Simon
On Wed, 2019-09-04 at 19:06 -0700, Ankit Singhal wrote:
> >>would it be best to
> use the HBase API for creating the data.
> yes, you can use HBase API but you need to ensure that Phoenix Data
> type APIs are used to
> convert your column values into bytes and also while creating a
> composite key(if applicable).
> otherwise you would not be able to read data from Phoenix when using
> different data types
> other than varchar or unsigned_bigint.
>
> >> The sparse nature of the data means
> that I will be constantly adding new columns, not sure if Phoenix
> would
> have a problem with that.
> Phoenix supports dynamic columns so you should not have a problem
> with that.
>
> Regards,
> Ankit Singhal
>
> On Wed, Sep 4, 2019 at 6:24 PM Simon Mottram <
> Simon.Mottram@cucumber.co.nz> wrote:
> > Hi Ankit
> >
> > Thats very useful, many thanks.
> >
> > Before I dive into using Phoenix (which has given me a torrid time
> > over
> > the last few days!), is using Phoenix the best option given that
> > I'm
> > doing some low level access to Cell information, or would it be
> > best to
> > use the HBase API for creating the data.
> >
> > We would of course use Phoenix for querying the tables, I'm just
> > wondering if the import of data would be better handled via the
> > native
> > HBase API.
> >
> > I think I only need to set labels or use the ACL system, everything
> > else should be straight forward. The sparse nature of the data
> > means
> > that I will be constantly adding new columns, not sure if Phoenix
> > would
> > have a problem with that.
> >
> > Best Regards
> >
> > Simon
> >
> > On Tue, 2019-09-03 at 16:30 -0700, Ankit Singhal wrote:
> > > >> If not possible I guess we have to look at doing something at
> > the
> > > HBase
> > > level.
> > > As Josh said, it's not yet supported in Phoenix, Though you may
> > try
> > > using cell-level security of HBase with some Phoenix internal API
> > and
> > > let us know if it works for you.
> > > Sharing a sample code if you wanna try.
> > >
> > > /**
> > > * Do writes using cell based ACLs
> > > **/
> > > Properties props = new Properties();
> > > //conf = Hbase conf
> > > PhoenixConnection conn = (PhoenixConnection)
> > > QueryUtil.getConnection(props, conf);
> > > conn.setAutoCommit(false);
> > > conn.createStatement().executeUpdate("<your upsert>");
> > > final Iterator<Pair<byte[],List<Mutation>>> iterator =
> > > pconn.getMutationState().toMutations(false);
> > > while (iterator.hasNext()) {
> > > Pair<byte[], List<Mutation>> kvPair = iterator.next();
> > > List<Mutation> mutationList = kvPair.getSecond();
> > > byte[] tableName = kvPair.getFirst();
> > > for (Mutation mutation : mutationList) {
> > > //perms is user->permissions map
> > > mutation.setACL(perms);
> > > }
> > > htable.batch(mutationList);
> > > }
> > > conn.rollback();
> > >
> > >
> > > On Tue, Sep 3, 2019 at 3:19 PM Simon Mottram <
> > > Simon.Mottram@cucumber.co.nz> wrote:
> > > > Hi Josh
> > > >
> > > > Thought as much, thanks very much for taking the time to
> > respond.
> > > >
> > > > Appreciated
> > > >
> > > > Simon
> > > >
> > > > On Tue, 2019-09-03 at 11:19 -0400, Josh Elser wrote:
> > > > > Hi Simon,
> > > > >
> > > > > Phoenix does not provide any authorization/security layers on
> > top
> > > > of
> > > > > what HBase does (the thread on user@hbase has a suggestion on
> > > > cell
> > > > > ACLs
> > > > > which is good).
> > > > >
> > > > > I think the question you're ultimately asking is: no, the
> > > > TenantID
> > > > > is
> > > > > not an authorization layer. In a nut-shell, the TenantID is
> > just
> > > > an
> > > > > extra attribute (column) added to your primary key
> > constraint
> > > > > auto-magically. If a user doesn't set a TenantID, then they
> > see
> > > > _all_
> > > > > data.
> > > > >
> > > > > Unless you have a layer in-between Phoenix and your end-users
> > > > that
> > > > > add
> > > > > extra guarantees/restrictions, a user could set their own
> > > > TenantID
> > > > > and
> > > > > see other folks' data. I don't think this is a good solution
> > for
> > > > > what
> > > > > you're trying to accomplish.
> > > > >
> > > > > On 9/2/19 8:34 PM, Simon Mottram wrote:
> > > > > > Hi
> > > > > >
> > > > > > I'm working on a project where we have a combination of
> > very
> > > > sparse
> > > > > > data columns with added headaches of multi-tenancy. Hbase
> > > > looks
> > > > > > great
> > > > > > for the back end but I need to check that we can support
> > the
> > > > > > customer's
> > > > > > multi-tenancy requirements.
> > > > > >
> > > > > > There are 2 that I'm struggling to find a definitive answer
> > > > for.
> > > > > > Any
> > > > > > info most gratefully received
> > > > > >
> > > > > > Shared Data
> > > > > > ===========
> > > > > > Each record in the table must be secured but it could be
> > > > multiple
> > > > > > tenants for a record. Think 'shared' data.
> > > > > >
> > > > > > So for example if you had 3 records
> > > > > >
> > > > > > record1, some secret data
> > > > > > record2, some other secret data
> > > > > > record3, data? what data.
> > > > > >
> > > > > > We need
> > > > > > user1 to be able to see record1 and record2
> > > > > > user2 to be able to see record2 and record3
> > > > > >
> > > > > > From what I see in the mult-tenancy doco, the tenant_id
> > field
> > > > is a
> > > > > > VARCHAR, can this be multiple values?
> > > > > >
> > > > > > The actual 'multiple tenant' value would be set at creation
> > and
> > > > > > very
> > > > > > rarely (if ever) changed, but I couldn't guarantee
> > immutability
> > > > > >
> > > > > >
> > > > > > Enforced Security
> > > > > > =================
> > > > > > Can you prevent access without TenantId? Otherwise if
> > someone
> > > > just
> > > > > > edits the connection info they can sidestep all the multi-
> > > > tenancy
> > > > > > features. Our users include scientific types who will
> > want to
> > > > > > connect
> > > > > > directly using JDBC/Python/Other so we need to be sure to
> > lock
> > > > this
> > > > > > data down.
> > > > > >
> > > > > > Of course they want 'admin' types who CAN see all =)
> > Whether
> > > > there
> > > > > > is a
> > > > > > special connection that allows non-tenanted connections or
> > have
> > > > a
> > > > > > multi-tenant key that always contains a master tenantid
> > (yuck)
> > > > > >
> > > > > > If not possible I guess we have to look at doing something
> > at
> > > > the
> > > > > > HBase
> > > > > > level.
> > > > > >
> > > > > > Best Regards
> > > > > >
> > > > > > Simon
> > > > > >
Re: Multi-Tenancy and shared records
Posted by Ankit Singhal <an...@gmail.com>.
>>would it be best to
use the HBase API for creating the data.
yes, you can use HBase API but you need to ensure that Phoenix Data type
APIs are used to
convert your column values into bytes and also while creating a composite
key(if applicable).
otherwise you would not be able to read data from Phoenix when using
different data types
other than varchar or unsigned_bigint.
>> The sparse nature of the data means
that I will be constantly adding new columns, not sure if Phoenix would
have a problem with that.
Phoenix supports dynamic columns so you should not have a problem with that.
Regards,
Ankit Singhal
On Wed, Sep 4, 2019 at 6:24 PM Simon Mottram <Si...@cucumber.co.nz>
wrote:
> Hi Ankit
>
> Thats very useful, many thanks.
>
> Before I dive into using Phoenix (which has given me a torrid time over
> the last few days!), is using Phoenix the best option given that I'm
> doing some low level access to Cell information, or would it be best to
> use the HBase API for creating the data.
>
> We would of course use Phoenix for querying the tables, I'm just
> wondering if the import of data would be better handled via the native
> HBase API.
>
> I think I only need to set labels or use the ACL system, everything
> else should be straight forward. The sparse nature of the data means
> that I will be constantly adding new columns, not sure if Phoenix would
> have a problem with that.
>
> Best Regards
>
> Simon
>
> On Tue, 2019-09-03 at 16:30 -0700, Ankit Singhal wrote:
> > >> If not possible I guess we have to look at doing something at the
> > HBase
> > level.
> > As Josh said, it's not yet supported in Phoenix, Though you may try
> > using cell-level security of HBase with some Phoenix internal API and
> > let us know if it works for you.
> > Sharing a sample code if you wanna try.
> >
> > /**
> > * Do writes using cell based ACLs
> > **/
> > Properties props = new Properties();
> > //conf = Hbase conf
> > PhoenixConnection conn = (PhoenixConnection)
> > QueryUtil.getConnection(props, conf);
> > conn.setAutoCommit(false);
> > conn.createStatement().executeUpdate("<your upsert>");
> > final Iterator<Pair<byte[],List<Mutation>>> iterator =
> > pconn.getMutationState().toMutations(false);
> > while (iterator.hasNext()) {
> > Pair<byte[], List<Mutation>> kvPair = iterator.next();
> > List<Mutation> mutationList = kvPair.getSecond();
> > byte[] tableName = kvPair.getFirst();
> > for (Mutation mutation : mutationList) {
> > //perms is user->permissions map
> > mutation.setACL(perms);
> > }
> > htable.batch(mutationList);
> > }
> > conn.rollback();
> >
> >
> > On Tue, Sep 3, 2019 at 3:19 PM Simon Mottram <
> > Simon.Mottram@cucumber.co.nz> wrote:
> > > Hi Josh
> > >
> > > Thought as much, thanks very much for taking the time to respond.
> > >
> > > Appreciated
> > >
> > > Simon
> > >
> > > On Tue, 2019-09-03 at 11:19 -0400, Josh Elser wrote:
> > > > Hi Simon,
> > > >
> > > > Phoenix does not provide any authorization/security layers on top
> > > of
> > > > what HBase does (the thread on user@hbase has a suggestion on
> > > cell
> > > > ACLs
> > > > which is good).
> > > >
> > > > I think the question you're ultimately asking is: no, the
> > > TenantID
> > > > is
> > > > not an authorization layer. In a nut-shell, the TenantID is just
> > > an
> > > > extra attribute (column) added to your primary key constraint
> > > > auto-magically. If a user doesn't set a TenantID, then they see
> > > _all_
> > > > data.
> > > >
> > > > Unless you have a layer in-between Phoenix and your end-users
> > > that
> > > > add
> > > > extra guarantees/restrictions, a user could set their own
> > > TenantID
> > > > and
> > > > see other folks' data. I don't think this is a good solution for
> > > > what
> > > > you're trying to accomplish.
> > > >
> > > > On 9/2/19 8:34 PM, Simon Mottram wrote:
> > > > > Hi
> > > > >
> > > > > I'm working on a project where we have a combination of very
> > > sparse
> > > > > data columns with added headaches of multi-tenancy. Hbase
> > > looks
> > > > > great
> > > > > for the back end but I need to check that we can support the
> > > > > customer's
> > > > > multi-tenancy requirements.
> > > > >
> > > > > There are 2 that I'm struggling to find a definitive answer
> > > for.
> > > > > Any
> > > > > info most gratefully received
> > > > >
> > > > > Shared Data
> > > > > ===========
> > > > > Each record in the table must be secured but it could be
> > > multiple
> > > > > tenants for a record. Think 'shared' data.
> > > > >
> > > > > So for example if you had 3 records
> > > > >
> > > > > record1, some secret data
> > > > > record2, some other secret data
> > > > > record3, data? what data.
> > > > >
> > > > > We need
> > > > > user1 to be able to see record1 and record2
> > > > > user2 to be able to see record2 and record3
> > > > >
> > > > > From what I see in the mult-tenancy doco, the tenant_id field
> > > is a
> > > > > VARCHAR, can this be multiple values?
> > > > >
> > > > > The actual 'multiple tenant' value would be set at creation and
> > > > > very
> > > > > rarely (if ever) changed, but I couldn't guarantee immutability
> > > > >
> > > > >
> > > > > Enforced Security
> > > > > =================
> > > > > Can you prevent access without TenantId? Otherwise if someone
> > > just
> > > > > edits the connection info they can sidestep all the multi-
> > > tenancy
> > > > > features. Our users include scientific types who will want to
> > > > > connect
> > > > > directly using JDBC/Python/Other so we need to be sure to lock
> > > this
> > > > > data down.
> > > > >
> > > > > Of course they want 'admin' types who CAN see all =) Whether
> > > there
> > > > > is a
> > > > > special connection that allows non-tenanted connections or have
> > > a
> > > > > multi-tenant key that always contains a master tenantid (yuck)
> > > > >
> > > > > If not possible I guess we have to look at doing something at
> > > the
> > > > > HBase
> > > > > level.
> > > > >
> > > > > Best Regards
> > > > >
> > > > > Simon
> > > > >
>
Re: Multi-Tenancy and shared records
Posted by Simon Mottram <Si...@cucumber.co.nz>.
Hi Ankit
Thats very useful, many thanks.
Before I dive into using Phoenix (which has given me a torrid time over
the last few days!), is using Phoenix the best option given that I'm
doing some low level access to Cell information, or would it be best to
use the HBase API for creating the data.
We would of course use Phoenix for querying the tables, I'm just
wondering if the import of data would be better handled via the native
HBase API.
I think I only need to set labels or use the ACL system, everything
else should be straight forward. The sparse nature of the data means
that I will be constantly adding new columns, not sure if Phoenix would
have a problem with that.
Best Regards
Simon
On Tue, 2019-09-03 at 16:30 -0700, Ankit Singhal wrote:
> >> If not possible I guess we have to look at doing something at the
> HBase
> level.
> As Josh said, it's not yet supported in Phoenix, Though you may try
> using cell-level security of HBase with some Phoenix internal API and
> let us know if it works for you.
> Sharing a sample code if you wanna try.
>
> /**
> * Do writes using cell based ACLs
> **/
> Properties props = new Properties();
> //conf = Hbase conf
> PhoenixConnection conn = (PhoenixConnection)
> QueryUtil.getConnection(props, conf);
> conn.setAutoCommit(false);
> conn.createStatement().executeUpdate("<your upsert>");
> final Iterator<Pair<byte[],List<Mutation>>> iterator =
> pconn.getMutationState().toMutations(false);
> while (iterator.hasNext()) {
> Pair<byte[], List<Mutation>> kvPair = iterator.next();
> List<Mutation> mutationList = kvPair.getSecond();
> byte[] tableName = kvPair.getFirst();
> for (Mutation mutation : mutationList) {
> //perms is user->permissions map
> mutation.setACL(perms);
> }
> htable.batch(mutationList);
> }
> conn.rollback();
>
>
> On Tue, Sep 3, 2019 at 3:19 PM Simon Mottram <
> Simon.Mottram@cucumber.co.nz> wrote:
> > Hi Josh
> >
> > Thought as much, thanks very much for taking the time to respond.
> >
> > Appreciated
> >
> > Simon
> >
> > On Tue, 2019-09-03 at 11:19 -0400, Josh Elser wrote:
> > > Hi Simon,
> > >
> > > Phoenix does not provide any authorization/security layers on top
> > of
> > > what HBase does (the thread on user@hbase has a suggestion on
> > cell
> > > ACLs
> > > which is good).
> > >
> > > I think the question you're ultimately asking is: no, the
> > TenantID
> > > is
> > > not an authorization layer. In a nut-shell, the TenantID is just
> > an
> > > extra attribute (column) added to your primary key constraint
> > > auto-magically. If a user doesn't set a TenantID, then they see
> > _all_
> > > data.
> > >
> > > Unless you have a layer in-between Phoenix and your end-users
> > that
> > > add
> > > extra guarantees/restrictions, a user could set their own
> > TenantID
> > > and
> > > see other folks' data. I don't think this is a good solution for
> > > what
> > > you're trying to accomplish.
> > >
> > > On 9/2/19 8:34 PM, Simon Mottram wrote:
> > > > Hi
> > > >
> > > > I'm working on a project where we have a combination of very
> > sparse
> > > > data columns with added headaches of multi-tenancy. Hbase
> > looks
> > > > great
> > > > for the back end but I need to check that we can support the
> > > > customer's
> > > > multi-tenancy requirements.
> > > >
> > > > There are 2 that I'm struggling to find a definitive answer
> > for.
> > > > Any
> > > > info most gratefully received
> > > >
> > > > Shared Data
> > > > ===========
> > > > Each record in the table must be secured but it could be
> > multiple
> > > > tenants for a record. Think 'shared' data.
> > > >
> > > > So for example if you had 3 records
> > > >
> > > > record1, some secret data
> > > > record2, some other secret data
> > > > record3, data? what data.
> > > >
> > > > We need
> > > > user1 to be able to see record1 and record2
> > > > user2 to be able to see record2 and record3
> > > >
> > > > From what I see in the mult-tenancy doco, the tenant_id field
> > is a
> > > > VARCHAR, can this be multiple values?
> > > >
> > > > The actual 'multiple tenant' value would be set at creation and
> > > > very
> > > > rarely (if ever) changed, but I couldn't guarantee immutability
> > > >
> > > >
> > > > Enforced Security
> > > > =================
> > > > Can you prevent access without TenantId? Otherwise if someone
> > just
> > > > edits the connection info they can sidestep all the multi-
> > tenancy
> > > > features. Our users include scientific types who will want to
> > > > connect
> > > > directly using JDBC/Python/Other so we need to be sure to lock
> > this
> > > > data down.
> > > >
> > > > Of course they want 'admin' types who CAN see all =) Whether
> > there
> > > > is a
> > > > special connection that allows non-tenanted connections or have
> > a
> > > > multi-tenant key that always contains a master tenantid (yuck)
> > > >
> > > > If not possible I guess we have to look at doing something at
> > the
> > > > HBase
> > > > level.
> > > >
> > > > Best Regards
> > > >
> > > > Simon
> > > >
Re: Multi-Tenancy and shared records
Posted by Ankit Singhal <an...@gmail.com>.
>> If not possible I guess we have to look at doing something at the HBase
level.
As Josh said, it's not yet supported in Phoenix, Though you may try using
cell-level security of HBase with some Phoenix internal API and let us know
if it works for you.
Sharing a sample code if you wanna try.
/**
* Do writes using cell based ACLs
**/
Properties props = new Properties();
//conf = Hbase conf
PhoenixConnection conn = (PhoenixConnection) QueryUtil.getConnection(props,
conf);
conn.setAutoCommit(false);
conn.createStatement().executeUpdate("<your upsert>");
final Iterator<Pair<byte[],List<Mutation>>> iterator =
pconn.getMutationState().toMutations(false);
while (iterator.hasNext()) {
Pair<byte[], List<Mutation>> kvPair = iterator.next();
List<Mutation> mutationList = kvPair.getSecond();
byte[] tableName = kvPair.getFirst();
for (Mutation mutation : mutationList) {
//perms is user->permissions map
mutation.setACL(perms);
}
htable.batch(mutationList);
}
conn.rollback();
On Tue, Sep 3, 2019 at 3:19 PM Simon Mottram <Si...@cucumber.co.nz>
wrote:
> Hi Josh
>
> Thought as much, thanks very much for taking the time to respond.
>
> Appreciated
>
> Simon
>
> On Tue, 2019-09-03 at 11:19 -0400, Josh Elser wrote:
> > Hi Simon,
> >
> > Phoenix does not provide any authorization/security layers on top of
> > what HBase does (the thread on user@hbase has a suggestion on cell
> > ACLs
> > which is good).
> >
> > I think the question you're ultimately asking is: no, the TenantID
> > is
> > not an authorization layer. In a nut-shell, the TenantID is just an
> > extra attribute (column) added to your primary key constraint
> > auto-magically. If a user doesn't set a TenantID, then they see _all_
> > data.
> >
> > Unless you have a layer in-between Phoenix and your end-users that
> > add
> > extra guarantees/restrictions, a user could set their own TenantID
> > and
> > see other folks' data. I don't think this is a good solution for
> > what
> > you're trying to accomplish.
> >
> > On 9/2/19 8:34 PM, Simon Mottram wrote:
> > > Hi
> > >
> > > I'm working on a project where we have a combination of very sparse
> > > data columns with added headaches of multi-tenancy. Hbase looks
> > > great
> > > for the back end but I need to check that we can support the
> > > customer's
> > > multi-tenancy requirements.
> > >
> > > There are 2 that I'm struggling to find a definitive answer for.
> > > Any
> > > info most gratefully received
> > >
> > > Shared Data
> > > ===========
> > > Each record in the table must be secured but it could be multiple
> > > tenants for a record. Think 'shared' data.
> > >
> > > So for example if you had 3 records
> > >
> > > record1, some secret data
> > > record2, some other secret data
> > > record3, data? what data.
> > >
> > > We need
> > > user1 to be able to see record1 and record2
> > > user2 to be able to see record2 and record3
> > >
> > > From what I see in the mult-tenancy doco, the tenant_id field is a
> > > VARCHAR, can this be multiple values?
> > >
> > > The actual 'multiple tenant' value would be set at creation and
> > > very
> > > rarely (if ever) changed, but I couldn't guarantee immutability
> > >
> > >
> > > Enforced Security
> > > =================
> > > Can you prevent access without TenantId? Otherwise if someone just
> > > edits the connection info they can sidestep all the multi-tenancy
> > > features. Our users include scientific types who will want to
> > > connect
> > > directly using JDBC/Python/Other so we need to be sure to lock this
> > > data down.
> > >
> > > Of course they want 'admin' types who CAN see all =) Whether there
> > > is a
> > > special connection that allows non-tenanted connections or have a
> > > multi-tenant key that always contains a master tenantid (yuck)
> > >
> > > If not possible I guess we have to look at doing something at the
> > > HBase
> > > level.
> > >
> > > Best Regards
> > >
> > > Simon
> > >
>
Re: Multi-Tenancy and shared records
Posted by Simon Mottram <Si...@cucumber.co.nz>.
Hi Josh
Thought as much, thanks very much for taking the time to respond.
Appreciated
Simon
On Tue, 2019-09-03 at 11:19 -0400, Josh Elser wrote:
> Hi Simon,
>
> Phoenix does not provide any authorization/security layers on top of
> what HBase does (the thread on user@hbase has a suggestion on cell
> ACLs
> which is good).
>
> I think the question you're ultimately asking is: no, the TenantID
> is
> not an authorization layer. In a nut-shell, the TenantID is just an
> extra attribute (column) added to your primary key constraint
> auto-magically. If a user doesn't set a TenantID, then they see _all_
> data.
>
> Unless you have a layer in-between Phoenix and your end-users that
> add
> extra guarantees/restrictions, a user could set their own TenantID
> and
> see other folks' data. I don't think this is a good solution for
> what
> you're trying to accomplish.
>
> On 9/2/19 8:34 PM, Simon Mottram wrote:
> > Hi
> >
> > I'm working on a project where we have a combination of very sparse
> > data columns with added headaches of multi-tenancy. Hbase looks
> > great
> > for the back end but I need to check that we can support the
> > customer's
> > multi-tenancy requirements.
> >
> > There are 2 that I'm struggling to find a definitive answer for.
> > Any
> > info most gratefully received
> >
> > Shared Data
> > ===========
> > Each record in the table must be secured but it could be multiple
> > tenants for a record. Think 'shared' data.
> >
> > So for example if you had 3 records
> >
> > record1, some secret data
> > record2, some other secret data
> > record3, data? what data.
> >
> > We need
> > user1 to be able to see record1 and record2
> > user2 to be able to see record2 and record3
> >
> > From what I see in the mult-tenancy doco, the tenant_id field is a
> > VARCHAR, can this be multiple values?
> >
> > The actual 'multiple tenant' value would be set at creation and
> > very
> > rarely (if ever) changed, but I couldn't guarantee immutability
> >
> >
> > Enforced Security
> > =================
> > Can you prevent access without TenantId? Otherwise if someone just
> > edits the connection info they can sidestep all the multi-tenancy
> > features. Our users include scientific types who will want to
> > connect
> > directly using JDBC/Python/Other so we need to be sure to lock this
> > data down.
> >
> > Of course they want 'admin' types who CAN see all =) Whether there
> > is a
> > special connection that allows non-tenanted connections or have a
> > multi-tenant key that always contains a master tenantid (yuck)
> >
> > If not possible I guess we have to look at doing something at the
> > HBase
> > level.
> >
> > Best Regards
> >
> > Simon
> >
Re: Multi-Tenancy and shared records
Posted by Josh Elser <el...@apache.org>.
Hi Simon,
Phoenix does not provide any authorization/security layers on top of
what HBase does (the thread on user@hbase has a suggestion on cell ACLs
which is good).
I think the question you're ultimately asking is: no, the TenantID is
not an authorization layer. In a nut-shell, the TenantID is just an
extra attribute (column) added to your primary key constraint
auto-magically. If a user doesn't set a TenantID, then they see _all_ data.
Unless you have a layer in-between Phoenix and your end-users that add
extra guarantees/restrictions, a user could set their own TenantID and
see other folks' data. I don't think this is a good solution for what
you're trying to accomplish.
On 9/2/19 8:34 PM, Simon Mottram wrote:
> Hi
>
> I'm working on a project where we have a combination of very sparse
> data columns with added headaches of multi-tenancy. Hbase looks great
> for the back end but I need to check that we can support the customer's
> multi-tenancy requirements.
>
> There are 2 that I'm struggling to find a definitive answer for. Any
> info most gratefully received
>
> Shared Data
> ===========
> Each record in the table must be secured but it could be multiple
> tenants for a record. Think 'shared' data.
>
> So for example if you had 3 records
>
> record1, some secret data
> record2, some other secret data
> record3, data? what data.
>
> We need
> user1 to be able to see record1 and record2
> user2 to be able to see record2 and record3
>
> From what I see in the mult-tenancy doco, the tenant_id field is a
> VARCHAR, can this be multiple values?
>
> The actual 'multiple tenant' value would be set at creation and very
> rarely (if ever) changed, but I couldn't guarantee immutability
>
>
> Enforced Security
> =================
> Can you prevent access without TenantId? Otherwise if someone just
> edits the connection info they can sidestep all the multi-tenancy
> features. Our users include scientific types who will want to connect
> directly using JDBC/Python/Other so we need to be sure to lock this
> data down.
>
> Of course they want 'admin' types who CAN see all =) Whether there is a
> special connection that allows non-tenanted connections or have a
> multi-tenant key that always contains a master tenantid (yuck)
>
> If not possible I guess we have to look at doing something at the HBase
> level.
>
> Best Regards
>
> Simon
>