You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Niels Basjes <Ni...@basjes.nl> on 2014/03/06 21:54:38 UTC

Dynamically deploying filters?

Hi,

In the current HBase versions a Filter needs to be deployed by putting a
jar into all region servers (and depending on the HBase version restart the
regionservers).

I'm in a multi tenant cluster environment where we may run into the need to
have both the old and the new version of a Filter available at the same
time. Also the option of having a method of easily trying out a new
implementation for a Filter (to see if it performs better) would be a lot
easier if it were possible to use a custom Filter without having to put it
onto all region servers.

So after some Googling I found this interesting experiment for dynamically
uploading the Filter code with the Scan:
http://tech.flurry.com/2012/12/06/exploring-dynamic-loading-of-custom-filters-i/


My question: Is such a feature planned for the mainline HBase?

-- 
Best regards

Niels Basjes

Re: Dynamically deploying filters?

Posted by Niels Basjes <Ni...@basjes.nl>.
In my mind: if a (Filter)class can execute arbitrary code under a specific
userid then I expect that this code can simply read the HFiles from HDFS
directly. And because it is running under the 'hbase' user the HDFS
permissions should allow this.

Because I expect this loop hole to exist I think that handling this may be
tricky.


On Sun, Mar 9, 2014 at 4:44 PM, Ted Yu <yu...@gmail.com> wrote:

> The original blog didn't mention security.
>
> If I understand correctly, the application of custom filters is after ACL
> check in a secure cluster.
> The cell visibility feature in 0.98 is implemented through
> VisibilityController which builds on top of BaseRegionObserver.
>
> So we should be fine.
>
>
> On Sun, Mar 9, 2014 at 12:39 AM, Niels Basjes <Ni...@basjes.nl> wrote:
>
> > From what I see this is not putting those classes on into the cluster at
> > all. This looks like it is serializing them during each scan.
> > So this issue does not arise.
> > What I'm thinking about is how to ensure that this is not a way to avoid
> > the security in the cluster.
> >
> > Niels
> > On Mar 7, 2014 1:41 AM, "Ted Yu" <yu...@gmail.com> wrote:
> >
> > > Interesting blog.
> > >
> > > I wonder how subsequent work addresses the following:
> > >
> > > bq. Updating the filter.jar in the Hadoop FS while a table scan is
> > > happening can have undesired results if the updated filters are not
> > > backward compatible.
> > >
> > >
> > > On Thu, Mar 6, 2014 at 12:54 PM, Niels Basjes <Ni...@basjes.nl> wrote:
> > >
> > > > Hi,
> > > >
> > > > In the current HBase versions a Filter needs to be deployed by
> putting
> > a
> > > > jar into all region servers (and depending on the HBase version
> restart
> > > the
> > > > regionservers).
> > > >
> > > > I'm in a multi tenant cluster environment where we may run into the
> > need
> > > to
> > > > have both the old and the new version of a Filter available at the
> same
> > > > time. Also the option of having a method of easily trying out a new
> > > > implementation for a Filter (to see if it performs better) would be a
> > lot
> > > > easier if it were possible to use a custom Filter without having to
> put
> > > it
> > > > onto all region servers.
> > > >
> > > > So after some Googling I found this interesting experiment for
> > > dynamically
> > > > uploading the Filter code with the Scan:
> > > >
> > > >
> > >
> >
> http://tech.flurry.com/2012/12/06/exploring-dynamic-loading-of-custom-filters-i/
> > > >
> > > >
> > > > My question: Is such a feature planned for the mainline HBase?
> > > >
> > > > --
> > > > Best regards
> > > >
> > > > Niels Basjes
> > > >
> > >
> >
>



-- 
Best regards / Met vriendelijke groeten,

Niels Basjes

Re: Dynamically deploying filters?

Posted by Niels Basjes <Ni...@basjes.nl>.
Thanks for the pointer to HBASE-1936.
Very interesting feature.


On Sun, Mar 9, 2014 at 8:03 PM, Bharath Vissapragada
<bh...@cloudera.com>wrote:

> Hey Niels,
>
> Did you go through HBASE-1936? You can just upload the jars to a path in
> hdfs and not to all regionservers.
>
> IIRC, "refreshing" a jar doesn't work. That means you need add jars with
> new names and also since the code uses URLClassloader, classes once loaded
> cannot be unloaded since the current loader still has references to it.
>
> So you need to maintain different versions of your filter incase you are
> using a modified version to test. I believe you can use this for your
> testing. All you need to do is to make sure the new version of filters have
> new classnames and modify your testing code accordingly.
>
>
>
>
>
>
> On Sun, Mar 9, 2014 at 9:14 PM, Ted Yu <yu...@gmail.com> wrote:
>
> > The original blog didn't mention security.
> >
> > If I understand correctly, the application of custom filters is after ACL
> > check in a secure cluster.
> > The cell visibility feature in 0.98 is implemented through
> > VisibilityController which builds on top of BaseRegionObserver.
> >
> > So we should be fine.
> >
> >
> > On Sun, Mar 9, 2014 at 12:39 AM, Niels Basjes <Ni...@basjes.nl> wrote:
> >
> > > From what I see this is not putting those classes on into the cluster
> at
> > > all. This looks like it is serializing them during each scan.
> > > So this issue does not arise.
> > > What I'm thinking about is how to ensure that this is not a way to
> avoid
> > > the security in the cluster.
> > >
> > > Niels
> > > On Mar 7, 2014 1:41 AM, "Ted Yu" <yu...@gmail.com> wrote:
> > >
> > > > Interesting blog.
> > > >
> > > > I wonder how subsequent work addresses the following:
> > > >
> > > > bq. Updating the filter.jar in the Hadoop FS while a table scan is
> > > > happening can have undesired results if the updated filters are not
> > > > backward compatible.
> > > >
> > > >
> > > > On Thu, Mar 6, 2014 at 12:54 PM, Niels Basjes <Ni...@basjes.nl>
> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > In the current HBase versions a Filter needs to be deployed by
> > putting
> > > a
> > > > > jar into all region servers (and depending on the HBase version
> > restart
> > > > the
> > > > > regionservers).
> > > > >
> > > > > I'm in a multi tenant cluster environment where we may run into the
> > > need
> > > > to
> > > > > have both the old and the new version of a Filter available at the
> > same
> > > > > time. Also the option of having a method of easily trying out a new
> > > > > implementation for a Filter (to see if it performs better) would
> be a
> > > lot
> > > > > easier if it were possible to use a custom Filter without having to
> > put
> > > > it
> > > > > onto all region servers.
> > > > >
> > > > > So after some Googling I found this interesting experiment for
> > > > dynamically
> > > > > uploading the Filter code with the Scan:
> > > > >
> > > > >
> > > >
> > >
> >
> http://tech.flurry.com/2012/12/06/exploring-dynamic-loading-of-custom-filters-i/
> > > > >
> > > > >
> > > > > My question: Is such a feature planned for the mainline HBase?
> > > > >
> > > > > --
> > > > > Best regards
> > > > >
> > > > > Niels Basjes
> > > > >
> > > >
> > >
> >
>
>
>
> --
> Bharath Vissapragada
> <http://www.cloudera.com>
>



-- 
Best regards / Met vriendelijke groeten,

Niels Basjes

Re: Dynamically deploying filters?

Posted by Bharath Vissapragada <bh...@cloudera.com>.
Hey Niels,

Did you go through HBASE-1936? You can just upload the jars to a path in
hdfs and not to all regionservers.

IIRC, "refreshing" a jar doesn't work. That means you need add jars with
new names and also since the code uses URLClassloader, classes once loaded
cannot be unloaded since the current loader still has references to it.

So you need to maintain different versions of your filter incase you are
using a modified version to test. I believe you can use this for your
testing. All you need to do is to make sure the new version of filters have
new classnames and modify your testing code accordingly.






On Sun, Mar 9, 2014 at 9:14 PM, Ted Yu <yu...@gmail.com> wrote:

> The original blog didn't mention security.
>
> If I understand correctly, the application of custom filters is after ACL
> check in a secure cluster.
> The cell visibility feature in 0.98 is implemented through
> VisibilityController which builds on top of BaseRegionObserver.
>
> So we should be fine.
>
>
> On Sun, Mar 9, 2014 at 12:39 AM, Niels Basjes <Ni...@basjes.nl> wrote:
>
> > From what I see this is not putting those classes on into the cluster at
> > all. This looks like it is serializing them during each scan.
> > So this issue does not arise.
> > What I'm thinking about is how to ensure that this is not a way to avoid
> > the security in the cluster.
> >
> > Niels
> > On Mar 7, 2014 1:41 AM, "Ted Yu" <yu...@gmail.com> wrote:
> >
> > > Interesting blog.
> > >
> > > I wonder how subsequent work addresses the following:
> > >
> > > bq. Updating the filter.jar in the Hadoop FS while a table scan is
> > > happening can have undesired results if the updated filters are not
> > > backward compatible.
> > >
> > >
> > > On Thu, Mar 6, 2014 at 12:54 PM, Niels Basjes <Ni...@basjes.nl> wrote:
> > >
> > > > Hi,
> > > >
> > > > In the current HBase versions a Filter needs to be deployed by
> putting
> > a
> > > > jar into all region servers (and depending on the HBase version
> restart
> > > the
> > > > regionservers).
> > > >
> > > > I'm in a multi tenant cluster environment where we may run into the
> > need
> > > to
> > > > have both the old and the new version of a Filter available at the
> same
> > > > time. Also the option of having a method of easily trying out a new
> > > > implementation for a Filter (to see if it performs better) would be a
> > lot
> > > > easier if it were possible to use a custom Filter without having to
> put
> > > it
> > > > onto all region servers.
> > > >
> > > > So after some Googling I found this interesting experiment for
> > > dynamically
> > > > uploading the Filter code with the Scan:
> > > >
> > > >
> > >
> >
> http://tech.flurry.com/2012/12/06/exploring-dynamic-loading-of-custom-filters-i/
> > > >
> > > >
> > > > My question: Is such a feature planned for the mainline HBase?
> > > >
> > > > --
> > > > Best regards
> > > >
> > > > Niels Basjes
> > > >
> > >
> >
>



-- 
Bharath Vissapragada
<http://www.cloudera.com>

Re: Dynamically deploying filters?

Posted by Ted Yu <yu...@gmail.com>.
The original blog didn't mention security.

If I understand correctly, the application of custom filters is after ACL
check in a secure cluster.
The cell visibility feature in 0.98 is implemented through
VisibilityController which builds on top of BaseRegionObserver.

So we should be fine.


On Sun, Mar 9, 2014 at 12:39 AM, Niels Basjes <Ni...@basjes.nl> wrote:

> From what I see this is not putting those classes on into the cluster at
> all. This looks like it is serializing them during each scan.
> So this issue does not arise.
> What I'm thinking about is how to ensure that this is not a way to avoid
> the security in the cluster.
>
> Niels
> On Mar 7, 2014 1:41 AM, "Ted Yu" <yu...@gmail.com> wrote:
>
> > Interesting blog.
> >
> > I wonder how subsequent work addresses the following:
> >
> > bq. Updating the filter.jar in the Hadoop FS while a table scan is
> > happening can have undesired results if the updated filters are not
> > backward compatible.
> >
> >
> > On Thu, Mar 6, 2014 at 12:54 PM, Niels Basjes <Ni...@basjes.nl> wrote:
> >
> > > Hi,
> > >
> > > In the current HBase versions a Filter needs to be deployed by putting
> a
> > > jar into all region servers (and depending on the HBase version restart
> > the
> > > regionservers).
> > >
> > > I'm in a multi tenant cluster environment where we may run into the
> need
> > to
> > > have both the old and the new version of a Filter available at the same
> > > time. Also the option of having a method of easily trying out a new
> > > implementation for a Filter (to see if it performs better) would be a
> lot
> > > easier if it were possible to use a custom Filter without having to put
> > it
> > > onto all region servers.
> > >
> > > So after some Googling I found this interesting experiment for
> > dynamically
> > > uploading the Filter code with the Scan:
> > >
> > >
> >
> http://tech.flurry.com/2012/12/06/exploring-dynamic-loading-of-custom-filters-i/
> > >
> > >
> > > My question: Is such a feature planned for the mainline HBase?
> > >
> > > --
> > > Best regards
> > >
> > > Niels Basjes
> > >
> >
>

Re: Dynamically deploying filters?

Posted by Niels Basjes <Ni...@basjes.nl>.
>From what I see this is not putting those classes on into the cluster at
all. This looks like it is serializing them during each scan.
So this issue does not arise.
What I'm thinking about is how to ensure that this is not a way to avoid
the security in the cluster.

Niels
On Mar 7, 2014 1:41 AM, "Ted Yu" <yu...@gmail.com> wrote:

> Interesting blog.
>
> I wonder how subsequent work addresses the following:
>
> bq. Updating the filter.jar in the Hadoop FS while a table scan is
> happening can have undesired results if the updated filters are not
> backward compatible.
>
>
> On Thu, Mar 6, 2014 at 12:54 PM, Niels Basjes <Ni...@basjes.nl> wrote:
>
> > Hi,
> >
> > In the current HBase versions a Filter needs to be deployed by putting a
> > jar into all region servers (and depending on the HBase version restart
> the
> > regionservers).
> >
> > I'm in a multi tenant cluster environment where we may run into the need
> to
> > have both the old and the new version of a Filter available at the same
> > time. Also the option of having a method of easily trying out a new
> > implementation for a Filter (to see if it performs better) would be a lot
> > easier if it were possible to use a custom Filter without having to put
> it
> > onto all region servers.
> >
> > So after some Googling I found this interesting experiment for
> dynamically
> > uploading the Filter code with the Scan:
> >
> >
> http://tech.flurry.com/2012/12/06/exploring-dynamic-loading-of-custom-filters-i/
> >
> >
> > My question: Is such a feature planned for the mainline HBase?
> >
> > --
> > Best regards
> >
> > Niels Basjes
> >
>

Re: Dynamically deploying filters?

Posted by Ted Yu <yu...@gmail.com>.
Interesting blog.

I wonder how subsequent work addresses the following:

bq. Updating the filter.jar in the Hadoop FS while a table scan is
happening can have undesired results if the updated filters are not
backward compatible.


On Thu, Mar 6, 2014 at 12:54 PM, Niels Basjes <Ni...@basjes.nl> wrote:

> Hi,
>
> In the current HBase versions a Filter needs to be deployed by putting a
> jar into all region servers (and depending on the HBase version restart the
> regionservers).
>
> I'm in a multi tenant cluster environment where we may run into the need to
> have both the old and the new version of a Filter available at the same
> time. Also the option of having a method of easily trying out a new
> implementation for a Filter (to see if it performs better) would be a lot
> easier if it were possible to use a custom Filter without having to put it
> onto all region servers.
>
> So after some Googling I found this interesting experiment for dynamically
> uploading the Filter code with the Scan:
>
> http://tech.flurry.com/2012/12/06/exploring-dynamic-loading-of-custom-filters-i/
>
>
> My question: Is such a feature planned for the mainline HBase?
>
> --
> Best regards
>
> Niels Basjes
>