You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by Jonathan Wonders <jw...@gmail.com> on 2016/06/09 18:49:43 UTC

visibility constraint performance

Hi All,

I've been tracking down some performance issues on a few 1.6.x environments
and noticed some interesting and potentially undesirable behavior in the
visibility constraint.  When the associated visibility evaluator checks a
token to see if the user has a matching authorization, it uses the security
operations from the tablet server's constraint environment which ends up
authenticating the user's credentials on each call.  This will end up
flooding logs with Audit messages corresponding to these authentications if
the Audit logging is enabled.  It also consumes a non-negligible amount of
CPU, produces a lot of garbage (maybe 50-60% of that generated under a
heavy streaming ingest load), and can cause some contention between client
pool threads when accessing the ZooCache.

My initial measurements indicate a 25-30% decrease in ingest rate
(entries/s and MB/s) for my environement and workload when this constraint
is enabled.  This is with the Audit logging disabled.

Is this intended behavior?  It seems like the authentication is redundant
with the authentication that is performed at the beginning of the update
session.

Thanks,
--Jonathan

Re: visibility constraint performance

Posted by Josh Elser <jo...@gmail.com>.

Jonathan Wonders wrote:
>
> On Thu, Jun 9, 2016 at 3:58 PM, Sean Busbey <busbey@cloudera.com
> <ma...@cloudera.com>> wrote:
>
>     On Thu, Jun 9, 2016 at 2:47 PM, Josh Elser <josh.elser@gmail.com
>     <ma...@gmail.com>> wrote:
>     >
>     >  Agreed. Want to open up something on JIRA? It sounds like there
>     might be a
>     >  few things we can investigate.
>
>
> I'm happy to open up a JIRA issue for this.


Boss. Thanks!

>     >
>     >  * Synchronization/concurrency on ZooCache
>     >  * Excessive object creation when using the VisibilityConstraint
>     >  * Noticeable time spent creating Audit messages which are not logged
>     >  (Auditing is disabled)
>     >
>     >  I miss any points?
>
>     sounds like duplicative authentication checks
>
>     --
>     busbey
>
>
> I believe eliminating the redundant authentication checks would fix all
> of these symptoms.
>
> --Jonathan

Excellent.

Re: visibility constraint performance

Posted by Jonathan Wonders <jw...@gmail.com>.
On Thu, Jun 9, 2016 at 3:58 PM, Sean Busbey <bu...@cloudera.com> wrote:

> On Thu, Jun 9, 2016 at 2:47 PM, Josh Elser <jo...@gmail.com> wrote:
> >
> > Agreed. Want to open up something on JIRA? It sounds like there might be
> a
> > few things we can investigate.
>

I'm happy to open up a JIRA issue for this.


> >
> > * Synchronization/concurrency on ZooCache
> > * Excessive object creation when using the VisibilityConstraint
> > * Noticeable time spent creating Audit messages which are not logged
> > (Auditing is disabled)
> >
> > I miss any points?
>
> sounds like duplicative authentication checks
>
> --
> busbey
>

I believe eliminating the redundant authentication checks would fix all of
these symptoms.

--Jonathan

Re: visibility constraint performance

Posted by Sean Busbey <bu...@cloudera.com>.
On Thu, Jun 9, 2016 at 2:47 PM, Josh Elser <jo...@gmail.com> wrote:
>
> Agreed. Want to open up something on JIRA? It sounds like there might be a
> few things we can investigate.
>
> * Synchronization/concurrency on ZooCache
> * Excessive object creation when using the VisibilityConstraint
> * Noticeable time spent creating Audit messages which are not logged
> (Auditing is disabled)
>
> I miss any points?

sounds like duplicative authentication checks

-- 
busbey

Re: visibility constraint performance

Posted by Josh Elser <jo...@gmail.com>.

Keith Turner wrote:
>
>
> On Thu, Jun 9, 2016 at 2:49 PM, Jonathan Wonders <jwonders88@gmail.com
> <ma...@gmail.com>> wrote:
>
>     Hi All,
>
>     I've been tracking down some performance issues on a few 1.6.x
>     environments and noticed some interesting and potentially
>     undesirable behavior in the visibility constraint.  When the
>     associated visibility evaluator checks a token to see if the user
>     has a matching authorization, it uses the security operations from
>     the tablet server's constraint environment which ends up
>     authenticating the user's credentials on each call.  This will end
>     up flooding logs with Audit messages corresponding to these
>     authentications if the Audit logging is enabled.  It also consumes a
>     non-negligible amount of CPU, produces a lot of garbage (maybe
>     50-60% of that generated under a heavy streaming ingest load), and
>     can cause some contention between client pool threads when accessing
>     the ZooCache.
>
>     My initial measurements indicate a 25-30% decrease in ingest rate
>     (entries/s and MB/s) for my environement and workload when this
>     constraint is enabled.  This is with the Audit logging disabled.
>
>     Is this intended behavior?  It seems like the authentication is
>     redundant with the authentication that is performed at the beginning
>     of the update session.
>
>
> No. It would be best to avoid that behavior.

Agreed. Want to open up something on JIRA? It sounds like there might be 
a few things we can investigate.

* Synchronization/concurrency on ZooCache
* Excessive object creation when using the VisibilityConstraint
* Noticeable time spent creating Audit messages which are not logged 
(Auditing is disabled)

I miss any points?

Re: visibility constraint performance

Posted by Keith Turner <ke...@deenlo.com>.
On Thu, Jun 9, 2016 at 2:49 PM, Jonathan Wonders <jw...@gmail.com>
wrote:

> Hi All,
>
> I've been tracking down some performance issues on a few 1.6.x
> environments and noticed some interesting and potentially undesirable
> behavior in the visibility constraint.  When the associated visibility
> evaluator checks a token to see if the user has a matching authorization,
> it uses the security operations from the tablet server's constraint
> environment which ends up authenticating the user's credentials on each
> call.  This will end up flooding logs with Audit messages corresponding to
> these authentications if the Audit logging is enabled.  It also consumes a
> non-negligible amount of CPU, produces a lot of garbage (maybe 50-60% of
> that generated under a heavy streaming ingest load), and can cause some
> contention between client pool threads when accessing the ZooCache.
>
> My initial measurements indicate a 25-30% decrease in ingest rate
> (entries/s and MB/s) for my environement and workload when this constraint
> is enabled.  This is with the Audit logging disabled.
>
> Is this intended behavior?  It seems like the authentication is redundant
> with the authentication that is performed at the beginning of the update
> session.
>

No. It would be best to avoid that behavior.


>
> Thanks,
> --Jonathan
>