You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by Sean Busbey <bu...@cloudera.com> on 2014/03/26 08:10:19 UTC

[DISCUSS] MiniAccumuloCluster goals and approach

ACCUMULO-2143 has developed a conversation about MiniAccumuloCluster's
intended use and the way we currently implement the difference between MAC
for external use and MAC for internal Accumulo testing[1].

In particular, Josh had a few major concerns

-----

It doesn't make sense to me why MiniAccumuloCluster is a concrete class
which, ultimately still tied to a MiniAccumuloClusterImpl.
MiniAccumuloCluster *requires* a MiniAccumuloClusterImpl or something that
extends it. This is what's really chafing me about the separation of
"accumulo user" and "accumulo developer" methods – you *always* have them
both. Not to mention, this hierarchy is really obnoxious to create a new
implementation of AccumuloMiniCluster(Impl) because I have to carry all of
the cruft of the "original" implementation with me.

Bringing this back around to this ticket, while I still don't agree with
the reasoning that exposing the FileSystem or ZooKeeper object that
MiniAccumuloClusterImpl is getting us anything other than the ability to
say "we didn't change this [arbitrary] API". For "users" who might not care
what the underlying FileSystem or ZooKeeper connection, it's merely an
extra two items in their editor's code-completion. For "users" who would
care to use this information, we now make them jump through extra hoops to
get it. That just doesn't make any sense to me for something we haven't
even released.

To be honest, I really want to re-open
ACCUMULO-2151<https://issues.apache.org/jira/browse/ACCUMULO-2151>,
make MiniAccumuloCluster an interface, MiniAccumuloClusterImpl an
implementation of said interface, and create some factory class to make
instances, ala Connector.tableOperations, Connector.securityOperations,
etc. Right now there's a class we call an "API" that cannot be generically
extended for the sake of saying "we have an API".

----

I wanted to avoid having a drawn out discussion on a jira, where folks my
not notice it. Especially with things being late in 1.6.0 development and
the potential this has to impact the API.

Personally, I don't have much of a dog in the fight. There's always some
arbitrary line for where the public API will be, presuming we want to have
any kind of balance between providing a stable based for others to build on
and being able to refactor things. I would like us to hold to our API
promises[2] and I would rather we not leak implementation details
unnecessarily.

I suspect the choice to make MiniAccumuloCluster a class rather than an
interface with a factory was driven by the restrictions we put on API
changes between major versions and the fact that 1.5 had a class you could
instantiate via constructors[3].

It's possible we can address some of the major reusability concerns by
moving most of the implementation back into MAC, liberally using package
access for members, and making the internal-use MAC extend the public one.


[1]: https://issues.apache.org/jira/browse/ACCUMULO-2143
[2]: http://accumulo.apache.org/governance/releasing.html
[3]: https://issues.apache.org/jira/browse/ACCUMULO-2151

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Keith Turner <ke...@deenlo.com>.
On Wed, Mar 26, 2014 at 12:44 PM, Sean Busbey <bu...@cloudera.com> wrote:

> On Wed, Mar 26, 2014 at 11:17 AM, Keith Turner <ke...@deenlo.com> wrote:
>
> >
> >
> > There is a slow way to introduce an interface.
> >
> >  1) Depreciate MAC construnctors and add factory in 1.7.0
> >  2) In 1.10.0 drop constructor and change to interface.
> >
> >
> >
> That helps, but still breaks binary compatibility. The class files compiled
> against the original version will still throws an
> IncompatibleClassChangeError when they attempt to get back instances from
> the factory.
>

NM then.  I thought we did something like this w/ Connector, but I see now
its an abstract class.    So it was changed from a concrete class to an
abstract class.

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Sean Busbey <bu...@cloudera.com>.
On Wed, Mar 26, 2014 at 11:17 AM, Keith Turner <ke...@deenlo.com> wrote:

>
>
> There is a slow way to introduce an interface.
>
>  1) Depreciate MAC construnctors and add factory in 1.7.0
>  2) In 1.10.0 drop constructor and change to interface.
>
>
>
That helps, but still breaks binary compatibility. The class files compiled
against the original version will still throws an
IncompatibleClassChangeError when they attempt to get back instances from
the factory.

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Keith Turner <ke...@deenlo.com>.
On Wed, Mar 26, 2014 at 11:38 AM, Josh Elser <jo...@gmail.com> wrote:

>
>
> On 3/26/14, 12:10 AM, Sean Busbey wrote:
>
>> ACCUMULO-2143 has developed a conversation about MiniAccumuloCluster's
>> intended use and the way we currently implement the difference between MAC
>> for external use and MAC for internal Accumulo testing[1].
>>
>> In particular, Josh had a few major concerns
>>
>> -----
>>
>> It doesn't make sense to me why MiniAccumuloCluster is a concrete class
>> which, ultimately still tied to a MiniAccumuloClusterImpl.
>> MiniAccumuloCluster *requires* a MiniAccumuloClusterImpl or something that
>> extends it. This is what's really chafing me about the separation of
>> "accumulo user" and "accumulo developer" methods - you *always* have them
>> both. Not to mention, this hierarchy is really obnoxious to create a new
>> implementation of AccumuloMiniCluster(Impl) because I have to carry all of
>> the cruft of the "original" implementation with me.
>>
>> Bringing this back around to this ticket, while I still don't agree with
>> the reasoning that exposing the FileSystem or ZooKeeper object that
>> MiniAccumuloClusterImpl is getting us anything other than the ability to
>> say "we didn't change this [arbitrary] API". For "users" who might not
>> care
>> what the underlying FileSystem or ZooKeeper connection, it's merely an
>> extra two items in their editor's code-completion. For "users" who would
>> care to use this information, we now make them jump through extra hoops to
>> get it. That just doesn't make any sense to me for something we haven't
>> even released.
>>
>> To be honest, I really want to re-open
>> ACCUMULO-2151<https://issues.apache.org/jira/browse/ACCUMULO-2151>,
>> make MiniAccumuloCluster an interface, MiniAccumuloClusterImpl an
>> implementation of said interface, and create some factory class to make
>> instances, ala Connector.tableOperations, Connector.securityOperations,
>> etc. Right now there's a class we call an "API" that cannot be generically
>> extended for the sake of saying "we have an API".
>>
>> ----
>>
>> I wanted to avoid having a drawn out discussion on a jira, where folks my
>> not notice it. Especially with things being late in 1.6.0 development and
>> the potential this has to impact the API.
>>
>> Personally, I don't have much of a dog in the fight. There's always some
>> arbitrary line for where the public API will be, presuming we want to have
>> any kind of balance between providing a stable based for others to build
>> on
>> and being able to refactor things. I would like us to hold to our API
>> promises[2] and I would rather we not leak implementation details
>> unnecessarily.
>>
>> I suspect the choice to make MiniAccumuloCluster a class rather than an
>> interface with a factory was driven by the restrictions we put on API
>> changes between major versions and the fact that 1.5 had a class you could
>> instantiate via constructors[3].
>>
>
> Ok, that makes the most sense to me - I hadn't previously considered the
> "deprecation" cycle since it previously wasn't held to that standard. I
> mocked up some changes to better match the rest of our "public API" objects
> (class, interfaces, factories). If we want to preserve this API
> compatibility from earlier versions, we *could* name the interfaces classes
> (what would normally be MiniAccumuloCluster and MiniAccumuloConfig) to
> something non-standard to support API compatibility (just requiring a
> re-compilation I think).
>
> I would be behind getting the interfaces how we want them long term before
> categorizing these classes as "public". I'm willing to make sure we're all
> happy with this for 1.6.0.


There is a slow way to introduce an interface.

 1) Depreciate MAC construnctors and add factory in 1.7.0
 2) In 1.10.0 drop constructor and change to interface.


>
>
>  It's possible we can address some of the major reusability concerns by
>> moving most of the implementation back into MAC, liberally using package
>> access for members, and making the internal-use MAC extend the public one.
>>
>>
>> [1]: https://issues.apache.org/jira/browse/ACCUMULO-2143
>> [2]: http://accumulo.apache.org/governance/releasing.html
>> [3]: https://issues.apache.org/jira/browse/ACCUMULO-2151
>>
>>

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Sean Busbey <bu...@cloudera.com>.
On Wed, Mar 26, 2014 at 10:38 AM, Josh Elser <jo...@gmail.com> wrote:

>
>
>>
>> I suspect the choice to make MiniAccumuloCluster a class rather than an
>> interface with a factory was driven by the restrictions we put on API
>> changes between major versions and the fact that 1.5 had a class you could
>> instantiate via constructors[3].
>>
>
> Ok, that makes the most sense to me - I hadn't previously considered the
> "deprecation" cycle since it previously wasn't held to that standard. I
> mocked up some changes to better match the rest of our "public API" objects
> (class, interfaces, factories). If we want to preserve this API
> compatibility from earlier versions, we *could* name the interfaces classes
> (what would normally be MiniAccumuloCluster and MiniAccumuloConfig) to
> something non-standard to support API compatibility (just requiring a
> re-compilation I think).
>
> I would be behind getting the interfaces how we want them long term before
> categorizing these classes as "public". I'm willing to make sure we're all
> happy with this for 1.6.0.
>
>>
>>
We already declared minicluster public in 1.5.0 and 1.5.1, so we're a bit
bound as is. We don't clarify in our docs if we maintain source
compatibility or binary compatibility.  I would caution that users tend to
presume both (and maybe binary a little more), based on the continued
dismay that happens around the Hadoop 1 -> Hadoop 2 Class/Interface stuff.

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Josh Elser <jo...@gmail.com>.

On 3/26/14, 12:10 AM, Sean Busbey wrote:
> ACCUMULO-2143 has developed a conversation about MiniAccumuloCluster's
> intended use and the way we currently implement the difference between MAC
> for external use and MAC for internal Accumulo testing[1].
>
> In particular, Josh had a few major concerns
>
> -----
>
> It doesn't make sense to me why MiniAccumuloCluster is a concrete class
> which, ultimately still tied to a MiniAccumuloClusterImpl.
> MiniAccumuloCluster *requires* a MiniAccumuloClusterImpl or something that
> extends it. This is what's really chafing me about the separation of
> "accumulo user" and "accumulo developer" methods – you *always* have them
> both. Not to mention, this hierarchy is really obnoxious to create a new
> implementation of AccumuloMiniCluster(Impl) because I have to carry all of
> the cruft of the "original" implementation with me.
>
> Bringing this back around to this ticket, while I still don't agree with
> the reasoning that exposing the FileSystem or ZooKeeper object that
> MiniAccumuloClusterImpl is getting us anything other than the ability to
> say "we didn't change this [arbitrary] API". For "users" who might not care
> what the underlying FileSystem or ZooKeeper connection, it's merely an
> extra two items in their editor's code-completion. For "users" who would
> care to use this information, we now make them jump through extra hoops to
> get it. That just doesn't make any sense to me for something we haven't
> even released.
>
> To be honest, I really want to re-open
> ACCUMULO-2151<https://issues.apache.org/jira/browse/ACCUMULO-2151>,
> make MiniAccumuloCluster an interface, MiniAccumuloClusterImpl an
> implementation of said interface, and create some factory class to make
> instances, ala Connector.tableOperations, Connector.securityOperations,
> etc. Right now there's a class we call an "API" that cannot be generically
> extended for the sake of saying "we have an API".
>
> ----
>
> I wanted to avoid having a drawn out discussion on a jira, where folks my
> not notice it. Especially with things being late in 1.6.0 development and
> the potential this has to impact the API.
>
> Personally, I don't have much of a dog in the fight. There's always some
> arbitrary line for where the public API will be, presuming we want to have
> any kind of balance between providing a stable based for others to build on
> and being able to refactor things. I would like us to hold to our API
> promises[2] and I would rather we not leak implementation details
> unnecessarily.
>
> I suspect the choice to make MiniAccumuloCluster a class rather than an
> interface with a factory was driven by the restrictions we put on API
> changes between major versions and the fact that 1.5 had a class you could
> instantiate via constructors[3].

Ok, that makes the most sense to me - I hadn't previously considered the 
"deprecation" cycle since it previously wasn't held to that standard. I 
mocked up some changes to better match the rest of our "public API" 
objects (class, interfaces, factories). If we want to preserve this API 
compatibility from earlier versions, we *could* name the interfaces 
classes (what would normally be MiniAccumuloCluster and 
MiniAccumuloConfig) to something non-standard to support API 
compatibility (just requiring a re-compilation I think).

I would be behind getting the interfaces how we want them long term 
before categorizing these classes as "public". I'm willing to make sure 
we're all happy with this for 1.6.0.

> It's possible we can address some of the major reusability concerns by
> moving most of the implementation back into MAC, liberally using package
> access for members, and making the internal-use MAC extend the public one.
>
>
> [1]: https://issues.apache.org/jira/browse/ACCUMULO-2143
> [2]: http://accumulo.apache.org/governance/releasing.html
> [3]: https://issues.apache.org/jira/browse/ACCUMULO-2151
>

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Sean Busbey <bu...@cloudera.com>.
On Wed, Mar 26, 2014 at 11:06 AM, Keith Turner <ke...@deenlo.com> wrote:

>
> As Sean said MAC was a class in 1.4.4, 1.5.0, and 1.5.1.  So making it an
> interface would break things for any users using it.  Any reorganizing of
> the implementation of MAC could easily be done after 1.6.0.  From a users
> perspective the MAC API has changed very little, even though the
> implementation has dramatically changed.
>

One minor point of clarification: while the MAC was backported to 1.4.4+ it
was not made a part of the public API in those releases.

I have been presuming this was an intentional compromise position. I like
the current situation and would prefer we have major additions like the MAC
released for a version before adding them to the public api (even though
this specific case was a retrofit). Still, if our intention is for MAC in
1.4 to be a part of the support api, I'll need to file a ticket and fix it.

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Sean Busbey <bu...@cloudera.com>.
On Wed, Mar 26, 2014 at 11:12 AM, Josh Elser <jo...@gmail.com> wrote:

> On 3/26/14, 9:06 AM, Keith Turner wrote:
>
>> There were many change made to MAC so Accumulo could test itself.  For
>> example a method was added to return the internal threads that flush logs.
>> Flushing the logs may be a useful feature to add.  However it could be
>> offered in a way that does not expose these internal threads.   When
>> working on  ACCUMULO-2151 I had no desire to reimplement things like this,
>> I just wanted to hide it.  It was hidden from users so we do not have to
>> support it and can change it at will when testing 1.7.0.
>>
>
> That's my irk with it. The changes we made "hide" things for no other
> purpose than saying "we hid them". The next variant of a MAC is going to
> have to re-architect the entire thing anyways (I'm doing this right now and
> I'm overhauling it).
>
>
>
We also haven't labeled anything @deprecated, so this same interface would
have to be supported in 1.7.x to meet our API promises.

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Sean Busbey <bu...@cloudera.com>.
On Wed, Mar 26, 2014 at 11:26 AM, Josh Elser <jo...@gmail.com> wrote:

> On 3/26/14, 9:23 AM, Keith Turner wrote:
>
>> That's my irk with it. The changes we made "hide" things for no other
>>> >purpose than saying "we hid them". The next variant of a MAC is going to
>>> >have to re-architect the entire thing anyways (I'm doing this right now
>>> and
>>> >I'm overhauling it).
>>> >
>>>
>> There is a purpose.  Whats an alternative solution to the addition of
>> "public List<LogWriter> getLogWriters()" to the MAC API?
>>
>
> Personally, I wouldn't have really cared if such a method was added to its
> API.
>
>
 I would not want this added to the public API.

IMO, ACCUMULO-2151 has two key goals

1) keep the public API simple and well-defined for the use of people
testing against the rest of our public API

2) allowing us to leverage the codebase for our internal testing of
Accumulo.

expanding the former to help the later doesn't get us anywhere. It just
binds up what we can change in the thing we use for testing because we
exposed it to the class of users who rely on us to have a boundary for
slowing down changes.

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Mike Drob <ma...@cloudera.com>.
On Wed, Mar 26, 2014 at 12:26 PM, Josh Elser <jo...@gmail.com> wrote:

> On 3/26/14, 9:23 AM, Keith Turner wrote:
>
>> That's my irk with it. The changes we made "hide" things for no other
>>> >purpose than saying "we hid them". The next variant of a MAC is going to
>>> >have to re-architect the entire thing anyways (I'm doing this right now
>>> and
>>> >I'm overhauling it).
>>> >
>>>
>> There is a purpose.  Whats an alternative solution to the addition of
>> "public List<LogWriter> getLogWriters()" to the MAC API?
>>
>
> Personally, I wouldn't have really cared if such a method was added to its
> API.
>
>
>  If you want to re-write MAC all you have to support is the interface in
>> minicluster, you are free to throw everything in minicluster.impl away.
>>
>>
>>
> No, not with the "interface" explicitly referencing MiniAccumuloC*Impl
> internally, I can't. I do not see any way I can throw away the existing
> impl given the API wrapper. Am I missing something?
>

This sounds like a bug that is orthogonal and should be addressed
regardless of the outcome of everything else.

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Sean Busbey <bu...@cloudera.com>.
On Wed, Mar 26, 2014 at 11:46 AM, Josh Elser <jo...@gmail.com> wrote:

> On 3/26/14, 9:33 AM, Keith Turner wrote:
>
>> On Wed, Mar 26, 2014 at 12:26 PM, Josh Elser <jo...@gmail.com>
>> wrote:
>>
>>  On 3/26/14, 9:23 AM, Keith Turner wrote:
>>>
>>>  That's my irk with it. The changes we made "hide" things for no other
>>>>
>>>>> purpose than saying "we hid them". The next variant of a MAC is going
>>>>>> to
>>>>>> have to re-architect the entire thing anyways (I'm doing this right
>>>>>> now
>>>>>>
>>>>> and
>>>>>
>>>>>> I'm overhauling it).
>>>>>>
>>>>>>
>>>>>  There is a purpose.  Whats an alternative solution to the addition of
>>>> "public List<LogWriter> getLogWriters()" to the MAC API?
>>>>
>>>>
>>> Personally, I wouldn't have really cared if such a method was added to
>>> its
>>> API.
>>>
>>
>>
>> Why not?  It needlessly exposes a MAC implementation detail.  Java 7
>> offers
>> a much better way to handle this situation and makes the need for these
>> threads go away. As I said flushing the logs could be offered in the API
>> in
>> a much nicer way.  Thats one solution.
>>
>>
> If it was needless as you claim, why was it added in the first place as a
> public method?
>


AFAICT, it's used in internal tests (that are in a different package) to
make sure things have been flushed to disk before verifying internal state
(because checking that state as files in HDFS is simpler then walking
in-memory representations)

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by John Vines <vi...@apache.org>.
I think it would be best if you could throw a different MACConfig at it to
have it vary in behavior, rather then different implementations. I'd like
to think that this would provide the most backward compatibility and ease
of use, but I could be mistaken.


On Wed, Mar 26, 2014 at 2:05 PM, Josh Elser <jo...@gmail.com> wrote:

> On 3/26/14, 10:57 AM, Keith Turner wrote:
>
>> Can you give an example of what you are thinking of? I don't understand
>> you
>> viewpoint either
>>
>
> Sure. One limitation of MAC, in general as a testing harness, is that it
> doesn't adequately exercise multi-node implementations. You can run
> multiple tservers, but they are all on the same host which limits the
> validity of a "robust" test. This is my immediate goal.
>
> Multi-node deployments are capable using something like Mesos or Yarn.
> Given that there is already functioning support to deploy Accumulo on Yarn,
> this was my goal.
>
> My goal is to be able to have the ability to run all of our AbstractMacIT
> implementations against "real" hardware without changing a single line of
> test code (ok - maybe a line or two to do injection of the MAC
> implementation). The point is, I believe there could be a huge testing gain
> from being able to write tests which leverage yarn, have the same
> programmatic configuration API from MAC, and provide near "real" Accumulo
> semantics.
>

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by William Slacum <wi...@accumulo.net>.
I think this is better reserved for a version later than 1.6.0. It's an
11th hour change in addition to being a large overhaul of the interfaces to
support functionality we never intended for 1.6.0.


On Fri, Mar 28, 2014 at 4:04 PM, Josh Elser <jo...@gmail.com> wrote:

> Forgot to also add, that I would add the experimental annotation to
> alleviate confusion.
>
> The already mocked minimum set of methods on the interface which I posted
> to github Is a first pass. If we miss something that is in fact common, we
> can add it later, anything else is likely destined for the implementation.
>
> On Friday, March 28, 2014, Keith Turner <ke...@deenlo.com> wrote:
>
> > On Fri, Mar 28, 2014 at 3:14 PM, Josh Elser <josh.elser@gmail.com
> <javascript:;>>
> > wrote:
> >
> > > Not even the addition of a new interface, Christopher? I'd very much
> like
> > > to have an interface that we can get in 1.6.0 at a minimum. I wouldn't
> > even
> > > push for any deprecation of what's currently in place.
> > >
> >
> > W/o deprecation it seems very confusing.   The intent is that users
> should
> > use the new one, but the old one is not deprecated.  If someone is
> > completely new to this, how will they know which option to use?
> >
> > Once you get down in the weeds of working on this, do you think you might
> > end wanting something very different?
> >
> >
> >
> > > On Mar 28, 2014 10:02 AM, "Christopher" <ct...@apache.org> wrote:
> > >
> > > > I don't think any of this should be done for 1.6.0, but I like the
> > > > idea of creating a separate cluster interface for testing. I think it
> > > > should be integrated into the accumulo-maven-plugin, also. I think
> the
> > > > idea should be hammered out, and tested as a separate thing, to
> > > > experiment with the options, and provided as a complete feature for
> > > > the next major release. If it would change packaging dependencies, it
> > > > shouldn't even be done for 1.6.x bugfix releases.
> > > >
> > > > --
> > > > Christopher L Tubbs II
> > > > http://gravatar.com/ctubbsii
> > > >
> > > >
> > > > On Fri, Mar 28, 2014 at 12:24 PM, Josh Elser <jo...@gmail.com>
> > > wrote:
> > > > > Oh, I like that idea, Bill & Sean.
> > > > >
> > > > > Package: org.apache.accumulo.cluster
> > > > > Public API: org.apache.accumulo.cluster.AccumuloCluster
> > > > > MAC: org.apache.accumulo.cluster.mini.MiniAccumuloCluster
> (implements
> > > > > AccumuloCluster, allows for backwards compat)
> > > > > Yarn: org.apache.accumulo.cluster.yarn
> > > > > Docker: ...
> > > > > Mesos: ...
> > > > >
> > > > > etc etc etc.
> > > > >
> > > > > One question in my mind, do we keep the maven module
> > > > 'accumulo-minicluster'?
> > > > > I would imagine that if we struck the 'mini' portion from 1.6 that
> > > would
> > > > > create some confusion. Would it be worth the indirection to rename
> > > > > accumulo-minicluster to accumulo-cluster and then create a new
> > > > > accumulo-minicluster module that depends on accumulo-minicluster
> (but
> > > > > contains no code itself) to preserve the 1.4 and 1.5 poms to
> > generally
> > > > work
> > > > > with a version bump? I'm not sure if Maven would be happy with that
> > or
> > > do
> > > > > what I think it "should".
> > > > >
> > > > >
> > > > > On 3/28/14, 6:26 AM, Bill Havanki wrote:
> > > > >>
> > > > >> I've been watching the conversation on the side, but I wanted to
> > > mention
> > > > >> that it seems the focus isn't so much on "mini" clusters anymore.
> > > You're
> > > > >> thinking of programmatic cluster management, whether one node or
> > many.
> > > > The
> > > > >> idea of a basic cluster management interface, with MAC as an
> > > > >> implementation, is promising. A package name of just "cluster"
> could
> > > > work.
> > > > >>
> > > > >> Carry on :)
> > > > >>
> > > > >> Bill H
> > > > >>
> > > > >>
> > > > >> On Fri, Mar 28, 2014 at 12:39 AM, Sean Busbey
> > > > >> <bu...@cloudera.com>wrote:
> > > > >>
> > > > >>> If you decide to go the mapred/mapreduce way, you could go with
> the
> > > > >>> package
> > > > >>> name "mini".
> > > > >>>
> > > > >>> alternatively, we can do a multi-stage change out
> > > > >>>
> > > > >>> 1)  1.6.x:  introduce TestAccumuloCluster interface, @deprecate
> > > > >>> MiniAccumuloCluster class and make it implement
> TestAccumuloCluster
> > > > >>>
> > > > >>> 2) 1.6 + major: change MiniAccumuloCluster to an interface that
> > > extends
> > > > >>> TestAccumuloCluster, @deprecate TestAccumuloCluster
> > > > >>>
> > > > >>> 3) 1.6 + 2 major: remove TestAccumuloCluster
> > > > >>>
> > > > >>> Or just go with TestAccumuloCluster as the interface, have
> > > > >>> MiniAccumuloCluster as the local pseudo distributed
> implementation,
> > > and
> > > > >>> then call your new one something like YarnAccumuloCluster.
> > > > >>>
> > > > >>> In that case we could use the deprecation cycle to move the MAC
> > class
> > > > out
> > > > >>> of the public api.
> > > > >>>
> > > > >>>
> > > > >>> On Thu, Mar 27, 2014 at 6:48 PM, Josh Elser <
>

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Josh Elser <jo...@gmail.com>.
Forgot to also add, that I would add the experimental annotation to
alleviate confusion.

The already mocked minimum set of methods on the interface which I posted
to github Is a first pass. If we miss something that is in fact common, we
can add it later, anything else is likely destined for the implementation.

On Friday, March 28, 2014, Keith Turner <ke...@deenlo.com> wrote:

> On Fri, Mar 28, 2014 at 3:14 PM, Josh Elser <josh.elser@gmail.com<javascript:;>>
> wrote:
>
> > Not even the addition of a new interface, Christopher? I'd very much like
> > to have an interface that we can get in 1.6.0 at a minimum. I wouldn't
> even
> > push for any deprecation of what's currently in place.
> >
>
> W/o deprecation it seems very confusing.   The intent is that users should
> use the new one, but the old one is not deprecated.  If someone is
> completely new to this, how will they know which option to use?
>
> Once you get down in the weeds of working on this, do you think you might
> end wanting something very different?
>
>
>
> > On Mar 28, 2014 10:02 AM, "Christopher" <ct...@apache.org> wrote:
> >
> > > I don't think any of this should be done for 1.6.0, but I like the
> > > idea of creating a separate cluster interface for testing. I think it
> > > should be integrated into the accumulo-maven-plugin, also. I think the
> > > idea should be hammered out, and tested as a separate thing, to
> > > experiment with the options, and provided as a complete feature for
> > > the next major release. If it would change packaging dependencies, it
> > > shouldn't even be done for 1.6.x bugfix releases.
> > >
> > > --
> > > Christopher L Tubbs II
> > > http://gravatar.com/ctubbsii
> > >
> > >
> > > On Fri, Mar 28, 2014 at 12:24 PM, Josh Elser <jo...@gmail.com>
> > wrote:
> > > > Oh, I like that idea, Bill & Sean.
> > > >
> > > > Package: org.apache.accumulo.cluster
> > > > Public API: org.apache.accumulo.cluster.AccumuloCluster
> > > > MAC: org.apache.accumulo.cluster.mini.MiniAccumuloCluster (implements
> > > > AccumuloCluster, allows for backwards compat)
> > > > Yarn: org.apache.accumulo.cluster.yarn
> > > > Docker: ...
> > > > Mesos: ...
> > > >
> > > > etc etc etc.
> > > >
> > > > One question in my mind, do we keep the maven module
> > > 'accumulo-minicluster'?
> > > > I would imagine that if we struck the 'mini' portion from 1.6 that
> > would
> > > > create some confusion. Would it be worth the indirection to rename
> > > > accumulo-minicluster to accumulo-cluster and then create a new
> > > > accumulo-minicluster module that depends on accumulo-minicluster (but
> > > > contains no code itself) to preserve the 1.4 and 1.5 poms to
> generally
> > > work
> > > > with a version bump? I'm not sure if Maven would be happy with that
> or
> > do
> > > > what I think it "should".
> > > >
> > > >
> > > > On 3/28/14, 6:26 AM, Bill Havanki wrote:
> > > >>
> > > >> I've been watching the conversation on the side, but I wanted to
> > mention
> > > >> that it seems the focus isn't so much on "mini" clusters anymore.
> > You're
> > > >> thinking of programmatic cluster management, whether one node or
> many.
> > > The
> > > >> idea of a basic cluster management interface, with MAC as an
> > > >> implementation, is promising. A package name of just "cluster" could
> > > work.
> > > >>
> > > >> Carry on :)
> > > >>
> > > >> Bill H
> > > >>
> > > >>
> > > >> On Fri, Mar 28, 2014 at 12:39 AM, Sean Busbey
> > > >> <bu...@cloudera.com>wrote:
> > > >>
> > > >>> If you decide to go the mapred/mapreduce way, you could go with the
> > > >>> package
> > > >>> name "mini".
> > > >>>
> > > >>> alternatively, we can do a multi-stage change out
> > > >>>
> > > >>> 1)  1.6.x:  introduce TestAccumuloCluster interface, @deprecate
> > > >>> MiniAccumuloCluster class and make it implement TestAccumuloCluster
> > > >>>
> > > >>> 2) 1.6 + major: change MiniAccumuloCluster to an interface that
> > extends
> > > >>> TestAccumuloCluster, @deprecate TestAccumuloCluster
> > > >>>
> > > >>> 3) 1.6 + 2 major: remove TestAccumuloCluster
> > > >>>
> > > >>> Or just go with TestAccumuloCluster as the interface, have
> > > >>> MiniAccumuloCluster as the local pseudo distributed implementation,
> > and
> > > >>> then call your new one something like YarnAccumuloCluster.
> > > >>>
> > > >>> In that case we could use the deprecation cycle to move the MAC
> class
> > > out
> > > >>> of the public api.
> > > >>>
> > > >>>
> > > >>> On Thu, Mar 27, 2014 at 6:48 PM, Josh Elser <

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Keith Turner <ke...@deenlo.com>.
On Fri, Mar 28, 2014 at 3:14 PM, Josh Elser <jo...@gmail.com> wrote:

> Not even the addition of a new interface, Christopher? I'd very much like
> to have an interface that we can get in 1.6.0 at a minimum. I wouldn't even
> push for any deprecation of what's currently in place.
>

W/o deprecation it seems very confusing.   The intent is that users should
use the new one, but the old one is not deprecated.  If someone is
completely new to this, how will they know which option to use?

Once you get down in the weeds of working on this, do you think you might
end wanting something very different?



> On Mar 28, 2014 10:02 AM, "Christopher" <ct...@apache.org> wrote:
>
> > I don't think any of this should be done for 1.6.0, but I like the
> > idea of creating a separate cluster interface for testing. I think it
> > should be integrated into the accumulo-maven-plugin, also. I think the
> > idea should be hammered out, and tested as a separate thing, to
> > experiment with the options, and provided as a complete feature for
> > the next major release. If it would change packaging dependencies, it
> > shouldn't even be done for 1.6.x bugfix releases.
> >
> > --
> > Christopher L Tubbs II
> > http://gravatar.com/ctubbsii
> >
> >
> > On Fri, Mar 28, 2014 at 12:24 PM, Josh Elser <jo...@gmail.com>
> wrote:
> > > Oh, I like that idea, Bill & Sean.
> > >
> > > Package: org.apache.accumulo.cluster
> > > Public API: org.apache.accumulo.cluster.AccumuloCluster
> > > MAC: org.apache.accumulo.cluster.mini.MiniAccumuloCluster (implements
> > > AccumuloCluster, allows for backwards compat)
> > > Yarn: org.apache.accumulo.cluster.yarn
> > > Docker: ...
> > > Mesos: ...
> > >
> > > etc etc etc.
> > >
> > > One question in my mind, do we keep the maven module
> > 'accumulo-minicluster'?
> > > I would imagine that if we struck the 'mini' portion from 1.6 that
> would
> > > create some confusion. Would it be worth the indirection to rename
> > > accumulo-minicluster to accumulo-cluster and then create a new
> > > accumulo-minicluster module that depends on accumulo-minicluster (but
> > > contains no code itself) to preserve the 1.4 and 1.5 poms to generally
> > work
> > > with a version bump? I'm not sure if Maven would be happy with that or
> do
> > > what I think it "should".
> > >
> > >
> > > On 3/28/14, 6:26 AM, Bill Havanki wrote:
> > >>
> > >> I've been watching the conversation on the side, but I wanted to
> mention
> > >> that it seems the focus isn't so much on "mini" clusters anymore.
> You're
> > >> thinking of programmatic cluster management, whether one node or many.
> > The
> > >> idea of a basic cluster management interface, with MAC as an
> > >> implementation, is promising. A package name of just "cluster" could
> > work.
> > >>
> > >> Carry on :)
> > >>
> > >> Bill H
> > >>
> > >>
> > >> On Fri, Mar 28, 2014 at 12:39 AM, Sean Busbey
> > >> <bu...@cloudera.com>wrote:
> > >>
> > >>> If you decide to go the mapred/mapreduce way, you could go with the
> > >>> package
> > >>> name "mini".
> > >>>
> > >>> alternatively, we can do a multi-stage change out
> > >>>
> > >>> 1)  1.6.x:  introduce TestAccumuloCluster interface, @deprecate
> > >>> MiniAccumuloCluster class and make it implement TestAccumuloCluster
> > >>>
> > >>> 2) 1.6 + major: change MiniAccumuloCluster to an interface that
> extends
> > >>> TestAccumuloCluster, @deprecate TestAccumuloCluster
> > >>>
> > >>> 3) 1.6 + 2 major: remove TestAccumuloCluster
> > >>>
> > >>> Or just go with TestAccumuloCluster as the interface, have
> > >>> MiniAccumuloCluster as the local pseudo distributed implementation,
> and
> > >>> then call your new one something like YarnAccumuloCluster.
> > >>>
> > >>> In that case we could use the deprecation cycle to move the MAC class
> > out
> > >>> of the public api.
> > >>>
> > >>>
> > >>> On Thu, Mar 27, 2014 at 6:48 PM, Josh Elser <jo...@gmail.com>
> > wrote:
> > >>>
> > >>>> Thoughts on if this would be an acceptable change for 1.6.0 to
> > alleviate
> > >>>> future cruft?
> > >>>>
> > >>>> Suggestions on the new package and/or class name would be greatly
> > >>>> appreciated over "NewMiniAccumuloC*".
> > >>>>
> > >>>>
> > >>>> On 3/26/14, 3:37 PM, Josh Elser wrote:
> > >>>>
> > >>>>> Those who are interested: check out
> > >>>>> https://github.com/joshelser/accumulo/commit/
> > >>>>> 9f63cf32559ab514a69ff2c6b02acef9c9cbb4e8
> > >>>>>
> > >>>>>
> > >>>>> tl;dr I could create some real interfaces for the cluster and
> config,
> > >>>>> which are "hidden" under the covers by the 1.4 and 1.5
> > >>>>> MiniAccumuloCluster and MiniAccumuloConfig classes. This de-couples
> > the
> > >>>>> default implementation, gives us the ability to hide
> "implementation
> > >>>>> details" if wanted, and moves us towards some factory methods
> instead
> > >>>>> of
> > >>>>> calling a class directly.
> > >>>>>
> > >>>>> Thoughts?
> > >>>>>
> > >>>>> On 3/26/14, 1:21 PM, Josh Elser wrote:
> > >>>>>
> > >>>>>> Yes, very much experimental at this point.
> > >>>>>>
> > >>>>>> What I'm most concerned about is having reasonable hooks up front,
> > not
> > >>>>>> trying to make an implementation for inclusion 1.6.0.
> > >>>>>>
> > >>>>>> Regarding additions, the implementations already contains most
> > things
> > >>>>>> I
> > >>>>>> would want to expose. I haven't come up with anything that would
> be
> > >>>>>> generally returned through the "API" rather than through this
> > proposed
> > >>>>>> implementation (e.g. YARN connection information)
> > >>>>>>
> > >>>>>> On 3/26/14, 11:57 AM, Keith Turner wrote:
> > >>>>>>
> > >>>>>>> What you are trying to do sounds interesting.  It also sounds
> > >>>>>>> experimental
> > >>>>>>> and in the early stages.   Is there anything specific you think
> > >>>>>>> should be
> > >>>>>>> done for 1.6.0 w/ regards to MAC API?
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Wed, Mar 26, 2014 at 2:26 PM, Josh Elser <
> josh.elser@gmail.com>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>   On 3/26/14, 11:13 AM, Keith Turner wrote:
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>   On Wed, Mar 26, 2014 at 2:05 PM, Josh Elser <
> > josh.elser@gmail.com>
> > >>>>>>>>>
> > >>>>>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>    On 3/26/14, 10:57 AM, Keith Turner wrote:
> > >>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>    Can you give an example of what you are thinking of? I
> don't
> > >>>>>>>>>> understand
> > >>>>>>>>>>
> > >>>>>>>>>>> you
> > >>>>>>>>>>> viewpoint either
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>   Sure. One limitation of MAC, in general as a testing
> harness,
> > >>>>>>>>>>> is
> > >>>>>>>>>>
> > >>>>>>>>>> that it
> > >>>>>>>>>> doesn't adequately exercise multi-node implementations. You
> can
> > >>>>>>>>>> run
> > >>>>>>>>>> multiple tservers, but they are all on the same host which
> > limits
> > >>>
> > >>> the
> > >>>>>>>>>>
> > >>>>>>>>>> validity of a "robust" test. This is my immediate goal.
> > >>>>>>>>>>
> > >>>>>>>>>> Multi-node deployments are capable using something like Mesos
> or
> > >>>>>>>>>> Yarn.
> > >>>>>>>>>> Given that there is already functioning support to deploy
> > Accumulo
> > >>>
> > >>> on
> > >>>>>>>>>>
> > >>>>>>>>>> Yarn,
> > >>>>>>>>>> this was my goal.
> > >>>>>>>>>>
> > >>>>>>>>>> My goal is to be able to have the ability to run all of our
> > >>>>>>>>>> AbstractMacIT
> > >>>>>>>>>> implementations against "real" hardware without changing a
> > single
> > >>>>>>>>>> line of
> > >>>>>>>>>> test code (ok - maybe a line or two to do injection of the MAC
> > >>>>>>>>>> implementation). The point is, I believe there could be a huge
> > >>>>>>>>>> testing
> > >>>>>>>>>> gain
> > >>>>>>>>>> from being able to write tests which leverage yarn, have the
> > same
> > >>>>>>>>>> programmatic configuration API from MAC, and provide near
> "real"
> > >>>>>>>>>> Accumulo
> > >>>>>>>>>> semantics.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>   Ok so you want to MAC to be an interface so that you can
> > provide
> > >>>>>>>>>> a
> > >>>>>>>>>
> > >>>>>>>>> completely different implementation?
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>   Correct. Some things would serve well in a common abstract
> base
> > >>>
> > >>> (e.g.
> > >>>>>>>>
> > >>>>>>>> numTservers, siteXml configuration), but all the nonsense about
> > >>>>>>>> creating
> > >>>>>>>> directory structures and managing Processes is implementation
> > >>>
> > >>> specific.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Perhaps I could create a new interface that the current
> > >>>
> > >>> implementation
> > >>>>>>>>
> > >>>>>>>> implements which still provides the same semantics from 1.4 and
> > 1.5.
> > >>>>>>>> Let me
> > >>>>>>>> see if I can mock up what I'm thinking -- that will probably be
> > >>>>>>>> easier than
> > >>>>>>>> me trying to write it out.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>
> > >>
> > >>
> > >>
> > >
> >
>

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Josh Elser <jo...@gmail.com>.
I'll have something integrated to 1.6.0 this weekend. Something concrete
may help to firm everyone's opinion.
On Mar 28, 2014 5:54 PM, "Christopher" <ct...@apache.org> wrote:

> If it's marked Experimental in the javadocs, I think it may be fine.
>
> --
> Christopher L Tubbs II
> http://gravatar.com/ctubbsii
>
>
> On Fri, Mar 28, 2014 at 4:50 PM, Sean Busbey <bu...@cloudera.com>
> wrote:
> > If the new interface is not in the public API, then I think adding it
> > (without deprecating MAC) is fine.
> >
> > That way it can evolve if needed and we can add it to the public API on a
> > later release.
> >
> >
> > On Fri, Mar 28, 2014 at 3:39 PM, Christopher <ct...@apache.org>
> wrote:
> >
> >> But... without more time to fully develop the requirements for the
> >> interface, with a few implementations, it's probably going to change
> >> anyway. I think even adding the interface could complicate the
> >> follow-on work. But... *shrug*.... maybe you can have guarantees that
> >> the interface will stay as is (same package, same methods, same name,
> >> etc.)?
> >>
> >> --
> >> Christopher L Tubbs II
> >> http://gravatar.com/ctubbsii
> >>
> >>
> >> On Fri, Mar 28, 2014 at 3:14 PM, Josh Elser <jo...@gmail.com>
> wrote:
> >> > Not even the addition of a new interface, Christopher? I'd very much
> like
> >> > to have an interface that we can get in 1.6.0 at a minimum. I wouldn't
> >> even
> >> > push for any deprecation of what's currently in place.
> >> > On Mar 28, 2014 10:02 AM, "Christopher" <ct...@apache.org> wrote:
> >> >
> >> >> I don't think any of this should be done for 1.6.0, but I like the
> >> >> idea of creating a separate cluster interface for testing. I think it
> >> >> should be integrated into the accumulo-maven-plugin, also. I think
> the
> >> >> idea should be hammered out, and tested as a separate thing, to
> >> >> experiment with the options, and provided as a complete feature for
> >> >> the next major release. If it would change packaging dependencies, it
> >> >> shouldn't even be done for 1.6.x bugfix releases.
> >> >>
> >> >> --
> >> >> Christopher L Tubbs II
> >> >> http://gravatar.com/ctubbsii
> >> >>
> >> >>
> >> >> On Fri, Mar 28, 2014 at 12:24 PM, Josh Elser <jo...@gmail.com>
> >> wrote:
> >> >> > Oh, I like that idea, Bill & Sean.
> >> >> >
> >> >> > Package: org.apache.accumulo.cluster
> >> >> > Public API: org.apache.accumulo.cluster.AccumuloCluster
> >> >> > MAC: org.apache.accumulo.cluster.mini.MiniAccumuloCluster
> (implements
> >> >> > AccumuloCluster, allows for backwards compat)
> >> >> > Yarn: org.apache.accumulo.cluster.yarn
> >> >> > Docker: ...
> >> >> > Mesos: ...
> >> >> >
> >> >> > etc etc etc.
> >> >> >
> >> >> > One question in my mind, do we keep the maven module
> >> >> 'accumulo-minicluster'?
> >> >> > I would imagine that if we struck the 'mini' portion from 1.6 that
> >> would
> >> >> > create some confusion. Would it be worth the indirection to rename
> >> >> > accumulo-minicluster to accumulo-cluster and then create a new
> >> >> > accumulo-minicluster module that depends on accumulo-minicluster
> (but
> >> >> > contains no code itself) to preserve the 1.4 and 1.5 poms to
> generally
> >> >> work
> >> >> > with a version bump? I'm not sure if Maven would be happy with that
> >> or do
> >> >> > what I think it "should".
> >> >> >
> >> >> >
> >> >> > On 3/28/14, 6:26 AM, Bill Havanki wrote:
> >> >> >>
> >> >> >> I've been watching the conversation on the side, but I wanted to
> >> mention
> >> >> >> that it seems the focus isn't so much on "mini" clusters anymore.
> >> You're
> >> >> >> thinking of programmatic cluster management, whether one node or
> >> many.
> >> >> The
> >> >> >> idea of a basic cluster management interface, with MAC as an
> >> >> >> implementation, is promising. A package name of just "cluster"
> could
> >> >> work.
> >> >> >>
> >> >> >> Carry on :)
> >> >> >>
> >> >> >> Bill H
> >> >> >>
> >> >> >>
> >> >> >> On Fri, Mar 28, 2014 at 12:39 AM, Sean Busbey
> >> >> >> <bu...@cloudera.com>wrote:
> >> >> >>
> >> >> >>> If you decide to go the mapred/mapreduce way, you could go with
> the
> >> >> >>> package
> >> >> >>> name "mini".
> >> >> >>>
> >> >> >>> alternatively, we can do a multi-stage change out
> >> >> >>>
> >> >> >>> 1)  1.6.x:  introduce TestAccumuloCluster interface, @deprecate
> >> >> >>> MiniAccumuloCluster class and make it implement
> TestAccumuloCluster
> >> >> >>>
> >> >> >>> 2) 1.6 + major: change MiniAccumuloCluster to an interface that
> >> extends
> >> >> >>> TestAccumuloCluster, @deprecate TestAccumuloCluster
> >> >> >>>
> >> >> >>> 3) 1.6 + 2 major: remove TestAccumuloCluster
> >> >> >>>
> >> >> >>> Or just go with TestAccumuloCluster as the interface, have
> >> >> >>> MiniAccumuloCluster as the local pseudo distributed
> implementation,
> >> and
> >> >> >>> then call your new one something like YarnAccumuloCluster.
> >> >> >>>
> >> >> >>> In that case we could use the deprecation cycle to move the MAC
> >> class
> >> >> out
> >> >> >>> of the public api.
> >> >> >>>
> >> >> >>>
> >> >> >>> On Thu, Mar 27, 2014 at 6:48 PM, Josh Elser <
> josh.elser@gmail.com>
> >> >> wrote:
> >> >> >>>
> >> >> >>>> Thoughts on if this would be an acceptable change for 1.6.0 to
> >> >> alleviate
> >> >> >>>> future cruft?
> >> >> >>>>
> >> >> >>>> Suggestions on the new package and/or class name would be
> greatly
> >> >> >>>> appreciated over "NewMiniAccumuloC*".
> >> >> >>>>
> >> >> >>>>
> >> >> >>>> On 3/26/14, 3:37 PM, Josh Elser wrote:
> >> >> >>>>
> >> >> >>>>> Those who are interested: check out
> >> >> >>>>> https://github.com/joshelser/accumulo/commit/
> >> >> >>>>> 9f63cf32559ab514a69ff2c6b02acef9c9cbb4e8
> >> >> >>>>>
> >> >> >>>>>
> >> >> >>>>> tl;dr I could create some real interfaces for the cluster and
> >> config,
> >> >> >>>>> which are "hidden" under the covers by the 1.4 and 1.5
> >> >> >>>>> MiniAccumuloCluster and MiniAccumuloConfig classes. This
> >> de-couples
> >> >> the
> >> >> >>>>> default implementation, gives us the ability to hide
> >> "implementation
> >> >> >>>>> details" if wanted, and moves us towards some factory methods
> >> instead
> >> >> >>>>> of
> >> >> >>>>> calling a class directly.
> >> >> >>>>>
> >> >> >>>>> Thoughts?
> >> >> >>>>>
> >> >> >>>>> On 3/26/14, 1:21 PM, Josh Elser wrote:
> >> >> >>>>>
> >> >> >>>>>> Yes, very much experimental at this point.
> >> >> >>>>>>
> >> >> >>>>>> What I'm most concerned about is having reasonable hooks up
> >> front,
> >> >> not
> >> >> >>>>>> trying to make an implementation for inclusion 1.6.0.
> >> >> >>>>>>
> >> >> >>>>>> Regarding additions, the implementations already contains most
> >> >> things
> >> >> >>>>>> I
> >> >> >>>>>> would want to expose. I haven't come up with anything that
> would
> >> be
> >> >> >>>>>> generally returned through the "API" rather than through this
> >> >> proposed
> >> >> >>>>>> implementation (e.g. YARN connection information)
> >> >> >>>>>>
> >> >> >>>>>> On 3/26/14, 11:57 AM, Keith Turner wrote:
> >> >> >>>>>>
> >> >> >>>>>>> What you are trying to do sounds interesting.  It also sounds
> >> >> >>>>>>> experimental
> >> >> >>>>>>> and in the early stages.   Is there anything specific you
> think
> >> >> >>>>>>> should be
> >> >> >>>>>>> done for 1.6.0 w/ regards to MAC API?
> >> >> >>>>>>>
> >> >> >>>>>>>
> >> >> >>>>>>> On Wed, Mar 26, 2014 at 2:26 PM, Josh Elser <
> >> josh.elser@gmail.com>
> >> >> >>>>>>> wrote:
> >> >> >>>>>>>
> >> >> >>>>>>>   On 3/26/14, 11:13 AM, Keith Turner wrote:
> >> >> >>>>>>>>
> >> >> >>>>>>>>
> >> >> >>>>>>>>   On Wed, Mar 26, 2014 at 2:05 PM, Josh Elser <
> >> >> josh.elser@gmail.com>
> >> >> >>>>>>>>>
> >> >> >>>>>>>>> wrote:
> >> >> >>>>>>>>>
> >> >> >>>>>>>>>    On 3/26/14, 10:57 AM, Keith Turner wrote:
> >> >> >>>>>>>>>
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>>    Can you give an example of what you are thinking of? I
> >> don't
> >> >> >>>>>>>>>> understand
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>>> you
> >> >> >>>>>>>>>>> viewpoint either
> >> >> >>>>>>>>>>>
> >> >> >>>>>>>>>>>
> >> >> >>>>>>>>>>>   Sure. One limitation of MAC, in general as a testing
> >> harness,
> >> >> >>>>>>>>>>> is
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>> that it
> >> >> >>>>>>>>>> doesn't adequately exercise multi-node implementations.
> You
> >> can
> >> >> >>>>>>>>>> run
> >> >> >>>>>>>>>> multiple tservers, but they are all on the same host which
> >> >> limits
> >> >> >>>
> >> >> >>> the
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>> validity of a "robust" test. This is my immediate goal.
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>> Multi-node deployments are capable using something like
> >> Mesos or
> >> >> >>>>>>>>>> Yarn.
> >> >> >>>>>>>>>> Given that there is already functioning support to deploy
> >> >> Accumulo
> >> >> >>>
> >> >> >>> on
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>> Yarn,
> >> >> >>>>>>>>>> this was my goal.
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>> My goal is to be able to have the ability to run all of
> our
> >> >> >>>>>>>>>> AbstractMacIT
> >> >> >>>>>>>>>> implementations against "real" hardware without changing a
> >> >> single
> >> >> >>>>>>>>>> line of
> >> >> >>>>>>>>>> test code (ok - maybe a line or two to do injection of the
> >> MAC
> >> >> >>>>>>>>>> implementation). The point is, I believe there could be a
> >> huge
> >> >> >>>>>>>>>> testing
> >> >> >>>>>>>>>> gain
> >> >> >>>>>>>>>> from being able to write tests which leverage yarn, have
> the
> >> >> same
> >> >> >>>>>>>>>> programmatic configuration API from MAC, and provide near
> >> "real"
> >> >> >>>>>>>>>> Accumulo
> >> >> >>>>>>>>>> semantics.
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>>
> >> >> >>>>>>>>>>   Ok so you want to MAC to be an interface so that you can
> >> >> provide
> >> >> >>>>>>>>>> a
> >> >> >>>>>>>>>
> >> >> >>>>>>>>> completely different implementation?
> >> >> >>>>>>>>>
> >> >> >>>>>>>>>
> >> >> >>>>>>>>>   Correct. Some things would serve well in a common
> abstract
> >> base
> >> >> >>>
> >> >> >>> (e.g.
> >> >> >>>>>>>>
> >> >> >>>>>>>> numTservers, siteXml configuration), but all the nonsense
> about
> >> >> >>>>>>>> creating
> >> >> >>>>>>>> directory structures and managing Processes is
> implementation
> >> >> >>>
> >> >> >>> specific.
> >> >> >>>>>>>>
> >> >> >>>>>>>>
> >> >> >>>>>>>> Perhaps I could create a new interface that the current
> >> >> >>>
> >> >> >>> implementation
> >> >> >>>>>>>>
> >> >> >>>>>>>> implements which still provides the same semantics from 1.4
> and
> >> >> 1.5.
> >> >> >>>>>>>> Let me
> >> >> >>>>>>>> see if I can mock up what I'm thinking -- that will
> probably be
> >> >> >>>>>>>> easier than
> >> >> >>>>>>>> me trying to write it out.
> >> >> >>>>>>>>
> >> >> >>>>>>>>
> >> >> >>>>>>>
> >> >> >>>
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >
> >> >>
> >>
>

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Christopher <ct...@apache.org>.
If it's marked Experimental in the javadocs, I think it may be fine.

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Fri, Mar 28, 2014 at 4:50 PM, Sean Busbey <bu...@cloudera.com> wrote:
> If the new interface is not in the public API, then I think adding it
> (without deprecating MAC) is fine.
>
> That way it can evolve if needed and we can add it to the public API on a
> later release.
>
>
> On Fri, Mar 28, 2014 at 3:39 PM, Christopher <ct...@apache.org> wrote:
>
>> But... without more time to fully develop the requirements for the
>> interface, with a few implementations, it's probably going to change
>> anyway. I think even adding the interface could complicate the
>> follow-on work. But... *shrug*.... maybe you can have guarantees that
>> the interface will stay as is (same package, same methods, same name,
>> etc.)?
>>
>> --
>> Christopher L Tubbs II
>> http://gravatar.com/ctubbsii
>>
>>
>> On Fri, Mar 28, 2014 at 3:14 PM, Josh Elser <jo...@gmail.com> wrote:
>> > Not even the addition of a new interface, Christopher? I'd very much like
>> > to have an interface that we can get in 1.6.0 at a minimum. I wouldn't
>> even
>> > push for any deprecation of what's currently in place.
>> > On Mar 28, 2014 10:02 AM, "Christopher" <ct...@apache.org> wrote:
>> >
>> >> I don't think any of this should be done for 1.6.0, but I like the
>> >> idea of creating a separate cluster interface for testing. I think it
>> >> should be integrated into the accumulo-maven-plugin, also. I think the
>> >> idea should be hammered out, and tested as a separate thing, to
>> >> experiment with the options, and provided as a complete feature for
>> >> the next major release. If it would change packaging dependencies, it
>> >> shouldn't even be done for 1.6.x bugfix releases.
>> >>
>> >> --
>> >> Christopher L Tubbs II
>> >> http://gravatar.com/ctubbsii
>> >>
>> >>
>> >> On Fri, Mar 28, 2014 at 12:24 PM, Josh Elser <jo...@gmail.com>
>> wrote:
>> >> > Oh, I like that idea, Bill & Sean.
>> >> >
>> >> > Package: org.apache.accumulo.cluster
>> >> > Public API: org.apache.accumulo.cluster.AccumuloCluster
>> >> > MAC: org.apache.accumulo.cluster.mini.MiniAccumuloCluster (implements
>> >> > AccumuloCluster, allows for backwards compat)
>> >> > Yarn: org.apache.accumulo.cluster.yarn
>> >> > Docker: ...
>> >> > Mesos: ...
>> >> >
>> >> > etc etc etc.
>> >> >
>> >> > One question in my mind, do we keep the maven module
>> >> 'accumulo-minicluster'?
>> >> > I would imagine that if we struck the 'mini' portion from 1.6 that
>> would
>> >> > create some confusion. Would it be worth the indirection to rename
>> >> > accumulo-minicluster to accumulo-cluster and then create a new
>> >> > accumulo-minicluster module that depends on accumulo-minicluster (but
>> >> > contains no code itself) to preserve the 1.4 and 1.5 poms to generally
>> >> work
>> >> > with a version bump? I'm not sure if Maven would be happy with that
>> or do
>> >> > what I think it "should".
>> >> >
>> >> >
>> >> > On 3/28/14, 6:26 AM, Bill Havanki wrote:
>> >> >>
>> >> >> I've been watching the conversation on the side, but I wanted to
>> mention
>> >> >> that it seems the focus isn't so much on "mini" clusters anymore.
>> You're
>> >> >> thinking of programmatic cluster management, whether one node or
>> many.
>> >> The
>> >> >> idea of a basic cluster management interface, with MAC as an
>> >> >> implementation, is promising. A package name of just "cluster" could
>> >> work.
>> >> >>
>> >> >> Carry on :)
>> >> >>
>> >> >> Bill H
>> >> >>
>> >> >>
>> >> >> On Fri, Mar 28, 2014 at 12:39 AM, Sean Busbey
>> >> >> <bu...@cloudera.com>wrote:
>> >> >>
>> >> >>> If you decide to go the mapred/mapreduce way, you could go with the
>> >> >>> package
>> >> >>> name "mini".
>> >> >>>
>> >> >>> alternatively, we can do a multi-stage change out
>> >> >>>
>> >> >>> 1)  1.6.x:  introduce TestAccumuloCluster interface, @deprecate
>> >> >>> MiniAccumuloCluster class and make it implement TestAccumuloCluster
>> >> >>>
>> >> >>> 2) 1.6 + major: change MiniAccumuloCluster to an interface that
>> extends
>> >> >>> TestAccumuloCluster, @deprecate TestAccumuloCluster
>> >> >>>
>> >> >>> 3) 1.6 + 2 major: remove TestAccumuloCluster
>> >> >>>
>> >> >>> Or just go with TestAccumuloCluster as the interface, have
>> >> >>> MiniAccumuloCluster as the local pseudo distributed implementation,
>> and
>> >> >>> then call your new one something like YarnAccumuloCluster.
>> >> >>>
>> >> >>> In that case we could use the deprecation cycle to move the MAC
>> class
>> >> out
>> >> >>> of the public api.
>> >> >>>
>> >> >>>
>> >> >>> On Thu, Mar 27, 2014 at 6:48 PM, Josh Elser <jo...@gmail.com>
>> >> wrote:
>> >> >>>
>> >> >>>> Thoughts on if this would be an acceptable change for 1.6.0 to
>> >> alleviate
>> >> >>>> future cruft?
>> >> >>>>
>> >> >>>> Suggestions on the new package and/or class name would be greatly
>> >> >>>> appreciated over "NewMiniAccumuloC*".
>> >> >>>>
>> >> >>>>
>> >> >>>> On 3/26/14, 3:37 PM, Josh Elser wrote:
>> >> >>>>
>> >> >>>>> Those who are interested: check out
>> >> >>>>> https://github.com/joshelser/accumulo/commit/
>> >> >>>>> 9f63cf32559ab514a69ff2c6b02acef9c9cbb4e8
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> tl;dr I could create some real interfaces for the cluster and
>> config,
>> >> >>>>> which are "hidden" under the covers by the 1.4 and 1.5
>> >> >>>>> MiniAccumuloCluster and MiniAccumuloConfig classes. This
>> de-couples
>> >> the
>> >> >>>>> default implementation, gives us the ability to hide
>> "implementation
>> >> >>>>> details" if wanted, and moves us towards some factory methods
>> instead
>> >> >>>>> of
>> >> >>>>> calling a class directly.
>> >> >>>>>
>> >> >>>>> Thoughts?
>> >> >>>>>
>> >> >>>>> On 3/26/14, 1:21 PM, Josh Elser wrote:
>> >> >>>>>
>> >> >>>>>> Yes, very much experimental at this point.
>> >> >>>>>>
>> >> >>>>>> What I'm most concerned about is having reasonable hooks up
>> front,
>> >> not
>> >> >>>>>> trying to make an implementation for inclusion 1.6.0.
>> >> >>>>>>
>> >> >>>>>> Regarding additions, the implementations already contains most
>> >> things
>> >> >>>>>> I
>> >> >>>>>> would want to expose. I haven't come up with anything that would
>> be
>> >> >>>>>> generally returned through the "API" rather than through this
>> >> proposed
>> >> >>>>>> implementation (e.g. YARN connection information)
>> >> >>>>>>
>> >> >>>>>> On 3/26/14, 11:57 AM, Keith Turner wrote:
>> >> >>>>>>
>> >> >>>>>>> What you are trying to do sounds interesting.  It also sounds
>> >> >>>>>>> experimental
>> >> >>>>>>> and in the early stages.   Is there anything specific you think
>> >> >>>>>>> should be
>> >> >>>>>>> done for 1.6.0 w/ regards to MAC API?
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>> On Wed, Mar 26, 2014 at 2:26 PM, Josh Elser <
>> josh.elser@gmail.com>
>> >> >>>>>>> wrote:
>> >> >>>>>>>
>> >> >>>>>>>   On 3/26/14, 11:13 AM, Keith Turner wrote:
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>>   On Wed, Mar 26, 2014 at 2:05 PM, Josh Elser <
>> >> josh.elser@gmail.com>
>> >> >>>>>>>>>
>> >> >>>>>>>>> wrote:
>> >> >>>>>>>>>
>> >> >>>>>>>>>    On 3/26/14, 10:57 AM, Keith Turner wrote:
>> >> >>>>>>>>>
>> >> >>>>>>>>>>
>> >> >>>>>>>>>>    Can you give an example of what you are thinking of? I
>> don't
>> >> >>>>>>>>>> understand
>> >> >>>>>>>>>>
>> >> >>>>>>>>>>> you
>> >> >>>>>>>>>>> viewpoint either
>> >> >>>>>>>>>>>
>> >> >>>>>>>>>>>
>> >> >>>>>>>>>>>   Sure. One limitation of MAC, in general as a testing
>> harness,
>> >> >>>>>>>>>>> is
>> >> >>>>>>>>>>
>> >> >>>>>>>>>> that it
>> >> >>>>>>>>>> doesn't adequately exercise multi-node implementations. You
>> can
>> >> >>>>>>>>>> run
>> >> >>>>>>>>>> multiple tservers, but they are all on the same host which
>> >> limits
>> >> >>>
>> >> >>> the
>> >> >>>>>>>>>>
>> >> >>>>>>>>>> validity of a "robust" test. This is my immediate goal.
>> >> >>>>>>>>>>
>> >> >>>>>>>>>> Multi-node deployments are capable using something like
>> Mesos or
>> >> >>>>>>>>>> Yarn.
>> >> >>>>>>>>>> Given that there is already functioning support to deploy
>> >> Accumulo
>> >> >>>
>> >> >>> on
>> >> >>>>>>>>>>
>> >> >>>>>>>>>> Yarn,
>> >> >>>>>>>>>> this was my goal.
>> >> >>>>>>>>>>
>> >> >>>>>>>>>> My goal is to be able to have the ability to run all of our
>> >> >>>>>>>>>> AbstractMacIT
>> >> >>>>>>>>>> implementations against "real" hardware without changing a
>> >> single
>> >> >>>>>>>>>> line of
>> >> >>>>>>>>>> test code (ok - maybe a line or two to do injection of the
>> MAC
>> >> >>>>>>>>>> implementation). The point is, I believe there could be a
>> huge
>> >> >>>>>>>>>> testing
>> >> >>>>>>>>>> gain
>> >> >>>>>>>>>> from being able to write tests which leverage yarn, have the
>> >> same
>> >> >>>>>>>>>> programmatic configuration API from MAC, and provide near
>> "real"
>> >> >>>>>>>>>> Accumulo
>> >> >>>>>>>>>> semantics.
>> >> >>>>>>>>>>
>> >> >>>>>>>>>>
>> >> >>>>>>>>>>   Ok so you want to MAC to be an interface so that you can
>> >> provide
>> >> >>>>>>>>>> a
>> >> >>>>>>>>>
>> >> >>>>>>>>> completely different implementation?
>> >> >>>>>>>>>
>> >> >>>>>>>>>
>> >> >>>>>>>>>   Correct. Some things would serve well in a common abstract
>> base
>> >> >>>
>> >> >>> (e.g.
>> >> >>>>>>>>
>> >> >>>>>>>> numTservers, siteXml configuration), but all the nonsense about
>> >> >>>>>>>> creating
>> >> >>>>>>>> directory structures and managing Processes is implementation
>> >> >>>
>> >> >>> specific.
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> Perhaps I could create a new interface that the current
>> >> >>>
>> >> >>> implementation
>> >> >>>>>>>>
>> >> >>>>>>>> implements which still provides the same semantics from 1.4 and
>> >> 1.5.
>> >> >>>>>>>> Let me
>> >> >>>>>>>> see if I can mock up what I'm thinking -- that will probably be
>> >> >>>>>>>> easier than
>> >> >>>>>>>> me trying to write it out.
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>
>> >> >>>
>> >> >>
>> >> >>
>> >> >>
>> >> >
>> >>
>>

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Sean Busbey <bu...@cloudera.com>.
If the new interface is not in the public API, then I think adding it
(without deprecating MAC) is fine.

That way it can evolve if needed and we can add it to the public API on a
later release.


On Fri, Mar 28, 2014 at 3:39 PM, Christopher <ct...@apache.org> wrote:

> But... without more time to fully develop the requirements for the
> interface, with a few implementations, it's probably going to change
> anyway. I think even adding the interface could complicate the
> follow-on work. But... *shrug*.... maybe you can have guarantees that
> the interface will stay as is (same package, same methods, same name,
> etc.)?
>
> --
> Christopher L Tubbs II
> http://gravatar.com/ctubbsii
>
>
> On Fri, Mar 28, 2014 at 3:14 PM, Josh Elser <jo...@gmail.com> wrote:
> > Not even the addition of a new interface, Christopher? I'd very much like
> > to have an interface that we can get in 1.6.0 at a minimum. I wouldn't
> even
> > push for any deprecation of what's currently in place.
> > On Mar 28, 2014 10:02 AM, "Christopher" <ct...@apache.org> wrote:
> >
> >> I don't think any of this should be done for 1.6.0, but I like the
> >> idea of creating a separate cluster interface for testing. I think it
> >> should be integrated into the accumulo-maven-plugin, also. I think the
> >> idea should be hammered out, and tested as a separate thing, to
> >> experiment with the options, and provided as a complete feature for
> >> the next major release. If it would change packaging dependencies, it
> >> shouldn't even be done for 1.6.x bugfix releases.
> >>
> >> --
> >> Christopher L Tubbs II
> >> http://gravatar.com/ctubbsii
> >>
> >>
> >> On Fri, Mar 28, 2014 at 12:24 PM, Josh Elser <jo...@gmail.com>
> wrote:
> >> > Oh, I like that idea, Bill & Sean.
> >> >
> >> > Package: org.apache.accumulo.cluster
> >> > Public API: org.apache.accumulo.cluster.AccumuloCluster
> >> > MAC: org.apache.accumulo.cluster.mini.MiniAccumuloCluster (implements
> >> > AccumuloCluster, allows for backwards compat)
> >> > Yarn: org.apache.accumulo.cluster.yarn
> >> > Docker: ...
> >> > Mesos: ...
> >> >
> >> > etc etc etc.
> >> >
> >> > One question in my mind, do we keep the maven module
> >> 'accumulo-minicluster'?
> >> > I would imagine that if we struck the 'mini' portion from 1.6 that
> would
> >> > create some confusion. Would it be worth the indirection to rename
> >> > accumulo-minicluster to accumulo-cluster and then create a new
> >> > accumulo-minicluster module that depends on accumulo-minicluster (but
> >> > contains no code itself) to preserve the 1.4 and 1.5 poms to generally
> >> work
> >> > with a version bump? I'm not sure if Maven would be happy with that
> or do
> >> > what I think it "should".
> >> >
> >> >
> >> > On 3/28/14, 6:26 AM, Bill Havanki wrote:
> >> >>
> >> >> I've been watching the conversation on the side, but I wanted to
> mention
> >> >> that it seems the focus isn't so much on "mini" clusters anymore.
> You're
> >> >> thinking of programmatic cluster management, whether one node or
> many.
> >> The
> >> >> idea of a basic cluster management interface, with MAC as an
> >> >> implementation, is promising. A package name of just "cluster" could
> >> work.
> >> >>
> >> >> Carry on :)
> >> >>
> >> >> Bill H
> >> >>
> >> >>
> >> >> On Fri, Mar 28, 2014 at 12:39 AM, Sean Busbey
> >> >> <bu...@cloudera.com>wrote:
> >> >>
> >> >>> If you decide to go the mapred/mapreduce way, you could go with the
> >> >>> package
> >> >>> name "mini".
> >> >>>
> >> >>> alternatively, we can do a multi-stage change out
> >> >>>
> >> >>> 1)  1.6.x:  introduce TestAccumuloCluster interface, @deprecate
> >> >>> MiniAccumuloCluster class and make it implement TestAccumuloCluster
> >> >>>
> >> >>> 2) 1.6 + major: change MiniAccumuloCluster to an interface that
> extends
> >> >>> TestAccumuloCluster, @deprecate TestAccumuloCluster
> >> >>>
> >> >>> 3) 1.6 + 2 major: remove TestAccumuloCluster
> >> >>>
> >> >>> Or just go with TestAccumuloCluster as the interface, have
> >> >>> MiniAccumuloCluster as the local pseudo distributed implementation,
> and
> >> >>> then call your new one something like YarnAccumuloCluster.
> >> >>>
> >> >>> In that case we could use the deprecation cycle to move the MAC
> class
> >> out
> >> >>> of the public api.
> >> >>>
> >> >>>
> >> >>> On Thu, Mar 27, 2014 at 6:48 PM, Josh Elser <jo...@gmail.com>
> >> wrote:
> >> >>>
> >> >>>> Thoughts on if this would be an acceptable change for 1.6.0 to
> >> alleviate
> >> >>>> future cruft?
> >> >>>>
> >> >>>> Suggestions on the new package and/or class name would be greatly
> >> >>>> appreciated over "NewMiniAccumuloC*".
> >> >>>>
> >> >>>>
> >> >>>> On 3/26/14, 3:37 PM, Josh Elser wrote:
> >> >>>>
> >> >>>>> Those who are interested: check out
> >> >>>>> https://github.com/joshelser/accumulo/commit/
> >> >>>>> 9f63cf32559ab514a69ff2c6b02acef9c9cbb4e8
> >> >>>>>
> >> >>>>>
> >> >>>>> tl;dr I could create some real interfaces for the cluster and
> config,
> >> >>>>> which are "hidden" under the covers by the 1.4 and 1.5
> >> >>>>> MiniAccumuloCluster and MiniAccumuloConfig classes. This
> de-couples
> >> the
> >> >>>>> default implementation, gives us the ability to hide
> "implementation
> >> >>>>> details" if wanted, and moves us towards some factory methods
> instead
> >> >>>>> of
> >> >>>>> calling a class directly.
> >> >>>>>
> >> >>>>> Thoughts?
> >> >>>>>
> >> >>>>> On 3/26/14, 1:21 PM, Josh Elser wrote:
> >> >>>>>
> >> >>>>>> Yes, very much experimental at this point.
> >> >>>>>>
> >> >>>>>> What I'm most concerned about is having reasonable hooks up
> front,
> >> not
> >> >>>>>> trying to make an implementation for inclusion 1.6.0.
> >> >>>>>>
> >> >>>>>> Regarding additions, the implementations already contains most
> >> things
> >> >>>>>> I
> >> >>>>>> would want to expose. I haven't come up with anything that would
> be
> >> >>>>>> generally returned through the "API" rather than through this
> >> proposed
> >> >>>>>> implementation (e.g. YARN connection information)
> >> >>>>>>
> >> >>>>>> On 3/26/14, 11:57 AM, Keith Turner wrote:
> >> >>>>>>
> >> >>>>>>> What you are trying to do sounds interesting.  It also sounds
> >> >>>>>>> experimental
> >> >>>>>>> and in the early stages.   Is there anything specific you think
> >> >>>>>>> should be
> >> >>>>>>> done for 1.6.0 w/ regards to MAC API?
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>> On Wed, Mar 26, 2014 at 2:26 PM, Josh Elser <
> josh.elser@gmail.com>
> >> >>>>>>> wrote:
> >> >>>>>>>
> >> >>>>>>>   On 3/26/14, 11:13 AM, Keith Turner wrote:
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>>   On Wed, Mar 26, 2014 at 2:05 PM, Josh Elser <
> >> josh.elser@gmail.com>
> >> >>>>>>>>>
> >> >>>>>>>>> wrote:
> >> >>>>>>>>>
> >> >>>>>>>>>    On 3/26/14, 10:57 AM, Keith Turner wrote:
> >> >>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>>    Can you give an example of what you are thinking of? I
> don't
> >> >>>>>>>>>> understand
> >> >>>>>>>>>>
> >> >>>>>>>>>>> you
> >> >>>>>>>>>>> viewpoint either
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>
> >> >>>>>>>>>>>   Sure. One limitation of MAC, in general as a testing
> harness,
> >> >>>>>>>>>>> is
> >> >>>>>>>>>>
> >> >>>>>>>>>> that it
> >> >>>>>>>>>> doesn't adequately exercise multi-node implementations. You
> can
> >> >>>>>>>>>> run
> >> >>>>>>>>>> multiple tservers, but they are all on the same host which
> >> limits
> >> >>>
> >> >>> the
> >> >>>>>>>>>>
> >> >>>>>>>>>> validity of a "robust" test. This is my immediate goal.
> >> >>>>>>>>>>
> >> >>>>>>>>>> Multi-node deployments are capable using something like
> Mesos or
> >> >>>>>>>>>> Yarn.
> >> >>>>>>>>>> Given that there is already functioning support to deploy
> >> Accumulo
> >> >>>
> >> >>> on
> >> >>>>>>>>>>
> >> >>>>>>>>>> Yarn,
> >> >>>>>>>>>> this was my goal.
> >> >>>>>>>>>>
> >> >>>>>>>>>> My goal is to be able to have the ability to run all of our
> >> >>>>>>>>>> AbstractMacIT
> >> >>>>>>>>>> implementations against "real" hardware without changing a
> >> single
> >> >>>>>>>>>> line of
> >> >>>>>>>>>> test code (ok - maybe a line or two to do injection of the
> MAC
> >> >>>>>>>>>> implementation). The point is, I believe there could be a
> huge
> >> >>>>>>>>>> testing
> >> >>>>>>>>>> gain
> >> >>>>>>>>>> from being able to write tests which leverage yarn, have the
> >> same
> >> >>>>>>>>>> programmatic configuration API from MAC, and provide near
> "real"
> >> >>>>>>>>>> Accumulo
> >> >>>>>>>>>> semantics.
> >> >>>>>>>>>>
> >> >>>>>>>>>>
> >> >>>>>>>>>>   Ok so you want to MAC to be an interface so that you can
> >> provide
> >> >>>>>>>>>> a
> >> >>>>>>>>>
> >> >>>>>>>>> completely different implementation?
> >> >>>>>>>>>
> >> >>>>>>>>>
> >> >>>>>>>>>   Correct. Some things would serve well in a common abstract
> base
> >> >>>
> >> >>> (e.g.
> >> >>>>>>>>
> >> >>>>>>>> numTservers, siteXml configuration), but all the nonsense about
> >> >>>>>>>> creating
> >> >>>>>>>> directory structures and managing Processes is implementation
> >> >>>
> >> >>> specific.
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> Perhaps I could create a new interface that the current
> >> >>>
> >> >>> implementation
> >> >>>>>>>>
> >> >>>>>>>> implements which still provides the same semantics from 1.4 and
> >> 1.5.
> >> >>>>>>>> Let me
> >> >>>>>>>> see if I can mock up what I'm thinking -- that will probably be
> >> >>>>>>>> easier than
> >> >>>>>>>> me trying to write it out.
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>
> >> >>>
> >> >>
> >> >>
> >> >>
> >> >
> >>
>

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Christopher <ct...@apache.org>.
But... without more time to fully develop the requirements for the
interface, with a few implementations, it's probably going to change
anyway. I think even adding the interface could complicate the
follow-on work. But... *shrug*.... maybe you can have guarantees that
the interface will stay as is (same package, same methods, same name,
etc.)?

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Fri, Mar 28, 2014 at 3:14 PM, Josh Elser <jo...@gmail.com> wrote:
> Not even the addition of a new interface, Christopher? I'd very much like
> to have an interface that we can get in 1.6.0 at a minimum. I wouldn't even
> push for any deprecation of what's currently in place.
> On Mar 28, 2014 10:02 AM, "Christopher" <ct...@apache.org> wrote:
>
>> I don't think any of this should be done for 1.6.0, but I like the
>> idea of creating a separate cluster interface for testing. I think it
>> should be integrated into the accumulo-maven-plugin, also. I think the
>> idea should be hammered out, and tested as a separate thing, to
>> experiment with the options, and provided as a complete feature for
>> the next major release. If it would change packaging dependencies, it
>> shouldn't even be done for 1.6.x bugfix releases.
>>
>> --
>> Christopher L Tubbs II
>> http://gravatar.com/ctubbsii
>>
>>
>> On Fri, Mar 28, 2014 at 12:24 PM, Josh Elser <jo...@gmail.com> wrote:
>> > Oh, I like that idea, Bill & Sean.
>> >
>> > Package: org.apache.accumulo.cluster
>> > Public API: org.apache.accumulo.cluster.AccumuloCluster
>> > MAC: org.apache.accumulo.cluster.mini.MiniAccumuloCluster (implements
>> > AccumuloCluster, allows for backwards compat)
>> > Yarn: org.apache.accumulo.cluster.yarn
>> > Docker: ...
>> > Mesos: ...
>> >
>> > etc etc etc.
>> >
>> > One question in my mind, do we keep the maven module
>> 'accumulo-minicluster'?
>> > I would imagine that if we struck the 'mini' portion from 1.6 that would
>> > create some confusion. Would it be worth the indirection to rename
>> > accumulo-minicluster to accumulo-cluster and then create a new
>> > accumulo-minicluster module that depends on accumulo-minicluster (but
>> > contains no code itself) to preserve the 1.4 and 1.5 poms to generally
>> work
>> > with a version bump? I'm not sure if Maven would be happy with that or do
>> > what I think it "should".
>> >
>> >
>> > On 3/28/14, 6:26 AM, Bill Havanki wrote:
>> >>
>> >> I've been watching the conversation on the side, but I wanted to mention
>> >> that it seems the focus isn't so much on "mini" clusters anymore. You're
>> >> thinking of programmatic cluster management, whether one node or many.
>> The
>> >> idea of a basic cluster management interface, with MAC as an
>> >> implementation, is promising. A package name of just "cluster" could
>> work.
>> >>
>> >> Carry on :)
>> >>
>> >> Bill H
>> >>
>> >>
>> >> On Fri, Mar 28, 2014 at 12:39 AM, Sean Busbey
>> >> <bu...@cloudera.com>wrote:
>> >>
>> >>> If you decide to go the mapred/mapreduce way, you could go with the
>> >>> package
>> >>> name "mini".
>> >>>
>> >>> alternatively, we can do a multi-stage change out
>> >>>
>> >>> 1)  1.6.x:  introduce TestAccumuloCluster interface, @deprecate
>> >>> MiniAccumuloCluster class and make it implement TestAccumuloCluster
>> >>>
>> >>> 2) 1.6 + major: change MiniAccumuloCluster to an interface that extends
>> >>> TestAccumuloCluster, @deprecate TestAccumuloCluster
>> >>>
>> >>> 3) 1.6 + 2 major: remove TestAccumuloCluster
>> >>>
>> >>> Or just go with TestAccumuloCluster as the interface, have
>> >>> MiniAccumuloCluster as the local pseudo distributed implementation, and
>> >>> then call your new one something like YarnAccumuloCluster.
>> >>>
>> >>> In that case we could use the deprecation cycle to move the MAC class
>> out
>> >>> of the public api.
>> >>>
>> >>>
>> >>> On Thu, Mar 27, 2014 at 6:48 PM, Josh Elser <jo...@gmail.com>
>> wrote:
>> >>>
>> >>>> Thoughts on if this would be an acceptable change for 1.6.0 to
>> alleviate
>> >>>> future cruft?
>> >>>>
>> >>>> Suggestions on the new package and/or class name would be greatly
>> >>>> appreciated over "NewMiniAccumuloC*".
>> >>>>
>> >>>>
>> >>>> On 3/26/14, 3:37 PM, Josh Elser wrote:
>> >>>>
>> >>>>> Those who are interested: check out
>> >>>>> https://github.com/joshelser/accumulo/commit/
>> >>>>> 9f63cf32559ab514a69ff2c6b02acef9c9cbb4e8
>> >>>>>
>> >>>>>
>> >>>>> tl;dr I could create some real interfaces for the cluster and config,
>> >>>>> which are "hidden" under the covers by the 1.4 and 1.5
>> >>>>> MiniAccumuloCluster and MiniAccumuloConfig classes. This de-couples
>> the
>> >>>>> default implementation, gives us the ability to hide "implementation
>> >>>>> details" if wanted, and moves us towards some factory methods instead
>> >>>>> of
>> >>>>> calling a class directly.
>> >>>>>
>> >>>>> Thoughts?
>> >>>>>
>> >>>>> On 3/26/14, 1:21 PM, Josh Elser wrote:
>> >>>>>
>> >>>>>> Yes, very much experimental at this point.
>> >>>>>>
>> >>>>>> What I'm most concerned about is having reasonable hooks up front,
>> not
>> >>>>>> trying to make an implementation for inclusion 1.6.0.
>> >>>>>>
>> >>>>>> Regarding additions, the implementations already contains most
>> things
>> >>>>>> I
>> >>>>>> would want to expose. I haven't come up with anything that would be
>> >>>>>> generally returned through the "API" rather than through this
>> proposed
>> >>>>>> implementation (e.g. YARN connection information)
>> >>>>>>
>> >>>>>> On 3/26/14, 11:57 AM, Keith Turner wrote:
>> >>>>>>
>> >>>>>>> What you are trying to do sounds interesting.  It also sounds
>> >>>>>>> experimental
>> >>>>>>> and in the early stages.   Is there anything specific you think
>> >>>>>>> should be
>> >>>>>>> done for 1.6.0 w/ regards to MAC API?
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> On Wed, Mar 26, 2014 at 2:26 PM, Josh Elser <jo...@gmail.com>
>> >>>>>>> wrote:
>> >>>>>>>
>> >>>>>>>   On 3/26/14, 11:13 AM, Keith Turner wrote:
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>   On Wed, Mar 26, 2014 at 2:05 PM, Josh Elser <
>> josh.elser@gmail.com>
>> >>>>>>>>>
>> >>>>>>>>> wrote:
>> >>>>>>>>>
>> >>>>>>>>>    On 3/26/14, 10:57 AM, Keith Turner wrote:
>> >>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>    Can you give an example of what you are thinking of? I don't
>> >>>>>>>>>> understand
>> >>>>>>>>>>
>> >>>>>>>>>>> you
>> >>>>>>>>>>> viewpoint either
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>   Sure. One limitation of MAC, in general as a testing harness,
>> >>>>>>>>>>> is
>> >>>>>>>>>>
>> >>>>>>>>>> that it
>> >>>>>>>>>> doesn't adequately exercise multi-node implementations. You can
>> >>>>>>>>>> run
>> >>>>>>>>>> multiple tservers, but they are all on the same host which
>> limits
>> >>>
>> >>> the
>> >>>>>>>>>>
>> >>>>>>>>>> validity of a "robust" test. This is my immediate goal.
>> >>>>>>>>>>
>> >>>>>>>>>> Multi-node deployments are capable using something like Mesos or
>> >>>>>>>>>> Yarn.
>> >>>>>>>>>> Given that there is already functioning support to deploy
>> Accumulo
>> >>>
>> >>> on
>> >>>>>>>>>>
>> >>>>>>>>>> Yarn,
>> >>>>>>>>>> this was my goal.
>> >>>>>>>>>>
>> >>>>>>>>>> My goal is to be able to have the ability to run all of our
>> >>>>>>>>>> AbstractMacIT
>> >>>>>>>>>> implementations against "real" hardware without changing a
>> single
>> >>>>>>>>>> line of
>> >>>>>>>>>> test code (ok - maybe a line or two to do injection of the MAC
>> >>>>>>>>>> implementation). The point is, I believe there could be a huge
>> >>>>>>>>>> testing
>> >>>>>>>>>> gain
>> >>>>>>>>>> from being able to write tests which leverage yarn, have the
>> same
>> >>>>>>>>>> programmatic configuration API from MAC, and provide near "real"
>> >>>>>>>>>> Accumulo
>> >>>>>>>>>> semantics.
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>   Ok so you want to MAC to be an interface so that you can
>> provide
>> >>>>>>>>>> a
>> >>>>>>>>>
>> >>>>>>>>> completely different implementation?
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>   Correct. Some things would serve well in a common abstract base
>> >>>
>> >>> (e.g.
>> >>>>>>>>
>> >>>>>>>> numTservers, siteXml configuration), but all the nonsense about
>> >>>>>>>> creating
>> >>>>>>>> directory structures and managing Processes is implementation
>> >>>
>> >>> specific.
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> Perhaps I could create a new interface that the current
>> >>>
>> >>> implementation
>> >>>>>>>>
>> >>>>>>>> implements which still provides the same semantics from 1.4 and
>> 1.5.
>> >>>>>>>> Let me
>> >>>>>>>> see if I can mock up what I'm thinking -- that will probably be
>> >>>>>>>> easier than
>> >>>>>>>> me trying to write it out.
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>
>> >>
>> >>
>> >>
>> >
>>

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Josh Elser <jo...@gmail.com>.
Not even the addition of a new interface, Christopher? I'd very much like
to have an interface that we can get in 1.6.0 at a minimum. I wouldn't even
push for any deprecation of what's currently in place.
On Mar 28, 2014 10:02 AM, "Christopher" <ct...@apache.org> wrote:

> I don't think any of this should be done for 1.6.0, but I like the
> idea of creating a separate cluster interface for testing. I think it
> should be integrated into the accumulo-maven-plugin, also. I think the
> idea should be hammered out, and tested as a separate thing, to
> experiment with the options, and provided as a complete feature for
> the next major release. If it would change packaging dependencies, it
> shouldn't even be done for 1.6.x bugfix releases.
>
> --
> Christopher L Tubbs II
> http://gravatar.com/ctubbsii
>
>
> On Fri, Mar 28, 2014 at 12:24 PM, Josh Elser <jo...@gmail.com> wrote:
> > Oh, I like that idea, Bill & Sean.
> >
> > Package: org.apache.accumulo.cluster
> > Public API: org.apache.accumulo.cluster.AccumuloCluster
> > MAC: org.apache.accumulo.cluster.mini.MiniAccumuloCluster (implements
> > AccumuloCluster, allows for backwards compat)
> > Yarn: org.apache.accumulo.cluster.yarn
> > Docker: ...
> > Mesos: ...
> >
> > etc etc etc.
> >
> > One question in my mind, do we keep the maven module
> 'accumulo-minicluster'?
> > I would imagine that if we struck the 'mini' portion from 1.6 that would
> > create some confusion. Would it be worth the indirection to rename
> > accumulo-minicluster to accumulo-cluster and then create a new
> > accumulo-minicluster module that depends on accumulo-minicluster (but
> > contains no code itself) to preserve the 1.4 and 1.5 poms to generally
> work
> > with a version bump? I'm not sure if Maven would be happy with that or do
> > what I think it "should".
> >
> >
> > On 3/28/14, 6:26 AM, Bill Havanki wrote:
> >>
> >> I've been watching the conversation on the side, but I wanted to mention
> >> that it seems the focus isn't so much on "mini" clusters anymore. You're
> >> thinking of programmatic cluster management, whether one node or many.
> The
> >> idea of a basic cluster management interface, with MAC as an
> >> implementation, is promising. A package name of just "cluster" could
> work.
> >>
> >> Carry on :)
> >>
> >> Bill H
> >>
> >>
> >> On Fri, Mar 28, 2014 at 12:39 AM, Sean Busbey
> >> <bu...@cloudera.com>wrote:
> >>
> >>> If you decide to go the mapred/mapreduce way, you could go with the
> >>> package
> >>> name "mini".
> >>>
> >>> alternatively, we can do a multi-stage change out
> >>>
> >>> 1)  1.6.x:  introduce TestAccumuloCluster interface, @deprecate
> >>> MiniAccumuloCluster class and make it implement TestAccumuloCluster
> >>>
> >>> 2) 1.6 + major: change MiniAccumuloCluster to an interface that extends
> >>> TestAccumuloCluster, @deprecate TestAccumuloCluster
> >>>
> >>> 3) 1.6 + 2 major: remove TestAccumuloCluster
> >>>
> >>> Or just go with TestAccumuloCluster as the interface, have
> >>> MiniAccumuloCluster as the local pseudo distributed implementation, and
> >>> then call your new one something like YarnAccumuloCluster.
> >>>
> >>> In that case we could use the deprecation cycle to move the MAC class
> out
> >>> of the public api.
> >>>
> >>>
> >>> On Thu, Mar 27, 2014 at 6:48 PM, Josh Elser <jo...@gmail.com>
> wrote:
> >>>
> >>>> Thoughts on if this would be an acceptable change for 1.6.0 to
> alleviate
> >>>> future cruft?
> >>>>
> >>>> Suggestions on the new package and/or class name would be greatly
> >>>> appreciated over "NewMiniAccumuloC*".
> >>>>
> >>>>
> >>>> On 3/26/14, 3:37 PM, Josh Elser wrote:
> >>>>
> >>>>> Those who are interested: check out
> >>>>> https://github.com/joshelser/accumulo/commit/
> >>>>> 9f63cf32559ab514a69ff2c6b02acef9c9cbb4e8
> >>>>>
> >>>>>
> >>>>> tl;dr I could create some real interfaces for the cluster and config,
> >>>>> which are "hidden" under the covers by the 1.4 and 1.5
> >>>>> MiniAccumuloCluster and MiniAccumuloConfig classes. This de-couples
> the
> >>>>> default implementation, gives us the ability to hide "implementation
> >>>>> details" if wanted, and moves us towards some factory methods instead
> >>>>> of
> >>>>> calling a class directly.
> >>>>>
> >>>>> Thoughts?
> >>>>>
> >>>>> On 3/26/14, 1:21 PM, Josh Elser wrote:
> >>>>>
> >>>>>> Yes, very much experimental at this point.
> >>>>>>
> >>>>>> What I'm most concerned about is having reasonable hooks up front,
> not
> >>>>>> trying to make an implementation for inclusion 1.6.0.
> >>>>>>
> >>>>>> Regarding additions, the implementations already contains most
> things
> >>>>>> I
> >>>>>> would want to expose. I haven't come up with anything that would be
> >>>>>> generally returned through the "API" rather than through this
> proposed
> >>>>>> implementation (e.g. YARN connection information)
> >>>>>>
> >>>>>> On 3/26/14, 11:57 AM, Keith Turner wrote:
> >>>>>>
> >>>>>>> What you are trying to do sounds interesting.  It also sounds
> >>>>>>> experimental
> >>>>>>> and in the early stages.   Is there anything specific you think
> >>>>>>> should be
> >>>>>>> done for 1.6.0 w/ regards to MAC API?
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, Mar 26, 2014 at 2:26 PM, Josh Elser <jo...@gmail.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>   On 3/26/14, 11:13 AM, Keith Turner wrote:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>   On Wed, Mar 26, 2014 at 2:05 PM, Josh Elser <
> josh.elser@gmail.com>
> >>>>>>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>    On 3/26/14, 10:57 AM, Keith Turner wrote:
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>    Can you give an example of what you are thinking of? I don't
> >>>>>>>>>> understand
> >>>>>>>>>>
> >>>>>>>>>>> you
> >>>>>>>>>>> viewpoint either
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>   Sure. One limitation of MAC, in general as a testing harness,
> >>>>>>>>>>> is
> >>>>>>>>>>
> >>>>>>>>>> that it
> >>>>>>>>>> doesn't adequately exercise multi-node implementations. You can
> >>>>>>>>>> run
> >>>>>>>>>> multiple tservers, but they are all on the same host which
> limits
> >>>
> >>> the
> >>>>>>>>>>
> >>>>>>>>>> validity of a "robust" test. This is my immediate goal.
> >>>>>>>>>>
> >>>>>>>>>> Multi-node deployments are capable using something like Mesos or
> >>>>>>>>>> Yarn.
> >>>>>>>>>> Given that there is already functioning support to deploy
> Accumulo
> >>>
> >>> on
> >>>>>>>>>>
> >>>>>>>>>> Yarn,
> >>>>>>>>>> this was my goal.
> >>>>>>>>>>
> >>>>>>>>>> My goal is to be able to have the ability to run all of our
> >>>>>>>>>> AbstractMacIT
> >>>>>>>>>> implementations against "real" hardware without changing a
> single
> >>>>>>>>>> line of
> >>>>>>>>>> test code (ok - maybe a line or two to do injection of the MAC
> >>>>>>>>>> implementation). The point is, I believe there could be a huge
> >>>>>>>>>> testing
> >>>>>>>>>> gain
> >>>>>>>>>> from being able to write tests which leverage yarn, have the
> same
> >>>>>>>>>> programmatic configuration API from MAC, and provide near "real"
> >>>>>>>>>> Accumulo
> >>>>>>>>>> semantics.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>   Ok so you want to MAC to be an interface so that you can
> provide
> >>>>>>>>>> a
> >>>>>>>>>
> >>>>>>>>> completely different implementation?
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>   Correct. Some things would serve well in a common abstract base
> >>>
> >>> (e.g.
> >>>>>>>>
> >>>>>>>> numTservers, siteXml configuration), but all the nonsense about
> >>>>>>>> creating
> >>>>>>>> directory structures and managing Processes is implementation
> >>>
> >>> specific.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Perhaps I could create a new interface that the current
> >>>
> >>> implementation
> >>>>>>>>
> >>>>>>>> implements which still provides the same semantics from 1.4 and
> 1.5.
> >>>>>>>> Let me
> >>>>>>>> see if I can mock up what I'm thinking -- that will probably be
> >>>>>>>> easier than
> >>>>>>>> me trying to write it out.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>
> >>
> >>
> >>
> >
>

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Christopher <ct...@apache.org>.
I don't think any of this should be done for 1.6.0, but I like the
idea of creating a separate cluster interface for testing. I think it
should be integrated into the accumulo-maven-plugin, also. I think the
idea should be hammered out, and tested as a separate thing, to
experiment with the options, and provided as a complete feature for
the next major release. If it would change packaging dependencies, it
shouldn't even be done for 1.6.x bugfix releases.

--
Christopher L Tubbs II
http://gravatar.com/ctubbsii


On Fri, Mar 28, 2014 at 12:24 PM, Josh Elser <jo...@gmail.com> wrote:
> Oh, I like that idea, Bill & Sean.
>
> Package: org.apache.accumulo.cluster
> Public API: org.apache.accumulo.cluster.AccumuloCluster
> MAC: org.apache.accumulo.cluster.mini.MiniAccumuloCluster (implements
> AccumuloCluster, allows for backwards compat)
> Yarn: org.apache.accumulo.cluster.yarn
> Docker: ...
> Mesos: ...
>
> etc etc etc.
>
> One question in my mind, do we keep the maven module 'accumulo-minicluster'?
> I would imagine that if we struck the 'mini' portion from 1.6 that would
> create some confusion. Would it be worth the indirection to rename
> accumulo-minicluster to accumulo-cluster and then create a new
> accumulo-minicluster module that depends on accumulo-minicluster (but
> contains no code itself) to preserve the 1.4 and 1.5 poms to generally work
> with a version bump? I'm not sure if Maven would be happy with that or do
> what I think it "should".
>
>
> On 3/28/14, 6:26 AM, Bill Havanki wrote:
>>
>> I've been watching the conversation on the side, but I wanted to mention
>> that it seems the focus isn't so much on "mini" clusters anymore. You're
>> thinking of programmatic cluster management, whether one node or many. The
>> idea of a basic cluster management interface, with MAC as an
>> implementation, is promising. A package name of just "cluster" could work.
>>
>> Carry on :)
>>
>> Bill H
>>
>>
>> On Fri, Mar 28, 2014 at 12:39 AM, Sean Busbey
>> <bu...@cloudera.com>wrote:
>>
>>> If you decide to go the mapred/mapreduce way, you could go with the
>>> package
>>> name "mini".
>>>
>>> alternatively, we can do a multi-stage change out
>>>
>>> 1)  1.6.x:  introduce TestAccumuloCluster interface, @deprecate
>>> MiniAccumuloCluster class and make it implement TestAccumuloCluster
>>>
>>> 2) 1.6 + major: change MiniAccumuloCluster to an interface that extends
>>> TestAccumuloCluster, @deprecate TestAccumuloCluster
>>>
>>> 3) 1.6 + 2 major: remove TestAccumuloCluster
>>>
>>> Or just go with TestAccumuloCluster as the interface, have
>>> MiniAccumuloCluster as the local pseudo distributed implementation, and
>>> then call your new one something like YarnAccumuloCluster.
>>>
>>> In that case we could use the deprecation cycle to move the MAC class out
>>> of the public api.
>>>
>>>
>>> On Thu, Mar 27, 2014 at 6:48 PM, Josh Elser <jo...@gmail.com> wrote:
>>>
>>>> Thoughts on if this would be an acceptable change for 1.6.0 to alleviate
>>>> future cruft?
>>>>
>>>> Suggestions on the new package and/or class name would be greatly
>>>> appreciated over "NewMiniAccumuloC*".
>>>>
>>>>
>>>> On 3/26/14, 3:37 PM, Josh Elser wrote:
>>>>
>>>>> Those who are interested: check out
>>>>> https://github.com/joshelser/accumulo/commit/
>>>>> 9f63cf32559ab514a69ff2c6b02acef9c9cbb4e8
>>>>>
>>>>>
>>>>> tl;dr I could create some real interfaces for the cluster and config,
>>>>> which are "hidden" under the covers by the 1.4 and 1.5
>>>>> MiniAccumuloCluster and MiniAccumuloConfig classes. This de-couples the
>>>>> default implementation, gives us the ability to hide "implementation
>>>>> details" if wanted, and moves us towards some factory methods instead
>>>>> of
>>>>> calling a class directly.
>>>>>
>>>>> Thoughts?
>>>>>
>>>>> On 3/26/14, 1:21 PM, Josh Elser wrote:
>>>>>
>>>>>> Yes, very much experimental at this point.
>>>>>>
>>>>>> What I'm most concerned about is having reasonable hooks up front, not
>>>>>> trying to make an implementation for inclusion 1.6.0.
>>>>>>
>>>>>> Regarding additions, the implementations already contains most things
>>>>>> I
>>>>>> would want to expose. I haven't come up with anything that would be
>>>>>> generally returned through the "API" rather than through this proposed
>>>>>> implementation (e.g. YARN connection information)
>>>>>>
>>>>>> On 3/26/14, 11:57 AM, Keith Turner wrote:
>>>>>>
>>>>>>> What you are trying to do sounds interesting.  It also sounds
>>>>>>> experimental
>>>>>>> and in the early stages.   Is there anything specific you think
>>>>>>> should be
>>>>>>> done for 1.6.0 w/ regards to MAC API?
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Mar 26, 2014 at 2:26 PM, Josh Elser <jo...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>   On 3/26/14, 11:13 AM, Keith Turner wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>   On Wed, Mar 26, 2014 at 2:05 PM, Josh Elser <jo...@gmail.com>
>>>>>>>>>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>    On 3/26/14, 10:57 AM, Keith Turner wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>    Can you give an example of what you are thinking of? I don't
>>>>>>>>>> understand
>>>>>>>>>>
>>>>>>>>>>> you
>>>>>>>>>>> viewpoint either
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>   Sure. One limitation of MAC, in general as a testing harness,
>>>>>>>>>>> is
>>>>>>>>>>
>>>>>>>>>> that it
>>>>>>>>>> doesn't adequately exercise multi-node implementations. You can
>>>>>>>>>> run
>>>>>>>>>> multiple tservers, but they are all on the same host which limits
>>>
>>> the
>>>>>>>>>>
>>>>>>>>>> validity of a "robust" test. This is my immediate goal.
>>>>>>>>>>
>>>>>>>>>> Multi-node deployments are capable using something like Mesos or
>>>>>>>>>> Yarn.
>>>>>>>>>> Given that there is already functioning support to deploy Accumulo
>>>
>>> on
>>>>>>>>>>
>>>>>>>>>> Yarn,
>>>>>>>>>> this was my goal.
>>>>>>>>>>
>>>>>>>>>> My goal is to be able to have the ability to run all of our
>>>>>>>>>> AbstractMacIT
>>>>>>>>>> implementations against "real" hardware without changing a single
>>>>>>>>>> line of
>>>>>>>>>> test code (ok - maybe a line or two to do injection of the MAC
>>>>>>>>>> implementation). The point is, I believe there could be a huge
>>>>>>>>>> testing
>>>>>>>>>> gain
>>>>>>>>>> from being able to write tests which leverage yarn, have the same
>>>>>>>>>> programmatic configuration API from MAC, and provide near "real"
>>>>>>>>>> Accumulo
>>>>>>>>>> semantics.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>   Ok so you want to MAC to be an interface so that you can provide
>>>>>>>>>> a
>>>>>>>>>
>>>>>>>>> completely different implementation?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>   Correct. Some things would serve well in a common abstract base
>>>
>>> (e.g.
>>>>>>>>
>>>>>>>> numTservers, siteXml configuration), but all the nonsense about
>>>>>>>> creating
>>>>>>>> directory structures and managing Processes is implementation
>>>
>>> specific.
>>>>>>>>
>>>>>>>>
>>>>>>>> Perhaps I could create a new interface that the current
>>>
>>> implementation
>>>>>>>>
>>>>>>>> implements which still provides the same semantics from 1.4 and 1.5.
>>>>>>>> Let me
>>>>>>>> see if I can mock up what I'm thinking -- that will probably be
>>>>>>>> easier than
>>>>>>>> me trying to write it out.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>
>>
>>
>>
>

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Josh Elser <jo...@gmail.com>.
Oh, I like that idea, Bill & Sean.

Package: org.apache.accumulo.cluster
Public API: org.apache.accumulo.cluster.AccumuloCluster
MAC: org.apache.accumulo.cluster.mini.MiniAccumuloCluster (implements 
AccumuloCluster, allows for backwards compat)
Yarn: org.apache.accumulo.cluster.yarn
Docker: ...
Mesos: ...

etc etc etc.

One question in my mind, do we keep the maven module 
'accumulo-minicluster'? I would imagine that if we struck the 'mini' 
portion from 1.6 that would create some confusion. Would it be worth the 
indirection to rename accumulo-minicluster to accumulo-cluster and then 
create a new accumulo-minicluster module that depends on 
accumulo-minicluster (but contains no code itself) to preserve the 1.4 
and 1.5 poms to generally work with a version bump? I'm not sure if 
Maven would be happy with that or do what I think it "should".

On 3/28/14, 6:26 AM, Bill Havanki wrote:
> I've been watching the conversation on the side, but I wanted to mention
> that it seems the focus isn't so much on "mini" clusters anymore. You're
> thinking of programmatic cluster management, whether one node or many. The
> idea of a basic cluster management interface, with MAC as an
> implementation, is promising. A package name of just "cluster" could work.
>
> Carry on :)
>
> Bill H
>
>
> On Fri, Mar 28, 2014 at 12:39 AM, Sean Busbey <bu...@cloudera.com>wrote:
>
>> If you decide to go the mapred/mapreduce way, you could go with the package
>> name "mini".
>>
>> alternatively, we can do a multi-stage change out
>>
>> 1)  1.6.x:  introduce TestAccumuloCluster interface, @deprecate
>> MiniAccumuloCluster class and make it implement TestAccumuloCluster
>>
>> 2) 1.6 + major: change MiniAccumuloCluster to an interface that extends
>> TestAccumuloCluster, @deprecate TestAccumuloCluster
>>
>> 3) 1.6 + 2 major: remove TestAccumuloCluster
>>
>> Or just go with TestAccumuloCluster as the interface, have
>> MiniAccumuloCluster as the local pseudo distributed implementation, and
>> then call your new one something like YarnAccumuloCluster.
>>
>> In that case we could use the deprecation cycle to move the MAC class out
>> of the public api.
>>
>>
>> On Thu, Mar 27, 2014 at 6:48 PM, Josh Elser <jo...@gmail.com> wrote:
>>
>>> Thoughts on if this would be an acceptable change for 1.6.0 to alleviate
>>> future cruft?
>>>
>>> Suggestions on the new package and/or class name would be greatly
>>> appreciated over "NewMiniAccumuloC*".
>>>
>>>
>>> On 3/26/14, 3:37 PM, Josh Elser wrote:
>>>
>>>> Those who are interested: check out
>>>> https://github.com/joshelser/accumulo/commit/
>>>> 9f63cf32559ab514a69ff2c6b02acef9c9cbb4e8
>>>>
>>>>
>>>> tl;dr I could create some real interfaces for the cluster and config,
>>>> which are "hidden" under the covers by the 1.4 and 1.5
>>>> MiniAccumuloCluster and MiniAccumuloConfig classes. This de-couples the
>>>> default implementation, gives us the ability to hide "implementation
>>>> details" if wanted, and moves us towards some factory methods instead of
>>>> calling a class directly.
>>>>
>>>> Thoughts?
>>>>
>>>> On 3/26/14, 1:21 PM, Josh Elser wrote:
>>>>
>>>>> Yes, very much experimental at this point.
>>>>>
>>>>> What I'm most concerned about is having reasonable hooks up front, not
>>>>> trying to make an implementation for inclusion 1.6.0.
>>>>>
>>>>> Regarding additions, the implementations already contains most things I
>>>>> would want to expose. I haven't come up with anything that would be
>>>>> generally returned through the "API" rather than through this proposed
>>>>> implementation (e.g. YARN connection information)
>>>>>
>>>>> On 3/26/14, 11:57 AM, Keith Turner wrote:
>>>>>
>>>>>> What you are trying to do sounds interesting.  It also sounds
>>>>>> experimental
>>>>>> and in the early stages.   Is there anything specific you think
>>>>>> should be
>>>>>> done for 1.6.0 w/ regards to MAC API?
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 26, 2014 at 2:26 PM, Josh Elser <jo...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>   On 3/26/14, 11:13 AM, Keith Turner wrote:
>>>>>>>
>>>>>>>   On Wed, Mar 26, 2014 at 2:05 PM, Josh Elser <jo...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>    On 3/26/14, 10:57 AM, Keith Turner wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>    Can you give an example of what you are thinking of? I don't
>>>>>>>>> understand
>>>>>>>>>
>>>>>>>>>> you
>>>>>>>>>> viewpoint either
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>   Sure. One limitation of MAC, in general as a testing harness, is
>>>>>>>>> that it
>>>>>>>>> doesn't adequately exercise multi-node implementations. You can run
>>>>>>>>> multiple tservers, but they are all on the same host which limits
>> the
>>>>>>>>> validity of a "robust" test. This is my immediate goal.
>>>>>>>>>
>>>>>>>>> Multi-node deployments are capable using something like Mesos or
>>>>>>>>> Yarn.
>>>>>>>>> Given that there is already functioning support to deploy Accumulo
>> on
>>>>>>>>> Yarn,
>>>>>>>>> this was my goal.
>>>>>>>>>
>>>>>>>>> My goal is to be able to have the ability to run all of our
>>>>>>>>> AbstractMacIT
>>>>>>>>> implementations against "real" hardware without changing a single
>>>>>>>>> line of
>>>>>>>>> test code (ok - maybe a line or two to do injection of the MAC
>>>>>>>>> implementation). The point is, I believe there could be a huge
>>>>>>>>> testing
>>>>>>>>> gain
>>>>>>>>> from being able to write tests which leverage yarn, have the same
>>>>>>>>> programmatic configuration API from MAC, and provide near "real"
>>>>>>>>> Accumulo
>>>>>>>>> semantics.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>   Ok so you want to MAC to be an interface so that you can provide a
>>>>>>>> completely different implementation?
>>>>>>>>
>>>>>>>>
>>>>>>>>   Correct. Some things would serve well in a common abstract base
>> (e.g.
>>>>>>> numTservers, siteXml configuration), but all the nonsense about
>>>>>>> creating
>>>>>>> directory structures and managing Processes is implementation
>> specific.
>>>>>>>
>>>>>>> Perhaps I could create a new interface that the current
>> implementation
>>>>>>> implements which still provides the same semantics from 1.4 and 1.5.
>>>>>>> Let me
>>>>>>> see if I can mock up what I'm thinking -- that will probably be
>>>>>>> easier than
>>>>>>> me trying to write it out.
>>>>>>>
>>>>>>>
>>>>>>
>>
>
>
>

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Bill Havanki <bh...@clouderagovt.com>.
I've been watching the conversation on the side, but I wanted to mention
that it seems the focus isn't so much on "mini" clusters anymore. You're
thinking of programmatic cluster management, whether one node or many. The
idea of a basic cluster management interface, with MAC as an
implementation, is promising. A package name of just "cluster" could work.

Carry on :)

Bill H


On Fri, Mar 28, 2014 at 12:39 AM, Sean Busbey <bu...@cloudera.com>wrote:

> If you decide to go the mapred/mapreduce way, you could go with the package
> name "mini".
>
> alternatively, we can do a multi-stage change out
>
> 1)  1.6.x:  introduce TestAccumuloCluster interface, @deprecate
> MiniAccumuloCluster class and make it implement TestAccumuloCluster
>
> 2) 1.6 + major: change MiniAccumuloCluster to an interface that extends
> TestAccumuloCluster, @deprecate TestAccumuloCluster
>
> 3) 1.6 + 2 major: remove TestAccumuloCluster
>
> Or just go with TestAccumuloCluster as the interface, have
> MiniAccumuloCluster as the local pseudo distributed implementation, and
> then call your new one something like YarnAccumuloCluster.
>
> In that case we could use the deprecation cycle to move the MAC class out
> of the public api.
>
>
> On Thu, Mar 27, 2014 at 6:48 PM, Josh Elser <jo...@gmail.com> wrote:
>
> > Thoughts on if this would be an acceptable change for 1.6.0 to alleviate
> > future cruft?
> >
> > Suggestions on the new package and/or class name would be greatly
> > appreciated over "NewMiniAccumuloC*".
> >
> >
> > On 3/26/14, 3:37 PM, Josh Elser wrote:
> >
> >> Those who are interested: check out
> >> https://github.com/joshelser/accumulo/commit/
> >> 9f63cf32559ab514a69ff2c6b02acef9c9cbb4e8
> >>
> >>
> >> tl;dr I could create some real interfaces for the cluster and config,
> >> which are "hidden" under the covers by the 1.4 and 1.5
> >> MiniAccumuloCluster and MiniAccumuloConfig classes. This de-couples the
> >> default implementation, gives us the ability to hide "implementation
> >> details" if wanted, and moves us towards some factory methods instead of
> >> calling a class directly.
> >>
> >> Thoughts?
> >>
> >> On 3/26/14, 1:21 PM, Josh Elser wrote:
> >>
> >>> Yes, very much experimental at this point.
> >>>
> >>> What I'm most concerned about is having reasonable hooks up front, not
> >>> trying to make an implementation for inclusion 1.6.0.
> >>>
> >>> Regarding additions, the implementations already contains most things I
> >>> would want to expose. I haven't come up with anything that would be
> >>> generally returned through the "API" rather than through this proposed
> >>> implementation (e.g. YARN connection information)
> >>>
> >>> On 3/26/14, 11:57 AM, Keith Turner wrote:
> >>>
> >>>> What you are trying to do sounds interesting.  It also sounds
> >>>> experimental
> >>>> and in the early stages.   Is there anything specific you think
> >>>> should be
> >>>> done for 1.6.0 w/ regards to MAC API?
> >>>>
> >>>>
> >>>> On Wed, Mar 26, 2014 at 2:26 PM, Josh Elser <jo...@gmail.com>
> >>>> wrote:
> >>>>
> >>>>  On 3/26/14, 11:13 AM, Keith Turner wrote:
> >>>>>
> >>>>>  On Wed, Mar 26, 2014 at 2:05 PM, Josh Elser <jo...@gmail.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>   On 3/26/14, 10:57 AM, Keith Turner wrote:
> >>>>>>
> >>>>>>>
> >>>>>>>   Can you give an example of what you are thinking of? I don't
> >>>>>>> understand
> >>>>>>>
> >>>>>>>> you
> >>>>>>>> viewpoint either
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>  Sure. One limitation of MAC, in general as a testing harness, is
> >>>>>>> that it
> >>>>>>> doesn't adequately exercise multi-node implementations. You can run
> >>>>>>> multiple tservers, but they are all on the same host which limits
> the
> >>>>>>> validity of a "robust" test. This is my immediate goal.
> >>>>>>>
> >>>>>>> Multi-node deployments are capable using something like Mesos or
> >>>>>>> Yarn.
> >>>>>>> Given that there is already functioning support to deploy Accumulo
> on
> >>>>>>> Yarn,
> >>>>>>> this was my goal.
> >>>>>>>
> >>>>>>> My goal is to be able to have the ability to run all of our
> >>>>>>> AbstractMacIT
> >>>>>>> implementations against "real" hardware without changing a single
> >>>>>>> line of
> >>>>>>> test code (ok - maybe a line or two to do injection of the MAC
> >>>>>>> implementation). The point is, I believe there could be a huge
> >>>>>>> testing
> >>>>>>> gain
> >>>>>>> from being able to write tests which leverage yarn, have the same
> >>>>>>> programmatic configuration API from MAC, and provide near "real"
> >>>>>>> Accumulo
> >>>>>>> semantics.
> >>>>>>>
> >>>>>>>
> >>>>>>>  Ok so you want to MAC to be an interface so that you can provide a
> >>>>>> completely different implementation?
> >>>>>>
> >>>>>>
> >>>>>>  Correct. Some things would serve well in a common abstract base
> (e.g.
> >>>>> numTservers, siteXml configuration), but all the nonsense about
> >>>>> creating
> >>>>> directory structures and managing Processes is implementation
> specific.
> >>>>>
> >>>>> Perhaps I could create a new interface that the current
> implementation
> >>>>> implements which still provides the same semantics from 1.4 and 1.5.
> >>>>> Let me
> >>>>> see if I can mock up what I'm thinking -- that will probably be
> >>>>> easier than
> >>>>> me trying to write it out.
> >>>>>
> >>>>>
> >>>>
>



-- 
// Bill Havanki
// Solutions Architect, Cloudera Govt Solutions
// 443.686.9283

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Sean Busbey <bu...@cloudera.com>.
If you decide to go the mapred/mapreduce way, you could go with the package
name "mini".

alternatively, we can do a multi-stage change out

1)  1.6.x:  introduce TestAccumuloCluster interface, @deprecate
MiniAccumuloCluster class and make it implement TestAccumuloCluster

2) 1.6 + major: change MiniAccumuloCluster to an interface that extends
TestAccumuloCluster, @deprecate TestAccumuloCluster

3) 1.6 + 2 major: remove TestAccumuloCluster

Or just go with TestAccumuloCluster as the interface, have
MiniAccumuloCluster as the local pseudo distributed implementation, and
then call your new one something like YarnAccumuloCluster.

In that case we could use the deprecation cycle to move the MAC class out
of the public api.


On Thu, Mar 27, 2014 at 6:48 PM, Josh Elser <jo...@gmail.com> wrote:

> Thoughts on if this would be an acceptable change for 1.6.0 to alleviate
> future cruft?
>
> Suggestions on the new package and/or class name would be greatly
> appreciated over "NewMiniAccumuloC*".
>
>
> On 3/26/14, 3:37 PM, Josh Elser wrote:
>
>> Those who are interested: check out
>> https://github.com/joshelser/accumulo/commit/
>> 9f63cf32559ab514a69ff2c6b02acef9c9cbb4e8
>>
>>
>> tl;dr I could create some real interfaces for the cluster and config,
>> which are "hidden" under the covers by the 1.4 and 1.5
>> MiniAccumuloCluster and MiniAccumuloConfig classes. This de-couples the
>> default implementation, gives us the ability to hide "implementation
>> details" if wanted, and moves us towards some factory methods instead of
>> calling a class directly.
>>
>> Thoughts?
>>
>> On 3/26/14, 1:21 PM, Josh Elser wrote:
>>
>>> Yes, very much experimental at this point.
>>>
>>> What I'm most concerned about is having reasonable hooks up front, not
>>> trying to make an implementation for inclusion 1.6.0.
>>>
>>> Regarding additions, the implementations already contains most things I
>>> would want to expose. I haven't come up with anything that would be
>>> generally returned through the "API" rather than through this proposed
>>> implementation (e.g. YARN connection information)
>>>
>>> On 3/26/14, 11:57 AM, Keith Turner wrote:
>>>
>>>> What you are trying to do sounds interesting.  It also sounds
>>>> experimental
>>>> and in the early stages.   Is there anything specific you think
>>>> should be
>>>> done for 1.6.0 w/ regards to MAC API?
>>>>
>>>>
>>>> On Wed, Mar 26, 2014 at 2:26 PM, Josh Elser <jo...@gmail.com>
>>>> wrote:
>>>>
>>>>  On 3/26/14, 11:13 AM, Keith Turner wrote:
>>>>>
>>>>>  On Wed, Mar 26, 2014 at 2:05 PM, Josh Elser <jo...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>   On 3/26/14, 10:57 AM, Keith Turner wrote:
>>>>>>
>>>>>>>
>>>>>>>   Can you give an example of what you are thinking of? I don't
>>>>>>> understand
>>>>>>>
>>>>>>>> you
>>>>>>>> viewpoint either
>>>>>>>>
>>>>>>>>
>>>>>>>>  Sure. One limitation of MAC, in general as a testing harness, is
>>>>>>> that it
>>>>>>> doesn't adequately exercise multi-node implementations. You can run
>>>>>>> multiple tservers, but they are all on the same host which limits the
>>>>>>> validity of a "robust" test. This is my immediate goal.
>>>>>>>
>>>>>>> Multi-node deployments are capable using something like Mesos or
>>>>>>> Yarn.
>>>>>>> Given that there is already functioning support to deploy Accumulo on
>>>>>>> Yarn,
>>>>>>> this was my goal.
>>>>>>>
>>>>>>> My goal is to be able to have the ability to run all of our
>>>>>>> AbstractMacIT
>>>>>>> implementations against "real" hardware without changing a single
>>>>>>> line of
>>>>>>> test code (ok - maybe a line or two to do injection of the MAC
>>>>>>> implementation). The point is, I believe there could be a huge
>>>>>>> testing
>>>>>>> gain
>>>>>>> from being able to write tests which leverage yarn, have the same
>>>>>>> programmatic configuration API from MAC, and provide near "real"
>>>>>>> Accumulo
>>>>>>> semantics.
>>>>>>>
>>>>>>>
>>>>>>>  Ok so you want to MAC to be an interface so that you can provide a
>>>>>> completely different implementation?
>>>>>>
>>>>>>
>>>>>>  Correct. Some things would serve well in a common abstract base (e.g.
>>>>> numTservers, siteXml configuration), but all the nonsense about
>>>>> creating
>>>>> directory structures and managing Processes is implementation specific.
>>>>>
>>>>> Perhaps I could create a new interface that the current implementation
>>>>> implements which still provides the same semantics from 1.4 and 1.5.
>>>>> Let me
>>>>> see if I can mock up what I'm thinking -- that will probably be
>>>>> easier than
>>>>> me trying to write it out.
>>>>>
>>>>>
>>>>

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Josh Elser <jo...@gmail.com>.
Thoughts on if this would be an acceptable change for 1.6.0 to alleviate 
future cruft?

Suggestions on the new package and/or class name would be greatly 
appreciated over "NewMiniAccumuloC*".

On 3/26/14, 3:37 PM, Josh Elser wrote:
> Those who are interested: check out
> https://github.com/joshelser/accumulo/commit/9f63cf32559ab514a69ff2c6b02acef9c9cbb4e8
>
>
> tl;dr I could create some real interfaces for the cluster and config,
> which are "hidden" under the covers by the 1.4 and 1.5
> MiniAccumuloCluster and MiniAccumuloConfig classes. This de-couples the
> default implementation, gives us the ability to hide "implementation
> details" if wanted, and moves us towards some factory methods instead of
> calling a class directly.
>
> Thoughts?
>
> On 3/26/14, 1:21 PM, Josh Elser wrote:
>> Yes, very much experimental at this point.
>>
>> What I'm most concerned about is having reasonable hooks up front, not
>> trying to make an implementation for inclusion 1.6.0.
>>
>> Regarding additions, the implementations already contains most things I
>> would want to expose. I haven't come up with anything that would be
>> generally returned through the "API" rather than through this proposed
>> implementation (e.g. YARN connection information)
>>
>> On 3/26/14, 11:57 AM, Keith Turner wrote:
>>> What you are trying to do sounds interesting.  It also sounds
>>> experimental
>>> and in the early stages.   Is there anything specific you think
>>> should be
>>> done for 1.6.0 w/ regards to MAC API?
>>>
>>>
>>> On Wed, Mar 26, 2014 at 2:26 PM, Josh Elser <jo...@gmail.com>
>>> wrote:
>>>
>>>> On 3/26/14, 11:13 AM, Keith Turner wrote:
>>>>
>>>>> On Wed, Mar 26, 2014 at 2:05 PM, Josh Elser <jo...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>   On 3/26/14, 10:57 AM, Keith Turner wrote:
>>>>>>
>>>>>>   Can you give an example of what you are thinking of? I don't
>>>>>> understand
>>>>>>> you
>>>>>>> viewpoint either
>>>>>>>
>>>>>>>
>>>>>> Sure. One limitation of MAC, in general as a testing harness, is
>>>>>> that it
>>>>>> doesn't adequately exercise multi-node implementations. You can run
>>>>>> multiple tservers, but they are all on the same host which limits the
>>>>>> validity of a "robust" test. This is my immediate goal.
>>>>>>
>>>>>> Multi-node deployments are capable using something like Mesos or
>>>>>> Yarn.
>>>>>> Given that there is already functioning support to deploy Accumulo on
>>>>>> Yarn,
>>>>>> this was my goal.
>>>>>>
>>>>>> My goal is to be able to have the ability to run all of our
>>>>>> AbstractMacIT
>>>>>> implementations against "real" hardware without changing a single
>>>>>> line of
>>>>>> test code (ok - maybe a line or two to do injection of the MAC
>>>>>> implementation). The point is, I believe there could be a huge
>>>>>> testing
>>>>>> gain
>>>>>> from being able to write tests which leverage yarn, have the same
>>>>>> programmatic configuration API from MAC, and provide near "real"
>>>>>> Accumulo
>>>>>> semantics.
>>>>>>
>>>>>>
>>>>> Ok so you want to MAC to be an interface so that you can provide a
>>>>> completely different implementation?
>>>>>
>>>>>
>>>> Correct. Some things would serve well in a common abstract base (e.g.
>>>> numTservers, siteXml configuration), but all the nonsense about
>>>> creating
>>>> directory structures and managing Processes is implementation specific.
>>>>
>>>> Perhaps I could create a new interface that the current implementation
>>>> implements which still provides the same semantics from 1.4 and 1.5.
>>>> Let me
>>>> see if I can mock up what I'm thinking -- that will probably be
>>>> easier than
>>>> me trying to write it out.
>>>>
>>>

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Josh Elser <jo...@gmail.com>.
Those who are interested: check out 
https://github.com/joshelser/accumulo/commit/9f63cf32559ab514a69ff2c6b02acef9c9cbb4e8

tl;dr I could create some real interfaces for the cluster and config, 
which are "hidden" under the covers by the 1.4 and 1.5 
MiniAccumuloCluster and MiniAccumuloConfig classes. This de-couples the 
default implementation, gives us the ability to hide "implementation 
details" if wanted, and moves us towards some factory methods instead of 
calling a class directly.

Thoughts?

On 3/26/14, 1:21 PM, Josh Elser wrote:
> Yes, very much experimental at this point.
>
> What I'm most concerned about is having reasonable hooks up front, not
> trying to make an implementation for inclusion 1.6.0.
>
> Regarding additions, the implementations already contains most things I
> would want to expose. I haven't come up with anything that would be
> generally returned through the "API" rather than through this proposed
> implementation (e.g. YARN connection information)
>
> On 3/26/14, 11:57 AM, Keith Turner wrote:
>> What you are trying to do sounds interesting.  It also sounds
>> experimental
>> and in the early stages.   Is there anything specific you think should be
>> done for 1.6.0 w/ regards to MAC API?
>>
>>
>> On Wed, Mar 26, 2014 at 2:26 PM, Josh Elser <jo...@gmail.com> wrote:
>>
>>> On 3/26/14, 11:13 AM, Keith Turner wrote:
>>>
>>>> On Wed, Mar 26, 2014 at 2:05 PM, Josh Elser <jo...@gmail.com>
>>>> wrote:
>>>>
>>>>   On 3/26/14, 10:57 AM, Keith Turner wrote:
>>>>>
>>>>>   Can you give an example of what you are thinking of? I don't
>>>>> understand
>>>>>> you
>>>>>> viewpoint either
>>>>>>
>>>>>>
>>>>> Sure. One limitation of MAC, in general as a testing harness, is
>>>>> that it
>>>>> doesn't adequately exercise multi-node implementations. You can run
>>>>> multiple tservers, but they are all on the same host which limits the
>>>>> validity of a "robust" test. This is my immediate goal.
>>>>>
>>>>> Multi-node deployments are capable using something like Mesos or Yarn.
>>>>> Given that there is already functioning support to deploy Accumulo on
>>>>> Yarn,
>>>>> this was my goal.
>>>>>
>>>>> My goal is to be able to have the ability to run all of our
>>>>> AbstractMacIT
>>>>> implementations against "real" hardware without changing a single
>>>>> line of
>>>>> test code (ok - maybe a line or two to do injection of the MAC
>>>>> implementation). The point is, I believe there could be a huge testing
>>>>> gain
>>>>> from being able to write tests which leverage yarn, have the same
>>>>> programmatic configuration API from MAC, and provide near "real"
>>>>> Accumulo
>>>>> semantics.
>>>>>
>>>>>
>>>> Ok so you want to MAC to be an interface so that you can provide a
>>>> completely different implementation?
>>>>
>>>>
>>> Correct. Some things would serve well in a common abstract base (e.g.
>>> numTservers, siteXml configuration), but all the nonsense about creating
>>> directory structures and managing Processes is implementation specific.
>>>
>>> Perhaps I could create a new interface that the current implementation
>>> implements which still provides the same semantics from 1.4 and 1.5.
>>> Let me
>>> see if I can mock up what I'm thinking -- that will probably be
>>> easier than
>>> me trying to write it out.
>>>
>>

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Josh Elser <jo...@gmail.com>.
Yes, very much experimental at this point.

What I'm most concerned about is having reasonable hooks up front, not 
trying to make an implementation for inclusion 1.6.0.

Regarding additions, the implementations already contains most things I 
would want to expose. I haven't come up with anything that would be 
generally returned through the "API" rather than through this proposed 
implementation (e.g. YARN connection information)

On 3/26/14, 11:57 AM, Keith Turner wrote:
> What you are trying to do sounds interesting.  It also sounds experimental
> and in the early stages.   Is there anything specific you think should be
> done for 1.6.0 w/ regards to MAC API?
>
>
> On Wed, Mar 26, 2014 at 2:26 PM, Josh Elser <jo...@gmail.com> wrote:
>
>> On 3/26/14, 11:13 AM, Keith Turner wrote:
>>
>>> On Wed, Mar 26, 2014 at 2:05 PM, Josh Elser <jo...@gmail.com> wrote:
>>>
>>>   On 3/26/14, 10:57 AM, Keith Turner wrote:
>>>>
>>>>   Can you give an example of what you are thinking of? I don't understand
>>>>> you
>>>>> viewpoint either
>>>>>
>>>>>
>>>> Sure. One limitation of MAC, in general as a testing harness, is that it
>>>> doesn't adequately exercise multi-node implementations. You can run
>>>> multiple tservers, but they are all on the same host which limits the
>>>> validity of a "robust" test. This is my immediate goal.
>>>>
>>>> Multi-node deployments are capable using something like Mesos or Yarn.
>>>> Given that there is already functioning support to deploy Accumulo on
>>>> Yarn,
>>>> this was my goal.
>>>>
>>>> My goal is to be able to have the ability to run all of our AbstractMacIT
>>>> implementations against "real" hardware without changing a single line of
>>>> test code (ok - maybe a line or two to do injection of the MAC
>>>> implementation). The point is, I believe there could be a huge testing
>>>> gain
>>>> from being able to write tests which leverage yarn, have the same
>>>> programmatic configuration API from MAC, and provide near "real" Accumulo
>>>> semantics.
>>>>
>>>>
>>> Ok so you want to MAC to be an interface so that you can provide a
>>> completely different implementation?
>>>
>>>
>> Correct. Some things would serve well in a common abstract base (e.g.
>> numTservers, siteXml configuration), but all the nonsense about creating
>> directory structures and managing Processes is implementation specific.
>>
>> Perhaps I could create a new interface that the current implementation
>> implements which still provides the same semantics from 1.4 and 1.5. Let me
>> see if I can mock up what I'm thinking -- that will probably be easier than
>> me trying to write it out.
>>
>

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Keith Turner <ke...@deenlo.com>.
What you are trying to do sounds interesting.  It also sounds experimental
and in the early stages.   Is there anything specific you think should be
done for 1.6.0 w/ regards to MAC API?


On Wed, Mar 26, 2014 at 2:26 PM, Josh Elser <jo...@gmail.com> wrote:

> On 3/26/14, 11:13 AM, Keith Turner wrote:
>
>> On Wed, Mar 26, 2014 at 2:05 PM, Josh Elser <jo...@gmail.com> wrote:
>>
>>  On 3/26/14, 10:57 AM, Keith Turner wrote:
>>>
>>>  Can you give an example of what you are thinking of? I don't understand
>>>> you
>>>> viewpoint either
>>>>
>>>>
>>> Sure. One limitation of MAC, in general as a testing harness, is that it
>>> doesn't adequately exercise multi-node implementations. You can run
>>> multiple tservers, but they are all on the same host which limits the
>>> validity of a "robust" test. This is my immediate goal.
>>>
>>> Multi-node deployments are capable using something like Mesos or Yarn.
>>> Given that there is already functioning support to deploy Accumulo on
>>> Yarn,
>>> this was my goal.
>>>
>>> My goal is to be able to have the ability to run all of our AbstractMacIT
>>> implementations against "real" hardware without changing a single line of
>>> test code (ok - maybe a line or two to do injection of the MAC
>>> implementation). The point is, I believe there could be a huge testing
>>> gain
>>> from being able to write tests which leverage yarn, have the same
>>> programmatic configuration API from MAC, and provide near "real" Accumulo
>>> semantics.
>>>
>>>
>> Ok so you want to MAC to be an interface so that you can provide a
>> completely different implementation?
>>
>>
> Correct. Some things would serve well in a common abstract base (e.g.
> numTservers, siteXml configuration), but all the nonsense about creating
> directory structures and managing Processes is implementation specific.
>
> Perhaps I could create a new interface that the current implementation
> implements which still provides the same semantics from 1.4 and 1.5. Let me
> see if I can mock up what I'm thinking -- that will probably be easier than
> me trying to write it out.
>

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Josh Elser <jo...@gmail.com>.
On 3/26/14, 11:13 AM, Keith Turner wrote:
> On Wed, Mar 26, 2014 at 2:05 PM, Josh Elser <jo...@gmail.com> wrote:
>
>> On 3/26/14, 10:57 AM, Keith Turner wrote:
>>
>>> Can you give an example of what you are thinking of? I don't understand
>>> you
>>> viewpoint either
>>>
>>
>> Sure. One limitation of MAC, in general as a testing harness, is that it
>> doesn't adequately exercise multi-node implementations. You can run
>> multiple tservers, but they are all on the same host which limits the
>> validity of a "robust" test. This is my immediate goal.
>>
>> Multi-node deployments are capable using something like Mesos or Yarn.
>> Given that there is already functioning support to deploy Accumulo on Yarn,
>> this was my goal.
>>
>> My goal is to be able to have the ability to run all of our AbstractMacIT
>> implementations against "real" hardware without changing a single line of
>> test code (ok - maybe a line or two to do injection of the MAC
>> implementation). The point is, I believe there could be a huge testing gain
>> from being able to write tests which leverage yarn, have the same
>> programmatic configuration API from MAC, and provide near "real" Accumulo
>> semantics.
>>
>
> Ok so you want to MAC to be an interface so that you can provide a
> completely different implementation?
>

Correct. Some things would serve well in a common abstract base (e.g. 
numTservers, siteXml configuration), but all the nonsense about creating 
directory structures and managing Processes is implementation specific.

Perhaps I could create a new interface that the current implementation 
implements which still provides the same semantics from 1.4 and 1.5. Let 
me see if I can mock up what I'm thinking -- that will probably be 
easier than me trying to write it out.

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Keith Turner <ke...@deenlo.com>.
On Wed, Mar 26, 2014 at 2:05 PM, Josh Elser <jo...@gmail.com> wrote:

> On 3/26/14, 10:57 AM, Keith Turner wrote:
>
>> Can you give an example of what you are thinking of? I don't understand
>> you
>> viewpoint either
>>
>
> Sure. One limitation of MAC, in general as a testing harness, is that it
> doesn't adequately exercise multi-node implementations. You can run
> multiple tservers, but they are all on the same host which limits the
> validity of a "robust" test. This is my immediate goal.
>
> Multi-node deployments are capable using something like Mesos or Yarn.
> Given that there is already functioning support to deploy Accumulo on Yarn,
> this was my goal.
>
> My goal is to be able to have the ability to run all of our AbstractMacIT
> implementations against "real" hardware without changing a single line of
> test code (ok - maybe a line or two to do injection of the MAC
> implementation). The point is, I believe there could be a huge testing gain
> from being able to write tests which leverage yarn, have the same
> programmatic configuration API from MAC, and provide near "real" Accumulo
> semantics.
>

Ok so you want to MAC to be an interface so that you can provide a
completely different implementation?

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Josh Elser <jo...@gmail.com>.
On 3/26/14, 10:57 AM, Keith Turner wrote:
> Can you give an example of what you are thinking of? I don't understand you
> viewpoint either

Sure. One limitation of MAC, in general as a testing harness, is that it 
doesn't adequately exercise multi-node implementations. You can run 
multiple tservers, but they are all on the same host which limits the 
validity of a "robust" test. This is my immediate goal.

Multi-node deployments are capable using something like Mesos or Yarn. 
Given that there is already functioning support to deploy Accumulo on 
Yarn, this was my goal.

My goal is to be able to have the ability to run all of our 
AbstractMacIT implementations against "real" hardware without changing a 
single line of test code (ok - maybe a line or two to do injection of 
the MAC implementation). The point is, I believe there could be a huge 
testing gain from being able to write tests which leverage yarn, have 
the same programmatic configuration API from MAC, and provide near 
"real" Accumulo semantics.

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Keith Turner <ke...@deenlo.com>.
On Wed, Mar 26, 2014 at 12:46 PM, Josh Elser <jo...@gmail.com> wrote:

> On 3/26/14, 9:33 AM, Keith Turner wrote:
>
>> On Wed, Mar 26, 2014 at 12:26 PM, Josh Elser <jo...@gmail.com>
>> wrote:
>>
>>  On 3/26/14, 9:23 AM, Keith Turner wrote:
>>>
>>>  That's my irk with it. The changes we made "hide" things for no other
>>>>
>>>>> purpose than saying "we hid them". The next variant of a MAC is going
>>>>>> to
>>>>>> have to re-architect the entire thing anyways (I'm doing this right
>>>>>> now
>>>>>>
>>>>> and
>>>>>
>>>>>> I'm overhauling it).
>>>>>>
>>>>>>
>>>>>  There is a purpose.  Whats an alternative solution to the addition of
>>>> "public List<LogWriter> getLogWriters()" to the MAC API?
>>>>
>>>>
>>> Personally, I wouldn't have really cared if such a method was added to
>>> its
>>> API.
>>>
>>
>>
>> Why not?  It needlessly exposes a MAC implementation detail.  Java 7
>> offers
>> a much better way to handle this situation and makes the need for these
>> threads go away. As I said flushing the logs could be offered in the API
>> in
>> a much nicer way.  Thats one solution.
>>
>>
> If it was needless as you claim, why was it added in the first place as a
> public method?
>
>
>
>>
>>>
>>>   If you want to re-write MAC all you have to support is the interface in
>>>
>>>> minicluster, you are free to throw everything in minicluster.impl away.
>>>>
>>>>
>>>>
>>>>  No, not with the "interface" explicitly referencing MiniAccumuloC*Impl
>>> internally, I can't. I do not see any way I can throw away the existing
>>> impl given the API wrapper. Am I missing something?
>>>
>>>
>> Does MiniAccumuloC*Impl leak from the minicluster package in some way?
>>
>
> I don't understand your question, so I'll restate my concern: the only
> implementation I can make that leverages MiniAccumuloC* is by extending
> MiniAccumuloC*Impl.
>

Can you give an example of what you are thinking of? I don't understand you
viewpoint either

As long as we preserve the correct signatures on the public methods in
o.a.a.minicluster, we can make any changes we like to the implementation of
those methods.   The implementation of those methods happen to use classes
in o.a.a.minicluster.impl.  No classes from  o.a.a.minicluster.impl should
leak through the public methods in o.a.a.minicluster.  I was asking if
there was leakage there.



>
> Talking to John, he did present the possibility that MACI could be made
> into a common base, the current impl lifted to a new impl that isn't tied
> to specific details, and then that *should* work.
>

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Josh Elser <jo...@gmail.com>.
On 3/26/14, 9:33 AM, Keith Turner wrote:
> On Wed, Mar 26, 2014 at 12:26 PM, Josh Elser <jo...@gmail.com> wrote:
>
>> On 3/26/14, 9:23 AM, Keith Turner wrote:
>>
>>> That's my irk with it. The changes we made "hide" things for no other
>>>>> purpose than saying "we hid them". The next variant of a MAC is going to
>>>>> have to re-architect the entire thing anyways (I'm doing this right now
>>>> and
>>>>> I'm overhauling it).
>>>>>
>>>>
>>> There is a purpose.  Whats an alternative solution to the addition of
>>> "public List<LogWriter> getLogWriters()" to the MAC API?
>>>
>>
>> Personally, I wouldn't have really cared if such a method was added to its
>> API.
>
>
> Why not?  It needlessly exposes a MAC implementation detail.  Java 7 offers
> a much better way to handle this situation and makes the need for these
> threads go away. As I said flushing the logs could be offered in the API in
> a much nicer way.  Thats one solution.
>

If it was needless as you claim, why was it added in the first place as 
a public method?

>
>>
>>
>>   If you want to re-write MAC all you have to support is the interface in
>>> minicluster, you are free to throw everything in minicluster.impl away.
>>>
>>>
>>>
>> No, not with the "interface" explicitly referencing MiniAccumuloC*Impl
>> internally, I can't. I do not see any way I can throw away the existing
>> impl given the API wrapper. Am I missing something?
>>
>
> Does MiniAccumuloC*Impl leak from the minicluster package in some way?

I don't understand your question, so I'll restate my concern: the only 
implementation I can make that leverages MiniAccumuloC* is by extending 
MiniAccumuloC*Impl.

Talking to John, he did present the possibility that MACI could be made 
into a common base, the current impl lifted to a new impl that isn't 
tied to specific details, and then that *should* work.

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Keith Turner <ke...@deenlo.com>.
On Wed, Mar 26, 2014 at 12:26 PM, Josh Elser <jo...@gmail.com> wrote:

> On 3/26/14, 9:23 AM, Keith Turner wrote:
>
>> That's my irk with it. The changes we made "hide" things for no other
>>> >purpose than saying "we hid them". The next variant of a MAC is going to
>>> >have to re-architect the entire thing anyways (I'm doing this right now
>>> and
>>> >I'm overhauling it).
>>> >
>>>
>> There is a purpose.  Whats an alternative solution to the addition of
>> "public List<LogWriter> getLogWriters()" to the MAC API?
>>
>
> Personally, I wouldn't have really cared if such a method was added to its
> API.


Why not?  It needlessly exposes a MAC implementation detail.  Java 7 offers
a much better way to handle this situation and makes the need for these
threads go away. As I said flushing the logs could be offered in the API in
a much nicer way.  Thats one solution.


>
>
>  If you want to re-write MAC all you have to support is the interface in
>> minicluster, you are free to throw everything in minicluster.impl away.
>>
>>
>>
> No, not with the "interface" explicitly referencing MiniAccumuloC*Impl
> internally, I can't. I do not see any way I can throw away the existing
> impl given the API wrapper. Am I missing something?
>

Does MiniAccumuloC*Impl leak from the minicluster package in some way?

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Josh Elser <jo...@gmail.com>.
On 3/26/14, 9:23 AM, Keith Turner wrote:
>> That's my irk with it. The changes we made "hide" things for no other
>> >purpose than saying "we hid them". The next variant of a MAC is going to
>> >have to re-architect the entire thing anyways (I'm doing this right now and
>> >I'm overhauling it).
>> >
> There is a purpose.  Whats an alternative solution to the addition of
> "public List<LogWriter> getLogWriters()" to the MAC API?

Personally, I wouldn't have really cared if such a method was added to 
its API.

> If you want to re-write MAC all you have to support is the interface in
> minicluster, you are free to throw everything in minicluster.impl away.
>
>

No, not with the "interface" explicitly referencing MiniAccumuloC*Impl 
internally, I can't. I do not see any way I can throw away the existing 
impl given the API wrapper. Am I missing something?

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Keith Turner <ke...@deenlo.com>.
On Wed, Mar 26, 2014 at 12:12 PM, Josh Elser <jo...@gmail.com> wrote:

> On 3/26/14, 9:06 AM, Keith Turner wrote:
>
>> There were many change made to MAC so Accumulo could test itself.  For
>> example a method was added to return the internal threads that flush logs.
>> Flushing the logs may be a useful feature to add.  However it could be
>> offered in a way that does not expose these internal threads.   When
>> working on  ACCUMULO-2151 I had no desire to reimplement things like this,
>> I just wanted to hide it.  It was hidden from users so we do not have to
>> support it and can change it at will when testing 1.7.0.
>>
>
> That's my irk with it. The changes we made "hide" things for no other
> purpose than saying "we hid them". The next variant of a MAC is going to
> have to re-architect the entire thing anyways (I'm doing this right now and
> I'm overhauling it).
>

There is a purpose.  Whats an alternative solution to the addition of
"public List<LogWriter> getLogWriters()" to the MAC API?

If you want to re-write MAC all you have to support is the interface in
minicluster, you are free to throw everything in minicluster.impl away.


>
> It doesn't make sense to me ship something that changes it without
> addressing the underlying problems. I don't want to suggest it because I
> don't want to introduce our own mapred and mapreduce dichotomy, but I can't
> come up with a better alternative yet.
>
>
>  As Sean said MAC was a class in 1.4.4, 1.5.0, and 1.5.1.  So making it an
>> interface would break things for any users using it.  Any reorganizing of
>> the implementation of MAC could easily be done after 1.6.0.  From a users
>> perspective the MAC API has changed very little, even though the
>> implementation has dramatically changed.
>>
>
> Reorganizing of *this single implementation of MAC* can easily be done.
> Any extension or reimplementation will be dirty, and inherit a large bit of
> code that is likely useless.
>
>
>
>> On Wed, Mar 26, 2014 at 3:10 AM, Sean Busbey <bu...@cloudera.com>
>> wrote:
>>
>>  ACCUMULO-2143 has developed a conversation about MiniAccumuloCluster's
>>> intended use and the way we currently implement the difference between
>>> MAC
>>> for external use and MAC for internal Accumulo testing[1].
>>>
>>> In particular, Josh had a few major concerns
>>>
>>> -----
>>>
>>> It doesn't make sense to me why MiniAccumuloCluster is a concrete class
>>> which, ultimately still tied to a MiniAccumuloClusterImpl.
>>> MiniAccumuloCluster *requires* a MiniAccumuloClusterImpl or something
>>> that
>>> extends it. This is what's really chafing me about the separation of
>>> "accumulo user" and "accumulo developer" methods - you *always* have them
>>> both. Not to mention, this hierarchy is really obnoxious to create a new
>>> implementation of AccumuloMiniCluster(Impl) because I have to carry all
>>> of
>>> the cruft of the "original" implementation with me.
>>>
>>> Bringing this back around to this ticket, while I still don't agree with
>>> the reasoning that exposing the FileSystem or ZooKeeper object that
>>> MiniAccumuloClusterImpl is getting us anything other than the ability to
>>> say "we didn't change this [arbitrary] API". For "users" who might not
>>> care
>>> what the underlying FileSystem or ZooKeeper connection, it's merely an
>>> extra two items in their editor's code-completion. For "users" who would
>>> care to use this information, we now make them jump through extra hoops
>>> to
>>> get it. That just doesn't make any sense to me for something we haven't
>>> even released.
>>>
>>> To be honest, I really want to re-open
>>> ACCUMULO-2151<https://issues.apache.org/jira/browse/ACCUMULO-2151>,
>>> make MiniAccumuloCluster an interface, MiniAccumuloClusterImpl an
>>> implementation of said interface, and create some factory class to make
>>> instances, ala Connector.tableOperations, Connector.securityOperations,
>>> etc. Right now there's a class we call an "API" that cannot be
>>> generically
>>> extended for the sake of saying "we have an API".
>>>
>>> ----
>>>
>>> I wanted to avoid having a drawn out discussion on a jira, where folks my
>>> not notice it. Especially with things being late in 1.6.0 development and
>>> the potential this has to impact the API.
>>>
>>> Personally, I don't have much of a dog in the fight. There's always some
>>> arbitrary line for where the public API will be, presuming we want to
>>> have
>>> any kind of balance between providing a stable based for others to build
>>> on
>>> and being able to refactor things. I would like us to hold to our API
>>> promises[2] and I would rather we not leak implementation details
>>> unnecessarily.
>>>
>>> I suspect the choice to make MiniAccumuloCluster a class rather than an
>>> interface with a factory was driven by the restrictions we put on API
>>> changes between major versions and the fact that 1.5 had a class you
>>> could
>>> instantiate via constructors[3].
>>>
>>> It's possible we can address some of the major reusability concerns by
>>> moving most of the implementation back into MAC, liberally using package
>>> access for members, and making the internal-use MAC extend the public
>>> one.
>>>
>>>
>>> [1]: https://issues.apache.org/jira/browse/ACCUMULO-2143
>>> [2]: http://accumulo.apache.org/governance/releasing.html
>>> [3]: https://issues.apache.org/jira/browse/ACCUMULO-2151
>>>
>>>
>>

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Josh Elser <jo...@gmail.com>.
On 3/26/14, 9:06 AM, Keith Turner wrote:
> There were many change made to MAC so Accumulo could test itself.  For
> example a method was added to return the internal threads that flush logs.
> Flushing the logs may be a useful feature to add.  However it could be
> offered in a way that does not expose these internal threads.   When
> working on  ACCUMULO-2151 I had no desire to reimplement things like this,
> I just wanted to hide it.  It was hidden from users so we do not have to
> support it and can change it at will when testing 1.7.0.

That's my irk with it. The changes we made "hide" things for no other 
purpose than saying "we hid them". The next variant of a MAC is going to 
have to re-architect the entire thing anyways (I'm doing this right now 
and I'm overhauling it).

It doesn't make sense to me ship something that changes it without 
addressing the underlying problems. I don't want to suggest it because I 
don't want to introduce our own mapred and mapreduce dichotomy, but I 
can't come up with a better alternative yet.

> As Sean said MAC was a class in 1.4.4, 1.5.0, and 1.5.1.  So making it an
> interface would break things for any users using it.  Any reorganizing of
> the implementation of MAC could easily be done after 1.6.0.  From a users
> perspective the MAC API has changed very little, even though the
> implementation has dramatically changed.

Reorganizing of *this single implementation of MAC* can easily be done. 
Any extension or reimplementation will be dirty, and inherit a large bit 
of code that is likely useless.

>
> On Wed, Mar 26, 2014 at 3:10 AM, Sean Busbey <bu...@cloudera.com>wrote:
>
>> ACCUMULO-2143 has developed a conversation about MiniAccumuloCluster's
>> intended use and the way we currently implement the difference between MAC
>> for external use and MAC for internal Accumulo testing[1].
>>
>> In particular, Josh had a few major concerns
>>
>> -----
>>
>> It doesn't make sense to me why MiniAccumuloCluster is a concrete class
>> which, ultimately still tied to a MiniAccumuloClusterImpl.
>> MiniAccumuloCluster *requires* a MiniAccumuloClusterImpl or something that
>> extends it. This is what's really chafing me about the separation of
>> "accumulo user" and "accumulo developer" methods - you *always* have them
>> both. Not to mention, this hierarchy is really obnoxious to create a new
>> implementation of AccumuloMiniCluster(Impl) because I have to carry all of
>> the cruft of the "original" implementation with me.
>>
>> Bringing this back around to this ticket, while I still don't agree with
>> the reasoning that exposing the FileSystem or ZooKeeper object that
>> MiniAccumuloClusterImpl is getting us anything other than the ability to
>> say "we didn't change this [arbitrary] API". For "users" who might not care
>> what the underlying FileSystem or ZooKeeper connection, it's merely an
>> extra two items in their editor's code-completion. For "users" who would
>> care to use this information, we now make them jump through extra hoops to
>> get it. That just doesn't make any sense to me for something we haven't
>> even released.
>>
>> To be honest, I really want to re-open
>> ACCUMULO-2151<https://issues.apache.org/jira/browse/ACCUMULO-2151>,
>> make MiniAccumuloCluster an interface, MiniAccumuloClusterImpl an
>> implementation of said interface, and create some factory class to make
>> instances, ala Connector.tableOperations, Connector.securityOperations,
>> etc. Right now there's a class we call an "API" that cannot be generically
>> extended for the sake of saying "we have an API".
>>
>> ----
>>
>> I wanted to avoid having a drawn out discussion on a jira, where folks my
>> not notice it. Especially with things being late in 1.6.0 development and
>> the potential this has to impact the API.
>>
>> Personally, I don't have much of a dog in the fight. There's always some
>> arbitrary line for where the public API will be, presuming we want to have
>> any kind of balance between providing a stable based for others to build on
>> and being able to refactor things. I would like us to hold to our API
>> promises[2] and I would rather we not leak implementation details
>> unnecessarily.
>>
>> I suspect the choice to make MiniAccumuloCluster a class rather than an
>> interface with a factory was driven by the restrictions we put on API
>> changes between major versions and the fact that 1.5 had a class you could
>> instantiate via constructors[3].
>>
>> It's possible we can address some of the major reusability concerns by
>> moving most of the implementation back into MAC, liberally using package
>> access for members, and making the internal-use MAC extend the public one.
>>
>>
>> [1]: https://issues.apache.org/jira/browse/ACCUMULO-2143
>> [2]: http://accumulo.apache.org/governance/releasing.html
>> [3]: https://issues.apache.org/jira/browse/ACCUMULO-2151
>>
>

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by William Slacum <wi...@accumulo.net>.
Correction from my previous email:

At this point, the MiniAccumuloCluster's interface of the
MiniAccumuloClusterImpl's interface.

should read

At this point, the MiniAccumuloCluster's interface is a subset of the
MiniAccumuloClusterImpl's interface.


On Wed, Mar 26, 2014 at 1:10 PM, William Slacum <
wilhelm.von.cloud@accumulo.net> wrote:

> [NOTE: I started this email when this thread was new, and it kinda of blew
> up on me while writing it and being distracted. Apologies in advance if
> things were already covered or it's not relevant any more.]
>
> Is this a design quality discussion or a a functionality discussion?
>
> The changes from 1.5->1.6 seem like a poor design decision, but they do
> aid in functionality.
>
> From 1.5:
>   public MiniAccumuloCluster(File dir, String rootPassword) throws
> IOException
>   public MiniAccumuloCluster(MiniAccumuloConfig config) throws IOException
>   public void start() throws IOException, InterruptedException
>   public String getInstanceName()
>   public String getZooKeepers()
>   public void stop() throws IOException, InterruptedException
>
> From 1.6:
>   public MiniAccumuloCluster(File dir, String rootPassword) throws
> IOException
>   public MiniAccumuloCluster(MiniAccumuloConfig config) throws IOException
>   public void start() throws IOException, InterruptedException
>   public Set<Pair<ServerType,Integer>> getDebugPorts()
>   public String getInstanceName()
>   public String getZooKeepers()
>   public void stop() throws IOException, InterruptedException
>   public MiniAccumuloConfig getConfig()
>   public Connector getConnector(String user, String passwd) throws
> AccumuloException, AccumuloSecurityException
>   public ClientConfiguration getClientConfig()
>
> From a client perspective, I see a difference of #getDebugPorts,
> #getConfig, #getConnector, #getClientConfig. The other methods are on the
> Impl. There's nothing wrong with using aggregation in this case, since the
> code would be the same regardless.
>
> I don't quite understand what it means to "extend generically." At this
> point, the MiniAccumuloCluster's interface of the MiniAccumuloClusterImpl's
> interface. The naming could, and should, be better, but I don't quite get
> where we're losing functionality.
>
>
>
> On Wed, Mar 26, 2014 at 12:06 PM, Keith Turner <ke...@deenlo.com> wrote:
>
>> There were many change made to MAC so Accumulo could test itself.  For
>> example a method was added to return the internal threads that flush logs.
>> Flushing the logs may be a useful feature to add.  However it could be
>> offered in a way that does not expose these internal threads.   When
>> working on  ACCUMULO-2151 I had no desire to reimplement things like this,
>> I just wanted to hide it.  It was hidden from users so we do not have to
>> support it and can change it at will when testing 1.7.0.
>>
>> As Sean said MAC was a class in 1.4.4, 1.5.0, and 1.5.1.  So making it an
>> interface would break things for any users using it.  Any reorganizing of
>> the implementation of MAC could easily be done after 1.6.0.  From a users
>> perspective the MAC API has changed very little, even though the
>> implementation has dramatically changed.
>>
>>
>>
>>
>> On Wed, Mar 26, 2014 at 3:10 AM, Sean Busbey <busbey+lists@cloudera.com
>> >wrote:
>>
>> > ACCUMULO-2143 has developed a conversation about MiniAccumuloCluster's
>> > intended use and the way we currently implement the difference between
>> MAC
>> > for external use and MAC for internal Accumulo testing[1].
>> >
>> > In particular, Josh had a few major concerns
>> >
>> > -----
>> >
>> > It doesn't make sense to me why MiniAccumuloCluster is a concrete class
>> > which, ultimately still tied to a MiniAccumuloClusterImpl.
>> > MiniAccumuloCluster *requires* a MiniAccumuloClusterImpl or something
>> that
>> > extends it. This is what's really chafing me about the separation of
>> > "accumulo user" and "accumulo developer" methods - you *always* have
>> them
>> > both. Not to mention, this hierarchy is really obnoxious to create a new
>> > implementation of AccumuloMiniCluster(Impl) because I have to carry all
>> of
>> > the cruft of the "original" implementation with me.
>> >
>> > Bringing this back around to this ticket, while I still don't agree with
>> > the reasoning that exposing the FileSystem or ZooKeeper object that
>> > MiniAccumuloClusterImpl is getting us anything other than the ability to
>> > say "we didn't change this [arbitrary] API". For "users" who might not
>> care
>> > what the underlying FileSystem or ZooKeeper connection, it's merely an
>> > extra two items in their editor's code-completion. For "users" who would
>> > care to use this information, we now make them jump through extra hoops
>> to
>> > get it. That just doesn't make any sense to me for something we haven't
>> > even released.
>> >
>> > To be honest, I really want to re-open
>> > ACCUMULO-2151<https://issues.apache.org/jira/browse/ACCUMULO-2151>,
>> > make MiniAccumuloCluster an interface, MiniAccumuloClusterImpl an
>> > implementation of said interface, and create some factory class to make
>> > instances, ala Connector.tableOperations, Connector.securityOperations,
>> > etc. Right now there's a class we call an "API" that cannot be
>> generically
>> > extended for the sake of saying "we have an API".
>> >
>> > ----
>> >
>> > I wanted to avoid having a drawn out discussion on a jira, where folks
>> my
>> > not notice it. Especially with things being late in 1.6.0 development
>> and
>> > the potential this has to impact the API.
>> >
>> > Personally, I don't have much of a dog in the fight. There's always some
>> > arbitrary line for where the public API will be, presuming we want to
>> have
>> > any kind of balance between providing a stable based for others to
>> build on
>> > and being able to refactor things. I would like us to hold to our API
>> > promises[2] and I would rather we not leak implementation details
>> > unnecessarily.
>> >
>> > I suspect the choice to make MiniAccumuloCluster a class rather than an
>> > interface with a factory was driven by the restrictions we put on API
>> > changes between major versions and the fact that 1.5 had a class you
>> could
>> > instantiate via constructors[3].
>> >
>> > It's possible we can address some of the major reusability concerns by
>> > moving most of the implementation back into MAC, liberally using package
>> > access for members, and making the internal-use MAC extend the public
>> one.
>> >
>> >
>> > [1]: https://issues.apache.org/jira/browse/ACCUMULO-2143
>> > [2]: http://accumulo.apache.org/governance/releasing.html
>> > [3]: https://issues.apache.org/jira/browse/ACCUMULO-2151
>> >
>>
>
>

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by William Slacum <wi...@accumulo.net>.
[NOTE: I started this email when this thread was new, and it kinda of blew
up on me while writing it and being distracted. Apologies in advance if
things were already covered or it's not relevant any more.]

Is this a design quality discussion or a a functionality discussion?

The changes from 1.5->1.6 seem like a poor design decision, but they do aid
in functionality.

>From 1.5:
  public MiniAccumuloCluster(File dir, String rootPassword) throws
IOException
  public MiniAccumuloCluster(MiniAccumuloConfig config) throws IOException
  public void start() throws IOException, InterruptedException
  public String getInstanceName()
  public String getZooKeepers()
  public void stop() throws IOException, InterruptedException

>From 1.6:
  public MiniAccumuloCluster(File dir, String rootPassword) throws
IOException
  public MiniAccumuloCluster(MiniAccumuloConfig config) throws IOException
  public void start() throws IOException, InterruptedException
  public Set<Pair<ServerType,Integer>> getDebugPorts()
  public String getInstanceName()
  public String getZooKeepers()
  public void stop() throws IOException, InterruptedException
  public MiniAccumuloConfig getConfig()
  public Connector getConnector(String user, String passwd) throws
AccumuloException, AccumuloSecurityException
  public ClientConfiguration getClientConfig()

>From a client perspective, I see a difference of #getDebugPorts,
#getConfig, #getConnector, #getClientConfig. The other methods are on the
Impl. There's nothing wrong with using aggregation in this case, since the
code would be the same regardless.

I don't quite understand what it means to "extend generically." At this
point, the MiniAccumuloCluster's interface of the MiniAccumuloClusterImpl's
interface. The naming could, and should, be better, but I don't quite get
where we're losing functionality.



On Wed, Mar 26, 2014 at 12:06 PM, Keith Turner <ke...@deenlo.com> wrote:

> There were many change made to MAC so Accumulo could test itself.  For
> example a method was added to return the internal threads that flush logs.
> Flushing the logs may be a useful feature to add.  However it could be
> offered in a way that does not expose these internal threads.   When
> working on  ACCUMULO-2151 I had no desire to reimplement things like this,
> I just wanted to hide it.  It was hidden from users so we do not have to
> support it and can change it at will when testing 1.7.0.
>
> As Sean said MAC was a class in 1.4.4, 1.5.0, and 1.5.1.  So making it an
> interface would break things for any users using it.  Any reorganizing of
> the implementation of MAC could easily be done after 1.6.0.  From a users
> perspective the MAC API has changed very little, even though the
> implementation has dramatically changed.
>
>
>
>
> On Wed, Mar 26, 2014 at 3:10 AM, Sean Busbey <busbey+lists@cloudera.com
> >wrote:
>
> > ACCUMULO-2143 has developed a conversation about MiniAccumuloCluster's
> > intended use and the way we currently implement the difference between
> MAC
> > for external use and MAC for internal Accumulo testing[1].
> >
> > In particular, Josh had a few major concerns
> >
> > -----
> >
> > It doesn't make sense to me why MiniAccumuloCluster is a concrete class
> > which, ultimately still tied to a MiniAccumuloClusterImpl.
> > MiniAccumuloCluster *requires* a MiniAccumuloClusterImpl or something
> that
> > extends it. This is what's really chafing me about the separation of
> > "accumulo user" and "accumulo developer" methods - you *always* have them
> > both. Not to mention, this hierarchy is really obnoxious to create a new
> > implementation of AccumuloMiniCluster(Impl) because I have to carry all
> of
> > the cruft of the "original" implementation with me.
> >
> > Bringing this back around to this ticket, while I still don't agree with
> > the reasoning that exposing the FileSystem or ZooKeeper object that
> > MiniAccumuloClusterImpl is getting us anything other than the ability to
> > say "we didn't change this [arbitrary] API". For "users" who might not
> care
> > what the underlying FileSystem or ZooKeeper connection, it's merely an
> > extra two items in their editor's code-completion. For "users" who would
> > care to use this information, we now make them jump through extra hoops
> to
> > get it. That just doesn't make any sense to me for something we haven't
> > even released.
> >
> > To be honest, I really want to re-open
> > ACCUMULO-2151<https://issues.apache.org/jira/browse/ACCUMULO-2151>,
> > make MiniAccumuloCluster an interface, MiniAccumuloClusterImpl an
> > implementation of said interface, and create some factory class to make
> > instances, ala Connector.tableOperations, Connector.securityOperations,
> > etc. Right now there's a class we call an "API" that cannot be
> generically
> > extended for the sake of saying "we have an API".
> >
> > ----
> >
> > I wanted to avoid having a drawn out discussion on a jira, where folks my
> > not notice it. Especially with things being late in 1.6.0 development and
> > the potential this has to impact the API.
> >
> > Personally, I don't have much of a dog in the fight. There's always some
> > arbitrary line for where the public API will be, presuming we want to
> have
> > any kind of balance between providing a stable based for others to build
> on
> > and being able to refactor things. I would like us to hold to our API
> > promises[2] and I would rather we not leak implementation details
> > unnecessarily.
> >
> > I suspect the choice to make MiniAccumuloCluster a class rather than an
> > interface with a factory was driven by the restrictions we put on API
> > changes between major versions and the fact that 1.5 had a class you
> could
> > instantiate via constructors[3].
> >
> > It's possible we can address some of the major reusability concerns by
> > moving most of the implementation back into MAC, liberally using package
> > access for members, and making the internal-use MAC extend the public
> one.
> >
> >
> > [1]: https://issues.apache.org/jira/browse/ACCUMULO-2143
> > [2]: http://accumulo.apache.org/governance/releasing.html
> > [3]: https://issues.apache.org/jira/browse/ACCUMULO-2151
> >
>

Re: [DISCUSS] MiniAccumuloCluster goals and approach

Posted by Keith Turner <ke...@deenlo.com>.
There were many change made to MAC so Accumulo could test itself.  For
example a method was added to return the internal threads that flush logs.
Flushing the logs may be a useful feature to add.  However it could be
offered in a way that does not expose these internal threads.   When
working on  ACCUMULO-2151 I had no desire to reimplement things like this,
I just wanted to hide it.  It was hidden from users so we do not have to
support it and can change it at will when testing 1.7.0.

As Sean said MAC was a class in 1.4.4, 1.5.0, and 1.5.1.  So making it an
interface would break things for any users using it.  Any reorganizing of
the implementation of MAC could easily be done after 1.6.0.  From a users
perspective the MAC API has changed very little, even though the
implementation has dramatically changed.




On Wed, Mar 26, 2014 at 3:10 AM, Sean Busbey <bu...@cloudera.com>wrote:

> ACCUMULO-2143 has developed a conversation about MiniAccumuloCluster's
> intended use and the way we currently implement the difference between MAC
> for external use and MAC for internal Accumulo testing[1].
>
> In particular, Josh had a few major concerns
>
> -----
>
> It doesn't make sense to me why MiniAccumuloCluster is a concrete class
> which, ultimately still tied to a MiniAccumuloClusterImpl.
> MiniAccumuloCluster *requires* a MiniAccumuloClusterImpl or something that
> extends it. This is what's really chafing me about the separation of
> "accumulo user" and "accumulo developer" methods - you *always* have them
> both. Not to mention, this hierarchy is really obnoxious to create a new
> implementation of AccumuloMiniCluster(Impl) because I have to carry all of
> the cruft of the "original" implementation with me.
>
> Bringing this back around to this ticket, while I still don't agree with
> the reasoning that exposing the FileSystem or ZooKeeper object that
> MiniAccumuloClusterImpl is getting us anything other than the ability to
> say "we didn't change this [arbitrary] API". For "users" who might not care
> what the underlying FileSystem or ZooKeeper connection, it's merely an
> extra two items in their editor's code-completion. For "users" who would
> care to use this information, we now make them jump through extra hoops to
> get it. That just doesn't make any sense to me for something we haven't
> even released.
>
> To be honest, I really want to re-open
> ACCUMULO-2151<https://issues.apache.org/jira/browse/ACCUMULO-2151>,
> make MiniAccumuloCluster an interface, MiniAccumuloClusterImpl an
> implementation of said interface, and create some factory class to make
> instances, ala Connector.tableOperations, Connector.securityOperations,
> etc. Right now there's a class we call an "API" that cannot be generically
> extended for the sake of saying "we have an API".
>
> ----
>
> I wanted to avoid having a drawn out discussion on a jira, where folks my
> not notice it. Especially with things being late in 1.6.0 development and
> the potential this has to impact the API.
>
> Personally, I don't have much of a dog in the fight. There's always some
> arbitrary line for where the public API will be, presuming we want to have
> any kind of balance between providing a stable based for others to build on
> and being able to refactor things. I would like us to hold to our API
> promises[2] and I would rather we not leak implementation details
> unnecessarily.
>
> I suspect the choice to make MiniAccumuloCluster a class rather than an
> interface with a factory was driven by the restrictions we put on API
> changes between major versions and the fact that 1.5 had a class you could
> instantiate via constructors[3].
>
> It's possible we can address some of the major reusability concerns by
> moving most of the implementation back into MAC, liberally using package
> access for members, and making the internal-use MAC extend the public one.
>
>
> [1]: https://issues.apache.org/jira/browse/ACCUMULO-2143
> [2]: http://accumulo.apache.org/governance/releasing.html
> [3]: https://issues.apache.org/jira/browse/ACCUMULO-2151
>