You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@fluo.apache.org by Arvind Shyamsundar <ar...@microsoft.com.INVALID> on 2022/01/05 23:45:01 UTC

RE: [EXTERNAL] [DISCUSS] Move Muchos from Centos 7 to another OS

Hi Keith,
Process-wise, I believe it would be simpler and less taxing on effort / testing / supportability to standardize on one final OS target platform. Supporting a choice of OS platforms is hard and more effort than rewards. While selecting the OS, we could consider the following (this is just my top-of-mind, there may be more aspects to consider):

- End-of-life / LTS status for that OS.
- Licensing and commercial implications if any.
- Ready and wide availability of image(s) for the cloud providers (Azure, EC2).
- We should not require the user of Muchos to have to build their own image.
- Those cloud provider image(s) should support cloud-init.
- Those cloud provider image(s) should be updated by the cloud provider (or partners thereof) on a reasonably regular (quarterly or more frequent?) basis to try to include as many security patches out-of-the-box.
- The OS should be reasonably "stable" in that it should not increase burden on the user of Muchos by introducing any regressions (functional / perf) of its own origin.
- There should be ready availability of OpenJDK binary distribution for that OS.
... etc.

BTW, I did some work in December to evaluate Ubuntu 20.04 as a candidate OS for the Muchos cluster nodes on Azure. In the proof-of-concept that I did, the changes to the Muchos code base seemed to be contained to yum packages, OpenJDK package names, and similar relatively minor changes. The downstream playbooks for Zookeeper, Hadoop, Accumulo "just worked".

I hope this helps.

Arvind.

-----Original Message-----
From: Keith Turner <ke...@deenlo.com> 
Sent: Wednesday, January 5, 2022 2:00 PM
To: fluo-dev <de...@fluo.apache.org>
Subject: [EXTERNAL] [DISCUSS] Move Muchos from Centos 7 to another OS

We need to move Muchos from Centos 7 to another OS.  Not completely sure how we should  proceed.  Below is a possible process we could follow to make this change.

 * We collaborate to create a list of candidate OSes with pros and cons for each OS, not sure where is best to do this.  I don't think a mailing list is the best place for this collaboration, however it's nice for the collaboration to be recorded on the mailing list.
 * We hold a vote (possibly iterative voting or ranked voting) to try to reach consensus on one of the candidate OSes we would like Muchos to use.
 * After a candidate is selected we can create a branch and start submitting PRs to move Muchos to that new OS in the branch.  When it's functional we can merge that branch into main.

Can anyone think of another process we could follow to move Muchos to another OS? Does anyone object to the process proposed above or have suggestions to improve it?

Keith

Re: [EXTERNAL] [DISCUSS] Move Muchos from Centos 7 to another OS

Posted by Keith Turner <ke...@deenlo.com>.
The PR https://github.com/apache/fluo-muchos/pull/429 is relevant to
this discussion.

On Thu, Jan 6, 2022 at 1:53 PM Christopher <ct...@apache.org> wrote:
>
> On Thu, Jan 6, 2022 at 11:31 AM Keith Turner <ke...@deenlo.com> wrote:
> >
> > The way I use Muchos I have never really cared too much what the OS is
> > from the perspective of testing and experimenting.  I use Muchos to
> > experiment with and test Fluo and Accumulo features and functionality
> > at scale.  When I use Muchos I hope to spend as little time as
> > possible working on Muchos itself and as much time as possible working
> > on the experiment/test I am trying to run.  From the perspective of
> > saving time, I care a lot about the OS for two reasons. First I would
> > prefer something that's stable (like RHEL derivatives or Ubuntu LTS
> > releases)  so that when I do use Muchos it's less likely that I have
> > to spend time dealing with a change in the OS that breaks something.
> > Second, if we all decide to make Muchos work with a single default OS,
> > then we can all benefit from each other's work saving time for all of
> > us.
> >
> > > I am interpreting that to mean "what do we
> > > want the default image to be?"
> >
> > Yes, that is what I am thinking.
> >
> > > Process-wise, I believe it would be simpler and less taxing on effort / testing / supportability to standardize on one final OS target platform.
> >
> > I also agree with this as I think it will save time for using Muchos
> > to test Fluo and Accumulo at scale.
> >
> > >  So, "[supporting] ... one final OS
> > > target platform" doesn't make sense to me if we have developers
> > > needing to use it to test on several different OSes because they use
> > > different OSes in production.
> >
> > We should definitely steer clear of using terminology like "supporting
> > OS X".  There is never any guarantee that Muchos will work when
> > someone uses it with the default OS it's using, much less some other
> > OS.
> >
> > > I expect it
> > > to work today without any changes on a RHEL7 image, or a RHEL8 or even
> > > CentOS Stream 8 image, and would be surprised if it didn't.
> >
> > I would be surprised if the current version of Muchos worked w/ no
> > changes on RHEL8.  But I have no idea if it would or would not work.
> >
>
> RHEL7 and RHEL8 are downstream of Fedora, and there haven't been
> substantial changes in Fedora over that period that I know of that
> would have negatively impacted Muchos. I tried to set up Fedora today,
> and the main thing I had to change was to disable the attempts to
> install the 'epel-release' yum repository. For those not familiar,
> that repository basically contains packages from Fedora that RedHat
> didn't package in RHEL, so as a workaround, you have to install them
> from the Fedora-maintained "EPEL" repo. So, that repository isn't
> needed when deploying to Fedora, since it already has the packages in
> its main repos. Trivially, I also had to add collectd-disk to the list
> of installed packages, because it wasn't installed by default. I also
> changed the default user to "fedora" from "centos" in the
> muchos.props, but that's AMI-specific user config anyway.
>
> While I was trying out Fedora, I saw that we already have some logic
> in there to support CentOS/RHEL 8. I did not test that.
>
> > > I largely concur with the list of aspects to consider that Arvind has provided
> >
> > I also agree with that list.
> >
> > > There should be ready availability of OpenJDK binary distribution for that OS.
> >
> > Ideally the new default OS would make OpenJDK 8,11, and 17 easily
> > installable from its package manager.
>
> Fedora supports these Java LTS versions in parallel, and the same
> package names that CentOS/RHEL use. I use Fedora for development and
> switch between these versions as needed, routinely, by simply changing
> JAVA_HOME environment variable in my bashrc and updating my PATH.
>
> >
> > > - we should provide a reasonable default, but avoid doing things in
> > > Muchos that would tightly couple us to a particular image, so that
> > > users can easily configure a different image and have reasonable
> > > expectations that Muchos will work with little or very minor tweaks
> >
> > As each code change is code reviewed, if someone notices something
> > that would make using another OS really hard they can mention it in
> > the review.  Personally I would not be interested in spending time on
> > making the statement "users can easily configure a different image and
> > have reasonable expectations that Muchos will work with little or very
> > minor tweaks" a reality because I can't see how to achieve it w/o
> > testing against other OSes (besides the default) periodically.  A
>
> I agree that we shouldn't spend time/effort into that. My argument
> here is that I think we get this reasonable expectation naturally by
> sticking to something in the same RPM-based EL family of operating
> systems that we currently use and which currently meets our needs,
> because these versions upstream and downstream of RHEL aren't
> substantially different from each other. This is how we've been able
> to bump through the CentOS7 releases 7.1, 7.2, 7.3, etc. so easily.
>
> > slightly different way to approach this may be to accept changes to
> > Muchos to make it work with an OS besides the default OS as long as
> > those do not introduce an excessive maintenance burden. I think it
> > would be nice to avoid introducing complexity and processes to support
> > multiple OS that does not improve the ability to test Fluo and
> > Accumulo functionality at scale.
> >
> > > My preferred default would be Fedora
> >
> > I suspect making something like Fedora, a non-LTS Ubuntu version, or
> > Centos stream the default OS will lead to more time spent dealing with
> > OS changes that takes time away from testing/experimenting with
> > Accumulo and Fluo at scale. Given this I would prefer a RHEL 8
> > derivative or Ubuntu LTS (20.04 or 22.04) as the new default OS.
>
> I've been using Fedora for 10+ years of Accumulo development and have
> rarely, if ever, encountered a substantial deviation from CentOS/RHEL
> behavior that required I take any time away from testing/experimenting
> with Accumulo to address it, *except* for those rare instances where
> we got advanced warning of something that would soon impact CentOS
> because of it being patched in Fedora first. In fact, it's been easier
> to do development in Fedora for Java, because Fedora has made newer
> Java releases available much more regularly and predictably than on
> CentOS. In spite of its reputation as "bleeding edge" (which I think
> only applies to the rawhide branch of Fedora anyway, which I never
> use), Fedora is just as stable as any of the downstream CentOS/RHEL
> distros that build off it. The user experience with Fedora should be
> identical, or nearly so, to any RHEL8 derivative, since they're in the
> same family. However, Fedora receives security updates more regularly,
> and more rapidly, since it is upstream of those.
>
> I would strongly prefer *not* using a Debian-based derivative at the
> default, because the user experience will be substantially different
> than any of the RPM-based distros used in any of the actual production
> deployments of Accumulo I'm familiar with, and defaulting to Ubuntu
> would make packages and other conventions no longer work out of the
> box, without at least a little effort, for EL-based distros, if
> somebody wanted to test at scale on an EL OS they might use in
> production.
>
> I'd be happy with a RHEL8 downstream distro if there was one that
> stood out as ubiquitous. Currently, I think Alma and Rocky are the two
> main competitors, but I don't know if either are available on Azure. I
> also don't know if either have the infrastructure to support regular
> releases and security updates yet. This is why I suggested Fedora. It
> does have these things, and is as close as we can get to RHEL8,
> without paying RHN subscription fees.
>
> >
> > > - we should focus on OS/OS families actually used by devs in their
> > > Accumulo/Fluo testing environments
> >
> > This is hard to know w/ certainty.  That is why I was thinking voting
> > on a list of candidates for the new default OS might be good. When
> > people vote they could consider this among many other things.
>
> Right. By this, all I meant was that Muchos is our tool, and it should
> meet our dev needs, rather than us trying to cater to what some
> external non-dev user might hypothetically want to use if they had
> their ideal pick.
>
> >
> > > - it should be a GNU/Linux OS, since Accumulo isn't really designed or
> > > tested with anything else (sorry BSD users)
> >
> > Yeah should definitely be  GNU/Linux OS
> >
> > On Wed, Jan 5, 2022 at 11:32 PM Christopher <ct...@apache.org> wrote:
> > >
> > > Thanks for bringing this up for discussion, Keith!
> > >
> > > One point I'd like us to be clear about for the purposes of this
> > > discussion: when we say "move from Centos 7 to another OS", I want us
> > > to be clear that Muchos is only "on" CentOS 7 as a default configured
> > > image. Muchos *should* be able to run on any similar OS, by merely
> > > changing the image it is configured to use. For example, I expect it
> > > to work today without any changes on a RHEL7 image, or a RHEL8 or even
> > > CentOS Stream 8 image, and would be surprised if it didn't. So, when
> > > we talk about "moving", I am interpreting that to mean "what do we
> > > want the default image to be?" If we mean something other than
> > > selecting the default, then I'm not sure what we're talking about and
> > > would appreciate clarification.
> > >
> > > Since the intent of Muchos, and the reason it continues to fall under
> > > the purview of the Fluo PMC, was for the Fluo developers to have a way
> > > to quickly deploy a test cluster of Accumulo and Fluo. While it may be
> > > used by some outside that scope, we should not lose sight of that
> > > intent. Given that purpose, it becomes clear that Muchos needs to be
> > > able to deploy on an operating system that the Accumulo and Fluo
> > > developers are actually using to test on to anticipate problems in
> > > their own production clusters.  So, "[supporting] ... one final OS
> > > target platform" doesn't make sense to me if we have developers
> > > needing to use it to test on several different OSes because they use
> > > different OSes in production.
> > >
> > > From Arvind's experience testing Ubuntu, and my own expectations using
> > > other Fedora and other Fedora-derived RPM-based distros like RHEL and
> > > CentOS, I think it's likely that the biggest incompatibilities
> > > preventing use of an arbitrary AMI is going to be the package names,
> > > and a few filesystem layout conventions. So, I'm not too worried about
> > > choosing one default and users still being able to configure a
> > > different one that they need to test on. As long as the differences
> > > between major OS families we use are minor, we can all still use the
> > > same Muchos to test with, by baking in some conditional logic, or
> > > making some aspects more configurable.
> > >
> > > I largely concur with the list of aspects to consider that Arvind has
> > > provided, but would add:
> > > - we should provide a reasonable default, but avoid doing things in
> > > Muchos that would tightly couple us to a particular image, so that
> > > users can easily configure a different image and have reasonable
> > > expectations that Muchos will work with little or very minor tweaks
> > > - we should focus on OS/OS families actually used by devs in their
> > > Accumulo/Fluo testing environments
> > > - it should be a GNU/Linux OS, since Accumulo isn't really designed or
> > > tested with anything else (sorry BSD users)
> > >
> > > My preferred default would be Fedora, because it is upstream of
> > > RHEL/CentOS/Rocky/Alma/etc., unless one of the RHEL downstream CentOS
> > > replacements becomes a de facto standard replacement for CentOS,
> > > because I'm likely going to need some kind of EL or upstream-to-EL
> > > testing for the enterprise users I support at $dayjob. However, given
> > > the ubiquity of Ubuntu among Debian varieties, I think it's reasonable
> > > to expect things to work more-or-less out of the box if the user
> > > configures a modern Ubuntu image.
> > >
> > > Related topic: would it make more sense to migrate Muchos away from
> > > Ansible, and use something that is maybe a little less hard-coded
> > > scripts, and a little more flexible user config? I'm thinking
> > > something like https://github.com/hashicorp/terraform#readme ; they
> > > have examples with cloud-init
> > > https://learn.hashicorp.com/tutorials/terraform/cloud-init ; Terraform
> > > is extremely well documented, and it seems to me that a few easily
> > > user-editable config templates could replace Muchos and support both
> > > EC2 and Azure easily. It's probably a bigger change than when Muchos
> > > moved to ansible from the previous scripts, but long-term, it seems
> > > better than trying to maintain our own provisioning code that's as
> > > flexible. However, I don't have much experience with Terraform.
> > >
> > > On Wed, Jan 5, 2022 at 6:45 PM Arvind Shyamsundar
> > > <ar...@microsoft.com.invalid> wrote:
> > > >
> > > > Hi Keith,
> > > > Process-wise, I believe it would be simpler and less taxing on effort / testing / supportability to standardize on one final OS target platform. Supporting a choice of OS platforms is hard and more effort than rewards. While selecting the OS, we could consider the following (this is just my top-of-mind, there may be more aspects to consider):
> > > >
> > > > - End-of-life / LTS status for that OS.
> > > > - Licensing and commercial implications if any.
> > > > - Ready and wide availability of image(s) for the cloud providers (Azure, EC2).
> > > > - We should not require the user of Muchos to have to build their own image.
> > > > - Those cloud provider image(s) should support cloud-init.
> > > > - Those cloud provider image(s) should be updated by the cloud provider (or partners thereof) on a reasonably regular (quarterly or more frequent?) basis to try to include as many security patches out-of-the-box.
> > > > - The OS should be reasonably "stable" in that it should not increase burden on the user of Muchos by introducing any regressions (functional / perf) of its own origin.
> > > > - There should be ready availability of OpenJDK binary distribution for that OS.
> > > > ... etc.
> > > >
> > > > BTW, I did some work in December to evaluate Ubuntu 20.04 as a candidate OS for the Muchos cluster nodes on Azure. In the proof-of-concept that I did, the changes to the Muchos code base seemed to be contained to yum packages, OpenJDK package names, and similar relatively minor changes. The downstream playbooks for Zookeeper, Hadoop, Accumulo "just worked".
> > > >
> > > > I hope this helps.
> > > >
> > > > Arvind.
> > > >
> > > > -----Original Message-----
> > > > From: Keith Turner <ke...@deenlo.com>
> > > > Sent: Wednesday, January 5, 2022 2:00 PM
> > > > To: fluo-dev <de...@fluo.apache.org>
> > > > Subject: [EXTERNAL] [DISCUSS] Move Muchos from Centos 7 to another OS
> > > >
> > > > We need to move Muchos from Centos 7 to another OS.  Not completely sure how we should  proceed.  Below is a possible process we could follow to make this change.
> > > >
> > > >  * We collaborate to create a list of candidate OSes with pros and cons for each OS, not sure where is best to do this.  I don't think a mailing list is the best place for this collaboration, however it's nice for the collaboration to be recorded on the mailing list.
> > > >  * We hold a vote (possibly iterative voting or ranked voting) to try to reach consensus on one of the candidate OSes we would like Muchos to use.
> > > >  * After a candidate is selected we can create a branch and start submitting PRs to move Muchos to that new OS in the branch.  When it's functional we can merge that branch into main.
> > > >
> > > > Can anyone think of another process we could follow to move Muchos to another OS? Does anyone object to the process proposed above or have suggestions to improve it?
> > > >
> > > > Keith

Re: [EXTERNAL] [DISCUSS] Move Muchos from Centos 7 to another OS

Posted by Christopher <ct...@apache.org>.
On Thu, Jan 6, 2022 at 11:31 AM Keith Turner <ke...@deenlo.com> wrote:
>
> The way I use Muchos I have never really cared too much what the OS is
> from the perspective of testing and experimenting.  I use Muchos to
> experiment with and test Fluo and Accumulo features and functionality
> at scale.  When I use Muchos I hope to spend as little time as
> possible working on Muchos itself and as much time as possible working
> on the experiment/test I am trying to run.  From the perspective of
> saving time, I care a lot about the OS for two reasons. First I would
> prefer something that's stable (like RHEL derivatives or Ubuntu LTS
> releases)  so that when I do use Muchos it's less likely that I have
> to spend time dealing with a change in the OS that breaks something.
> Second, if we all decide to make Muchos work with a single default OS,
> then we can all benefit from each other's work saving time for all of
> us.
>
> > I am interpreting that to mean "what do we
> > want the default image to be?"
>
> Yes, that is what I am thinking.
>
> > Process-wise, I believe it would be simpler and less taxing on effort / testing / supportability to standardize on one final OS target platform.
>
> I also agree with this as I think it will save time for using Muchos
> to test Fluo and Accumulo at scale.
>
> >  So, "[supporting] ... one final OS
> > target platform" doesn't make sense to me if we have developers
> > needing to use it to test on several different OSes because they use
> > different OSes in production.
>
> We should definitely steer clear of using terminology like "supporting
> OS X".  There is never any guarantee that Muchos will work when
> someone uses it with the default OS it's using, much less some other
> OS.
>
> > I expect it
> > to work today without any changes on a RHEL7 image, or a RHEL8 or even
> > CentOS Stream 8 image, and would be surprised if it didn't.
>
> I would be surprised if the current version of Muchos worked w/ no
> changes on RHEL8.  But I have no idea if it would or would not work.
>

RHEL7 and RHEL8 are downstream of Fedora, and there haven't been
substantial changes in Fedora over that period that I know of that
would have negatively impacted Muchos. I tried to set up Fedora today,
and the main thing I had to change was to disable the attempts to
install the 'epel-release' yum repository. For those not familiar,
that repository basically contains packages from Fedora that RedHat
didn't package in RHEL, so as a workaround, you have to install them
from the Fedora-maintained "EPEL" repo. So, that repository isn't
needed when deploying to Fedora, since it already has the packages in
its main repos. Trivially, I also had to add collectd-disk to the list
of installed packages, because it wasn't installed by default. I also
changed the default user to "fedora" from "centos" in the
muchos.props, but that's AMI-specific user config anyway.

While I was trying out Fedora, I saw that we already have some logic
in there to support CentOS/RHEL 8. I did not test that.

> > I largely concur with the list of aspects to consider that Arvind has provided
>
> I also agree with that list.
>
> > There should be ready availability of OpenJDK binary distribution for that OS.
>
> Ideally the new default OS would make OpenJDK 8,11, and 17 easily
> installable from its package manager.

Fedora supports these Java LTS versions in parallel, and the same
package names that CentOS/RHEL use. I use Fedora for development and
switch between these versions as needed, routinely, by simply changing
JAVA_HOME environment variable in my bashrc and updating my PATH.

>
> > - we should provide a reasonable default, but avoid doing things in
> > Muchos that would tightly couple us to a particular image, so that
> > users can easily configure a different image and have reasonable
> > expectations that Muchos will work with little or very minor tweaks
>
> As each code change is code reviewed, if someone notices something
> that would make using another OS really hard they can mention it in
> the review.  Personally I would not be interested in spending time on
> making the statement "users can easily configure a different image and
> have reasonable expectations that Muchos will work with little or very
> minor tweaks" a reality because I can't see how to achieve it w/o
> testing against other OSes (besides the default) periodically.  A

I agree that we shouldn't spend time/effort into that. My argument
here is that I think we get this reasonable expectation naturally by
sticking to something in the same RPM-based EL family of operating
systems that we currently use and which currently meets our needs,
because these versions upstream and downstream of RHEL aren't
substantially different from each other. This is how we've been able
to bump through the CentOS7 releases 7.1, 7.2, 7.3, etc. so easily.

> slightly different way to approach this may be to accept changes to
> Muchos to make it work with an OS besides the default OS as long as
> those do not introduce an excessive maintenance burden. I think it
> would be nice to avoid introducing complexity and processes to support
> multiple OS that does not improve the ability to test Fluo and
> Accumulo functionality at scale.
>
> > My preferred default would be Fedora
>
> I suspect making something like Fedora, a non-LTS Ubuntu version, or
> Centos stream the default OS will lead to more time spent dealing with
> OS changes that takes time away from testing/experimenting with
> Accumulo and Fluo at scale. Given this I would prefer a RHEL 8
> derivative or Ubuntu LTS (20.04 or 22.04) as the new default OS.

I've been using Fedora for 10+ years of Accumulo development and have
rarely, if ever, encountered a substantial deviation from CentOS/RHEL
behavior that required I take any time away from testing/experimenting
with Accumulo to address it, *except* for those rare instances where
we got advanced warning of something that would soon impact CentOS
because of it being patched in Fedora first. In fact, it's been easier
to do development in Fedora for Java, because Fedora has made newer
Java releases available much more regularly and predictably than on
CentOS. In spite of its reputation as "bleeding edge" (which I think
only applies to the rawhide branch of Fedora anyway, which I never
use), Fedora is just as stable as any of the downstream CentOS/RHEL
distros that build off it. The user experience with Fedora should be
identical, or nearly so, to any RHEL8 derivative, since they're in the
same family. However, Fedora receives security updates more regularly,
and more rapidly, since it is upstream of those.

I would strongly prefer *not* using a Debian-based derivative at the
default, because the user experience will be substantially different
than any of the RPM-based distros used in any of the actual production
deployments of Accumulo I'm familiar with, and defaulting to Ubuntu
would make packages and other conventions no longer work out of the
box, without at least a little effort, for EL-based distros, if
somebody wanted to test at scale on an EL OS they might use in
production.

I'd be happy with a RHEL8 downstream distro if there was one that
stood out as ubiquitous. Currently, I think Alma and Rocky are the two
main competitors, but I don't know if either are available on Azure. I
also don't know if either have the infrastructure to support regular
releases and security updates yet. This is why I suggested Fedora. It
does have these things, and is as close as we can get to RHEL8,
without paying RHN subscription fees.

>
> > - we should focus on OS/OS families actually used by devs in their
> > Accumulo/Fluo testing environments
>
> This is hard to know w/ certainty.  That is why I was thinking voting
> on a list of candidates for the new default OS might be good. When
> people vote they could consider this among many other things.

Right. By this, all I meant was that Muchos is our tool, and it should
meet our dev needs, rather than us trying to cater to what some
external non-dev user might hypothetically want to use if they had
their ideal pick.

>
> > - it should be a GNU/Linux OS, since Accumulo isn't really designed or
> > tested with anything else (sorry BSD users)
>
> Yeah should definitely be  GNU/Linux OS
>
> On Wed, Jan 5, 2022 at 11:32 PM Christopher <ct...@apache.org> wrote:
> >
> > Thanks for bringing this up for discussion, Keith!
> >
> > One point I'd like us to be clear about for the purposes of this
> > discussion: when we say "move from Centos 7 to another OS", I want us
> > to be clear that Muchos is only "on" CentOS 7 as a default configured
> > image. Muchos *should* be able to run on any similar OS, by merely
> > changing the image it is configured to use. For example, I expect it
> > to work today without any changes on a RHEL7 image, or a RHEL8 or even
> > CentOS Stream 8 image, and would be surprised if it didn't. So, when
> > we talk about "moving", I am interpreting that to mean "what do we
> > want the default image to be?" If we mean something other than
> > selecting the default, then I'm not sure what we're talking about and
> > would appreciate clarification.
> >
> > Since the intent of Muchos, and the reason it continues to fall under
> > the purview of the Fluo PMC, was for the Fluo developers to have a way
> > to quickly deploy a test cluster of Accumulo and Fluo. While it may be
> > used by some outside that scope, we should not lose sight of that
> > intent. Given that purpose, it becomes clear that Muchos needs to be
> > able to deploy on an operating system that the Accumulo and Fluo
> > developers are actually using to test on to anticipate problems in
> > their own production clusters.  So, "[supporting] ... one final OS
> > target platform" doesn't make sense to me if we have developers
> > needing to use it to test on several different OSes because they use
> > different OSes in production.
> >
> > From Arvind's experience testing Ubuntu, and my own expectations using
> > other Fedora and other Fedora-derived RPM-based distros like RHEL and
> > CentOS, I think it's likely that the biggest incompatibilities
> > preventing use of an arbitrary AMI is going to be the package names,
> > and a few filesystem layout conventions. So, I'm not too worried about
> > choosing one default and users still being able to configure a
> > different one that they need to test on. As long as the differences
> > between major OS families we use are minor, we can all still use the
> > same Muchos to test with, by baking in some conditional logic, or
> > making some aspects more configurable.
> >
> > I largely concur with the list of aspects to consider that Arvind has
> > provided, but would add:
> > - we should provide a reasonable default, but avoid doing things in
> > Muchos that would tightly couple us to a particular image, so that
> > users can easily configure a different image and have reasonable
> > expectations that Muchos will work with little or very minor tweaks
> > - we should focus on OS/OS families actually used by devs in their
> > Accumulo/Fluo testing environments
> > - it should be a GNU/Linux OS, since Accumulo isn't really designed or
> > tested with anything else (sorry BSD users)
> >
> > My preferred default would be Fedora, because it is upstream of
> > RHEL/CentOS/Rocky/Alma/etc., unless one of the RHEL downstream CentOS
> > replacements becomes a de facto standard replacement for CentOS,
> > because I'm likely going to need some kind of EL or upstream-to-EL
> > testing for the enterprise users I support at $dayjob. However, given
> > the ubiquity of Ubuntu among Debian varieties, I think it's reasonable
> > to expect things to work more-or-less out of the box if the user
> > configures a modern Ubuntu image.
> >
> > Related topic: would it make more sense to migrate Muchos away from
> > Ansible, and use something that is maybe a little less hard-coded
> > scripts, and a little more flexible user config? I'm thinking
> > something like https://github.com/hashicorp/terraform#readme ; they
> > have examples with cloud-init
> > https://learn.hashicorp.com/tutorials/terraform/cloud-init ; Terraform
> > is extremely well documented, and it seems to me that a few easily
> > user-editable config templates could replace Muchos and support both
> > EC2 and Azure easily. It's probably a bigger change than when Muchos
> > moved to ansible from the previous scripts, but long-term, it seems
> > better than trying to maintain our own provisioning code that's as
> > flexible. However, I don't have much experience with Terraform.
> >
> > On Wed, Jan 5, 2022 at 6:45 PM Arvind Shyamsundar
> > <ar...@microsoft.com.invalid> wrote:
> > >
> > > Hi Keith,
> > > Process-wise, I believe it would be simpler and less taxing on effort / testing / supportability to standardize on one final OS target platform. Supporting a choice of OS platforms is hard and more effort than rewards. While selecting the OS, we could consider the following (this is just my top-of-mind, there may be more aspects to consider):
> > >
> > > - End-of-life / LTS status for that OS.
> > > - Licensing and commercial implications if any.
> > > - Ready and wide availability of image(s) for the cloud providers (Azure, EC2).
> > > - We should not require the user of Muchos to have to build their own image.
> > > - Those cloud provider image(s) should support cloud-init.
> > > - Those cloud provider image(s) should be updated by the cloud provider (or partners thereof) on a reasonably regular (quarterly or more frequent?) basis to try to include as many security patches out-of-the-box.
> > > - The OS should be reasonably "stable" in that it should not increase burden on the user of Muchos by introducing any regressions (functional / perf) of its own origin.
> > > - There should be ready availability of OpenJDK binary distribution for that OS.
> > > ... etc.
> > >
> > > BTW, I did some work in December to evaluate Ubuntu 20.04 as a candidate OS for the Muchos cluster nodes on Azure. In the proof-of-concept that I did, the changes to the Muchos code base seemed to be contained to yum packages, OpenJDK package names, and similar relatively minor changes. The downstream playbooks for Zookeeper, Hadoop, Accumulo "just worked".
> > >
> > > I hope this helps.
> > >
> > > Arvind.
> > >
> > > -----Original Message-----
> > > From: Keith Turner <ke...@deenlo.com>
> > > Sent: Wednesday, January 5, 2022 2:00 PM
> > > To: fluo-dev <de...@fluo.apache.org>
> > > Subject: [EXTERNAL] [DISCUSS] Move Muchos from Centos 7 to another OS
> > >
> > > We need to move Muchos from Centos 7 to another OS.  Not completely sure how we should  proceed.  Below is a possible process we could follow to make this change.
> > >
> > >  * We collaborate to create a list of candidate OSes with pros and cons for each OS, not sure where is best to do this.  I don't think a mailing list is the best place for this collaboration, however it's nice for the collaboration to be recorded on the mailing list.
> > >  * We hold a vote (possibly iterative voting or ranked voting) to try to reach consensus on one of the candidate OSes we would like Muchos to use.
> > >  * After a candidate is selected we can create a branch and start submitting PRs to move Muchos to that new OS in the branch.  When it's functional we can merge that branch into main.
> > >
> > > Can anyone think of another process we could follow to move Muchos to another OS? Does anyone object to the process proposed above or have suggestions to improve it?
> > >
> > > Keith

Re: [EXTERNAL] [DISCUSS] Move Muchos from Centos 7 to another OS

Posted by Keith Turner <ke...@deenlo.com>.
The way I use Muchos I have never really cared too much what the OS is
from the perspective of testing and experimenting.  I use Muchos to
experiment with and test Fluo and Accumulo features and functionality
at scale.  When I use Muchos I hope to spend as little time as
possible working on Muchos itself and as much time as possible working
on the experiment/test I am trying to run.  From the perspective of
saving time, I care a lot about the OS for two reasons. First I would
prefer something that's stable (like RHEL derivatives or Ubuntu LTS
releases)  so that when I do use Muchos it's less likely that I have
to spend time dealing with a change in the OS that breaks something.
Second, if we all decide to make Muchos work with a single default OS,
then we can all benefit from each other's work saving time for all of
us.

> I am interpreting that to mean "what do we
> want the default image to be?"

Yes, that is what I am thinking.

> Process-wise, I believe it would be simpler and less taxing on effort / testing / supportability to standardize on one final OS target platform.

I also agree with this as I think it will save time for using Muchos
to test Fluo and Accumulo at scale.

>  So, "[supporting] ... one final OS
> target platform" doesn't make sense to me if we have developers
> needing to use it to test on several different OSes because they use
> different OSes in production.

We should definitely steer clear of using terminology like "supporting
OS X".  There is never any guarantee that Muchos will work when
someone uses it with the default OS it's using, much less some other
OS.

> I expect it
> to work today without any changes on a RHEL7 image, or a RHEL8 or even
> CentOS Stream 8 image, and would be surprised if it didn't.

I would be surprised if the current version of Muchos worked w/ no
changes on RHEL8.  But I have no idea if it would or would not work.

> I largely concur with the list of aspects to consider that Arvind has provided

I also agree with that list.

> There should be ready availability of OpenJDK binary distribution for that OS.

Ideally the new default OS would make OpenJDK 8,11, and 17 easily
installable from its package manager.

> - we should provide a reasonable default, but avoid doing things in
> Muchos that would tightly couple us to a particular image, so that
> users can easily configure a different image and have reasonable
> expectations that Muchos will work with little or very minor tweaks

As each code change is code reviewed, if someone notices something
that would make using another OS really hard they can mention it in
the review.  Personally I would not be interested in spending time on
making the statement "users can easily configure a different image and
have reasonable expectations that Muchos will work with little or very
minor tweaks" a reality because I can't see how to achieve it w/o
testing against other OSes (besides the default) periodically.  A
slightly different way to approach this may be to accept changes to
Muchos to make it work with an OS besides the default OS as long as
those do not introduce an excessive maintenance burden. I think it
would be nice to avoid introducing complexity and processes to support
multiple OS that does not improve the ability to test Fluo and
Accumulo functionality at scale.

> My preferred default would be Fedora

I suspect making something like Fedora, a non-LTS Ubuntu version, or
Centos stream the default OS will lead to more time spent dealing with
OS changes that takes time away from testing/experimenting with
Accumulo and Fluo at scale. Given this I would prefer a RHEL 8
derivative or Ubuntu LTS (20.04 or 22.04) as the new default OS.

> - we should focus on OS/OS families actually used by devs in their
> Accumulo/Fluo testing environments

This is hard to know w/ certainty.  That is why I was thinking voting
on a list of candidates for the new default OS might be good. When
people vote they could consider this among many other things.

> - it should be a GNU/Linux OS, since Accumulo isn't really designed or
> tested with anything else (sorry BSD users)

Yeah should definitely be  GNU/Linux OS

On Wed, Jan 5, 2022 at 11:32 PM Christopher <ct...@apache.org> wrote:
>
> Thanks for bringing this up for discussion, Keith!
>
> One point I'd like us to be clear about for the purposes of this
> discussion: when we say "move from Centos 7 to another OS", I want us
> to be clear that Muchos is only "on" CentOS 7 as a default configured
> image. Muchos *should* be able to run on any similar OS, by merely
> changing the image it is configured to use. For example, I expect it
> to work today without any changes on a RHEL7 image, or a RHEL8 or even
> CentOS Stream 8 image, and would be surprised if it didn't. So, when
> we talk about "moving", I am interpreting that to mean "what do we
> want the default image to be?" If we mean something other than
> selecting the default, then I'm not sure what we're talking about and
> would appreciate clarification.
>
> Since the intent of Muchos, and the reason it continues to fall under
> the purview of the Fluo PMC, was for the Fluo developers to have a way
> to quickly deploy a test cluster of Accumulo and Fluo. While it may be
> used by some outside that scope, we should not lose sight of that
> intent. Given that purpose, it becomes clear that Muchos needs to be
> able to deploy on an operating system that the Accumulo and Fluo
> developers are actually using to test on to anticipate problems in
> their own production clusters.  So, "[supporting] ... one final OS
> target platform" doesn't make sense to me if we have developers
> needing to use it to test on several different OSes because they use
> different OSes in production.
>
> From Arvind's experience testing Ubuntu, and my own expectations using
> other Fedora and other Fedora-derived RPM-based distros like RHEL and
> CentOS, I think it's likely that the biggest incompatibilities
> preventing use of an arbitrary AMI is going to be the package names,
> and a few filesystem layout conventions. So, I'm not too worried about
> choosing one default and users still being able to configure a
> different one that they need to test on. As long as the differences
> between major OS families we use are minor, we can all still use the
> same Muchos to test with, by baking in some conditional logic, or
> making some aspects more configurable.
>
> I largely concur with the list of aspects to consider that Arvind has
> provided, but would add:
> - we should provide a reasonable default, but avoid doing things in
> Muchos that would tightly couple us to a particular image, so that
> users can easily configure a different image and have reasonable
> expectations that Muchos will work with little or very minor tweaks
> - we should focus on OS/OS families actually used by devs in their
> Accumulo/Fluo testing environments
> - it should be a GNU/Linux OS, since Accumulo isn't really designed or
> tested with anything else (sorry BSD users)
>
> My preferred default would be Fedora, because it is upstream of
> RHEL/CentOS/Rocky/Alma/etc., unless one of the RHEL downstream CentOS
> replacements becomes a de facto standard replacement for CentOS,
> because I'm likely going to need some kind of EL or upstream-to-EL
> testing for the enterprise users I support at $dayjob. However, given
> the ubiquity of Ubuntu among Debian varieties, I think it's reasonable
> to expect things to work more-or-less out of the box if the user
> configures a modern Ubuntu image.
>
> Related topic: would it make more sense to migrate Muchos away from
> Ansible, and use something that is maybe a little less hard-coded
> scripts, and a little more flexible user config? I'm thinking
> something like https://github.com/hashicorp/terraform#readme ; they
> have examples with cloud-init
> https://learn.hashicorp.com/tutorials/terraform/cloud-init ; Terraform
> is extremely well documented, and it seems to me that a few easily
> user-editable config templates could replace Muchos and support both
> EC2 and Azure easily. It's probably a bigger change than when Muchos
> moved to ansible from the previous scripts, but long-term, it seems
> better than trying to maintain our own provisioning code that's as
> flexible. However, I don't have much experience with Terraform.
>
> On Wed, Jan 5, 2022 at 6:45 PM Arvind Shyamsundar
> <ar...@microsoft.com.invalid> wrote:
> >
> > Hi Keith,
> > Process-wise, I believe it would be simpler and less taxing on effort / testing / supportability to standardize on one final OS target platform. Supporting a choice of OS platforms is hard and more effort than rewards. While selecting the OS, we could consider the following (this is just my top-of-mind, there may be more aspects to consider):
> >
> > - End-of-life / LTS status for that OS.
> > - Licensing and commercial implications if any.
> > - Ready and wide availability of image(s) for the cloud providers (Azure, EC2).
> > - We should not require the user of Muchos to have to build their own image.
> > - Those cloud provider image(s) should support cloud-init.
> > - Those cloud provider image(s) should be updated by the cloud provider (or partners thereof) on a reasonably regular (quarterly or more frequent?) basis to try to include as many security patches out-of-the-box.
> > - The OS should be reasonably "stable" in that it should not increase burden on the user of Muchos by introducing any regressions (functional / perf) of its own origin.
> > - There should be ready availability of OpenJDK binary distribution for that OS.
> > ... etc.
> >
> > BTW, I did some work in December to evaluate Ubuntu 20.04 as a candidate OS for the Muchos cluster nodes on Azure. In the proof-of-concept that I did, the changes to the Muchos code base seemed to be contained to yum packages, OpenJDK package names, and similar relatively minor changes. The downstream playbooks for Zookeeper, Hadoop, Accumulo "just worked".
> >
> > I hope this helps.
> >
> > Arvind.
> >
> > -----Original Message-----
> > From: Keith Turner <ke...@deenlo.com>
> > Sent: Wednesday, January 5, 2022 2:00 PM
> > To: fluo-dev <de...@fluo.apache.org>
> > Subject: [EXTERNAL] [DISCUSS] Move Muchos from Centos 7 to another OS
> >
> > We need to move Muchos from Centos 7 to another OS.  Not completely sure how we should  proceed.  Below is a possible process we could follow to make this change.
> >
> >  * We collaborate to create a list of candidate OSes with pros and cons for each OS, not sure where is best to do this.  I don't think a mailing list is the best place for this collaboration, however it's nice for the collaboration to be recorded on the mailing list.
> >  * We hold a vote (possibly iterative voting or ranked voting) to try to reach consensus on one of the candidate OSes we would like Muchos to use.
> >  * After a candidate is selected we can create a branch and start submitting PRs to move Muchos to that new OS in the branch.  When it's functional we can merge that branch into main.
> >
> > Can anyone think of another process we could follow to move Muchos to another OS? Does anyone object to the process proposed above or have suggestions to improve it?
> >
> > Keith

Re: [EXTERNAL] [DISCUSS] Move Muchos from Centos 7 to another OS

Posted by Christopher <ct...@apache.org>.
Thanks for bringing this up for discussion, Keith!

One point I'd like us to be clear about for the purposes of this
discussion: when we say "move from Centos 7 to another OS", I want us
to be clear that Muchos is only "on" CentOS 7 as a default configured
image. Muchos *should* be able to run on any similar OS, by merely
changing the image it is configured to use. For example, I expect it
to work today without any changes on a RHEL7 image, or a RHEL8 or even
CentOS Stream 8 image, and would be surprised if it didn't. So, when
we talk about "moving", I am interpreting that to mean "what do we
want the default image to be?" If we mean something other than
selecting the default, then I'm not sure what we're talking about and
would appreciate clarification.

Since the intent of Muchos, and the reason it continues to fall under
the purview of the Fluo PMC, was for the Fluo developers to have a way
to quickly deploy a test cluster of Accumulo and Fluo. While it may be
used by some outside that scope, we should not lose sight of that
intent. Given that purpose, it becomes clear that Muchos needs to be
able to deploy on an operating system that the Accumulo and Fluo
developers are actually using to test on to anticipate problems in
their own production clusters.  So, "[supporting] ... one final OS
target platform" doesn't make sense to me if we have developers
needing to use it to test on several different OSes because they use
different OSes in production.

From Arvind's experience testing Ubuntu, and my own expectations using
other Fedora and other Fedora-derived RPM-based distros like RHEL and
CentOS, I think it's likely that the biggest incompatibilities
preventing use of an arbitrary AMI is going to be the package names,
and a few filesystem layout conventions. So, I'm not too worried about
choosing one default and users still being able to configure a
different one that they need to test on. As long as the differences
between major OS families we use are minor, we can all still use the
same Muchos to test with, by baking in some conditional logic, or
making some aspects more configurable.

I largely concur with the list of aspects to consider that Arvind has
provided, but would add:
- we should provide a reasonable default, but avoid doing things in
Muchos that would tightly couple us to a particular image, so that
users can easily configure a different image and have reasonable
expectations that Muchos will work with little or very minor tweaks
- we should focus on OS/OS families actually used by devs in their
Accumulo/Fluo testing environments
- it should be a GNU/Linux OS, since Accumulo isn't really designed or
tested with anything else (sorry BSD users)

My preferred default would be Fedora, because it is upstream of
RHEL/CentOS/Rocky/Alma/etc., unless one of the RHEL downstream CentOS
replacements becomes a de facto standard replacement for CentOS,
because I'm likely going to need some kind of EL or upstream-to-EL
testing for the enterprise users I support at $dayjob. However, given
the ubiquity of Ubuntu among Debian varieties, I think it's reasonable
to expect things to work more-or-less out of the box if the user
configures a modern Ubuntu image.

Related topic: would it make more sense to migrate Muchos away from
Ansible, and use something that is maybe a little less hard-coded
scripts, and a little more flexible user config? I'm thinking
something like https://github.com/hashicorp/terraform#readme ; they
have examples with cloud-init
https://learn.hashicorp.com/tutorials/terraform/cloud-init ; Terraform
is extremely well documented, and it seems to me that a few easily
user-editable config templates could replace Muchos and support both
EC2 and Azure easily. It's probably a bigger change than when Muchos
moved to ansible from the previous scripts, but long-term, it seems
better than trying to maintain our own provisioning code that's as
flexible. However, I don't have much experience with Terraform.

On Wed, Jan 5, 2022 at 6:45 PM Arvind Shyamsundar
<ar...@microsoft.com.invalid> wrote:
>
> Hi Keith,
> Process-wise, I believe it would be simpler and less taxing on effort / testing / supportability to standardize on one final OS target platform. Supporting a choice of OS platforms is hard and more effort than rewards. While selecting the OS, we could consider the following (this is just my top-of-mind, there may be more aspects to consider):
>
> - End-of-life / LTS status for that OS.
> - Licensing and commercial implications if any.
> - Ready and wide availability of image(s) for the cloud providers (Azure, EC2).
> - We should not require the user of Muchos to have to build their own image.
> - Those cloud provider image(s) should support cloud-init.
> - Those cloud provider image(s) should be updated by the cloud provider (or partners thereof) on a reasonably regular (quarterly or more frequent?) basis to try to include as many security patches out-of-the-box.
> - The OS should be reasonably "stable" in that it should not increase burden on the user of Muchos by introducing any regressions (functional / perf) of its own origin.
> - There should be ready availability of OpenJDK binary distribution for that OS.
> ... etc.
>
> BTW, I did some work in December to evaluate Ubuntu 20.04 as a candidate OS for the Muchos cluster nodes on Azure. In the proof-of-concept that I did, the changes to the Muchos code base seemed to be contained to yum packages, OpenJDK package names, and similar relatively minor changes. The downstream playbooks for Zookeeper, Hadoop, Accumulo "just worked".
>
> I hope this helps.
>
> Arvind.
>
> -----Original Message-----
> From: Keith Turner <ke...@deenlo.com>
> Sent: Wednesday, January 5, 2022 2:00 PM
> To: fluo-dev <de...@fluo.apache.org>
> Subject: [EXTERNAL] [DISCUSS] Move Muchos from Centos 7 to another OS
>
> We need to move Muchos from Centos 7 to another OS.  Not completely sure how we should  proceed.  Below is a possible process we could follow to make this change.
>
>  * We collaborate to create a list of candidate OSes with pros and cons for each OS, not sure where is best to do this.  I don't think a mailing list is the best place for this collaboration, however it's nice for the collaboration to be recorded on the mailing list.
>  * We hold a vote (possibly iterative voting or ranked voting) to try to reach consensus on one of the candidate OSes we would like Muchos to use.
>  * After a candidate is selected we can create a branch and start submitting PRs to move Muchos to that new OS in the branch.  When it's functional we can merge that branch into main.
>
> Can anyone think of another process we could follow to move Muchos to another OS? Does anyone object to the process proposed above or have suggestions to improve it?
>
> Keith