You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bigtop.apache.org by Christopher <ct...@apache.org> on 2016/08/12 23:47:00 UTC

Fedora/EPEL packaging

Hi Bigtop Developers,

I see that Bigtop packaging for RHEL/CentOS 6 and 7 and Fedora 20.

I was just wondering if there were any motivation within this community to
put some of that packaging effort into those upstream communities, either
to replace the packaging that BigTop provides, or to compliment it.

I'm currently a package maintainer for Fedora and EPEL (which is maintained
by the Fedora community), for Hadoop, ZooKeeper, and Accumulo, and I think
these Big Data packages could really use some additional support from more
packagers. I also think that perhaps we can deduplicate some effort.

Is anybody interested in helping with the packaging in the Fedora/EPEL
communities to support Fedora and EL distros?

Re: Fedora/EPEL packaging

Posted by Christopher <ct...@apache.org>.
On Tue, Aug 16, 2016 at 4:52 PM RJ Nowling <rn...@gmail.com> wrote:

> Hi Christopher,
>
> Jay Vyas and I work at Red Hat as well and are PMC members.  (Actually,
> tomorrow is my last day at Red Hat.)  And we worked on a team with several
> of the previous Hadoop, etc. Fedora maintainers.
>
> Bigtop uses a system where by some common build scripts are called by the
> RPM and Debian package build tools.  Bigtop would have to completely
> rebuild the packaging system to meet Fedora guidelines.  The current system
> doesn't  break down dependencies into separate packages or require that
> packages depend on a single version of each dependency -- JARs are allowed
> to bundle whatever version they want.
>
> The hurdles in the different approaches to packaging (optimized for
> different needs) are quite large.
>
> RJ
>
>
Hi RJ, thanks for the info. It might be the case that there's still some
common concerns, and the two communities can help each other meet their
separate needs.

Re: Fedora/EPEL packaging

Posted by RJ Nowling <rn...@gmail.com>.
Hi Christopher,

Jay Vyas and I work at Red Hat as well and are PMC members.  (Actually,
tomorrow is my last day at Red Hat.)  And we worked on a team with several
of the previous Hadoop, etc. Fedora maintainers.

Bigtop uses a system where by some common build scripts are called by the
RPM and Debian package build tools.  Bigtop would have to completely
rebuild the packaging system to meet Fedora guidelines.  The current system
doesn't  break down dependencies into separate packages or require that
packages depend on a single version of each dependency -- JARs are allowed
to bundle whatever version they want.

The hurdles in the different approaches to packaging (optimized for
different needs) are quite large.

RJ


On Sun, Aug 14, 2016 at 2:38 PM, Konstantin Boudnik <co...@apache.org> wrote:

> On Sat, Aug 13, 2016 at 02:55AM, Christopher wrote:
> >
> > Ideally, I think a long-term goal would be to try to eliminate the need
> for
> > Bigtop. It'd be nice if I could just go to my preferred distribution, and
> > install using the distro's provided package manager. However, because the
>
> I believe there's a bit of an misinterpretation here. Bigtop isn't all
> about
> packaging. There's also deployment aka 'operational knowledge', great deal
> of
> expertise about what works and what doesn't, a community of people highly
> skilled in all aspects of the Apache Bigdata ecosystem and distributed
> computing, and so on and so far. Besides, Bigtop cuts _across_ the
> distributions, so there's a bit of the value of the "central hub" as well.
>
> If guys at the Fedora want to go into the Hadoop distribution ordeal -
> you're
> more than welcome. I don't think any other Linux distribution does, but it
> could be just my own misguided view of the world.
>
> And of course what Olaf said: the particular complications of the stack
> creation stemmed from the need of multiple versions of the same binaries.
> Not
> always, but once in a while. Hope it helps.
>
> But we'd very happy to have more folks from the Fedora community helping to
> make the particular packaging format better and more relevant and standard.
>
> Regards,
>   Cos
>

Re: Fedora/EPEL packaging

Posted by Konstantin Boudnik <co...@apache.org>.
On Sat, Aug 13, 2016 at 02:55AM, Christopher wrote:
> 
> Ideally, I think a long-term goal would be to try to eliminate the need for
> Bigtop. It'd be nice if I could just go to my preferred distribution, and
> install using the distro's provided package manager. However, because the

I believe there's a bit of an misinterpretation here. Bigtop isn't all about
packaging. There's also deployment aka 'operational knowledge', great deal of
expertise about what works and what doesn't, a community of people highly
skilled in all aspects of the Apache Bigdata ecosystem and distributed
computing, and so on and so far. Besides, Bigtop cuts _across_ the
distributions, so there's a bit of the value of the "central hub" as well.

If guys at the Fedora want to go into the Hadoop distribution ordeal - you're
more than welcome. I don't think any other Linux distribution does, but it
could be just my own misguided view of the world.

And of course what Olaf said: the particular complications of the stack
creation stemmed from the need of multiple versions of the same binaries. Not
always, but once in a while. Hope it helps.

But we'd very happy to have more folks from the Fedora community helping to
make the particular packaging format better and more relevant and standard. 

Regards,
  Cos 

Re: Fedora/EPEL packaging

Posted by Christopher <ct...@apache.org>.
On Fri, Aug 12, 2016 at 9:26 PM Konstantin Boudnik <co...@apache.org> wrote:

> Thanks for reaching out, Christopher! I am not entirely sure about the
> proposition. Are you proposing to harmonize the RPM packaging between what
> we
> provide here and what is going to Fedora/CentOS? That'd be grand, I guess.
>
> I would like to learn how you propose we keep the code in sync, unless you
> would like to use the Bigtop specs as the upstream for your packaging
> effort.
> Then any changes could be up-streamed back through the usual Apache
> contribution process.
>
> What do you think? Thanks!
>   Cos
>
>
Sorry for the long response. The summary is that I'm hoping to:
1) solicit support for packaging in the other communities (especially
Fedora/EPEL) from Bigtop folks with packaging experience/interests, and
2) see where we can eliminate redundancy, and create a consistent message
about the differences between the available choices.


Longer response follows:

From Roman's response, it sounds like there might be incentive to continue
with the packaging standards established within Bigtop, because at least
some vendors are derivative of those standards.

I actually really like the standards in the Fedora community, which also
seem to closely match what EL are doing (not counting RHSCL stuffs). I
don't really know anything about Debian/Ubuntu packaging, though.

Ideally, I think a long-term goal would be to try to eliminate the need for
Bigtop. It'd be nice if I could just go to my preferred distribution, and
install using the distro's provided package manager. However, because the
Bigtop packaging standards may intentionally differ from these distros'
standards, I'd like to at least make them complimentary (reduce overlap),
and maybe make it clear why one might choose one set vs. another. For
example, one might choose Bigtop if one wants consistent packaging
regardless of the distribution one uses. Or, perhaps Bigtop adds value by
providing newer packages than what are provided by stable distro releases
(similar to https://ius.io/ for EL).

In the short term, I'm trying to encourage some folks to help out with the
Fedora and EPEL packaging. I've only taken on Hadoop in Fedora because the
previous maintainer could no longer maintain it, and I needed it for
Accumulo. But, I'm still lacking some expertise and time to work through
outstanding issues. In particular, the Hadoop version in Fedora is only
2.4.1, so it needs to be updated. Fedora also has build requirements to
support ARM architectures, so not all of the native code in Hadoop works
well on ARM, and I don't have sufficient expertise in C to bring the
relevant issues to the upstream Hadoop.

EPEL, as a whole is still lacking most of the good packages. Presumably,
that's because EL users are relying on commercial vendors, which is fine...
but I'd like to open these packages to a wider audience, within EPEL, just
like Bigtop is doing in Apache.

Thanks.

Re: Fedora/EPEL packaging

Posted by Konstantin Boudnik <co...@apache.org>.
Thanks for reaching out, Christopher! I am not entirely sure about the
proposition. Are you proposing to harmonize the RPM packaging between what we
provide here and what is going to Fedora/CentOS? That'd be grand, I guess.

I would like to learn how you propose we keep the code in sync, unless you
would like to use the Bigtop specs as the upstream for your packaging effort.
Then any changes could be up-streamed back through the usual Apache
contribution process.

What do you think? Thanks!
  Cos

On Fri, Aug 12, 2016 at 11:47PM, Christopher wrote:
> Hi Bigtop Developers,
> 
> I see that Bigtop packaging for RHEL/CentOS 6 and 7 and Fedora 20.
> 
> I was just wondering if there were any motivation within this community to
> put some of that packaging effort into those upstream communities, either
> to replace the packaging that BigTop provides, or to compliment it.
> 
> I'm currently a package maintainer for Fedora and EPEL (which is maintained
> by the Fedora community), for Hadoop, ZooKeeper, and Accumulo, and I think
> these Big Data packages could really use some additional support from more
> packagers. I also think that perhaps we can deduplicate some effort.
> 
> Is anybody interested in helping with the packaging in the Fedora/EPEL
> communities to support Fedora and EL distros?

Re: Fedora/EPEL packaging

Posted by Christopher <ct...@apache.org>.
On Sat, Aug 13, 2016 at 9:53 AM Olaf Flebbe <of...@oflebbe.de> wrote:

> Hi Christopher,
>
> thanks for reaching out to us!
>
> To be honest, we are getting into packaging problems recently:
>
> We are downloading dependencies from public repositories. Some artifacts
> are not suitable for the POWER8 and AARCH64 platforms we like to support.
> since they do not contain the proper shared libraries for these
> architectures. Right now I do not have a clue how to provide a suitable
> solution to this, since we like to
> built packges as unmodified as possible.
>
>
This is one area where downstream packagers can provide feedback to
upstream communities to extend native code support to more platforms.


> Fedora (Centos/Debian/Ubuntu) approach not to download anything, compiling
> all the artifacts for themselfs and reuse these artifacts when compiling
> upper software layers should solve this problem.
> I think we may agree on this point. I very much like to pickup jar's built
> by fedora if these fix our problems.
>
> But, as far as I read the documents Fedora still sticks to the "one
> version" mantra: There should be only one version of a library (jar) on the
> system. (The same on Debian, btw)
>
> Upstream devs often use ancient library versions and we cannot easily
> upgrade to newer versions, since they are not api compatible.
> However, other Fedora projects require new version and these are already
> packaged. Do we have to upgrade all the sources ???
>

One of the biggest advantages of a distro is dependency convergence. This
is important for stability, security, and interoperability. Individual
upstream projects can do their own thing with dependencies, and that's fine
when used in isolation, but that's where the power of having downstream
packaging comes in. Yes, it means sometimes the upstream project has to be
slightly patched to converge.

Fedora does permit multiple versions to be packaged as "compat" packages,
but they have an altered naming convention. This is suitable for some
things (log4j1 and log4j2, for example), but not necessarily others
(commons-math vs. commons-math3... easier to patch to use math3 rather than
package math2).

So, yes, sometimes it means upgrading sources... and sometimes it means
encouraging upstream communities to move to newer versions.


> For instance nobody likes to upgrade protobuf 2.5.1 in Hadoop to a current
> version since it may break the networking protocol and the compatiblity to
> other hadoop products.
>
> Frankly, nobody is in the position the clean all the mess up.
>
>
I'm optimistic that such things can be addressed reasonably. Encouraging
upstream to stick to relatively modern versions can help, and downstream
packagers/vendors can give upstream the confidence that those changes won't
create a bad user experience. My personal philosophy is that upstream
should use relatively modern versions, and downstream can patch to provide
support for older environments, as necessary. This is better than upstream
being stuck on old software, and packagers having to patch to make things
work in modern environments. Everything is on a case-by-case scenario,
though, and some dependencies can't be easily updated without breaking
things, but those older dependencies can be packaged, too. Multiple
versions is an option.


> Without fedora/centos/RHEL allowing different versions of a library to
> coexist on the system, even only to sidestep dependency problems, I see no
> chance of having substancial part of our effort into fedora.
>
>
They do allow multiple versions:
https://fedoraproject.org/wiki/Packaging:Naming#Multiple_packages_with_the_same_base_name,
but these shouldn't be used to circumvent the goal of dependency
convergence. For example, Fedora still ships jline1 and jline2, as well as
log4j 1.12, and log4j 2.

I think one of the biggest advantages that the dependency convergence of
downstream vendors/packagers can provide is the ability for upstream to be
confident enough to modernize their dependencies. Without strong downstream
packaging support, upstream is too afraid to make any changes in their
dependencies. Knowing that downstream is doing proper dependency
convergence and integration means that they don't have to worry about
breaking users so much, when they modernize their upstream applications.
There will always be special cases like protobuf (any serializer,
actually), but even that serves as a good example:

If Hadoop were confident that there were downstream packaging available
which converge on a common version of protobuf, to provide
interoperability, would they be so afraid to update to protobuf 2.6? I
think probably not, because they're not thinking that people are going to
get their tools in a dependency-convergent environment. They're worried
that their decisions are going to diverge from other upstream decisions.

Re: Fedora/EPEL packaging

Posted by Olaf Flebbe <of...@oflebbe.de>.
Hi Christopher,

thanks for reaching out to us!

To be honest, we are getting into packaging problems recently:

We are downloading dependencies from public repositories. Some artifacts are not suitable for the POWER8 and AARCH64 platforms we like to support.
since they do not contain the proper shared libraries for these architectures. Right now I do not have a clue how to provide a suitable solution to this, since we like to
built packges as unmodified as possible.

Fedora (Centos/Debian/Ubuntu) approach not to download anything, compiling all the artifacts for themselfs and reuse these artifacts when compiling upper software layers should solve this problem.
I think we may agree on this point. I very much like to pickup jar's built by fedora if these fix our problems.

But, as far as I read the documents Fedora still sticks to the "one version" mantra: There should be only one version of a library (jar) on the system. (The same on Debian, btw)

Upstream devs often use ancient library versions and we cannot easily upgrade to newer versions, since they are not api compatible.
However, other Fedora projects require new version and these are already packaged. Do we have to upgrade all the sources ???
For instance nobody likes to upgrade protobuf 2.5.1 in Hadoop to a current version since it may break the networking protocol and the compatiblity to other hadoop products.

Frankly, nobody is in the position the clean all the mess up.

Without fedora/centos/RHEL allowing different versions of a library to coexist on the system, even only to sidestep dependency problems, I see no chance of having substancial part of our effort into fedora.

I was working on silencing build sanitizer tools and to produce src rpm and src deb packages which can be recompiled by itself. But even that was a tough job:
See for instance BIGTOP-2151 : I failed to add an empty dir into the source rpm.

If you still like to contribute, please file JIRA's!

Olaf




> Am 13.08.2016 um 05:13 schrieb Christopher <ct...@apache.org>:
> 
> On Fri, Aug 12, 2016 at 9:41 PM Roman Shaposhnik <ro...@shaposhnik.org>
> wrote:
> 
>> Hi Christopher!
>> 
>> On Fri, Aug 12, 2016 at 4:47 PM, Christopher <ct...@apache.org> wrote:
>>> Hi Bigtop Developers,
>>> 
>>> I see that Bigtop packaging for RHEL/CentOS 6 and 7 and Fedora 20.
>>> 
>>> I was just wondering if there were any motivation within this community
>> to
>>> put some of that packaging effort into those upstream communities, either
>>> to replace the packaging that BigTop provides, or to compliment it.
>>> 
>>> I'm currently a package maintainer for Fedora and EPEL (which is
>> maintained
>>> by the Fedora community), for Hadoop, ZooKeeper, and Accumulo, and I
>> think
>>> these Big Data packages could really use some additional support from
>> more
>>> packagers. I also think that perhaps we can deduplicate some effort.
>>> 
>>> Is anybody interested in helping with the packaging in the Fedora/EPEL
>>> communities to support Fedora and EL distros?
>> 
>> I'm definitely very much interested in this. Especially since you have the
>> kind
>> of deep expertise of being an official maintainer. In fact, that's
>> actually the
>> biggest issue that we faced last time we were thinking about upstreaming
>> Bigtop packaging into Linux distros -- packaging guidelines.
>> 
>> 
> 
> My expertise is limited. I don't want to overstate it. Fedora is a
> community, just like Apache, and is open source. There's nothing official
> about it. :) I took on some of these as a novice, and for the most part, I
> still am.
> 
> (see my response to Konstantin).
> 
> 
>> For better or for worth, Bigtop ended up defining the layout and policies
>> for
>> all major commercial Hadoop offerings and changing that may not be much
>> of an option. If we keep Bigtop packaging the way it is I'm not sure how
>> much
>> good will we will get on the Linux distro side.
>> 
>> Examples off the top of my head here include:
>>   1. the way Bigtop packaging deals with jars
>>   2. the way Bigtop packaging deals with /etc
>> 
>> 
> 
> Fedora and Red Hat have gotten a *lot* better about packaging Java lately.
> I know that's one area that Linux distros have typically lagged behind in,
> but the state of things now are much much better... consistent locations
> for config files, env scripts, launch scripts, jar locations, rpm macros
> for creating launch scripts, tools to construct classpath, maven/ant-ivy
> stuffs to resolve dependencies from distro-packaged RPMs, rpm metadata to
> resolve RPMs by maven coordinates, etc.
> 
> So perhaps, a good first step for you would be to pick a simple package
>> (lets say Zookeeper) and tell us how much of Fedora packaging guidelines
>> we're still violating and how much of that is non-negotiable vs. could be
>> fixed.
>> 
>> Thanks,
>> Roman.
>> 
> 
> I'd have to look at how Bigtop is packaging things today. Understanding
> these differences is part of what I'm requesting here from this community.
> That and I figure some of Bigtop's community already acts as a liaisons to
> the upstream projects, for packaging-related issues/improvements/fixes.
> That's the kind of expertise I think Fedora/EPEL needs. In exchange, I can
> also do my best to try to help improve Bigtop packaging based on my
> experience in Fedora/EPEL, especially as I understand the differences in
> the packaging philosophies at Bigtop, and the separate goals of the project.
> 
> For now, I can offer a link to some of the packaging guidelines for
> Java/Maven in Fedora (https://fedoraproject.org/wiki/Packaging:Java) and
> some git repos for the SPEC/distro-specific patches:
> http://pkgs.fedoraproject.org/cgit/rpms/accumulo.git/
> http://pkgs.fedoraproject.org/cgit/rpms/hadoop.git/
> http://pkgs.fedoraproject.org/cgit/rpms/zookeeper.git/
> 
> If anybody wants to help out with any of these, I can offer my assistance
> getting started as a co-maintainer.


Re: Fedora/EPEL packaging

Posted by Christopher <ct...@apache.org>.
On Fri, Aug 12, 2016 at 9:41 PM Roman Shaposhnik <ro...@shaposhnik.org>
wrote:

> Hi Christopher!
>
> On Fri, Aug 12, 2016 at 4:47 PM, Christopher <ct...@apache.org> wrote:
> > Hi Bigtop Developers,
> >
> > I see that Bigtop packaging for RHEL/CentOS 6 and 7 and Fedora 20.
> >
> > I was just wondering if there were any motivation within this community
> to
> > put some of that packaging effort into those upstream communities, either
> > to replace the packaging that BigTop provides, or to compliment it.
> >
> > I'm currently a package maintainer for Fedora and EPEL (which is
> maintained
> > by the Fedora community), for Hadoop, ZooKeeper, and Accumulo, and I
> think
> > these Big Data packages could really use some additional support from
> more
> > packagers. I also think that perhaps we can deduplicate some effort.
> >
> > Is anybody interested in helping with the packaging in the Fedora/EPEL
> > communities to support Fedora and EL distros?
>
> I'm definitely very much interested in this. Especially since you have the
> kind
> of deep expertise of being an official maintainer. In fact, that's
> actually the
> biggest issue that we faced last time we were thinking about upstreaming
> Bigtop packaging into Linux distros -- packaging guidelines.
>
>

My expertise is limited. I don't want to overstate it. Fedora is a
community, just like Apache, and is open source. There's nothing official
about it. :) I took on some of these as a novice, and for the most part, I
still am.

(see my response to Konstantin).


> For better or for worth, Bigtop ended up defining the layout and policies
> for
> all major commercial Hadoop offerings and changing that may not be much
> of an option. If we keep Bigtop packaging the way it is I'm not sure how
> much
> good will we will get on the Linux distro side.
>
> Examples off the top of my head here include:
>    1. the way Bigtop packaging deals with jars
>    2. the way Bigtop packaging deals with /etc
>
>

Fedora and Red Hat have gotten a *lot* better about packaging Java lately.
I know that's one area that Linux distros have typically lagged behind in,
but the state of things now are much much better... consistent locations
for config files, env scripts, launch scripts, jar locations, rpm macros
for creating launch scripts, tools to construct classpath, maven/ant-ivy
stuffs to resolve dependencies from distro-packaged RPMs, rpm metadata to
resolve RPMs by maven coordinates, etc.

So perhaps, a good first step for you would be to pick a simple package
> (lets say Zookeeper) and tell us how much of Fedora packaging guidelines
> we're still violating and how much of that is non-negotiable vs. could be
> fixed.
>
> Thanks,
> Roman.
>

I'd have to look at how Bigtop is packaging things today. Understanding
these differences is part of what I'm requesting here from this community.
That and I figure some of Bigtop's community already acts as a liaisons to
the upstream projects, for packaging-related issues/improvements/fixes.
That's the kind of expertise I think Fedora/EPEL needs. In exchange, I can
also do my best to try to help improve Bigtop packaging based on my
experience in Fedora/EPEL, especially as I understand the differences in
the packaging philosophies at Bigtop, and the separate goals of the project.

For now, I can offer a link to some of the packaging guidelines for
Java/Maven in Fedora (https://fedoraproject.org/wiki/Packaging:Java) and
some git repos for the SPEC/distro-specific patches:
http://pkgs.fedoraproject.org/cgit/rpms/accumulo.git/
http://pkgs.fedoraproject.org/cgit/rpms/hadoop.git/
http://pkgs.fedoraproject.org/cgit/rpms/zookeeper.git/

If anybody wants to help out with any of these, I can offer my assistance
getting started as a co-maintainer.

Re: Fedora/EPEL packaging

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.
Hi Christopher!

On Fri, Aug 12, 2016 at 4:47 PM, Christopher <ct...@apache.org> wrote:
> Hi Bigtop Developers,
>
> I see that Bigtop packaging for RHEL/CentOS 6 and 7 and Fedora 20.
>
> I was just wondering if there were any motivation within this community to
> put some of that packaging effort into those upstream communities, either
> to replace the packaging that BigTop provides, or to compliment it.
>
> I'm currently a package maintainer for Fedora and EPEL (which is maintained
> by the Fedora community), for Hadoop, ZooKeeper, and Accumulo, and I think
> these Big Data packages could really use some additional support from more
> packagers. I also think that perhaps we can deduplicate some effort.
>
> Is anybody interested in helping with the packaging in the Fedora/EPEL
> communities to support Fedora and EL distros?

I'm definitely very much interested in this. Especially since you have the kind
of deep expertise of being an official maintainer. In fact, that's actually the
biggest issue that we faced last time we were thinking about upstreaming
Bigtop packaging into Linux distros -- packaging guidelines.

For better or for worth, Bigtop ended up defining the layout and policies for
all major commercial Hadoop offerings and changing that may not be much
of an option. If we keep Bigtop packaging the way it is I'm not sure how much
good will we will get on the Linux distro side.

Examples off the top of my head here include:
   1. the way Bigtop packaging deals with jars
   2. the way Bigtop packaging deals with /etc

So perhaps, a good first step for you would be to pick a simple package
(lets say Zookeeper) and tell us how much of Fedora packaging guidelines
we're still violating and how much of that is non-negotiable vs. could be
fixed.

Thanks,
Roman.