You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@bigtop.apache.org by Evans Ye <ev...@apache.org> on 2017/06/18 19:13:54 UTC

[DISCUSS] The future of Apache Bigtop

Hi folks,

Many things happened during DataWorks Summit San Jose 2017. Some of the
folks(Cos, Roman, Amir, Nate, Mike, etc) gathered together to discuss 1.2.1
and the future 1.3 release of Bigtop. I'd like to get back those
discussions to the mailing list so that who can't make it there can still
be with us for further discussions:

* 1.2.1 release
a). Some of the folks expecting Docker on YARN to be back ported to 2.7.4
and included in the release
b). Get rotted code out of our code base: packaging, deployment, testing,
etc
c). Get integration test to work in CI

* 1.3.0 release
a). More machine learning integrations
b). K8S integration will be an interesting topic

Please help me to complete the list if I miss something. :)


OTOH, for me specifically, I visited Cloudera for doing a tech talk. I meet
Sean Mackrory and there Hadoop and HBase lead. The pain point they're
having for a long time is not having an integration test framework for
there work on the bleeding edge. For example, whether a specific patch from
Hadoop breaks HBase or Hive?

My thinking towards this is this is what Bigtop tries to solve at the very
beginning. We supposed to have folks from multiple projects to work with us
to upgrade  packages, and use our frameworks to properly integrate, test
their code with other components.

So, the future of Bigtop. I think tightly work with the other communities
is a better way we move forward. But, that means something need to be
changed. For example, our distribution is somehow, from developers
perspective, old. Which can not support the integration and testing on the
bleeding edge. If we still like to  release something suggested for
Production only, one of the solution is to have both dev and stable
releases in Bigtop, so developers can work on the dev branch and test
against newest components. In that case, people from other communities
might be possible to help us upgrade the package to the newer version,
which makes things easier.

What do you guys think? Please join me for the discussion.

Re: [DISCUSS] The future of Apache Bigtop

Posted by Jay Vyas <ja...@gmail.com>.

If you guys finally decide to move to k8s then I can help get involved again. I'm a k8s maintainer and my current company needs a Hadoop deployment that supports containers 

> On Jun 21, 2017, at 1:02 PM, Olaf Flebbe <of...@oflebbe.de> wrote:
> 
> Hi Andrew,
> 
> you surely are making jokes when you are saying TAR is an improvement with respect to RPM/DEB.You surely know that you can unpack every RPM straight to the filesystem (DEB requires two steps), in case you'll like to.
> 
> You surely know that one can easily host a complete docker based hadoop cluster on a developer machine in the current git of bigtop. And that docker toolbox, docker engine, docker for mac integrates really well with Windows, Linux and MacOSX, working right out of the box (at least on MacOSX and Linux) as it is right now within bigtop without manually tweaking config files.
> 
> I see no point in reproducing hive, hbase, ... hadoop tests -- most of them single machine, fake cluster environment  -- when we can have the real thing, a cluster where we use docker for isolating nodes. When the tests do not really work portable, that's a problem of other projects, not ours. Let's fix it there.
> 
> IMHO, if we could orchestrate k8s (kubernetes) (or docker-swarm, my favorite) we could even chose to use a single host with some docker instances or scale out to a cloud environment and have a reproducable system without tweaking files. Of course there is much work to do to port tests to the cloud environment, but these would be a tremendous value added.
> 
> Olaf
> 
> 
> 
> 
>> Am 20.06.2017 um 23:12 schrieb Andrew Purtell <an...@gmail.com>:
>> 
>> Yeah, we can build from git repos. Instead of archive URL you can specify for each component a repo and reference by git-URL and branch, tag, or SHA.
>> 
>> Regarding tarball build targets, I was thinking of it as a packaging improvement, an additional packaging target. It could make integration testing more convenient too if you are not using containers or bare metal systems where you own the whole filesystem.
>> 
>>> On Jun 19, 2017, at 6:13 AM, Evans Ye <ev...@apache.org> wrote:
>>> 
>>> Hi Andy,
>>> 
>>> Is it easier to have multiple tarballs to setup a cluster for integration
>>> tests?
>>> I'm not on the Hadoop/HBase developer side so I have zero context. I was
>>> just assuming that deploying a cluster for integration tests would be a
>>> beneficial feature for them.
>>> 
>>> Bringing up my discussion with Hadoop and HBase guys at Cloudera, them
>>> mentioned two things specifically for Bigtop:
>>> 
>>> a). build from git (which I think you've contributed that in Bigtop already)
>>> b). easy to run integration test framework
>>> 
>>> I'm happy to have b). because either way we need to have it in our CI.
>>> 
>>> 
>>> 2017-06-19 5:04 GMT+08:00 Andrew Purtell <ap...@apache.org>:
>>> 
>>>> IMHO, the easiest and fastest way to get the distribution aspect to be more
>>>> useful to more folks is to add a build target that generates plain tarballs
>>>> instead of distro-specific Linux packaging. People like us can take the
>>>> tarballs and unpack them to environments where for various reasons we don't
>>>> want to do RPM management. Vendors like Cloudera can convert tarballs to
>>>> parcels, or whatever proprietary format is desired.
>>>> 
>>>> 
>>>> 
>>>>> On Sun, Jun 18, 2017 at 12:13 PM, Evans Ye <ev...@apache.org> wrote:
>>>>> 
>>>>> Hi folks,
>>>>> 
>>>>> Many things happened during DataWorks Summit San Jose 2017. Some of the
>>>>> folks(Cos, Roman, Amir, Nate, Mike, etc) gathered together to discuss
>>>> 1.2.1
>>>>> and the future 1.3 release of Bigtop. I'd like to get back those
>>>>> discussions to the mailing list so that who can't make it there can still
>>>>> be with us for further discussions:
>>>>> 
>>>>> * 1.2.1 release
>>>>> a). Some of the folks expecting Docker on YARN to be back ported to 2.7.4
>>>>> and included in the release
>>>>> b). Get rotted code out of our code base: packaging, deployment, testing,
>>>>> etc
>>>>> c). Get integration test to work in CI
>>>>> 
>>>>> * 1.3.0 release
>>>>> a). More machine learning integrations
>>>>> b). K8S integration will be an interesting topic
>>>>> 
>>>>> Please help me to complete the list if I miss something. :)
>>>>> 
>>>>> 
>>>>> OTOH, for me specifically, I visited Cloudera for doing a tech talk. I
>>>> meet
>>>>> Sean Mackrory and there Hadoop and HBase lead. The pain point they're
>>>>> having for a long time is not having an integration test framework for
>>>>> there work on the bleeding edge. For example, whether a specific patch
>>>> from
>>>>> Hadoop breaks HBase or Hive?
>>>>> 
>>>>> My thinking towards this is this is what Bigtop tries to solve at the
>>>> very
>>>>> beginning. We supposed to have folks from multiple projects to work with
>>>> us
>>>>> to upgrade  packages, and use our frameworks to properly integrate, test
>>>>> their code with other components.
>>>>> 
>>>>> So, the future of Bigtop. I think tightly work with the other communities
>>>>> is a better way we move forward. But, that means something need to be
>>>>> changed. For example, our distribution is somehow, from developers
>>>>> perspective, old. Which can not support the integration and testing on
>>>> the
>>>>> bleeding edge. If we still like to  release something suggested for
>>>>> Production only, one of the solution is to have both dev and stable
>>>>> releases in Bigtop, so developers can work on the dev branch and test
>>>>> against newest components. In that case, people from other communities
>>>>> might be possible to help us upgrade the package to the newer version,
>>>>> which makes things easier.
>>>>> 
>>>>> What do you guys think? Please join me for the discussion.
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Best regards,
>>>> 
>>>> - Andy
>>>> 
>>>> If you are given a choice, you believe you have acted freely. - Raymond
>>>> Teller (via Peter Watts)
>>>> 
>

Re: [DISCUSS] The future of Apache Bigtop

Posted by Evans Ye <ev...@apache.org>.

Putting technical debate aside, it would be appreciated if we can have
discussions peacefully.
After all all of us are just trying to make bigtop a better software. :)

Andy,
Can you elaborate how tarballs can make integration tests more easily? Here
I'm with Olaf that doing real cluster integration tests for
compatibilities, operational issues like rolling upgrade are something we
should be more focused on.

Jay,
I'd like to count on your expertise on k8s. I know a little about Swarm,
but zero about k8s.
How do you imaging the fsimages and edit logs being stored on k8s? I assume
there must be some distributed store for these files so that leveraging
docker for dynamic deployment is possible.


2017-06-22 4:53 GMT+08:00 Olaf Flebbe <of...@oflebbe.de>:

> Andrew,
>
> sorry to be sounding dismissive.
>
> rpm can be converted to cpio payload. deb is an ar archive containing
> data.tar,gz with the files as payload.
>
> Details can be looked up.
>
> Nevertheless, please do not call "tar" an "improvement" with respect to
> deployment compared to "rpm" or "deb". I see the usecase to extract just
> the content without all the dependency management, service management,
> verification and so on. Please just use rpm2cpio or ar.
>
> Olaf
>
>
> > Am 21.06.2017 um 22:08 schrieb Andrew Purtell <ap...@apache.org>:
> >
> >> you surely are making jokes when you are saying TAR is an improvement
> > with respect to RPM/DEB.You surely know that you can unpack every RPM
> > straight to the filesystem (DEB requires two steps), in case you'll like
> to.
> >
> > Not a joke and the condescension isn't helpful either.
> >
> >
> > On Wed, Jun 21, 2017 at 10:02 AM, Olaf Flebbe <of...@oflebbe.de> wrote:
> >
> >> Hi Andrew,
> >>
> >> you surely are making jokes when you are saying TAR is an improvement
> with
> >> respect to RPM/DEB.You surely know that you can unpack every RPM
> straight
> >> to the filesystem (DEB requires two steps), in case you'll like to.
> >>
> >> You surely know that one can easily host a complete docker based hadoop
> >> cluster on a developer machine in the current git of bigtop. And that
> >> docker toolbox, docker engine, docker for mac integrates really well
> with
> >> Windows, Linux and MacOSX, working right out of the box (at least on
> MacOSX
> >> and Linux) as it is right now within bigtop without manually tweaking
> >> config files.
> >>
> >> I see no point in reproducing hive, hbase, ... hadoop tests -- most of
> >> them single machine, fake cluster environment  -- when we can have the
> real
> >> thing, a cluster where we use docker for isolating nodes. When the
> tests do
> >> not really work portable, that's a problem of other projects, not ours.
> >> Let's fix it there.
> >>
> >> IMHO, if we could orchestrate k8s (kubernetes) (or docker-swarm, my
> >> favorite) we could even chose to use a single host with some docker
> >> instances or scale out to a cloud environment and have a reproducable
> >> system without tweaking files. Of course there is much work to do to
> port
> >> tests to the cloud environment, but these would be a tremendous value
> added.
> >>
> >> Olaf
> >>
> >>
> >>
> >>
> >>> Am 20.06.2017 um 23:12 schrieb Andrew Purtell <
> andrew.purtell@gmail.com
> >>> :
> >>>
> >>> Yeah, we can build from git repos. Instead of archive URL you can
> >> specify for each component a repo and reference by git-URL and branch,
> tag,
> >> or SHA.
> >>>
> >>> Regarding tarball build targets, I was thinking of it as a packaging
> >> improvement, an additional packaging target. It could make integration
> >> testing more convenient too if you are not using containers or bare
> metal
> >> systems where you own the whole filesystem.
> >>>
> >>>> On Jun 19, 2017, at 6:13 AM, Evans Ye <ev...@apache.org> wrote:
> >>>>
> >>>> Hi Andy,
> >>>>
> >>>> Is it easier to have multiple tarballs to setup a cluster for
> >> integration
> >>>> tests?
> >>>> I'm not on the Hadoop/HBase developer side so I have zero context. I
> was
> >>>> just assuming that deploying a cluster for integration tests would be
> a
> >>>> beneficial feature for them.
> >>>>
> >>>> Bringing up my discussion with Hadoop and HBase guys at Cloudera, them
> >>>> mentioned two things specifically for Bigtop:
> >>>>
> >>>> a). build from git (which I think you've contributed that in Bigtop
> >> already)
> >>>> b). easy to run integration test framework
> >>>>
> >>>> I'm happy to have b). because either way we need to have it in our CI.
> >>>>
> >>>>
> >>>> 2017-06-19 5:04 GMT+08:00 Andrew Purtell <ap...@apache.org>:
> >>>>
> >>>>> IMHO, the easiest and fastest way to get the distribution aspect to
> be
> >> more
> >>>>> useful to more folks is to add a build target that generates plain
> >> tarballs
> >>>>> instead of distro-specific Linux packaging. People like us can take
> the
> >>>>> tarballs and unpack them to environments where for various reasons we
> >> don't
> >>>>> want to do RPM management. Vendors like Cloudera can convert tarballs
> >> to
> >>>>> parcels, or whatever proprietary format is desired.
> >>>>>
> >>>>>
> >>>>>
> >>>>>> On Sun, Jun 18, 2017 at 12:13 PM, Evans Ye <ev...@apache.org>
> >> wrote:
> >>>>>>
> >>>>>> Hi folks,
> >>>>>>
> >>>>>> Many things happened during DataWorks Summit San Jose 2017. Some of
> >> the
> >>>>>> folks(Cos, Roman, Amir, Nate, Mike, etc) gathered together to
> discuss
> >>>>> 1.2.1
> >>>>>> and the future 1.3 release of Bigtop. I'd like to get back those
> >>>>>> discussions to the mailing list so that who can't make it there can
> >> still
> >>>>>> be with us for further discussions:
> >>>>>>
> >>>>>> * 1.2.1 release
> >>>>>> a). Some of the folks expecting Docker on YARN to be back ported to
> >> 2.7.4
> >>>>>> and included in the release
> >>>>>> b). Get rotted code out of our code base: packaging, deployment,
> >> testing,
> >>>>>> etc
> >>>>>> c). Get integration test to work in CI
> >>>>>>
> >>>>>> * 1.3.0 release
> >>>>>> a). More machine learning integrations
> >>>>>> b). K8S integration will be an interesting topic
> >>>>>>
> >>>>>> Please help me to complete the list if I miss something. :)
> >>>>>>
> >>>>>>
> >>>>>> OTOH, for me specifically, I visited Cloudera for doing a tech
> talk. I
> >>>>> meet
> >>>>>> Sean Mackrory and there Hadoop and HBase lead. The pain point
> they're
> >>>>>> having for a long time is not having an integration test framework
> for
> >>>>>> there work on the bleeding edge. For example, whether a specific
> patch
> >>>>> from
> >>>>>> Hadoop breaks HBase or Hive?
> >>>>>>
> >>>>>> My thinking towards this is this is what Bigtop tries to solve at
> the
> >>>>> very
> >>>>>> beginning. We supposed to have folks from multiple projects to work
> >> with
> >>>>> us
> >>>>>> to upgrade  packages, and use our frameworks to properly integrate,
> >> test
> >>>>>> their code with other components.
> >>>>>>
> >>>>>> So, the future of Bigtop. I think tightly work with the other
> >> communities
> >>>>>> is a better way we move forward. But, that means something need to
> be
> >>>>>> changed. For example, our distribution is somehow, from developers
> >>>>>> perspective, old. Which can not support the integration and testing
> on
> >>>>> the
> >>>>>> bleeding edge. If we still like to  release something suggested for
> >>>>>> Production only, one of the solution is to have both dev and stable
> >>>>>> releases in Bigtop, so developers can work on the dev branch and
> test
> >>>>>> against newest components. In that case, people from other
> communities
> >>>>>> might be possible to help us upgrade the package to the newer
> version,
> >>>>>> which makes things easier.
> >>>>>>
> >>>>>> What do you guys think? Please join me for the discussion.
> >>>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Best regards,
> >>>>>
> >>>>> - Andy
> >>>>>
> >>>>> If you are given a choice, you believe you have acted freely. -
> Raymond
> >>>>> Teller (via Peter Watts)
> >>>>>
> >>
> >>
> >
> >
> > --
> > Best regards,
> > Andrew
> >
> > Words like orphans lost among the crosstalk, meaning torn from truth's
> > decrepit hands
> >   - A23, Crosstalk
>
>

Re: [DISCUSS] The future of Apache Bigtop

Posted by Evans Ye <ev...@apache.org>.

I just wan to follow the guide and try out the feature:

https://github.com/apache/bigtop#for-developers-building-a-component-from-git-repository

[evans@aws-ec2 ~]$ git diff
diff --git a/bigtop.bom b/bigtop.bom
index 72b2b94..003295f 100644
--- a/bigtop.bom
+++ b/bigtop.bom
@@ -151,6 +151,9 @@ bigtop {
       name    = 'hbase'
       relNotes = 'Apache HBase'
       version { base = '1.1.9'; pkg = base; release = 1 }
+      git     { repo = "https://github.com/apache/hbase.git"
+                ref  = "${version.base}"
+                dir  = "${name}-${version.base}" }
       tarball { destination = "${name}-${version.base}.tar.gz"
                 source      = "${name}-${version.base}-src.tar.gz" }
       url     { download_path = "/$name/${version.base}/"



[evans@aws-ec2 ~]$ ./bigtop-ci/build.sh --os trunk-ubuntu-16.04 --target
hbase-pkg
...
:hbase-download FAILED

FAILURE: Build failed with an exception.

* Where:
Script '/var/lib/jenkins/bigtop/packages.gradle' line: 226

* What went wrong:
Execution failed for task ':hbase-download'.
> Process 'command 'git'' finished with non-zero exit value 1

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or
--debug option to get more log output.

BUILD FAILED

Total time: 14.973 secs


Appreciate that if you can help me out :)


2017-06-29 23:55 GMT+08:00 Konstantin Boudnik <co...@apache.org>:

> Sorry, which component you're trying to build from Git? I can take a
> look later today...
>
> Cos
> --
>   Take care,
> Konstantin (Cos) Boudnik
> 2CAC 8312 4870 D885 8616  6115 220F 6980 1F27 E622
>
> Disclaimer: Opinions expressed in this email are those of the author,
> and do not necessarily represent the views of any company the author
> might be affiliated with at the moment of writing.
>
>
> On Thu, Jun 29, 2017 at 7:00 AM, Evans Ye <ev...@apache.org> wrote:
> > I follow the guide on readme.md for building from git. However I got no
> > luck. Does someone tried it recently?
> >
> > Konstantin Boudnik <co...@apache.org>於 2017年6月29日 週四，上午10:53寫道：
> >
> >> On Wed, Jun 21, 2017 at 10:53PM, Olaf Flebbe wrote:
> >> > Andrew,
> >> >
> >> > sorry to be sounding dismissive.
> >>
> >> For it worth - I don't think you were dismissive. But whatever...
> >>
> >> I actually agree with your point: tarballs could be easily derived from
> the
> >> packages. And if anyone needs them we can have a target in the build
> system
> >> (please submit the patch), but I would be really against having them as
> a
> >> part
> >> of the standard artifacts' set.
> >>
> >> > rpm can be converted to cpio payload. deb is an ar archive containing
> >> > data.tar,gz with the files as payload.
> >> >
> >> > Details can be looked up.
> >> >
> >> > Nevertheless, please do not call "tar" an "improvement" with respect
> to
> >> > deployment compared to "rpm" or "deb". I see the usecase to extract
> just
> >> the
> >> > content without all the dependency management, service management,
> >> > verification and so on. Please just use rpm2cpio or ar.
> >>
> >> +1
> >>
> >> Cos
> >>
> >> > > Am 21.06.2017 um 22:08 schrieb Andrew Purtell <apurtell@apache.org
> >:
> >> > >
> >> > >> you surely are making jokes when you are saying TAR is an
> improvement
> >> > > with respect to RPM/DEB.You surely know that you can unpack every
> RPM
> >> > > straight to the filesystem (DEB requires two steps), in case you'll
> >> like to.
> >> > >
> >> > > Not a joke and the condescension isn't helpful either.
> >> > >
> >> > >
> >> > > On Wed, Jun 21, 2017 at 10:02 AM, Olaf Flebbe <of...@oflebbe.de>
> wrote:
> >> > >
> >> > >> Hi Andrew,
> >> > >>
> >> > >> you surely are making jokes when you are saying TAR is an
> improvement
> >> with
> >> > >> respect to RPM/DEB.You surely know that you can unpack every RPM
> >> straight
> >> > >> to the filesystem (DEB requires two steps), in case you'll like to.
> >> > >>
> >> > >> You surely know that one can easily host a complete docker based
> >> hadoop
> >> > >> cluster on a developer machine in the current git of bigtop. And
> that
> >> > >> docker toolbox, docker engine, docker for mac integrates really
> well
> >> with
> >> > >> Windows, Linux and MacOSX, working right out of the box (at least
> on
> >> MacOSX
> >> > >> and Linux) as it is right now within bigtop without manually
> tweaking
> >> > >> config files.
> >> > >>
> >> > >> I see no point in reproducing hive, hbase, ... hadoop tests --
> most of
> >> > >> them single machine, fake cluster environment  -- when we can have
> >> the real
> >> > >> thing, a cluster where we use docker for isolating nodes. When the
> >> tests do
> >> > >> not really work portable, that's a problem of other projects, not
> >> ours.
> >> > >> Let's fix it there.
> >> > >>
> >> > >> IMHO, if we could orchestrate k8s (kubernetes) (or docker-swarm, my
> >> > >> favorite) we could even chose to use a single host with some docker
> >> > >> instances or scale out to a cloud environment and have a
> reproducable
> >> > >> system without tweaking files. Of course there is much work to do
> to
> >> port
> >> > >> tests to the cloud environment, but these would be a tremendous
> value
> >> added.
> >> > >>
> >> > >> Olaf
> >> > >>
> >> > >>
> >> > >>
> >> > >>
> >> > >>> Am 20.06.2017 um 23:12 schrieb Andrew Purtell <
> >> andrew.purtell@gmail.com
> >> > >>> :
> >> > >>>
> >> > >>> Yeah, we can build from git repos. Instead of archive URL you can
> >> > >> specify for each component a repo and reference by git-URL and
> >> branch, tag,
> >> > >> or SHA.
> >> > >>>
> >> > >>> Regarding tarball build targets, I was thinking of it as a
> packaging
> >> > >> improvement, an additional packaging target. It could make
> integration
> >> > >> testing more convenient too if you are not using containers or bare
> >> metal
> >> > >> systems where you own the whole filesystem.
> >> > >>>
> >> > >>>> On Jun 19, 2017, at 6:13 AM, Evans Ye <ev...@apache.org>
> wrote:
> >> > >>>>
> >> > >>>> Hi Andy,
> >> > >>>>
> >> > >>>> Is it easier to have multiple tarballs to setup a cluster for
> >> > >> integration
> >> > >>>> tests?
> >> > >>>> I'm not on the Hadoop/HBase developer side so I have zero
> context.
> >> I was
> >> > >>>> just assuming that deploying a cluster for integration tests
> would
> >> be a
> >> > >>>> beneficial feature for them.
> >> > >>>>
> >> > >>>> Bringing up my discussion with Hadoop and HBase guys at Cloudera,
> >> them
> >> > >>>> mentioned two things specifically for Bigtop:
> >> > >>>>
> >> > >>>> a). build from git (which I think you've contributed that in
> Bigtop
> >> > >> already)
> >> > >>>> b). easy to run integration test framework
> >> > >>>>
> >> > >>>> I'm happy to have b). because either way we need to have it in
> our
> >> CI.
> >> > >>>>
> >> > >>>>
> >> > >>>> 2017-06-19 5:04 GMT+08:00 Andrew Purtell <ap...@apache.org>:
> >> > >>>>
> >> > >>>>> IMHO, the easiest and fastest way to get the distribution aspect
> >> to be
> >> > >> more
> >> > >>>>> useful to more folks is to add a build target that generates
> plain
> >> > >> tarballs
> >> > >>>>> instead of distro-specific Linux packaging. People like us can
> >> take the
> >> > >>>>> tarballs and unpack them to environments where for various
> reasons
> >> we
> >> > >> don't
> >> > >>>>> want to do RPM management. Vendors like Cloudera can convert
> >> tarballs
> >> > >> to
> >> > >>>>> parcels, or whatever proprietary format is desired.
> >> > >>>>>
> >> > >>>>>
> >> > >>>>>
> >> > >>>>>> On Sun, Jun 18, 2017 at 12:13 PM, Evans Ye <evansye@apache.org
> >
> >> > >> wrote:
> >> > >>>>>>
> >> > >>>>>> Hi folks,
> >> > >>>>>>
> >> > >>>>>> Many things happened during DataWorks Summit San Jose 2017.
> Some
> >> of
> >> > >> the
> >> > >>>>>> folks(Cos, Roman, Amir, Nate, Mike, etc) gathered together to
> >> discuss
> >> > >>>>> 1.2.1
> >> > >>>>>> and the future 1.3 release of Bigtop. I'd like to get back
> those
> >> > >>>>>> discussions to the mailing list so that who can't make it there
> >> can
> >> > >> still
> >> > >>>>>> be with us for further discussions:
> >> > >>>>>>
> >> > >>>>>> * 1.2.1 release
> >> > >>>>>> a). Some of the folks expecting Docker on YARN to be back
> ported
> >> to
> >> > >> 2.7.4
> >> > >>>>>> and included in the release
> >> > >>>>>> b). Get rotted code out of our code base: packaging,
> deployment,
> >> > >> testing,
> >> > >>>>>> etc
> >> > >>>>>> c). Get integration test to work in CI
> >> > >>>>>>
> >> > >>>>>> * 1.3.0 release
> >> > >>>>>> a). More machine learning integrations
> >> > >>>>>> b). K8S integration will be an interesting topic
> >> > >>>>>>
> >> > >>>>>> Please help me to complete the list if I miss something. :)
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>> OTOH, for me specifically, I visited Cloudera for doing a tech
> >> talk. I
> >> > >>>>> meet
> >> > >>>>>> Sean Mackrory and there Hadoop and HBase lead. The pain point
> >> they're
> >> > >>>>>> having for a long time is not having an integration test
> >> framework for
> >> > >>>>>> there work on the bleeding edge. For example, whether a
> specific
> >> patch
> >> > >>>>> from
> >> > >>>>>> Hadoop breaks HBase or Hive?
> >> > >>>>>>
> >> > >>>>>> My thinking towards this is this is what Bigtop tries to solve
> at
> >> the
> >> > >>>>> very
> >> > >>>>>> beginning. We supposed to have folks from multiple projects to
> >> work
> >> > >> with
> >> > >>>>> us
> >> > >>>>>> to upgrade  packages, and use our frameworks to properly
> >> integrate,
> >> > >> test
> >> > >>>>>> their code with other components.
> >> > >>>>>>
> >> > >>>>>> So, the future of Bigtop. I think tightly work with the other
> >> > >> communities
> >> > >>>>>> is a better way we move forward. But, that means something need
> >> to be
> >> > >>>>>> changed. For example, our distribution is somehow, from
> developers
> >> > >>>>>> perspective, old. Which can not support the integration and
> >> testing on
> >> > >>>>> the
> >> > >>>>>> bleeding edge. If we still like to  release something suggested
> >> for
> >> > >>>>>> Production only, one of the solution is to have both dev and
> >> stable
> >> > >>>>>> releases in Bigtop, so developers can work on the dev branch
> and
> >> test
> >> > >>>>>> against newest components. In that case, people from other
> >> communities
> >> > >>>>>> might be possible to help us upgrade the package to the newer
> >> version,
> >> > >>>>>> which makes things easier.
> >> > >>>>>>
> >> > >>>>>> What do you guys think? Please join me for the discussion.
> >> > >>>>>>
> >> > >>>>>
> >> > >>>>>
> >> > >>>>>
> >> > >>>>> --
> >> > >>>>> Best regards,
> >> > >>>>>
> >> > >>>>> - Andy
> >> > >>>>>
> >> > >>>>> If you are given a choice, you believe you have acted freely. -
> >> Raymond
> >> > >>>>> Teller (via Peter Watts)
> >> > >>>>>
> >> > >>
> >> > >>
> >> > >
> >> > >
> >> > > --
> >> > > Best regards,
> >> > > Andrew
> >> > >
> >> > > Words like orphans lost among the crosstalk, meaning torn from
> truth's
> >> > > decrepit hands
> >> > >   - A23, Crosstalk
> >> >
> >>
> >>
> >>
>

Re: [DISCUSS] The future of Apache Bigtop

Posted by Konstantin Boudnik <co...@apache.org>.

Sorry, which component you're trying to build from Git? I can take a
look later today...

Cos
--
  Take care,
Konstantin (Cos) Boudnik
2CAC 8312 4870 D885 8616  6115 220F 6980 1F27 E622

Disclaimer: Opinions expressed in this email are those of the author,
and do not necessarily represent the views of any company the author
might be affiliated with at the moment of writing.


On Thu, Jun 29, 2017 at 7:00 AM, Evans Ye <ev...@apache.org> wrote:
> I follow the guide on readme.md for building from git. However I got no
> luck. Does someone tried it recently?
>
> Konstantin Boudnik <co...@apache.org>於 2017年6月29日 週四，上午10:53寫道：
>
>> On Wed, Jun 21, 2017 at 10:53PM, Olaf Flebbe wrote:
>> > Andrew,
>> >
>> > sorry to be sounding dismissive.
>>
>> For it worth - I don't think you were dismissive. But whatever...
>>
>> I actually agree with your point: tarballs could be easily derived from the
>> packages. And if anyone needs them we can have a target in the build system
>> (please submit the patch), but I would be really against having them as a
>> part
>> of the standard artifacts' set.
>>
>> > rpm can be converted to cpio payload. deb is an ar archive containing
>> > data.tar,gz with the files as payload.
>> >
>> > Details can be looked up.
>> >
>> > Nevertheless, please do not call "tar" an "improvement" with respect to
>> > deployment compared to "rpm" or "deb". I see the usecase to extract just
>> the
>> > content without all the dependency management, service management,
>> > verification and so on. Please just use rpm2cpio or ar.
>>
>> +1
>>
>> Cos
>>
>> > > Am 21.06.2017 um 22:08 schrieb Andrew Purtell <ap...@apache.org>:
>> > >
>> > >> you surely are making jokes when you are saying TAR is an improvement
>> > > with respect to RPM/DEB.You surely know that you can unpack every RPM
>> > > straight to the filesystem (DEB requires two steps), in case you'll
>> like to.
>> > >
>> > > Not a joke and the condescension isn't helpful either.
>> > >
>> > >
>> > > On Wed, Jun 21, 2017 at 10:02 AM, Olaf Flebbe <of...@oflebbe.de> wrote:
>> > >
>> > >> Hi Andrew,
>> > >>
>> > >> you surely are making jokes when you are saying TAR is an improvement
>> with
>> > >> respect to RPM/DEB.You surely know that you can unpack every RPM
>> straight
>> > >> to the filesystem (DEB requires two steps), in case you'll like to.
>> > >>
>> > >> You surely know that one can easily host a complete docker based
>> hadoop
>> > >> cluster on a developer machine in the current git of bigtop. And that
>> > >> docker toolbox, docker engine, docker for mac integrates really well
>> with
>> > >> Windows, Linux and MacOSX, working right out of the box (at least on
>> MacOSX
>> > >> and Linux) as it is right now within bigtop without manually tweaking
>> > >> config files.
>> > >>
>> > >> I see no point in reproducing hive, hbase, ... hadoop tests -- most of
>> > >> them single machine, fake cluster environment  -- when we can have
>> the real
>> > >> thing, a cluster where we use docker for isolating nodes. When the
>> tests do
>> > >> not really work portable, that's a problem of other projects, not
>> ours.
>> > >> Let's fix it there.
>> > >>
>> > >> IMHO, if we could orchestrate k8s (kubernetes) (or docker-swarm, my
>> > >> favorite) we could even chose to use a single host with some docker
>> > >> instances or scale out to a cloud environment and have a reproducable
>> > >> system without tweaking files. Of course there is much work to do to
>> port
>> > >> tests to the cloud environment, but these would be a tremendous value
>> added.
>> > >>
>> > >> Olaf
>> > >>
>> > >>
>> > >>
>> > >>
>> > >>> Am 20.06.2017 um 23:12 schrieb Andrew Purtell <
>> andrew.purtell@gmail.com
>> > >>> :
>> > >>>
>> > >>> Yeah, we can build from git repos. Instead of archive URL you can
>> > >> specify for each component a repo and reference by git-URL and
>> branch, tag,
>> > >> or SHA.
>> > >>>
>> > >>> Regarding tarball build targets, I was thinking of it as a packaging
>> > >> improvement, an additional packaging target. It could make integration
>> > >> testing more convenient too if you are not using containers or bare
>> metal
>> > >> systems where you own the whole filesystem.
>> > >>>
>> > >>>> On Jun 19, 2017, at 6:13 AM, Evans Ye <ev...@apache.org> wrote:
>> > >>>>
>> > >>>> Hi Andy,
>> > >>>>
>> > >>>> Is it easier to have multiple tarballs to setup a cluster for
>> > >> integration
>> > >>>> tests?
>> > >>>> I'm not on the Hadoop/HBase developer side so I have zero context.
>> I was
>> > >>>> just assuming that deploying a cluster for integration tests would
>> be a
>> > >>>> beneficial feature for them.
>> > >>>>
>> > >>>> Bringing up my discussion with Hadoop and HBase guys at Cloudera,
>> them
>> > >>>> mentioned two things specifically for Bigtop:
>> > >>>>
>> > >>>> a). build from git (which I think you've contributed that in Bigtop
>> > >> already)
>> > >>>> b). easy to run integration test framework
>> > >>>>
>> > >>>> I'm happy to have b). because either way we need to have it in our
>> CI.
>> > >>>>
>> > >>>>
>> > >>>> 2017-06-19 5:04 GMT+08:00 Andrew Purtell <ap...@apache.org>:
>> > >>>>
>> > >>>>> IMHO, the easiest and fastest way to get the distribution aspect
>> to be
>> > >> more
>> > >>>>> useful to more folks is to add a build target that generates plain
>> > >> tarballs
>> > >>>>> instead of distro-specific Linux packaging. People like us can
>> take the
>> > >>>>> tarballs and unpack them to environments where for various reasons
>> we
>> > >> don't
>> > >>>>> want to do RPM management. Vendors like Cloudera can convert
>> tarballs
>> > >> to
>> > >>>>> parcels, or whatever proprietary format is desired.
>> > >>>>>
>> > >>>>>
>> > >>>>>
>> > >>>>>> On Sun, Jun 18, 2017 at 12:13 PM, Evans Ye <ev...@apache.org>
>> > >> wrote:
>> > >>>>>>
>> > >>>>>> Hi folks,
>> > >>>>>>
>> > >>>>>> Many things happened during DataWorks Summit San Jose 2017. Some
>> of
>> > >> the
>> > >>>>>> folks(Cos, Roman, Amir, Nate, Mike, etc) gathered together to
>> discuss
>> > >>>>> 1.2.1
>> > >>>>>> and the future 1.3 release of Bigtop. I'd like to get back those
>> > >>>>>> discussions to the mailing list so that who can't make it there
>> can
>> > >> still
>> > >>>>>> be with us for further discussions:
>> > >>>>>>
>> > >>>>>> * 1.2.1 release
>> > >>>>>> a). Some of the folks expecting Docker on YARN to be back ported
>> to
>> > >> 2.7.4
>> > >>>>>> and included in the release
>> > >>>>>> b). Get rotted code out of our code base: packaging, deployment,
>> > >> testing,
>> > >>>>>> etc
>> > >>>>>> c). Get integration test to work in CI
>> > >>>>>>
>> > >>>>>> * 1.3.0 release
>> > >>>>>> a). More machine learning integrations
>> > >>>>>> b). K8S integration will be an interesting topic
>> > >>>>>>
>> > >>>>>> Please help me to complete the list if I miss something. :)
>> > >>>>>>
>> > >>>>>>
>> > >>>>>> OTOH, for me specifically, I visited Cloudera for doing a tech
>> talk. I
>> > >>>>> meet
>> > >>>>>> Sean Mackrory and there Hadoop and HBase lead. The pain point
>> they're
>> > >>>>>> having for a long time is not having an integration test
>> framework for
>> > >>>>>> there work on the bleeding edge. For example, whether a specific
>> patch
>> > >>>>> from
>> > >>>>>> Hadoop breaks HBase or Hive?
>> > >>>>>>
>> > >>>>>> My thinking towards this is this is what Bigtop tries to solve at
>> the
>> > >>>>> very
>> > >>>>>> beginning. We supposed to have folks from multiple projects to
>> work
>> > >> with
>> > >>>>> us
>> > >>>>>> to upgrade  packages, and use our frameworks to properly
>> integrate,
>> > >> test
>> > >>>>>> their code with other components.
>> > >>>>>>
>> > >>>>>> So, the future of Bigtop. I think tightly work with the other
>> > >> communities
>> > >>>>>> is a better way we move forward. But, that means something need
>> to be
>> > >>>>>> changed. For example, our distribution is somehow, from developers
>> > >>>>>> perspective, old. Which can not support the integration and
>> testing on
>> > >>>>> the
>> > >>>>>> bleeding edge. If we still like to  release something suggested
>> for
>> > >>>>>> Production only, one of the solution is to have both dev and
>> stable
>> > >>>>>> releases in Bigtop, so developers can work on the dev branch and
>> test
>> > >>>>>> against newest components. In that case, people from other
>> communities
>> > >>>>>> might be possible to help us upgrade the package to the newer
>> version,
>> > >>>>>> which makes things easier.
>> > >>>>>>
>> > >>>>>> What do you guys think? Please join me for the discussion.
>> > >>>>>>
>> > >>>>>
>> > >>>>>
>> > >>>>>
>> > >>>>> --
>> > >>>>> Best regards,
>> > >>>>>
>> > >>>>> - Andy
>> > >>>>>
>> > >>>>> If you are given a choice, you believe you have acted freely. -
>> Raymond
>> > >>>>> Teller (via Peter Watts)
>> > >>>>>
>> > >>
>> > >>
>> > >
>> > >
>> > > --
>> > > Best regards,
>> > > Andrew
>> > >
>> > > Words like orphans lost among the crosstalk, meaning torn from truth's
>> > > decrepit hands
>> > >   - A23, Crosstalk
>> >
>>
>>
>>

Re: [DISCUSS] The future of Apache Bigtop

Posted by Evans Ye <ev...@apache.org>.

I follow the guide on readme.md for building from git. However I got no
luck. Does someone tried it recently?

Konstantin Boudnik <co...@apache.org>於 2017年6月29日 週四，上午10:53寫道：

> On Wed, Jun 21, 2017 at 10:53PM, Olaf Flebbe wrote:
> > Andrew,
> >
> > sorry to be sounding dismissive.
>
> For it worth - I don't think you were dismissive. But whatever...
>
> I actually agree with your point: tarballs could be easily derived from the
> packages. And if anyone needs them we can have a target in the build system
> (please submit the patch), but I would be really against having them as a
> part
> of the standard artifacts' set.
>
> > rpm can be converted to cpio payload. deb is an ar archive containing
> > data.tar,gz with the files as payload.
> >
> > Details can be looked up.
> >
> > Nevertheless, please do not call "tar" an "improvement" with respect to
> > deployment compared to "rpm" or "deb". I see the usecase to extract just
> the
> > content without all the dependency management, service management,
> > verification and so on. Please just use rpm2cpio or ar.
>
> +1
>
> Cos
>
> > > Am 21.06.2017 um 22:08 schrieb Andrew Purtell <ap...@apache.org>:
> > >
> > >> you surely are making jokes when you are saying TAR is an improvement
> > > with respect to RPM/DEB.You surely know that you can unpack every RPM
> > > straight to the filesystem (DEB requires two steps), in case you'll
> like to.
> > >
> > > Not a joke and the condescension isn't helpful either.
> > >
> > >
> > > On Wed, Jun 21, 2017 at 10:02 AM, Olaf Flebbe <of...@oflebbe.de> wrote:
> > >
> > >> Hi Andrew,
> > >>
> > >> you surely are making jokes when you are saying TAR is an improvement
> with
> > >> respect to RPM/DEB.You surely know that you can unpack every RPM
> straight
> > >> to the filesystem (DEB requires two steps), in case you'll like to.
> > >>
> > >> You surely know that one can easily host a complete docker based
> hadoop
> > >> cluster on a developer machine in the current git of bigtop. And that
> > >> docker toolbox, docker engine, docker for mac integrates really well
> with
> > >> Windows, Linux and MacOSX, working right out of the box (at least on
> MacOSX
> > >> and Linux) as it is right now within bigtop without manually tweaking
> > >> config files.
> > >>
> > >> I see no point in reproducing hive, hbase, ... hadoop tests -- most of
> > >> them single machine, fake cluster environment  -- when we can have
> the real
> > >> thing, a cluster where we use docker for isolating nodes. When the
> tests do
> > >> not really work portable, that's a problem of other projects, not
> ours.
> > >> Let's fix it there.
> > >>
> > >> IMHO, if we could orchestrate k8s (kubernetes) (or docker-swarm, my
> > >> favorite) we could even chose to use a single host with some docker
> > >> instances or scale out to a cloud environment and have a reproducable
> > >> system without tweaking files. Of course there is much work to do to
> port
> > >> tests to the cloud environment, but these would be a tremendous value
> added.
> > >>
> > >> Olaf
> > >>
> > >>
> > >>
> > >>
> > >>> Am 20.06.2017 um 23:12 schrieb Andrew Purtell <
> andrew.purtell@gmail.com
> > >>> :
> > >>>
> > >>> Yeah, we can build from git repos. Instead of archive URL you can
> > >> specify for each component a repo and reference by git-URL and
> branch, tag,
> > >> or SHA.
> > >>>
> > >>> Regarding tarball build targets, I was thinking of it as a packaging
> > >> improvement, an additional packaging target. It could make integration
> > >> testing more convenient too if you are not using containers or bare
> metal
> > >> systems where you own the whole filesystem.
> > >>>
> > >>>> On Jun 19, 2017, at 6:13 AM, Evans Ye <ev...@apache.org> wrote:
> > >>>>
> > >>>> Hi Andy,
> > >>>>
> > >>>> Is it easier to have multiple tarballs to setup a cluster for
> > >> integration
> > >>>> tests?
> > >>>> I'm not on the Hadoop/HBase developer side so I have zero context.
> I was
> > >>>> just assuming that deploying a cluster for integration tests would
> be a
> > >>>> beneficial feature for them.
> > >>>>
> > >>>> Bringing up my discussion with Hadoop and HBase guys at Cloudera,
> them
> > >>>> mentioned two things specifically for Bigtop:
> > >>>>
> > >>>> a). build from git (which I think you've contributed that in Bigtop
> > >> already)
> > >>>> b). easy to run integration test framework
> > >>>>
> > >>>> I'm happy to have b). because either way we need to have it in our
> CI.
> > >>>>
> > >>>>
> > >>>> 2017-06-19 5:04 GMT+08:00 Andrew Purtell <ap...@apache.org>:
> > >>>>
> > >>>>> IMHO, the easiest and fastest way to get the distribution aspect
> to be
> > >> more
> > >>>>> useful to more folks is to add a build target that generates plain
> > >> tarballs
> > >>>>> instead of distro-specific Linux packaging. People like us can
> take the
> > >>>>> tarballs and unpack them to environments where for various reasons
> we
> > >> don't
> > >>>>> want to do RPM management. Vendors like Cloudera can convert
> tarballs
> > >> to
> > >>>>> parcels, or whatever proprietary format is desired.
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>> On Sun, Jun 18, 2017 at 12:13 PM, Evans Ye <ev...@apache.org>
> > >> wrote:
> > >>>>>>
> > >>>>>> Hi folks,
> > >>>>>>
> > >>>>>> Many things happened during DataWorks Summit San Jose 2017. Some
> of
> > >> the
> > >>>>>> folks(Cos, Roman, Amir, Nate, Mike, etc) gathered together to
> discuss
> > >>>>> 1.2.1
> > >>>>>> and the future 1.3 release of Bigtop. I'd like to get back those
> > >>>>>> discussions to the mailing list so that who can't make it there
> can
> > >> still
> > >>>>>> be with us for further discussions:
> > >>>>>>
> > >>>>>> * 1.2.1 release
> > >>>>>> a). Some of the folks expecting Docker on YARN to be back ported
> to
> > >> 2.7.4
> > >>>>>> and included in the release
> > >>>>>> b). Get rotted code out of our code base: packaging, deployment,
> > >> testing,
> > >>>>>> etc
> > >>>>>> c). Get integration test to work in CI
> > >>>>>>
> > >>>>>> * 1.3.0 release
> > >>>>>> a). More machine learning integrations
> > >>>>>> b). K8S integration will be an interesting topic
> > >>>>>>
> > >>>>>> Please help me to complete the list if I miss something. :)
> > >>>>>>
> > >>>>>>
> > >>>>>> OTOH, for me specifically, I visited Cloudera for doing a tech
> talk. I
> > >>>>> meet
> > >>>>>> Sean Mackrory and there Hadoop and HBase lead. The pain point
> they're
> > >>>>>> having for a long time is not having an integration test
> framework for
> > >>>>>> there work on the bleeding edge. For example, whether a specific
> patch
> > >>>>> from
> > >>>>>> Hadoop breaks HBase or Hive?
> > >>>>>>
> > >>>>>> My thinking towards this is this is what Bigtop tries to solve at
> the
> > >>>>> very
> > >>>>>> beginning. We supposed to have folks from multiple projects to
> work
> > >> with
> > >>>>> us
> > >>>>>> to upgrade  packages, and use our frameworks to properly
> integrate,
> > >> test
> > >>>>>> their code with other components.
> > >>>>>>
> > >>>>>> So, the future of Bigtop. I think tightly work with the other
> > >> communities
> > >>>>>> is a better way we move forward. But, that means something need
> to be
> > >>>>>> changed. For example, our distribution is somehow, from developers
> > >>>>>> perspective, old. Which can not support the integration and
> testing on
> > >>>>> the
> > >>>>>> bleeding edge. If we still like to  release something suggested
> for
> > >>>>>> Production only, one of the solution is to have both dev and
> stable
> > >>>>>> releases in Bigtop, so developers can work on the dev branch and
> test
> > >>>>>> against newest components. In that case, people from other
> communities
> > >>>>>> might be possible to help us upgrade the package to the newer
> version,
> > >>>>>> which makes things easier.
> > >>>>>>
> > >>>>>> What do you guys think? Please join me for the discussion.
> > >>>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> --
> > >>>>> Best regards,
> > >>>>>
> > >>>>> - Andy
> > >>>>>
> > >>>>> If you are given a choice, you believe you have acted freely. -
> Raymond
> > >>>>> Teller (via Peter Watts)
> > >>>>>
> > >>
> > >>
> > >
> > >
> > > --
> > > Best regards,
> > > Andrew
> > >
> > > Words like orphans lost among the crosstalk, meaning torn from truth's
> > > decrepit hands
> > >   - A23, Crosstalk
> >
>
>
>

Re: [DISCUSS] The future of Apache Bigtop

Posted by Konstantin Boudnik <co...@apache.org>.

On Wed, Jun 21, 2017 at 10:53PM, Olaf Flebbe wrote:
> Andrew,
> 
> sorry to be sounding dismissive.

For it worth - I don't think you were dismissive. But whatever...

I actually agree with your point: tarballs could be easily derived from the
packages. And if anyone needs them we can have a target in the build system
(please submit the patch), but I would be really against having them as a part
of the standard artifacts' set.

> rpm can be converted to cpio payload. deb is an ar archive containing
> data.tar,gz with the files as payload.
> 
> Details can be looked up.
> 
> Nevertheless, please do not call "tar" an "improvement" with respect to
> deployment compared to "rpm" or "deb". I see the usecase to extract just the
> content without all the dependency management, service management,
> verification and so on. Please just use rpm2cpio or ar.

+1

Cos

> > Am 21.06.2017 um 22:08 schrieb Andrew Purtell <ap...@apache.org>:
> > 
> >> you surely are making jokes when you are saying TAR is an improvement
> > with respect to RPM/DEB.You surely know that you can unpack every RPM
> > straight to the filesystem (DEB requires two steps), in case you'll like to.
> > 
> > Not a joke and the condescension isn't helpful either.
> > 
> > 
> > On Wed, Jun 21, 2017 at 10:02 AM, Olaf Flebbe <of...@oflebbe.de> wrote:
> > 
> >> Hi Andrew,
> >> 
> >> you surely are making jokes when you are saying TAR is an improvement with
> >> respect to RPM/DEB.You surely know that you can unpack every RPM straight
> >> to the filesystem (DEB requires two steps), in case you'll like to.
> >> 
> >> You surely know that one can easily host a complete docker based hadoop
> >> cluster on a developer machine in the current git of bigtop. And that
> >> docker toolbox, docker engine, docker for mac integrates really well with
> >> Windows, Linux and MacOSX, working right out of the box (at least on MacOSX
> >> and Linux) as it is right now within bigtop without manually tweaking
> >> config files.
> >> 
> >> I see no point in reproducing hive, hbase, ... hadoop tests -- most of
> >> them single machine, fake cluster environment  -- when we can have the real
> >> thing, a cluster where we use docker for isolating nodes. When the tests do
> >> not really work portable, that's a problem of other projects, not ours.
> >> Let's fix it there.
> >> 
> >> IMHO, if we could orchestrate k8s (kubernetes) (or docker-swarm, my
> >> favorite) we could even chose to use a single host with some docker
> >> instances or scale out to a cloud environment and have a reproducable
> >> system without tweaking files. Of course there is much work to do to port
> >> tests to the cloud environment, but these would be a tremendous value added.
> >> 
> >> Olaf
> >> 
> >> 
> >> 
> >> 
> >>> Am 20.06.2017 um 23:12 schrieb Andrew Purtell <andrew.purtell@gmail.com
> >>> :
> >>> 
> >>> Yeah, we can build from git repos. Instead of archive URL you can
> >> specify for each component a repo and reference by git-URL and branch, tag,
> >> or SHA.
> >>> 
> >>> Regarding tarball build targets, I was thinking of it as a packaging
> >> improvement, an additional packaging target. It could make integration
> >> testing more convenient too if you are not using containers or bare metal
> >> systems where you own the whole filesystem.
> >>> 
> >>>> On Jun 19, 2017, at 6:13 AM, Evans Ye <ev...@apache.org> wrote:
> >>>> 
> >>>> Hi Andy,
> >>>> 
> >>>> Is it easier to have multiple tarballs to setup a cluster for
> >> integration
> >>>> tests?
> >>>> I'm not on the Hadoop/HBase developer side so I have zero context. I was
> >>>> just assuming that deploying a cluster for integration tests would be a
> >>>> beneficial feature for them.
> >>>> 
> >>>> Bringing up my discussion with Hadoop and HBase guys at Cloudera, them
> >>>> mentioned two things specifically for Bigtop:
> >>>> 
> >>>> a). build from git (which I think you've contributed that in Bigtop
> >> already)
> >>>> b). easy to run integration test framework
> >>>> 
> >>>> I'm happy to have b). because either way we need to have it in our CI.
> >>>> 
> >>>> 
> >>>> 2017-06-19 5:04 GMT+08:00 Andrew Purtell <ap...@apache.org>:
> >>>> 
> >>>>> IMHO, the easiest and fastest way to get the distribution aspect to be
> >> more
> >>>>> useful to more folks is to add a build target that generates plain
> >> tarballs
> >>>>> instead of distro-specific Linux packaging. People like us can take the
> >>>>> tarballs and unpack them to environments where for various reasons we
> >> don't
> >>>>> want to do RPM management. Vendors like Cloudera can convert tarballs
> >> to
> >>>>> parcels, or whatever proprietary format is desired.
> >>>>> 
> >>>>> 
> >>>>> 
> >>>>>> On Sun, Jun 18, 2017 at 12:13 PM, Evans Ye <ev...@apache.org>
> >> wrote:
> >>>>>> 
> >>>>>> Hi folks,
> >>>>>> 
> >>>>>> Many things happened during DataWorks Summit San Jose 2017. Some of
> >> the
> >>>>>> folks(Cos, Roman, Amir, Nate, Mike, etc) gathered together to discuss
> >>>>> 1.2.1
> >>>>>> and the future 1.3 release of Bigtop. I'd like to get back those
> >>>>>> discussions to the mailing list so that who can't make it there can
> >> still
> >>>>>> be with us for further discussions:
> >>>>>> 
> >>>>>> * 1.2.1 release
> >>>>>> a). Some of the folks expecting Docker on YARN to be back ported to
> >> 2.7.4
> >>>>>> and included in the release
> >>>>>> b). Get rotted code out of our code base: packaging, deployment,
> >> testing,
> >>>>>> etc
> >>>>>> c). Get integration test to work in CI
> >>>>>> 
> >>>>>> * 1.3.0 release
> >>>>>> a). More machine learning integrations
> >>>>>> b). K8S integration will be an interesting topic
> >>>>>> 
> >>>>>> Please help me to complete the list if I miss something. :)
> >>>>>> 
> >>>>>> 
> >>>>>> OTOH, for me specifically, I visited Cloudera for doing a tech talk. I
> >>>>> meet
> >>>>>> Sean Mackrory and there Hadoop and HBase lead. The pain point they're
> >>>>>> having for a long time is not having an integration test framework for
> >>>>>> there work on the bleeding edge. For example, whether a specific patch
> >>>>> from
> >>>>>> Hadoop breaks HBase or Hive?
> >>>>>> 
> >>>>>> My thinking towards this is this is what Bigtop tries to solve at the
> >>>>> very
> >>>>>> beginning. We supposed to have folks from multiple projects to work
> >> with
> >>>>> us
> >>>>>> to upgrade  packages, and use our frameworks to properly integrate,
> >> test
> >>>>>> their code with other components.
> >>>>>> 
> >>>>>> So, the future of Bigtop. I think tightly work with the other
> >> communities
> >>>>>> is a better way we move forward. But, that means something need to be
> >>>>>> changed. For example, our distribution is somehow, from developers
> >>>>>> perspective, old. Which can not support the integration and testing on
> >>>>> the
> >>>>>> bleeding edge. If we still like to  release something suggested for
> >>>>>> Production only, one of the solution is to have both dev and stable
> >>>>>> releases in Bigtop, so developers can work on the dev branch and test
> >>>>>> against newest components. In that case, people from other communities
> >>>>>> might be possible to help us upgrade the package to the newer version,
> >>>>>> which makes things easier.
> >>>>>> 
> >>>>>> What do you guys think? Please join me for the discussion.
> >>>>>> 
> >>>>> 
> >>>>> 
> >>>>> 
> >>>>> --
> >>>>> Best regards,
> >>>>> 
> >>>>> - Andy
> >>>>> 
> >>>>> If you are given a choice, you believe you have acted freely. - Raymond
> >>>>> Teller (via Peter Watts)
> >>>>> 
> >> 
> >> 
> > 
> > 
> > --
> > Best regards,
> > Andrew
> > 
> > Words like orphans lost among the crosstalk, meaning torn from truth's
> > decrepit hands
> >   - A23, Crosstalk
>

Re: [DISCUSS] The future of Apache Bigtop

Posted by Olaf Flebbe <of...@oflebbe.de>.

Andrew,

sorry to be sounding dismissive.

rpm can be converted to cpio payload. deb is an ar archive containing data.tar,gz with the files as payload.

Details can be looked up.

Nevertheless, please do not call "tar" an "improvement" with respect to deployment compared to "rpm" or "deb". I see the usecase to extract just the content without all the dependency management, service management, verification and so on. Please just use rpm2cpio or ar.

Olaf


> Am 21.06.2017 um 22:08 schrieb Andrew Purtell <ap...@apache.org>:
> 
>> you surely are making jokes when you are saying TAR is an improvement
> with respect to RPM/DEB.You surely know that you can unpack every RPM
> straight to the filesystem (DEB requires two steps), in case you'll like to.
> 
> Not a joke and the condescension isn't helpful either.
> 
> 
> On Wed, Jun 21, 2017 at 10:02 AM, Olaf Flebbe <of...@oflebbe.de> wrote:
> 
>> Hi Andrew,
>> 
>> you surely are making jokes when you are saying TAR is an improvement with
>> respect to RPM/DEB.You surely know that you can unpack every RPM straight
>> to the filesystem (DEB requires two steps), in case you'll like to.
>> 
>> You surely know that one can easily host a complete docker based hadoop
>> cluster on a developer machine in the current git of bigtop. And that
>> docker toolbox, docker engine, docker for mac integrates really well with
>> Windows, Linux and MacOSX, working right out of the box (at least on MacOSX
>> and Linux) as it is right now within bigtop without manually tweaking
>> config files.
>> 
>> I see no point in reproducing hive, hbase, ... hadoop tests -- most of
>> them single machine, fake cluster environment  -- when we can have the real
>> thing, a cluster where we use docker for isolating nodes. When the tests do
>> not really work portable, that's a problem of other projects, not ours.
>> Let's fix it there.
>> 
>> IMHO, if we could orchestrate k8s (kubernetes) (or docker-swarm, my
>> favorite) we could even chose to use a single host with some docker
>> instances or scale out to a cloud environment and have a reproducable
>> system without tweaking files. Of course there is much work to do to port
>> tests to the cloud environment, but these would be a tremendous value added.
>> 
>> Olaf
>> 
>> 
>> 
>> 
>>> Am 20.06.2017 um 23:12 schrieb Andrew Purtell <andrew.purtell@gmail.com
>>> :
>>> 
>>> Yeah, we can build from git repos. Instead of archive URL you can
>> specify for each component a repo and reference by git-URL and branch, tag,
>> or SHA.
>>> 
>>> Regarding tarball build targets, I was thinking of it as a packaging
>> improvement, an additional packaging target. It could make integration
>> testing more convenient too if you are not using containers or bare metal
>> systems where you own the whole filesystem.
>>> 
>>>> On Jun 19, 2017, at 6:13 AM, Evans Ye <ev...@apache.org> wrote:
>>>> 
>>>> Hi Andy,
>>>> 
>>>> Is it easier to have multiple tarballs to setup a cluster for
>> integration
>>>> tests?
>>>> I'm not on the Hadoop/HBase developer side so I have zero context. I was
>>>> just assuming that deploying a cluster for integration tests would be a
>>>> beneficial feature for them.
>>>> 
>>>> Bringing up my discussion with Hadoop and HBase guys at Cloudera, them
>>>> mentioned two things specifically for Bigtop:
>>>> 
>>>> a). build from git (which I think you've contributed that in Bigtop
>> already)
>>>> b). easy to run integration test framework
>>>> 
>>>> I'm happy to have b). because either way we need to have it in our CI.
>>>> 
>>>> 
>>>> 2017-06-19 5:04 GMT+08:00 Andrew Purtell <ap...@apache.org>:
>>>> 
>>>>> IMHO, the easiest and fastest way to get the distribution aspect to be
>> more
>>>>> useful to more folks is to add a build target that generates plain
>> tarballs
>>>>> instead of distro-specific Linux packaging. People like us can take the
>>>>> tarballs and unpack them to environments where for various reasons we
>> don't
>>>>> want to do RPM management. Vendors like Cloudera can convert tarballs
>> to
>>>>> parcels, or whatever proprietary format is desired.
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Sun, Jun 18, 2017 at 12:13 PM, Evans Ye <ev...@apache.org>
>> wrote:
>>>>>> 
>>>>>> Hi folks,
>>>>>> 
>>>>>> Many things happened during DataWorks Summit San Jose 2017. Some of
>> the
>>>>>> folks(Cos, Roman, Amir, Nate, Mike, etc) gathered together to discuss
>>>>> 1.2.1
>>>>>> and the future 1.3 release of Bigtop. I'd like to get back those
>>>>>> discussions to the mailing list so that who can't make it there can
>> still
>>>>>> be with us for further discussions:
>>>>>> 
>>>>>> * 1.2.1 release
>>>>>> a). Some of the folks expecting Docker on YARN to be back ported to
>> 2.7.4
>>>>>> and included in the release
>>>>>> b). Get rotted code out of our code base: packaging, deployment,
>> testing,
>>>>>> etc
>>>>>> c). Get integration test to work in CI
>>>>>> 
>>>>>> * 1.3.0 release
>>>>>> a). More machine learning integrations
>>>>>> b). K8S integration will be an interesting topic
>>>>>> 
>>>>>> Please help me to complete the list if I miss something. :)
>>>>>> 
>>>>>> 
>>>>>> OTOH, for me specifically, I visited Cloudera for doing a tech talk. I
>>>>> meet
>>>>>> Sean Mackrory and there Hadoop and HBase lead. The pain point they're
>>>>>> having for a long time is not having an integration test framework for
>>>>>> there work on the bleeding edge. For example, whether a specific patch
>>>>> from
>>>>>> Hadoop breaks HBase or Hive?
>>>>>> 
>>>>>> My thinking towards this is this is what Bigtop tries to solve at the
>>>>> very
>>>>>> beginning. We supposed to have folks from multiple projects to work
>> with
>>>>> us
>>>>>> to upgrade  packages, and use our frameworks to properly integrate,
>> test
>>>>>> their code with other components.
>>>>>> 
>>>>>> So, the future of Bigtop. I think tightly work with the other
>> communities
>>>>>> is a better way we move forward. But, that means something need to be
>>>>>> changed. For example, our distribution is somehow, from developers
>>>>>> perspective, old. Which can not support the integration and testing on
>>>>> the
>>>>>> bleeding edge. If we still like to  release something suggested for
>>>>>> Production only, one of the solution is to have both dev and stable
>>>>>> releases in Bigtop, so developers can work on the dev branch and test
>>>>>> against newest components. In that case, people from other communities
>>>>>> might be possible to help us upgrade the package to the newer version,
>>>>>> which makes things easier.
>>>>>> 
>>>>>> What do you guys think? Please join me for the discussion.
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Best regards,
>>>>> 
>>>>> - Andy
>>>>> 
>>>>> If you are given a choice, you believe you have acted freely. - Raymond
>>>>> Teller (via Peter Watts)
>>>>> 
>> 
>> 
> 
> 
> --
> Best regards,
> Andrew
> 
> Words like orphans lost among the crosstalk, meaning torn from truth's
> decrepit hands
>   - A23, Crosstalk

Re: [DISCUSS] The future of Apache Bigtop

Posted by Andrew Purtell <ap...@apache.org>.

> you surely are making jokes when you are saying TAR is an improvement
with respect to RPM/DEB.You surely know that you can unpack every RPM
straight to the filesystem (DEB requires two steps), in case you'll like to.

Not a joke and the condescension isn't helpful either.


On Wed, Jun 21, 2017 at 10:02 AM, Olaf Flebbe <of...@oflebbe.de> wrote:

> Hi Andrew,
>
> you surely are making jokes when you are saying TAR is an improvement with
> respect to RPM/DEB.You surely know that you can unpack every RPM straight
> to the filesystem (DEB requires two steps), in case you'll like to.
>
> You surely know that one can easily host a complete docker based hadoop
> cluster on a developer machine in the current git of bigtop. And that
> docker toolbox, docker engine, docker for mac integrates really well with
> Windows, Linux and MacOSX, working right out of the box (at least on MacOSX
> and Linux) as it is right now within bigtop without manually tweaking
> config files.
>
> I see no point in reproducing hive, hbase, ... hadoop tests -- most of
> them single machine, fake cluster environment  -- when we can have the real
> thing, a cluster where we use docker for isolating nodes. When the tests do
> not really work portable, that's a problem of other projects, not ours.
> Let's fix it there.
>
> IMHO, if we could orchestrate k8s (kubernetes) (or docker-swarm, my
> favorite) we could even chose to use a single host with some docker
> instances or scale out to a cloud environment and have a reproducable
> system without tweaking files. Of course there is much work to do to port
> tests to the cloud environment, but these would be a tremendous value added.
>
> Olaf
>
>
>
>
> > Am 20.06.2017 um 23:12 schrieb Andrew Purtell <andrew.purtell@gmail.com
> >:
> >
> > Yeah, we can build from git repos. Instead of archive URL you can
> specify for each component a repo and reference by git-URL and branch, tag,
> or SHA.
> >
> > Regarding tarball build targets, I was thinking of it as a packaging
> improvement, an additional packaging target. It could make integration
> testing more convenient too if you are not using containers or bare metal
> systems where you own the whole filesystem.
> >
> >> On Jun 19, 2017, at 6:13 AM, Evans Ye <ev...@apache.org> wrote:
> >>
> >> Hi Andy,
> >>
> >> Is it easier to have multiple tarballs to setup a cluster for
> integration
> >> tests?
> >> I'm not on the Hadoop/HBase developer side so I have zero context. I was
> >> just assuming that deploying a cluster for integration tests would be a
> >> beneficial feature for them.
> >>
> >> Bringing up my discussion with Hadoop and HBase guys at Cloudera, them
> >> mentioned two things specifically for Bigtop:
> >>
> >> a). build from git (which I think you've contributed that in Bigtop
> already)
> >> b). easy to run integration test framework
> >>
> >> I'm happy to have b). because either way we need to have it in our CI.
> >>
> >>
> >> 2017-06-19 5:04 GMT+08:00 Andrew Purtell <ap...@apache.org>:
> >>
> >>> IMHO, the easiest and fastest way to get the distribution aspect to be
> more
> >>> useful to more folks is to add a build target that generates plain
> tarballs
> >>> instead of distro-specific Linux packaging. People like us can take the
> >>> tarballs and unpack them to environments where for various reasons we
> don't
> >>> want to do RPM management. Vendors like Cloudera can convert tarballs
> to
> >>> parcels, or whatever proprietary format is desired.
> >>>
> >>>
> >>>
> >>>> On Sun, Jun 18, 2017 at 12:13 PM, Evans Ye <ev...@apache.org>
> wrote:
> >>>>
> >>>> Hi folks,
> >>>>
> >>>> Many things happened during DataWorks Summit San Jose 2017. Some of
> the
> >>>> folks(Cos, Roman, Amir, Nate, Mike, etc) gathered together to discuss
> >>> 1.2.1
> >>>> and the future 1.3 release of Bigtop. I'd like to get back those
> >>>> discussions to the mailing list so that who can't make it there can
> still
> >>>> be with us for further discussions:
> >>>>
> >>>> * 1.2.1 release
> >>>> a). Some of the folks expecting Docker on YARN to be back ported to
> 2.7.4
> >>>> and included in the release
> >>>> b). Get rotted code out of our code base: packaging, deployment,
> testing,
> >>>> etc
> >>>> c). Get integration test to work in CI
> >>>>
> >>>> * 1.3.0 release
> >>>> a). More machine learning integrations
> >>>> b). K8S integration will be an interesting topic
> >>>>
> >>>> Please help me to complete the list if I miss something. :)
> >>>>
> >>>>
> >>>> OTOH, for me specifically, I visited Cloudera for doing a tech talk. I
> >>> meet
> >>>> Sean Mackrory and there Hadoop and HBase lead. The pain point they're
> >>>> having for a long time is not having an integration test framework for
> >>>> there work on the bleeding edge. For example, whether a specific patch
> >>> from
> >>>> Hadoop breaks HBase or Hive?
> >>>>
> >>>> My thinking towards this is this is what Bigtop tries to solve at the
> >>> very
> >>>> beginning. We supposed to have folks from multiple projects to work
> with
> >>> us
> >>>> to upgrade  packages, and use our frameworks to properly integrate,
> test
> >>>> their code with other components.
> >>>>
> >>>> So, the future of Bigtop. I think tightly work with the other
> communities
> >>>> is a better way we move forward. But, that means something need to be
> >>>> changed. For example, our distribution is somehow, from developers
> >>>> perspective, old. Which can not support the integration and testing on
> >>> the
> >>>> bleeding edge. If we still like to  release something suggested for
> >>>> Production only, one of the solution is to have both dev and stable
> >>>> releases in Bigtop, so developers can work on the dev branch and test
> >>>> against newest components. In that case, people from other communities
> >>>> might be possible to help us upgrade the package to the newer version,
> >>>> which makes things easier.
> >>>>
> >>>> What do you guys think? Please join me for the discussion.
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Best regards,
> >>>
> >>>  - Andy
> >>>
> >>> If you are given a choice, you believe you have acted freely. - Raymond
> >>> Teller (via Peter Watts)
> >>>
>
>


-- 
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk

Re: [DISCUSS] The future of Apache Bigtop

Posted by Olaf Flebbe <of...@oflebbe.de>.

Hi Andrew,

you surely are making jokes when you are saying TAR is an improvement with respect to RPM/DEB.You surely know that you can unpack every RPM straight to the filesystem (DEB requires two steps), in case you'll like to.

You surely know that one can easily host a complete docker based hadoop cluster on a developer machine in the current git of bigtop. And that docker toolbox, docker engine, docker for mac integrates really well with Windows, Linux and MacOSX, working right out of the box (at least on MacOSX and Linux) as it is right now within bigtop without manually tweaking config files.

I see no point in reproducing hive, hbase, ... hadoop tests -- most of them single machine, fake cluster environment  -- when we can have the real thing, a cluster where we use docker for isolating nodes. When the tests do not really work portable, that's a problem of other projects, not ours. Let's fix it there.

IMHO, if we could orchestrate k8s (kubernetes) (or docker-swarm, my favorite) we could even chose to use a single host with some docker instances or scale out to a cloud environment and have a reproducable system without tweaking files. Of course there is much work to do to port tests to the cloud environment, but these would be a tremendous value added.

Olaf




> Am 20.06.2017 um 23:12 schrieb Andrew Purtell <an...@gmail.com>:
> 
> Yeah, we can build from git repos. Instead of archive URL you can specify for each component a repo and reference by git-URL and branch, tag, or SHA.
> 
> Regarding tarball build targets, I was thinking of it as a packaging improvement, an additional packaging target. It could make integration testing more convenient too if you are not using containers or bare metal systems where you own the whole filesystem.
> 
>> On Jun 19, 2017, at 6:13 AM, Evans Ye <ev...@apache.org> wrote:
>> 
>> Hi Andy,
>> 
>> Is it easier to have multiple tarballs to setup a cluster for integration
>> tests?
>> I'm not on the Hadoop/HBase developer side so I have zero context. I was
>> just assuming that deploying a cluster for integration tests would be a
>> beneficial feature for them.
>> 
>> Bringing up my discussion with Hadoop and HBase guys at Cloudera, them
>> mentioned two things specifically for Bigtop:
>> 
>> a). build from git (which I think you've contributed that in Bigtop already)
>> b). easy to run integration test framework
>> 
>> I'm happy to have b). because either way we need to have it in our CI.
>> 
>> 
>> 2017-06-19 5:04 GMT+08:00 Andrew Purtell <ap...@apache.org>:
>> 
>>> IMHO, the easiest and fastest way to get the distribution aspect to be more
>>> useful to more folks is to add a build target that generates plain tarballs
>>> instead of distro-specific Linux packaging. People like us can take the
>>> tarballs and unpack them to environments where for various reasons we don't
>>> want to do RPM management. Vendors like Cloudera can convert tarballs to
>>> parcels, or whatever proprietary format is desired.
>>> 
>>> 
>>> 
>>>> On Sun, Jun 18, 2017 at 12:13 PM, Evans Ye <ev...@apache.org> wrote:
>>>> 
>>>> Hi folks,
>>>> 
>>>> Many things happened during DataWorks Summit San Jose 2017. Some of the
>>>> folks(Cos, Roman, Amir, Nate, Mike, etc) gathered together to discuss
>>> 1.2.1
>>>> and the future 1.3 release of Bigtop. I'd like to get back those
>>>> discussions to the mailing list so that who can't make it there can still
>>>> be with us for further discussions:
>>>> 
>>>> * 1.2.1 release
>>>> a). Some of the folks expecting Docker on YARN to be back ported to 2.7.4
>>>> and included in the release
>>>> b). Get rotted code out of our code base: packaging, deployment, testing,
>>>> etc
>>>> c). Get integration test to work in CI
>>>> 
>>>> * 1.3.0 release
>>>> a). More machine learning integrations
>>>> b). K8S integration will be an interesting topic
>>>> 
>>>> Please help me to complete the list if I miss something. :)
>>>> 
>>>> 
>>>> OTOH, for me specifically, I visited Cloudera for doing a tech talk. I
>>> meet
>>>> Sean Mackrory and there Hadoop and HBase lead. The pain point they're
>>>> having for a long time is not having an integration test framework for
>>>> there work on the bleeding edge. For example, whether a specific patch
>>> from
>>>> Hadoop breaks HBase or Hive?
>>>> 
>>>> My thinking towards this is this is what Bigtop tries to solve at the
>>> very
>>>> beginning. We supposed to have folks from multiple projects to work with
>>> us
>>>> to upgrade  packages, and use our frameworks to properly integrate, test
>>>> their code with other components.
>>>> 
>>>> So, the future of Bigtop. I think tightly work with the other communities
>>>> is a better way we move forward. But, that means something need to be
>>>> changed. For example, our distribution is somehow, from developers
>>>> perspective, old. Which can not support the integration and testing on
>>> the
>>>> bleeding edge. If we still like to  release something suggested for
>>>> Production only, one of the solution is to have both dev and stable
>>>> releases in Bigtop, so developers can work on the dev branch and test
>>>> against newest components. In that case, people from other communities
>>>> might be possible to help us upgrade the package to the newer version,
>>>> which makes things easier.
>>>> 
>>>> What do you guys think? Please join me for the discussion.
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Best regards,
>>> 
>>>  - Andy
>>> 
>>> If you are given a choice, you believe you have acted freely. - Raymond
>>> Teller (via Peter Watts)
>>>

Re: [DISCUSS] The future of Apache Bigtop

Posted by Andrew Purtell <an...@gmail.com>.

Yeah, we can build from git repos. Instead of archive URL you can specify for each component a repo and reference by git-URL and branch, tag, or SHA. 

Regarding tarball build targets, I was thinking of it as a packaging improvement, an additional packaging target. It could make integration testing more convenient too if you are not using containers or bare metal systems where you own the whole filesystem. 

> On Jun 19, 2017, at 6:13 AM, Evans Ye <ev...@apache.org> wrote:
> 
> Hi Andy,
> 
> Is it easier to have multiple tarballs to setup a cluster for integration
> tests?
> I'm not on the Hadoop/HBase developer side so I have zero context. I was
> just assuming that deploying a cluster for integration tests would be a
> beneficial feature for them.
> 
> Bringing up my discussion with Hadoop and HBase guys at Cloudera, them
> mentioned two things specifically for Bigtop:
> 
> a). build from git (which I think you've contributed that in Bigtop already)
> b). easy to run integration test framework
> 
> I'm happy to have b). because either way we need to have it in our CI.
> 
> 
> 2017-06-19 5:04 GMT+08:00 Andrew Purtell <ap...@apache.org>:
> 
>> IMHO, the easiest and fastest way to get the distribution aspect to be more
>> useful to more folks is to add a build target that generates plain tarballs
>> instead of distro-specific Linux packaging. People like us can take the
>> tarballs and unpack them to environments where for various reasons we don't
>> want to do RPM management. Vendors like Cloudera can convert tarballs to
>> parcels, or whatever proprietary format is desired.
>> 
>> 
>> 
>>> On Sun, Jun 18, 2017 at 12:13 PM, Evans Ye <ev...@apache.org> wrote:
>>> 
>>> Hi folks,
>>> 
>>> Many things happened during DataWorks Summit San Jose 2017. Some of the
>>> folks(Cos, Roman, Amir, Nate, Mike, etc) gathered together to discuss
>> 1.2.1
>>> and the future 1.3 release of Bigtop. I'd like to get back those
>>> discussions to the mailing list so that who can't make it there can still
>>> be with us for further discussions:
>>> 
>>> * 1.2.1 release
>>> a). Some of the folks expecting Docker on YARN to be back ported to 2.7.4
>>> and included in the release
>>> b). Get rotted code out of our code base: packaging, deployment, testing,
>>> etc
>>> c). Get integration test to work in CI
>>> 
>>> * 1.3.0 release
>>> a). More machine learning integrations
>>> b). K8S integration will be an interesting topic
>>> 
>>> Please help me to complete the list if I miss something. :)
>>> 
>>> 
>>> OTOH, for me specifically, I visited Cloudera for doing a tech talk. I
>> meet
>>> Sean Mackrory and there Hadoop and HBase lead. The pain point they're
>>> having for a long time is not having an integration test framework for
>>> there work on the bleeding edge. For example, whether a specific patch
>> from
>>> Hadoop breaks HBase or Hive?
>>> 
>>> My thinking towards this is this is what Bigtop tries to solve at the
>> very
>>> beginning. We supposed to have folks from multiple projects to work with
>> us
>>> to upgrade  packages, and use our frameworks to properly integrate, test
>>> their code with other components.
>>> 
>>> So, the future of Bigtop. I think tightly work with the other communities
>>> is a better way we move forward. But, that means something need to be
>>> changed. For example, our distribution is somehow, from developers
>>> perspective, old. Which can not support the integration and testing on
>> the
>>> bleeding edge. If we still like to  release something suggested for
>>> Production only, one of the solution is to have both dev and stable
>>> releases in Bigtop, so developers can work on the dev branch and test
>>> against newest components. In that case, people from other communities
>>> might be possible to help us upgrade the package to the newer version,
>>> which makes things easier.
>>> 
>>> What do you guys think? Please join me for the discussion.
>>> 
>> 
>> 
>> 
>> --
>> Best regards,
>> 
>>   - Andy
>> 
>> If you are given a choice, you believe you have acted freely. - Raymond
>> Teller (via Peter Watts)
>>

Re: [DISCUSS] The future of Apache Bigtop

Posted by Evans Ye <ev...@apache.org>.

Hi Andy,

Is it easier to have multiple tarballs to setup a cluster for integration
tests?
I'm not on the Hadoop/HBase developer side so I have zero context. I was
just assuming that deploying a cluster for integration tests would be a
beneficial feature for them.

Bringing up my discussion with Hadoop and HBase guys at Cloudera, them
mentioned two things specifically for Bigtop:

a). build from git (which I think you've contributed that in Bigtop already)
b). easy to run integration test framework

I'm happy to have b). because either way we need to have it in our CI.


2017-06-19 5:04 GMT+08:00 Andrew Purtell <ap...@apache.org>:

> IMHO, the easiest and fastest way to get the distribution aspect to be more
> useful to more folks is to add a build target that generates plain tarballs
> instead of distro-specific Linux packaging. People like us can take the
> tarballs and unpack them to environments where for various reasons we don't
> want to do RPM management. Vendors like Cloudera can convert tarballs to
> parcels, or whatever proprietary format is desired.
>
>
>
> On Sun, Jun 18, 2017 at 12:13 PM, Evans Ye <ev...@apache.org> wrote:
>
> > Hi folks,
> >
> > Many things happened during DataWorks Summit San Jose 2017. Some of the
> > folks(Cos, Roman, Amir, Nate, Mike, etc) gathered together to discuss
> 1.2.1
> > and the future 1.3 release of Bigtop. I'd like to get back those
> > discussions to the mailing list so that who can't make it there can still
> > be with us for further discussions:
> >
> > * 1.2.1 release
> > a). Some of the folks expecting Docker on YARN to be back ported to 2.7.4
> > and included in the release
> > b). Get rotted code out of our code base: packaging, deployment, testing,
> > etc
> > c). Get integration test to work in CI
> >
> > * 1.3.0 release
> > a). More machine learning integrations
> > b). K8S integration will be an interesting topic
> >
> > Please help me to complete the list if I miss something. :)
> >
> >
> > OTOH, for me specifically, I visited Cloudera for doing a tech talk. I
> meet
> > Sean Mackrory and there Hadoop and HBase lead. The pain point they're
> > having for a long time is not having an integration test framework for
> > there work on the bleeding edge. For example, whether a specific patch
> from
> > Hadoop breaks HBase or Hive?
> >
> > My thinking towards this is this is what Bigtop tries to solve at the
> very
> > beginning. We supposed to have folks from multiple projects to work with
> us
> > to upgrade  packages, and use our frameworks to properly integrate, test
> > their code with other components.
> >
> > So, the future of Bigtop. I think tightly work with the other communities
> > is a better way we move forward. But, that means something need to be
> > changed. For example, our distribution is somehow, from developers
> > perspective, old. Which can not support the integration and testing on
> the
> > bleeding edge. If we still like to  release something suggested for
> > Production only, one of the solution is to have both dev and stable
> > releases in Bigtop, so developers can work on the dev branch and test
> > against newest components. In that case, people from other communities
> > might be possible to help us upgrade the package to the newer version,
> > which makes things easier.
> >
> > What do you guys think? Please join me for the discussion.
> >
>
>
>
> --
> Best regards,
>
>    - Andy
>
> If you are given a choice, you believe you have acted freely. - Raymond
> Teller (via Peter Watts)
>

Re: [DISCUSS] The future of Apache Bigtop

Posted by Andrew Purtell <ap...@apache.org>.

IMHO, the easiest and fastest way to get the distribution aspect to be more
useful to more folks is to add a build target that generates plain tarballs
instead of distro-specific Linux packaging. People like us can take the
tarballs and unpack them to environments where for various reasons we don't
want to do RPM management. Vendors like Cloudera can convert tarballs to
parcels, or whatever proprietary format is desired.



On Sun, Jun 18, 2017 at 12:13 PM, Evans Ye <ev...@apache.org> wrote:

> Hi folks,
>
> Many things happened during DataWorks Summit San Jose 2017. Some of the
> folks(Cos, Roman, Amir, Nate, Mike, etc) gathered together to discuss 1.2.1
> and the future 1.3 release of Bigtop. I'd like to get back those
> discussions to the mailing list so that who can't make it there can still
> be with us for further discussions:
>
> * 1.2.1 release
> a). Some of the folks expecting Docker on YARN to be back ported to 2.7.4
> and included in the release
> b). Get rotted code out of our code base: packaging, deployment, testing,
> etc
> c). Get integration test to work in CI
>
> * 1.3.0 release
> a). More machine learning integrations
> b). K8S integration will be an interesting topic
>
> Please help me to complete the list if I miss something. :)
>
>
> OTOH, for me specifically, I visited Cloudera for doing a tech talk. I meet
> Sean Mackrory and there Hadoop and HBase lead. The pain point they're
> having for a long time is not having an integration test framework for
> there work on the bleeding edge. For example, whether a specific patch from
> Hadoop breaks HBase or Hive?
>
> My thinking towards this is this is what Bigtop tries to solve at the very
> beginning. We supposed to have folks from multiple projects to work with us
> to upgrade  packages, and use our frameworks to properly integrate, test
> their code with other components.
>
> So, the future of Bigtop. I think tightly work with the other communities
> is a better way we move forward. But, that means something need to be
> changed. For example, our distribution is somehow, from developers
> perspective, old. Which can not support the integration and testing on the
> bleeding edge. If we still like to  release something suggested for
> Production only, one of the solution is to have both dev and stable
> releases in Bigtop, so developers can work on the dev branch and test
> against newest components. In that case, people from other communities
> might be possible to help us upgrade the package to the newer version,
> which makes things easier.
>
> What do you guys think? Please join me for the discussion.
>



-- 
Best regards,

   - Andy

If you are given a choice, you believe you have acted freely. - Raymond
Teller (via Peter Watts)