You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bigtop.apache.org by Olaf Flebbe <of...@oflebbe.de> on 2017/12/16 17:57:36 UTC

Bigtop/2, Go Container !? Was: Upgrade Zookeeper

Hi Jeff,

Regarding your comment on  https://issues.apache.org/jira/browse/BIGTOP-2754

Well, pretty ugly story.  The thread leading to reverting zookeeper is here:

https://lists.apache.org/thread.html/c2e9705ee7bdc8afb7b9e79683dda18b0079aec215079b608862c99b@%3Cdev.bigtop.apache.org%3E <https://lists.apache.org/thread.html/c2e9705ee7bdc8afb7b9e79683dda18b0079aec215079b608862c99b@%3Cdev.bigtop.apache.org%3E>

Included you will find the unresolved Hadoop JIRA tickets

https://issues.apache.org/jira/browse/HADOOP-12928 <https://issues.apache.org/jira/browse/HADOOP-12928>
 https://issues.apache.org/jira/browse/HADOOP-13413 <https://issues.apache.org/jira/browse/HADOOP-13413>

However, Jay flashed a proposal several times that we should "go container". I am now ready to see this necessary.

I think the hadoop ecosystem should interoperate on stable network protocols, not by sharing Java artifacts.

This would put the final nail in the coffin of the attempt harmonising protocols by symlinking client and server packages in order to enforce same versions of client and server. We would give up most of our inter-package dependencies, every package get its own dependency packaged, pulled from whatever the upstream project decided.

This may solve issues like the zookeeper / hadoop interdependency . Supplying a zookeeper-3.11 server package would be not an issue any more. Packages needing zookeeper client would use their zookeeper client to access the server. And than think "Hadoop-3" as a further example.

For dependencies with native libraries this may lead to issues, which can be solved IMHO.

I am now out of the position to control a reasonable large hadoop bigtop infra, so I am not sure this path will lead to disaster area, since I cannot test the long time stability any more.

Other Bigtoppers, please comment, since this will have pretty big impact .

I like to see traditional Bigtop bare metal -- no containers -- supported  as well. At least as a corner case.

Cheers
Olaf




> Am 16.12.2017 um 02:54 schrieb Jeff Widman (JIRA) <ji...@apache.org>:
> 
> 
>    [ https://issues.apache.org/jira/browse/BIGTOP-2754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16293553#comment-16293553 ]
> 
> Jeff Widman commented on BIGTOP-2754:
> -------------------------------------
> 
> Is there a ticket in Hadoop tracking this?
> 
> Zookeeper 3.4.6-->3.4.9 have a couple of bugs affecting snapshot garbage collection (ZOOKEEPER-1797 and ZOOKEEPER-2574), so we want to upgrade to 3.4.11.
> 
> However, when I tried pulling that from Zookeeper repo, I noticed that it pins Java 6 which breaks on Ubuntu 16.04: https://github.com/apache/zookeeper/blob/release-3.4.6/src/packages/update-zookeeper-env.sh#L152
> 
> I looked at the Zookeeper master branch, and looks like Zookeeper decided in ZOOKEEPER-1604 to delegate packaging to the Bigtop project. However, when I came here, I was disappointed to see that the newest Zookeeper version that Bigtop packages is 3.4.6....
> 
> For those of us who use Zookeeper apart from Hadoop but don't want to deal with the headaches of packaging it'd be nice to figure out a way to get a newer zookeeper package released. I'm happy to take a look at the Hadoop project to see if they can upgrade to 3.4.11, but not sure where to find their rationale for pinning to 3.4.6...
> 
>> Revert BIGTOP-2730: Upgrade Zookeeper to version 3.4.10
>> -------------------------------------------------------
>> 
>>                Key: BIGTOP-2754
>>                URL: https://issues.apache.org/jira/browse/BIGTOP-2754
>>            Project: Bigtop
>>         Issue Type: Bug
>>   Affects Versions: 1.2.0
>>           Reporter: Olaf Flebbe
>>           Assignee: Olaf Flebbe
>>            Fix For: 1.2.1, 1.3.0
>> 
>>        Attachments: BIGTOP-2754.patch
>> 
>> 
>> Since Hadoop enforces zookeeper-3.6 we will revert  BIGTOP-2730  in order to wait for upstream to deal with this issue.
> 
> 
> 
> --
> This message was sent by Atlassian JIRA
> (v6.4.14#64029)


Re: Bigtop/2, Go Container !? Was: Upgrade Zookeeper

Posted by Olaf Flebbe <of...@oflebbe.de>.
Anton,

> 
> It is easy to run, but what about performance and security?
> 
> There is two way:
> 
> 1. True nested container. In that case we have AUFS as storage in nested
> containers. And we have to use --privilege.

Even the normal yarn may need --priviledge (Because of "normal" yarn containerization)
Not sure if it would be possible to place the container workload on a docker volume.

Surely we will not want to place hadoop storage on an overlay storage.

> 2. Fake nested container when we provide docker.socket from host to YARN
> container. In that case all containers run on same level. But that way
> looks like not safe.

Having access to a docker socket is never secure, right. But I am not sure we need to expose the docker socket even to the yarn payload.

Olaf



> 
> On 27 December 2017 at 11:47, Roman Shaposhnik <ro...@shaposhnik.org> wrote:
> 
>> On Wed, Dec 27, 2017 at 12:13 AM, Anton Chevychalov <ca...@arenadata.io>
>> wrote:
>>> Note that new YARN  supports Docker as payload. So it is really difficult
>>> to put that to containers.
>> 
>> It actually isn't that difficult -- nested containers are pretty easy
>> to do -- take
>> a look at how Kubernetes folks are doing it for Minikube.
>> 
>> Thanks,
>> Roman.
>> 
> 
> 
> 
> -- 
> Anton B Chevychalov
> Team Lead at ArenaData.io


Re: Bigtop/2, Go Container !? Was: Upgrade Zookeeper

Posted by Anton Chevychalov <ca...@arenadata.io>.
Roman,

It is easy to run, but what about performance and security?

There is two way:

1. True nested container. In that case we have AUFS as storage in nested
containers. And we have to use --privilege.
2. Fake nested container when we provide docker.socket from host to YARN
container. In that case all containers run on same level. But that way
looks like not safe.

On 27 December 2017 at 11:47, Roman Shaposhnik <ro...@shaposhnik.org> wrote:

> On Wed, Dec 27, 2017 at 12:13 AM, Anton Chevychalov <ca...@arenadata.io>
> wrote:
> > Note that new YARN  supports Docker as payload. So it is really difficult
> > to put that to containers.
>
> It actually isn't that difficult -- nested containers are pretty easy
> to do -- take
> a look at how Kubernetes folks are doing it for Minikube.
>
> Thanks,
> Roman.
>



-- 
Anton B Chevychalov
Team Lead at ArenaData.io

Re: Bigtop/2, Go Container !? Was: Upgrade Zookeeper

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.
On Wed, Dec 27, 2017 at 12:13 AM, Anton Chevychalov <ca...@arenadata.io> wrote:
> Note that new YARN  supports Docker as payload. So it is really difficult
> to put that to containers.

It actually isn't that difficult -- nested containers are pretty easy
to do -- take
a look at how Kubernetes folks are doing it for Minikube.

Thanks,
Roman.

Re: Bigtop/2, Go Container !? Was: Upgrade Zookeeper

Posted by Anton Chevychalov <ca...@arenadata.io>.
Note that new YARN  supports Docker as payload. So it is really difficult
to put that to containers.

I am not sure about 2a way. Mostly because it is not enough to install rpm
anyway.

What we need to start some service inside container:
1. JDK
2. Place to store data (volume) if service has it.
3. Place to store logs (or socket/device provided by syslog)
4. Configs:
4a. Configs generated from some external source based on some logic
(Ambari, etcd (k8s) or something else)
4b. Configs generated by unknown external application and connected over
volume.
4c. Configs comes from git on startup.
In 4a and 4b we need some bootstrap logic inside container.

-- 
Anton B Chevychalov
Team Lead at ArenaData.io



On 23 December 2017 at 17:34, Olaf Flebbe <of...@oflebbe.de> wrote:

> Hi Evans,
>
>
> > Am 20.12.2017 um 13:01 schrieb Evans Ye <ev...@apache.org>:
> >
> > I am 100% up to go container. We need to embrace the changes and right
> now
> > container seems to be a dominating solution.
> >
> > Breaking down into a concrete plan, we got some choices:
> >
> > 1. Leverage docker provisiner to provision cluster on swarm using compose
>
> Docker provisioner AFAIK is like having virtual machines in docker:
> Installing more than one service/all services in one container.
>
> Why not splitting different services into seperate containers in order to
> remove interdependencies between packages?
>
>
> > 2. Develop docker based packaging solution
> > 2a. Reuse our RPM/DEB packages to create docker images
> 2a: Yes, please.
>
> > 2b. Develop a brand new packaging solution for docker and drop puppet
> deploy
> >
> > I would favor 2a.
> >
> > Any other thoughts?
>
>
>
>
>
> >
> >
> >
> > Olaf Flebbe <of...@oflebbe.de>於 2017年12月17日 週日,上午1:57寫道:
> >
> >> Hi Jeff,
> >>
> >> Regarding your comment on  *https://issues.apache.org/
> jira/browse/BIGTOP-2754
> >> <https://issues.apache.org/jira/browse/BIGTOP-2754>*
> >>
> >> Well, pretty ugly story.  The thread leading to reverting zookeeper is
> >> here:
> >>
> >>
> >> https://lists.apache.org/thread.html/c2e9705ee7bdc8afb7b9e79683dda1
> 8b0079aec215079b608862c99b@%3Cdev.bigtop.apache.org%3E
> >>
> >> Included you will find the unresolved Hadoop JIRA tickets
> >>
> >> https://issues.apache.org/jira/browse/HADOOP-12928
> >>
> >> https://issues.apache.org/jira/browse/HADOOP-13413
> >>
> >>
> >> However, Jay flashed a proposal several times that we should "go
> >> container". I am now ready to see this necessary.
> >>
> >> I think the hadoop ecosystem should interoperate on stable network
> >> protocols, not by sharing Java artifacts.
> >>
> >> This would put the final nail in the coffin of the attempt harmonising
> >> protocols by symlinking client and server packages in order to enforce
> same
> >> versions of client and server. We would give up most of our
> inter-package
> >> dependencies, every package get its own dependency packaged, pulled from
> >> whatever the upstream project decided.
> >>
> >> This may solve issues like the zookeeper / hadoop interdependency .
> >> Supplying a zookeeper-3.11 server package would be not an issue any
> more.
> >> Packages needing zookeeper client would use their zookeeper client to
> >> access the server. And than think "Hadoop-3" as a further example.
> >>
> >> For dependencies with native libraries this may lead to issues, which
> can
> >> be solved IMHO.
> >>
> >> I am now out of the position to control a reasonable large hadoop bigtop
> >> infra, so I am not sure this path will lead to disaster area, since I
> >> cannot test the long time stability any more.
> >>
> >> Other Bigtoppers, please comment, since this will have pretty big
> impact .
> >>
> >> I like to see traditional Bigtop bare metal -- no containers --
> supported
> >> as well. At least as a corner case.
> >>
> >> Cheers
> >> Olaf
> >>
> >>
> >>
> >>
> >> Am 16.12.2017 um 02:54 schrieb Jeff Widman (JIRA) <ji...@apache.org>:
> >>
> >>
> >>   [
> >> https://issues.apache.org/jira/browse/BIGTOP-2754?page=
> com.atlassian.jira.plugin.system.issuetabpanels:comment-
> tabpanel&focusedCommentId=16293553#comment-16293553
> >> ]
> >>
> >> Jeff Widman commented on BIGTOP-2754:
> >> -------------------------------------
> >>
> >> Is there a ticket in Hadoop tracking this?
> >>
> >> Zookeeper 3.4.6-->3.4.9 have a couple of bugs affecting snapshot garbage
> >> collection (ZOOKEEPER-1797 and ZOOKEEPER-2574), so we want to upgrade to
> >> 3.4.11.
> >>
> >> However, when I tried pulling that from Zookeeper repo, I noticed that
> it
> >> pins Java 6 which breaks on Ubuntu 16.04:
> >> https://github.com/apache/zookeeper/blob/release-3.4.6/
> src/packages/update-zookeeper-env.sh#L152
> >>
> >> I looked at the Zookeeper master branch, and looks like Zookeeper
> decided
> >> in ZOOKEEPER-1604 to delegate packaging to the Bigtop project. However,
> >> when I came here, I was disappointed to see that the newest Zookeeper
> >> version that Bigtop packages is 3.4.6....
> >>
> >> For those of us who use Zookeeper apart from Hadoop but don't want to
> deal
> >> with the headaches of packaging it'd be nice to figure out a way to get
> a
> >> newer zookeeper package released. I'm happy to take a look at the Hadoop
> >> project to see if they can upgrade to 3.4.11, but not sure where to find
> >> their rationale for pinning to 3.4.6...
> >>
> >> Revert BIGTOP-2730: Upgrade Zookeeper to version 3.4.10
> >> -------------------------------------------------------
> >>
> >>               Key: BIGTOP-2754
> >>               URL: https://issues.apache.org/jira/browse/BIGTOP-2754
> >>           Project: Bigtop
> >>        Issue Type: Bug
> >>  Affects Versions: 1.2.0
> >>          Reporter: Olaf Flebbe
> >>          Assignee: Olaf Flebbe
> >>           Fix For: 1.2.1, 1.3.0
> >>
> >>       Attachments: BIGTOP-2754.patch
> >>
> >>
> >> Since Hadoop enforces zookeeper-3.6 we will revert  BIGTOP-2730  in
> order
> >> to wait for upstream to deal with this issue.
> >>
> >>
> >>
> >>
> >> --
> >> This message was sent by Atlassian JIRA
> >> (v6.4.14#64029)
> >>
> >>
>
>

Re: Bigtop/2, Go Container !? Was: Upgrade Zookeeper

Posted by Olaf Flebbe <of...@oflebbe.de>.
Hi Evans,


> Am 20.12.2017 um 13:01 schrieb Evans Ye <ev...@apache.org>:
> 
> I am 100% up to go container. We need to embrace the changes and right now
> container seems to be a dominating solution.
> 
> Breaking down into a concrete plan, we got some choices:
> 
> 1. Leverage docker provisiner to provision cluster on swarm using compose

Docker provisioner AFAIK is like having virtual machines in docker: Installing more than one service/all services in one container.

Why not splitting different services into seperate containers in order to remove interdependencies between packages?


> 2. Develop docker based packaging solution
> 2a. Reuse our RPM/DEB packages to create docker images
2a: Yes, please.

> 2b. Develop a brand new packaging solution for docker and drop puppet deploy
> 
> I would favor 2a.
> 
> Any other thoughts?





> 
> 
> 
> Olaf Flebbe <of...@oflebbe.de>於 2017年12月17日 週日,上午1:57寫道:
> 
>> Hi Jeff,
>> 
>> Regarding your comment on  *https://issues.apache.org/jira/browse/BIGTOP-2754
>> <https://issues.apache.org/jira/browse/BIGTOP-2754>*
>> 
>> Well, pretty ugly story.  The thread leading to reverting zookeeper is
>> here:
>> 
>> 
>> https://lists.apache.org/thread.html/c2e9705ee7bdc8afb7b9e79683dda18b0079aec215079b608862c99b@%3Cdev.bigtop.apache.org%3E
>> 
>> Included you will find the unresolved Hadoop JIRA tickets
>> 
>> https://issues.apache.org/jira/browse/HADOOP-12928
>> 
>> https://issues.apache.org/jira/browse/HADOOP-13413
>> 
>> 
>> However, Jay flashed a proposal several times that we should "go
>> container". I am now ready to see this necessary.
>> 
>> I think the hadoop ecosystem should interoperate on stable network
>> protocols, not by sharing Java artifacts.
>> 
>> This would put the final nail in the coffin of the attempt harmonising
>> protocols by symlinking client and server packages in order to enforce same
>> versions of client and server. We would give up most of our inter-package
>> dependencies, every package get its own dependency packaged, pulled from
>> whatever the upstream project decided.
>> 
>> This may solve issues like the zookeeper / hadoop interdependency .
>> Supplying a zookeeper-3.11 server package would be not an issue any more.
>> Packages needing zookeeper client would use their zookeeper client to
>> access the server. And than think "Hadoop-3" as a further example.
>> 
>> For dependencies with native libraries this may lead to issues, which can
>> be solved IMHO.
>> 
>> I am now out of the position to control a reasonable large hadoop bigtop
>> infra, so I am not sure this path will lead to disaster area, since I
>> cannot test the long time stability any more.
>> 
>> Other Bigtoppers, please comment, since this will have pretty big impact .
>> 
>> I like to see traditional Bigtop bare metal -- no containers -- supported
>> as well. At least as a corner case.
>> 
>> Cheers
>> Olaf
>> 
>> 
>> 
>> 
>> Am 16.12.2017 um 02:54 schrieb Jeff Widman (JIRA) <ji...@apache.org>:
>> 
>> 
>>   [
>> https://issues.apache.org/jira/browse/BIGTOP-2754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16293553#comment-16293553
>> ]
>> 
>> Jeff Widman commented on BIGTOP-2754:
>> -------------------------------------
>> 
>> Is there a ticket in Hadoop tracking this?
>> 
>> Zookeeper 3.4.6-->3.4.9 have a couple of bugs affecting snapshot garbage
>> collection (ZOOKEEPER-1797 and ZOOKEEPER-2574), so we want to upgrade to
>> 3.4.11.
>> 
>> However, when I tried pulling that from Zookeeper repo, I noticed that it
>> pins Java 6 which breaks on Ubuntu 16.04:
>> https://github.com/apache/zookeeper/blob/release-3.4.6/src/packages/update-zookeeper-env.sh#L152
>> 
>> I looked at the Zookeeper master branch, and looks like Zookeeper decided
>> in ZOOKEEPER-1604 to delegate packaging to the Bigtop project. However,
>> when I came here, I was disappointed to see that the newest Zookeeper
>> version that Bigtop packages is 3.4.6....
>> 
>> For those of us who use Zookeeper apart from Hadoop but don't want to deal
>> with the headaches of packaging it'd be nice to figure out a way to get a
>> newer zookeeper package released. I'm happy to take a look at the Hadoop
>> project to see if they can upgrade to 3.4.11, but not sure where to find
>> their rationale for pinning to 3.4.6...
>> 
>> Revert BIGTOP-2730: Upgrade Zookeeper to version 3.4.10
>> -------------------------------------------------------
>> 
>>               Key: BIGTOP-2754
>>               URL: https://issues.apache.org/jira/browse/BIGTOP-2754
>>           Project: Bigtop
>>        Issue Type: Bug
>>  Affects Versions: 1.2.0
>>          Reporter: Olaf Flebbe
>>          Assignee: Olaf Flebbe
>>           Fix For: 1.2.1, 1.3.0
>> 
>>       Attachments: BIGTOP-2754.patch
>> 
>> 
>> Since Hadoop enforces zookeeper-3.6 we will revert  BIGTOP-2730  in order
>> to wait for upstream to deal with this issue.
>> 
>> 
>> 
>> 
>> --
>> This message was sent by Atlassian JIRA
>> (v6.4.14#64029)
>> 
>> 


Re: Bigtop/2, Go Container !? Was: Upgrade Zookeeper

Posted by Evans Ye <ev...@apache.org>.
I am 100% up to go container. We need to embrace the changes and right now
container seems to be a dominating solution.

Breaking down into a concrete plan, we got some choices:

1. Leverage docker provisiner to provision cluster on swarm using compose
2. Develop docker based packaging solution
2a. Reuse our RPM/DEB packages to create docker images
2b. Develop a brand new packaging solution for docker and drop puppet deploy

I would favor 2a.

Any other thoughts?



Olaf Flebbe <of...@oflebbe.de>於 2017年12月17日 週日,上午1:57寫道:

> Hi Jeff,
>
> Regarding your comment on  *https://issues.apache.org/jira/browse/BIGTOP-2754
> <https://issues.apache.org/jira/browse/BIGTOP-2754>*
>
> Well, pretty ugly story.  The thread leading to reverting zookeeper is
> here:
>
>
> https://lists.apache.org/thread.html/c2e9705ee7bdc8afb7b9e79683dda18b0079aec215079b608862c99b@%3Cdev.bigtop.apache.org%3E
>
> Included you will find the unresolved Hadoop JIRA tickets
>
> https://issues.apache.org/jira/browse/HADOOP-12928
>
>  https://issues.apache.org/jira/browse/HADOOP-13413
>
>
> However, Jay flashed a proposal several times that we should "go
> container". I am now ready to see this necessary.
>
> I think the hadoop ecosystem should interoperate on stable network
> protocols, not by sharing Java artifacts.
>
> This would put the final nail in the coffin of the attempt harmonising
> protocols by symlinking client and server packages in order to enforce same
> versions of client and server. We would give up most of our inter-package
> dependencies, every package get its own dependency packaged, pulled from
> whatever the upstream project decided.
>
> This may solve issues like the zookeeper / hadoop interdependency .
> Supplying a zookeeper-3.11 server package would be not an issue any more.
> Packages needing zookeeper client would use their zookeeper client to
> access the server. And than think "Hadoop-3" as a further example.
>
> For dependencies with native libraries this may lead to issues, which can
> be solved IMHO.
>
> I am now out of the position to control a reasonable large hadoop bigtop
> infra, so I am not sure this path will lead to disaster area, since I
> cannot test the long time stability any more.
>
> Other Bigtoppers, please comment, since this will have pretty big impact .
>
> I like to see traditional Bigtop bare metal -- no containers -- supported
>  as well. At least as a corner case.
>
> Cheers
> Olaf
>
>
>
>
> Am 16.12.2017 um 02:54 schrieb Jeff Widman (JIRA) <ji...@apache.org>:
>
>
>    [
> https://issues.apache.org/jira/browse/BIGTOP-2754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16293553#comment-16293553
> ]
>
> Jeff Widman commented on BIGTOP-2754:
> -------------------------------------
>
> Is there a ticket in Hadoop tracking this?
>
> Zookeeper 3.4.6-->3.4.9 have a couple of bugs affecting snapshot garbage
> collection (ZOOKEEPER-1797 and ZOOKEEPER-2574), so we want to upgrade to
> 3.4.11.
>
> However, when I tried pulling that from Zookeeper repo, I noticed that it
> pins Java 6 which breaks on Ubuntu 16.04:
> https://github.com/apache/zookeeper/blob/release-3.4.6/src/packages/update-zookeeper-env.sh#L152
>
> I looked at the Zookeeper master branch, and looks like Zookeeper decided
> in ZOOKEEPER-1604 to delegate packaging to the Bigtop project. However,
> when I came here, I was disappointed to see that the newest Zookeeper
> version that Bigtop packages is 3.4.6....
>
> For those of us who use Zookeeper apart from Hadoop but don't want to deal
> with the headaches of packaging it'd be nice to figure out a way to get a
> newer zookeeper package released. I'm happy to take a look at the Hadoop
> project to see if they can upgrade to 3.4.11, but not sure where to find
> their rationale for pinning to 3.4.6...
>
> Revert BIGTOP-2730: Upgrade Zookeeper to version 3.4.10
> -------------------------------------------------------
>
>                Key: BIGTOP-2754
>                URL: https://issues.apache.org/jira/browse/BIGTOP-2754
>            Project: Bigtop
>         Issue Type: Bug
>   Affects Versions: 1.2.0
>           Reporter: Olaf Flebbe
>           Assignee: Olaf Flebbe
>            Fix For: 1.2.1, 1.3.0
>
>        Attachments: BIGTOP-2754.patch
>
>
> Since Hadoop enforces zookeeper-3.6 we will revert  BIGTOP-2730  in order
> to wait for upstream to deal with this issue.
>
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.4.14#64029)
>
>