You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@bigtop.apache.org by "Evans Ye (JIRA)" <ji...@apache.org> on 2016/12/02 13:53:58 UTC

[jira] [Resolved] (BIGTOP-2397) Add image pre-build new feature

     [ https://issues.apache.org/jira/browse/BIGTOP-2397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Evans Ye resolved BIGTOP-2397.
------------------------------
    Resolution: Won't Fix

As described in my last comment, lets move on to a new solution.

> Add image pre-build new feature
> -------------------------------
>
>                 Key: BIGTOP-2397
>                 URL: https://issues.apache.org/jira/browse/BIGTOP-2397
>             Project: Bigtop
>          Issue Type: Sub-task
>          Components: docker, provisioner
>    Affects Versions: backlog
>            Reporter: Evans Ye
>            Assignee: Evans Ye
>
> Currently the provisioner takes roughly 3 to 5 mins to get a cluster up and running. This is not quite efficient for CI and developers to get a cluster for use. This jira trys to add an image pre-build feature into docker provisioner so that users can build the image once and run it multiple times. With this we can expect a significant performance boost for provisioning.
> BIGTOP-2296 showcased a solution which is to pre-build a hard-coded stack, i.e., hadoop + yarn. However, in this jira we'd like to make it more general, for example, user can define a stack named "in-memory stack", or "foobar company big data stack", which preload a set of components such as hadoop, yarn, spark into a image for further provisioning.
> The following are more detailed steps for a cluster to be deployed with pre-build feature involved:
> # user operation stage: update configuration file and run the provisioner command
> {code}
> vim config.yml 
> # specify image name, for example *foobar_company:evans-hadoop-stack*, and the components to be provisioned
> ./docker-hadoop.sh --burn --create 3
> {code}
> # pre-build stage1: install general system packages such as java
> # pre-build stage2: install stack packages such as hadoop, hbase, spark, etc
> # provision stage1: upload configuration files
> # provision stage2: run puppet apply to simply do deploy config files and start up services 
> When user want to re-provision a clean cluster, the prebuild image will be used:
> {code}
> ./docker-hadoop.sh --create 3 
> {code}
> There's actually no code change at the provisioner side, but with image that already has packages installed, the provisioning should be very fast.
> Notice that, this JIRA does not going to pre-build multiple docker images for different daemons/components. Instead, one stack mapping to only one image. The reason to do so is because our CM tool, bigtop puppet, currently only supports the package based deployment(RPM or DEB). Hence there's no way we can manage multiple hadoop daemons running in different container as microservices. Besides that, with package dependencies, a single daemon image still needs to pull in lots of dependent packages into it, which will likely to burn out the system storage.
> The one-stack one-image approach probably is not the best way, but it's the simplest way for bigtop to evolve and embrace docker more closely.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)