You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "hector.acosta@cloudera.com (Code Review)" <ge...@cloudera.org> on 2019/01/28 18:21:52 UTC

[native-toolchain-CR] Initial support for building the toolchain in docker

hector.acosta@cloudera.com has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12285


Change subject: Initial support for building the toolchain in docker
......................................................................

Initial support for building the toolchain in docker

Adds Dockerfiles and supporting scripts to build the toolchain in a
docker container. It should now be possible to run any build*.sh script
in a supported platform by running:

./in-docker.sh impala-toolchain-debian8 "./build.sh gflags 2.2.0-p2"
 - To build gflags version 2.2.0-p2 in debian8

It is also possible to build the entire toolchain for all platforms by
running:

./buildall-docker.sh

Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
---
A buildall-docker.sh
A docker/all/assert.sh
A docker/all/postinstall.sh
A docker/buildall.sh
A docker/centos/yum-install
A docker/centos5.df
A docker/centos5/epel/RPM-GPG-KEY-EPEL-5
A docker/centos5/epel/epel-release-5-4.noarch.rpm
A docker/centos5/yum.repos.d/CentOS-Base.repo
A docker/centos5/yum.repos.d/CentOS-fasttrack.repo
A docker/centos5/yum.repos.d/libselinux.repo
A docker/centos6.df
A docker/centos7.df
A docker/debian7.df
A docker/debian8.df
A docker/ubuntu12.df
A docker/ubuntu14.df
A docker/ubuntu16.df
M functions.sh
A in-docker.sh
20 files changed, 625 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/85/12285/1
-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 1
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>

[native-toolchain-CR] Initial support for building the toolchain in docker

Posted by "Laszlo Gaal (Code Review)" <ge...@cloudera.org>.
Laszlo Gaal has posted comments on this change. ( http://gerrit.cloudera.org:8080/12285 )

Change subject: Initial support for building the toolchain in docker
......................................................................


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12285/6/docker/redhat/CentOS-6.6-Base.repo
File docker/redhat/CentOS-6.6-Base.repo:

http://gerrit.cloudera.org:8080/#/c/12285/6/docker/redhat/CentOS-6.6-Base.repo@34
PS6, Line 34: enabled=1
The Centos Plus repo is enabled for 6.6, but it is disabled for 7.2 (see the 7.2. repo descriptor file).

I admit I don't know which one is the correct approach, I just wonder if the discrepancy is intentional, or maybe just an oversight.


http://gerrit.cloudera.org:8080/#/c/12285/6/docker/redhat6.df
File docker/redhat6.df:

http://gerrit.cloudera.org:8080/#/c/12285/6/docker/redhat6.df@26
PS6, Line 26: java-1.8.0-openjdk-devel
I also wonder about the JDK version selection logic: RedHat variants get 1.8, Debian and Ubuntu receive 1.7, except Ubuntu 16.04, which get 1.8.

IIRC Kudu is the only toolchain component that requires Java.

If this patch is intended to support only the master branch of Impala, then this is probably OK. Impala master uses Java 8, and code built with Java 7 should be able to run. Additionally, Impala 3.x usually consumes Kudu from the Cloudera-built CDH tarballs (see the environment variable USE_CDH_KUDU in Impala/bin/impala-config.sh), so the Kudu bits built and published with the toolchain may be less interesting.
For this case I am assuming that binary toolchain bits for Impala 2.x would continue to be published by the existing build mechanisms.

However, if this patch, or patch set targets the Impala 2.x branch as well, then we would need Java 7 everywhere for two reasons:
1. Impala 2.x is built with Java 7
2. Impala 2.x consumes Kudu exclusively from the toolchain bits, so the JDK used for building Kudu will impact Impala 2.x directly.

Happy to discuss off-Gerritt if that's more comfortable for you.



-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 6
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Comment-Date: Wed, 20 Feb 2019 21:44:41 +0000
Gerrit-HasComments: Yes

[native-toolchain-CR] Initial support for building the toolchain in docker

Posted by "hector.acosta@cloudera.com (Code Review)" <ge...@cloudera.org>.
Hello Lars Volker, Laszlo Gaal, Philip Zeyliger, Tim Armstrong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12285

to look at the new patch set (#3).

Change subject: Initial support for building the toolchain in docker
......................................................................

Initial support for building the toolchain in docker

Adds Dockerfiles and supporting scripts to build the toolchain in a
docker container. It should now be possible to run any build*.sh script
in a supported platform by running:

./in-docker.py impala-toolchain-debian8 "./build.sh gflags 2.2.0-p2"
 - To build gflags version 2.2.0-p2 in debian8

It is also possible to build the entire toolchain for all platforms by
running:

make -j $P

Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
---
A Makefile
A docker/all/assert-dependencies-present.py
A docker/all/postinstall.sh
A docker/buildall.sh
A docker/centos/yum-install
A docker/centos6.df
A docker/centos7.df
A docker/debian7.df
A docker/debian8.df
A docker/ubuntu12.df
A docker/ubuntu14.df
A docker/ubuntu16.df
M functions.sh
A in-docker.py
14 files changed, 617 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/85/12285/3
-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 3
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: hector.acosta@cloudera.com <he...@cloudera.com>

[native-toolchain-CR] Initial support for building the toolchain in docker

Posted by "Philip Zeyliger (Code Review)" <ge...@cloudera.org>.
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/12285 )

Change subject: Initial support for building the toolchain in docker
......................................................................


Patch Set 1:

(16 comments)

Thanks!

This is a huge improvement on the Jenkins-only version of this. in-docker.sh is borderline worthy of re-writing in Python, though it's a bit hard to tell whether or not it'll get shorter. I'll leave that up to you.

I think we should try to separate the docker image building versus the building a little bit more than you have. Historically, we've pinned our AMI's, so the build is fairly reproducible. As you've got it here, a change to RedHat's libc will tease its way into our stuff, and maybe we want that to be explicit. I don't think that's a big deal: it's a matter of defining a default place to pull the images from...

http://gerrit.cloudera.org:8080/#/c/12285/1/buildall-docker.sh
File buildall-docker.sh:

http://gerrit.cloudera.org:8080/#/c/12285/1/buildall-docker.sh@2
PS1, Line 2: # Copyright 2017 Cloudera Inc.
2019.


http://gerrit.cloudera.org:8080/#/c/12285/1/buildall-docker.sh@23
PS1, Line 23: for image in $(cd docker; ./buildall.sh); do
Perhaps add a TODO for parallelism?


http://gerrit.cloudera.org:8080/#/c/12285/1/docker/all/assert.sh
File docker/all/assert.sh:

http://gerrit.cloudera.org:8080/#/c/12285/1/docker/all/assert.sh@2
PS1, Line 2: # Copyright 2015 Cloudera Inc.
2019


http://gerrit.cloudera.org:8080/#/c/12285/1/docker/all/assert.sh@16
PS1, Line 16: # Assert that a system has the correct libraries/binaries to build the
Perhaps rename assert to something more friendly? Like "assert-dependencies-present.sh"?


http://gerrit.cloudera.org:8080/#/c/12285/1/docker/all/assert.sh@19
PS1, Line 19: set -u
            : set -e
            : set -o pipefail
It's temping to write this in python from the get-go. The parsing of ldconfig -p in particular is a bit tedious in shell.


http://gerrit.cloudera.org:8080/#/c/12285/1/docker/all/assert.sh@35
PS1, Line 35: rm -f /tmp/lib
You seem like you use a temp dir on line 23; what's this about?


http://gerrit.cloudera.org:8080/#/c/12285/1/docker/all/postinstall.sh
File docker/all/postinstall.sh:

http://gerrit.cloudera.org:8080/#/c/12285/1/docker/all/postinstall.sh@2
PS1, Line 2: # Copyright 2015 Cloudera Inc.
2019


http://gerrit.cloudera.org:8080/#/c/12285/1/docker/all/postinstall.sh@18
PS1, Line 18: getent hosts github.com| xargs -n1 ssh-keyscan -t rsa,dsa >> /etc/ssh/ssh_known_hosts
Comment what this does?


http://gerrit.cloudera.org:8080/#/c/12285/1/docker/centos/yum-install
File docker/centos/yum-install:

http://gerrit.cloudera.org:8080/#/c/12285/1/docker/centos/yum-install@23
PS1, Line 23: yum install $@ 2>&1 | tee $t
Does this preserve yum failures, given that you don't have pipefail set?

Should this be "$@" rather than $@?


http://gerrit.cloudera.org:8080/#/c/12285/1/docker/centos/yum-install@25
PS1, Line 25: 	>&2 echo "yum-install: One or more packages not available but specified as arguments"
avoid tabs?


http://gerrit.cloudera.org:8080/#/c/12285/1/docker/centos5.df
File docker/centos5.df:

http://gerrit.cloudera.org:8080/#/c/12285/1/docker/centos5.df@47
PS1, Line 47: # XXX: Thrift 0.9.3 requires python2.6, and uses system python.
??? Can we manipulate this via PATH?


http://gerrit.cloudera.org:8080/#/c/12285/1/docker/centos5/epel/epel-release-5-4.noarch.rpm
File docker/centos5/epel/epel-release-5-4.noarch.rpm:

http://gerrit.cloudera.org:8080/#/c/12285/1/docker/centos5/epel/epel-release-5-4.noarch.rpm@a1
PS1, Line 1: 
I always worry about binary files in git repos... Can we do this in a more text-y way? (perhaps by just editing yum.repos.d files?


http://gerrit.cloudera.org:8080/#/c/12285/1/in-docker.sh
File in-docker.sh:

http://gerrit.cloudera.org:8080/#/c/12285/1/in-docker.sh@16
PS1, Line 16: # Script that sets up environment used in order to perform full (or partial)
Can you document how the file system mounting is set up here? i.e., where the output is.


http://gerrit.cloudera.org:8080/#/c/12285/1/in-docker.sh@49
PS1, Line 49: TARGET_DIR=/mnt
Is this useful to be configured?


http://gerrit.cloudera.org:8080/#/c/12285/1/in-docker.sh@66
PS1, Line 66:   while [[ $1 ]]; do
Can you do this with "-z" and avoid the "set +u" stuff?


http://gerrit.cloudera.org:8080/#/c/12285/1/in-docker.sh@87
PS1, Line 87: # GCC_VERSION works here
Comment that this looks for subdirectories of source (like source/foo) and produces variables FOO_VERSION for them.

You don't really need the basename here. You can do something like:

find . -mindepth 1 -maxdepth 1 -type d -printf "%f\n"

The %f gives you the last component of the name.



-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 1
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Mon, 28 Jan 2019 22:27:31 +0000
Gerrit-HasComments: Yes

[native-toolchain-CR] Initial support for building the toolchain in docker

Posted by "Laszlo Gaal (Code Review)" <ge...@cloudera.org>.
Laszlo Gaal has posted comments on this change. ( http://gerrit.cloudera.org:8080/12285 )

Change subject: Initial support for building the toolchain in docker
......................................................................


Patch Set 5:

> Patch Set 5:
> 
> (3 comments)

Nevermind; I saw the corersponding change in the "prepare dockerized builds for publishing" review; left comments there.


-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 5
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Comment-Date: Wed, 13 Feb 2019 16:59:54 +0000
Gerrit-HasComments: No

[native-toolchain-CR] Initial support for building the toolchain in docker

Posted by "hector.acosta@cloudera.com (Code Review)" <ge...@cloudera.org>.
Hello Lars Volker, Laszlo Gaal, Philip Zeyliger, Tim Armstrong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12285

to look at the new patch set (#4).

Change subject: Initial support for building the toolchain in docker
......................................................................

Initial support for building the toolchain in docker

Adds Dockerfiles and supporting scripts to build the toolchain in a
docker container. It should now be possible to run any build*.sh script
in a supported platform by running:

./in-docker.py impala-toolchain-debian8 "./build.sh gflags 2.2.0-p2"
 - To build gflags version 2.2.0-p2 in debian8

It is also possible to build the entire toolchain for all platforms by
running:

make -j $P

Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
---
A Makefile
A docker/all/assert-dependencies-present.py
A docker/all/postinstall.sh
A docker/buildall.sh
A docker/centos/yum-install
A docker/centos6.df
A docker/centos7.df
A docker/debian7.df
A docker/debian8.df
A docker/ubuntu12.df
A docker/ubuntu14.df
A docker/ubuntu16.df
M functions.sh
A in-docker.py
14 files changed, 634 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/85/12285/4
-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 4
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: hector.acosta@cloudera.com <he...@cloudera.com>

[native-toolchain-CR] Initial support for building the toolchain in docker

Posted by "hector.acosta@cloudera.com (Code Review)" <ge...@cloudera.org>.
Hello Lars Volker, Laszlo Gaal, Philip Zeyliger, Tim Armstrong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12285

to look at the new patch set (#6).

Change subject: Initial support for building the toolchain in docker
......................................................................

Initial support for building the toolchain in docker

Adds Dockerfiles and supporting scripts to build the toolchain in a
docker container. It should now be possible to run any build*.sh script
in a supported platform by running:

./in-docker.py impala-toolchain-debian8 "./build.sh gflags 2.2.0-p2"
 - To build gflags version 2.2.0-p2 in debian8

It is also possible to build the entire toolchain for all platforms by
running:

make -j $P

Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
---
A Makefile
A docker/all/assert-dependencies-present.py
A docker/all/postinstall.sh
A docker/buildall.sh
A docker/debian7.df
A docker/debian8.df
A docker/redhat/CentOS-6.6-Base.repo
A docker/redhat/CentOS-7.2.1511-Base.repo
A docker/redhat/yum-install
A docker/redhat6.df
A docker/redhat7.df
A docker/ubuntu1204.df
A docker/ubuntu1404.df
A docker/ubuntu1604.df
M functions.sh
A in-docker.py
16 files changed, 728 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/85/12285/6
-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 6
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: hector.acosta@cloudera.com <he...@cloudera.com>

[native-toolchain-CR] Initial support for building the toolchain in docker

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12285 )

Change subject: Initial support for building the toolchain in docker
......................................................................


Patch Set 7: Code-Review+1


-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 7
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Comment-Date: Thu, 21 Feb 2019 18:33:02 +0000
Gerrit-HasComments: No

[native-toolchain-CR] Initial support for building the toolchain in docker

Posted by "hector.acosta@cloudera.com (Code Review)" <ge...@cloudera.org>.
hector.acosta@cloudera.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/12285 )

Change subject: Initial support for building the toolchain in docker
......................................................................


Patch Set 2:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/12285/2/Makefile
File Makefile:

http://gerrit.cloudera.org:8080/#/c/12285/2/Makefile@2
PS2, Line 2: STAMP_DIR=$(BUILD_DIR)/stamp
> copyright, etc.
Done


http://gerrit.cloudera.org:8080/#/c/12285/2/Makefile@15
PS2, Line 15: 	./in-docker.py $(IN_DOCKER_ARGS) $(@F) -- ./buildall.sh |sed -s 's/^/$(@F): /'
> Does that sed lose error codes because of pipefail-equivalent issues? (My q
I set the shell to /bin/bash -o pipefail and I did get this to fail after I messed up a rebase.

% cat Makefile;make
SHELL=/bin/bash -o pipefail

all:
        exit 99 | true
exit 99 | true
make: *** [Makefile:4: all] Error 99

I found it hard to figure out what failed without that sed, but I'm fine removing it too.


http://gerrit.cloudera.org:8080/#/c/12285/2/docker/all/assert-dependencies-present.py
File docker/all/assert-dependencies-present.py:

http://gerrit.cloudera.org:8080/#/c/12285/2/docker/all/assert-dependencies-present.py@17
PS2, Line 17: # native toolchain. Must be python2.6 compatible.
> Worth a note about why it must be 2.6 compatible.
Done


http://gerrit.cloudera.org:8080/#/c/12285/2/docker/all/assert-dependencies-present.py@25
PS2, Line 25: def run(cmd):
> I think you can call this check_output() since it's essentailly a re-implem
Done


http://gerrit.cloudera.org:8080/#/c/12285/2/docker/all/assert-dependencies-present.py@34
PS2, Line 34:   for s in l:
            :     if re.match(regex, s):
            :       return True
            :   return False
> I don't know if it's better, but this is basically any and map.
Done


http://gerrit.cloudera.org:8080/#/c/12285/2/docker/all/assert-dependencies-present.py@41
PS2, Line 41:   patterns = ['libdb.*.so',
            :               'libkrb.*.so',
            :               'libncurses.so',
            :               'libsasl.*.so',
            :               'libcrypto.so',
            :               'libz.so']
> There are dots here that are wildcards and dots here that are real dots. Sh
Done


http://gerrit.cloudera.org:8080/#/c/12285/2/docker/all/assert-dependencies-present.py@47
PS2, Line 47:   out = run(['ldconfig', '-p'])[0]
            :   libraries = []
            :   for line in out.splitlines():
            :     libraries.append(line.split()[0])
> I'm ambivalent, but:
Done


http://gerrit.cloudera.org:8080/#/c/12285/2/docker/all/assert-dependencies-present.py@77
PS2, Line 77:     print 'Checking program: %s' % p
> use logging?
Done


http://gerrit.cloudera.org:8080/#/c/12285/2/docker/all/assert-dependencies-present.py@79
PS2, Line 79:       sys.exit('Unable to find %s in PATH' % p)
> Use exceptions (and line 54)?
Done


http://gerrit.cloudera.org:8080/#/c/12285/2/docker/all/assert-dependencies-present.py@85
PS2, Line 85:     import distutils.core  # noqa: F401
> Maybe just move this to top of the file and python will fail and it'll be c
Done


http://gerrit.cloudera.org:8080/#/c/12285/2/in-docker.py
File in-docker.py:

http://gerrit.cloudera.org:8080/#/c/12285/2/in-docker.py@42
PS2, Line 42: in-docker.py --docker-args "-t" impala-toolchain-centos6 -- bash
> Don't you need the "-i"?
We include that by default.


http://gerrit.cloudera.org:8080/#/c/12285/2/in-docker.py@70
PS2, Line 70:               '-v', '/etc/passwd:/etc/passwd:ro',
            :               '-v', '/etc/group:/etc/group:ro',
> We've maybe seen cases where this doesn't work, but I'm fine with it if it 
This seems to work reliably for now, I'll include additional fixes if needed.


http://gerrit.cloudera.org:8080/#/c/12285/2/in-docker.py@77
PS2, Line 77:   parser.add_argument('--docker-args', default='')
> Add help?
Done


http://gerrit.cloudera.org:8080/#/c/12285/2/in-docker.py@125
PS2, Line 125:   xargs = subprocess.Popen(['xargs', '-i', 'cp', '--parents', '{}', distro_build_dir], stdin=git.stdout)
> I had to look up -i and it looks like maybe it's deprecated?
Done


http://gerrit.cloudera.org:8080/#/c/12285/2/in-docker.py@147
PS2, Line 147:     sys.stderr.write('Running: %s\n' % ' '.join(cmd))
> use logging?
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 2
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Comment-Date: Wed, 30 Jan 2019 22:38:05 +0000
Gerrit-HasComments: Yes

[native-toolchain-CR] Initial support for building the toolchain in docker

Posted by "Laszlo Gaal (Code Review)" <ge...@cloudera.org>.
Laszlo Gaal has posted comments on this change. ( http://gerrit.cloudera.org:8080/12285 )

Change subject: Initial support for building the toolchain in docker
......................................................................


Patch Set 4:

(3 comments)

I am still concerned about version pinning for the Docker images. The base images look all right, but the subsequent install steps will still use the "latest" repos for the CentOS images: the repo descriptors in /etc/yum.repos.d use only the OS major version in their default state, so all the packages will be installed from a much later minor version than the base images themselves. I wonder if we are setting ourselves up for later problems this way.

http://gerrit.cloudera.org:8080/#/c/12285/2/Makefile
File Makefile:

http://gerrit.cloudera.org:8080/#/c/12285/2/Makefile@11
PS2, Line 11:  WITHOUT
nit: you probably want to add this in the follow-up patch (the one adding Ubuntu18 support).


http://gerrit.cloudera.org:8080/#/c/12285/4/Makefile
File Makefile:

http://gerrit.cloudera.org:8080/#/c/12285/4/Makefile@25
PS4, Line 25: ubuntu18
Maybe move this to the Ubuntu18 support commit?


http://gerrit.cloudera.org:8080/#/c/12285/4/in-docker.py
File in-docker.py:

http://gerrit.cloudera.org:8080/#/c/12285/4/in-docker.py@64
PS4, Line 64: impal
SImilarly, maybe this should also be moved to the other commit that enables Ubuntu18.



-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 4
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Comment-Date: Tue, 05 Feb 2019 15:53:20 +0000
Gerrit-HasComments: Yes

[native-toolchain-CR] Initial support for building the toolchain in docker

Posted by "hector.acosta@cloudera.com (Code Review)" <ge...@cloudera.org>.
Hello Lars Volker, Laszlo Gaal, Philip Zeyliger, Tim Armstrong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12285

to look at the new patch set (#7).

Change subject: Initial support for building the toolchain in docker
......................................................................

Initial support for building the toolchain in docker

Adds Dockerfiles and supporting scripts to build the toolchain in a
docker container. It should now be possible to run any build*.sh script
in a supported platform by running:

./in-docker.py impala-toolchain-debian8 "./build.sh gflags 2.2.0-p2"
 - To build gflags version 2.2.0-p2 in debian8

It is also possible to build the entire toolchain for all platforms by
running:

make -j $P

Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
---
A Makefile
A docker/all/assert-dependencies-present.py
A docker/all/postinstall.sh
A docker/buildall.sh
A docker/debian7.df
A docker/debian8.df
A docker/redhat/CentOS-6.6-Base.repo
A docker/redhat/CentOS-7.2.1511-Base.repo
A docker/redhat/yum-install
A docker/redhat6.df
A docker/redhat7.df
A docker/ubuntu1204.df
A docker/ubuntu1404.df
A docker/ubuntu1604.df
M functions.sh
A in-docker.py
16 files changed, 721 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/85/12285/7
-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 7
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: hector.acosta@cloudera.com <he...@cloudera.com>

[native-toolchain-CR] Initial support for building the toolchain in docker

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/12285 )

Change subject: Initial support for building the toolchain in docker
......................................................................

Initial support for building the toolchain in docker

Adds Dockerfiles and supporting scripts to build the toolchain in a
docker container. It should now be possible to run any build*.sh script
in a supported platform by running:

./in-docker.py impala-toolchain-debian8 "./build.sh gflags 2.2.0-p2"
 - To build gflags version 2.2.0-p2 in debian8

It is also possible to build the entire toolchain for all platforms by
running:

make -j $P

Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Reviewed-on: http://gerrit.cloudera.org:8080/12285
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Reviewed-by: Laszlo Gaal <la...@cloudera.com>
Tested-by: Tim Armstrong <ta...@cloudera.com>
---
A Makefile
A docker/all/assert-dependencies-present.py
A docker/all/postinstall.sh
A docker/buildall.sh
A docker/debian7.df
A docker/debian8.df
A docker/redhat/CentOS-6.6-Base.repo
A docker/redhat/CentOS-7.2.1511-Base.repo
A docker/redhat/yum-install
A docker/redhat6.df
A docker/redhat7.df
A docker/ubuntu1204.df
A docker/ubuntu1404.df
A docker/ubuntu1604.df
M functions.sh
A in-docker.py
16 files changed, 721 insertions(+), 0 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, but someone else must approve; Verified
  Laszlo Gaal: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 8
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: hector.acosta@cloudera.com <he...@cloudera.com>

[native-toolchain-CR] Initial support for building the toolchain in docker

Posted by "hector.acosta@cloudera.com (Code Review)" <ge...@cloudera.org>.
hector.acosta@cloudera.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/12285 )

Change subject: Initial support for building the toolchain in docker
......................................................................


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12285/5/docker/redhat6.df
File docker/redhat6.df:

http://gerrit.cloudera.org:8080/#/c/12285/5/docker/redhat6.df@4
PS5, Line 4: # We need to pin our yum updates to keep old library versions.
> Repo pinning seems to be missing from here. Is that intentional?
Done


http://gerrit.cloudera.org:8080/#/c/12285/5/docker/redhat6.df@6
PS5, Line 6: 's 
> nit: seems to be redundant with "-y" also present in the docker/redhat/yum-
Done


http://gerrit.cloudera.org:8080/#/c/12285/5/in-docker.py
File in-docker.py:

http://gerrit.cloudera.org:8080/#/c/12285/5/in-docker.py@112
PS5, Line 112: passthrough_env
> Maybe add the env vars needed for publishing to the S3 bucket:
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 6
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Comment-Date: Tue, 19 Feb 2019 18:27:09 +0000
Gerrit-HasComments: Yes

[native-toolchain-CR] Initial support for building the toolchain in docker

Posted by "hector.acosta@cloudera.com (Code Review)" <ge...@cloudera.org>.
hector.acosta@cloudera.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/12285 )

Change subject: Initial support for building the toolchain in docker
......................................................................


Patch Set 1:

(16 comments)

> Thanks for tackling this Hector!
 > 
 > Pinning the various dependencies would definitely be useful; we
 > have been bitten several times by various packages suddenly getting
 > updated under a seemingly unchanged setup script.
 > To ensure compatibility with the widest possible range of build
 > platforms it would seem useful to pin the OS dependencies at
 > suitable less recent release: probably not all Impala users run the
 > latest and greatest release of their respective platforms at all
 > times.
 > 
 > It is not just the OS platform (the FROM clause of the Dockerfiles)
 > that may warrant pinning, the same approach seems useful for the
 > installed components as well, e.g. by pinning repo descriptors to
 > the repos corresponding to the platform version. This could ensure
 > that an older platform would not suddenly be updated with packages
 > from a much more recent release (unless explicitly
 > allowed/requested by a change that can be identified later from the
 > git history).

Pinned OS versions.

http://gerrit.cloudera.org:8080/#/c/12285/1/buildall-docker.sh
File buildall-docker.sh:

http://gerrit.cloudera.org:8080/#/c/12285/1/buildall-docker.sh@2
PS1, Line 2: # Copyright 2017 Cloudera Inc.
> 2019.
Done


http://gerrit.cloudera.org:8080/#/c/12285/1/buildall-docker.sh@23
PS1, Line 23: for image in $(cd docker; ./buildall.sh); do
> Perhaps add a TODO for parallelism?
Done


http://gerrit.cloudera.org:8080/#/c/12285/1/docker/all/assert.sh
File docker/all/assert.sh:

http://gerrit.cloudera.org:8080/#/c/12285/1/docker/all/assert.sh@2
PS1, Line 2: # Copyright 2015 Cloudera Inc.
> 2019
Done


http://gerrit.cloudera.org:8080/#/c/12285/1/docker/all/assert.sh@16
PS1, Line 16: # Assert that a system has the correct libraries/binaries to build the
> Perhaps rename assert to something more friendly? Like "assert-dependencies
Done


http://gerrit.cloudera.org:8080/#/c/12285/1/docker/all/assert.sh@19
PS1, Line 19: set -u
            : set -e
            : set -o pipefail
> It's temping to write this in python from the get-go. The parsing of ldconf
Done


http://gerrit.cloudera.org:8080/#/c/12285/1/docker/all/assert.sh@35
PS1, Line 35: rm -f /tmp/lib
> You seem like you use a temp dir on line 23; what's this about?
Done


http://gerrit.cloudera.org:8080/#/c/12285/1/docker/all/postinstall.sh
File docker/all/postinstall.sh:

http://gerrit.cloudera.org:8080/#/c/12285/1/docker/all/postinstall.sh@2
PS1, Line 2: # Copyright 2015 Cloudera Inc.
> 2019
Done


http://gerrit.cloudera.org:8080/#/c/12285/1/docker/all/postinstall.sh@18
PS1, Line 18: getent hosts github.com| xargs -n1 ssh-keyscan -t rsa,dsa >> /etc/ssh/ssh_known_hosts
> Comment what this does?
Done


http://gerrit.cloudera.org:8080/#/c/12285/1/docker/centos/yum-install
File docker/centos/yum-install:

http://gerrit.cloudera.org:8080/#/c/12285/1/docker/centos/yum-install@23
PS1, Line 23: yum install $@ 2>&1 | tee $t
> Does this preserve yum failures, given that you don't have pipefail set?
Done


http://gerrit.cloudera.org:8080/#/c/12285/1/docker/centos/yum-install@25
PS1, Line 25: 	>&2 echo "yum-install: One or more packages not available but specified as arguments"
> avoid tabs?
Done


http://gerrit.cloudera.org:8080/#/c/12285/1/docker/centos5.df
File docker/centos5.df:

http://gerrit.cloudera.org:8080/#/c/12285/1/docker/centos5.df@47
PS1, Line 47: # XXX: Thrift 0.9.3 requires python2.6, and uses system python.
> ??? Can we manipulate this via PATH?
The answer is most likely yes with some changes to the build script. This is centos5, so I'll drop this based on your other review.


http://gerrit.cloudera.org:8080/#/c/12285/1/docker/centos5/epel/epel-release-5-4.noarch.rpm
File docker/centos5/epel/epel-release-5-4.noarch.rpm:

http://gerrit.cloudera.org:8080/#/c/12285/1/docker/centos5/epel/epel-release-5-4.noarch.rpm@a1
PS1, Line 1: 
> I always worry about binary files in git repos... Can we do this in a more 
Done


http://gerrit.cloudera.org:8080/#/c/12285/1/in-docker.sh
File in-docker.sh:

http://gerrit.cloudera.org:8080/#/c/12285/1/in-docker.sh@16
PS1, Line 16: # Script that sets up environment used in order to perform full (or partial)
> Can you document how the file system mounting is set up here? i.e., where t
Done


http://gerrit.cloudera.org:8080/#/c/12285/1/in-docker.sh@49
PS1, Line 49: TARGET_DIR=/mnt
> Is this useful to be configured?
I don't see a use case for it. I think we mostly care about the SOURCE dir (which can be configured via BUILD_DIR) but I personally don't care where it ends up in the container.


http://gerrit.cloudera.org:8080/#/c/12285/1/in-docker.sh@66
PS1, Line 66:   while [[ $1 ]]; do
> Can you do this with "-z" and avoid the "set +u" stuff?
Done


http://gerrit.cloudera.org:8080/#/c/12285/1/in-docker.sh@87
PS1, Line 87: # GCC_VERSION works here
> Comment that this looks for subdirectories of source (like source/foo) and 
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 1
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Comment-Date: Wed, 30 Jan 2019 05:49:47 +0000
Gerrit-HasComments: Yes

[native-toolchain-CR] Initial support for building the toolchain in docker

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12285 )

Change subject: Initial support for building the toolchain in docker
......................................................................


Patch Set 7: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 7
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Comment-Date: Fri, 22 Feb 2019 18:36:18 +0000
Gerrit-HasComments: No

[native-toolchain-CR] Initial support for building the toolchain in docker

Posted by "Laszlo Gaal (Code Review)" <ge...@cloudera.org>.
Laszlo Gaal has posted comments on this change. ( http://gerrit.cloudera.org:8080/12285 )

Change subject: Initial support for building the toolchain in docker
......................................................................


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/12285/5/docker/redhat6.df
File docker/redhat6.df:

http://gerrit.cloudera.org:8080/#/c/12285/5/docker/redhat6.df@4
PS5, Line 4: 
Repo pinning seems to be missing from here. Is that intentional?


http://gerrit.cloudera.org:8080/#/c/12285/5/docker/redhat6.df@6
PS5, Line 6:  -y
nit: seems to be redundant with "-y" also present in the docker/redhat/yum-install script file.


http://gerrit.cloudera.org:8080/#/c/12285/5/in-docker.py
File in-docker.py:

http://gerrit.cloudera.org:8080/#/c/12285/5/in-docker.py@112
PS5, Line 112: passthrough_env
Maybe add the env vars needed for publishing to the S3 bucket:
- S3_BUCKET (the bucket name -- this is required, there is no default value in the scripts)
- AWS_SECRET_ACCESS_KEY (the credential-bearing variables may be optional, e.g. our existing automation uses IAM roles)
- AWS_ACCESS_KEY_ID
- AWS_SESSION_TOKEN



-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 5
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Comment-Date: Wed, 13 Feb 2019 16:42:42 +0000
Gerrit-HasComments: Yes

[native-toolchain-CR] Initial support for building the toolchain in docker

Posted by "Laszlo Gaal (Code Review)" <ge...@cloudera.org>.
Laszlo Gaal has posted comments on this change. ( http://gerrit.cloudera.org:8080/12285 )

Change subject: Initial support for building the toolchain in docker
......................................................................


Patch Set 7: Code-Review+2

Tim is of course right about the Java versions. I guess I stared at config diffs a bit too much...


-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 7
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Comment-Date: Thu, 21 Feb 2019 20:40:53 +0000
Gerrit-HasComments: No

[native-toolchain-CR] Initial support for building the toolchain in docker

Posted by "hector.acosta@cloudera.com (Code Review)" <ge...@cloudera.org>.
Hello Lars Volker, Laszlo Gaal, Philip Zeyliger, Tim Armstrong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12285

to look at the new patch set (#2).

Change subject: Initial support for building the toolchain in docker
......................................................................

Initial support for building the toolchain in docker

Adds Dockerfiles and supporting scripts to build the toolchain in a
docker container. It should now be possible to run any build*.sh script
in a supported platform by running:

./in-docker.py impala-toolchain-debian8 "./build.sh gflags 2.2.0-p2"
 - To build gflags version 2.2.0-p2 in debian8

It is also possible to build the entire toolchain for all platforms by
running:

make -j $P

Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
---
A Makefile
A docker/all/assert-dependencies-present.py
A docker/all/postinstall.sh
A docker/buildall.sh
A docker/centos/yum-install
A docker/centos6.df
A docker/centos7.df
A docker/debian7.df
A docker/debian8.df
A docker/ubuntu12.df
A docker/ubuntu14.df
A docker/ubuntu16.df
M functions.sh
A in-docker.py
14 files changed, 617 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/85/12285/2
-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 2
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[native-toolchain-CR] Initial support for building the toolchain in docker

Posted by "hector.acosta@cloudera.com (Code Review)" <ge...@cloudera.org>.
Hello Lars Volker, Laszlo Gaal, Philip Zeyliger, Tim Armstrong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12285

to look at the new patch set (#5).

Change subject: Initial support for building the toolchain in docker
......................................................................

Initial support for building the toolchain in docker

Adds Dockerfiles and supporting scripts to build the toolchain in a
docker container. It should now be possible to run any build*.sh script
in a supported platform by running:

./in-docker.py impala-toolchain-debian8 "./build.sh gflags 2.2.0-p2"
 - To build gflags version 2.2.0-p2 in debian8

It is also possible to build the entire toolchain for all platforms by
running:

make -j $P

Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
---
A Makefile
A docker/all/assert-dependencies-present.py
A docker/all/postinstall.sh
A docker/buildall.sh
A docker/debian7.df
A docker/debian8.df
A docker/redhat/CentOS-7.2.1511-Base.repo
A docker/redhat/yum-install
A docker/redhat6.df
A docker/redhat7.df
A docker/ubuntu1204.df
A docker/ubuntu1404.df
A docker/ubuntu1604.df
M functions.sh
A in-docker.py
15 files changed, 684 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain refs/changes/85/12285/5
-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 5
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: hector.acosta@cloudera.com <he...@cloudera.com>

[native-toolchain-CR] Initial support for building the toolchain in docker

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12285 )

Change subject: Initial support for building the toolchain in docker
......................................................................


Patch Set 6: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12285/6/docker/redhat6.df
File docker/redhat6.df:

http://gerrit.cloudera.org:8080/#/c/12285/6/docker/redhat6.df@26
PS6, Line 26: java-1.8.0-openjdk-devel
> I also wonder about the JDK version selection logic: RedHat variants get 1.
We don't consume the Kudu jars from the toolchain, just the client .so. The jars all come via Maven, so JDK version isn't a concern so long as it doesn't break compilation.



-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 6
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Comment-Date: Thu, 21 Feb 2019 06:17:27 +0000
Gerrit-HasComments: Yes

[native-toolchain-CR] Initial support for building the toolchain in docker

Posted by "Laszlo Gaal (Code Review)" <ge...@cloudera.org>.
Laszlo Gaal has posted comments on this change. ( http://gerrit.cloudera.org:8080/12285 )

Change subject: Initial support for building the toolchain in docker
......................................................................


Patch Set 1:

Thanks for tackling this Hector!

Pinning the various dependencies would definitely be useful; we have been bitten several times by various packages suddenly getting updated under a seemingly unchanged setup script.
To ensure compatibility with the widest possible range of build platforms it would seem useful to pin the OS dependencies at suitable less recent release: probably not all Impala users run the latest and greatest release of their respective platforms at all times.

It is not just the OS platform (the FROM clause of the Dockerfiles) that may warrant pinning, the same approach seems useful for the installed components as well, e.g. by pinning repo descriptors to the repos corresponding to the platform version. This could ensure that an older platform would not suddenly be updated with packages from a much more recent release (unless explicitly allowed/requested by a change that can be identified later from the git history).


-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 1
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 29 Jan 2019 15:20:44 +0000
Gerrit-HasComments: No

[native-toolchain-CR] Initial support for building the toolchain in docker

Posted by "hector.acosta@cloudera.com (Code Review)" <ge...@cloudera.org>.
hector.acosta@cloudera.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/12285 )

Change subject: Initial support for building the toolchain in docker
......................................................................


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12285/6/docker/redhat/CentOS-6.6-Base.repo
File docker/redhat/CentOS-6.6-Base.repo:

http://gerrit.cloudera.org:8080/#/c/12285/6/docker/redhat/CentOS-6.6-Base.repo@34
PS6, Line 34: 
> The Centos Plus repo is enabled for 6.6, but it is disabled for 7.2 (see th
Done


http://gerrit.cloudera.org:8080/#/c/12285/6/docker/redhat6.df
File docker/redhat6.df:

http://gerrit.cloudera.org:8080/#/c/12285/6/docker/redhat6.df@26
PS6, Line 26: java-1.8.0-openjdk-devel
> We don't consume the Kudu jars from the toolchain, just the client .so. The
Based on Tim's comment I'll leave this as-is. If there's a need to standardize our java version I'll do that as follow up.



-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 7
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Comment-Date: Thu, 21 Feb 2019 16:49:02 +0000
Gerrit-HasComments: Yes

[native-toolchain-CR] Initial support for building the toolchain in docker

Posted by "Philip Zeyliger (Code Review)" <ge...@cloudera.org>.
Philip Zeyliger has posted comments on this change. ( http://gerrit.cloudera.org:8080/12285 )

Change subject: Initial support for building the toolchain in docker
......................................................................


Patch Set 2:

(15 comments)

Thanks! When you say we're pinning the OS, does this actually pin the packages? Does it look like the libc (or whatever) we're getting from pinning to 6.6 actually move? Unfortunately, I bet it does, so we have to be careful about how to use it. That's the underlying reason we need to be able to build the toolchain against explicit images.

http://gerrit.cloudera.org:8080/#/c/12285/2/Makefile
File Makefile:

http://gerrit.cloudera.org:8080/#/c/12285/2/Makefile@2
PS2, Line 2: STAMP_DIR=$(BUILD_DIR)/stamp
copyright, etc.


http://gerrit.cloudera.org:8080/#/c/12285/2/Makefile@15
PS2, Line 15: 	./in-docker.py $(IN_DOCKER_ARGS) $(@F) -- ./buildall.sh |sed -s 's/^/$(@F): /'
Does that sed lose error codes because of pipefail-equivalent issues? (My quick testing suggests that it does.)


http://gerrit.cloudera.org:8080/#/c/12285/2/docker/all/assert-dependencies-present.py
File docker/all/assert-dependencies-present.py:

http://gerrit.cloudera.org:8080/#/c/12285/2/docker/all/assert-dependencies-present.py@17
PS2, Line 17: # native toolchain. Must be python2.6 compatible.
Worth a note about why it must be 2.6 compatible.


http://gerrit.cloudera.org:8080/#/c/12285/2/docker/all/assert-dependencies-present.py@25
PS2, Line 25: def run(cmd):
I think you can call this check_output() since it's essentailly a re-implementation of subprocess.check_output() if we're on 2.6, which doesn't have it, if I remember correctly...


http://gerrit.cloudera.org:8080/#/c/12285/2/docker/all/assert-dependencies-present.py@34
PS2, Line 34:   for s in l:
            :     if re.match(regex, s):
            :       return True
            :   return False
I don't know if it's better, but this is basically any and map.

>>> import re
>>> r = re.compile('a')
>>> any(map(r.match, ["a", "b"]))
True
>>> any(map(r.match, ["c", "b"]))
False


http://gerrit.cloudera.org:8080/#/c/12285/2/docker/all/assert-dependencies-present.py@41
PS2, Line 41:   patterns = ['libdb.*.so',
            :               'libkrb.*.so',
            :               'libncurses.so',
            :               'libsasl.*.so',
            :               'libcrypto.so',
            :               'libz.so']
There are dots here that are wildcards and dots here that are real dots. Should we be correct about it?


http://gerrit.cloudera.org:8080/#/c/12285/2/docker/all/assert-dependencies-present.py@47
PS2, Line 47:   out = run(['ldconfig', '-p'])[0]
            :   libraries = []
            :   for line in out.splitlines():
            :     libraries.append(line.split()[0])
I'm ambivalent, but:

libraries = [ line.split()[0] for line in check_output(["ldconfig", "-p"]).splitlines() ]


http://gerrit.cloudera.org:8080/#/c/12285/2/docker/all/assert-dependencies-present.py@77
PS2, Line 77:     print 'Checking program: %s' % p
use logging?


http://gerrit.cloudera.org:8080/#/c/12285/2/docker/all/assert-dependencies-present.py@79
PS2, Line 79:       sys.exit('Unable to find %s in PATH' % p)
Use exceptions (and line 54)?


http://gerrit.cloudera.org:8080/#/c/12285/2/docker/all/assert-dependencies-present.py@85
PS2, Line 85:     import distutils.core  # noqa: F401
Maybe just move this to top of the file and python will fail and it'll be clear enough?


http://gerrit.cloudera.org:8080/#/c/12285/2/in-docker.py
File in-docker.py:

http://gerrit.cloudera.org:8080/#/c/12285/2/in-docker.py@42
PS2, Line 42: in-docker.py --docker-args "-t" impala-toolchain-centos6 -- bash
Don't you need the "-i"?


http://gerrit.cloudera.org:8080/#/c/12285/2/in-docker.py@70
PS2, Line 70:               '-v', '/etc/passwd:/etc/passwd:ro',
            :               '-v', '/etc/group:/etc/group:ro',
We've maybe seen cases where this doesn't work, but I'm fine with it if it works.


http://gerrit.cloudera.org:8080/#/c/12285/2/in-docker.py@77
PS2, Line 77:   parser.add_argument('--docker-args', default='')
Add help?

What's the quoting convention here?


http://gerrit.cloudera.org:8080/#/c/12285/2/in-docker.py@125
PS2, Line 125:   xargs = subprocess.Popen(['xargs', '-i', 'cp', '--parents', '{}', distro_build_dir], stdin=git.stdout)
I had to look up -i and it looks like maybe it's deprecated?

              This option is a synonym for -Ireplace-str if replace-str is specified.  If the replace-str argument is missing, the effect is the same as -I{}.  This option is deprecated; use -I instead.


http://gerrit.cloudera.org:8080/#/c/12285/2/in-docker.py@147
PS2, Line 147:     sys.stderr.write('Running: %s\n' % ' '.join(cmd))
use logging?



-- 
To view, visit http://gerrit.cloudera.org:8080/12285
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If42c9bc06a3d303642eb37dea784b61e2a1f5cc6
Gerrit-Change-Number: 12285
Gerrit-PatchSet: 2
Gerrit-Owner: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv...@cloudera.com>
Gerrit-Reviewer: Laszlo Gaal <la...@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <ph...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: hector.acosta@cloudera.com <he...@cloudera.com>
Gerrit-Comment-Date: Wed, 30 Jan 2019 19:46:55 +0000
Gerrit-HasComments: Yes