You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@arrow.apache.org by "Ben Kietzman (Jira)" <ji...@apache.org> on 2020/04/14 19:22:00 UTC

[jira] [Resolved] (ARROW-8432) [Python][CI] Failure to download Hadoop

     [ https://issues.apache.org/jira/browse/ARROW-8432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ben Kietzman resolved ARROW-8432.
---------------------------------
    Resolution: Fixed

Issue resolved by pull request 6922
[https://github.com/apache/arrow/pull/6922]

> [Python][CI] Failure to download Hadoop
> ---------------------------------------
>
>                 Key: ARROW-8432
>                 URL: https://issues.apache.org/jira/browse/ARROW-8432
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Continuous Integration, Python
>    Affects Versions: 0.16.0
>            Reporter: Ben Kietzman
>            Assignee: Ben Kietzman
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.17.0
>
>          Time Spent: 4h
>  Remaining Estimate: 0h
>
> https://circleci.com/gh/ursa-labs/crossbow/11128?utm_campaign=vcs-integration-link&utm_medium=referral&utm_source=github-build-link
> This is caused by an HTTP request failure https://github.com/apache/arrow/blob/master/ci/docker/conda-python-hdfs.dockerfile#L36
> We should probably not rely on https://www.apache.org/dyn/mirrors/mirrors.cgi to get tarballs. Currently there are:
> {code}
> ci/docker/conda-python-hdfs.dockerfile
> 36:RUN wget -q -O - "https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download&filename=hadoop/common/hadoop-${hdfs}/hadoop-${hdfs}.tar.gz" | tar -xzf - -C /opt
> ci/docker/linux-apt-docs.dockerfile
> 57:RUN wget -q -O - "https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download&filename=maven/maven-3/${maven}/binaries/apache-maven-${maven}-bin.tar.gz" | tar -xzf - -C /opt
> python/manylinux1/scripts/build_thrift.sh
> 22:  "https://www.apache.org/dyn/mirrors/mirrors.cgi?action=download&filename=${THRIFT_DOWNLOAD_PATH}" \
> python/manylinux201x/scripts/build_thrift.sh
> 20:wget https://archive.apache.org/dist/thrift/${THRIFT_VERSION}/thrift-${THRIFT_VERSION}.tar.gz
> {code}
> Factor these out into a reusable script for downloading apache tarballs. It should contain hard coded apache mirrors and retry when connections fail



--
This message was sent by Atlassian Jira
(v8.3.4#803005)