You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/03/24 05:18:00 UTC

[jira] [Commented] (ARROW-2350) Shrink size of spark_integration Docker container

    [ https://issues.apache.org/jira/browse/ARROW-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16412426#comment-16412426 ] 

ASF GitHub Bot commented on ARROW-2350:
---------------------------------------

jameslamb opened a new pull request #1787: ARROW-2350: Consolidated RUN step in spark_integration Dockerfile
URL: https://github.com/apache/arrow/pull/1787
 
 
   I am a big fan of this project! In this PR, I propose a small change to the `spark_integration` Dockerfile. See [ARROW-2350](https://issues.apache.org/jira/browse/ARROW-2350) for a full description of what I'm proposing here.
   
   This change cuts the size of the `spark_integration` container from **2.65 GB** to **2.26 GB**.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Shrink size of spark_integration Docker container
> -------------------------------------------------
>
>                 Key: ARROW-2350
>                 URL: https://issues.apache.org/jira/browse/ARROW-2350
>             Project: Apache Arrow
>          Issue Type: Improvement
>            Reporter: James Lamb
>            Priority: Minor
>              Labels: docker, pull-request-available, spark
>   Original Estimate: 10m
>  Remaining Estimate: 10m
>
> I would like to propose a few changes to the spark_integration Dockerfile:
> [https://github.com/apache/arrow/tree/master/dev/spark_integration]
> The size of the resulting image can be reduced by making the following changes:
>  * consolidating all RUN commands into a single RUN layer (reducing the number of layers)
>  * running {color:#14892c}apt-get clean{color} to clear out the package cache
>  * running {color:#14892c}conda clean --all{color} to clear out cached package tarballs, abandoned package versions, and other build artifacts from all the libraries that are conda installed
> I will be submitting a PR on GitHub shortly. Generating this issue first so I can tag my PR to it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)