You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/05/01 12:57:20 UTC

[GitHub] [arrow] kszucs opened a new pull request #7081: [CI] Cache docker volumes [WIP]

kszucs opened a new pull request #7081:
URL: https://github.com/apache/arrow/pull/7081


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs edited a comment on pull request #7081: [CI] Cache docker volumes [WIP]

Posted by GitBox <gi...@apache.org>.
kszucs edited a comment on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-622417613


   With warmed up cache the build time has been reduced to 6m from 17m which is promising. 
   
   I'll need to do some gymnastics with the cache keys because the cache plugin lacks some features, but it'll work.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs commented on pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes

Posted by GitBox <gi...@apache.org>.
kszucs commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-624544911


   Gthub actions cache works a bit counter intuitively, once you upload something to a cache key, you cannot modify it until it gets evicted by github. Thus the keys are immutable. 
   
   I set a restore pattern to [match previously created cache entries](https://help.github.com/en/actions/configuring-and-managing-workflows/caching-dependencies-to-speed-up-workflows#matching-a-cache-key) if the specific hash is not found.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs commented on a change in pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes [WIP]

Posted by GitBox <gi...@apache.org>.
kszucs commented on a change in pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#discussion_r420672485



##########
File path: .github/workflows/python_cron.yml
##########
@@ -110,6 +119,12 @@ jobs:
         run: ci/scripts/util_checkout.sh
       - name: Free Up Disk Space
         run: ci/scripts/util_cleanup.sh
+      - name: Cache Docker Volumes
+        uses: actions/cache@v1
+        with:
+          path: .docker
+          key: ${{ matrix.cache }}-${{ hashFiles('cpp/**') }}

Review comment:
       The build time is majored by the cpp implementation, so I thought it's better to have cache hits if the underlying C++ implementation has not changed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs commented on pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes

Posted by GitBox <gi...@apache.org>.
kszucs commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-624554888


   According to multiple comments in the cache action repository it is done once per day rather than instantly after reaching the 5GB repository level limit - probably multiple changes will happen regarding this.
   
   Because the caching efficiency is pretty dependent on the cache eviction policy, I think we need to keep our eyes on the build times and ccache hit rates and tune the cache keys and file hashing accordingly. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kou commented on a change in pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes

Posted by GitBox <gi...@apache.org>.
kou commented on a change in pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#discussion_r420501687



##########
File path: .github/workflows/python.yml
##########
@@ -84,6 +93,14 @@ jobs:
         run: ci/scripts/util_checkout.sh
       - name: Free Up Disk Space
         run: ci/scripts/util_cleanup.sh
+      - name: Cache Docker Volumes
+        uses: actions/cache@v1
+        with:
+          path: .docker
+          key: ${{ matrix.cache }}-${{ hashFiles('cpp/**') }}
+          restore-keys: ${{ matrix.cache }}-
+      - name: Debug
+        run: mkdir -p .docker && du -d1 -h .docker

Review comment:
       Should we remove this?

##########
File path: .github/workflows/python_cron.yml
##########
@@ -110,6 +119,12 @@ jobs:
         run: ci/scripts/util_checkout.sh
       - name: Free Up Disk Space
         run: ci/scripts/util_cleanup.sh
+      - name: Cache Docker Volumes
+        uses: actions/cache@v1
+        with:
+          path: .docker
+          key: ${{ matrix.cache }}-${{ hashFiles('cpp/**') }}

Review comment:
       `hashFiles('cpp/**', 'python/**')`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs commented on pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes

Posted by GitBox <gi...@apache.org>.
kszucs commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-624570358


   Many builds have finished in around 5 minutes which is a fine improvement for now.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #7081: [CI] Cache docker volumes [WIP]

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-622378960


   <!--
     Licensed to the Apache Software Foundation (ASF) under one
     or more contributor license agreements.  See the NOTICE file
     distributed with this work for additional information
     regarding copyright ownership.  The ASF licenses this file
     to you under the Apache License, Version 2.0 (the
     "License"); you may not use this file except in compliance
     with the License.  You may obtain a copy of the License at
   
       http://www.apache.org/licenses/LICENSE-2.0
   
     Unless required by applicable law or agreed to in writing,
     software distributed under the License is distributed on an
     "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
     KIND, either express or implied.  See the License for the
     specific language governing permissions and limitations
     under the License.
   -->
   
   Thanks for opening a pull request!
   
   Could you open an issue for this pull request on JIRA?
   https://issues.apache.org/jira/browse/ARROW
   
   Then could you also rename pull request title in the following format?
   
       ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}
   
   See also:
   
     * [Other pull requests](https://github.com/apache/arrow/pulls/)
     * [Contribution Guidelines - How to contribute patches](https://arrow.apache.org/docs/developers/contributing.html#how-to-contribute-patches)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs commented on a change in pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes

Posted by GitBox <gi...@apache.org>.
kszucs commented on a change in pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#discussion_r420695312



##########
File path: docker-compose.yml
##########
@@ -55,29 +55,12 @@
 
 version: '3.5'
 
-volumes:

Review comment:
       Created a follow-up to remove the `env` boilerplates  from the github workflow files:https://issues.apache.org/jira/browse/ARROW-8713

##########
File path: docker-compose.yml
##########
@@ -55,29 +55,12 @@
 
 version: '3.5'
 
-volumes:

Review comment:
       Created a follow-up to remove the `env` boilerplates  from the github workflow files: https://issues.apache.org/jira/browse/ARROW-8713




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] removed a comment on pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes

Posted by GitBox <gi...@apache.org>.
github-actions[bot] removed a comment on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-622378960


   <!--
     Licensed to the Apache Software Foundation (ASF) under one
     or more contributor license agreements.  See the NOTICE file
     distributed with this work for additional information
     regarding copyright ownership.  The ASF licenses this file
     to you under the Apache License, Version 2.0 (the
     "License"); you may not use this file except in compliance
     with the License.  You may obtain a copy of the License at
   
       http://www.apache.org/licenses/LICENSE-2.0
   
     Unless required by applicable law or agreed to in writing,
     software distributed under the License is distributed on an
     "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
     KIND, either express or implied.  See the License for the
     specific language governing permissions and limitations
     under the License.
   -->
   
   Thanks for opening a pull request!
   
   Could you open an issue for this pull request on JIRA?
   https://issues.apache.org/jira/browse/ARROW
   
   Then could you also rename pull request title in the following format?
   
       ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}
   
   See also:
   
     * [Other pull requests](https://github.com/apache/arrow/pulls/)
     * [Contribution Guidelines - How to contribute patches](https://arrow.apache.org/docs/developers/contributing.html#how-to-contribute-patches)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs commented on a change in pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes [WIP]

Posted by GitBox <gi...@apache.org>.
kszucs commented on a change in pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#discussion_r420670036



##########
File path: .github/workflows/python.yml
##########
@@ -84,6 +93,14 @@ jobs:
         run: ci/scripts/util_checkout.sh
       - name: Free Up Disk Space
         run: ci/scripts/util_cleanup.sh
+      - name: Cache Docker Volumes
+        uses: actions/cache@v1
+        with:
+          path: .docker
+          key: ${{ matrix.cache }}-${{ hashFiles('cpp/**') }}
+          restore-keys: ${{ matrix.cache }}-
+      - name: Debug
+        run: mkdir -p .docker && du -d1 -h .docker

Review comment:
       Yes, removed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs commented on pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes

Posted by GitBox <gi...@apache.org>.
kszucs commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-624381195


   Need to trigger another CI execution to see the benefits of the caching. The previous runs showed a decent speedup, but we can certainly increase the cache hit rate later with a bit better cache key selection.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs commented on pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes [WIP]

Posted by GitBox <gi...@apache.org>.
kszucs commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-624363183


   I'm planning to merge it once turns green, because it contains a critical fix for the python workflow which was incorrectly tabulated in the previous patch. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-624493995


   Why are you hashing the files? This means if one C++ source file changes, the ccache won't be reused?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs commented on pull request #7081: [CI] Cache docker volumes [WIP]

Posted by GitBox <gi...@apache.org>.
kszucs commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-622417613


   With warmed up cache the build time has been reduced to 6m from 17m which is promising. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs removed a comment on pull request #7081: [CI] Cache docker volumes [WIP]

Posted by GitBox <gi...@apache.org>.
kszucs removed a comment on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-622418381


   Checking that the cache properly works on my fork's master branch.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #7081: [CI] Cache docker volumes [WIP]

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-623367064


   Did they increase the available cache size? Last I looked it was a fixed size for the entire repo.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs commented on pull request #7081: [CI] Cache docker volumes [WIP]

Posted by GitBox <gi...@apache.org>.
kszucs commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-622418381


   Checking that the cache properly works on my fork's master branch.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-624268714


   https://issues.apache.org/jira/browse/ARROW-8708


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes [WIP]

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-624546678


   Yuck, I see.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org