You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/05/01 12:57:20 UTC
[GitHub] [arrow] kszucs opened a new pull request #7081: [CI] Cache docker volumes [WIP]
kszucs opened a new pull request #7081:
URL: https://github.com/apache/arrow/pull/7081
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kszucs edited a comment on pull request #7081: [CI] Cache docker volumes [WIP]
Posted by GitBox <gi...@apache.org>.
kszucs edited a comment on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-622417613
With warmed up cache the build time has been reduced to 6m from 17m which is promising.
I'll need to do some gymnastics with the cache keys because the cache plugin lacks some features, but it'll work.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kszucs commented on pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes
Posted by GitBox <gi...@apache.org>.
kszucs commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-624544911
Gthub actions cache works a bit counter intuitively, once you upload something to a cache key, you cannot modify it until it gets evicted by github. Thus the keys are immutable.
I set a restore pattern to [match previously created cache entries](https://help.github.com/en/actions/configuring-and-managing-workflows/caching-dependencies-to-speed-up-workflows#matching-a-cache-key) if the specific hash is not found.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kszucs commented on a change in pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes [WIP]
Posted by GitBox <gi...@apache.org>.
kszucs commented on a change in pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#discussion_r420672485
##########
File path: .github/workflows/python_cron.yml
##########
@@ -110,6 +119,12 @@ jobs:
run: ci/scripts/util_checkout.sh
- name: Free Up Disk Space
run: ci/scripts/util_cleanup.sh
+ - name: Cache Docker Volumes
+ uses: actions/cache@v1
+ with:
+ path: .docker
+ key: ${{ matrix.cache }}-${{ hashFiles('cpp/**') }}
Review comment:
The build time is majored by the cpp implementation, so I thought it's better to have cache hits if the underlying C++ implementation has not changed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kszucs commented on pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes
Posted by GitBox <gi...@apache.org>.
kszucs commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-624554888
According to multiple comments in the cache action repository it is done once per day rather than instantly after reaching the 5GB repository level limit - probably multiple changes will happen regarding this.
Because the caching efficiency is pretty dependent on the cache eviction policy, I think we need to keep our eyes on the build times and ccache hit rates and tune the cache keys and file hashing accordingly.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kou commented on a change in pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes
Posted by GitBox <gi...@apache.org>.
kou commented on a change in pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#discussion_r420501687
##########
File path: .github/workflows/python.yml
##########
@@ -84,6 +93,14 @@ jobs:
run: ci/scripts/util_checkout.sh
- name: Free Up Disk Space
run: ci/scripts/util_cleanup.sh
+ - name: Cache Docker Volumes
+ uses: actions/cache@v1
+ with:
+ path: .docker
+ key: ${{ matrix.cache }}-${{ hashFiles('cpp/**') }}
+ restore-keys: ${{ matrix.cache }}-
+ - name: Debug
+ run: mkdir -p .docker && du -d1 -h .docker
Review comment:
Should we remove this?
##########
File path: .github/workflows/python_cron.yml
##########
@@ -110,6 +119,12 @@ jobs:
run: ci/scripts/util_checkout.sh
- name: Free Up Disk Space
run: ci/scripts/util_cleanup.sh
+ - name: Cache Docker Volumes
+ uses: actions/cache@v1
+ with:
+ path: .docker
+ key: ${{ matrix.cache }}-${{ hashFiles('cpp/**') }}
Review comment:
`hashFiles('cpp/**', 'python/**')`?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kszucs commented on pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes
Posted by GitBox <gi...@apache.org>.
kszucs commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-624570358
Many builds have finished in around 5 minutes which is a fine improvement for now.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #7081: [CI] Cache docker volumes [WIP]
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-622378960
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
Thanks for opening a pull request!
Could you open an issue for this pull request on JIRA?
https://issues.apache.org/jira/browse/ARROW
Then could you also rename pull request title in the following format?
ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}
See also:
* [Other pull requests](https://github.com/apache/arrow/pulls/)
* [Contribution Guidelines - How to contribute patches](https://arrow.apache.org/docs/developers/contributing.html#how-to-contribute-patches)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kszucs commented on a change in pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes
Posted by GitBox <gi...@apache.org>.
kszucs commented on a change in pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#discussion_r420695312
##########
File path: docker-compose.yml
##########
@@ -55,29 +55,12 @@
version: '3.5'
-volumes:
Review comment:
Created a follow-up to remove the `env` boilerplates from the github workflow files:https://issues.apache.org/jira/browse/ARROW-8713
##########
File path: docker-compose.yml
##########
@@ -55,29 +55,12 @@
version: '3.5'
-volumes:
Review comment:
Created a follow-up to remove the `env` boilerplates from the github workflow files: https://issues.apache.org/jira/browse/ARROW-8713
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] removed a comment on pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes
Posted by GitBox <gi...@apache.org>.
github-actions[bot] removed a comment on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-622378960
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
Thanks for opening a pull request!
Could you open an issue for this pull request on JIRA?
https://issues.apache.org/jira/browse/ARROW
Then could you also rename pull request title in the following format?
ARROW-${JIRA_ID}: [${COMPONENT}] ${SUMMARY}
See also:
* [Other pull requests](https://github.com/apache/arrow/pulls/)
* [Contribution Guidelines - How to contribute patches](https://arrow.apache.org/docs/developers/contributing.html#how-to-contribute-patches)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kszucs commented on a change in pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes [WIP]
Posted by GitBox <gi...@apache.org>.
kszucs commented on a change in pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#discussion_r420670036
##########
File path: .github/workflows/python.yml
##########
@@ -84,6 +93,14 @@ jobs:
run: ci/scripts/util_checkout.sh
- name: Free Up Disk Space
run: ci/scripts/util_cleanup.sh
+ - name: Cache Docker Volumes
+ uses: actions/cache@v1
+ with:
+ path: .docker
+ key: ${{ matrix.cache }}-${{ hashFiles('cpp/**') }}
+ restore-keys: ${{ matrix.cache }}-
+ - name: Debug
+ run: mkdir -p .docker && du -d1 -h .docker
Review comment:
Yes, removed.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kszucs commented on pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes
Posted by GitBox <gi...@apache.org>.
kszucs commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-624381195
Need to trigger another CI execution to see the benefits of the caching. The previous runs showed a decent speedup, but we can certainly increase the cache hit rate later with a bit better cache key selection.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kszucs commented on pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes [WIP]
Posted by GitBox <gi...@apache.org>.
kszucs commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-624363183
I'm planning to merge it once turns green, because it contains a critical fix for the python workflow which was incorrectly tabulated in the previous patch.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pitrou commented on pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes
Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-624493995
Why are you hashing the files? This means if one C++ source file changes, the ccache won't be reused?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kszucs commented on pull request #7081: [CI] Cache docker volumes [WIP]
Posted by GitBox <gi...@apache.org>.
kszucs commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-622417613
With warmed up cache the build time has been reduced to 6m from 17m which is promising.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kszucs removed a comment on pull request #7081: [CI] Cache docker volumes [WIP]
Posted by GitBox <gi...@apache.org>.
kszucs removed a comment on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-622418381
Checking that the cache properly works on my fork's master branch.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pitrou commented on pull request #7081: [CI] Cache docker volumes [WIP]
Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-623367064
Did they increase the available cache size? Last I looked it was a fixed size for the entire repo.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] kszucs commented on pull request #7081: [CI] Cache docker volumes [WIP]
Posted by GitBox <gi...@apache.org>.
kszucs commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-622418381
Checking that the cache properly works on my fork's master branch.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-624268714
https://issues.apache.org/jira/browse/ARROW-8708
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] pitrou commented on pull request #7081: ARROW-8708: [CI] Utilize github actions cache for docker-compose volumes [WIP]
Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #7081:
URL: https://github.com/apache/arrow/pull/7081#issuecomment-624546678
Yuck, I see.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org