You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by "paul-rogers (via GitHub)" <gi...@apache.org> on 2023/02/10 02:49:55 UTC

[GitHub] [druid] paul-rogers commented on a diff in pull request #13772: Fix GHA tests cache hit miss scenario

paul-rogers commented on code in PR #13772:
URL: https://github.com/apache/druid/pull/13772#discussion_r1102221995


##########
.github/workflows/reusable-revised-its.yml:
##########
@@ -61,16 +61,38 @@ jobs:
         with:
           path: ~/.m2/repository
           key: maven-${{ runner.os }}-${{ inputs.build_jdk }}-${{ github.sha }}
-          fail-on-cache-miss: true
+
+      - name: Restore targets
+        id: targets-restore
+        uses: actions/cache/restore@v3
+        with:
+          path: ./**/target
+          key: maven-${{ runner.os }}-${{ inputs.build_jdk }}-targets-${{ github.sha }}
 
       - name: Retrieve cached docker image
+        id: docker-restore
         uses: actions/cache/restore@v3
         with:
-          key: druid-container-jdk${{ inputs.build_jdk }}.tar.gz
+          key: druid-container-jdk${{ inputs.build_jdk }}.tar.gz-${{ github.sha }}
           path: |
             ./druid-container-jdk${{ inputs.build_jdk }}.tar.gz
             ./integration-tests-ex/image/target/env.sh
-          fail-on-cache-miss: true
+
+      - name: Maven build
+        if: steps.maven-restore.outputs.cache-hit != 'true' || ( steps.docker-restore.outputs.cache-hit != 'true' && steps.targets-restore.outputs.cache-hit != 'true' )
+        run: |
+          ./it.sh ci
+
+      - name: Create docker image
+        if: steps.docker-restore.outputs.cache-hit != 'true'
+        env:
+          docker-restore: ${{ toJson(steps.docker-restore.outputs) }}
+        run: |
+          ./it.sh image
+          source ./integration-tests-ex/image/target/env.sh

Review Comment:
   I still think that, in GHA, the image name should be passed into `.it.sh`, not retrieved from it. That is, set the image name so something like `drill-test-{{$ github.sha }}` or some such, where the suffix indicates the current commit or overall build. This way, we don't have to parse the env file.



##########
.github/workflows/reusable-standard-its.yml:
##########
@@ -74,12 +74,15 @@ jobs:
         with:
           path: ~/.m2/repository
           key: maven-${{ runner.os }}-${{ inputs.build_jdk }}-${{ github.sha }}
-          fail-on-cache-miss: true
+
+      - name: Maven build
+        if: steps.maven-restore.outputs.cache-hit != 'true'
+        run: |
+          ./it.sh ci

Review Comment:
   I wonder, why would the cache sometimes hit and sometimes not? Is the behavior non-deterministic? If so, is the cache atomic or might we sometimes get a partially-populated cache? Are we trying to populate the cache concurrently from reading from it?
   
   This makes me wonder? What problem are we trying to solve? Do we know the root cause, or are we just adding "here's what to do it we get an unexpected result" workaround?
   
   One really does hope that builds are deterministic!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org