You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/09/01 21:35:25 UTC

[GitHub] [spark] dongjoon-hyun opened a new pull request, #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

dongjoon-hyun opened a new pull request, #37745:
URL: https://github.com/apache/spark/pull/37745

   ### What changes were proposed in this pull request?
   
   This PR aims to add `gcs-connector` shaded jar to `hadoop-cloud` module.
   
   ### Why are the changes needed?
   
   To support Google Cloud Storage more easily.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Only one shaded jar file is added when the distribution is built with `-Phadoop-cloud`.
   ```
   $ ls -alh gcs*
   -rw-r--r--@ 1 dongjoon  staff    32M Aug 31 11:14 gcs-connector-hadoop3-2.2.7-shaded.jar
   ```
   
   ### How was this patch tested?
   
   **BUILD**
   ```
   $ dev/make-distribution.sh -Phadoop-cloud
   ```
   
   **RUN**
   ```
   $ export KEYFILE=YOUR-credentials.json
   $ export EMAIL=$(jq -r '.client_email' < $KEYFILE)
   $ export PRIVATE_KEY_ID=$(jq -r '.private_key_id' < $KEYFILE)
   $ export PRIVATE_KEY="$(jq -r '.private_key' < $KEYFILE)"
   $ bin/spark-shell \
   -c spark.hadoop.fs.gs.auth.service.account.email=$EMAIL \
   -c spark.hadoop.fs.gs.auth.service.account.private.key.id=$PRIVATE_KEY_ID \
   -c spark.hadoop.fs.gs.auth.service.account.private.key="$PRIVATE_KEY"
   Setting default log level to "WARN".
   To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
   22/08/31 11:56:04 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
   Spark context Web UI available at http://localhost:4040
   Spark context available as 'sc' (master = local[*], app id = local-1661972165062).
   Spark session available as 'spark'.
   Welcome to
         ____              __
        / __/__  ___ _____/ /__
       _\ \/ _ \/ _ `/ __/  '_/
      /___/ .__/\_,_/_/ /_/\_\   version 3.4.0-SNAPSHOT
         /_/
   
   Using Scala version 2.12.16 (OpenJDK 64-Bit Server VM, Java 17.0.4)
   Type in expressions to have them evaluated.
   Type :help for more information.
   
   scala> spark.read.text("gs://apache-spark-bucket/README.md").count()
   res0: Long = 124
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #37745:
URL: https://github.com/apache/spark/pull/37745#issuecomment-1234071272

   I'm already allowing all of them.
   <img width="405" alt="Screen Shot 2022-09-01 at 3 19 30 AM" src="https://user-images.githubusercontent.com/9700541/187891526-5938feb5-d380-4574-a81a-9b621779dead.png">
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
Yikun commented on PR #37745:
URL: https://github.com/apache/spark/pull/37745#issuecomment-1234081059

   Could you check this link?
   
   https://github.com/users/dongjoon-hyun/packages/container/package/apache-spark-ci-image/settings
   
   ![image](https://user-images.githubusercontent.com/1736354/187893383-37752514-c9be-4f3b-bb53-a3f8cdc3e25c.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
Yikun commented on PR #37745:
URL: https://github.com/apache/spark/pull/37745#issuecomment-1234063012

   @dongjoon-hyun I just saw your recreate the spark repo, so might default permisson has some changes on Github Action?
   
   You could first set permission for your dongjoon-hyun/spark repo: https://github.blog/changelog/2021-04-20-github-actions-control-permissions-for-github_token/#setting-the-default-permissions-for-the-organization-or-repository
   
   and we might need a separate pr to set spark permission for new created repo: https://github.blog/changelog/2021-04-20-github-actions-control-permissions-for-github_token/#setting-permissions-in-the-workflow


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun closed pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun closed pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module
URL: https://github.com/apache/spark/pull/37745


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
Yikun commented on PR #37745:
URL: https://github.com/apache/spark/pull/37745#issuecomment-1234097485

   The potential issue might be you remove the old repo, but the images is not be deleted, then when create the new repo, the write permisson of this image are not configured to new repo.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] steveloughran commented on pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
steveloughran commented on PR #37745:
URL: https://github.com/apache/spark/pull/37745#issuecomment-1235374285

   anyway, the version you are looking at is probably safe; it switched in feb 2022 (pr 726)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #37745:
URL: https://github.com/apache/spark/pull/37745#issuecomment-1233311486

   cc @sunchao , @steveloughran 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #37745:
URL: https://github.com/apache/spark/pull/37745#issuecomment-1234014605

   The failure is irrelevant to this PR. It seems that the base image publishing is broken again. cc @Yikun 
   ```
   #33 ERROR: failed commit on ref "manifest-sha256:d7fdbdf2cdb51876ca0c22c2e3b1865b11e2058fd9a07e59f281f7596cb00956": unexpected status: 403 Forbidden
    > exporting to image:
   ERROR: failed to solve: failed commit on ref "manifest-sha256:d7fdbdf2cdb51876ca0c22c2e3b1865b11e2058fd9a07e59f281f7596cb00956": unexpected status: 403 Forbidden
   Error: buildx failed with: ERROR: failed to solve: failed commit on ref "manifest-sha256:d7fdbdf2cdb51876ca0c22c2e3b1865b11e2058fd9a07e59f281f7596cb00956": unexpected status: 403 Forbidden
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #37745:
URL: https://github.com/apache/spark/pull/37745#issuecomment-1234807199

   Ur, wait. @steveloughran . It's Java 8, isn't it?
   
   https://github.com/GoogleCloudDataproc/hadoop-connectors/blob/8453ce7ce7510e983bae7470909fbd02704c0539/pom.xml#L76-L77
   
   ```
       <build.java.source.version>8</build.java.source.version>
       <build.java.target.version>8</build.java.target.version>
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #37745:
URL: https://github.com/apache/spark/pull/37745#issuecomment-1233372943

   - Yes, only for better GCS support for the users who use `-Phadoop-cloud`.
   - Apache Spark distribution doesn't use `-Phadoop-cloud` during our release process. So, the publish artifacts are not affected. The cost of 32M additional size increase is only on the user-side.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a diff in pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on code in PR #37745:
URL: https://github.com/apache/spark/pull/37745#discussion_r960841601


##########
hadoop-cloud/pom.xml:
##########
@@ -135,6 +135,18 @@
         </exclusion>
       </exclusions>
     </dependency>
+    <dependency>
+      <groupId>com.google.cloud.bigdataoss</groupId>
+      <artifactId>gcs-connector</artifactId>
+      <version>${gcs-connector.version}</version>
+      <classifier>shaded</classifier>
+      <exclusions>
+        <exclusion>
+          <groupId>*</groupId>

Review Comment:
   Thank you for review, @sunchao . According to the shaded pattern,
   
   https://github.com/GoogleCloudDataproc/hadoop-connectors/blob/8453ce7ce7510e983bae7470909fbd02704c0539/gcs/pom.xml#L208-L363
   
   We have all we needed for Hadoop3 and Hadoop2.
   - For Hadoop3, https://mvnrepository.com/artifact/com.google.cloud.bigdataoss/gcs-connector/hadoop3-2.2.7
   - For Hadoop2, https://mvnrepository.com/artifact/com.google.cloud.bigdataoss/gcs-connector/hadoop2-2.2.7
   
   I intentionally exclude everything. We will add Spark's version if there is additional missing transitive dependency (if exists).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #37745:
URL: https://github.com/apache/spark/pull/37745#issuecomment-1234077478

   It's weird. IIRC, I didn't change anything from my previous repo either when your PR applied this change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
Yikun commented on PR #37745:
URL: https://github.com/apache/spark/pull/37745#issuecomment-1234067056

   https://github.com/dongjoon-hyun/spark/settings/actions
   
   ![image](https://user-images.githubusercontent.com/1736354/187890839-2f26ce10-2e20-4d7e-ab6e-311c898fc416.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #37745:
URL: https://github.com/apache/spark/pull/37745#issuecomment-1234388728

   1. I checked that mine is the same with you.
   <img width="1015" alt="Screen Shot 2022-09-01 at 7 47 34 AM" src="https://user-images.githubusercontent.com/9700541/187943970-bd5d40bf-8545-4d50-b7eb-16fc4a0440d8.png">
   
   2. Let me try to clean up
   ```
   curl -X DELETE -H "Accept: application/vnd.github+json" -H "Authorization: token $REPLACE_ME" https://api.github.com/user/packages/container/apache-spark-ci-image
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
Yikun commented on PR #37745:
URL: https://github.com/apache/spark/pull/37745#issuecomment-1234041557

   @dongjoon-hyun Thanks to ping me, this due to github action ghcr unstable, you could retry to make it work.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] sunchao commented on a diff in pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
sunchao commented on code in PR #37745:
URL: https://github.com/apache/spark/pull/37745#discussion_r960782293


##########
hadoop-cloud/pom.xml:
##########
@@ -135,6 +135,18 @@
         </exclusion>
       </exclusions>
     </dependency>
+    <dependency>
+      <groupId>com.google.cloud.bigdataoss</groupId>
+      <artifactId>gcs-connector</artifactId>
+      <version>${gcs-connector.version}</version>
+      <classifier>shaded</classifier>
+      <exclusions>
+        <exclusion>
+          <groupId>*</groupId>

Review Comment:
   curious why do we exclude everything from the shaded jar



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #37745:
URL: https://github.com/apache/spark/pull/37745#issuecomment-1238778068

   Thank you for your review, comments, and help, @srowen , @Yikun , @sunchao , @steveloughran . It seems that there is no other concerns, I'll merge this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] srowen commented on pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
srowen commented on PR #37745:
URL: https://github.com/apache/spark/pull/37745#issuecomment-1233318749

   Seems OK, but what does it buy us? GCS storage support? Only downside is increasing the sea of JARs in the project, I guess.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #37745:
URL: https://github.com/apache/spark/pull/37745#issuecomment-1234803915

   Thank you for review, @steveloughran .
   
   > note that the gcs connector (at leasts the builds off their master) are java 11 only; not sure where that stands w.r.t older releases 
   > note that the gcs connector (at leasts the builds off their master) are java 11 only; not sure where that stands w.r.t older releases
   
   I didn't realize this because I've been using Java 11+. If then, I had better close this PR and the JIRA officially.
   
   Thank you, @srowen , @sunchao and @steveloughran !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #37745:
URL: https://github.com/apache/spark/pull/37745#issuecomment-1234507890

   Thank you so much, @Yikun . Now, it seems to work on my three PRs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] steveloughran commented on pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
steveloughran commented on PR #37745:
URL: https://github.com/apache/spark/pull/37745#issuecomment-1235372490

   3.0.0 is java 11
   ```
       <build.java.source.version>11</build.java.source.version>
       <build.java.target.version>11</build.java.target.version>
   ```
   that is hadoop-3.3.1, which simplifies my life a lot too...the streams even support IOStatistics. I've been testing my manifest committer through it and have to bump the test shell up to java11 for all to work


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #37745:
URL: https://github.com/apache/spark/pull/37745#issuecomment-1234055469

   Thank you, but `Base Image Build` phase failed three times already .
   <img width="328" alt="Screen Shot 2022-09-01 at 3 05 48 AM" src="https://user-images.githubusercontent.com/9700541/187888992-48c0292b-2586-421b-8f9e-9b514ab35cb2.png">
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
Yikun commented on PR #37745:
URL: https://github.com/apache/spark/pull/37745#issuecomment-1234426066

   https://github.com/users/dongjoon-hyun/packages/container/apache-spark-ci-image/settings
   
   You can also remove it in page ^


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #37745:
URL: https://github.com/apache/spark/pull/37745#issuecomment-1234809633

   This PR passed CI here.
   - https://github.com/dongjoon-hyun/spark/runs/8139684479?check_suite_focus=true


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] steveloughran commented on a diff in pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
steveloughran commented on code in PR #37745:
URL: https://github.com/apache/spark/pull/37745#discussion_r960971095


##########
hadoop-cloud/pom.xml:
##########
@@ -135,6 +135,18 @@
         </exclusion>
       </exclusions>
     </dependency>
+    <dependency>
+      <groupId>com.google.cloud.bigdataoss</groupId>
+      <artifactId>gcs-connector</artifactId>
+      <version>${gcs-connector.version}</version>
+      <classifier>shaded</classifier>
+      <exclusions>
+        <exclusion>
+          <groupId>*</groupId>

Review Comment:
   issue is that there's a history of the shaded connector still declaring a dependence on things which are now shaded, so breaking convergence.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on PR #37745:
URL: https://github.com/apache/spark/pull/37745#issuecomment-1235772123

   There is no 3.0.0 yet. :)
   ![Screen Shot 2022-09-02 at 10 59 41 AM](https://user-images.githubusercontent.com/9700541/188211576-79fe7c0b-fe5d-4405-bece-69bf2bb1106e.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun closed pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun closed pull request #37745: [SPARK-33605][BUILD] Add `gcs-connector` to `hadoop-cloud` module
URL: https://github.com/apache/spark/pull/37745


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org