You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2022/08/31 17:52:00 UTC

[jira] [Assigned] (SPARK-33605) Add GCS FS/connector config (dependencies?) akin to S3

     [ https://issues.apache.org/jira/browse/SPARK-33605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-33605:
------------------------------------

    Assignee: Apache Spark

> Add GCS FS/connector config (dependencies?) akin to S3
> ------------------------------------------------------
>
>                 Key: SPARK-33605
>                 URL: https://issues.apache.org/jira/browse/SPARK-33605
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark, Spark Core
>    Affects Versions: 3.0.1
>            Reporter: Rafal Wojdyla
>            Assignee: Apache Spark
>            Priority: Major
>
> Spark comes with some S3 batteries included, which makes it easier to use with S3, for GCS to work users are required to manually configure the jars. This is especially problematic for python users who may not be accustomed to java dependencies etc. This is an example of workaround for pyspark: [pyspark_gcs|https://github.com/ravwojdyla/pyspark_gcs]. If we include the [GCS connector|https://cloud.google.com/dataproc/docs/concepts/connectors/cloud-storage], it would make things easier for GCS users.
> Please let me know what you think.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org