You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Weichen Xu (JIRA)" <ji...@apache.org> on 2019/07/12 11:49:00 UTC

[jira] [Created] (SPARK-28366) Logging in driver when loading single large gzipped file via sc.textFile

Weichen Xu created SPARK-28366:
----------------------------------

             Summary: Logging in driver when loading single large gzipped file via sc.textFile
                 Key: SPARK-28366
                 URL: https://issues.apache.org/jira/browse/SPARK-28366
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
    Affects Versions: 2.4.3
            Reporter: Weichen Xu


For a large gzipped file, since they are not splittable, spark have to use only one partition task to read and decompress it. This could be very slow.

We should log for this case in driver side.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org