You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "lufei (JIRA)" <ji...@apache.org> on 2017/08/24 06:54:00 UTC

[jira] [Created] (SPARK-21822) When insert Hive Table is finished, it is better to clean out the tmpLocation dir

lufei created SPARK-21822:
-----------------------------

             Summary: When insert Hive Table is finished, it is better to clean out the tmpLocation dir
                 Key: SPARK-21822
                 URL: https://issues.apache.org/jira/browse/SPARK-21822
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.3.0
            Reporter: lufei


When insert Hive Table is finished, it is better to clean out the tmpLocation dir(the temp directorys like ".hive-staging_hive_2017-08-19_10-56-01_540_5448395226195533570-9/-ext-10000" or "/tmp/hive/..." for an old spark version).
Otherwise, when lots of spark job are executed, millions of temporary directories are left in HDFS. And these temporary directories can only be deleted by the maintainer through the shell script.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org