You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "hongbin ma (JIRA)" <ji...@apache.org> on 2015/09/01 12:40:45 UTC

[jira] [Commented] (KYLIN-978) GarbageCollectionStep dropped Hive Intermediate Table but didn't drop external hdfs path

    [ https://issues.apache.org/jira/browse/KYLIN-978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14725161#comment-14725161 ] 

hongbin ma commented on KYLIN-978:
----------------------------------

If this info could help you:

Despite GarbageCollectionStep, some garbage files still exist as Kylin keeps running. We provide a tool to do cleaning:
{code}
hbase org.apache.hadoop.util.RunJar $KYLIN_HOME/lib/kylin-job-0.8.1-incubating-SNAPSHOT.jar org.apache.kylin.job.hadoop.cube.StorageCleanupJob --delete true
{code}

to remove left-behind garbages in both hdfs and hbase.
The left-behind  external hive files will also get cleaned up by this command.

However, removing garbages at GarbageCollectionStep will be even better. 



> GarbageCollectionStep dropped Hive Intermediate Table but didn't drop external hdfs path
> ----------------------------------------------------------------------------------------
>
>                 Key: KYLIN-978
>                 URL: https://issues.apache.org/jira/browse/KYLIN-978
>             Project: Kylin
>          Issue Type: Bug
>          Components: Job Engine
>    Affects Versions: v0.7.2
>            Reporter: Yerui Sun
>            Assignee: Shaofeng SHI
>
> In GarbageCollectionStep, the hive intermediate table created in step 1 was dropped. 
> As the table is external table, data was stored in a external hdfs path, like '.../kylin-$\{jobId\}/kylin_intermediate_...', which didn't deleted when drop hive table.
> Considering the purpose of GarbageCollectionStep, the external data path should also be deleted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)