You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "hongbin ma (JIRA)" <ji...@apache.org> on 2015/09/01 12:40:45 UTC
[jira] [Commented] (KYLIN-978) GarbageCollectionStep dropped Hive
Intermediate Table but didn't drop external hdfs path
[ https://issues.apache.org/jira/browse/KYLIN-978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14725161#comment-14725161 ]
hongbin ma commented on KYLIN-978:
----------------------------------
If this info could help you:
Despite GarbageCollectionStep, some garbage files still exist as Kylin keeps running. We provide a tool to do cleaning:
{code}
hbase org.apache.hadoop.util.RunJar $KYLIN_HOME/lib/kylin-job-0.8.1-incubating-SNAPSHOT.jar org.apache.kylin.job.hadoop.cube.StorageCleanupJob --delete true
{code}
to remove left-behind garbages in both hdfs and hbase.
The left-behind external hive files will also get cleaned up by this command.
However, removing garbages at GarbageCollectionStep will be even better.
> GarbageCollectionStep dropped Hive Intermediate Table but didn't drop external hdfs path
> ----------------------------------------------------------------------------------------
>
> Key: KYLIN-978
> URL: https://issues.apache.org/jira/browse/KYLIN-978
> Project: Kylin
> Issue Type: Bug
> Components: Job Engine
> Affects Versions: v0.7.2
> Reporter: Yerui Sun
> Assignee: Shaofeng SHI
>
> In GarbageCollectionStep, the hive intermediate table created in step 1 was dropped.
> As the table is external table, data was stored in a external hdfs path, like '.../kylin-$\{jobId\}/kylin_intermediate_...', which didn't deleted when drop hive table.
> Considering the purpose of GarbageCollectionStep, the external data path should also be deleted.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)