You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "WangSheng (JIRA)" <ji...@apache.org> on 2019/06/28 03:41:00 UTC

[jira] [Created] (KYLIN-4060) "Garbage Collection on HDFS" step failed because of hdfs path not exists

WangSheng created KYLIN-4060:
--------------------------------

             Summary: "Garbage Collection on HDFS" step failed because of hdfs path not exists
                 Key: KYLIN-4060
                 URL: https://issues.apache.org/jira/browse/KYLIN-4060
             Project: Kylin
          Issue Type: Bug
          Components: Job Engine
    Affects Versions: v2.4.1
            Reporter: WangSheng


We found a bug recently when we used streaming cube on last job step "Garbage Collection on HDFS", the proplem is as blow:

 
{code:java}
Drop HDFS path on FileSystem: "hdfs://kylin-cluster" 
HDFS path /user/kylin/kylin_home/kylin_metadata/kylin-03c04b31-5d40-441a-a0df-289f5977b733/cube_test/fact_distinct_columns not exists.

File /user/kylin/kylin_home/kylin_metadata/kylin-03c04b31-5d40-441a-a0df-289f5977b733/cube_test does not exist.
{code}
When I check the code and log, I found that the main reason is:

 
 # A build job first submitted, and on step "Update Cube Info", segment became "READY";
 # Then a merge job submitted automatically by kylin, include segment on step1. The merge job finished quickly, and deleted input segments hdfs path;
 # After merge job finished, the build job continue build, "Hive Cleanup" and "Garbage Collection on HBase", failed at last step because the hdfs path is deleted on step2.

Our version is 2.4.x, I'm not sure this if this bug fixed on latest 2.6.x version. If not, please assign this Jira to me, thanks!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)