You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Gaurav Rawat (JIRA)" <ji...@apache.org> on 2018/11/02 02:39:00 UTC

[jira] [Commented] (KYLIN-3555) Garbage collection on HBase step fails with S3 selected as storage

    [ https://issues.apache.org/jira/browse/KYLIN-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16672507#comment-16672507 ] 

Gaurav Rawat commented on KYLIN-3555:
-------------------------------------

I also faced the same with similar settings trying to use   "kylin.storage.hbase.cluster-fs" as empty for now and looks to be working don't see the error in the last step now .

> Garbage collection on HBase step fails with S3 selected as storage
> ------------------------------------------------------------------
>
>                 Key: KYLIN-3555
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3555
>             Project: Kylin
>          Issue Type: Bug
>          Components: Job Engine
>    Affects Versions: v2.4.1
>            Reporter: Iñigo Martinez
>            Priority: Major
>              Labels: build
>         Attachments: Screenshot from 2018-09-11 12-31-25.png
>
>
> When building a cube with S3 selected has storage, build process fails at latest step.
> Although s3 has been defined as storage, cleanup task tries to delete from HDFS and, of course, there is no file at HDFS.
>  
> {code:java}
> 2018-09-11 12:27:56,311 DEBUG [Scheduler 1407846257 Job f8416975-eea6-4500-9cb7-4374f28451dc-237] steps.HDFSPathGarbageCollectionStep:78 : Drop HDFS path on FileSystem: s3://XXXXXXX-emr-kylin
> 2018-09-11 12:27:57,364 DEBUG [Scheduler 1407846257 Job f8416975-eea6-4500-9cb7-4374f28451dc-237] steps.HDFSPathGarbageCollectionStep:87 : HDFS path /kylin/kylin_metadata/kylin-f8416975-eea6-4500-9cb7-4374f28451dc/plataforma_transacciones_cubo_v1/fact_distinct_columns is dropped.
> 2018-09-11 12:27:58,104 DEBUG [Scheduler 1407846257 Job f8416975-eea6-4500-9cb7-4374f28451dc-237] steps.HDFSPathGarbageCollectionStep:87 : HDFS path /kylin/kylin_metadata/kylin-f8416975-eea6-4500-9cb7-4374f28451dc/plataforma_transacciones_cubo_v1/hfile is dropped.
> 2018-09-11 12:27:58,140 DEBUG [Scheduler 1407846257 Job f8416975-eea6-4500-9cb7-4374f28451dc-237] steps.HDFSPathGarbageCollectionStep:78 : Drop HDFS path on FileSystem: hdfs://ip-10-0-1-63.eu-west-1.compute.internal:8020
> 2018-09-11 12:27:58,142 DEBUG [Scheduler 1407846257 Job f8416975-eea6-4500-9cb7-4374f28451dc-237] steps.HDFSPathGarbageCollectionStep:90 : HDFS path /kylin/kylin_metadata/kylin-f8416975-eea6-4500-9cb7-4374f28451dc/plataforma_transacciones_cubo_v1/fact_distinct_columns not exists.
> 2018-09-11 12:27:58,147 ERROR [Scheduler 1407846257 Job f8416975-eea6-4500-9cb7-4374f28451dc-237] steps.HDFSPathGarbageCollectionStep:68 : job:f8416975-eea6-4500-9cb7-4374f28451dc-15 execute finished with exception
> java.io.FileNotFoundException: File /kylin/kylin_metadata/kylin-f8416975-eea6-4500-9cb7-4374f28451dc/plataforma_transacciones_cubo_v1 does not exist.
> at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:904)
> at org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:114)
> at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:964)
> at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:961)
> at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:971)
> at org.apache.kylin.storage.hbase.steps.HDFSPathGarbageCollectionStep.dropHdfsPathOnCluster(HDFSPathGarbageCollectionStep.java:95)
> at org.apache.kylin.storage.hbase.steps.HDFSPathGarbageCollectionStep.doWork(HDFSPathGarbageCollectionStep.java:65)
> at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:162)
> at org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:69)
> at org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:162)
> at org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:113)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)