You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by "Manish Gupta (JIRA)" <ji...@apache.org> on 2018/01/05 10:42:00 UTC

[jira] [Resolved] (CARBONDATA-1896) Clean files operation improvement

     [ https://issues.apache.org/jira/browse/CARBONDATA-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Manish Gupta resolved CARBONDATA-1896.
--------------------------------------
       Resolution: Fixed
    Fix Version/s: 1.3.0

> Clean files operation improvement
> ---------------------------------
>
>                 Key: CARBONDATA-1896
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-1896
>             Project: CarbonData
>          Issue Type: Bug
>            Reporter: dhatchayani
>            Assignee: dhatchayani
>             Fix For: 1.3.0
>
>          Time Spent: 14h 40m
>  Remaining Estimate: 0h
>
> +*Problem:*+
> When bringing up the session, clean operation is handled in a way to mark all the INSERT_OVERWRITE_IN_PROGRESS or INSERT_IN_PROGRESS segments to MARKED_FOR_DELETE in tablestatus file. This clean operation is not considering the other parallel sessions. If any other session's data load is IN_PROGRESS at the time of bringing up one session, then the executing load also will be changed to MARKED_FOR_DELETE irrespective of the actual load status. Handling stale segments cleaning while session bring up also increases the time of bringing up a session.
> +*Solution:*+
> SEGMENT_LOCK should be taken on the new segment while loading.
> While cleaning segments tablestatus file and SEGMENT_LOCK should be considered.
> Cleaning stale files while bringing up the session should be removed and this can be either manually done on the needed tables through already existing CLEAN FILES DDL or the next  load will automatically clean the same.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)