You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@helix.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/12/10 01:49:10 UTC

[jira] [Commented] (HELIX-617) Job IdealState is generated even the job is not running and not removed when it is completed.

    [ https://issues.apache.org/jira/browse/HELIX-617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15049750#comment-15049750 ] 

ASF GitHub Bot commented on HELIX-617:
--------------------------------------

GitHub user lei-xia opened a pull request:

    https://github.com/apache/helix/pull/40

    [HELIX-617] Job IdealState is generated even the job is not running and not removed when it is completed

    – Problem:
    In current task framework implementation, one IdealState and one ExternalView znode are generated for each job once the job is queued (not when it is started). These znodes will stay in the zookeeper until the specified data expiry time (usually one day in Espresso backup setting). This is even worse for recurrent job queue. For recurrent job queue, one idealstate/externalview is created for original job template, and one idealstate/externalview will be created for each scheduled run of the job.
    
    This usually generates a significant amounts of znodes in IdealState. For example, to schedule a queue with 100 jobs, 100 "template" idealstates are created and will stay there for ever. While a new set of 100 "scheduling" idealstates are created for every time all these jobs are scheduled to run, which will stay until the expiry time setup in the job configuration has passed (usually a relative long time).
    
    – Proposed Changes:
    The IdealState of a job will be generated only when it is scheduled to run, and will be removed immediately once the job is completed. In practice, there will be only one job running at the same time for each queue, thus only one IdealState will exist for each queue at any given time.
    
    - What changes in this Diff:
     1) Split TaskRebalancer to WorkflowRebalancer and JobRebalancer, both extends the base class TaskRebalancer.  
     2) WorkflowRebalancer is responsible to check whether a workflow is ready to run, and schedule the ready-to-run jobs (by adding idealstate of a job when it is ready to run). 
     3) JobRebalancer do the actually task assigning of a job.  
     4) In this way, only Idealstate for a workflow need to be created when the workflow is created. Job's idealstate will not be created until it is ready to run, and job's IS will be deleted once it is terminated (completed or failed).
     5) Removed FixedTargetTaskRebalancer and GenericTaskRebalancer classes since they are only difference at how to assign tasks to nodes.  Created FixedTargetTaskAssignmentCalculator and GenericTaskAssignmentCaculator (both implements TaskAssignmentCaculator interface) instead for different types of jobs.
     6) All API in TaskDriver and user facing behaviors are not changed. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/lei-xia/helix helix-0.6.x

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/helix/pull/40.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #40
    
----
commit 1798e793522157b1b479a66c8a9ec9453d698b8f
Author: Lei Xia <lx...@linkedin.com>
Date:   2015-12-09T22:02:45Z

    [HELIX-617] Job IdealState is generated even the job is not running and not removed when it is completed.

----


> Job IdealState is generated even the job is not running and not removed when it is completed.
> ---------------------------------------------------------------------------------------------
>
>                 Key: HELIX-617
>                 URL: https://issues.apache.org/jira/browse/HELIX-617
>             Project: Apache Helix
>          Issue Type: Task
>            Reporter: Lei Xia
>            Assignee: Lei Xia
>
> -- Problem:
> In current task framework implementation, one IdealState and one ExternalView znode are generated for each job once the job is queued (not when it is started). These znodes will stay in the zookeeper until the specified data expiry time (usually one day in Espresso backup setting). This is even worse for recurrent job queue.  For recurrent job queue, one idealstate/externalview is created for original job template, and one idealstate/externalview will be created for each scheduled run of the job. 
> This usually generates a significant amounts of znodes in IdealState.  For example, to schedule a queue with 100 jobs, 100 "template" idealstates are created and will stay there for ever. While a new set of 100 "scheduling" idealstates are created for every time all these jobs are scheduled to run, which will stay until the expiry time setup in the job configuration has passed (usually a relative long time).
> -- Proposed Changes:
> The IdealState of a job will be generated only when it is scheduled to run, and will be removed immediately once the job is completed.  In practice, there will be only one job running at the same time for each queue, thus only one IdealState will exist for each queue at any given time. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)