You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by Srikanth Sundarrajan <sr...@hotmail.com> on 2014/02/12 06:06:03 UTC

Review Request 17991: OOZIE-1532 Purging should remove completed children job for long running coordinator jobs

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17991/
-----------------------------------------------------------

Review request for oozie.


Bugs: OOZIE-1532
    https://issues.apache.org/jira/browse/OOZIE-1532


Repository: oozie-git


Description
-------

Purging should remove completed children job for long running coordinator jobs


Diffs
-----

  core/src/main/java/org/apache/oozie/CoordinatorActionBean.java 03a7ed8 
  core/src/main/java/org/apache/oozie/WorkflowJobBean.java 3194995 
  core/src/main/java/org/apache/oozie/command/PurgeXCommand.java 9973719 
  core/src/main/java/org/apache/oozie/executor/jpa/CoordActionsDeleteJPAExecutor.java PRE-CREATION 
  core/src/main/java/org/apache/oozie/executor/jpa/CoordActionsGetForPurgeJPAExecutor.java PRE-CREATION 
  core/src/test/java/org/apache/oozie/command/TestPurgeXCommand.java 666271e 
  core/src/test/java/org/apache/oozie/executor/jpa/TestCoordActionsDeleteJPAExecutor.java PRE-CREATION 
  core/src/test/java/org/apache/oozie/executor/jpa/TestCoordActionsGetForPurgeJPAExecutor.java PRE-CREATION 

Diff: https://reviews.apache.org/r/17991/diff/


Testing
-------


Thanks,

Srikanth Sundarrajan


Re: Review Request 17991: OOZIE-1532 Purging should remove completed children job for long running coordinator jobs

Posted by Bowen Zhang <bo...@yahoo.com>.

> On Feb. 12, 2014, 5:35 a.m., Srikanth Sundarrajan wrote:
> > core/src/main/java/org/apache/oozie/WorkflowJobBean.java, line 87
> > <https://reviews.apache.org/r/17991/diff/1/?file=483265#file483265line87>
> >
> >     Would these queries slow down the DB or fetch too much result into the server? Are there alternate ways to figure if the w/f is triggered for a coord action and not a subflow or an independent w/f instead of using like '%...'

In WorkflowJobsGetForPurgeJPAExecutor.java, we set limit in the max results, so we won't fetch too many results. There is no difference between wf triggered by a coord Action or being a subworkflow of a parent except the ending string pattern of its parent_id. And to use a join query to verify a workflow being triggered by a coord action is a lot more expensive and I only see one place in our entire code base where we have a join query on a particular bundle id to fetch all its coord actions(which is less expensive than getting all old workflow which have coord parents)


> On Feb. 12, 2014, 5:35 a.m., Srikanth Sundarrajan wrote:
> > core/src/main/java/org/apache/oozie/command/PurgeXCommand.java, line 97
> > <https://reviews.apache.org/r/17991/diff/1/?file=483266#file483266line97>
> >
> >     Is there additional memory pressure here ?
> >     
> >     Also based on what the limit is and how many rows qualify, GET_COMPLETED_COORD_ACTIONS_OLDER_THAN may be executed several times.

I don't see the memory pressure here since we are fetching beans chunk by chunk, not in a single load, just like how we fetch workflows, coordjobs, and bundle jobs in the loadState() method. And GET_COMPLETED_COORD_ACTIONS_OLDER_THAN is SUPPOSED to be executed several time since we are fetching objects chunk by chunk.


- Bowen


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17991/#review34268
-----------------------------------------------------------


On Feb. 12, 2014, 5:05 a.m., Srikanth Sundarrajan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17991/
> -----------------------------------------------------------
> 
> (Updated Feb. 12, 2014, 5:05 a.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1532
>     https://issues.apache.org/jira/browse/OOZIE-1532
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> Purging should remove completed children job for long running coordinator jobs
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/oozie/CoordinatorActionBean.java 03a7ed8 
>   core/src/main/java/org/apache/oozie/WorkflowJobBean.java 3194995 
>   core/src/main/java/org/apache/oozie/command/PurgeXCommand.java 9973719 
>   core/src/main/java/org/apache/oozie/executor/jpa/CoordActionsDeleteJPAExecutor.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/executor/jpa/CoordActionsGetForPurgeJPAExecutor.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/command/TestPurgeXCommand.java 666271e 
>   core/src/test/java/org/apache/oozie/executor/jpa/TestCoordActionsDeleteJPAExecutor.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/executor/jpa/TestCoordActionsGetForPurgeJPAExecutor.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/17991/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Srikanth Sundarrajan
> 
>


Re: Review Request 17991: OOZIE-1532 Purging should remove completed children job for long running coordinator jobs

Posted by Srikanth Sundarrajan <sr...@hotmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17991/#review34268
-----------------------------------------------------------



core/src/main/java/org/apache/oozie/WorkflowJobBean.java
<https://reviews.apache.org/r/17991/#comment64297>

    Would these queries slow down the DB or fetch too much result into the server? Are there alternate ways to figure if the w/f is triggered for a coord action and not a subflow or an independent w/f instead of using like '%...'



core/src/main/java/org/apache/oozie/command/PurgeXCommand.java
<https://reviews.apache.org/r/17991/#comment64298>

    Is there additional memory pressure here ?
    
    Also based on what the limit is and how many rows qualify, GET_COMPLETED_COORD_ACTIONS_OLDER_THAN may be executed several times. 


- Srikanth Sundarrajan


On Feb. 12, 2014, 5:05 a.m., Srikanth Sundarrajan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17991/
> -----------------------------------------------------------
> 
> (Updated Feb. 12, 2014, 5:05 a.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1532
>     https://issues.apache.org/jira/browse/OOZIE-1532
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> Purging should remove completed children job for long running coordinator jobs
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/oozie/CoordinatorActionBean.java 03a7ed8 
>   core/src/main/java/org/apache/oozie/WorkflowJobBean.java 3194995 
>   core/src/main/java/org/apache/oozie/command/PurgeXCommand.java 9973719 
>   core/src/main/java/org/apache/oozie/executor/jpa/CoordActionsDeleteJPAExecutor.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/executor/jpa/CoordActionsGetForPurgeJPAExecutor.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/command/TestPurgeXCommand.java 666271e 
>   core/src/test/java/org/apache/oozie/executor/jpa/TestCoordActionsDeleteJPAExecutor.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/executor/jpa/TestCoordActionsGetForPurgeJPAExecutor.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/17991/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Srikanth Sundarrajan
> 
>


Re: Review Request 17991: OOZIE-1532 Purging should remove completed children job for long running coordinator jobs

Posted by Rohini Palaniswamy <ro...@gmail.com>.

> On March 10, 2014, 3:09 a.m., Rohini Palaniswamy wrote:
> > The approach is inefficient. Getting the list of coord actions and deleting one by one will create big redo logs and also will be time consuming. 
> > Can you please do a join query to do the delete directly. It will simple, efficient in terms of DB load and lot less code as well. Something like
> > 
> > delete from CoordinatorActionBean a where a.id in (select w.parentId from WorkflowJobBean w where w.endTimestamp < :endTime and w.parentId is not null)
> > 
> > 
> > Can you also make the new query part of CoordActionQueryExecutor? We are not writing any new JPAExecutor classes.

Since we are not relying on something like ON DELETE CASCADE, the workflows need to be purged before coord actions can be purged. So w.parentId like '%C@%' is still needed in GET_COMPLETED_WORKFLOWS_WITH_NO_PARENT_OLDER_THAN (Can we rename it as it now includes workflows with parent as well?) so that the workflows are purged before coordinator is purged. In that case, after the workflow purge, the coord action purge query will become something like

delete from CoordinatorActionBean a where a.externalId not in (select w.id from WorkflowJobBean w where w.parentId is not null)


- Rohini


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17991/#review36620
-----------------------------------------------------------


On Feb. 12, 2014, 5:05 a.m., Srikanth Sundarrajan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17991/
> -----------------------------------------------------------
> 
> (Updated Feb. 12, 2014, 5:05 a.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1532
>     https://issues.apache.org/jira/browse/OOZIE-1532
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> Purging should remove completed children job for long running coordinator jobs
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/oozie/CoordinatorActionBean.java 03a7ed8 
>   core/src/main/java/org/apache/oozie/WorkflowJobBean.java 3194995 
>   core/src/main/java/org/apache/oozie/command/PurgeXCommand.java 9973719 
>   core/src/main/java/org/apache/oozie/executor/jpa/CoordActionsDeleteJPAExecutor.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/executor/jpa/CoordActionsGetForPurgeJPAExecutor.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/command/TestPurgeXCommand.java 666271e 
>   core/src/test/java/org/apache/oozie/executor/jpa/TestCoordActionsDeleteJPAExecutor.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/executor/jpa/TestCoordActionsGetForPurgeJPAExecutor.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/17991/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Srikanth Sundarrajan
> 
>


Re: Review Request 17991: OOZIE-1532 Purging should remove completed children job for long running coordinator jobs

Posted by Bowen Zhang <bo...@yahoo.com>.

> On March 10, 2014, 3:09 a.m., Rohini Palaniswamy wrote:
> > The approach is inefficient. Getting the list of coord actions and deleting one by one will create big redo logs and also will be time consuming. 
> > Can you please do a join query to do the delete directly. It will simple, efficient in terms of DB load and lot less code as well. Something like
> > 
> > delete from CoordinatorActionBean a where a.id in (select w.parentId from WorkflowJobBean w where w.endTimestamp < :endTime and w.parentId is not null)
> > 
> > 
> > Can you also make the new query part of CoordActionQueryExecutor? We are not writing any new JPAExecutor classes.
> 
> Rohini Palaniswamy wrote:
>     Since we are not relying on something like ON DELETE CASCADE, the workflows need to be purged before coord actions can be purged. So w.parentId like '%C@%' is still needed in GET_COMPLETED_WORKFLOWS_WITH_NO_PARENT_OLDER_THAN (Can we rename it as it now includes workflows with parent as well?) so that the workflows are purged before coordinator is purged. In that case, after the workflow purge, the coord action purge query will become something like
>     
>     delete from CoordinatorActionBean a where a.externalId not in (select w.id from WorkflowJobBean w where w.parentId is not null)

Your approach of running everything in one query will change the existing mechanism of purging. Take a look at purgeXCommand.java and loadState and execute method.


- Bowen


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17991/#review36620
-----------------------------------------------------------


On Feb. 12, 2014, 5:05 a.m., Srikanth Sundarrajan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17991/
> -----------------------------------------------------------
> 
> (Updated Feb. 12, 2014, 5:05 a.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1532
>     https://issues.apache.org/jira/browse/OOZIE-1532
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> Purging should remove completed children job for long running coordinator jobs
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/oozie/CoordinatorActionBean.java 03a7ed8 
>   core/src/main/java/org/apache/oozie/WorkflowJobBean.java 3194995 
>   core/src/main/java/org/apache/oozie/command/PurgeXCommand.java 9973719 
>   core/src/main/java/org/apache/oozie/executor/jpa/CoordActionsDeleteJPAExecutor.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/executor/jpa/CoordActionsGetForPurgeJPAExecutor.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/command/TestPurgeXCommand.java 666271e 
>   core/src/test/java/org/apache/oozie/executor/jpa/TestCoordActionsDeleteJPAExecutor.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/executor/jpa/TestCoordActionsGetForPurgeJPAExecutor.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/17991/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Srikanth Sundarrajan
> 
>


Re: Review Request 17991: OOZIE-1532 Purging should remove completed children job for long running coordinator jobs

Posted by Rohini Palaniswamy <ro...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17991/#review36620
-----------------------------------------------------------


The approach is inefficient. Getting the list of coord actions and deleting one by one will create big redo logs and also will be time consuming. 
Can you please do a join query to do the delete directly. It will simple, efficient in terms of DB load and lot less code as well. Something like

delete from CoordinatorActionBean a where a.id in (select w.parentId from WorkflowJobBean w where w.endTimestamp < :endTime and w.parentId is not null)


Can you also make the new query part of CoordActionQueryExecutor? We are not writing any new JPAExecutor classes. 

- Rohini Palaniswamy


On Feb. 12, 2014, 5:05 a.m., Srikanth Sundarrajan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17991/
> -----------------------------------------------------------
> 
> (Updated Feb. 12, 2014, 5:05 a.m.)
> 
> 
> Review request for oozie.
> 
> 
> Bugs: OOZIE-1532
>     https://issues.apache.org/jira/browse/OOZIE-1532
> 
> 
> Repository: oozie-git
> 
> 
> Description
> -------
> 
> Purging should remove completed children job for long running coordinator jobs
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/oozie/CoordinatorActionBean.java 03a7ed8 
>   core/src/main/java/org/apache/oozie/WorkflowJobBean.java 3194995 
>   core/src/main/java/org/apache/oozie/command/PurgeXCommand.java 9973719 
>   core/src/main/java/org/apache/oozie/executor/jpa/CoordActionsDeleteJPAExecutor.java PRE-CREATION 
>   core/src/main/java/org/apache/oozie/executor/jpa/CoordActionsGetForPurgeJPAExecutor.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/command/TestPurgeXCommand.java 666271e 
>   core/src/test/java/org/apache/oozie/executor/jpa/TestCoordActionsDeleteJPAExecutor.java PRE-CREATION 
>   core/src/test/java/org/apache/oozie/executor/jpa/TestCoordActionsGetForPurgeJPAExecutor.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/17991/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Srikanth Sundarrajan
> 
>