You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by gw...@orange.com on 2014/06/25 12:12:37 UTC

Oozie Database Purge workflows

Hello !

We migrated oozie RDBMS from Derby to Postgres.
We have a problem with workflows filling the partition where the oozie database is. (in 6 hours because there is a lot of workflows)
I would like to know what I can purge in the database oozie to free some discspace ?

In fact, I would love to know if there is a documentation concerning how the oozie database works.

Here are the list of tables :
bundle_actions
bundle_jobs
coord_actions
coord_jobs
oozie_sys
openjpa_sequence_table
sla_events
sla_registration
sla_summary
validate_conn
wf_actions
wf_jobs

If I want to remove old workflows from the database, may I do something like this :
delete from WF_ACTIONS where job_id in (select id from WF_JOBS where created_time < @date);
delete from WF_JOBS where created_time < @date;

Best regards.

Gwenael Le Barzic

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.


Re: Oozie Database Purge workflows

Posted by Shwetha GS <sh...@inmobi.com>.
Hi,

If you don't need to look at older workflows, you can just delete the
entries independently like
delete from WF_ACTIONS/WF_JOBS where end_time < '<time>'

Note that frequent deletes also fragments the DB and degrades DB
performance. Better way of doing purging is partitioning the tables based
on timestamp and drop the old partitions.

Having minutely thresholds for purging is useful. Can you file a jira for
it so that we don't lose track.

Thanks,
Shwetha


On Thu, Jun 26, 2014 at 11:44 PM, <gw...@orange.com> wrote:

> Hello !
>
> Thank you for your answer.
>
> Is it possible to put less than one day in the parameter
> oozie.service.PurgeService.older.than ?
>
> In fact, here is my problem :
> 1. Some of my colleagues developed workflows launching two subworkflows
> 2. These workflows are launched by coordinators every 2 minuts
> 3. In  6 hours it filled the oozie database with a lot of information, too
> many in fact, and the volume for oozie db is filled very quickly.
>
> So I found the different parameters you mentioned, but as they are in
> days, it does not help me in my current problem.
>
> As I migrated the database oozie in postgres, I was thinking about doing
> something like this to purge :
> Create a shell and schedule it hourly in the cron to do this :
>
> echo "DELETE FROM wf_actions WHERE job_id IN (SELECT id FROM wf_jobs WHERE
> created_time < '"`date +"%Y-%m-%d %H:%M:%S"`"';" | psql -p <myport> -U
> <myuser> -d <mydb>
> echo "DELETE FROM wf_jobs WHERE created_time < '"`date +"%Y-%m-%d
> %H:%M:%S"`"';" | psql -p <myport> -U <myuser> -d <mydb>
>
> Is it clean or dirty ? Will it create other problems or is it OK to purge
> the workflows ?
>
> Best regards.
>
> Gwenael Le Barzic
>
> -----Message d'origine-----
> De : Amit Patil [mailto:arpatil@skyhighnetworks.com]
> Envoyé : mercredi 25 juin 2014 19:52
> À : user@oozie.apache.org
> Objet : Re: Oozie Database Purge workflows
>
> Look for the following config properties
>
> oozie.service.PurgeService.older.than
>
> oozie.service.PurgeService.purge.interval
>
>
> On Wed, Jun 25, 2014 at 3:12 AM, <gw...@orange.com> wrote:
>
> > Hello !
> >
> > We migrated oozie RDBMS from Derby to Postgres.
> > We have a problem with workflows filling the partition where the oozie
> > database is. (in 6 hours because there is a lot of workflows) I would
> > like to know what I can purge in the database oozie to free some
> > discspace ?
> >
> > In fact, I would love to know if there is a documentation concerning
> > how the oozie database works.
> >
> > Here are the list of tables :
> > bundle_actions
> > bundle_jobs
> > coord_actions
> > coord_jobs
> > oozie_sys
> > openjpa_sequence_table
> > sla_events
> > sla_registration
> > sla_summary
> > validate_conn
> > wf_actions
> > wf_jobs
> >
> > If I want to remove old workflows from the database, may I do
> > something like this :
> > delete from WF_ACTIONS where job_id in (select id from WF_JOBS where
> > created_time < @date); delete from WF_JOBS where created_time < @date;
> >
> > Best regards.
> >
> > Gwenael Le Barzic
> >
> >
> > ______________________________________________________________________
> > ___________________________________________________
> >
> > Ce message et ses pieces jointes peuvent contenir des informations
> > confidentielles ou privilegiees et ne doivent donc pas etre diffuses,
> > exploites ou copies sans autorisation. Si vous avez recu ce message
> > par erreur, veuillez le signaler a l'expediteur et le detruire ainsi
> > que les pieces jointes. Les messages electroniques etant susceptibles
> > d'alteration, Orange decline toute responsabilite si ce message a ete
> > altere, deforme ou falsifie. Merci.
> >
> > This message and its attachments may contain confidential or
> > privileged information that may be protected by law; they should not
> > be distributed, used or copied without authorisation.
> > If you have received this email in error, please notify the sender and
> > delete this message and its attachments.
> > As emails may be altered, Orange is not liable for messages that have
> > been modified, changed or falsified.
> > Thank you.
> >
> >
>
>
> _________________________________________________________________________________________________________________________
>
> Ce message et ses pieces jointes peuvent contenir des informations
> confidentielles ou privilegiees et ne doivent donc
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez
> recu ce message par erreur, veuillez le signaler
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages
> electroniques etant susceptibles d'alteration,
> Orange decline toute responsabilite si ce message a ete altere, deforme ou
> falsifie. Merci.
>
> This message and its attachments may contain confidential or privileged
> information that may be protected by law;
> they should not be distributed, used or copied without authorisation.
> If you have received this email in error, please notify the sender and
> delete this message and its attachments.
> As emails may be altered, Orange is not liable for messages that have been
> modified, changed or falsified.
> Thank you.
>
>

-- 
_____________________________________________________________
The information contained in this communication is intended solely for the 
use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally privileged 
information. If you are not the intended recipient you are hereby notified 
that any disclosure, copying, distribution or taking any action in reliance 
on the contents of this information is strictly prohibited and may be 
unlawful. If you have received this communication in error, please notify 
us immediately by responding to this email and then delete it from your 
system. The firm is neither liable for the proper and complete transmission 
of the information contained in this communication nor for any delay in its 
receipt.

RE: Oozie Database Purge workflows

Posted by gw...@orange.com.
Hello !

Thank you for your answer.

Is it possible to put less than one day in the parameter oozie.service.PurgeService.older.than ? 

In fact, here is my problem :
1. Some of my colleagues developed workflows launching two subworkflows 
2. These workflows are launched by coordinators every 2 minuts
3. In  6 hours it filled the oozie database with a lot of information, too many in fact, and the volume for oozie db is filled very quickly.

So I found the different parameters you mentioned, but as they are in days, it does not help me in my current problem.

As I migrated the database oozie in postgres, I was thinking about doing something like this to purge :
Create a shell and schedule it hourly in the cron to do this :

echo "DELETE FROM wf_actions WHERE job_id IN (SELECT id FROM wf_jobs WHERE created_time < '"`date +"%Y-%m-%d %H:%M:%S"`"';" | psql -p <myport> -U <myuser> -d <mydb>
echo "DELETE FROM wf_jobs WHERE created_time < '"`date +"%Y-%m-%d %H:%M:%S"`"';" | psql -p <myport> -U <myuser> -d <mydb>
 
Is it clean or dirty ? Will it create other problems or is it OK to purge the workflows ?

Best regards.

Gwenael Le Barzic

-----Message d'origine-----
De : Amit Patil [mailto:arpatil@skyhighnetworks.com] 
Envoyé : mercredi 25 juin 2014 19:52
À : user@oozie.apache.org
Objet : Re: Oozie Database Purge workflows

Look for the following config properties

oozie.service.PurgeService.older.than

oozie.service.PurgeService.purge.interval


On Wed, Jun 25, 2014 at 3:12 AM, <gw...@orange.com> wrote:

> Hello !
>
> We migrated oozie RDBMS from Derby to Postgres.
> We have a problem with workflows filling the partition where the oozie 
> database is. (in 6 hours because there is a lot of workflows) I would 
> like to know what I can purge in the database oozie to free some 
> discspace ?
>
> In fact, I would love to know if there is a documentation concerning 
> how the oozie database works.
>
> Here are the list of tables :
> bundle_actions
> bundle_jobs
> coord_actions
> coord_jobs
> oozie_sys
> openjpa_sequence_table
> sla_events
> sla_registration
> sla_summary
> validate_conn
> wf_actions
> wf_jobs
>
> If I want to remove old workflows from the database, may I do 
> something like this :
> delete from WF_ACTIONS where job_id in (select id from WF_JOBS where 
> created_time < @date); delete from WF_JOBS where created_time < @date;
>
> Best regards.
>
> Gwenael Le Barzic
>
>
> ______________________________________________________________________
> ___________________________________________________
>
> Ce message et ses pieces jointes peuvent contenir des informations 
> confidentielles ou privilegiees et ne doivent donc pas etre diffuses, 
> exploites ou copies sans autorisation. Si vous avez recu ce message 
> par erreur, veuillez le signaler a l'expediteur et le detruire ainsi 
> que les pieces jointes. Les messages electroniques etant susceptibles 
> d'alteration, Orange decline toute responsabilite si ce message a ete 
> altere, deforme ou falsifie. Merci.
>
> This message and its attachments may contain confidential or 
> privileged information that may be protected by law; they should not 
> be distributed, used or copied without authorisation.
> If you have received this email in error, please notify the sender and 
> delete this message and its attachments.
> As emails may be altered, Orange is not liable for messages that have 
> been modified, changed or falsified.
> Thank you.
>
>

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.


Re: Oozie Database Purge workflows

Posted by Amit Patil <ar...@skyhighnetworks.com>.
Look for the following config properties

oozie.service.PurgeService.older.than

oozie.service.PurgeService.purge.interval


On Wed, Jun 25, 2014 at 3:12 AM, <gw...@orange.com> wrote:

> Hello !
>
> We migrated oozie RDBMS from Derby to Postgres.
> We have a problem with workflows filling the partition where the oozie
> database is. (in 6 hours because there is a lot of workflows)
> I would like to know what I can purge in the database oozie to free some
> discspace ?
>
> In fact, I would love to know if there is a documentation concerning how
> the oozie database works.
>
> Here are the list of tables :
> bundle_actions
> bundle_jobs
> coord_actions
> coord_jobs
> oozie_sys
> openjpa_sequence_table
> sla_events
> sla_registration
> sla_summary
> validate_conn
> wf_actions
> wf_jobs
>
> If I want to remove old workflows from the database, may I do something
> like this :
> delete from WF_ACTIONS where job_id in (select id from WF_JOBS where
> created_time < @date);
> delete from WF_JOBS where created_time < @date;
>
> Best regards.
>
> Gwenael Le Barzic
>
>
> _________________________________________________________________________________________________________________________
>
> Ce message et ses pieces jointes peuvent contenir des informations
> confidentielles ou privilegiees et ne doivent donc
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez
> recu ce message par erreur, veuillez le signaler
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages
> electroniques etant susceptibles d'alteration,
> Orange decline toute responsabilite si ce message a ete altere, deforme ou
> falsifie. Merci.
>
> This message and its attachments may contain confidential or privileged
> information that may be protected by law;
> they should not be distributed, used or copied without authorisation.
> If you have received this email in error, please notify the sender and
> delete this message and its attachments.
> As emails may be altered, Orange is not liable for messages that have been
> modified, changed or falsified.
> Thank you.
>
>

Re: Oozie Database Purge workflows

Posted by Amit Patil <ar...@corp.skyhighnetworks.com>.
Look for the following config properties
oozie.service.PurgeService.older.than
oozie.service.PurgeService.purge.interval




On 6/25/14, 3:12 AM, "gwenael.lebarzic@orange.com"
<gw...@orange.com> wrote:

>Hello !
>
>We migrated oozie RDBMS from Derby to Postgres.
>We have a problem with workflows filling the partition where the oozie
>database is. (in 6 hours because there is a lot of workflows)
>I would like to know what I can purge in the database oozie to free some
>discspace ?
>
>In fact, I would love to know if there is a documentation concerning how
>the oozie database works.
>
>Here are the list of tables :
>bundle_actions
>bundle_jobs
>coord_actions
>coord_jobs
>oozie_sys
>openjpa_sequence_table
>sla_events
>sla_registration
>sla_summary
>validate_conn
>wf_actions
>wf_jobs
>
>If I want to remove old workflows from the database, may I do something
>like this :
>delete from WF_ACTIONS where job_id in (select id from WF_JOBS where
>created_time < @date);
>delete from WF_JOBS where created_time < @date;
>
>Best regards.
>
>Gwenael Le Barzic
>
>__________________________________________________________________________
>_______________________________________________
>
>Ce message et ses pieces jointes peuvent contenir des informations
>confidentielles ou privilegiees et ne doivent donc
>pas etre diffuses, exploites ou copies sans autorisation. Si vous avez
>recu ce message par erreur, veuillez le signaler
>a l'expediteur et le detruire ainsi que les pieces jointes. Les messages
>electroniques etant susceptibles d'alteration,
>Orange decline toute responsabilite si ce message a ete altere, deforme
>ou falsifie. Merci.
>
>This message and its attachments may contain confidential or privileged
>information that may be protected by law;
>they should not be distributed, used or copied without authorisation.
>If you have received this email in error, please notify the sender and
>delete this message and its attachments.
>As emails may be altered, Orange is not liable for messages that have
>been modified, changed or falsified.
>Thank you.
>