You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by Samuel Dehouck <sa...@box.net> on 2011/11/03 18:00:00 UTC

Oozie automated testing

Hi,

I was wondering if there was a way in Oozie to test workflows. Basically
what I'd need would be a mode where I could run a workflow (after
validating xml) without actually running the actions but only checking that
actions are defect free. For instance Hive actions would check if all conf
files are present and that  the query has no syntax error but wouldn't
actually run the query. It'd also be really helpful to be able to output
the "depency graph" of the workflow to be able to quickly check that
actions will be run after their dependencies.

Thanks,

Samuel

Re: Oozie automated testing

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
Matt,

Dryrun submissions would not create entries in Oozie DB. And we could
tweak the submission command not to produce log entries for dryrun
submisions.

Thanks.

Alejandro

On Mon, Nov 7, 2011 at 7:20 PM, GOEKE, MATTHEW (AG/1000)
<ma...@monsanto.com> wrote:
> Alejandro,
>
> I completely agree with you that duplicating the logic is not the preferable option. I like the dryrun option, as it would be fairly simple to implement, but would you recommend forwarding these to the dev oozie server or standing up a separate build oozie server? I would assume that the footprint of these submissions would be small so my concern is more around mucking up the dev oozie server logs / web console with meaningless submissions.
>
> Thanks,
> Matt
>
> -----Original Message-----
> From: Alejandro Abdelnur [mailto:tucu@cloudera.com]
> Sent: Monday, November 07, 2011 11:37 AM
> To: oozie-users@incubator.apache.org
> Subject: Re: Oozie automated testing
>
> Matt,
>
> Oozie does XML Schema validation and a static integrity check (loop
> detection, invalid transitions) on job submission, one possibility
> would be to add a 'dryrun' parameter. This would require a call to an
> Oozie server, but I'd prefer that than duplicating such logic in the
> client.
>
> Thanks.
>
> Alejandro
>
> On Fri, Nov 4, 2011 at 8:55 AM, GOEKE, MATTHEW (AG/1000)
> <ma...@monsanto.com> wrote:
>> Alejandro,
>>
>> I was curious about the same thing as I will need to create something that would do an initial QC pass over the workflows as part of our build process. My initial thought was to create a maven plugin that would interact with the workflow.xml and use part of the Oozie server source to do an initial check for major mistakes (e.g. cyclic call, no start or end, etc...). I need to do a little more reading (potentially source diving) into how Oozie builds the DAG from the workflow.xml but has there been much discussion around this type of tooling before? From your response below I would assume that the foundation for this doesn't really exist to date.
>>
>> Matt
>>
>> -----Original Message-----
>> From: Alejandro Abdelnur [mailto:tucu@cloudera.com]
>> Sent: Thursday, November 03, 2011 1:29 PM
>> To: oozie-users@incubator.apache.org
>> Subject: Re: Oozie automated testing
>>
>> Samuel,
>>
>> Currently it is not possible to do so.
>>
>> It seems like Pig & Hive have something like a dryrun option. We could
>> make use of those. Still it would be quite complex, if not impossible
>> to do a full WF dryrun as the flow (decision nodes) and the
>> configuration of actions may depend on the output of previous actions
>> and if they don't have meaningful values the later actions would fail.
>>
>> Thanks.
>>
>> Alejandro
>>
>>
>> On Thu, Nov 3, 2011 at 10:00 AM, Samuel Dehouck <sa...@box.net> wrote:
>>> Hi,
>>>
>>> I was wondering if there was a way in Oozie to test workflows. Basically
>>> what I'd need would be a mode where I could run a workflow (after
>>> validating xml) without actually running the actions but only checking that
>>> actions are defect free. For instance Hive actions would check if all conf
>>> files are present and that  the query has no syntax error but wouldn't
>>> actually run the query. It'd also be really helpful to be able to output
>>> the "depency graph" of the workflow to be able to quickly check that
>>> actions will be run after their dependencies.
>>>
>>> Thanks,
>>>
>>> Samuel
>>>
>> This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled
>> to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and
>> all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited.
>>
>> All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its
>> subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware".
>> Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying
>> this e-mail or any attachment.
>>
>>
>> The information contained in this email may be subject to the export control laws and regulations of the United States, potentially
>> including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of
>> Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this information you are obligated to comply with all
>> applicable U.S. export laws and regulations.
>>
>>
>

RE: Oozie automated testing

Posted by "GOEKE, MATTHEW (AG/1000)" <ma...@monsanto.com>.
Alejandro,

I completely agree with you that duplicating the logic is not the preferable option. I like the dryrun option, as it would be fairly simple to implement, but would you recommend forwarding these to the dev oozie server or standing up a separate build oozie server? I would assume that the footprint of these submissions would be small so my concern is more around mucking up the dev oozie server logs / web console with meaningless submissions.

Thanks,
Matt

-----Original Message-----
From: Alejandro Abdelnur [mailto:tucu@cloudera.com] 
Sent: Monday, November 07, 2011 11:37 AM
To: oozie-users@incubator.apache.org
Subject: Re: Oozie automated testing

Matt,

Oozie does XML Schema validation and a static integrity check (loop
detection, invalid transitions) on job submission, one possibility
would be to add a 'dryrun' parameter. This would require a call to an
Oozie server, but I'd prefer that than duplicating such logic in the
client.

Thanks.

Alejandro

On Fri, Nov 4, 2011 at 8:55 AM, GOEKE, MATTHEW (AG/1000)
<ma...@monsanto.com> wrote:
> Alejandro,
>
> I was curious about the same thing as I will need to create something that would do an initial QC pass over the workflows as part of our build process. My initial thought was to create a maven plugin that would interact with the workflow.xml and use part of the Oozie server source to do an initial check for major mistakes (e.g. cyclic call, no start or end, etc...). I need to do a little more reading (potentially source diving) into how Oozie builds the DAG from the workflow.xml but has there been much discussion around this type of tooling before? From your response below I would assume that the foundation for this doesn't really exist to date.
>
> Matt
>
> -----Original Message-----
> From: Alejandro Abdelnur [mailto:tucu@cloudera.com]
> Sent: Thursday, November 03, 2011 1:29 PM
> To: oozie-users@incubator.apache.org
> Subject: Re: Oozie automated testing
>
> Samuel,
>
> Currently it is not possible to do so.
>
> It seems like Pig & Hive have something like a dryrun option. We could
> make use of those. Still it would be quite complex, if not impossible
> to do a full WF dryrun as the flow (decision nodes) and the
> configuration of actions may depend on the output of previous actions
> and if they don't have meaningful values the later actions would fail.
>
> Thanks.
>
> Alejandro
>
>
> On Thu, Nov 3, 2011 at 10:00 AM, Samuel Dehouck <sa...@box.net> wrote:
>> Hi,
>>
>> I was wondering if there was a way in Oozie to test workflows. Basically
>> what I'd need would be a mode where I could run a workflow (after
>> validating xml) without actually running the actions but only checking that
>> actions are defect free. For instance Hive actions would check if all conf
>> files are present and that  the query has no syntax error but wouldn't
>> actually run the query. It'd also be really helpful to be able to output
>> the "depency graph" of the workflow to be able to quickly check that
>> actions will be run after their dependencies.
>>
>> Thanks,
>>
>> Samuel
>>
> This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled
> to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and
> all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited.
>
> All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its
> subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware".
> Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying
> this e-mail or any attachment.
>
>
> The information contained in this email may be subject to the export control laws and regulations of the United States, potentially
> including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of
> Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this information you are obligated to comply with all
> applicable U.S. export laws and regulations.
>
>

Re: Oozie automated testing

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
Matt,

Oozie does XML Schema validation and a static integrity check (loop
detection, invalid transitions) on job submission, one possibility
would be to add a 'dryrun' parameter. This would require a call to an
Oozie server, but I'd prefer that than duplicating such logic in the
client.

Thanks.

Alejandro

On Fri, Nov 4, 2011 at 8:55 AM, GOEKE, MATTHEW (AG/1000)
<ma...@monsanto.com> wrote:
> Alejandro,
>
> I was curious about the same thing as I will need to create something that would do an initial QC pass over the workflows as part of our build process. My initial thought was to create a maven plugin that would interact with the workflow.xml and use part of the Oozie server source to do an initial check for major mistakes (e.g. cyclic call, no start or end, etc...). I need to do a little more reading (potentially source diving) into how Oozie builds the DAG from the workflow.xml but has there been much discussion around this type of tooling before? From your response below I would assume that the foundation for this doesn't really exist to date.
>
> Matt
>
> -----Original Message-----
> From: Alejandro Abdelnur [mailto:tucu@cloudera.com]
> Sent: Thursday, November 03, 2011 1:29 PM
> To: oozie-users@incubator.apache.org
> Subject: Re: Oozie automated testing
>
> Samuel,
>
> Currently it is not possible to do so.
>
> It seems like Pig & Hive have something like a dryrun option. We could
> make use of those. Still it would be quite complex, if not impossible
> to do a full WF dryrun as the flow (decision nodes) and the
> configuration of actions may depend on the output of previous actions
> and if they don't have meaningful values the later actions would fail.
>
> Thanks.
>
> Alejandro
>
>
> On Thu, Nov 3, 2011 at 10:00 AM, Samuel Dehouck <sa...@box.net> wrote:
>> Hi,
>>
>> I was wondering if there was a way in Oozie to test workflows. Basically
>> what I'd need would be a mode where I could run a workflow (after
>> validating xml) without actually running the actions but only checking that
>> actions are defect free. For instance Hive actions would check if all conf
>> files are present and that  the query has no syntax error but wouldn't
>> actually run the query. It'd also be really helpful to be able to output
>> the "depency graph" of the workflow to be able to quickly check that
>> actions will be run after their dependencies.
>>
>> Thanks,
>>
>> Samuel
>>
> This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled
> to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and
> all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited.
>
> All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its
> subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware".
> Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying
> this e-mail or any attachment.
>
>
> The information contained in this email may be subject to the export control laws and regulations of the United States, potentially
> including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of
> Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this information you are obligated to comply with all
> applicable U.S. export laws and regulations.
>
>

RE: Oozie automated testing

Posted by "GOEKE, MATTHEW (AG/1000)" <ma...@monsanto.com>.
Alejandro,

I was curious about the same thing as I will need to create something that would do an initial QC pass over the workflows as part of our build process. My initial thought was to create a maven plugin that would interact with the workflow.xml and use part of the Oozie server source to do an initial check for major mistakes (e.g. cyclic call, no start or end, etc...). I need to do a little more reading (potentially source diving) into how Oozie builds the DAG from the workflow.xml but has there been much discussion around this type of tooling before? From your response below I would assume that the foundation for this doesn't really exist to date. 

Matt

-----Original Message-----
From: Alejandro Abdelnur [mailto:tucu@cloudera.com] 
Sent: Thursday, November 03, 2011 1:29 PM
To: oozie-users@incubator.apache.org
Subject: Re: Oozie automated testing

Samuel,

Currently it is not possible to do so.

It seems like Pig & Hive have something like a dryrun option. We could
make use of those. Still it would be quite complex, if not impossible
to do a full WF dryrun as the flow (decision nodes) and the
configuration of actions may depend on the output of previous actions
and if they don't have meaningful values the later actions would fail.

Thanks.

Alejandro


On Thu, Nov 3, 2011 at 10:00 AM, Samuel Dehouck <sa...@box.net> wrote:
> Hi,
>
> I was wondering if there was a way in Oozie to test workflows. Basically
> what I'd need would be a mode where I could run a workflow (after
> validating xml) without actually running the actions but only checking that
> actions are defect free. For instance Hive actions would check if all conf
> files are present and that  the query has no syntax error but wouldn't
> actually run the query. It'd also be really helpful to be able to output
> the "depency graph" of the workflow to be able to quickly check that
> actions will be run after their dependencies.
>
> Thanks,
>
> Samuel
>
This e-mail message may contain privileged and/or confidential information, and is intended to be received only by persons entitled
to receive such information. If you have received this e-mail in error, please notify the sender immediately. Please delete it and
all attachments from any servers, hard drives or any other media. Other use of this e-mail by you is strictly prohibited.

All e-mails and attachments sent and received are subject to monitoring, reading and archival by Monsanto, including its
subsidiaries. The recipient of this e-mail is solely responsible for checking for the presence of "Viruses" or other "Malware".
Monsanto, along with its subsidiaries, accepts no liability for any damage caused by any such code transmitted by or accompanying
this e-mail or any attachment.


The information contained in this email may be subject to the export control laws and regulations of the United States, potentially
including but not limited to the Export Administration Regulations (EAR) and sanctions regulations issued by the U.S. Department of
Treasury, Office of Foreign Asset Controls (OFAC).  As a recipient of this information you are obligated to comply with all
applicable U.S. export laws and regulations.


Re: Oozie automated testing

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
Samuel,

Currently it is not possible to do so.

It seems like Pig & Hive have something like a dryrun option. We could
make use of those. Still it would be quite complex, if not impossible
to do a full WF dryrun as the flow (decision nodes) and the
configuration of actions may depend on the output of previous actions
and if they don't have meaningful values the later actions would fail.

Thanks.

Alejandro


On Thu, Nov 3, 2011 at 10:00 AM, Samuel Dehouck <sa...@box.net> wrote:
> Hi,
>
> I was wondering if there was a way in Oozie to test workflows. Basically
> what I'd need would be a mode where I could run a workflow (after
> validating xml) without actually running the actions but only checking that
> actions are defect free. For instance Hive actions would check if all conf
> files are present and that  the query has no syntax error but wouldn't
> actually run the query. It'd also be really helpful to be able to output
> the "depency graph" of the workflow to be able to quickly check that
> actions will be run after their dependencies.
>
> Thanks,
>
> Samuel
>