You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@oozie.apache.org by an...@apache.org on 2018/09/14 14:07:14 UTC

[08/11] oozie git commit: OOZIE-2734 [docs] Switch from TWiki to Markdown (asalamon74 via andras.piros, pbacsko, gezapeti)

http://git-wip-us.apache.org/repos/asf/oozie/blob/4e5b3cb5/docs/src/site/twiki/CoordinatorFunctionalSpec.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/CoordinatorFunctionalSpec.twiki b/docs/src/site/twiki/CoordinatorFunctionalSpec.twiki
index cd416d4..d31d1aa 100644
--- a/docs/src/site/twiki/CoordinatorFunctionalSpec.twiki
+++ b/docs/src/site/twiki/CoordinatorFunctionalSpec.twiki
@@ -1,74 +1,89 @@
-<noautolink>
 
-[[index][::Go back to Oozie Documentation Index::]]
+
+[::Go back to Oozie Documentation Index::](index.html)
 
 -----
 
----+!! Oozie Coordinator Specification
+# Oozie Coordinator Specification
 
 The goal of this document is to define a coordinator engine system specialized in submitting workflows based on time and data triggers.
 
-%TOC%
+<!-- MACRO{toc|fromDepth=1|toDepth=4} -->
+
+## Changelog
+
+**03/JUL/2013**
+
+   * Appendix A, Added new coordinator schema 0.4, sla schema 0.2 and changed schemas ordering to newest first
 
----++ Changelog
----+++!! 03/JUL/2013
+**07/JAN/2013**
 
-   * #Appendix A, Added new coordinator schema 0.4, sla schema 0.2 and changed schemas ordering to newest first
----+++!! 07/JAN/2013
+   * 6.8 Added section on new EL functions for datasets defined with HCatalog
 
-   * #6.8 Added section on new EL functions for datasets defined with HCatalog
----+++!! 26/JUL/2012
+**26/JUL/2012**
 
-   * #Appendix A, updated XML schema 0.4 to include =parameters= element
-   * #6.5 Updated to mention about =parameters= element as of schema 0.4
----+++!! 23/NOV/2011:
+   * Appendix A, updated XML schema 0.4 to include `parameters` element
+   * 6.5 Updated to mention about `parameters` element as of schema 0.4
+
+**23/NOV/2011:**
 
    * Update execution order typo
----+++!! 05/MAY/2011:
+
+**05/MAY/2011:**
 
    * Update coordinator schema 0.2
----+++!! 09/MAR/2011:
+
+**09/MAR/2011:**
 
    * Update coordinator status
----+++!! 02/DEC/2010:
+
+**02/DEC/2010:**
 
    * Update coordinator done-flag
----+++!! 26/AUG/2010:
+
+**26/AUG/2010:**
 
    * Update coordinator rerun
----+++!! 09/JUN/2010:
+
+**09/JUN/2010:**
 
    * Clean up unsupported functions
----+++!! 02/JUN/2010:
+
+**02/JUN/2010:**
 
    * Update all EL functions in CoordFunctionalSpec with "coord:" prefix
----+++!! 02/OCT/2009:
+
+**02/OCT/2009:**
 
    * Added Appendix A, Oozie Coordinator XML-Schema
    * Change #5.3., Datasets definition supports 'include' element
----+++!! 29/SEP/2009:
 
-   * Change #4.4.1, added =${coord:endOfDays(int n)}= EL function
-   * Change #4.4.2, added =${coord:endOfMonths(int n)}= EL function
----+++!! 11/SEP/2009:
+**29/SEP/2009:**
+
+   * Change #4.4.1, added `${coord:endOfDays(int n)}` EL function
+   * Change #4.4.2, added `${coord:endOfMonths(int n)}` EL function
+
+**11/SEP/2009:**
 
-   * Change #6.6.4. =${coord:tzOffset()}= EL function now returns offset in minutes. Added more explanation on behavior
+   * Change #6.6.4. `${coord:tzOffset()}` EL function now returns offset in minutes. Added more explanation on behavior
    * Removed 'oozie' URL from action workflow invocation, per arch review feedback coord&wf run on the same instance
----+++!! 07/SEP/2009:
+
+**07/SEP/2009:**
 
    * Full rewrite of sections #4 and #7
    * Added sections #6.1.7, #6.6.2, #6.6.3 & #6.6.4
    * Rewording through the spec definitions
    * Updated all examples and syntax to latest changes
----+++!! 03/SEP/2009:
+
+**03/SEP/2009:**
 
    * Change #2. Definitions. Some rewording in the definitions
-   * Change #6.6.4. Replaced =${coord:next(int n)}= with =${coord:version(int n)}= EL Function
+   * Change #6.6.4. Replaced `${coord:next(int n)}` with `${coord:version(int n)}` EL Function
    * Added #6.6.5. Dataset Instance Resolution for Instances Before the Initial Instance
 
----++ 1. Coordinator Overview
+## 1. Coordinator Overview
 
-Users typically run map-reduce, hadoop-streaming, hdfs and/or Pig jobs on the grid. Multiple of these jobs can be combined to form a workflow job. [[https://issues.apache.org/jira/browse/HADOOP-5303][Oozie, Hadoop Workflow System]] defines a workflow system that runs such jobs.
+Users typically run map-reduce, hadoop-streaming, hdfs and/or Pig jobs on the grid. Multiple of these jobs can be combined to form a workflow job. [Oozie, Hadoop Workflow System](https://issues.apache.org/jira/browse/HADOOP-5303) defines a workflow system that runs such jobs.
 
 Commonly, workflow jobs are run based on regular time intervals and/or data availability. And, in some cases, they can be triggered by an external event.
 
@@ -76,54 +91,54 @@ Expressing the condition(s) that trigger a workflow job can be modeled as a pred
 
 It is also necessary to connect workflow jobs that run regularly, but at different time intervals. The outputs of multiple subsequent runs of a workflow become the input to the next workflow. For example, the outputs of last 4 runs of a workflow that runs every 15 minutes become the input of another workflow that runs every 60 minutes. Chaining together these workflows result it is referred as a data application pipeline.
 
-The Oozie *Coordinator* system allows the user to define and execute recurrent and interdependent workflow jobs (data application pipelines).
+The Oozie **Coordinator** system allows the user to define and execute recurrent and interdependent workflow jobs (data application pipelines).
 
 Real world data application pipelines have to account for reprocessing, late processing, catchup, partial processing, monitoring, notification and SLAs.
 
 This document defines the functional specification for the Oozie Coordinator system.
 
----++ 2. Definitions
+## 2. Definitions
 
-*Actual time:* The actual time indicates the time when something actually happens.
+**Actual time:** The actual time indicates the time when something actually happens.
 
-*Nominal time:* The nominal time specifies the time when something should happen. In theory the nominal time and the actual time should match, however, in practice due to delays the actual time may occur later than the nominal time.
+**Nominal time:** The nominal time specifies the time when something should happen. In theory the nominal time and the actual time should match, however, in practice due to delays the actual time may occur later than the nominal time.
 
-*Dataset:* Collection of data referred to by a logical name. A dataset normally has several instances of data and each
+**Dataset:** Collection of data referred to by a logical name. A dataset normally has several instances of data and each
 one of them can be referred individually. Each dataset instance is represented by a unique set of URIs.
 
-*Synchronous Dataset:* Synchronous datasets instances are generated at fixed time intervals and there is a dataset
+**Synchronous Dataset:** Synchronous datasets instances are generated at fixed time intervals and there is a dataset
 instance associated with each time interval. Synchronous dataset instances are identified by their nominal time.
 For example, in the case of a HDFS based dataset, the nominal time would be somewhere in the file path of the
-dataset instance: hdfs://foo:8020/usr/logs/2009/04/15/23/30. In the case of HCatalog table partitions, the nominal time
-would be part of some partition values: hcat://bar:8020/mydb/mytable/year=2009;month=04;dt=15;region=us.
+dataset instance: `hdfs://foo:8020/usr/logs/2009/04/15/23/30`. In the case of HCatalog table partitions, the nominal time
+would be part of some partition values: `hcat://bar:8020/mydb/mytable/year=2009;month=04;dt=15;region=us`.
 
-*Coordinator Action:* A coordinator action is a workflow job that is started when a set of conditions are met (input dataset instances are available).
+**Coordinator Action:** A coordinator action is a workflow job that is started when a set of conditions are met (input dataset instances are available).
 
-*Coordinator Application:* A coordinator application defines the conditions under which coordinator actions should be created (the frequency) and when the actions can be started. The coordinator application also defines a start and an end time. Normally, coordinator applications are parameterized. A Coordinator application is written in XML.
+**Coordinator Application:** A coordinator application defines the conditions under which coordinator actions should be created (the frequency) and when the actions can be started. The coordinator application also defines a start and an end time. Normally, coordinator applications are parameterized. A Coordinator application is written in XML.
 
-*Coordinator Job:* A coordinator job is an executable instance of a coordination definition. A job submission is done by submitting a job configuration that resolves all parameters in the application definition.
+**Coordinator Job:** A coordinator job is an executable instance of a coordination definition. A job submission is done by submitting a job configuration that resolves all parameters in the application definition.
 
-*Data pipeline:* A data pipeline is a connected set of coordinator applications that consume and produce interdependent datasets.
+**Data pipeline:** A data pipeline is a connected set of coordinator applications that consume and produce interdependent datasets.
 
-*Coordinator Definition Language:* The language used to describe datasets and coordinator applications.
+**Coordinator Definition Language:** The language used to describe datasets and coordinator applications.
 
-*Coordinator Engine:* A system that executes coordinator jobs.
+**Coordinator Engine:** A system that executes coordinator jobs.
 
----++ 3. Expression Language for Parameterization
+## 3. Expression Language for Parameterization
 
 Coordinator application definitions can be parameterized with variables, built-in constants and built-in functions.
 
 At execution time all the parameters are resolved into concrete values.
 
-The parameterization of workflow definitions it done using JSP Expression Language syntax from the [[http://jcp.org/aboutJava/communityprocess/final/jsr152/][JSP 2.0 Specification (JSP.2.3)]], allowing not only to support variables as parameters but also functions and complex expressions.
+The parameterization of workflow definitions it done using JSP Expression Language syntax from the [JSP 2.0 Specification (JSP.2.3)](http://jcp.org/aboutJava/communityprocess/final/jsr152/index.html), allowing not only to support variables as parameters but also functions and complex expressions.
 
 EL expressions can be used in XML attribute values and XML text element values. They cannot be used in XML element and XML attribute names.
 
 Refer to section #6.5 'Parameterization of Coordinator Applications' for more details.
 
----++ 4. Datetime, Frequency and Time-Period Representation
+## 4. Datetime, Frequency and Time-Period Representation
 
-Oozie processes coordinator jobs in a fixed timezone with no DST (typically =UTC=), this timezone is referred as 'Oozie
+Oozie processes coordinator jobs in a fixed timezone with no DST (typically `UTC`), this timezone is referred as 'Oozie
 processing timezone'.
 
 The Oozie processing timezone is used to resolve coordinator jobs start/end times, job pause times and the initial-instance
@@ -131,42 +146,42 @@ of datasets. Also, all coordinator dataset instance URI templates are resolved t
 time-zone.
 
 All the datetimes used in coordinator applications and job parameters to coordinator applications must be specified
-in the Oozie processing timezone. If Oozie processing timezone is =UTC=, the qualifier is  *Z*. If Oozie processing
-time zone is other than =UTC=, the qualifier must be the GMT offset, =(+/-)####=.
+in the Oozie processing timezone. If Oozie processing timezone is `UTC`, the qualifier is  **Z**. If Oozie processing
+time zone is other than `UTC`, the qualifier must be the GMT offset, `(+/-)####`.
 
-For example, a datetime in =UTC=  is =2012-08-12T00:00Z=, the same datetime in =GMT+5:30= is =2012-08-12T05:30+0530=.
+For example, a datetime in `UTC`  is `2012-08-12T00:00Z`, the same datetime in `GMT+5:30` is `2012-08-12T05:30+0530`.
 
-For simplicity, the rest of this specification uses =UTC= datetimes.
+For simplicity, the rest of this specification uses `UTC` datetimes.
 
-#datetime
----+++ 4.1. Datetime
+<a name="datetime"></a>
+### 4.1. Datetime
 
-If the Oozie processing timezone is =UTC=, all datetime values are always in
-[[http://en.wikipedia.org/wiki/Coordinated_Universal_Time][UTC]] down to a minute precision, 'YYYY-MM-DDTHH:mmZ'.
+If the Oozie processing timezone is `UTC`, all datetime values are always in
+[UTC](http://en.wikipedia.org/wiki/Coordinated_Universal_Time) down to a minute precision, 'YYYY-MM-DDTHH:mmZ'.
 
-For example =2009-08-10T13:10Z= is August 10th 2009 at 13:10 UTC.
+For example `2009-08-10T13:10Z` is August 10th 2009 at 13:10 UTC.
 
-If the Oozie processing timezone is a GMT offset =GMT(+/-)####=, all datetime values are always in
-[[http://en.wikipedia.org/wiki/ISO_8601][ISO 8601]] in the corresponding GMT offset down to a minute precision,
+If the Oozie processing timezone is a GMT offset `GMT(+/-)####`, all datetime values are always in
+[ISO 8601](http://en.wikipedia.org/wiki/ISO_8601) in the corresponding GMT offset down to a minute precision,
 'YYYY-MM-DDTHH:mmGMT(+/-)####'.
 
-For example =2009-08-10T13:10+0530= is August 10th 2009 at 13:10 GMT+0530, India timezone.
+For example `2009-08-10T13:10+0530` is August 10th 2009 at 13:10 GMT+0530, India timezone.
 
----++++ 4.1.1 End of the day in Datetime Values
+#### 4.1.1 End of the day in Datetime Values
 
-It is valid to express the end of day as a '24:00' hour (i.e. =2009-08-10T24:00Z=).
+It is valid to express the end of day as a '24:00' hour (i.e. `2009-08-10T24:00Z`).
 
 However, for all calculations and display, Oozie resolves such dates as the zero hour of the following day
-(i.e. =2009-08-11T00:00Z=).
+(i.e. `2009-08-11T00:00Z`).
 
----+++ 4.2. Timezone Representation
+### 4.2. Timezone Representation
 
 There is no widely accepted standard to identify timezones.
 
 Oozie Coordinator will understand the following timezone identifiers:
 
-   * Generic NON-DST timezone identifier: =GMT[+/-]##:##= (i.e.: GMT+05:30)
-   * UTC timezone identifier: =UTC= (i.e.: 2009-06-06T00:00Z)
+   * Generic NON-DST timezone identifier: `GMT[+/-]##:##` (i.e.: GMT+05:30)
+   * UTC timezone identifier: `UTC` (i.e.: 2009-06-06T00:00Z)
    * ZoneInfo identifiers, with DST support, understood by Java JDK (about 600 IDs) (i.e.: America/Los_Angeles)
 
 Due to DST shift from PST to PDT, it is preferred that GMT, UTC or Region/City timezone notation is used in
@@ -175,9 +190,9 @@ at a DST shift. If used directly, PST will not handle DST shift when time is swi
 
 Oozie Coordinator must provide a tool for developers to list all supported timezone identifiers.
 
----+++ 4.3. Timezones and Daylight-Saving
+### 4.3. Timezones and Daylight-Saving
 
-While Oozie coordinator engine works in a fixed timezone with no DST (typically =UTC=), it provides DST support for coordinator applications.
+While Oozie coordinator engine works in a fixed timezone with no DST (typically `UTC`), it provides DST support for coordinator applications.
 
 The baseline datetime for datasets and coordinator applications are expressed in UTC. The baseline datetime is the time of the first occurrence.
 
@@ -189,85 +204,89 @@ The timezone indicator enables Oozie coordinator engine to properly compute freq
 
 Section #7 'Handling Timezones and Daylight Saving Time' explains how coordinator applications can be written to handle timezones and daylight-saving-time properly.
 
----+++ 4.4. Frequency and Time-Period Representation
+### 4.4. Frequency and Time-Period Representation
 
 Frequency is used to capture the periodic intervals at which datasets that are produced, and coordinator applications are scheduled to run.
 
 This time periods representation is also used to specify non-recurrent time-periods, for example a timeout interval.
 
-For datasets and coordinator applications the frequency time-period is applied =N= times to the baseline datetime to compute recurrent times.
+For datasets and coordinator applications the frequency time-period is applied `N` times to the baseline datetime to compute recurrent times.
 
 Frequency is always expressed in minutes.
 
-Because the number of minutes in day may vary for timezones that observe daylight saving time, constants cannot be use to express frequencies greater than a day for datasets and coordinator applications for such timezones. For such uses cases, Oozie coordinator provides 2 EL functions, =${coord:days(int n)}= and =${coord:months(int n)}=.
+Because the number of minutes in day may vary for timezones that observe daylight saving time, constants cannot be use to express frequencies greater than a day for datasets and coordinator applications for such timezones. For such uses cases, Oozie coordinator provides 2 EL functions, `${coord:days(int n)}` and `${coord:months(int n)}`.
 
 Frequencies can be expressed using EL constants and EL functions that evaluate to an positive integer number.
 
 Coordinator Frequencies can also be expressed using cron syntax.
 
-*%GREEN% Examples: %ENDCOLOR%*
+**<font color="#008000"> Examples: </font>**
 
-| *EL Constant* | *Value* | *Example* |
-| =${coord:minutes(int n)}= | _n_ | =${coord:minutes(45)}= --> =45= |
-| =${coord:hours(int n)}= | _n * 60_ | =${coord:hours(3)}= --> =180= |
-| =${coord:days(int n)}= | _variable_ | =${coord:days(2)}= --> minutes in 2 full days from the current date |
-| =${coord:months(int n)}= | _variable_ | =${coord:months(1)}= --> minutes in a 1 full month from the current date |
-| =${cron syntax}= | _variable_ | =${0,10 15 * * 2-6}= --> a job that runs every weekday at 3:00pm and 3:10pm UTC time|
+| **EL Constant** | **Value** | **Example** |
+| --- | --- | --- |
+| `${coord:minutes(int n)}` | _n_ | `${coord:minutes(45)}` --> `45` |
+| `${coord:hours(int n)}` | _n * 60_ | `${coord:hours(3)}` --> `180` |
+| `${coord:days(int n)}` | _variable_ | `${coord:days(2)}` --> minutes in 2 full days from the current date |
+| `${coord:months(int n)}` | _variable_ | `${coord:months(1)}` --> minutes in a 1 full month from the current date |
+| `${cron syntax}` | _variable_ | `${0,10 15 * * 2-6}` --> a job that runs every weekday at 3:00pm and 3:10pm UTC time|
 
-Note that, though =${coord:days(int n)}= and =${coord:months(int n)}= EL functions are used to calculate minutes precisely including
+Note that, though `${coord:days(int n)}` and `${coord:months(int n)}` EL functions are used to calculate minutes precisely including
 variations due to daylight saving time for Frequency representation, when specified for coordinator timeout interval, one day is
 calculated as 24 hours and one month is calculated as 30 days for simplicity.
 
----++++ 4.4.1. The coord:days(int n) and coord:endOfDays(int n) EL functions
+#### 4.4.1. The coord:days(int n) and coord:endOfDays(int n) EL functions
 
-The =${coord:days(int n)}= and =${coord:endOfDays(int n)}= EL functions should be used to handle day based frequencies.
+The `${coord:days(int n)}` and `${coord:endOfDays(int n)}` EL functions should be used to handle day based frequencies.
 
 Constant values should not be used to indicate a day based frequency (every 1 day, every 1 week, etc) because the number of hours in
 every day is not always the same for timezones that observe daylight-saving time.
 
-It is a good practice to use always these EL functions instead of using a constant expression (i.e. =24 * 60=) even if the timezone
+It is a good practice to use always these EL functions instead of using a constant expression (i.e. `24 * 60`) even if the timezone
 for which the application is being written for does not support daylight saving time. This makes application foolproof to country
 legislation changes and also makes applications portable across timezones.
 
----+++++ 4.4.1.1. The coord:days(int n) EL function
+##### 4.4.1.1. The coord:days(int n) EL function
 
-The =${coord:days(int n)}= EL function returns the number of minutes for 'n' complete days starting with the day of the specified nominal time for which the computation is being done.
+The `${coord:days(int n)}` EL function returns the number of minutes for 'n' complete days starting with the day of the specified nominal time for which the computation is being done.
 
-The =${coord:days(int n)}= EL function includes *all* the minutes of the current day, regardless of the time of the day of the current nominal time.
+The `${coord:days(int n)}` EL function includes **all** the minutes of the current day, regardless of the time of the day of the current nominal time.
 
-*%GREEN% Examples: %ENDCOLOR%*
+**<font color="#008000"> Examples: </font>**
 
-| *Starting Nominal UTC time* | *Timezone* | *Usage*  | *Value* | *First Occurrence* | *Comments* |
-| =2009-01-01T08:00Z= | =UTC= | =${coord:days(1)}= | 1440 | =2009-01-01T08:00Z= | total minutes on 2009JAN01 UTC time |
-| =2009-01-01T08:00Z= | =America/Los_Angeles= | =${coord:days(1)}= | 1440 | =2009-01-01T08:00Z= | total minutes in 2009JAN01 PST8PDT time |
-| =2009-01-01T08:00Z= | =America/Los_Angeles= | =${coord:days(2)}= | 2880 | =2009-01-01T08:00Z= | total minutes in 2009JAN01 and 2009JAN02 PST8PDT time |
+| **Starting Nominal UTC time** | **Timezone** | **Usage**  | **Value** | **First Occurrence** | **Comments** |
+| --- | --- | --- | --- | --- | --- |
+| `2009-01-01T08:00Z` | `UTC` | `${coord:days(1)}` | 1440 | `2009-01-01T08:00Z` | total minutes on 2009JAN01 UTC time |
+| `2009-01-01T08:00Z` | `America/Los_Angeles` | `${coord:days(1)}` | 1440 | `2009-01-01T08:00Z` | total minutes in 2009JAN01 PST8PDT time |
+| `2009-01-01T08:00Z` | `America/Los_Angeles` | `${coord:days(2)}` | 2880 | `2009-01-01T08:00Z` | total minutes in 2009JAN01 and 2009JAN02 PST8PDT time |
 | |||||
-| =2009-03-08T08:00Z= | =UTC= | =${coord:days(1)}= | 1440 | =2009-03-08T08:00Z= | total minutes on 2009MAR08 UTC time |
-| =2009-03-08T08:00Z= | =Europe/London= | =${coord:days(1)}= | 1440 | =2009-03-08T08:00Z= | total minutes in 2009MAR08 BST1BDT time |
-| =2009-03-08T08:00Z= | =America/Los_Angeles= | =${coord:days(1)}= | 1380 | =2009-03-08T08:00Z= | total minutes in 2009MAR08 PST8PDT time <br/> (2009MAR08 is DST switch in the US) |
-| =2009-03-08T08:00Z= | =UTC= | =${coord:days(2)}= | 2880 | =2009-03-08T08:00Z= | total minutes in 2009MAR08 and 2009MAR09 UTC time |
-| =2009-03-08T08:00Z= | =America/Los_Angeles= | =${coord:days(2)}= | 2820 | =2009-03-08T08:00Z= | total minutes in 2009MAR08 and 2009MAR09 PST8PDT time <br/> (2009MAR08 is DST switch in the US) |
-| =2009-03-09T08:00Z= | =America/Los_Angeles= | =${coord:days(1)}= | 1440 | =2009-03-09T07:00Z= | total minutes in 2009MAR09 PST8PDT time <br/> (2009MAR08 is DST ON, frequency tick is earlier in UTC) |
+| `2009-03-08T08:00Z` | `UTC` | `${coord:days(1)}` | 1440 | `2009-03-08T08:00Z` | total minutes on 2009MAR08 UTC time |
+| `2009-03-08T08:00Z` | `Europe/London` | `${coord:days(1)}` | 1440 | `2009-03-08T08:00Z` | total minutes in 2009MAR08 BST1BDT time |
+| `2009-03-08T08:00Z` | `America/Los_Angeles` | `${coord:days(1)}` | 1380 | `2009-03-08T08:00Z` | total minutes in 2009MAR08 PST8PDT time <br/> (2009MAR08 is DST switch in the US) |
+| `2009-03-08T08:00Z` | `UTC` | `${coord:days(2)}` | 2880 | `2009-03-08T08:00Z` | total minutes in 2009MAR08 and 2009MAR09 UTC time |
+| `2009-03-08T08:00Z` | `America/Los_Angeles` | `${coord:days(2)}` | 2820 | `2009-03-08T08:00Z` | total minutes in 2009MAR08 and 2009MAR09 PST8PDT time <br/> (2009MAR08 is DST switch in the US) |
+| `2009-03-09T08:00Z` | `America/Los_Angeles` | `${coord:days(1)}` | 1440 | `2009-03-09T07:00Z` | total minutes in 2009MAR09 PST8PDT time <br/> (2009MAR08 is DST ON, frequency tick is earlier in UTC) |
 
-For all these examples, the first occurrence of the frequency will be at =08:00Z= (UTC time).
+For all these examples, the first occurrence of the frequency will be at `08:00Z` (UTC time).
 
----+++++ 4.4.1.2. The coord:endOfDays(int n) EL function
+##### 4.4.1.2. The coord:endOfDays(int n) EL function
 
-The =${coord:endOfDays(int n)}= EL function is identical to the =${coord:days(int n)}= except that it shifts the first occurrence to the end of the day for the specified timezone before computing the interval in minutes.
+The `${coord:endOfDays(int n)}` EL function is identical to the `${coord:days(int n)}` except that it shifts the first occurrence to the end of the day for the specified timezone before computing the interval in minutes.
 
-*%GREEN% Examples: %ENDCOLOR%*
+**<font color="#008000"> Examples: </font>**
 
-| *Starting Nominal UTC time* | *Timezone* | *Usage*  | *Value* | *First Occurrence* | *Comments* |
-| =2009-01-01T08:00Z= | =UTC= | =${coord:endOfDays(1)}= | 1440 | =2009-01-02T00:00Z= | first occurrence in 2009JAN02 00:00 UTC time, <br/> first occurrence shifted to the end of the UTC day |
-| =2009-01-01T08:00Z= | =America/Los_Angeles= | =${coord:endOfDays(1)}= | 1440 | =2009-01-02T08:00Z= | first occurrence in 2009JAN02 08:00 UTC time, <br/> first occurrence shifted to the end of the PST8PDT day |
-| =2009-01-01T08:01Z= | =America/Los_Angeles= | =${coord:endOfDays(1)}= | 1440 | =2009-01-02T08:00Z= | first occurrence in 2009JAN02 08:00 UTC time, <br/> first occurrence shifted to the end of the PST8PDT day |
-| =2009-01-01T18:00Z= | =America/Los_Angeles= | =${coord:endOfDays(1)}= | 1440 | =2009-01-02T08:00Z= | first occurrence in 2009JAN02 08:00 UTC time, <br/> first occurrence shifted to the end of the PST8PDT day |
+| **Starting Nominal UTC time** | **Timezone** | **Usage**  | **Value** | **First Occurrence** | **Comments** |
+| --- | --- | --- | --- | --- | --- |
+| `2009-01-01T08:00Z` | `UTC` | `${coord:endOfDays(1)}` | 1440 | `2009-01-02T00:00Z` | first occurrence in 2009JAN02 00:00 UTC time, <br/> first occurrence shifted to the end of the UTC day |
+| `2009-01-01T08:00Z` | `America/Los_Angeles` | `${coord:endOfDays(1)}` | 1440 | `2009-01-02T08:00Z` | first occurrence in 2009JAN02 08:00 UTC time, <br/> first occurrence shifted to the end of the PST8PDT day |
+| `2009-01-01T08:01Z` | `America/Los_Angeles` | `${coord:endOfDays(1)}` | 1440 | `2009-01-02T08:00Z` | first occurrence in 2009JAN02 08:00 UTC time, <br/> first occurrence shifted to the end of the PST8PDT day |
+| `2009-01-01T18:00Z` | `America/Los_Angeles` | `${coord:endOfDays(1)}` | 1440 | `2009-01-02T08:00Z` | first occurrence in 2009JAN02 08:00 UTC time, <br/> first occurrence shifted to the end of the PST8PDT day |
 | |||||
-| =2009-03-07T09:00Z= | =America/Los_Angeles= | =${coord:endOfDays(1)}= | 1380 | =2009-03-08T08:00Z= | first occurrence in 2009MAR08 08:00 UTC time <br/> first occurrence shifted to the end of the PST8PDT day |
-| =2009-03-08T07:00Z= | =America/Los_Angeles= | =${coord:endOfDays(1)}= | 1440 | =2009-03-08T08:00Z= | first occurrence in 2009MAR08 08:00 UTC time <br/> first occurrence shifted to the end of the PST8PDT day |
-| =2009-03-09T07:00Z= | =America/Los_Angeles= | =${coord:endOfDays(1)}= | 1440 | =2009-03-10T07:00Z= | first occurrence in 2009MAR10 07:00 UTC time <br/> (2009MAR08 is DST switch in the US), <br/> first occurrence shifted to the end of the PST8PDT day |
+| `2009-03-07T09:00Z` | `America/Los_Angeles` | `${coord:endOfDays(1)}` | 1380 | `2009-03-08T08:00Z` | first occurrence in 2009MAR08 08:00 UTC time <br/> first occurrence shifted to the end of the PST8PDT day |
+| `2009-03-08T07:00Z` | `America/Los_Angeles` | `${coord:endOfDays(1)}` | 1440 | `2009-03-08T08:00Z` | first occurrence in 2009MAR08 08:00 UTC time <br/> first occurrence shifted to the end of the PST8PDT day |
+| `2009-03-09T07:00Z` | `America/Los_Angeles` | `${coord:endOfDays(1)}` | 1440 | `2009-03-10T07:00Z` | first occurrence in 2009MAR10 07:00 UTC time <br/> (2009MAR08 is DST switch in the US), <br/> first occurrence shifted to the end of the PST8PDT day |
 
-<verbatim>
+
+```
 <coordinator-app name="hello-coord" frequency="${coord:days(1)}"
                   start="2009-01-02T08:00Z" end="2009-01-04T08:00Z" timezone="America/Los_Angeles"
                  xmlns="uri:oozie:coordinator:0.5">
@@ -317,48 +336,51 @@ The =${coord:endOfDays(int n)}= EL function is identical to the =${coord:days(in
        </workflow>
       </action>
  </coordinator-app>
-</verbatim>
+```
 
----++++ 4.4.2. The coord:months(int n) and coord:endOfMonths(int n) EL functions
+#### 4.4.2. The coord:months(int n) and coord:endOfMonths(int n) EL functions
 
-The =${coord:months(int n)}= and =${coord:endOfMonths(int n)}= EL functions should be used to handle month based frequencies.
+The `${coord:months(int n)}` and `${coord:endOfMonths(int n)}` EL functions should be used to handle month based frequencies.
 
 Constant values cannot be used to indicate a month based frequency because the number of days in a month changes month to month and on leap years; plus the number of hours in every day of the month are not always the same for timezones that observe daylight-saving time.
 
----+++++ 4.4.2.1. The coord:months(int n) EL function
+##### 4.4.2.1. The coord:months(int n) EL function
 
-The =${coord:months(int n)}= EL function returns the number of minutes for 'n' complete months starting with the month of the current nominal time for which the computation is being done.
+The `${coord:months(int n)}` EL function returns the number of minutes for 'n' complete months starting with the month of the current nominal time for which the computation is being done.
 
-The =${coord:months(int n)}= EL function includes *all* the minutes of the current month, regardless of the day of the month of the current nominal time.
+The `${coord:months(int n)}` EL function includes **all** the minutes of the current month, regardless of the day of the month of the current nominal time.
 
-*%GREEN% Examples: %ENDCOLOR%*
+**<font color="#008000"> Examples: </font>**
 
-| *Starting Nominal UTC time* | *Timezone* | *Usage*  | *Value* | *First Occurrence* | *Comments* |
-| =2009-01-01T08:00Z= | =UTC= | =${coord:months(1)}= | 44640 | =2009-01-01T08:00Z= |total minutes for 2009JAN UTC time |
-| =2009-01-01T08:00Z= | =America/Los_Angeles= | =${coord:months(1)}= | 44640 | =2009-01-01T08:00Z= | total minutes in 2009JAN PST8PDT time |
-| =2009-01-01T08:00Z= | =America/Los_Angeles= | =${coord:months(2)}= | 84960 | =2009-01-01T08:00Z= | total minutes in 2009JAN and 2009FEB PST8PDT time |
+| **Starting Nominal UTC time** | **Timezone** | **Usage**  | **Value** | **First Occurrence** | **Comments** |
+| --- | --- | --- | --- | --- | --- |
+| `2009-01-01T08:00Z` | `UTC` | `${coord:months(1)}` | 44640 | `2009-01-01T08:00Z` |total minutes for 2009JAN UTC time |
+| `2009-01-01T08:00Z` | `America/Los_Angeles` | `${coord:months(1)}` | 44640 | `2009-01-01T08:00Z` | total minutes in 2009JAN PST8PDT time |
+| `2009-01-01T08:00Z` | `America/Los_Angeles` | `${coord:months(2)}` | 84960 | `2009-01-01T08:00Z` | total minutes in 2009JAN and 2009FEB PST8PDT time |
 | |||||
-| =2009-03-08T08:00Z= | =UTC= | =${coord:months(1)}= | 44640 | =2009-03-08T08:00Z= | total minutes on 2009MAR UTC time |
-| =2009-03-08T08:00Z= | =Europe/London= | =${coord:months(1)}= | 44580 | =2009-03-08T08:00Z= | total minutes in 2009MAR BST1BDT time <br/> (2009MAR29 is DST switch in Europe) |
-| =2009-03-08T08:00Z= | =America/Los_Angeles= | =${coord:months(1)}= | 44580 | =2009-03-08T08:00Z= | total minutes in 2009MAR PST8PDT time <br/> (2009MAR08 is DST switch in the US) |
-| =2009-03-08T08:00Z= | =UTC= | =${coord:months(2)}= | 87840 | =2009-03-08T08:00Z= | total minutes in 2009MAR and 2009APR UTC time |
-| =2009-03-08T08:00Z= | =America/Los_Angeles= | =${coord:months(2)}= | 87780 | =2009-03-08T08:00Z= | total minutes in 2009MAR and 2009APR PST8PDT time <br/> (2009MAR08 is DST switch in US) |
+| `2009-03-08T08:00Z` | `UTC` | `${coord:months(1)}` | 44640 | `2009-03-08T08:00Z` | total minutes on 2009MAR UTC time |
+| `2009-03-08T08:00Z` | `Europe/London` | `${coord:months(1)}` | 44580 | `2009-03-08T08:00Z` | total minutes in 2009MAR BST1BDT time <br/> (2009MAR29 is DST switch in Europe) |
+| `2009-03-08T08:00Z` | `America/Los_Angeles` | `${coord:months(1)}` | 44580 | `2009-03-08T08:00Z` | total minutes in 2009MAR PST8PDT time <br/> (2009MAR08 is DST switch in the US) |
+| `2009-03-08T08:00Z` | `UTC` | `${coord:months(2)}` | 87840 | `2009-03-08T08:00Z` | total minutes in 2009MAR and 2009APR UTC time |
+| `2009-03-08T08:00Z` | `America/Los_Angeles` | `${coord:months(2)}` | 87780 | `2009-03-08T08:00Z` | total minutes in 2009MAR and 2009APR PST8PDT time <br/> (2009MAR08 is DST switch in US) |
+
+##### 4.4.2.2. The coord:endOfMonths(int n) EL function
 
----+++++ 4.4.2.2. The coord:endOfMonths(int n) EL function
+The `${coord:endOfMonths(int n)}` EL function is identical to the `${coord:months(int n)}` except that it shifts the first occurrence to the end of the month for the specified timezone before computing the interval in minutes.
 
-The =${coord:endOfMonths(int n)}= EL function is identical to the =${coord:months(int n)}= except that it shifts the first occurrence to the end of the month for the specified timezone before computing the interval in minutes.
+**<font color="#008000"> Examples: </font>**
 
-*%GREEN% Examples: %ENDCOLOR%*
+| **Starting Nominal UTC time** | **Timezone** | **Usage**  | **Value** | **First Occurrence** | **Comments** |
+| --- | --- | --- | --- | --- | --- |
+| `2009-01-01T00:00Z` | `UTC` | `${coord:endOfMonths(1)}` | 40320 | `2009-02-01T00:00Z` | first occurrence in 2009FEB 00:00 UTC time |
+| `2009-01-01T08:00Z` | `UTC` | `${coord:endOfMonths(1)}` | 40320 | `2009-02-01T00:00Z` | first occurrence in 2009FEB 00:00 UTC time |
+| `2009-01-31T08:00Z` | `UTC` | `${coord:endOfMonths(1)}` | 40320 | `2009-02-01T00:00Z` | first occurrence in 2009FEB 00:00 UTC time |
+| `2009-01-01T08:00Z` | `America/Los_Angeles` | `${coord:endOfMonths(1)}` | 40320 | `2009-02-01T08:00Z` | first occurrence in 2009FEB 08:00 UTC time |
+| `2009-02-02T08:00Z` | `America/Los_Angeles` | `${coord:endOfMonths(1)}` | 44580  | `2009-03-01T08:00Z` | first occurrence in 2009MAR 08:00 UTC time |
+| `2009-02-01T08:00Z` | `America/Los_Angeles` | `${coord:endOfMonths(1)}` | 44580  | `2009-03-01T08:00Z` | first occurrence in 2009MAR 08:00 UTC time |
 
-| *Starting Nominal UTC time* | *Timezone* | *Usage*  | *Value* | *First Occurrence* | *Comments* |
-| =2009-01-01T00:00Z= | =UTC= | =${coord:endOfMonths(1)}= | 40320 | =2009-02-01T00:00Z= | first occurrence in 2009FEB 00:00 UTC time |
-| =2009-01-01T08:00Z= | =UTC= | =${coord:endOfMonths(1)}= | 40320 | =2009-02-01T00:00Z= | first occurrence in 2009FEB 00:00 UTC time |
-| =2009-01-31T08:00Z= | =UTC= | =${coord:endOfMonths(1)}= | 40320 | =2009-02-01T00:00Z= | first occurrence in 2009FEB 00:00 UTC time |
-| =2009-01-01T08:00Z= | =America/Los_Angeles= | =${coord:endOfMonths(1)}= | 40320 | =2009-02-01T08:00Z= | first occurrence in 2009FEB 08:00 UTC time |
-| =2009-02-02T08:00Z= | =America/Los_Angeles= | =${coord:endOfMonths(1)}= | 44580  | =2009-03-01T08:00Z= | first occurrence in 2009MAR 08:00 UTC time |
-| =2009-02-01T08:00Z= | =America/Los_Angeles= | =${coord:endOfMonths(1)}= | 44580  | =2009-03-01T08:00Z= | first occurrence in 2009MAR 08:00 UTC time |
 
-<verbatim>
+```
 <coordinator-app name="hello-coord" frequency="${coord:months(1)}"
                   start="2009-01-02T08:00Z" end="2009-04-02T08:00Z" timezone="America/Los_Angeles"
                  xmlns="uri:oozie:coordinator:0.5">
@@ -408,25 +430,27 @@ The =${coord:endOfMonths(int n)}= EL function is identical to the =${coord:month
        </workflow>
       </action>
  </coordinator-app>
-</verbatim>
+```
 
----++++ 4.4.3. The coord:endOfWeeks(int n) EL function
+#### 4.4.3. The coord:endOfWeeks(int n) EL function
 
-The =${coord:endOfWeeks(int n)}=  EL function shifts the first occurrence to the start of the week for the specified
+The `${coord:endOfWeeks(int n)}`  EL function shifts the first occurrence to the start of the week for the specified
 timezone before computing the interval in minutes. The start of the week depends on the Java's implementation of
-[[https://docs.oracle.com/javase/8/docs/api/java/util/Calendar.html#getFirstDayOfWeek--][Calendar.getFirstDayOfWeek()]]
+[Calendar.getFirstDayOfWeek()](https://docs.oracle.com/javase/8/docs/api/java/util/Calendar.html#getFirstDayOfWeek--)
  i.e. first day of the week is SUNDAY in the U.S., MONDAY in France.
 
-*%GREEN% Examples: %ENDCOLOR%*
+**<font color="#008000"> Examples: </font>**
 
-| *Starting Nominal UTC time* | *Timezone* | *Usage*  | *Value* | *First Occurrence* | *Comments* |
-| =2017-01-04T00:00Z= | =UTC= | =${coord:endOfWeeks(1)}= | 10080 | =2017-01-08T00:00Z= | first occurrence on 2017JAN08 08:00 UTC time |
-| =2017-01-04T08:00Z= | =UTC= | =${coord:endOfWeeks(1)}= | 10080 | =2017-01-08T08:00Z= | first occurrence on 2017JAN08 08:00 UTC time |
-| =2017-01-06T08:00Z= | =UTC= | =${coord:endOfWeeks(1)}= | 10080 | =2017-01-08T08:00Z= | first occurrence on 2017JAN08 08:00 UTC time |
-| =2017-01-04T08:00Z= | =America/Los_Angeles= | =${coord:endOfWeeks(1)}= | 10080 | =2017-01-08T08:00Z= | first occurrence in 2017JAN08 08:00 UTC time |
-| =2017-01-06T08:00Z= | =America/Los_Angeles= | =${coord:endOfWeeks(1)}= | 10080 | =2017-01-08T08:00Z= | first occurrence in 2017JAN08 08:00 UTC time |
+| **Starting Nominal UTC time** | **Timezone** | **Usage**  | **Value** | **First Occurrence** | **Comments** |
+| --- | --- | --- | --- | --- | --- |
+| `2017-01-04T00:00Z` | `UTC` | `${coord:endOfWeeks(1)}` | 10080 | `2017-01-08T00:00Z` | first occurrence on 2017JAN08 08:00 UTC time |
+| `2017-01-04T08:00Z` | `UTC` | `${coord:endOfWeeks(1)}` | 10080 | `2017-01-08T08:00Z` | first occurrence on 2017JAN08 08:00 UTC time |
+| `2017-01-06T08:00Z` | `UTC` | `${coord:endOfWeeks(1)}` | 10080 | `2017-01-08T08:00Z` | first occurrence on 2017JAN08 08:00 UTC time |
+| `2017-01-04T08:00Z` | `America/Los_Angeles` | `${coord:endOfWeeks(1)}` | 10080 | `2017-01-08T08:00Z` | first occurrence in 2017JAN08 08:00 UTC time |
+| `2017-01-06T08:00Z` | `America/Los_Angeles` | `${coord:endOfWeeks(1)}` | 10080 | `2017-01-08T08:00Z` | first occurrence in 2017JAN08 08:00 UTC time |
 
-<verbatim>
+
+```
 <coordinator-app name="hello-coord" frequency="${coord:endOfWeeks(1)}"
                   start="2017-01-04T08:00Z" end="2017-12-31T08:00Z" timezone="America/Los_Angeles"
                  xmlns="uri:oozie:coordinator:0.5">
@@ -476,9 +500,9 @@ timezone before computing the interval in minutes. The start of the week depends
        </workflow>
       </action>
  </coordinator-app>
-</verbatim>
+```
 
----++++ 4.4.4. Cron syntax in coordinator frequency
+#### 4.4.4. Cron syntax in coordinator frequency
 
 Oozie has historically allowed only very basic forms of scheduling: You could choose
 to run jobs separated by a certain number of minutes, hours, days or weeks. That's
@@ -497,8 +521,9 @@ Cron is a standard time-based job scheduling mechanism in unix-like operating sy
 administrators to setup jobs and maintain software environment. Cron syntax generally consists of five fields, minutes,
 hours, date of month, month, and day of week respectively although multiple variations do exist.
 
-<verbatim>
-<coordinator-app name="cron-coord" frequency="0/10 1/2 * * *" start="${start}" end="${end}" timezone="UTC"
+
+```
+<coordinator-app name="cron-coord" frequency="0/10 1/2 ** ** *" start="${start}" end="${end}" timezone="UTC"
                  xmlns="uri:oozie:coordinator:0.2">
         <action>
         <workflow>
@@ -520,18 +545,19 @@ hours, date of month, month, and day of week respectively although multiple vari
         </workflow>
     </action>
 </coordinator-app>
-</verbatim>
+```
 
 Cron expressions are comprised of 5 required fields. The fields respectively are described as follows:
 
-| *Field name* | *Allowed Values* | *Allowed Special Characters*  |
-| =Minutes= | =0-59= | , - * / |
-| =Hours= | =0-23= | , - * / |
-| =Day-of-month= | =1-31= | , - * ? / L W |
-| =Month= | =1-12 or JAN-DEC= | , - * / |
-| =Day-of-Week= | =1-7 or SUN-SAT= | , - * ? / L #|
+| **Field name** | **Allowed Values** | **Allowed Special Characters**  |
+| --- | --- | --- |
+| `Minutes` | `0-59` | , - * / |
+| `Hours` | `0-23` | , - * / |
+| `Day-of-month` | `1-31` | , - * ? / L W |
+| `Month` | `1-12 or JAN-DEC` | , - * / |
+| `Day-of-Week` | `1-7 or SUN-SAT` | , - * ? / L #|
 
-The '*' character is used to specify all values. For example, "*" in the minute field means "every minute".
+The '**' character is used to specify all values. For example, "**" in the minute field means "every minute".
 
 The '?' character is allowed for the day-of-month and day-of-week fields. It is used to specify 'no specific value'.
 This is useful when you need to specify something in one of the two fields, but not the other.
@@ -585,21 +611,22 @@ The legal characters and the names of months and days of the week are not case s
 If a user specifies an invalid cron syntax to run something on Feb, 30th for example: "0 10 30 2 *", the coordinator job
 will not be created and an invalid coordinator frequency parse exception will be thrown.
 
-If a user has a coordinator job that materializes no action during run time, for example: frequency of "0 10 * * *" with
+If a user has a coordinator job that materializes no action during run time, for example: frequency of "0 10 ** ** *" with
 start time of 2013-10-18T21:00Z and end time of 2013-10-18T22:00Z, the coordinator job submission will be rejected and
 an invalid coordinator attribute exception will be thrown.
 
-*%GREEN% Examples: %ENDCOLOR%*
+**<font color="#008000"> Examples: </font>**
 
-| *Cron Expression* | *Meaning* |
-| 10 9 * * * | Runs everyday at 9:10am |
-| 10,30,45 9 * * * | Runs everyday at 9:10am, 9:30am, and 9:45am |
-| =0 * 30 JAN 2-6= | Runs at 0 minute of every hour on weekdays and 30th of January |
-| =0/20 9-17 * * 2-5= | Runs every Mon, Tue, Wed, and Thurs at minutes 0, 20, 40 from 9am to 5pm |
-| 1 2 L-3 * * | Runs every third-to-last day of month at 2:01am |
-| =1 2 6W 3 ?= | Runs on the nearest weekday to March, 6th every year at 2:01am |
-| =1 2 * 3 3#2= | Runs every second Tuesday of March at 2:01am every year |
-| =0 10,13 * * MON-FRI= | Runs every weekday at 10am and 1pm |
+| **Cron Expression** | **Meaning** |
+| --- | --- |
+| 10 9 ** ** * | Runs everyday at 9:10am |
+| 10,30,45 9 ** ** * | Runs everyday at 9:10am, 9:30am, and 9:45am |
+| `0 * 30 JAN 2-6` | Runs at 0 minute of every hour on weekdays and 30th of January |
+| `0/20 9-17 ** ** 2-5` | Runs every Mon, Tue, Wed, and Thurs at minutes 0, 20, 40 from 9am to 5pm |
+| 1 2 L-3 ** ** | Runs every third-to-last day of month at 2:01am |
+| `1 2 6W 3 ?` | Runs on the nearest weekday to March, 6th every year at 2:01am |
+| `1 2 * 3 3#2` | Runs every second Tuesday of March at 2:01am every year |
+| `0 10,13 ** ** MON-FRI` | Runs every weekday at 10am and 1pm |
 
 
 NOTES:
@@ -619,7 +646,7 @@ NOTES:
     no effort has been made to determine which interpretation CronExpression chooses.
     An example would be "0 14-6 ? * FRI-MON".
 
----++ 5. Dataset
+## 5. Dataset
 
 A dataset is a collection of data referred to by a logical name.
 
@@ -631,7 +658,7 @@ A dataset is a synchronous (produced at regular time intervals, it has an expect
 
 A dataset instance is considered to be immutable while it is being consumed by coordinator jobs.
 
----+++ 5.1. Synchronous Datasets
+### 5.1. Synchronous Datasets
 
 Instances of synchronous datasets are produced at regular time intervals, at an expected frequency. They are also referred to as "clocked datasets".
 
@@ -639,175 +666,190 @@ Synchronous dataset instances are identified by their nominal creation time. The
 
 A synchronous dataset definition contains the following information:
 
-   * *%BLUE% name: %ENDCOLOR%* The dataset name. It must be a valid Java identifier.
-   * *%BLUE% frequency: %ENDCOLOR%* It represents the rate, in minutes at which data is _periodically_ created. The granularity is in minutes and can be expressed using EL expressions, for example: ${5 * HOUR}.
-   * *%BLUE% initial-instance: %ENDCOLOR%* The UTC datetime of the initial instance of the dataset. The initial-instance also provides the baseline datetime to compute instances of the dataset using multiples of the frequency.
-   * *%BLUE% timezone:%ENDCOLOR%* The timezone of the dataset.
-   * *%BLUE% uri-template:%ENDCOLOR%* The URI template that identifies the dataset and can be resolved into concrete URIs to identify a particular dataset instance. The URI template is constructed using:
-      * *%BLUE% constants %ENDCOLOR%* See the allowable EL Time Constants below. Ex: ${YEAR}/${MONTH}.
-      * *%BLUE% variables %ENDCOLOR%* Variables must be resolved at the time a coordinator job is submitted to the coordinator engine. They are normally provided a job parameters (configuration properties). Ex: ${market}/${language}
-   * *%BLUE% done-flag:%ENDCOLOR%* This flag denotes when a dataset instance is ready to be consumed.
+   * **<font color="#0000ff"> name: </font>** The dataset name. It must be a valid Java identifier.
+   * **<font color="#0000ff"> frequency: </font>*** It represents the rate, in minutes at which data is _periodically_ created. The granularity is in minutes and can be expressed using EL expressions, for example: ${5 ** HOUR}.
+   * **<font color="#0000ff"> initial-instance: </font>** The UTC datetime of the initial instance of the dataset. The initial-instance also provides the baseline datetime to compute instances of the dataset using multiples of the frequency.
+   * **<font color="#0000ff"> timezone:</font>** The timezone of the dataset.
+   * **<font color="#0000ff"> uri-template:</font>** The URI template that identifies the dataset and can be resolved into concrete URIs to identify a particular dataset instance. The URI template is constructed using:
+      * **<font color="#0000ff"> constants </font>** See the allowable EL Time Constants below. Ex: ${YEAR}/${MONTH}.
+      * **<font color="#0000ff"> variables </font>** Variables must be resolved at the time a coordinator job is submitted to the coordinator engine. They are normally provided a job parameters (configuration properties). Ex: ${market}/${language}
+   * **<font color="#0000ff"> done-flag:</font>** This flag denotes when a dataset instance is ready to be consumed.
       * If the done-flag is omitted the coordinator will wait for the presence of a _SUCCESS file in the directory (Note: MapReduce jobs create this on successful completion automatically).
       * If the done-flag is present but empty, then the existence of the directory itself indicates that the dataset is ready.
       * If the done-flag is present but non-empty, Oozie will check for the presence of the named file within the directory, and will be considered ready (done) when the file exists.
 
 The following EL constants can be used within synchronous dataset URI templates:
 
-| *EL Constant* | *Resulting Format* | *Comments*  |
-| =YEAR= | _YYYY_ | 4 digits representing the year |
-| =MONTH= | _MM_ | 2 digits representing the month of the year, January = 1 |
-| =DAY= | _DD_ | 2 digits representing the day of the month |
-| =HOUR= | _HH_ | 2 digits representing the hour of the day, in 24 hour format, 0 - 23 |
-| =MINUTE= | _mm_ | 2 digits representing the minute of the hour, 0 - 59 |
+| **EL Constant** | **Resulting Format** | **Comments**  |
+| --- | --- | --- |
+| `YEAR` | _YYYY_ | 4 digits representing the year |
+| `MONTH` | _MM_ | 2 digits representing the month of the year, January = 1 |
+| `DAY` | _DD_ | 2 digits representing the day of the month |
+| `HOUR` | _HH_ | 2 digits representing the hour of the day, in 24 hour format, 0 - 23 |
+| `MINUTE` | _mm_ | 2 digits representing the minute of the hour, 0 - 59 |
+
+**<font color="#800080">Syntax: </font>**
 
-*%PURPLE% Syntax: %ENDCOLOR%*
 
-<verbatim>
+```
   <dataset name="[NAME]" frequency="[FREQUENCY]"
            initial-instance="[DATETIME]" timezone="[TIMEZONE]">
     <uri-template>[URI TEMPLATE]</uri-template>
     <done-flag>[FILE NAME]</done-flag>
   </dataset>
-</verbatim>
+```
 
 IMPORTANT: The values of the EL constants in the dataset URIs (in HDFS) are expected in UTC. Oozie Coordinator takes care of the timezone conversion when performing calculations.
 
-*%GREEN% Examples: %ENDCOLOR%*
+**<font color="#008000"> Examples: </font>**
 
-1. *A dataset produced once every day at 00:15 PST8PDT and done-flag is set to empty:*
+1. **A dataset produced once every day at 00:15 PST8PDT and done-flag is set to empty:**
 
-<verbatim>
-  <dataset name="logs" frequency="${coord:days(1)}"
-           initial-instance="2009-02-15T08:15Z" timezone="America/Los_Angeles">
-    <uri-template>
-      hdfs://foo:8020/app/logs/${market}/${YEAR}${MONTH}/${DAY}/data
-    </uri-template>
-    <done-flag></done-flag>
-  </dataset>
-</verbatim>
 
+    ```
+      <dataset name="logs" frequency="${coord:days(1)}"
+               initial-instance="2009-02-15T08:15Z" timezone="America/Los_Angeles">
+        <uri-template>
+          hdfs://foo:8020/app/logs/${market}/${YEAR}${MONTH}/${DAY}/data
+        </uri-template>
+        <done-flag></done-flag>
+      </dataset>
+    ```
 
-The dataset would resolve to the following URIs and Coordinator looks for the existence of the directory itself:
 
-<verbatim>
-  [market] will be replaced with user given property.
+    The dataset would resolve to the following URIs and Coordinator looks for the existence of the directory itself:
 
-  hdfs://foo:8020/usr/app/[market]/2009/02/15/data
-  hdfs://foo:8020/usr/app/[market]/2009/02/16/data
-  hdfs://foo:8020/usr/app/[market]/2009/02/17/data
-  ...
-</verbatim>
 
+    ```
+      [market] will be replaced with user given property.
 
-2. *A dataset available on the 10th of each month and done-flag is default '_SUCCESS':*
+      hdfs://foo:8020/usr/app/[market]/2009/02/15/data
+      hdfs://foo:8020/usr/app/[market]/2009/02/16/data
+      hdfs://foo:8020/usr/app/[market]/2009/02/17/data
+      ...
+    ```
 
-<verbatim>
-  <dataset name="stats" frequency="${coord:months(1)}"
-           initial-instance="2009-01-10T10:00Z" timezone="America/Los_Angeles">
-    <uri-template>hdfs://foo:8020/usr/app/stats/${YEAR}/${MONTH}/data</uri-template>
-  </dataset>
-</verbatim>
 
-The dataset would resolve to the following URIs:
+2. **A dataset available on the 10th of each month and done-flag is default '_SUCCESS':**
 
-<verbatim>
-  hdfs://foo:8020/usr/app/stats/2009/01/data
-  hdfs://foo:8020/usr/app/stats/2009/02/data
-  hdfs://foo:8020/usr/app/stats/2009/03/data
-  ...
-</verbatim>
 
-The dataset instances are not ready until '_SUCCESS' exists in each path:
+    ```
+      <dataset name="stats" frequency="${coord:months(1)}"
+               initial-instance="2009-01-10T10:00Z" timezone="America/Los_Angeles">
+        <uri-template>hdfs://foo:8020/usr/app/stats/${YEAR}/${MONTH}/data</uri-template>
+      </dataset>
+    ```
 
-<verbatim>
-  hdfs://foo:8020/usr/app/stats/2009/01/data/_SUCCESS
-  hdfs://foo:8020/usr/app/stats/2009/02/data/_SUCCESS
-  hdfs://foo:8020/usr/app/stats/2009/03/data/_SUCCESS
-  ...
-</verbatim>
+    The dataset would resolve to the following URIs:
 
 
-3. *A dataset available at the end of every quarter and done-flag is 'trigger.dat':*
+    ```
+      hdfs://foo:8020/usr/app/stats/2009/01/data
+      hdfs://foo:8020/usr/app/stats/2009/02/data
+      hdfs://foo:8020/usr/app/stats/2009/03/data
+      ...
+    ```
 
-<verbatim>
-  <dataset name="stats" frequency="${coord:months(3)}"
-           initial-instance="2009-01-31T20:00Z" timezone="America/Los_Angeles">
-    <uri-template>
-      hdfs://foo:8020/usr/app/stats/${YEAR}/${MONTH}/data
-    </uri-template>
-    <done-flag>trigger.dat</done-flag>
-  </dataset>
-</verbatim>
+    The dataset instances are not ready until '_SUCCESS' exists in each path:
 
-The dataset would resolve to the following URIs:
 
-<verbatim>
-  hdfs://foo:8020/usr/app/stats/2009/01/data
-  hdfs://foo:8020/usr/app/stats/2009/04/data
-  hdfs://foo:8020/usr/app/stats/2009/07/data
-  ...
-</verbatim>
+    ```
+      hdfs://foo:8020/usr/app/stats/2009/01/data/_SUCCESS
+      hdfs://foo:8020/usr/app/stats/2009/02/data/_SUCCESS
+      hdfs://foo:8020/usr/app/stats/2009/03/data/_SUCCESS
+      ...
+    ```
 
-The dataset instances are not ready until 'trigger.dat' exists in each path:
 
-<verbatim>
-  hdfs://foo:8020/usr/app/stats/2009/01/data/trigger.dat
-  hdfs://foo:8020/usr/app/stats/2009/04/data/trigger.dat
-  hdfs://foo:8020/usr/app/stats/2009/07/data/trigger.dat
-  ...
-</verbatim>
+3. **A dataset available at the end of every quarter and done-flag is 'trigger.dat':**
 
 
-4. *Normally the URI template of a dataset has a precision similar to the frequency:*
+    ```
+      <dataset name="stats" frequency="${coord:months(3)}"
+               initial-instance="2009-01-31T20:00Z" timezone="America/Los_Angeles">
+        <uri-template>
+          hdfs://foo:8020/usr/app/stats/${YEAR}/${MONTH}/data
+        </uri-template>
+        <done-flag>trigger.dat</done-flag>
+      </dataset>
+    ```
 
-<verbatim>
-  <dataset name="logs" frequency="${coord:days(1)}"
-           initial-instance="2009-01-01T10:30Z" timezone="America/Los_Angeles">
-    <uri-template>
-      hdfs://foo:8020/usr/app/logs/${YEAR}/${MONTH}/${DAY}/data
-    </uri-template>
-  </dataset>
-</verbatim>
+    The dataset would resolve to the following URIs:
 
-The dataset would resolve to the following URIs:
 
-<verbatim>
-  hdfs://foo:8020/usr/app/logs/2009/01/01/data
-  hdfs://foo:8020/usr/app/logs/2009/01/02/data
-  hdfs://foo:8020/usr/app/logs/2009/01/03/data
-  ...
-</verbatim>
+    ```
+      hdfs://foo:8020/usr/app/stats/2009/01/data
+      hdfs://foo:8020/usr/app/stats/2009/04/data
+      hdfs://foo:8020/usr/app/stats/2009/07/data
+      ...
+    ```
 
-5. *However, if the URI template has a finer precision than the dataset frequency:*
+    The dataset instances are not ready until 'trigger.dat' exists in each path:
 
-<verbatim>
-  <dataset name="logs" frequency="${coord:days(1)}"
-           initial-instance="2009-01-01T10:30Z" timezone="America/Los_Angeles">
-    <uri-template>
-      hdfs://foo:8020/usr/app/logs/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}/data
-    </uri-template>
-  </dataset>
-</verbatim>
 
-The dataset resolves to the following URIs with fixed values for the finer precision template variables:
+    ```
+      hdfs://foo:8020/usr/app/stats/2009/01/data/trigger.dat
+      hdfs://foo:8020/usr/app/stats/2009/04/data/trigger.dat
+      hdfs://foo:8020/usr/app/stats/2009/07/data/trigger.dat
+      ...
+    ```
 
-<verbatim>
-  hdfs://foo:8020/usr/app/logs/2009/01/01/10/30/data
-  hdfs://foo:8020/usr/app/logs/2009/01/02/10/30/data
-  hdfs://foo:8020/usr/app/logs/2009/01/03/10/30/data
-  ...
-</verbatim>
 
----+++ 5.2. Dataset URI-Template types
+4. **Normally the URI template of a dataset has a precision similar to the frequency:**
+
 
-Each dataset URI could be a HDFS path URI denoting a HDFS directory: hdfs://foo:8020/usr/logs/20090415 or a
-HCatalog partition URI identifying a set of table partitions: hcat://bar:8020/logsDB/logsTable/dt=20090415;region=US.
+    ```
+      <dataset name="logs" frequency="${coord:days(1)}"
+               initial-instance="2009-01-01T10:30Z" timezone="America/Los_Angeles">
+        <uri-template>
+          hdfs://foo:8020/usr/app/logs/${YEAR}/${MONTH}/${DAY}/data
+        </uri-template>
+      </dataset>
+    ```
+
+    The dataset would resolve to the following URIs:
+
+
+    ```
+      hdfs://foo:8020/usr/app/logs/2009/01/01/data
+      hdfs://foo:8020/usr/app/logs/2009/01/02/data
+      hdfs://foo:8020/usr/app/logs/2009/01/03/data
+      ...
+    ```
+
+5. **However, if the URI template has a finer precision than the dataset frequency:**
+
+
+    ```
+      <dataset name="logs" frequency="${coord:days(1)}"
+               initial-instance="2009-01-01T10:30Z" timezone="America/Los_Angeles">
+        <uri-template>
+          hdfs://foo:8020/usr/app/logs/${YEAR}/${MONTH}/${DAY}/${HOUR}/${MINUTE}/data
+        </uri-template>
+      </dataset>
+    ```
+
+    The dataset resolves to the following URIs with fixed values for the finer precision template variables:
+
+
+    ```
+      hdfs://foo:8020/usr/app/logs/2009/01/01/10/30/data
+      hdfs://foo:8020/usr/app/logs/2009/01/02/10/30/data
+      hdfs://foo:8020/usr/app/logs/2009/01/03/10/30/data
+      ...
+    ```
+
+### 5.2. Dataset URI-Template types
+
+Each dataset URI could be a HDFS path URI denoting a HDFS directory: `hdfs://foo:8020/usr/logs/20090415` or a
+HCatalog partition URI identifying a set of table partitions: `hcat://bar:8020/logsDB/logsTable/dt=20090415;region=US`.
 
 HCatalog enables table and storage management for Pig, Hive and MapReduce. The format to specify a HCatalog table partition URI is
-hcat://[metastore server]:[port]/[database name]/[table name]/[partkey1]=[value];[partkey2]=[value];...
+`hcat://[metastore server]:[port]/[database name]/[table name]/[partkey1]=[value];[partkey2]=[value];...`
 
 For example,
-<verbatim>
+
+```
   <dataset name="logs" frequency="${coord:days(1)}"
            initial-instance="2009-02-15T08:15Z" timezone="America/Los_Angeles">
     <uri-template>
@@ -815,19 +857,20 @@ For example,
     </uri-template>
     <done-flag></done-flag>
   </dataset>
-</verbatim>
+```
 
----+++ 5.3. Asynchronous Datasets
+### 5.3. Asynchronous Datasets
    * TBD
 
----+++ 5.4. Dataset Definitions
+### 5.4. Dataset Definitions
 
 Dataset definitions are grouped in XML files.
-*IMPORTANT:* Please note that if an XML namespace version is specified for the coordinator-app element in the coordinator.xml file, no namespace needs to be defined separately for the datasets element (even if the dataset is defined in a separate file). Specifying it at multiple places might result in xml errors while submitting the coordinator job.
+**IMPORTANT:** Please note that if an XML namespace version is specified for the coordinator-app element in the coordinator.xml file, no namespace needs to be defined separately for the datasets element (even if the dataset is defined in a separate file). Specifying it at multiple places might result in xml errors while submitting the coordinator job.
 
-*%PURPLE% Syntax: %ENDCOLOR%*
+**<font color="#800080">Syntax: </font>**
 
-<verbatim>
+
+```
  <!-- Synchronous datasets -->
 <datasets>
   <include>[SHARED_DATASETS]</include>
@@ -838,11 +881,12 @@ Dataset definitions are grouped in XML files.
   </dataset>
   ...
 </datasets>
-</verbatim>
+```
+
+**<font color="#008000"> Example: </font>**
 
-*%GREEN% Example: %ENDCOLOR%*
 
-<verbatim>
+```
 <datasets>
 .
   <include>hdfs://foo:8020/app/dataset-definitions/globallogs.xml</include>
@@ -860,74 +904,74 @@ Dataset definitions are grouped in XML files.
   </dataset>
 .
 </datasets>
-</verbatim>
+```
 
----++ 6. Coordinator Application
+## 6. Coordinator Application
 
----+++ 6.1. Concepts
+### 6.1. Concepts
 
----++++ 6.1.1. Coordinator Application
+#### 6.1.1. Coordinator Application
 
 A coordinator application is a program that triggers actions (commonly workflow jobs) when a set of conditions are met. Conditions can be a time frequency, the availability of new dataset instances or other external events.
 
 Types of coordinator applications:
 
-   * *Synchronous:* Its coordinator actions are created at specified time intervals.
+   * **Synchronous:** Its coordinator actions are created at specified time intervals.
 
 Coordinator applications are normally parameterized.
 
----++++ 6.1.2. Coordinator Job
+#### 6.1.2. Coordinator Job
 
 To create a coordinator job, a job configuration that resolves all coordinator application parameters must be provided to the coordinator engine.
 
 A coordinator job is a running instance of a coordinator application running from a start time to an end time. The start
 time must be earlier than the end time.
 
-At any time, a coordinator job is in one of the following status: *PREP, RUNNING, RUNNINGWITHERROR, PREPSUSPENDED, SUSPENDED, SUSPENDEDWITHERROR, PREPPAUSED, PAUSED, PAUSEDWITHERROR, SUCCEEDED, DONEWITHERROR, KILLED, FAILED*.
+At any time, a coordinator job is in one of the following status: **PREP, RUNNING, RUNNINGWITHERROR, PREPSUSPENDED, SUSPENDED, SUSPENDEDWITHERROR, PREPPAUSED, PAUSED, PAUSEDWITHERROR, SUCCEEDED, DONEWITHERROR, KILLED, FAILED**.
 
 Valid coordinator job status transitions are:
 
-   * *PREP --> PREPSUSPENDED | PREPPAUSED | RUNNING | KILLED*
-   * *RUNNING --> RUNNINGWITHERROR | SUSPENDED | PAUSED | SUCCEEDED | KILLED*
-   * *RUNNINGWITHERROR --> RUNNING | SUSPENDEDWITHERROR | PAUSEDWITHERROR | DONEWITHERROR | KILLED | FAILED*
-   * *PREPSUSPENDED --> PREP | KILLED*
-   * *SUSPENDED --> RUNNING | KILLED*
-   * *SUSPENDEDWITHERROR --> RUNNINGWITHERROR | KILLED*
-   * *PREPPAUSED --> PREP | KILLED*
-   * *PAUSED --> SUSPENDED | RUNNING | KILLED*
-   * *PAUSEDWITHERROR --> SUSPENDEDWITHERROR | RUNNINGWITHERROR | KILLED*
-   * *FAILED | KILLED --> IGNORED*
-   * *IGNORED --> RUNNING*
+   * **PREP --> PREPSUSPENDED | PREPPAUSED | RUNNING | KILLED**
+   * **RUNNING --> RUNNINGWITHERROR | SUSPENDED | PAUSED | SUCCEEDED | KILLED**
+   * **RUNNINGWITHERROR --> RUNNING | SUSPENDEDWITHERROR | PAUSEDWITHERROR | DONEWITHERROR | KILLED | FAILED**
+   * **PREPSUSPENDED --> PREP | KILLED**
+   * **SUSPENDED --> RUNNING | KILLED**
+   * **SUSPENDEDWITHERROR --> RUNNINGWITHERROR | KILLED**
+   * **PREPPAUSED --> PREP | KILLED**
+   * **PAUSED --> SUSPENDED | RUNNING | KILLED**
+   * **PAUSEDWITHERROR --> SUSPENDEDWITHERROR | RUNNINGWITHERROR | KILLED**
+   * **FAILED | KILLED --> IGNORED**
+   * **IGNORED --> RUNNING**
 
-When a coordinator job is submitted, oozie parses the coordinator job XML. Oozie then creates a record for the coordinator with status *PREP* and returns a unique ID. The coordinator is also started immediately if pause time is not set.
+When a coordinator job is submitted, oozie parses the coordinator job XML. Oozie then creates a record for the coordinator with status **PREP** and returns a unique ID. The coordinator is also started immediately if pause time is not set.
 
-When a user requests to suspend a coordinator job that is in *PREP* state, oozie puts the job in status *PREPSUSPENDED*. Similarly, when pause time reaches for a coordinator job with *PREP* status, oozie puts the job in status *PREPPAUSED*.
+When a user requests to suspend a coordinator job that is in **PREP** state, oozie puts the job in status **PREPSUSPENDED**. Similarly, when pause time reaches for a coordinator job with **PREP** status, oozie puts the job in status **PREPPAUSED**.
 
-Conversely, when a user requests to resume a *PREPSUSPENDED* coordinator job, oozie puts the job in status *PREP*. And when pause time is reset for a coordinator job and job status is *PREPPAUSED*, oozie puts the job in status *PREP*.
+Conversely, when a user requests to resume a **PREPSUSPENDED** coordinator job, oozie puts the job in status **PREP**. And when pause time is reset for a coordinator job and job status is **PREPPAUSED**, oozie puts the job in status **PREP**.
 
-When a coordinator job starts, oozie puts the job in status *RUNNING* and start materializing workflow jobs based on job frequency. If any workflow job goes to *FAILED/KILLED/TIMEDOUT* state, the coordinator job is put in *RUNNINGWITHERROR*
+When a coordinator job starts, oozie puts the job in status **RUNNING** and start materializing workflow jobs based on job frequency. If any workflow job goes to **FAILED/KILLED/TIMEDOUT** state, the coordinator job is put in **RUNNINGWITHERROR**
 
-When a user requests to kill a coordinator job, oozie puts the job in status *KILLED* and it sends kill to all submitted workflow jobs.
+When a user requests to kill a coordinator job, oozie puts the job in status **KILLED** and it sends kill to all submitted workflow jobs.
 
-When a user requests to suspend a coordinator job that is in *RUNNING* status, oozie puts the job in status *SUSPENDED* and it suspends all submitted workflow jobs. Similarly, when a user requests to suspend a coordinator job that is in *RUNNINGWITHERROR* status, oozie puts the job in status *SUSPENDEDWITHERROR* and it suspends all submitted workflow jobs.
+When a user requests to suspend a coordinator job that is in **RUNNING** status, oozie puts the job in status **SUSPENDED** and it suspends all submitted workflow jobs. Similarly, when a user requests to suspend a coordinator job that is in **RUNNINGWITHERROR** status, oozie puts the job in status **SUSPENDEDWITHERROR** and it suspends all submitted workflow jobs.
 
-When pause time reaches for a coordinator job that is in *RUNNING* status, oozie puts the job in status *PAUSED*. Similarly, when pause time reaches for a coordinator job that is in *RUNNINGWITHERROR* status, oozie puts the job in status *PAUSEDWITHERROR*.
+When pause time reaches for a coordinator job that is in **RUNNING** status, oozie puts the job in status **PAUSED**. Similarly, when pause time reaches for a coordinator job that is in **RUNNINGWITHERROR** status, oozie puts the job in status **PAUSEDWITHERROR**.
 
-Conversely, when a user requests to resume a *SUSPENDED* coordinator job, oozie puts the job in status *RUNNING*. Also,  when a user requests to resume a *SUSPENDEDWITHERROR* coordinator job, oozie puts the job in status *RUNNINGWITHERROR*. And when pause time is reset for a coordinator job and job status is *PAUSED*, oozie puts the job in status *RUNNING*. Also, when the pause time is reset for a coordinator job and job status is *PAUSEDWITHERROR*, oozie puts the job in status *RUNNINGWITHERROR*
+Conversely, when a user requests to resume a **SUSPENDED** coordinator job, oozie puts the job in status **RUNNING**. Also,  when a user requests to resume a **SUSPENDEDWITHERROR** coordinator job, oozie puts the job in status **RUNNINGWITHERROR**. And when pause time is reset for a coordinator job and job status is **PAUSED**, oozie puts the job in status **RUNNING**. Also, when the pause time is reset for a coordinator job and job status is **PAUSEDWITHERROR**, oozie puts the job in status **RUNNINGWITHERROR**
 
-A coordinator job creates workflow jobs (commonly coordinator actions) only for the duration of the coordinator job and only if the coordinator job is in *RUNNING* status. If the coordinator job has been suspended, when resumed it will create all the coordinator actions that should have been created during the time it was suspended, actions will not be lost, they will delayed.
+A coordinator job creates workflow jobs (commonly coordinator actions) only for the duration of the coordinator job and only if the coordinator job is in **RUNNING** status. If the coordinator job has been suspended, when resumed it will create all the coordinator actions that should have been created during the time it was suspended, actions will not be lost, they will delayed.
 
 When the coordinator job materialization finishes and all workflow jobs finish, oozie updates the coordinator status accordingly.
-For example, if all workflows are *SUCCEEDED*, oozie puts the coordinator job into *SUCCEEDED* status.
-If all workflows are *FAILED*, oozie puts the coordinator job into *FAILED* status. If all workflows are *KILLED*, the coordinator
-job status changes to KILLED. However, if any workflow job finishes with not *SUCCEEDED* and combination of *KILLED*, *FAILED* or
-*TIMEOUT*, oozie puts the coordinator job into *DONEWITHERROR*. If all coordinator actions are *TIMEDOUT*, oozie puts the
-coordinator job into *DONEWITHERROR*.
+For example, if all workflows are **SUCCEEDED**, oozie puts the coordinator job into **SUCCEEDED** status.
+If all workflows are **FAILED**, oozie puts the coordinator job into **FAILED** status. If all workflows are **KILLED**, the coordinator
+job status changes to KILLED. However, if any workflow job finishes with not **SUCCEEDED** and combination of **KILLED**, **FAILED** or
+**TIMEOUT**, oozie puts the coordinator job into **DONEWITHERROR**. If all coordinator actions are **TIMEDOUT**, oozie puts the
+coordinator job into **DONEWITHERROR**.
 
-A coordinator job in *FAILED* or *KILLED* status can be changed to *IGNORED* status. A coordinator job in *IGNORED* status can be changed to
- *RUNNING* status.
+A coordinator job in **FAILED** or **KILLED** status can be changed to **IGNORED** status. A coordinator job in **IGNORED** status can be changed to
+ **RUNNING** status.
 
----++++ 6.1.3. Coordinator Action
+#### 6.1.3. Coordinator Action
 
 A coordinator job creates and executes coordinator actions.
 
@@ -935,169 +979,171 @@ A coordinator action is normally a workflow job that consumes and produces datas
 
 Once an coordinator action is created (this is also referred as the action being materialized), the coordinator action will be in waiting until all required inputs for execution are satisfied or until the waiting times out.
 
----+++++ 6.1.3.1. Coordinator Action Creation (Materialization)
+##### 6.1.3.1. Coordinator Action Creation (Materialization)
 
 A coordinator job has one driver event that determines the creation (materialization) of its coordinator actions (typically a workflow job).
 
    * For synchronous coordinator jobs the driver event is the frequency of the coordinator job.
 
----+++++ 6.1.3.2. Coordinator Action Status
+##### 6.1.3.2. Coordinator Action Status
 
-Once a coordinator action has been created (materialized) the coordinator action qualifies for execution. At this point, the action status is *WAITING*.
+Once a coordinator action has been created (materialized) the coordinator action qualifies for execution. At this point, the action status is **WAITING**.
 
-A coordinator action in *WAITING* status must wait until all its input events are available before is ready for execution. When a coordinator action is ready for execution its status is *READY*.
+A coordinator action in **WAITING** status must wait until all its input events are available before is ready for execution. When a coordinator action is ready for execution its status is **READY**.
 
-A coordinator action in *WAITING* status may timeout before it becomes ready for execution. Then the action status is *TIMEDOUT*.
+A coordinator action in **WAITING** status may timeout before it becomes ready for execution. Then the action status is **TIMEDOUT**.
 
-A coordinator action may remain in *READY* status for a while, without starting execution, due to the concurrency execution policies of the coordinator job.
+A coordinator action may remain in **READY** status for a while, without starting execution, due to the concurrency execution policies of the coordinator job.
 
-A coordinator action in *READY* or *WAITING* status changes to *SKIPPED* status if the execution strategy is LAST_ONLY and the
+A coordinator action in **READY** or **WAITING** status changes to **SKIPPED** status if the execution strategy is LAST_ONLY and the
 current time is past the next action's nominal time.  See section 6.3 for more details.
 
-A coordinator action in *READY* or *WAITING* status changes to *SKIPPED* status if the execution strategy is NONE and the
+A coordinator action in **READY** or **WAITING** status changes to **SKIPPED** status if the execution strategy is NONE and the
 current time is past the action's nominal time + 1 minute.  See section 6.3 for more details.
 
-A coordinator action in *READY* status changes to *SUBMITTED* status if total current *RUNNING* and *SUBMITTED* actions are less than concurrency execution limit.
+A coordinator action in **READY** status changes to **SUBMITTED** status if total current **RUNNING** and **SUBMITTED** actions are less than concurrency execution limit.
 
-A coordinator action in *SUBMITTED* status changes to *RUNNING* status when the workflow engine start execution of the coordinator action.
+A coordinator action in **SUBMITTED** status changes to **RUNNING** status when the workflow engine start execution of the coordinator action.
 
-A coordinator action is in *RUNNING* status until the associated workflow job completes its execution. Depending on the workflow job completion status, the coordinator action will be in *SUCCEEDED*, *KILLED* or *FAILED* status.
+A coordinator action is in **RUNNING** status until the associated workflow job completes its execution. Depending on the workflow job completion status, the coordinator action will be in **SUCCEEDED**, **KILLED** or **FAILED** status.
 
-A coordinator action in *WAITING*, *READY*, *SUBMITTED* or *RUNNING* status can be killed, changing to *KILLED* status.
+A coordinator action in **WAITING**, **READY**, **SUBMITTED** or **RUNNING** status can be killed, changing to **KILLED** status.
 
-A coordinator action in *SUBMITTED* or *RUNNING* status can also fail, changing to *FAILED* status.
+A coordinator action in **SUBMITTED** or **RUNNING** status can also fail, changing to **FAILED** status.
 
-A coordinator action in *FAILED*, *KILLED*, or *TIMEDOUT* status can be changed to *IGNORED* status. A coordinator action in *IGNORED* status can be
- rerun, changing to *WAITING* status.
+A coordinator action in **FAILED**, **KILLED**, or **TIMEDOUT** status can be changed to **IGNORED** status. A coordinator action in **IGNORED** status can be
+ rerun, changing to **WAITING** status.
 
 Valid coordinator action status transitions are:
 
-   * *WAITING --> READY | TIMEDOUT | SKIPPED | KILLED*
-   * *READY --> SUBMITTED | SKIPPED | KILLED*
-   * *SUBMITTED --> RUNNING | KILLED | FAILED*
-   * *RUNNING --> SUCCEEDED | KILLED | FAILED*
-   * *FAILED | KILLED | TIMEDOUT --> IGNORED*
-   * *IGNORED --> WAITING*
+   * **WAITING --> READY | TIMEDOUT | SKIPPED | KILLED**
+   * **READY --> SUBMITTED | SKIPPED | KILLED**
+   * **SUBMITTED --> RUNNING | KILLED | FAILED**
+   * **RUNNING --> SUCCEEDED | KILLED | FAILED**
+   * **FAILED | KILLED | TIMEDOUT --> IGNORED**
+   * **IGNORED --> WAITING**
 
----++++ 6.1.4. Input Events
+#### 6.1.4. Input Events
 
 The Input events of a coordinator application specify the input conditions that are required in order to execute a coordinator action.
 
 In the current specification input events are restricted to dataset instances availability.
 
-All the datasets instances defined as input events must be available for the coordinator action to be ready for execution ( *READY* status).
+All the datasets instances defined as input events must be available for the coordinator action to be ready for execution ( **READY** status).
 
 Input events are normally parameterized. For example, the last 24 hourly instances of the 'searchlogs' dataset.
 
 Input events can be refer to multiple instances of multiple datasets. For example, the last 24 hourly instances of the 'searchlogs' dataset and the last weekly instance of the 'celebrityRumours' dataset.
 
----++++ 6.1.5. Output Events
+#### 6.1.5. Output Events
 
 A coordinator action can produce one or more dataset(s) instances as output.
 
 Dataset instances produced as output by one coordinator actions may be consumed as input by another coordinator action(s) of other coordinator job(s).
 
-The chaining of coordinator jobs via the datasets they produce and consume is referred as a *data pipeline.*
+The chaining of coordinator jobs via the datasets they produce and consume is referred as a **data pipeline.**
 
 In the current specification coordinator job output events are restricted to dataset instances.
 
----++++ 6.1.6. Coordinator Action Execution Policies
+#### 6.1.6. Coordinator Action Execution Policies
 
 The execution policies for the actions of a coordinator job can be defined in the coordinator application.
 
    * Timeout: A coordinator job can specify the timeout for its coordinator actions, this is, how long the coordinator action will be in *WAITING* or *READY* status before giving up on its execution.
-   * Concurrency: A coordinator job can specify the concurrency for its coordinator actions, this is, how many coordinator actions are allowed to run concurrently ( *RUNNING* status) before the coordinator engine starts throttling them.
+   * Concurrency: A coordinator job can specify the concurrency for its coordinator actions, this is, how many coordinator actions are allowed to run concurrently ( **RUNNING** status) before the coordinator engine starts throttling them.
    * Execution strategy: A coordinator job can specify the execution strategy of its coordinator actions when there is backlog of coordinator actions in the coordinator engine. The different execution strategies are 'oldest first', 'newest first', 'none' and 'last one only'. A backlog normally happens because of delayed input data, concurrency control or because manual re-runs of coordinator jobs.
    * Throttle: A coordinator job can specify the materialization or creation throttle value for its coordinator actions, this is, how many maximum coordinator actions are allowed to be in WAITING state concurrently.
 
----++++ 6.1.7. Data Pipeline Application
+#### 6.1.7. Data Pipeline Application
 
 Commonly, multiple workflow applications are chained together to form a more complex application.
 
-Workflow applications are run on regular basis, each of one of them at their own frequency. The data consumed and produced by these workflow applications is relative to the nominal time of workflow job that is processing the data. This is a *coordinator application*.
+Workflow applications are run on regular basis, each of one of them at their own frequency. The data consumed and produced by these workflow applications is relative to the nominal time of workflow job that is processing the data. This is a **coordinator application**.
 
-The output of multiple workflow jobs of a single workflow application is then consumed by a single workflow job of another workflow application, this is done on regular basis as well. These workflow jobs are triggered by recurrent actions of coordinator jobs. This is a set of *coordinator jobs* that inter-depend on each other via the data they produce and consume.
+The output of multiple workflow jobs of a single workflow application is then consumed by a single workflow job of another workflow application, this is done on regular basis as well. These workflow jobs are triggered by recurrent actions of coordinator jobs. This is a set of **coordinator jobs** that inter-depend on each other via the data they produce and consume.
 
-This set of interdependent *coordinator applications* is referred as a *data pipeline application*.
+This set of interdependent **coordinator applications** is referred as a **data pipeline application**.
 
----+++ 6.2. Synchronous Coordinator Application Example
+### 6.2. Synchronous Coordinator Application Example
 
-   * The =checkouts= synchronous dataset is created every 15 minutes by an online checkout store.
-   * The =hourlyRevenue= synchronous dataset is created every hour and contains the hourly revenue.
-   * The =dailyRevenue= synchronous dataset is created every day and contains the daily revenue.
-   * The =monthlyRevenue= synchronous dataset is created every month and contains the monthly revenue.
+   * The `checkouts` synchronous dataset is created every 15 minutes by an online checkout store.
+   * The `hourlyRevenue` synchronous dataset is created every hour and contains the hourly revenue.
+   * The `dailyRevenue` synchronous dataset is created every day and contains the daily revenue.
+   * The `monthlyRevenue` synchronous dataset is created every month and contains the monthly revenue.
 
-   * The =revenueCalculator-wf= workflow consumes checkout data and produces as output the corresponding revenue.
-   * The =rollUpRevenue-wf= workflow consumes revenue data and produces a consolidated output.
+   * The `revenueCalculator-wf` workflow consumes checkout data and produces as output the corresponding revenue.
+   * The `rollUpRevenue-wf` workflow consumes revenue data and produces a consolidated output.
 
-   * The =hourlyRevenue-coord= coordinator job triggers, every hour, a =revenueCalculator-wf= workflow. It specifies as input the last 4 =checkouts= dataset instances and it specifies as output a new instance of the =hourlyRevenue= dataset.
-   * The =dailyRollUpRevenue-coord= coordinator job triggers, every day, a =rollUpRevenue-wf= workflow. It specifies as input the last 24 =hourlyRevenue= dataset instances and it specifies as output a new instance of the =dailyRevenue= dataset.
-   * The =monthlyRollUpRevenue-coord= coordinator job triggers, once a month, a =rollUpRevenue-wf= workflow. It specifies as input all the =dailyRevenue= dataset instance of the month and it specifies as output a new instance of the =monthlyRevenue= dataset.
+   * The `hourlyRevenue-coord` coordinator job triggers, every hour, a `revenueCalculator-wf` workflow. It specifies as input the last 4 `checkouts` dataset instances and it specifies as output a new instance of the `hourlyRevenue` dataset.
+   * The `dailyRollUpRevenue-coord` coordinator job triggers, every day, a `rollUpRevenue-wf` workflow. It specifies as input the last 24 `hourlyRevenue` dataset instances and it specifies as output a new instance of the `dailyRevenue` dataset.
+   * The `monthlyRollUpRevenue-coord` coordinator job triggers, once a month, a `rollUpRevenue-wf` workflow. It specifies as input all the `dailyRevenue` dataset instance of the month and it specifies as output a new instance of the `monthlyRevenue` dataset.
 
 This example contains describes all the components that conform a data pipeline: datasets, coordinator jobs and coordinator actions (workflows).
 
 The coordinator actions (the workflows) are completely agnostic of datasets and their frequencies, they just use them as input and output data (i.e. HDFS files or directories). Furthermore, as the example shows, the same workflow can be used to process similar datasets of different frequencies.
 
-The frequency of the =hourlyRevenue-coord= coordinator job is 1 hour, this means that every hour a coordinator action is created. A coordinator action will be executed only when the 4 =checkouts= dataset instances for the corresponding last hour are available, until then the coordinator action will remain as created (materialized), in *WAITING* status. Once the 4 dataset instances for the corresponding last hour are available, the coordinator action will be executed and it will start a =revenueCalculator-wf= workflow job.
+The frequency of the `hourlyRevenue-coord` coordinator job is 1 hour, this means that every hour a coordinator action is created. A coordinator action will be executed only when the 4 `checkouts` dataset instances for the corresponding last hour are available, until then the coordinator action will remain as created (materialized), in **WAITING** status. Once the 4 dataset instances for the corresponding last hour are available, the coordinator action will be executed and it will start a `revenueCalculator-wf` workflow job.
 
----+++ 6.3. Synchronous Coordinator Application Definition
+### 6.3. Synchronous Coordinator Application Definition
 
 A synchronous coordinator definition is a is defined by a name, start time and end time, the frequency of creation of its coordinator actions, the input events, the output events and action control information:
 
-   * *%BLUE% start: %ENDCOLOR%* The start datetime for the job. Starting at this time actions will be materialized. Refer to section #3 'Datetime Representation' for syntax details.
-   * *%BLUE% end: %ENDCOLOR%* The end datetime for the job. When actions will stop being materialized. Refer to section #3 'Datetime Representation' for syntax details.
-   * *%BLUE% timezone:%ENDCOLOR%* The timezone of the coordinator application.
-   * *%BLUE% frequency: %ENDCOLOR%* The frequency, in minutes, to materialize actions. Refer to section #4 'Time Interval Representation' for syntax details.
+   * **<font color="#0000ff"> start: </font>** The start datetime for the job. Starting at this time actions will be materialized. Refer to section #3 'Datetime Representation' for syntax details.
+   * **<font color="#0000ff"> end: </font>** The end datetime for the job. When actions will stop being materialized. Refer to section #3 'Datetime Representation' for syntax details.
+   * **<font color="#0000ff"> timezone:</font>** The timezone of the coordinator application.
+   * **<font color="#0000ff"> frequency: </font>** The frequency, in minutes, to materialize actions. Refer to section #4 'Time Interval Representation' for syntax details.
    * Control information:
-      * *%BLUE% timeout: %ENDCOLOR%* The maximum time, in minutes, that a materialized action will be waiting for the additional conditions to be satisfied before being discarded. A timeout of =0= indicates that at the time of materialization all the other conditions must be satisfied, else the action will be discarded. A timeout of =0= indicates that if all the input events are not satisfied at the time of action materialization, the action should timeout immediately. A timeout of =-1= indicates no timeout, the materialized action will wait forever for the other conditions to be satisfied. The default value is =-1=. The timeout can only cause a =WAITING= action to transition to =TIMEDOUT=; once the data dependency is satisified, a =WAITING= action transitions to =READY=, and the timeout no longer has any affect, even if the action hasn't transitioned to =SUBMITTED= or =RUNNING= when it expires.
-      * *%BLUE% concurrency: %ENDCOLOR%* The maximum number of actions for this job that can be running at the same time. This value allows to materialize and submit multiple instances of the coordinator app, and allows operations to catchup on delayed processing. The default value is =1=.
-      * *%BLUE% execution: %ENDCOLOR%* Specifies the execution order if multiple instances of the coordinator job have satisfied their execution criteria. Valid values are:
-         * =FIFO= (oldest first) *default*.
-         * =LIFO= (newest first).
-         * =LAST_ONLY= (see explanation below).
-         * =NONE= (see explanation below).
-      * *%BLUE% throttle: %ENDCOLOR%* The maximum coordinator actions are allowed to be in WAITING state concurrently. The default value is =12=.
-   * *%BLUE% datasets: %ENDCOLOR%* The datasets coordinator application uses.
-   * *%BLUE% input-events: %ENDCOLOR%* The coordinator job input events.
-      * *%BLUE% data-in: %ENDCOLOR%* It defines one job input condition that resolves to one or more instances of a dataset.
-         * *%BLUE% name: %ENDCOLOR%* input condition name.
-         * *%BLUE% dataset: %ENDCOLOR%* dataset name.
-         * *%BLUE% instance: %ENDCOLOR%* refers to a single dataset instance (the time for a synchronous dataset).
-         * *%BLUE% start-instance: %ENDCOLOR%* refers to the beginning of an instance range (the time for a synchronous dataset).
-         * *%BLUE% end-instance: %ENDCOLOR%* refers to the end of an instance range (the time for a synchronous dataset).
-   * *%BLUE% output-events: %ENDCOLOR%* The coordinator job output events.
-      * *%BLUE% data-out: %ENDCOLOR%* It defines one job output that resolves to a dataset instance.
-         * *%BLUE% name: %ENDCOLOR%* output name.
-         * *%BLUE% dataset: %ENDCOLOR%* dataset name.
-         * *%BLUE% instance: %ENDCOLOR%* dataset instance that will be generated by coordinator action.
-         * *%BLUE% nocleanup: %ENDCOLOR%* disable cleanup of the output dataset in rerun if true, even when nocleanup option is not used in CLI command.
-   * *%BLUE% action: %ENDCOLOR%* The coordinator action to execute.
-      * *%BLUE% workflow: %ENDCOLOR%* The workflow job invocation. Workflow job properties can refer to the defined data-in and data-out elements.
-
-*LAST_ONLY:* While =FIFO= and =LIFO= simply specify the order in which READY actions should be executed, =LAST_ONLY= can actually
-cause some actions to be SKIPPED and is a little harder to understand.  When =LAST_ONLY= is set, an action that is =WAITING=
-or =READY= will be =SKIPPED= when the current time is past the next action's nominal time.  For example, suppose action 1 and 2
-are both =READY=, the current time is 5:00pm, and action 2's nominal time is 5:10pm.  In 10 minutes from now, at 5:10pm, action 1
-will become SKIPPED, assuming it doesn't transition to =SUBMITTED= (or a terminal state) before then.  This sounds similar to the
+      * **<font color="#0000ff"> timeout: </font>** The maximum time, in minutes, that a materialized action will be waiting for the additional conditions to be satisfied before being discarded. A timeout of `0` indicates that at the time of materialization all the other conditions must be satisfied, else the action will be discarded. A timeout of `0` indicates that if all the input events are not satisfied at the time of action materialization, the action should timeout immediately. A timeout of `-1` indicates no timeout, the materialized action will wait forever for the other conditions to be satisfied. The default value is `-1`. The t

<TRUNCATED>