You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nifi.apache.org by al...@apache.org on 2017/01/26 03:05:35 UTC

nifi git commit: NIFI-3392 Enhanced documentation for provenance event type definitions.

Repository: nifi
Updated Branches:
  refs/heads/master 506709922 -> a1ecea360


NIFI-3392 Enhanced documentation for provenance event type definitions.

This closes #1445.

Signed-off-by: Andy LoPresto <al...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/nifi/repo
Commit: http://git-wip-us.apache.org/repos/asf/nifi/commit/a1ecea36
Tree: http://git-wip-us.apache.org/repos/asf/nifi/tree/a1ecea36
Diff: http://git-wip-us.apache.org/repos/asf/nifi/diff/a1ecea36

Branch: refs/heads/master
Commit: a1ecea3600b1dda8a9505aa1cca3caf31548c6c5
Parents: 5067099
Author: Andrew Lim <an...@gmail.com>
Authored: Wed Jan 25 11:44:40 2017 -0500
Committer: Andy LoPresto <al...@apache.org>
Committed: Wed Jan 25 19:04:37 2017 -0800

----------------------------------------------------------------------
 .../src/main/asciidoc/developer-guide.adoc      | 29 ++++++++++++++--
 nifi-docs/src/main/asciidoc/user-guide.adoc     | 36 ++++++++++++++++----
 2 files changed, 56 insertions(+), 9 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/nifi/blob/a1ecea36/nifi-docs/src/main/asciidoc/developer-guide.adoc
----------------------------------------------------------------------
diff --git a/nifi-docs/src/main/asciidoc/developer-guide.adoc b/nifi-docs/src/main/asciidoc/developer-guide.adoc
index bb182e4..4dbe649 100644
--- a/nifi-docs/src/main/asciidoc/developer-guide.adoc
+++ b/nifi-docs/src/main/asciidoc/developer-guide.adoc
@@ -152,7 +152,7 @@ The ProcessSession, often referred to as simply a "session," provides
 a mechanism by which FlowFiles can be created, destroyed, examined, cloned, and transferred to other
 Processors. Additionally, a ProcessSession provides mechanism for creating modified versions of
 FlowFiles, by adding or removing attributes, or by modifying the FlowFile's content. The ProcessSession
-also exposes a mechanism for emitting provenance events that provide for the ability to track the
+also exposes a mechanism for emitting <<provenance_events>> that provide for the ability to track the
 lineage and history of a FlowFile. After operations are performed on one or more FlowFiles, a
 ProcessSession can be either committed or rolled back.
 
@@ -680,7 +680,7 @@ ATTRIBUTES_MODIFIED event, the framework will emit a CONTENT_MODIFIED
 event. The framework will not emit an ATTRIBUTES_MODIFIED event if any
 other event is emitted for that FlowFile (either by the
 Processor or the framework). This is due to the fact that all
-Provenance Events know about the attributes of the FlowFile before the
+<<provenance_events>> know about the attributes of the FlowFile before the
 event occurred as well as those attributes that occurred as a result
 of the processing of that FlowFile, and as a result the
 ATTRIBUTES_MODIFIED is generally considered redundant and would result
@@ -848,6 +848,31 @@ Because this documentation is in an HTML format, you may include images and tabl
 to best describe this component.  The same methods can be used to provide advanced
 documentation for Processors, ControllerServices and ReportingTasks.
 
+[[provenance_events]]
+== Provenance Events
+
+The different event types for provenance reporting are:
+
+[options="header"]
+|======================
+|Provenance Event        |Description
+|ADDINFO                 |Indicates a provenance event for adding additional information such as new linkage to a new URI or UUID
+|ATTRIBUTES_MODIFIED     |Indicates that a FlowFile's attributes were modified in some way. This event is not needed when another event is reported at the same time, as the other event will already contain all FlowFile attributes
+|CLONE                   |Indicates that a FlowFile is an exact duplicate of its parent FlowFile
+|CONTENT_MODIFIED        |Indicates that a FlowFile's content was modified in some way. When using this Event Type, it is advisable to provide details about how the content is modified
+|CREATE                  |Indicates that a FlowFile was generated from data that was not received from a remote system or external process
+|DOWNLOAD                |Indicates that the contents of a FlowFile were downloaded by a user or external entity
+|DROP                    |Indicates a provenance event for the conclusion of an object's life for some reason other than object expiration
+|EXPIRE                  |Indicates a provenance event for the conclusion of an object's life due to the object not being processed in a timely manner
+|FETCH                   |Indicates that the contents of a FlowFile were overwritten using the contents of some external resource. This is similar to the RECEIVE event but varies in that RECEIVE events are intended to be used as the event that introduces the FlowFile into the system, whereas FETCH is used to indicate that the contents of an existing FlowFile were overwritten
+|FORK                    |Indicates that one or more FlowFiles were derived from a parent FlowFile
+|JOIN                    |Indicates that a single FlowFile is derived from joining together multiple parent FlowFiles
+|RECEIVE                 |Indicates a provenance event for receiving data from an external process. This Event Type is expected to be the first event for a FlowFile. As such, a Processor that receives data from an external source and uses that data to replace the content of an existing FlowFile should use the FETCH event type, rather than the RECEIVE event type
+|REPLAY                  |Indicates a provenance event for replaying a FlowFile. The UUID of the event indicates the UUID of the original FlowFile that is being replayed. The event contains one Parent UUID that is also the UUID of the FlowFile that is being replayed and one Child UUID that is the UUID of the a newly created FlowFile that will be re-queued for processing
+|ROUTE                   |Indicates that a FlowFile was routed to a specified relationship and provides information about why the FlowFile was routed to this relationship
+|SEND                    |Indicates a provenance event for sending data to an external process
+|UNKNOWN                 |Indicates that the type of provenance event is unknown because the user who is attempting to access the event is not authorized to know the type
+|======================
 
 
 == Common Processor Patterns

http://git-wip-us.apache.org/repos/asf/nifi/blob/a1ecea36/nifi-docs/src/main/asciidoc/user-guide.adoc
----------------------------------------------------------------------
diff --git a/nifi-docs/src/main/asciidoc/user-guide.adoc b/nifi-docs/src/main/asciidoc/user-guide.adoc
index d8b835b..2fe4374 100644
--- a/nifi-docs/src/main/asciidoc/user-guide.adoc
+++ b/nifi-docs/src/main/asciidoc/user-guide.adoc
@@ -294,12 +294,12 @@ image::nifi-processor-menu.png["Processor Menu"]
 
 While the options available from the context menu vary, the following options are typically available when you have full privileges to work with a Processor:
 
-- *Configure*: This option allows the user to establish or change the configuration of the Processor. (See <<Configuring_a_Processor>>.)
+- *Configure*: This option allows the user to establish or change the configuration of the Processor (see <<Configuring_a_Processor>>).
 - *Start* or *Stop*: This option allows the user to start or stop a Processor; the option will be either Start or Stop, depending on the current state of the Processor.
 - *Status History*: This option opens a graphical representation of the Processor's statistical information over time.
 - *Upstream connections*: This option allows the user to see and "jump to" upstream connections that are coming into the Processor. This is particularly useful when processors connect into and out of other Process Groups.
 - *Downstream connections*: This option allows the user to see and "jump to" downstream connections that are going out of the Processor. This is particularly useful when processors connect into and out of other Process Groups.
-- *Data provenance*: This option displays the NiFi Data Provenance table, with information about  data provenance events for the FlowFiles routed through that Processor
+- *Data provenance*: This option displays the NiFi Data Provenance table, with information about data provenance events for the FlowFiles routed through that Processor (see <<data_provenance>>).
 - *Usage*: This option takes the user to the Processor's usage documentation.
 - *Change color*: This option allows the user to change the color of the Processor, which can make the visual management of large flows easier.
 - *Center in view*: This option centers the view of the canvas on the given Processor.
@@ -1670,10 +1670,7 @@ image:iconDelete.png["Delete"]
 ). This will prompt for confirmation. After confirming the deletion, the Template will be removed from this table
 and will no longer be available to add to the canvas.
 
-
-
-
-
+[[data_provenance]]
 == Data Provenance
 While monitoring a dataflow, users often need a way to determine what happened to a particular data object (FlowFile).
 NiFi's Data Provenance page provides that information. Because NiFi records and indexes data provenance details
@@ -1690,11 +1687,36 @@ replay data at any point within the dataflow, and see a graphical representation
 
 image:provenance-annotated.png["Provenance Table"]
 
-Each point in a dataflow where a FlowFile is processed in some way is considered a "processing event". Various types of processing
+[[provenance_events]]
+=== Provenance Events
+
+Each point in a dataflow where a FlowFile is processed in some way is considered a 'provenance event'. Various types of provenance
 events occur, depending on the dataflow design. For example, when data is brought into the flow, a RECEIVE event occurs, and when
 data is sent out of the flow, a SEND event occurs. Other types of processing events may occur, such as if the data is cloned (CLONE event), routed (ROUTE event), modified (CONTENT_MODIFIED or ATTRIBUTES_MODIFIED event),
 split (FORK event), combined with other data objects (JOIN event), and ultimately removed from the flow (DROP event).
 
+The provenance event types are:
+
+[options="header"]
+|======================
+|Provenance Event        |Description
+|ADDINFO                 |Indicates a provenance event when additional information such as a new linkage to a new URI or UUID is added
+|ATTRIBUTES_MODIFIED     |Indicates that a FlowFile's attributes were modified in some way
+|CLONE                   |Indicates that a FlowFile is an exact duplicate of its parent FlowFile
+|CONTENT_MODIFIED        |Indicates that a FlowFile's content was modified in some way
+|CREATE                  |Indicates that a FlowFile was generated from data that was not received from a remote system or external process
+|DOWNLOAD                |Indicates that the contents of a FlowFile were downloaded by a user or external entity
+|DROP                    |Indicates a provenance event for the conclusion of an object's life for some reason other than object expiration
+|EXPIRE                  |Indicates a provenance event for the conclusion of an object's life due to the object not being processed in a timely manner
+|FETCH                   |Indicates that the contents of a FlowFile were overwritten using the contents of some external resource
+|FORK                    |Indicates that one or more FlowFiles were derived from a parent FlowFile
+|JOIN                    |Indicates that a single FlowFile is derived from joining together multiple parent FlowFiles
+|RECEIVE                 |Indicates a provenance event for receiving data from an external process
+|REPLAY                  |Indicates a provenance event for replaying a FlowFile
+|ROUTE                   |Indicates that a FlowFile was routed to a specified relationship and provides information about why the FlowFile was routed to this relationship
+|SEND                    |Indicates a provenance event for sending data to an external process
+|UNKNOWN                 |Indicates that the type of provenance event is unknown because the user who is attempting to access the event is not authorized to know the type
+|======================
 
 === Searching for Events
 One of the most common tasks performed in the Data Provenance page is a search for a given FlowFile to determine what happened to it. To do this,