You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nifi.apache.org by sc...@apache.org on 2017/08/13 14:08:16 UTC

nifi git commit: NIFI-4276 Add Write Ahead Provenance section to User Guide

Repository: nifi
Updated Branches:
  refs/heads/master 451f9cf12 -> bac175a00


NIFI-4276 Add Write Ahead Provenance section to User Guide

Signed-off-by: Scott Aslan <sc...@gmail.com>

This closes #2074


Project: http://git-wip-us.apache.org/repos/asf/nifi/repo
Commit: http://git-wip-us.apache.org/repos/asf/nifi/commit/bac175a0
Tree: http://git-wip-us.apache.org/repos/asf/nifi/tree/bac175a0
Diff: http://git-wip-us.apache.org/repos/asf/nifi/diff/bac175a0

Branch: refs/heads/master
Commit: bac175a00f47fd3e9cdf45e69d80ffa3c2e4f97c
Parents: 451f9cf
Author: Andrew Lim <an...@gmail.com>
Authored: Fri Aug 11 14:33:18 2017 -0400
Committer: Scott Aslan <sc...@gmail.com>
Committed: Sun Aug 13 10:01:36 2017 -0400

----------------------------------------------------------------------
 .../src/main/asciidoc/administration-guide.adoc |  4 +-
 nifi-docs/src/main/asciidoc/user-guide.adoc     | 39 ++++++++++++++++++++
 2 files changed, 41 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/nifi/blob/bac175a0/nifi-docs/src/main/asciidoc/administration-guide.adoc
----------------------------------------------------------------------
diff --git a/nifi-docs/src/main/asciidoc/administration-guide.adoc b/nifi-docs/src/main/asciidoc/administration-guide.adoc
index dd232e5..9ccde29 100644
--- a/nifi-docs/src/main/asciidoc/administration-guide.adoc
+++ b/nifi-docs/src/main/asciidoc/administration-guide.adoc
@@ -2799,7 +2799,7 @@ Providing three total locations, including `nifi.provenance.repository.directory
 |nifi.provenance.repository.rollover.time|The amount of time to wait before rolling over the latest data provenance information so that it is available in the User Interface. The default value is `30 secs`.
 |nifi.provenance.repository.rollover.size|The amount of information to roll over at a time. The default value is `100 MB`.
 |nifi.provenance.repository.query.threads|The number of threads to use for Provenance Repository queries. The default value is `2`.
-|nifi.provenance.repository.index.threads|The number of threads to use for indexing Provenance events so that they are searchable. The default value is `1`.
+|nifi.provenance.repository.index.threads|The number of threads to use for indexing Provenance events so that they are searchable. The default value is `2`.
 	For flows that operate on a very high number of FlowFiles, the indexing of Provenance events could become a bottleneck. If this is the case, a bulletin will appear, indicating that
 	"The rate of the dataflow is exceeding the provenance recording rate. Slowing down flow to accommodate." If this happens, increasing the value of this property
 	may increase the rate at which the Provenance Repository is able to process these records, resulting in better overall throughput.
@@ -2844,7 +2844,7 @@ Providing three total locations, including `nifi.provenance.repository.directory
 |nifi.provenance.repository.rollover.size|The amount of data to write to a single "event file." The default value is `100 MB`. For production
 	environments where a very large amount of Data Provenance is generated, a value of 1 GB is also very reasonable.
 |nifi.provenance.repository.query.threads|The number of threads to use for Provenance Repository queries. The default value is `2`.
-|nifi.provenance.repository.index.threads|The number of threads to use for indexing Provenance events so that they are searchable. The default value is `1`.
+|nifi.provenance.repository.index.threads|The number of threads to use for indexing Provenance events so that they are searchable. The default value is `2`.
 	For flows that operate on a very high number of FlowFiles, the indexing of Provenance events could become a bottleneck. If this happens, increasing the
 	value of this property may increase the rate at which the Provenance Repository is able to process these records, resulting in better overall throughput.
 	It is advisable to use at least 1 thread per storage location (i.e., if there are 3 storage locations, at least 3 threads should be used). For high

http://git-wip-us.apache.org/repos/asf/nifi/blob/bac175a0/nifi-docs/src/main/asciidoc/user-guide.adoc
----------------------------------------------------------------------
diff --git a/nifi-docs/src/main/asciidoc/user-guide.adoc b/nifi-docs/src/main/asciidoc/user-guide.adoc
index 0cf8bb0..a9c9898 100644
--- a/nifi-docs/src/main/asciidoc/user-guide.adoc
+++ b/nifi-docs/src/main/asciidoc/user-guide.adoc
@@ -1895,6 +1895,45 @@ Once "Expand" is selected, the graph is re-drawn to show the children and their
 
 image:expanded-events.png["Expanded Events"]
 
+[[writeahead-provenance]]
+=== Write Ahead Provenance Repository
+By default, the Provenance Repository is implemented in a Persistent Provenance configuration. In Apache NiFi 1.2.0, the Write Ahead configuration was introduced to provide the same capabilities as Persistent Provenance, but with far better performance. Migrating to the Write Ahead configuration is easy to accomplish. Simply change the setting for the `nifi.provenance.repository.implementation` system property in the `nifi.properties` file from the default value of `org.apache.nifi.provenance.PersistentProvenanceRepository` to `org.apache.nifi.provenance.WriteAheadProvenanceRepository` and restart NiFi.
+
+However, to increase the chances of a successful migration consider the following factors and recommended actions.
+
+==== Backwards Compatibility
+
+The `WriteAheadProvenanceRepository` can use the Provenance data stored by the `PersistentProvenanceRepository`. However, the `PersistentProvenanceRepository` may not be able to read the data written by the `WriteAheadProvenanceRepository`. Therefore, once the Provenance Repository is changed to use the `WriteAheadProvenanceRepository`, it cannot be changed back to the `PersistentProvenanceRepository` without first deleting the data in the Provenance Repository. It is therefore recommended that before changing the implementation to Write Ahead, ensure your version of NiFi is stable, in case an issue arises that requires the need to roll back to a previous version of NiFi that did not support the `WriteAheadProvenanceRepository`.
+
+==== Older Existing NiFi Version
+If you are upgrading from an older version of NiFi to 1.2.0 or later, it is recommended that you do not change the provenance configuration to Write Ahead until you confirm your flows and environment are stable in 1.2.0 first.  This reduces the number of variables in your upgrade and can simplify the debugging process if any issues arise.
+
+==== Bootstrap.conf
+While better performance is achieved with the G1 garbage collector, Java 8 bugs may surface more frequently in the Write Ahead configuration.  It is recommended that the following line is commented out in the `bootstrap.conf` file in the `conf` directory:
+
+....
+java.arg.13=-XX:+UseG1GC
+....
+
+==== System Properties
+Many of the same system properties are supported by both the Persistent and Write Ahead configurations, however the default values have been chosen for a Persistent Provenance configuration. The following exceptions and recommendations should be noted when changing to a Write Ahead configuration:
+
+* `nifi.provenance.repository.journal.count` is not relevant to a Write Ahead configuration
+* `nifi.provenance.repository.concurrent.merge.threads` and `nifi.provenance.repository.warm.cache.frequency` are new properties.  The default values of `2` for threads and blank for frequency (i.e. disabled) should remain for most installations.
+* Change the settings for `nifi.provenance.repository.max.storage.time` (default value of `24 hours`) and `nifi.provenance.repository.max.storage.size` (default value of `1 GB`) to values more suitable for your production environment
+* Change `nifi.provenance.repository.index.shard.size` from the default value of `500 MB` to `4 GB`
+* Change `nifi.provenance.repository.index.threads` from the default value of `2` to either `4` or `8` as the Write Ahead repository enables this to scale better
+* If processing a high volume of events, change `nifi.provenance.repository.rollover.time` from a default of `30 secs` to `1 min` and `nifi.provenance.repository.rollover.size` from the default of `100 MB` to `1 GB`
+
+Once these property changes have been made, restart NiFi.
+
+**Note:** Detailed descriptions for each of these properties can be found in  <<administration-guide.adoc#system_properties,System Properties>>.
+
+==== Encrypted Provenance Considerations
+The above migration recommendations for `WriteAheadProvenanceRepository` also apply to the encrypted version of the configuration, `EncryptedWriteAheadProvenanceRepository`.
+
+The next section has more information about implementing an Encrypted Provenance Repository.
+
 [[encrypted-provenance]]
 === Encrypted Provenance Repository
 While OS-level access control can offer some security over the provenance data written to the disk in a repository, there are scenarios where the data may be sensitive, compliance and regulatory requirements exist, or NiFi is running on hardware not under the direct control of the organization (cloud, etc.). In this case, the provenance repository allows for all data to be encrypted before being persisted to the disk.