You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nifi.apache.org by jo...@apache.org on 2020/01/23 03:48:24 UTC

svn commit: r1873052 [38/49] - in /nifi/site/trunk/docs/nifi-docs: ./ components/org.apache.nifi/nifi-ambari-nar/1.11.0/ components/org.apache.nifi/nifi-ambari-nar/1.11.0/org.apache.nifi.reporting.ambari.AmbariReportingTask/ components/org.apache.nifi/...

Added: nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.InvokeHTTP/index.html
URL: http://svn.apache.org/viewvc/nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.InvokeHTTP/index.html?rev=1873052&view=auto
==============================================================================
--- nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.InvokeHTTP/index.html (added)
+++ nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.InvokeHTTP/index.html Thu Jan 23 03:48:17 2020
@@ -0,0 +1 @@
+<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"></meta><title>InvokeHTTP</title><link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css"></link></head><script type="text/javascript">window.onload = function(){if(self==top) { document.getElementById('nameHeader').style.display = "inherit"; } }</script><body><h1 id="nameHeader" style="display: none;">InvokeHTTP</h1><h2>Description: </h2><p>An HTTP client processor which can interact with a configurable HTTP Endpoint. The destination URL and HTTP Method are configurable. FlowFile attributes are converted to HTTP headers and the FlowFile contents are included as the body of the request (if the HTTP Method is PUT, POST or PATCH).</p><h3>Tags: </h3><p>http, https, rest, client</p><h3>Properties: </h3><p>In the list below, the names of required properties appear in <strong>bold</strong>. Any other properties (not in bold) are considered optional. The table also indicates any default values, whether a 
 property supports the <a href="../../../../../html/expression-language-guide.html">NiFi Expression Language</a>, and whether a property is considered "sensitive", meaning that its value will be encrypted. Before entering a value in a sensitive property, ensure that the <strong>nifi.properties</strong> file has an entry for the property <strong>nifi.sensitive.props.key</strong>.</p><table id="properties"><tr><th>Name</th><th>Default Value</th><th>Allowable Values</th><th>Description</th></tr><tr><td id="name"><strong>HTTP Method</strong></td><td id="default-value">GET</td><td id="allowable-values"></td><td id="description">HTTP request method (GET, POST, PUT, PATCH, DELETE, HEAD, OPTIONS). Arbitrary methods are also supported. Methods other than POST, PUT and PATCH will be sent without a message body.<br/><strong>Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)</strong></td></tr><tr><td id="name"><strong>Remote URL</strong></td><
 td id="default-value"></td><td id="allowable-values"></td><td id="description">Remote URL which will be connected to, including scheme, host, port, path.<br/><strong>Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)</strong></td></tr><tr><td id="name">SSL Context Service</td><td id="default-value"></td><td id="allowable-values"><strong>Controller Service API: </strong><br/>SSLContextService<br/><strong>Implementations: </strong><a href="../../../nifi-ssl-context-service-nar/1.11.0/org.apache.nifi.ssl.StandardSSLContextService/index.html">StandardSSLContextService</a><br/><a href="../../../nifi-ssl-context-service-nar/1.11.0/org.apache.nifi.ssl.StandardRestrictedSSLContextService/index.html">StandardRestrictedSSLContextService</a></td><td id="description">The SSL Context Service used to provide client certificate information for TLS/SSL (https) connections. It is also used to connect to HTTPS Proxy.</td></tr><tr><td id="name"><str
 ong>Connection Timeout</strong></td><td id="default-value">5 secs</td><td id="allowable-values"></td><td id="description">Max wait time for connection to remote service.</td></tr><tr><td id="name"><strong>Read Timeout</strong></td><td id="default-value">15 secs</td><td id="allowable-values"></td><td id="description">Max wait time for response from remote service.</td></tr><tr><td id="name"><strong>Include Date Header</strong></td><td id="default-value">True</td><td id="allowable-values"><ul><li>True</li><li>False</li></ul></td><td id="description">Include an RFC-2616 Date header in the request.</td></tr><tr><td id="name"><strong>Follow Redirects</strong></td><td id="default-value">True</td><td id="allowable-values"><ul><li>True</li><li>False</li></ul></td><td id="description">Follow HTTP redirects issued by remote server.</td></tr><tr><td id="name">Attributes to Send</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">Regular expression that defines w
 hich attributes to send as HTTP headers in the request. If not defined, no attributes are sent as headers. Also any dynamic properties set will be sent as headers. The dynamic property key will be the header key and the dynamic property value will be interpreted as expression language will be the header value.</td></tr><tr><td id="name">Basic Authentication Username</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">The username to be used by the client to authenticate against the Remote URL.  Cannot include control characters (0-31), ':', or DEL (127).</td></tr><tr><td id="name">Basic Authentication Password</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">The password to be used by the client to authenticate against the Remote URL.<br/><strong>Sensitive Property: true</strong></td></tr><tr><td id="name">Proxy Configuration Service</td><td id="default-value"></td><td id="allowable-values"><strong>Controller Service
  API: </strong><br/>ProxyConfigurationService<br/><strong>Implementation: </strong><a href="../../../nifi-proxy-configuration-nar/1.11.0/org.apache.nifi.proxy.StandardProxyConfigurationService/index.html">StandardProxyConfigurationService</a></td><td id="description">Specifies the Proxy Configuration Controller Service to proxy network requests. If set, it supersedes proxy settings configured per component. Supported proxies: HTTP + AuthN, SOCKS</td></tr><tr><td id="name">Proxy Host</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">The fully qualified hostname or IP address of the proxy server<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Proxy Port</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">The port of the proxy server<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></
 td></tr><tr><td id="name">Proxy Type</td><td id="default-value">http</td><td id="allowable-values"></td><td id="description">The type of the proxy we are connecting to. Must be either http or https<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Proxy Username</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">Username to set when authenticating against proxy<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Proxy Password</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">Password to set when authenticating against proxy<br/><strong>Sensitive Property: true</strong><br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Put Response Body In Attribute</td><td id="default-value"><
 /td><td id="allowable-values"></td><td id="description">If set, the response body received back will be put into an attribute of the original FlowFile instead of a separate FlowFile. The attribute key to put to is determined by evaluating value of this property. <br/><strong>Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)</strong></td></tr><tr><td id="name">Max Length To Put In Attribute</td><td id="default-value">256</td><td id="allowable-values"></td><td id="description">If routing the response body to an attribute of the original (by setting the "Put response body in attribute" property or by receiving an error status code), the number of characters put to the attribute value will be at most this amount. This is important because attributes are held in memory and large attributes will quickly cause out of memory issues. If the output goes longer than this value, it will be truncated to fit. Consider making this smaller if ab
 le.</td></tr><tr><td id="name">Use Digest Authentication</td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">Whether to communicate with the website using Digest Authentication. 'Basic Authentication Username' and 'Basic Authentication Password' are used for authentication.</td></tr><tr><td id="name">Always Output Response</td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">Will force a response FlowFile to be generated and routed to the 'Response' relationship regardless of what the server status code received is or if the processor is configured to put the server response body in the request attribute. In the later configuration a request FlowFile with the response body in the attribute and a typical response FlowFile will be emitted to their respective relationships.</td></tr><tr><td id="name">Add Response Headers to Request</td><td id=
 "default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">Enabling this property saves all the response headers to the original request. This may be when the response headers are needed but a response is not generated due to the status code received.</td></tr><tr><td id="name"><strong>Content-Type</strong></td><td id="default-value">${mime.type}</td><td id="allowable-values"></td><td id="description">The Content-Type to specify for when content is being transmitted through a PUT, POST or PATCH. In the case of an empty value after evaluating an expression language expression, Content-Type defaults to application/octet-stream<br/><strong>Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)</strong></td></tr><tr><td id="name">Send Message Body</td><td id="default-value">true</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">If true, sends 
 the HTTP message body on POST/PUT/PATCH requests (default).  If false, suppresses the message body and content-type header for these requests.</td></tr><tr><td id="name"><strong>Use Chunked Encoding</strong></td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">When POST'ing, PUT'ing or PATCH'ing content set this property to true in order to not pass the 'Content-length' header and instead send 'Transfer-Encoding' with a value of 'chunked'. This will enable the data transfer mechanism which was introduced in HTTP 1.1 to pass data of unknown lengths in chunks.</td></tr><tr><td id="name">Penalize on "No Retry"</td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">Enabling this property will penalize FlowFiles that are routed to the "No Retry" relationship.</td></tr><tr><td id="name"><strong>Use HTTP ETag</strong></td><td id="default-value">false
 </td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">Enable HTTP entity tag (ETag) support for HTTP requests.</td></tr><tr><td id="name"><strong>Maximum ETag Cache Size</strong></td><td id="default-value">10MB</td><td id="allowable-values"></td><td id="description">The maximum size that the ETag cache should be allowed to grow to. The default size is 10MB.</td></tr></table><h3>Dynamic Properties: </h3><p>Dynamic Properties allow the user to specify both the name and value of a property.<table id="dynamic-properties"><tr><th>Name</th><th>Value</th><th>Description</th></tr><tr><td id="name">Header Name</td><td id="value">Attribute Expression Language</td><td>Send request header with a key matching the Dynamic Property Key and a value created by evaluating the Attribute Expression Language set in the value of the Dynamic Property.<br/><strong>Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)</
 strong></td></tr></table></p><h3>Relationships: </h3><table id="relationships"><tr><th>Name</th><th>Description</th></tr><tr><td>Original</td><td>The original FlowFile will be routed upon success (2xx status codes). It will have new attributes detailing the success of the request.</td></tr><tr><td>Failure</td><td>The original FlowFile will be routed on any type of connection failure, timeout or general exception. It will have new attributes detailing the request.</td></tr><tr><td>Retry</td><td>The original FlowFile will be routed on any status code that can be retried (5xx status codes). It will have new attributes detailing the request.</td></tr><tr><td>No Retry</td><td>The original FlowFile will be routed on any status code that should NOT be retried (1xx, 3xx, 4xx status codes).  It will have new attributes detailing the request.</td></tr><tr><td>Response</td><td>A Response FlowFile will be routed upon success (2xx status codes). If the 'Output Response Regardless' property is tr
 ue then the response will be sent to this relationship regardless of the status code received.</td></tr></table><h3>Reads Attributes: </h3>None specified.<h3>Writes Attributes: </h3><table id="writes-attributes"><tr><th>Name</th><th>Description</th></tr><tr><td>invokehttp.status.code</td><td>The status code that is returned</td></tr><tr><td>invokehttp.status.message</td><td>The status message that is returned</td></tr><tr><td>invokehttp.response.body</td><td>In the instance where the status code received is not a success (2xx) then the response body will be put to the 'invokehttp.response.body' attribute of the request FlowFile.</td></tr><tr><td>invokehttp.request.url</td><td>The request URL</td></tr><tr><td>invokehttp.tx.id</td><td>The transaction ID that is returned after reading the response</td></tr><tr><td>invokehttp.remote.dn</td><td>The DN of the remote server</td></tr><tr><td>invokehttp.java.exception.class</td><td>The Java exception class raised when the processor fails</td
 ></tr><tr><td>invokehttp.java.exception.message</td><td>The Java exception message raised when the processor fails</td></tr><tr><td>user-defined</td><td>If the 'Put Response Body In Attribute' property is set then whatever it is set to will become the attribute key and the value would be the body of the HTTP response.</td></tr></table><h3>State management: </h3>This component does not store state.<h3>Restricted: </h3>This component is not restricted.<h3>Input requirement: </h3>This component allows an incoming relationship.<h3>System Resource Considerations:</h3>None specified.</body></html>
\ No newline at end of file

Added: nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.JoltTransformJSON/additionalDetails.html
URL: http://svn.apache.org/viewvc/nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.JoltTransformJSON/additionalDetails.html?rev=1873052&view=auto
==============================================================================
--- nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.JoltTransformJSON/additionalDetails.html (added)
+++ nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.JoltTransformJSON/additionalDetails.html Thu Jan 23 03:48:17 2020
@@ -0,0 +1,40 @@
+<!DOCTYPE html>
+<html lang="en">
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+      http://www.apache.org/licenses/LICENSE-2.0
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+-->
+<head>
+    <meta charset="utf-8"/>
+    <title>JoltTransformJSON</title>
+    <link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css"/>
+</head>
+
+<body>
+<!-- Processor Documentation ================================================== -->
+<h2>Usage Information</h2>
+
+<p>
+    The Jolt utilities processing JSON are not not stream based therefore large JSON document
+    transformation may consume large amounts of memory. Currently UTF-8 FlowFile content and Jolt specifications are supported.
+    A specification can be defined using Expression Language where attributes can be referred either on the left or right hand side within the specification syntax.
+
+    Custom Jolt Transformations (that implement the Transform interface) are supported.  Modules containing custom libraries which do not
+    existing on the current class path can be included via the custom module directory property.
+
+    <Strong>Note:</Strong> When configuring a processor if user selects of the Default transformation yet provides a
+    Chain specification the system does not alert that the specification is invalid and and will produce failed flow files.
+    This is a known issue identified within the Jolt library.
+</p>
+</body>
+</html>

Added: nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.JoltTransformJSON/index.html
URL: http://svn.apache.org/viewvc/nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.JoltTransformJSON/index.html?rev=1873052&view=auto
==============================================================================
--- nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.JoltTransformJSON/index.html (added)
+++ nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.JoltTransformJSON/index.html Thu Jan 23 03:48:17 2020
@@ -0,0 +1 @@
+<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"></meta><title>JoltTransformJSON</title><link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css"></link></head><script type="text/javascript">window.onload = function(){if(self==top) { document.getElementById('nameHeader').style.display = "inherit"; } }</script><body><h1 id="nameHeader" style="display: none;">JoltTransformJSON</h1><h2>Description: </h2><p>Applies a list of Jolt specifications to the flowfile JSON payload. A new FlowFile is created with transformed content and is routed to the 'success' relationship. If the JSON transform fails, the original FlowFile is routed to the 'failure' relationship.</p><p><a href="additionalDetails.html">Additional Details...</a></p><h3>Tags: </h3><p>json, jolt, transform, shiftr, chainr, defaultr, removr, cardinality, sort</p><h3>Properties: </h3><p>In the list below, the names of required properties appear in <strong>bold</strong>. Any other properties (no
 t in bold) are considered optional. The table also indicates any default values, and whether a property supports the <a href="../../../../../html/expression-language-guide.html">NiFi Expression Language</a>.</p><table id="properties"><tr><th>Name</th><th>Default Value</th><th>Allowable Values</th><th>Description</th></tr><tr><td id="name"><strong>Jolt Transformation DSL</strong></td><td id="default-value">jolt-transform-chain</td><td id="allowable-values"><ul><li>Cardinality <img src="../../../../../html/images/iconInfo.png" alt="Change the cardinality of input elements to create the output JSON." title="Change the cardinality of input elements to create the output JSON."></img></li><li>Chain <img src="../../../../../html/images/iconInfo.png" alt="Execute list of Jolt transformations." title="Execute list of Jolt transformations."></img></li><li>Default <img src="../../../../../html/images/iconInfo.png" alt=" Apply default values to the output JSON." title=" Apply default values to 
 the output JSON."></img></li><li>Modify - Default <img src="../../../../../html/images/iconInfo.png" alt="Writes when key is missing or value is null" title="Writes when key is missing or value is null"></img></li><li>Modify - Define <img src="../../../../../html/images/iconInfo.png" alt="Writes when key is missing" title="Writes when key is missing"></img></li><li>Modify - Overwrite <img src="../../../../../html/images/iconInfo.png" alt=" Always overwrite value" title=" Always overwrite value"></img></li><li>Remove <img src="../../../../../html/images/iconInfo.png" alt=" Remove values from input data to create the output JSON." title=" Remove values from input data to create the output JSON."></img></li><li>Shift <img src="../../../../../html/images/iconInfo.png" alt="Shift input JSON/data to create the output JSON." title="Shift input JSON/data to create the output JSON."></img></li><li>Sort <img src="../../../../../html/images/iconInfo.png" alt="Sort input json key values alphabe
 tically. Any specification set is ignored." title="Sort input json key values alphabetically. Any specification set is ignored."></img></li><li>Custom <img src="../../../../../html/images/iconInfo.png" alt="Custom Transformation. Requires Custom Transformation Class Name" title="Custom Transformation. Requires Custom Transformation Class Name"></img></li></ul></td><td id="description">Specifies the Jolt Transformation that should be used with the provided specification.</td></tr><tr><td id="name">Custom Transformation Class Name</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">Fully Qualified Class Name for Custom Transformation</td></tr><tr><td id="name">Custom Module Directory</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">Comma-separated list of paths to files and/or directories which contain modules containing custom transformations (that are not included on NiFi's classpath).</td></tr><tr><td id="name">Jolt
  Specification</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">Jolt Specification for transform of JSON data. This value is ignored if the Jolt Sort Transformation is selected.<br/><strong>Supports Expression Language: true (will be evaluated using flow file attributes and variable registry)</strong></td></tr><tr><td id="name"><strong>Transform Cache Size</strong></td><td id="default-value">1</td><td id="allowable-values"></td><td id="description">Compiling a Jolt Transform can be fairly expensive. Ideally, this will be done only once. However, if the Expression Language is used in the transform, we may need a new Transform for each FlowFile. This value controls how many of those Transforms we cache in memory in order to avoid having to compile the Transform each time.</td></tr><tr><td id="name"><strong>Pretty Print</strong></td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description"
 >Apply pretty print formatting to the output of the Jolt transform</td></tr></table><h3>Relationships: </h3><table id="relationships"><tr><th>Name</th><th>Description</th></tr><tr><td>success</td><td>The FlowFile with transformed content will be routed to this relationship</td></tr><tr><td>failure</td><td>If a FlowFile fails processing for any reason (for example, the FlowFile is not valid JSON), it will be routed to this relationship</td></tr></table><h3>Reads Attributes: </h3>None specified.<h3>Writes Attributes: </h3><table id="writes-attributes"><tr><th>Name</th><th>Description</th></tr><tr><td>mime.type</td><td>Always set to application/json</td></tr></table><h3>State management: </h3>This component does not store state.<h3>Restricted: </h3>This component is not restricted.<h3>Input requirement: </h3>This component requires an incoming relationship.<h3>System Resource Considerations:</h3>None specified.</body></html>
\ No newline at end of file

Added: nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.ListDatabaseTables/index.html
URL: http://svn.apache.org/viewvc/nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.ListDatabaseTables/index.html?rev=1873052&view=auto
==============================================================================
--- nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.ListDatabaseTables/index.html (added)
+++ nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.ListDatabaseTables/index.html Thu Jan 23 03:48:17 2020
@@ -0,0 +1 @@
+<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"></meta><title>ListDatabaseTables</title><link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css"></link></head><script type="text/javascript">window.onload = function(){if(self==top) { document.getElementById('nameHeader').style.display = "inherit"; } }</script><body><h1 id="nameHeader" style="display: none;">ListDatabaseTables</h1><h2>Description: </h2><p>Generates a set of flow files, each containing attributes corresponding to metadata about a table from a database connection. Once metadata about a table has been fetched, it will not be fetched again until the Refresh Interval (if set) has elapsed, or until state has been manually cleared.</p><h3>Tags: </h3><p>sql, list, jdbc, table, database</p><h3>Properties: </h3><p>In the list below, the names of required properties appear in <strong>bold</strong>. Any other properties (not in bold) are considered optional. The table also indicates any defa
 ult values.</p><table id="properties"><tr><th>Name</th><th>Default Value</th><th>Allowable Values</th><th>Description</th></tr><tr><td id="name"><strong>Database Connection Pooling Service</strong></td><td id="default-value"></td><td id="allowable-values"><strong>Controller Service API: </strong><br/>DBCPService<br/><strong>Implementations: </strong><a href="../../../nifi-hive-nar/1.11.0/org.apache.nifi.dbcp.hive.HiveConnectionPool/index.html">HiveConnectionPool</a><br/><a href="../../../nifi-dbcp-service-nar/1.11.0/org.apache.nifi.dbcp.DBCPConnectionPool/index.html">DBCPConnectionPool</a><br/><a href="../../../nifi-dbcp-service-nar/1.11.0/org.apache.nifi.dbcp.DBCPConnectionPoolLookup/index.html">DBCPConnectionPoolLookup</a></td><td id="description">The Controller Service that is used to obtain connection to database</td></tr><tr><td id="name">Catalog</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">The name of a catalog from which to list database
  tables. The name must match the catalog name as it is stored in the database. If the property is not set, the catalog name will not be used to narrow the search for tables. If the property is set to an empty string, tables without a catalog will be listed.</td></tr><tr><td id="name">Schema Pattern</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">A pattern for matching schemas in the database. Within a pattern, "%" means match any substring of 0 or more characters, and "_" means match any one character. The pattern must match the schema name as it is stored in the database. If the property is not set, the schema name will not be used to narrow the search for tables. If the property is set to an empty string, tables without a schema will be listed.</td></tr><tr><td id="name">Table Name Pattern</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">A pattern for matching tables in the database. Within a pattern, "%" means
  match any substring of 0 or more characters, and "_" means match any one character. The pattern must match the table name as it is stored in the database. If the property is not set, all tables will be retrieved.</td></tr><tr><td id="name">Table Types</td><td id="default-value">TABLE</td><td id="allowable-values"></td><td id="description">A comma-separated list of table types to include. For example, some databases support TABLE and VIEW types. If the property is not set, tables of all types will be returned.</td></tr><tr><td id="name"><strong>Include Count</strong></td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">Whether to include the table's row count as a flow file attribute. This affects performance as a database query will be generated for each table in the retrieved list.</td></tr><tr><td id="name"><strong>Refresh Interval</strong></td><td id="default-value">0 sec</td><td id="allowable-values"></td><
 td id="description">The amount of time to elapse before resetting the processor state, thereby causing all current tables to be listed. During this interval, the processor may continue to run, but tables that have already been listed will not be re-listed. However new/added tables will be listed as the processor runs. A value of zero means the state will never be automatically reset, the user must Clear State manually.</td></tr></table><h3>Relationships: </h3><table id="relationships"><tr><th>Name</th><th>Description</th></tr><tr><td>success</td><td>All FlowFiles that are received are routed to success</td></tr></table><h3>Reads Attributes: </h3>None specified.<h3>Writes Attributes: </h3><table id="writes-attributes"><tr><th>Name</th><th>Description</th></tr><tr><td>db.table.name</td><td>Contains the name of a database table from the connection</td></tr><tr><td>db.table.catalog</td><td>Contains the name of the catalog to which the table belongs (may be null)</td></tr><tr><td>db.tabl
 e.schema</td><td>Contains the name of the schema to which the table belongs (may be null)</td></tr><tr><td>db.table.fullname</td><td>Contains the fully-qualifed table name (possibly including catalog, schema, etc.)</td></tr><tr><td>db.table.type</td><td>Contains the type of the database table from the connection. Typical types are "TABLE", "VIEW", "SYSTEM TABLE", "GLOBAL TEMPORARY", "LOCAL TEMPORARY", "ALIAS", "SYNONYM"</td></tr><tr><td>db.table.remarks</td><td>Contains the name of a database table from the connection</td></tr><tr><td>db.table.count</td><td>Contains the number of rows in the table</td></tr></table><h3>State management: </h3><table id="stateful"><tr><th>Scope</th><th>Description</th></tr><tr><td>CLUSTER</td><td>After performing a listing of tables, the timestamp of the query is stored. This allows the Processor to not re-list tables the next time that the Processor is run. Specifying the refresh interval in the processor properties will indicate that when the process
 or detects the interval has elapsed, the state will be reset and tables will be re-listed as a result. This processor is meant to be run on the primary node only.</td></tr></table><h3>Restricted: </h3>This component is not restricted.<h3>Input requirement: </h3>This component does not allow an incoming relationship.<h3>System Resource Considerations:</h3>None specified.</body></html>
\ No newline at end of file

Added: nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.ListFTP/index.html
URL: http://svn.apache.org/viewvc/nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.ListFTP/index.html?rev=1873052&view=auto
==============================================================================
--- nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.ListFTP/index.html (added)
+++ nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.ListFTP/index.html Thu Jan 23 03:48:17 2020
@@ -0,0 +1 @@
+<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"></meta><title>ListFTP</title><link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css"></link></head><script type="text/javascript">window.onload = function(){if(self==top) { document.getElementById('nameHeader').style.display = "inherit"; } }</script><body><h1 id="nameHeader" style="display: none;">ListFTP</h1><h2>Description: </h2><p>Performs a listing of the files residing on an FTP server. For each file that is found on the remote server, a new FlowFile will be created with the filename attribute set to the name of the file on the remote server. This can then be used in conjunction with FetchFTP in order to fetch those files.</p><h3>Tags: </h3><p>list, ftp, remote, ingest, source, input, files</p><h3>Properties: </h3><p>In the list below, the names of required properties appear in <strong>bold</strong>. Any other properties (not in bold) are considered optional. The table also indicates any def
 ault values, whether a property supports the <a href="../../../../../html/expression-language-guide.html">NiFi Expression Language</a>, and whether a property is considered "sensitive", meaning that its value will be encrypted. Before entering a value in a sensitive property, ensure that the <strong>nifi.properties</strong> file has an entry for the property <strong>nifi.sensitive.props.key</strong>.</p><table id="properties"><tr><th>Name</th><th>Default Value</th><th>Allowable Values</th><th>Description</th></tr><tr><td id="name"><strong>Listing Strategy</strong></td><td id="default-value">timestamps</td><td id="allowable-values"><ul><li>Tracking Timestamps <img src="../../../../../html/images/iconInfo.png" alt="This strategy tracks the latest timestamp of listed entity to determine new/updated entities. Since it only tracks few timestamps, it can manage listing state efficiently. However, any newly added, or updated entity having timestamp older than the tracked latest timestamp c
 an not be picked by this strategy. For example, such situation can happen in a file system if a file with old timestamp is copied or moved into the target directory without its last modified timestamp being updated." title="This strategy tracks the latest timestamp of listed entity to determine new/updated entities. Since it only tracks few timestamps, it can manage listing state efficiently. However, any newly added, or updated entity having timestamp older than the tracked latest timestamp can not be picked by this strategy. For example, such situation can happen in a file system if a file with old timestamp is copied or moved into the target directory without its last modified timestamp being updated."></img></li><li>Tracking Entities <img src="../../../../../html/images/iconInfo.png" alt="This strategy tracks information of all the listed entities within the latest 'Entity Tracking Time Window' to determine new/updated entities. This strategy can pick entities having old timesta
 mp that can be missed with 'Tracing Timestamps'. However additional DistributedMapCache controller service is required and more JVM heap memory is used. See the description of 'Entity Tracking Time Window' property for further details on how it works." title="This strategy tracks information of all the listed entities within the latest 'Entity Tracking Time Window' to determine new/updated entities. This strategy can pick entities having old timestamp that can be missed with 'Tracing Timestamps'. However additional DistributedMapCache controller service is required and more JVM heap memory is used. See the description of 'Entity Tracking Time Window' property for further details on how it works."></img></li></ul></td><td id="description">Specify how to determine new/updated entities. See each strategy descriptions for detail.</td></tr><tr><td id="name"><strong>Hostname</strong></td><td id="default-value"></td><td id="allowable-values"></td><td id="description">The fully qualified ho
 stname or IP address of the remote system<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name"><strong>Port</strong></td><td id="default-value">21</td><td id="allowable-values"></td><td id="description">The port to connect to on the remote host to fetch the data from<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name"><strong>Username</strong></td><td id="default-value"></td><td id="allowable-values"></td><td id="description">Username<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Password</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">Password for the user account<br/><strong>Sensitive Property: true</strong><br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</s
 trong></td></tr><tr><td id="name">Remote Path</td><td id="default-value">.</td><td id="allowable-values"></td><td id="description">The path on the remote system from which to pull or push files<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Distributed Cache Service</td><td id="default-value"></td><td id="allowable-values"><strong>Controller Service API: </strong><br/>DistributedMapCacheClient<br/><strong>Implementations: </strong><a href="../../../nifi-hbase_1_1_2-client-service-nar/1.11.0/org.apache.nifi.hbase.HBase_1_1_2_ClientMapCacheService/index.html">HBase_1_1_2_ClientMapCacheService</a><br/><a href="../../../nifi-couchbase-nar/1.11.0/org.apache.nifi.couchbase.CouchbaseMapCacheClient/index.html">CouchbaseMapCacheClient</a><br/><a href="../../../nifi-redis-nar/1.11.0/org.apache.nifi.redis.service.RedisDistributedMapCacheClientService/index.html">RedisDistributedMapCacheClientService</a><br/><a
  href="../../../nifi-distributed-cache-services-nar/1.11.0/org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService/index.html">DistributedMapCacheClientService</a><br/><a href="../../../nifi-hbase_2-client-service-nar/1.11.0/org.apache.nifi.hbase.HBase_2_ClientMapCacheService/index.html">HBase_2_ClientMapCacheService</a></td><td id="description">NOTE: This property is used merely for migration from old NiFi version before state management was introduced at version 0.5.0. The stored value in the cache service will be migrated into the state when this processor is started at the first time. The specified Controller Service was used to maintain state about what had been pulled from the remote server so that if a new node begins pulling data, it won't duplicate all of the work that has been done. If not specified, the information was not shared across the cluster. This property did not need to be set for standalone instances of NiFi but was supposed to be configured if
  NiFi had been running within a cluster.</td></tr><tr><td id="name"><strong>Search Recursively</strong></td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">If true, will pull files from arbitrarily nested subdirectories; otherwise, will not traverse subdirectories</td></tr><tr><td id="name"><strong>Follow symlink</strong></td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">If true, will pull even symbolic files and also nested symbolic subdirectories; otherwise, will not read symbolic files and will not traverse symbolic link subdirectories</td></tr><tr><td id="name">File Filter Regex</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">Provides a Java Regular Expression for filtering Filenames; if a filter is supplied, only files whose names match that Regular Expression will be fetched</td></tr><tr><td id="
 name">Path Filter Regex</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">When Search Recursively is true, then only subdirectories whose path matches the given Regular Expression will be scanned</td></tr><tr><td id="name"><strong>Ignore Dotted Files</strong></td><td id="default-value">true</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">If true, files whose names begin with a dot (".") will be ignored</td></tr><tr><td id="name"><strong>Remote Poll Batch Size</strong></td><td id="default-value">5000</td><td id="allowable-values"></td><td id="description">The value specifies how many file paths to find in a given directory on the remote system when doing a file listing. This value in general should not need to be modified but when polling against a remote system with a tremendous number of files this value can be critical.  Setting this value too high can result very poor performance and setting it too low 
 can cause the flow to be slower than normal.</td></tr><tr><td id="name"><strong>Connection Timeout</strong></td><td id="default-value">30 sec</td><td id="allowable-values"></td><td id="description">Amount of time to wait before timing out while creating a connection</td></tr><tr><td id="name"><strong>Data Timeout</strong></td><td id="default-value">30 sec</td><td id="allowable-values"></td><td id="description">When transferring a file between the local and remote system, this value specifies how long is allowed to elapse without any data being transferred between systems</td></tr><tr><td id="name">Connection Mode</td><td id="default-value">Passive</td><td id="allowable-values"><ul><li>Active</li><li>Passive</li></ul></td><td id="description">The FTP Connection Mode</td></tr><tr><td id="name">Transfer Mode</td><td id="default-value">Binary</td><td id="allowable-values"><ul><li>Binary</li><li>ASCII</li></ul></td><td id="description">The FTP Transfer Mode</td></tr><tr><td id="name">Pro
 xy Configuration Service</td><td id="default-value"></td><td id="allowable-values"><strong>Controller Service API: </strong><br/>ProxyConfigurationService<br/><strong>Implementation: </strong><a href="../../../nifi-proxy-configuration-nar/1.11.0/org.apache.nifi.proxy.StandardProxyConfigurationService/index.html">StandardProxyConfigurationService</a></td><td id="description">Specifies the Proxy Configuration Controller Service to proxy network requests. If set, it supersedes proxy settings configured per component. Supported proxies: HTTP + AuthN, SOCKS</td></tr><tr><td id="name">Proxy Type</td><td id="default-value">DIRECT</td><td id="allowable-values"><ul><li>DIRECT</li><li>HTTP</li><li>SOCKS</li></ul></td><td id="description">Proxy type used for file transfers</td></tr><tr><td id="name">Proxy Host</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">The fully qualified hostname or IP address of the proxy server<br/><strong>Supports Expression Languag
 e: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Proxy Port</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">The port of the proxy server<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Http Proxy Username</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">Http Proxy Username<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Http Proxy Password</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">Http Proxy Password<br/><strong>Sensitive Property: true</strong><br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Internal Buffer Size</td><td id="default-value">16KB</td><td id="allowable-values"></td><t
 d id="description">Set the internal buffer size for buffered data streams</td></tr><tr><td id="name"><strong>Target System Timestamp Precision</strong></td><td id="default-value">auto-detect</td><td id="allowable-values"><ul><li>Auto Detect <img src="../../../../../html/images/iconInfo.png" alt="Automatically detect time unit deterministically based on candidate entries timestamp. Please note that this option may take longer to list entities unnecessarily, if none of entries has a precise precision timestamp. E.g. even if a target system supports millis, if all entries only have timestamps without millis, such as '2017-06-16 09:06:34.000', then its precision is determined as 'seconds'." title="Automatically detect time unit deterministically based on candidate entries timestamp. Please note that this option may take longer to list entities unnecessarily, if none of entries has a precise precision timestamp. E.g. even if a target system supports millis, if all entries only have times
 tamps without millis, such as '2017-06-16 09:06:34.000', then its precision is determined as 'seconds'."></img></li><li>Milliseconds <img src="../../../../../html/images/iconInfo.png" alt="This option provides the minimum latency for an entry from being available to being listed if target system supports millis, if not, use other options." title="This option provides the minimum latency for an entry from being available to being listed if target system supports millis, if not, use other options."></img></li><li>Seconds <img src="../../../../../html/images/iconInfo.png" alt="For a target system that does not have millis precision, but has in seconds." title="For a target system that does not have millis precision, but has in seconds."></img></li><li>Minutes <img src="../../../../../html/images/iconInfo.png" alt="For a target system that only supports precision in minutes." title="For a target system that only supports precision in minutes."></img></li></ul></td><td id="description">S
 pecify timestamp precision at the target system. Since this processor uses timestamp of entities to decide which should be listed, it is crucial to use the right timestamp precision.</td></tr><tr><td id="name">Entity Tracking State Cache</td><td id="default-value"></td><td id="allowable-values"><strong>Controller Service API: </strong><br/>DistributedMapCacheClient<br/><strong>Implementations: </strong><a href="../../../nifi-hbase_1_1_2-client-service-nar/1.11.0/org.apache.nifi.hbase.HBase_1_1_2_ClientMapCacheService/index.html">HBase_1_1_2_ClientMapCacheService</a><br/><a href="../../../nifi-couchbase-nar/1.11.0/org.apache.nifi.couchbase.CouchbaseMapCacheClient/index.html">CouchbaseMapCacheClient</a><br/><a href="../../../nifi-redis-nar/1.11.0/org.apache.nifi.redis.service.RedisDistributedMapCacheClientService/index.html">RedisDistributedMapCacheClientService</a><br/><a href="../../../nifi-distributed-cache-services-nar/1.11.0/org.apache.nifi.distributed.cache.client.DistributedMap
 CacheClientService/index.html">DistributedMapCacheClientService</a><br/><a href="../../../nifi-hbase_2-client-service-nar/1.11.0/org.apache.nifi.hbase.HBase_2_ClientMapCacheService/index.html">HBase_2_ClientMapCacheService</a></td><td id="description">Listed entities are stored in the specified cache storage so that this processor can resume listing across NiFi restart or in case of primary node change. 'Tracking Entities' strategy require tracking information of all listed entities within the last 'Tracking Time Window'. To support large number of entities, the strategy uses DistributedMapCache instead of managed state. Cache key format is 'ListedEntities::{processorId}(::{nodeId})'. If it tracks per node listed entities, then the optional '::{nodeId}' part is added to manage state separately. E.g. cluster wide cache key = 'ListedEntities::8dda2321-0164-1000-50fa-3042fe7d6a7b', per node cache key = 'ListedEntities::8dda2321-0164-1000-50fa-3042fe7d6a7b::nifi-node3' The stored cache 
 content is Gzipped JSON string. The cache key will be deleted when target listing configuration is changed. Used by 'Tracking Entities' strategy.</td></tr><tr><td id="name">Entity Tracking Time Window</td><td id="default-value">3 hours</td><td id="allowable-values"></td><td id="description">Specify how long this processor should track already-listed entities. 'Tracking Entities' strategy can pick any entity whose timestamp is inside the specified time window. For example, if set to '30 minutes', any entity having timestamp in recent 30 minutes will be the listing target when this processor runs. A listed entity is considered 'new/updated' and a FlowFile is emitted if one of following condition meets: 1. does not exist in the already-listed entities, 2. has newer timestamp than the cached entity, 3. has different size than the cached entity. If a cached entity's timestamp becomes older than specified time window, that entity will be removed from the cached already-listed entities. Us
 ed by 'Tracking Entities' strategy.<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Entity Tracking Initial Listing Target</td><td id="default-value">all</td><td id="allowable-values"><ul><li>Tracking Time Window <img src="../../../../../html/images/iconInfo.png" alt="Ignore entities having timestamp older than the specified 'Tracking Time Window' at the initial listing activity." title="Ignore entities having timestamp older than the specified 'Tracking Time Window' at the initial listing activity."></img></li><li>All Available <img src="../../../../../html/images/iconInfo.png" alt="Regardless of entities timestamp, all existing entities will be listed at the initial listing activity." title="Regardless of entities timestamp, all existing entities will be listed at the initial listing activity."></img></li></ul></td><td id="description">Specify how initial listing should be handled. Used by 'Trackin
 g Entities' strategy.</td></tr></table><h3>Relationships: </h3><table id="relationships"><tr><th>Name</th><th>Description</th></tr><tr><td>success</td><td>All FlowFiles that are received are routed to success</td></tr></table><h3>Reads Attributes: </h3>None specified.<h3>Writes Attributes: </h3><table id="writes-attributes"><tr><th>Name</th><th>Description</th></tr><tr><td>ftp.remote.host</td><td>The hostname of the FTP Server</td></tr><tr><td>ftp.remote.port</td><td>The port that was connected to on the FTP Server</td></tr><tr><td>ftp.listing.user</td><td>The username of the user that performed the FTP Listing</td></tr><tr><td>file.owner</td><td>The numeric owner id of the source file</td></tr><tr><td>file.group</td><td>The numeric group id of the source file</td></tr><tr><td>file.permissions</td><td>The read/write/execute permissions of the source file</td></tr><tr><td>file.size</td><td>The number of bytes in the source file</td></tr><tr><td>file.lastModifiedTime</td><td>The times
 tamp of when the file in the filesystem waslast modified as 'yyyy-MM-dd'T'HH:mm:ssZ'</td></tr><tr><td>filename</td><td>The name of the file on the SFTP Server</td></tr><tr><td>path</td><td>The fully qualified name of the directory on the SFTP Server from which the file was pulled</td></tr></table><h3>State management: </h3><table id="stateful"><tr><th>Scope</th><th>Description</th></tr><tr><td>CLUSTER</td><td>After performing a listing of files, the timestamp of the newest file is stored. This allows the Processor to list only files that have been added or modified after this date the next time that the Processor is run. State is stored across the cluster so that this Processor can be run on Primary Node only and if a new Primary Node is selected, the new node will not duplicate the data that was listed by the previous Primary Node.</td></tr></table><h3>Restricted: </h3>This component is not restricted.<h3>Input requirement: </h3>This component does not allow an incoming relationshi
 p.<h3>System Resource Considerations:</h3>None specified.<h3>See Also:</h3><p><a href="../org.apache.nifi.processors.standard.FetchFTP/index.html">FetchFTP</a>, <a href="../org.apache.nifi.processors.standard.GetFTP/index.html">GetFTP</a>, <a href="../org.apache.nifi.processors.standard.PutFTP/index.html">PutFTP</a></p></body></html>
\ No newline at end of file

Added: nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.ListFile/index.html
URL: http://svn.apache.org/viewvc/nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.ListFile/index.html?rev=1873052&view=auto
==============================================================================
--- nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.ListFile/index.html (added)
+++ nifi/site/trunk/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.11.0/org.apache.nifi.processors.standard.ListFile/index.html Thu Jan 23 03:48:17 2020
@@ -0,0 +1 @@
+<!DOCTYPE html><html lang="en"><head><meta charset="utf-8"></meta><title>ListFile</title><link rel="stylesheet" href="../../../../../css/component-usage.css" type="text/css"></link></head><script type="text/javascript">window.onload = function(){if(self==top) { document.getElementById('nameHeader').style.display = "inherit"; } }</script><body><h1 id="nameHeader" style="display: none;">ListFile</h1><h2>Description: </h2><p>Retrieves a listing of files from the local filesystem. For each file that is listed, creates a FlowFile that represents the file so that it can be fetched in conjunction with FetchFile. This Processor is designed to run on Primary Node only in a cluster. If the primary node changes, the new Primary Node will pick up where the previous node left off without duplicating all of the data. Unlike GetFile, this Processor does not delete any data from the local filesystem.</p><h3>Tags: </h3><p>file, get, list, ingest, source, filesystem</p><h3>Properties: </h3><p>In the 
 list below, the names of required properties appear in <strong>bold</strong>. Any other properties (not in bold) are considered optional. The table also indicates any default values, and whether a property supports the <a href="../../../../../html/expression-language-guide.html">NiFi Expression Language</a>.</p><table id="properties"><tr><th>Name</th><th>Default Value</th><th>Allowable Values</th><th>Description</th></tr><tr><td id="name"><strong>Input Directory</strong></td><td id="default-value"></td><td id="allowable-values"></td><td id="description">The input directory from which files to pull files<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name"><strong>Listing Strategy</strong></td><td id="default-value">timestamps</td><td id="allowable-values"><ul><li>Tracking Timestamps <img src="../../../../../html/images/iconInfo.png" alt="This strategy tracks the latest timestamp of listed entity to determ
 ine new/updated entities. Since it only tracks few timestamps, it can manage listing state efficiently. However, any newly added, or updated entity having timestamp older than the tracked latest timestamp can not be picked by this strategy. For example, such situation can happen in a file system if a file with old timestamp is copied or moved into the target directory without its last modified timestamp being updated." title="This strategy tracks the latest timestamp of listed entity to determine new/updated entities. Since it only tracks few timestamps, it can manage listing state efficiently. However, any newly added, or updated entity having timestamp older than the tracked latest timestamp can not be picked by this strategy. For example, such situation can happen in a file system if a file with old timestamp is copied or moved into the target directory without its last modified timestamp being updated."></img></li><li>Tracking Entities <img src="../../../../../html/images/iconIn
 fo.png" alt="This strategy tracks information of all the listed entities within the latest 'Entity Tracking Time Window' to determine new/updated entities. This strategy can pick entities having old timestamp that can be missed with 'Tracing Timestamps'. However additional DistributedMapCache controller service is required and more JVM heap memory is used. See the description of 'Entity Tracking Time Window' property for further details on how it works." title="This strategy tracks information of all the listed entities within the latest 'Entity Tracking Time Window' to determine new/updated entities. This strategy can pick entities having old timestamp that can be missed with 'Tracing Timestamps'. However additional DistributedMapCache controller service is required and more JVM heap memory is used. See the description of 'Entity Tracking Time Window' property for further details on how it works."></img></li></ul></td><td id="description">Specify how to determine new/updated entiti
 es. See each strategy descriptions for detail.</td></tr><tr><td id="name"><strong>Recurse Subdirectories</strong></td><td id="default-value">true</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">Indicates whether to list files from subdirectories of the directory</td></tr><tr><td id="name"><strong>Input Directory Location</strong></td><td id="default-value">Local</td><td id="allowable-values"><ul><li>Local <img src="../../../../../html/images/iconInfo.png" alt="Input Directory is located on a local disk. State will be stored locally on each node in the cluster." title="Input Directory is located on a local disk. State will be stored locally on each node in the cluster."></img></li><li>Remote <img src="../../../../../html/images/iconInfo.png" alt="Input Directory is located on a remote system. State will be stored across the cluster so that the listing can be performed on Primary Node Only and another node can pick up where the last node lef
 t off, if the Primary Node changes" title="Input Directory is located on a remote system. State will be stored across the cluster so that the listing can be performed on Primary Node Only and another node can pick up where the last node left off, if the Primary Node changes"></img></li></ul></td><td id="description">Specifies where the Input Directory is located. This is used to determine whether state should be stored locally or across the cluster.</td></tr><tr><td id="name"><strong>File Filter</strong></td><td id="default-value">[^\.].*</td><td id="allowable-values"></td><td id="description">Only files whose names match the given regular expression will be picked up</td></tr><tr><td id="name">Path Filter</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">When Recurse Subdirectories is true, then only subdirectories whose path matches the given regular expression will be scanned</td></tr><tr><td id="name"><strong>Include File Attributes</strong></td
 ><td id="default-value">true</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">Whether or not to include information such as the file's Last Modified Time and Owner as FlowFile Attributes. Depending on the File System being used, gathering this information can be expensive and as a result should be disabled. This is especially true of remote file shares.</td></tr><tr><td id="name"><strong>Minimum File Age</strong></td><td id="default-value">0 sec</td><td id="allowable-values"></td><td id="description">The minimum age that a file must be in order to be pulled; any file younger than this amount of time (according to last modification date) will be ignored</td></tr><tr><td id="name">Maximum File Age</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">The maximum age that a file must be in order to be pulled; any file older than this amount of time (according to last modification date) will be ignored</td></tr><tr
 ><td id="name"><strong>Minimum File Size</strong></td><td id="default-value">0 B</td><td id="allowable-values"></td><td id="description">The minimum size that a file must be in order to be pulled</td></tr><tr><td id="name">Maximum File Size</td><td id="default-value"></td><td id="allowable-values"></td><td id="description">The maximum size that a file can be in order to be pulled</td></tr><tr><td id="name"><strong>Ignore Hidden Files</strong></td><td id="default-value">true</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">Indicates whether or not hidden files should be ignored</td></tr><tr><td id="name"><strong>Target System Timestamp Precision</strong></td><td id="default-value">auto-detect</td><td id="allowable-values"><ul><li>Auto Detect <img src="../../../../../html/images/iconInfo.png" alt="Automatically detect time unit deterministically based on candidate entries timestamp. Please note that this option may take longer to list entitie
 s unnecessarily, if none of entries has a precise precision timestamp. E.g. even if a target system supports millis, if all entries only have timestamps without millis, such as '2017-06-16 09:06:34.000', then its precision is determined as 'seconds'." title="Automatically detect time unit deterministically based on candidate entries timestamp. Please note that this option may take longer to list entities unnecessarily, if none of entries has a precise precision timestamp. E.g. even if a target system supports millis, if all entries only have timestamps without millis, such as '2017-06-16 09:06:34.000', then its precision is determined as 'seconds'."></img></li><li>Milliseconds <img src="../../../../../html/images/iconInfo.png" alt="This option provides the minimum latency for an entry from being available to being listed if target system supports millis, if not, use other options." title="This option provides the minimum latency for an entry from being available to being listed if t
 arget system supports millis, if not, use other options."></img></li><li>Seconds <img src="../../../../../html/images/iconInfo.png" alt="For a target system that does not have millis precision, but has in seconds." title="For a target system that does not have millis precision, but has in seconds."></img></li><li>Minutes <img src="../../../../../html/images/iconInfo.png" alt="For a target system that only supports precision in minutes." title="For a target system that only supports precision in minutes."></img></li></ul></td><td id="description">Specify timestamp precision at the target system. Since this processor uses timestamp of entities to decide which should be listed, it is crucial to use the right timestamp precision.</td></tr><tr><td id="name">Entity Tracking State Cache</td><td id="default-value"></td><td id="allowable-values"><strong>Controller Service API: </strong><br/>DistributedMapCacheClient<br/><strong>Implementations: </strong><a href="../../../nifi-hbase_1_1_2-cli
 ent-service-nar/1.11.0/org.apache.nifi.hbase.HBase_1_1_2_ClientMapCacheService/index.html">HBase_1_1_2_ClientMapCacheService</a><br/><a href="../../../nifi-couchbase-nar/1.11.0/org.apache.nifi.couchbase.CouchbaseMapCacheClient/index.html">CouchbaseMapCacheClient</a><br/><a href="../../../nifi-redis-nar/1.11.0/org.apache.nifi.redis.service.RedisDistributedMapCacheClientService/index.html">RedisDistributedMapCacheClientService</a><br/><a href="../../../nifi-distributed-cache-services-nar/1.11.0/org.apache.nifi.distributed.cache.client.DistributedMapCacheClientService/index.html">DistributedMapCacheClientService</a><br/><a href="../../../nifi-hbase_2-client-service-nar/1.11.0/org.apache.nifi.hbase.HBase_2_ClientMapCacheService/index.html">HBase_2_ClientMapCacheService</a></td><td id="description">Listed entities are stored in the specified cache storage so that this processor can resume listing across NiFi restart or in case of primary node change. 'Tracking Entities' strategy require 
 tracking information of all listed entities within the last 'Tracking Time Window'. To support large number of entities, the strategy uses DistributedMapCache instead of managed state. Cache key format is 'ListedEntities::{processorId}(::{nodeId})'. If it tracks per node listed entities, then the optional '::{nodeId}' part is added to manage state separately. E.g. cluster wide cache key = 'ListedEntities::8dda2321-0164-1000-50fa-3042fe7d6a7b', per node cache key = 'ListedEntities::8dda2321-0164-1000-50fa-3042fe7d6a7b::nifi-node3' The stored cache content is Gzipped JSON string. The cache key will be deleted when target listing configuration is changed. Used by 'Tracking Entities' strategy.</td></tr><tr><td id="name">Entity Tracking Time Window</td><td id="default-value">3 hours</td><td id="allowable-values"></td><td id="description">Specify how long this processor should track already-listed entities. 'Tracking Entities' strategy can pick any entity whose timestamp is inside the spe
 cified time window. For example, if set to '30 minutes', any entity having timestamp in recent 30 minutes will be the listing target when this processor runs. A listed entity is considered 'new/updated' and a FlowFile is emitted if one of following condition meets: 1. does not exist in the already-listed entities, 2. has newer timestamp than the cached entity, 3. has different size than the cached entity. If a cached entity's timestamp becomes older than specified time window, that entity will be removed from the cached already-listed entities. Used by 'Tracking Entities' strategy.<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Entity Tracking Initial Listing Target</td><td id="default-value">all</td><td id="allowable-values"><ul><li>Tracking Time Window <img src="../../../../../html/images/iconInfo.png" alt="Ignore entities having timestamp older than the specified 'Tracking Time Window' at the ini
 tial listing activity." title="Ignore entities having timestamp older than the specified 'Tracking Time Window' at the initial listing activity."></img></li><li>All Available <img src="../../../../../html/images/iconInfo.png" alt="Regardless of entities timestamp, all existing entities will be listed at the initial listing activity." title="Regardless of entities timestamp, all existing entities will be listed at the initial listing activity."></img></li></ul></td><td id="description">Specify how initial listing should be handled. Used by 'Tracking Entities' strategy.</td></tr><tr><td id="name">Entity Tracking Node Identifier</td><td id="default-value">${hostname()}</td><td id="allowable-values"></td><td id="description">The configured value will be appended to the cache key so that listing state can be tracked per NiFi node rather than cluster wide when tracking state is scoped to LOCAL. Used by 'Tracking Entities' strategy.<br/><strong>Supports Expression Language: true (will be e
 valuated using variable registry only)</strong></td></tr><tr><td id="name"><strong>Track Performance</strong></td><td id="default-value">false</td><td id="allowable-values"><ul><li>true</li><li>false</li></ul></td><td id="description">Whether or not the Processor should track the performance of disk access operations. If true, all accesses to disk will be recorded, including the file being accessed, the information being obtained, and how long it takes. This is then logged periodically at a DEBUG level. While the amount of data will be capped, this option may still consume a significant amount of heap (controlled by the 'Maximum Number of Files to Track' property), but it can be very useful for troubleshooting purposes if performance is poor is degraded.</td></tr><tr><td id="name"><strong>Maximum Number of Files to Track</strong></td><td id="default-value">100000</td><td id="allowable-values"></td><td id="description">If the 'Track Performance' property is set to 'true', this proper
 ty indicates the maximum number of files whose performance metrics should be held onto. A smaller value for this property will result in less heap utilization, while a larger value may provide more accurate insights into how the disk access operations are performing<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Max Disk Operation Time</td><td id="default-value">10 secs</td><td id="allowable-values"></td><td id="description">The maximum amount of time that any single disk operation is expected to take. If any disk operation takes longer than this amount of time, a warning bulletin will be generated for each operation that exceeds this amount of time.<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr><tr><td id="name">Max Directory Listing Time</td><td id="default-value">3 mins</td><td id="allowable-values"></td><td id="description">The 
 maximum amount of time that listing any single directory is expected to take. If the listing for the directory specified by the 'Input Directory' property, or the listing of any subdirectory (if 'Recurse' is set to true) takes longer than this amount of time, a warning bulletin will be generated for each directory listing that exceeds this amount of time.<br/><strong>Supports Expression Language: true (will be evaluated using variable registry only)</strong></td></tr></table><h3>Relationships: </h3><table id="relationships"><tr><th>Name</th><th>Description</th></tr><tr><td>success</td><td>All FlowFiles that are received are routed to success</td></tr></table><h3>Reads Attributes: </h3>None specified.<h3>Writes Attributes: </h3><table id="writes-attributes"><tr><th>Name</th><th>Description</th></tr><tr><td>filename</td><td>The name of the file that was read from filesystem.</td></tr><tr><td>path</td><td>The path is set to the relative path of the file's directory on filesystem compar
 ed to the Input Directory property.  For example, if Input Directory is set to /tmp, then files picked up from /tmp will have the path attribute set to "/". If the Recurse Subdirectories property is set to true and a file is picked up from /tmp/abc/1/2/3, then the path attribute will be set to "abc/1/2/3/".</td></tr><tr><td>absolute.path</td><td>The absolute.path is set to the absolute path of the file's directory on filesystem. For example, if the Input Directory property is set to /tmp, then files picked up from /tmp will have the path attribute set to "/tmp/". If the Recurse Subdirectories property is set to true and a file is picked up from /tmp/abc/1/2/3, then the path attribute will be set to "/tmp/abc/1/2/3/".</td></tr><tr><td>file.owner</td><td>The user that owns the file in filesystem</td></tr><tr><td>file.group</td><td>The group that owns the file in filesystem</td></tr><tr><td>file.size</td><td>The number of bytes in the file in filesystem</td></tr><tr><td>file.permission
 s</td><td>The permissions for the file in filesystem. This is formatted as 3 characters for the owner, 3 for the group, and 3 for other users. For example rw-rw-r--</td></tr><tr><td>file.lastModifiedTime</td><td>The timestamp of when the file in filesystem was last modified as 'yyyy-MM-dd'T'HH:mm:ssZ'</td></tr><tr><td>file.lastAccessTime</td><td>The timestamp of when the file in filesystem was last accessed as 'yyyy-MM-dd'T'HH:mm:ssZ'</td></tr><tr><td>file.creationTime</td><td>The timestamp of when the file in filesystem was created as 'yyyy-MM-dd'T'HH:mm:ssZ'</td></tr></table><h3>State management: </h3><table id="stateful"><tr><th>Scope</th><th>Description</th></tr><tr><td>LOCAL, CLUSTER</td><td>After performing a listing of files, the timestamp of the newest file is stored. This allows the Processor to list only files that have been added or modified after this date the next time that the Processor is run. Whether the state is stored with a Local or Cluster scope depends on the va
 lue of the &lt;Input Directory Location&gt; property.</td></tr></table><h3>Restricted: </h3>This component is not restricted.<h3>Input requirement: </h3>This component does not allow an incoming relationship.<h3>System Resource Considerations:</h3>None specified.<h3>See Also:</h3><p><a href="../org.apache.nifi.processors.standard.GetFile/index.html">GetFile</a>, <a href="../org.apache.nifi.processors.standard.PutFile/index.html">PutFile</a>, <a href="../org.apache.nifi.processors.standard.FetchFile/index.html">FetchFile</a></p></body></html>
\ No newline at end of file