You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@nifi.apache.org by jo...@apache.org on 2014/12/28 21:23:54 UTC

incubator-nifi git commit: NIFI-162 filled in a basic overview. Definitely still needs work though. Too wordy. Too specific.

Repository: incubator-nifi
Updated Branches:
  refs/heads/develop b6f2dd280 -> 87b07384a


NIFI-162 filled in a basic overview.  Definitely still needs work though.  Too wordy.  Too specific.


Project: http://git-wip-us.apache.org/repos/asf/incubator-nifi/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-nifi/commit/87b07384
Tree: http://git-wip-us.apache.org/repos/asf/incubator-nifi/tree/87b07384
Diff: http://git-wip-us.apache.org/repos/asf/incubator-nifi/diff/87b07384

Branch: refs/heads/develop
Commit: 87b07384a5b80886f8b088000fab8676ce482f0d
Parents: b6f2dd2
Author: joewitt <jo...@apache.org>
Authored: Sun Dec 28 15:23:45 2014 -0500
Committer: joewitt <jo...@apache.org>
Committed: Sun Dec 28 15:23:45 2014 -0500

----------------------------------------------------------------------
 .../src/main/asciidoc/administration-guide.adoc |   2 +
 .../src/main/asciidoc/developer-guide.adoc      |   2 +
 nifi-docs/src/main/asciidoc/overview.adoc       | 122 +++++++++++++++++--
 nifi-docs/src/main/asciidoc/user-guide.adoc     |   2 +
 pom.xml                                         |   5 +-
 5 files changed, 120 insertions(+), 13 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-nifi/blob/87b07384/nifi-docs/src/main/asciidoc/administration-guide.adoc
----------------------------------------------------------------------
diff --git a/nifi-docs/src/main/asciidoc/administration-guide.adoc b/nifi-docs/src/main/asciidoc/administration-guide.adoc
index d3e1def..529bddf 100644
--- a/nifi-docs/src/main/asciidoc/administration-guide.adoc
+++ b/nifi-docs/src/main/asciidoc/administration-guide.adoc
@@ -16,6 +16,8 @@
 //
 NiFi System Administrator's Guide
 =================================
+Apache NiFi Team <de...@nifi.incubator.apache.org>
+:homepage: http://nifi.incubator.apache.org
 
 How to install
 --------------

http://git-wip-us.apache.org/repos/asf/incubator-nifi/blob/87b07384/nifi-docs/src/main/asciidoc/developer-guide.adoc
----------------------------------------------------------------------
diff --git a/nifi-docs/src/main/asciidoc/developer-guide.adoc b/nifi-docs/src/main/asciidoc/developer-guide.adoc
index 90e2465..bfaa669 100644
--- a/nifi-docs/src/main/asciidoc/developer-guide.adoc
+++ b/nifi-docs/src/main/asciidoc/developer-guide.adoc
@@ -16,6 +16,8 @@
 //
 NiFi Developer's Guide
 ======================
+Apache NiFi Team <de...@nifi.incubator.apache.org>
+:homepage: http://nifi.incubator.apache.org
 
 The designed points of extension
 --------------------------------

http://git-wip-us.apache.org/repos/asf/incubator-nifi/blob/87b07384/nifi-docs/src/main/asciidoc/overview.adoc
----------------------------------------------------------------------
diff --git a/nifi-docs/src/main/asciidoc/overview.adoc b/nifi-docs/src/main/asciidoc/overview.adoc
index 4fbc99b..7398394 100644
--- a/nifi-docs/src/main/asciidoc/overview.adoc
+++ b/nifi-docs/src/main/asciidoc/overview.adoc
@@ -14,18 +14,118 @@
 // See the License for the specific language governing permissions and
 // limitations under the License.
 //
-NiFi Overview
-=============
+Apache NiFi Overview
+====================
+Apache NiFi Team <de...@nifi.incubator.apache.org>
+:homepage: http://nifi.incubator.apache.org
 
-The problem NiFi solves
------------------------
-Dataflow at scale...
+What is Apache NiFi?
+--------------------
+Put simply NiFi was built to automate the flow of data between systems.  While
+the term 'dataflow' is used in a variety of contexts we'll use it here 
+to mean the automated and managed flow of information between systems.  This 
+problem space has been around ever since enterprises had more than one system 
+where some of the systems created data and some of the systems consumed data.
+The problems and solution patterns that emerged have been discussed and 
+articulated extensively.  A comprehensive and readily consumed form is found in
+the _Enterprise Integration Patterns_ <<eip>>.
 
-The design philosophy of NiFi
------------------------------
-FBP, ...
+Over the years dataflow has been one of those necessary evils in an 
+architecture.  Now though there are a number of active and rapidly evolving 
+movements making dataflow a lot more interesting and a lot more vital to the 
+success of a given enterprise.  These include things like; Service Oriented 
+Architecture <<soa>>, the rise of the API <<api>><<api2>>, Internet of Things <<iot>>,
+and Big Data <<bigdata>>.  In addition, the level of rigor necessary for 
+compliance, privacy, and security is constantly on the rise.  Even still with 
+all of these new concepts coming about the patterns and needs of dataflow is 
+still largely the same.  The primary differences then are the scope of
+complexity, the rate of change necessary to adapt, and that at scale  
+the edge case becomes common occurrence.  NiFi is built to help tackle these 
+modern dataflow challenges.
 
-Key Features
-------------
-UI, compponent-based, high performance, provenance
+The core concepts of NiFi
+-------------------------
 
+NiFi's fundamental design concepts closely relate to the main ideas of Flow Based
+Programming <<fbp>>.  Here are some of 
+the main NiFi concepts and how they map to FBP:
+[grid="rows"]
+[options="header",cols="3,3,10"]
+|===========================
+| NiFi Term | FBP Term| Description
+
+| FlowFile | Information Packet | 
+A FlowFile represents the objects moving through the system and for each one NiFi
+keeps track of a Map of key/value pair attribute strings and its associated 
+content zero or bytes.
+
+| FlowFile Processor | Black Box | 
+Processors are what actually performs work.  In <<eip>> terms a processor is 
+doing some combination of data Routing, Transformation, or mediation between
+systems.  Processors have access to attributes of a given flow file and its 
+content stream.  Processors can operate on zero or more FlowFiles in a given unit of work
+and either commit that work or rollback.
+
+| Connection | Bounded Buffer | 
+Connections provide the actual linkage between processors.  These act as queues
+and allow various processes to interact at differing rates.  These queues then 
+can be prioritized dynamically and can have upper bounds on load which enables
+back pressure.
+
+| Flow Controller | Scheduler | 
+The Flow Controller maintains the knowledge of how processes actually connect 
+and manages the threads and allocations thereof which all processes use.  The
+Flow Controller acts as the broker facilitating the exchange of FlowFiles 
+between processors.
+
+| Process Group | subnet | 
+A Process Group is a specific set of processes and their connections which can
+receive data via input ports and which can send data out via output ports.  In 
+this manner process groups allow creation of entirely new components simply by
+composition of other components.
+
+|===========================
+
+This design model, also similar to <<seda>>, provides many beneficial consequences which help NiFi 
+to be a very effective platform for building powerful and scalable dataflows.
+A few of these benefits include:
+
+* Lends well to visual creation and management of directed graphs of processors
+* Is inherently asynchronous which allows for very high throughput and natural buffering even as processing and flow rates fluctuate
+* Provides a highly concurrent model without a developer having to worry about the typical complexities of concurrency
+* Promotes the development of cohesive and loosely coupled components which can then be reused in other contexts and promotes testable units
+* The resource constrained connections make critical functions such as back-pressure and pressure release very natural and intuitive
+* Error handling becomes as natural as the happy-path rather than a coarse grained catch-all
+* The points at which data enters and exits the system as well as how it flows through are well understood and easily tracked
+
+Dataflow Challenges : NiFi Features
+-----------------------------------
+* Systems fail
+** Explanation: Networks fail, disks fail, software crashes, people make mistakes.
+** Features: Fault-tolerance, buffering, durability, flow-specific QoS, data provenance, recovery/go back in time, visual command and control
+* Data access exceeds capacity to consume
+** Explanation: Sometimes a given data source can outpace some part of the processing or delivery chain - it only takes one weak-link to have an issue.
+** Features: Prioritization, Back-pressure, congestion-avoidance, QoS (some things are critical and some are not)
+* Boundary conditions are mere suggestions
+** Explanation: You will get data that is too big, too small, too fast, too slow, corrupt, wrong, wrong format
+** Features: flow-specific latency vs throughput tradeoffs, flow specific loss tolerance vs guaranteed delivery, extensible transformations
+* What is noise one day becomes signal the next
+** Explanation: Priorities of an organization change - rapidly.  Enabling new flows and changing existing ones must be fast.
+** Features:  Dynamic prioritization of data.  Go back in time (rolling buffer of recorded history).  Real-time visual command and control.  Changes are immediate and fine-grained.
+* Compliance and security
+** Explanation: Laws and regulations change.  Business to business agreements change.  System to system and system to user interactions must be secure and trusted.
+** Features: 2-Way SSL.  Pluggable authentication and authorization.  Data provenance.
+* Continuous improvement occurs in production
+** Explanation: It is often not possible to come even close to replicating production environments in the lab.
+** Features: Flow-specific QoS.  Cheap copy-on-write.  Data provenance.  It is safe to tee a flow to an unreliable or non-production system.
+
+# References
+[bibliography]
+- [[[eip]]] Gregor Hohpe. Enterprise Integration Patterns [online].  Retrieved: 27 Dec 2014, from: http://www.enterpriseintegrationpatterns.com/
+- [[[soa]]] Wikipedia. Service Oriented Architecture [online]. Retrieved: 27 Dec 2014, from: http://en.wikipedia.org/wiki/Service-oriented_architecture
+- [[[api]]] Eric Savitz.  Welcome to the API Economy [online].  Forbes.com. Retrieved: 27 Dec 2014, from: http://www.forbes.com/sites/ciocentral/2012/08/29/welcome-to-the-api-economy/
+- [[[api2]]] Adam Duvander.  The rise of the API economy and consumer-led ecosystems [online]. thenextweb.com.  Retrieved: 27 Dec 2014, from: http://thenextweb.com/dd/2014/03/28/api-economy/
+- [[[iot]]] Wikipedia. Internet of Things [online]. Retrieved: 27 Dec 2014, from: http://en.wikipedia.org/wiki/Internet_of_Things
+- [[[bigdata]]] Wikipedia.  Big Data [online].  Retrieved: 27 Dec 2014, from: http://en.wikipedia.org/wiki/Big_data
+- [[[fbp]]] Wikipedia.  Flow Based Programming [online].  Retrieved: 28 Dec 2014, from: http://en.wikipedia.org/wiki/Flow-based_programming#Concepts
+- [[[seda]]] Matt Welsh.  Harvard.  SEDA: An Architecture for Highly Concurrent Server Applications [online].  Retrieved: 28 Dec 2014, from: http://www.eecs.harvard.edu/~mdw/proj/seda/
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-nifi/blob/87b07384/nifi-docs/src/main/asciidoc/user-guide.adoc
----------------------------------------------------------------------
diff --git a/nifi-docs/src/main/asciidoc/user-guide.adoc b/nifi-docs/src/main/asciidoc/user-guide.adoc
index 8d145c2..ff26f0f 100644
--- a/nifi-docs/src/main/asciidoc/user-guide.adoc
+++ b/nifi-docs/src/main/asciidoc/user-guide.adoc
@@ -16,6 +16,8 @@
 //
 NiFi User Guide
 ===============
+Apache NiFi Team <de...@nifi.incubator.apache.org>
+:homepage: http://nifi.incubator.apache.org
 
 [template="glossary", id="terminology"]
 Terminology

http://git-wip-us.apache.org/repos/asf/incubator-nifi/blob/87b07384/pom.xml
----------------------------------------------------------------------
diff --git a/pom.xml b/pom.xml
index 58831a9..1de1a0e 100644
--- a/pom.xml
+++ b/pom.xml
@@ -12,7 +12,8 @@
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.
---><project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
+-->
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
     <modelVersion>4.0.0</modelVersion>
     <parent>
         <groupId>org.apache</groupId>
@@ -58,7 +59,7 @@
       ! http://jira.codehaus.org/browse/MNG-5297
     -->
     <prerequisites>
-      <maven>${maven.version}</maven>
+        <maven>${maven.version}</maven>
     </prerequisites>
     <modules>
         <!--