You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@falcon.apache.org by ba...@apache.org on 2016/08/08 23:15:38 UTC
[04/49] falcon git commit: FALCON-2006 Update documentation on site for 0.10 release

http://git-wip-us.apache.org/repos/asf/falcon/blob/4612c3f7/trunk/releases/0.10/src/site/twiki/HiveIntegration.twiki
----------------------------------------------------------------------
diff --git a/trunk/releases/0.10/src/site/twiki/HiveIntegration.twiki b/trunk/releases/0.10/src/site/twiki/HiveIntegration.twiki
new file mode 100644
index 0000000..688305d
--- /dev/null
+++ b/trunk/releases/0.10/src/site/twiki/HiveIntegration.twiki
@@ -0,0 +1,372 @@
+---+ Hive Integration
+
+---++ Overview
+Falcon provides data management functions for feeds declaratively. It allows users to represent feed locations as
+time-based partition directories on HDFS containing files.
+
+Hive provides a simple and familiar database like tabular model of data management to its users,
+which are backed by HDFS. It supports two classes of tables, managed tables and external tables.
+
+Falcon allows users to represent feed location as Hive tables. Falcon supports both managed and external tables
+and provide data management services for tables such as replication, eviction, archival, etc. Falcon will notify
+HCatalog as a side effect of either acquiring, replicating or evicting a data set instance and adds the
+missing capability of HCatalog table replication.
+
+In the near future, Falcon will allow users to express pipeline processing in Hive scripts
+apart from Pig and Oozie workflows.
+
+
+---++ Assumptions
+   * Date is a mandatory first-level partition for Hive tables
+      * Data availability triggers are based on date pattern in Oozie
+   * Tables must be created in Hive prior to adding it as a Feed in Falcon.
+      * Duplicating this in Falcon will create confusion on the real source of truth. Also propagating schema changes
+    between systems is a hard problem.
+   * Falcon does not know about the encoding of the data and data should be in HCatalog supported format.
+
+---++ Configuration
+Falcon provides a system level option to enable Hive integration. Falcon must be configured with an implementation
+for the catalog registry. The default implementation for Hive is shipped with Falcon.
+
+<verbatim>
+catalog.service.impl=org.apache.falcon.catalog.HiveCatalogService
+</verbatim>
+
+
+---++ Incompatible changes
+Falcon depends heavily on data-availability triggers for scheduling Falcon workflows. Oozie must support
+data-availability triggers based on HCatalog partition availability. This is only available in oozie 4.x.
+
+Hence, Falcon for Hive support needs Oozie 4.x.
+
+
+---++ Oozie Shared Library setup
+Falcon post Hive integration depends heavily on the [[http://oozie.apache.org/docs/4.0.1/WorkflowFunctionalSpec.html#a17_HDFS_Share_Libraries_for_Workflow_Applications_since_Oozie_2.3][shared library feature of Oozie]].
+Since the sheer number of jars for HCatalog, Pig and Hive are in the many 10s in numbers, its quite daunting to
+redistribute the dependent jars from Falcon.
+
+[[http://oozie.apache.org/docs/4.0.1/DG_QuickStart.html#Oozie_Share_Lib_Installation][This is a one time effort in Oozie setup and is quite straightforward.]]
+
+
+---++ Approach
+
+---+++ Entity Changes
+
+   * Cluster DSL will have an additional registry-interface section, specifying the endpoint for the
+HCatalog server. If this is absent, no HCatalog publication will be done from Falcon for this cluster.
+      <verbatim>thrift://hcatalog-server:port</verbatim>
+   * Feed DSL will allow users to specify the URI (location) for HCatalog tables as:
+      <verbatim>catalog:database_name:table_name#partitions(key=value?)*</verbatim>
+   * Failure to publish to HCatalog will be retried (configurable # of retires) with back off. Permanent failures
+   after all the retries are exhausted will fail the Falcon workflow
+
+---+++ Eviction
+
+   * Falcon will construct DDL statements to filter candidate partitions eligible for eviction drop partitions
+   * Falcon will construct DDL statements to drop the eligible partitions
+   * Additionally, Falcon will nuke the data on HDFS for external tables
+
+
+---+++ Replication
+
+   * Falcon will use HCatalog (Hive) API to export the data for a given table and the partition,
+which will result in a data collection that includes metadata on the data's storage format, the schema,
+how the data is sorted, what table the data came from, and values of any partition keys from that table.
+   * Falcon will use discp tool to copy the exported data collection into the secondary cluster into a staging
+directory used by Falcon.
+   * Falcon will then import the data into HCatalog (Hive) using the HCatalog (Hive) API. If the specified table does
+not yet exist, Falcon will create it, using the information in the imported metadata to set defaults for the
+table such as schema, storage format, etc.
+   * The partition is not complete and hence not visible to users until all the data is committed on the secondary
+cluster, (no dirty reads)
+   * Data collection is staged by Falcon and retries for copy continues from where it left off.
+   * Failure to register with Hive will be retired. After all the attempts are exhausted,
+the data will be cleaned up by Falcon.
+
+
+---+++ Security
+The user owns all data managed by Falcon. Falcon runs as the user who submitted the feed. Falcon will authenticate
+with HCatalog as the end user who owns the entity and the data.
+
+For Hive managed tables, the table may be owned by the end user or \u201chive\u201d. For \u201chive\u201d owned tables,
+user will have to configure the feed as \u201chive\u201d.
+
+
+---++ Load on HCatalog from Falcon
+It generally depends on the frequency of the feeds configured in Falcon and how often data is ingested, replicated,
+or processed.
+
+
+---++ User Impact
+   * There should not be any impact to user due to this integration
+   * Falcon will be fully backwards compatible 
+   * Users have a choice to either choose storage based on files on HDFS as they do today or use HCatalog for
+accessing the data in tables
+
+
+---++ Known Limitations
+
+---+++ Oozie
+
+   * Falcon with Hadoop 1.x requires copying guava jars manually to sharelib in oozie. Hadoop 2.x ships this.
+   * hcatalog-pig-adapter needs to be copied manually to oozie sharelib.
+<verbatim>
+bin/hadoop dfs -copyFromLocal $LFS/share/lib/hcatalog/hcatalog-pig-adapter-0.5.0-incubating.jar share/lib/hcatalog
+</verbatim>
+   * Oozie 4.x with Hadoop-2.x
+Replication jobs are submitted to oozie on the destination cluster. Oozie runs a table export job
+on RM on source cluster. Oozie server on the target cluster must be configured with source hadoop
+configs else jobs fail with errors on secure and non-secure clusters as below:
+<verbatim>
+org.apache.hadoop.security.token.SecretManager$InvalidToken: Password not found for ApplicationAttempt appattempt_1395965672651_0010_000002
+</verbatim>
+
+Make sure all oozie servers that falcon talks to has the hadoop configs configured in oozie-site.xml
+<verbatim>
+<property>
+      <name>oozie.service.HadoopAccessorService.hadoop.configurations</name>
+      <value>*=/etc/hadoop/conf,arpit-new-falcon-1.cs1cloud.internal:8020=/etc/hadoop-1,arpit-new-falcon-1.cs1cloud.internal:8032=/etc/hadoop-1,arpit-new-falcon-2.cs1cloud.internal:8020=/etc/hadoop-2,arpit-new-falcon-2.cs1cloud.internal:8032=/etc/hadoop-2,arpit-new-falcon-5.cs1cloud.internal:8020=/etc/hadoop-3,arpit-new-falcon-5.cs1cloud.internal:8032=/etc/hadoop-3</value>
+      <description>
+          Comma separated AUTHORITY=HADOOP_CONF_DIR, where AUTHORITY is the HOST:PORT of
+          the Hadoop service (JobTracker, HDFS). The wildcard '*' configuration is
+          used when there is no exact match for an authority. The HADOOP_CONF_DIR contains
+          the relevant Hadoop *-site.xml files. If the path is relative is looked within
+          the Oozie configuration directory; though the path can be absolute (i.e. to point
+          to Hadoop client conf/ directories in the local filesystem.
+      </description>
+    </property>
+</verbatim>
+
+---+++ Hive
+
+   * Dated Partitions
+Falcon does not work well when table partition contains multiple dated columns. Falcon only works
+with a single dated partition. This is being tracked in FALCON-357 which is a limitation in Oozie.
+<verbatim>
+catalog:default:table4#year=${YEAR};month=${MONTH};day=${DAY};hour=${HOUR};minute=${MINUTE}
+</verbatim>
+
+   * [[https://issues.apache.org/jira/browse/HIVE-5550][Hive table import fails for tables created with default text and sequence file formats using HCatalog API]]
+For some arcane reason, hive substitutes the output format for text and sequence to be prefixed with Hive.
+Hive table import fails since it compares against the input and output formats of the source table and they are
+different. Say, a table was created with out specifying the file format, it defaults to:
+<verbatim>
+fileFormat=TextFile, inputformat=org.apache.hadoop.mapred.TextInputFormat, outputformat=org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat
+</verbatim>
+
+But, when hive fetches the table from the metastore, it replaces the output format with org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
+and the comparison between source and target table fails.
+<verbatim>
+org.apache.hadoop.hive.ql.parse.ImportSemanticAnalyzer#checkTable
+      // check IF/OF/Serde
+      String existingifc = table.getInputFormatClass().getName();
+      String importedifc = tableDesc.getInputFormat();
+      String existingofc = table.getOutputFormatClass().getName();
+      String importedofc = tableDesc.getOutputFormat();
+      if ((!existingifc.equals(importedifc))
+          || (!existingofc.equals(importedofc))) {
+        throw new SemanticException(
+            ErrorMsg.INCOMPATIBLE_SCHEMA
+                .getMsg(" Table inputformat/outputformats do not match"));
+      }
+</verbatim>
+The above is not an issue with Hive 0.13.
+
+---++ Hive Examples
+Following is an example entity configuration for lifecycle management functions for tables in Hive.
+
+---+++ Hive Table Lifecycle Management - Replication and Retention
+
+---++++ Primary Cluster
+
+<verbatim>
+<?xml version="1.0"?>
+<!--
+    Primary cluster configuration for demo vm
+  -->
+<cluster colo="west-coast" description="Primary Cluster"
+         name="primary-cluster"
+         xmlns="uri:falcon:cluster:0.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
+    <interfaces>
+        <interface type="readonly" endpoint="hftp://localhost:10070"
+                   version="1.1.1" />
+        <interface type="write" endpoint="hdfs://localhost:10020"
+                   version="1.1.1" />
+        <interface type="execute" endpoint="localhost:10300"
+                   version="1.1.1" />
+        <interface type="workflow" endpoint="http://localhost:11010/oozie/"
+                   version="4.0.1" />
+        <interface type="registry" endpoint="thrift://localhost:19083"
+                   version="0.11.0" />
+        <interface type="messaging" endpoint="tcp://localhost:61616?daemon=true"
+                   version="5.4.3" />
+    </interfaces>
+    <locations>
+        <location name="staging" path="/apps/falcon/staging" />
+        <location name="temp" path="/tmp" />
+        <location name="working" path="/apps/falcon/working" />
+    </locations>
+</cluster>
+</verbatim>
+
+---++++ BCP Cluster
+
+<verbatim>
+<?xml version="1.0"?>
+<!--
+    BCP cluster configuration for demo vm
+  -->
+<cluster colo="east-coast" description="BCP Cluster"
+         name="bcp-cluster"
+         xmlns="uri:falcon:cluster:0.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
+    <interfaces>
+        <interface type="readonly" endpoint="hftp://localhost:20070"
+                   version="1.1.1" />
+        <interface type="write" endpoint="hdfs://localhost:20020"
+                   version="1.1.1" />
+        <interface type="execute" endpoint="localhost:20300"
+                   version="1.1.1" />
+        <interface type="workflow" endpoint="http://localhost:11020/oozie/"
+                   version="4.0.1" />
+        <interface type="registry" endpoint="thrift://localhost:29083"
+                   version="0.11.0" />
+        <interface type="messaging" endpoint="tcp://localhost:61616?daemon=true"
+                   version="5.4.3" />
+    </interfaces>
+    <locations>
+        <location name="staging" path="/apps/falcon/staging" />
+        <location name="temp" path="/tmp" />
+        <location name="working" path="/apps/falcon/working" />
+    </locations>
+</cluster>
+</verbatim>
+
+---++++ Feed with replication and eviction policy
+
+<verbatim>
+<?xml version="1.0"?>
+<!--
+    Replicating Hourly customer table from primary to secondary cluster.
+  -->
+<feed description="Replicating customer table feed" name="customer-table-replicating-feed"
+      xmlns="uri:falcon:feed:0.1">
+    <frequency>hours(1)</frequency>
+    <timezone>UTC</timezone>
+
+    <clusters>
+        <cluster name="primary-cluster" type="source">
+            <validity start="2013-09-24T00:00Z" end="2013-10-26T00:00Z"/>
+            <retention limit="hours(2)" action="delete"/>
+        </cluster>
+        <cluster name="bcp-cluster" type="target">
+            <validity start="2013-09-24T00:00Z" end="2013-10-26T00:00Z"/>
+            <retention limit="days(30)" action="delete"/>
+
+            <table uri="catalog:tgt_demo_db:customer_bcp#ds=${YEAR}-${MONTH}-${DAY}-${HOUR}" />
+        </cluster>
+    </clusters>
+
+    <table uri="catalog:src_demo_db:customer_raw#ds=${YEAR}-${MONTH}-${DAY}-${HOUR}" />
+
+    <ACL owner="seetharam" group="users" permission="0755"/>
+    <schema location="" provider="hcatalog"/>
+</feed>
+</verbatim>
+
+
+---+++ Hive Table used in Processing Pipelines
+
+---++++ Primary Cluster
+The cluster definition from the lifecycle example can be used.
+
+---++++ Input Feed
+
+<verbatim>
+<?xml version="1.0"?>
+<feed description="clicks log table " name="input-table" xmlns="uri:falcon:feed:0.1">
+    <groups>online,bi</groups>
+    <frequency>hours(1)</frequency>
+    <timezone>UTC</timezone>
+
+    <clusters>
+        <cluster name="##cluster##" type="source">
+            <validity start="2010-01-01T00:00Z" end="2012-04-21T00:00Z"/>
+            <retention limit="hours(24)" action="delete"/>
+        </cluster>
+    </clusters>
+
+    <table uri="catalog:falcon_db:input_table#ds=${YEAR}-${MONTH}-${DAY}-${HOUR}" />
+
+    <ACL owner="testuser" group="group" permission="0x755"/>
+    <schema location="/schema/clicks" provider="protobuf"/>
+</feed>
+</verbatim>
+
+
+---++++ Output Feed
+
+<verbatim>
+<?xml version="1.0"?>
+<feed description="clicks log identity table" name="output-table" xmlns="uri:falcon:feed:0.1">
+    <groups>online,bi</groups>
+    <frequency>hours(1)</frequency>
+    <timezone>UTC</timezone>
+
+    <clusters>
+        <cluster name="##cluster##" type="source">
+            <validity start="2010-01-01T00:00Z" end="2012-04-21T00:00Z"/>
+            <retention limit="hours(24)" action="delete"/>
+        </cluster>
+    </clusters>
+
+    <table uri="catalog:falcon_db:output_table#ds=${YEAR}-${MONTH}-${DAY}-${HOUR}" />
+
+    <ACL owner="testuser" group="group" permission="0x755"/>
+    <schema location="/schema/clicks" provider="protobuf"/>
+</feed>
+</verbatim>
+
+
+---++++ Process
+
+<verbatim>
+<?xml version="1.0"?>
+<process name="##processName##" xmlns="uri:falcon:process:0.1">
+    <clusters>
+        <cluster name="##cluster##">
+            <validity end="2012-04-22T00:00Z" start="2012-04-21T00:00Z"/>
+        </cluster>
+    </clusters>
+
+    <parallel>1</parallel>
+    <order>FIFO</order>
+    <frequency>days(1)</frequency>
+    <timezone>UTC</timezone>
+
+    <inputs>
+        <input end="today(0,0)" start="today(0,0)" feed="input-table" name="input"/>
+    </inputs>
+
+    <outputs>
+        <output instance="now(0,0)" feed="output-table" name="output"/>
+    </outputs>
+
+    <properties>
+        <property name="blah" value="blah"/>
+    </properties>
+
+    <workflow engine="pig" path="/falcon/test/apps/pig/table-id.pig"/>
+
+    <retry policy="periodic" delay="minutes(10)" attempts="3"/>
+</process>
+</verbatim>
+
+
+---++++ Pig Script
+
+<verbatim>
+A = load '$input_database.$input_table' using org.apache.hcatalog.pig.HCatLoader();
+B = FILTER A BY $input_filter;
+C = foreach B generate id, value;
+store C into '$output_database.$output_table' USING org.apache.hcatalog.pig.HCatStorer('$output_dataout_partitions');
+</verbatim>

http://git-wip-us.apache.org/repos/asf/falcon/blob/4612c3f7/trunk/releases/0.10/src/site/twiki/HiveMirroring.twiki
----------------------------------------------------------------------
diff --git a/trunk/releases/0.10/src/site/twiki/HiveMirroring.twiki b/trunk/releases/0.10/src/site/twiki/HiveMirroring.twiki
new file mode 100644
index 0000000..e28502a
--- /dev/null
+++ b/trunk/releases/0.10/src/site/twiki/HiveMirroring.twiki
@@ -0,0 +1,63 @@
+---+Hive Mirroring
+
+---++Overview
+Falcon provides feature to replicate Hive metadata and data events from source cluster to destination cluster. This is supported for both secure and unsecure cluster through Falcon extensions.
+
+---++Prerequisites
+Following is the prerequisites to use Hive Mirrroring
+
+   * *Hive 1.2.0+*
+   * *Oozie 4.2.0+*
+
+*Note:* Set following properties in hive-site.xml for replicating the Hive events on source and destination Hive cluster:
+<verbatim>
+    <property>
+        <name>hive.metastore.event.listeners</name>
+        <value>org.apache.hive.hcatalog.listener.DbNotificationListener</value>
+        <description>event listeners that are notified of any metastore changes</description>
+    </property>
+
+    <property>
+        <name>hive.metastore.dml.events</name>
+        <value>true</value>
+    </property>
+</verbatim>
+
+---++ Use Case
+* Replicate data/metadata of Hive DB & table from source to target cluster
+
+---++ Limitations
+* Currently Hive doesn't support create database, roles, views, offline tables, direct HDFS writes without registering with metadata and Database/Table name mapping replication events. Hence Hive mirroring extension cannot be used to replicate above mentioned events between warehouses.
+
+---++ Usage
+---+++ Bootstrap
+   Perform initial bootstrap of Table and Database from source cluster to destination cluster
+   * *Database Bootstrap*
+     For bootstrapping DB replication, first destination DB should be created. This step is expected,
+     since DB replication definitions can be set up by users only on pre-existing DB\u2019s. Second, Export all tables in
+     the source db and Import it in the destination db, as described in Table bootstrap.
+
+   * *Table Bootstrap*
+     For bootstrapping table replication, essentially after having turned on the !DbNotificationListener
+     on the source db, perform an Export of the table, distcp the Export over to the destination
+     warehouse and do an Import over there. Check the following [[https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ImportExport][Hive Export-Import]] for syntax details
+     and examples.
+     This will set up the destination table so that the events on the source cluster that modify the table
+     will then be replicated.
+
+---+++  Setup source and destination clusters
+   <verbatim>
+    $FALCON_HOME/bin/falcon entity -submit -type cluster -file /cluster/definition.xml
+   </verbatim>
+
+---+++ Hive mirroring extension properties
+   Extension artifacts are expected to be installed on HDFS at the path specified by "extension.store.uri" in startup properties. hive-mirroring-properties.json file located at "<extension.store.uri>/hive-mirroring/META/hive-mirroring-properties.json" lists all the required and optional parameters/arguments for scheduling Hive mirroring job.
+
+---+++ Submit and schedule Hive mirroring extension
+
+   <verbatim>
+    $FALCON_HOME/bin/falcon extension -submitAndSchedule -extensionName hive-mirroring -file /process/definition.xml
+   </verbatim>
+
+   Please Refer to [[falconcli/FalconCLI][Falcon CLI]] and [[restapi/ResourceList][REST API]] for more details on usage of CLI and REST API's.
+

http://git-wip-us.apache.org/repos/asf/falcon/blob/4612c3f7/trunk/releases/0.10/src/site/twiki/ImportExport.twiki
----------------------------------------------------------------------
diff --git a/trunk/releases/0.10/src/site/twiki/ImportExport.twiki b/trunk/releases/0.10/src/site/twiki/ImportExport.twiki
new file mode 100644
index 0000000..b0ce7ff
--- /dev/null
+++ b/trunk/releases/0.10/src/site/twiki/ImportExport.twiki
@@ -0,0 +1,242 @@
+---+Falcon Data Import and Export
+
+
+---++Overview
+
+Falcon provides constructs to periodically bring raw data from external data sources (like databases, drop boxes etc)
+onto Hadoop and push derived data computed on Hadoop onto external data sources.
+
+As of this release, Falcon only supports Relational Databases (e.g. Oracle, MySQL etc) via JDBC as external data source.
+The future releases will add support for other external data sources.
+
+
+---++Prerequisites
+
+Following are the prerequisites to import external data from and export to databases.
+
+   * *Sqoop 1.4.6+*
+   * *Oozie 4.2.0+*
+   * *Appropriate database connector*
+
+
+*Note:* Falcon uses Sqoop for import/export operation. Sqoop will require appropriate database driver to connect to
+the relational database. Please refer to the Sqoop documentation for any Sqoop related question. Please make sure
+the database driver jar is copied into oozie share lib for Sqoop.
+
+<verbatim>
+For example, in order to import and export with MySQL, please make sure the latest MySQL connector
+*mysql-connector-java-5.1.31.jar+* is copied into oozie's Sqoop share lib
+
+/user/oozie/share/lib/{lib-dir}/sqoop/mysql-connector-java-5.1.31.jar+
+
+where {lib-dir} value varies in oozie deployments.
+
+</verbatim>
+
+---++ Usage
+---+++ Entity Definition and Setup
+   * *Datasource Entity*
+      Datasource entity abstracts connection and credential details to external data sources. The Datasource entity
+      supports read and write interfaces with specific credentials. The default credential will be used if the read
+      or write interface does not have its own credentials. In general, the Datasource entity will be defined by
+      system administrator. Please refer to datasource XSD for more details.
+
+      The following example defines a Datasource entity for a MySQL database. The import operation will use
+      the read interface with url "jdbc:mysql://dbhost/test", user name "import_usr" and password text "sqoop".
+      Where as, the export operation will use the write interface with url "jdbc:mysql://dbhost/test" with user
+      name "export_usr" and password specified in a HDFS file at the location "/user/ambari-qa/password-store/password_write_user".
+
+      The default credential specified will be used if either the read or write interface does not provide its own
+      credentials. The default credential specifies the password using password alias feature available via hadoop credential
+      functionality. User will be able to create a password alias using "hadoop credential -create <alias> -provider
+      <provider-path>" command, where <alias> is a string and <provider-path> is a HDFS jceks file. During runtime,
+      the specified alias will be used to look up the password stored encrypted in the jceks hdfs file specified under
+      the providerPath element.
+
+      The available read and write interfaces enable database administrators to segregate read and write workloads.
+
+      <verbatim>
+
+      File: mysql-database.xml
+
+      <?xml version="1.0" encoding="UTF-8"?>
+      <datasource colo="west-coast" description="MySQL database on west coast" type="mysql" name="mysql-db" xmlns="uri:falcon:datasource:0.1">
+          <tags>owner=foobar@ambari.apache.org, consumer=phoe@ambari.apache.org</tags>
+          <interfaces>
+              <!-- ***** read interface ***** -->
+              <interface type="readonly" endpoint="jdbc:mysql://dbhost/test">
+                  <credential type="password-text">
+                      <userName>import_usr</userName>
+                      <passwordText>sqoop</passwordFile>
+                  </credential>
+              </interface>
+
+              <!-- ***** write interface ***** -->
+              <interface type="write"  endpoint="jdbc:mysql://dbhost/test">
+                  <credential type="password-file">
+                      <userName>export_usr</userName>
+                      <passwordFile>/user/ambari-qa/password-store/password_write_user</passwordFile>
+                  </credential>
+              </interface>
+
+              <!-- *** default credential *** -->
+              <credential type="password-alias">
+                <userName>sqoop2_user</userName>
+                <passwordAlias>
+                    <alias>sqoop.password.alias</alias>
+                    <providerPath>hdfs://namenode:8020/user/ambari-qa/sqoop_password.jceks</providerPath>
+                </passwordAlias>
+              </credential>
+
+          </interfaces>
+
+          <driver>
+              <clazz>com.mysql.jdbc.Driver</clazz>
+              <jar>/user/oozie/share/lib/lib_20150721010816/sqoop/mysql-connector-java-5.1.31</jar>
+          </driver>
+      </datasource>
+      </verbatim>
+
+   * *Feed  Entity*
+      Feed entity now enables users to define IMPORT and EXPORT policies in addition to RETENTION and REPLICATION.
+      The IMPORT and EXPORT policies will refer to a already defined Datasource entity for connection and credential
+      details and take a table name from the policy to operate on. Please refer to feed entity XSD for details.
+
+      The following example defines a Feed entity with IMPORT and EXPORT policies. Both the IMPORT and EXPORT operations
+      refer to a datasource entity "mysql-db". The IMPORT operation will use the read interface and credentials while
+      the EXPORT operation will use the write interface and credentials. A feed instance is created every 1 hour
+      since the frequency of the Feed is hour(1) and the Feed instances are deleted after 90 days because of the
+      retention policy.
+
+
+      <verbatim>
+
+      File: customer_email_feed.xml
+
+      <?xml version="1.0" encoding="UTF-8"?>
+      <!--
+       A feed representing Hourly customer email data retained for 90 days
+       -->
+      <feed description="Raw customer email feed" name="customer_feed" xmlns="uri:falcon:feed:0.1">
+          <tags>externalSystem=USWestEmailServers,classification=secure</tags>
+          <groups>DataImportPipeline</groups>
+          <frequency>hours(1)</frequency>
+          <late-arrival cut-off="hours(4)"/>
+          <clusters>
+              <cluster name="primaryCluster" type="source">
+                  <validity start="2015-12-15T00:00Z" end="2016-03-31T00:00Z"/>
+                  <retention limit="days(90)" action="delete"/>
+                  <import>
+                      <source name="mysql-db" tableName="simple">
+                          <extract type="full">
+                              <mergepolicy>snapshot</mergepolicy>
+                          </extract>
+                          <fields>
+                              <includes>
+                                  <field>id</field>
+                                  <field>name</field>
+                              </includes>
+                          </fields>
+                      </source>
+                      <arguments>
+                          <argument name="--split-by" value="id"/>
+                          <argument name="--num-mappers" value="2"/>
+                      </arguments>
+                  </import>
+                  <export>
+                        <target name="mysql-db" tableName="simple_export">
+                            <load type="insert"/>
+                            <fields>
+                              <includes>
+                                <field>id</field>
+                                <field>name</field>
+                              </includes>
+                            </fields>
+                        </target>
+                        <arguments>
+                             <argument name="--update-key" value="id"/>
+                        </arguments>
+                    </export>
+              </cluster>
+          </clusters>
+
+          <locations>
+              <location type="data" path="/user/ambari-qa/falcon/demo/primary/importfeed/${YEAR}-${MONTH}-${DAY}-${HOUR}-${MINUTE}"/>
+              <location type="stats" path="/none"/>
+              <location type="meta" path="/none"/>
+          </locations>
+
+          <ACL owner="ambari-qa" group="users" permission="0755"/>
+          <schema location="/none" provider="none"/>
+
+      </feed>
+      </verbatim>
+
+   * *Import policy*
+     The import policy uses the datasource entity specified in the "source" to connect to the database. The tableName
+     specified should exist in the source datasource.
+
+     Extraction type specifies whether to pull data from external datasource "full" everytime or "incrementally".
+     The mergepolicy specifies how to organize (snapshot or append, i.e time series partiitons) the data on hadoop.
+     The valid combinations are:
+      * [full,snapshot] - data is extracted in full and dumped into the feed instance location.
+      * [incremental, append] - data is extracted incrementally using the key specified in the *deltacolumn*
+        and added as a partition to the feed instance location.
+      * [incremental, snapshot] - data is extracted incrementally and merged with already existing data on hadoop to
+        produce one latest feed instance.*This feature is not supported currently*. The use case for this feature is
+        to efficiently import very large dimention tables that have updates and inserts onto hadoop and make it available
+        as a snapshot with latest updates to consumers.
+
+      The following example defines an incremental extraction with append organization:
+
+      <verbatim>
+           <import>
+                <source name="mysql-db" tableName="simple">
+                    <extract type="incremental">
+                        <deltacolumn>modified_time</deltacolumn>
+                        <mergepolicy>append</mergepolicy>
+                    </extract>
+                    <fields>
+                        <includes>
+                            <field>id</field>
+                            <field>name</field>
+                        </includes>
+                    </fields>
+                </source>
+                <arguments>
+                    <argument name="--split-by" value="id"/>
+                    <argument name="--num-mappers" value="2"/>
+                </arguments>
+            </import>
+        </verbatim>
+
+
+     The fields option enables to control what fields get imported. By default, all fields get import. The "includes" option
+     brings only those fields specified. The "excludes" option brings all the fields other than specified.
+
+     The arguments section enables to pass in any extra arguments needed for fine control on the underlying implementation --
+     in this case, Sqoop.
+
+   * *Export policy*
+     The export, like import, uses the datasource for connecting to the database. Load type specifies whether to insert
+     or only update data onto the external table. Fields option behaves the same way as in import policy.
+     The tableName specified should exist in the external datasource.
+
+---+++ Operation
+   Once the Datasource and Feed entity with import and export policies are defined, Users can submit and schedule
+   the Import and Export operations via CLI and REST API as below:
+
+   <verbatim>
+
+    ## submit the mysql-db datasource defined in the file mysql_datasource.xml
+    falcon entity -submit -type datasource -file mysql_datasource.xml
+
+    ## submit the customer_feed specified in the customer_email_feed.xml
+    falcon entity -submit -type feed -file customer_email_feed.xml
+
+    ## schedule the customer_feed
+    falcon entity -schedule -type feed -name customer_feed
+
+   </verbatim>
+
+   Falcon will create corresponding oozie bundles with coordinator and workflow for import and export operation.

http://git-wip-us.apache.org/repos/asf/falcon/blob/4612c3f7/trunk/releases/0.10/src/site/twiki/InstallationSteps.twiki
----------------------------------------------------------------------
diff --git a/trunk/releases/0.10/src/site/twiki/InstallationSteps.twiki b/trunk/releases/0.10/src/site/twiki/InstallationSteps.twiki
new file mode 100644
index 0000000..297d88e
--- /dev/null
+++ b/trunk/releases/0.10/src/site/twiki/InstallationSteps.twiki
@@ -0,0 +1,90 @@
+---+Building & Installing Falcon
+
+
+---++Building Falcon
+
+---+++Prerequisites
+
+   * JDK 1.7/1.8
+   * Maven 3.2.x
+
+
+
+---+++Step 1 - Clone the Falcon repository
+
+<verbatim>
+$git clone https://git-wip-us.apache.org/repos/asf/falcon.git falcon
+</verbatim>
+
+
+---+++Step 2 - Build Falcon
+
+<verbatim>
+$ cd falcon
+$ export MAVEN_OPTS="-Xmx1024m -XX:MaxPermSize=256m -noverify"
+$ mvn clean install 
+
+</verbatim>
+It builds and installs the package into the local repository, for use as a dependency in other projects locally.
+
+[optionally -Dhadoop.version=<<hadoop.version>> can be appended to build for a specific version of hadoop]
+
+*Note 1:* Falcon drops support for Hadoop-1 and only supports Hadoop-2 from Falcon 0.6 onwards
+          Falcon build with JDK 1.7 using -noverify option
+
+*Note 2:* To compile Falcon with addon extensions, append additional profiles to build command using syntax -P<<profile1,profile2>>
+          For Hive Mirroring extension, use profile"hivedr". Hive >= 1.2.0 and Oozie >= 4.2.0 is required
+          For HDFS Snapshot mirroring extension, use profile "hdfs-snapshot-mirroring". Hadoop >= 2.7.0 is required
+          For ADF integration, use profile "adf"
+
+---+++Step 3 - Package and Deploy Falcon
+
+Once the build successfully completes, artifacts can be packaged for deployment using the assembly plugin. The Assembly
+Plugin for Maven is primarily intended to allow users to aggregate the project output along with its dependencies,
+modules, site documentation, and other files into a single distributable archive. There are two basic ways in which you
+can deploy Falcon - Embedded mode(also known as Stand Alone Mode) and Distributed mode. Your next steps will vary based
+on the mode in which you want to deploy Falcon.
+
+*NOTE* : Oozie is being extended by Falcon (particularly on el-extensions) and hence the need for Falcon to build &
+re-package Oozie, so that users of Falcon can work with the right Oozie setup. Though Oozie is packaged by Falcon, it
+needs to be deployed separately by the administrator and is not auto deployed along with Falcon.
+
+
+---++++Embedded/Stand Alone Mode
+Embedded mode is useful when the Hadoop jobs and relevant data processing involve only one Hadoop cluster. In this mode
+ there is a single Falcon server that contacts the scheduler to schedule jobs on Hadoop. All the process/feed requests
+ like submit, schedule, suspend, kill etc. are sent to this server. For running Falcon in this mode one should use the
+ Falcon which has been built using standalone option. You can find the instructions for Embedded mode setup
+ [[Embedded-mode][here]].
+
+
+---++++Distributed Mode
+Distributed mode is for multiple (colos) instances of Hadoop clusters, and multiple workflow schedulers to handle them.
+In this mode Falcon has 2 components: Prism and Server(s). Both Prism and Server(s) have their own their own config
+locations(startup and runtime properties). In this mode Prism acts as a contact point for Falcon servers. While
+ all commands are available through Prism, only read and instance api's are available through Server. You can find the
+ instructions for Distributed Mode setup [[Distributed-mode][here]].
+
+
+
+---+++Preparing Oozie and Falcon packages for deployment
+<verbatim>
+$cd <<project home>>
+$src/bin/package.sh <<hadoop-version>> <<oozie-version>>
+
+>> ex. src/bin/package.sh 1.1.2 4.0.1 or src/bin/package.sh 0.20.2-cdh3u5 4.0.1
+>> ex. src/bin/package.sh 2.5.0 4.0.0
+>> Falcon package is available in <<falcon home>>/target/apache-falcon-<<version>>-bin.tar.gz
+>> Oozie package is available in <<falcon home>>/target/oozie-4.0.1-distro.tar.gz
+>> __IMPORTANT:  You need to download the je-5.0.73 version from http://download.oracle.com/otn/berkeley-db/je-5.0.73.zip and extract je-5.0.73 under the Falcon webapp directory or provision an HBase cluster for use as Falcon graphdb backend DB.
+    Depending on the Graphdb backend choice, update the startup.properties appropriately.__
+</verbatim>
+
+*NOTE:* If you have a separate Apache Oozie installation, you will need to follow some additional steps:
+   1. Once you have setup the Falcon Server, copy libraries under {falcon-server-dir}/oozie/libext/ to {oozie-install-dir}/libext.
+   1. Modify Oozie's configuration file. Copy all Falcon related properties from {falcon-server-dir}/oozie/conf/oozie-site.xml to {oozie-install-dir}/conf/oozie-site.xml
+   1. Restart oozie:
+      1. cd {oozie-install-dir}
+      1. sudo -u oozie ./bin/oozie-stop.sh
+      1. sudo -u oozie ./bin/oozie-setup.sh prepare-war
+      1. sudo -u oozie ./bin/oozie-start.sh

http://git-wip-us.apache.org/repos/asf/falcon/blob/4612c3f7/trunk/releases/0.10/src/site/twiki/LICENSE.txt
----------------------------------------------------------------------
diff --git a/trunk/releases/0.10/src/site/twiki/LICENSE.txt b/trunk/releases/0.10/src/site/twiki/LICENSE.txt
new file mode 100644
index 0000000..d3b580f
--- /dev/null
+++ b/trunk/releases/0.10/src/site/twiki/LICENSE.txt
@@ -0,0 +1,3 @@
+All files in this directory and subdirectories are under Apache License Version 2.0.
+The reason being Maven Doxia plugin that converts twiki to html does not have
+commenting out feature.

http://git-wip-us.apache.org/repos/asf/falcon/blob/4612c3f7/trunk/releases/0.10/src/site/twiki/MigrationInstructions.twiki
----------------------------------------------------------------------
diff --git a/trunk/releases/0.10/src/site/twiki/MigrationInstructions.twiki b/trunk/releases/0.10/src/site/twiki/MigrationInstructions.twiki
new file mode 100644
index 0000000..a11dbc4
--- /dev/null
+++ b/trunk/releases/0.10/src/site/twiki/MigrationInstructions.twiki
@@ -0,0 +1,32 @@
+---+ Migration Instructions
+
+---++ Migrate from 0.9 to 0.10
+
+FALCON-1333 (Instance Search feature) requires Falcon to use titan-berkeleyje version 0.5.4 to support indexing.
+Up until version 0.9 - Falcon used titan-berkeleyje-jre6 version 0.4.2. GraphDB created by version 0.4.2 cannot be
+read by version 0.5.4. The solution is to migrate the GraphDB to be compatible with Falcon 0.10 release. Please make
+sure that no falcon server is running while performing the migration.
+
+---+++ 1. Install Falcon 0.10
+Install Falcon 0.10 by following the [[InstallationSteps][Installation Steps]]. Do not start the falcon server yet.
+The tool to migrate graphDB is packaged with 0.10 Falcon server in falcon-common-0.10.jar.
+
+---+++ 2. Export GraphDB to JSON file using Falcon 0.9
+Please run the following command to generate the JSON file.
+
+<verbatim>
+ $FALCON_HOME/bin/graphdbutil.sh export <<java_home> <<hadoop_home>> <<falcon_0.9_home>> <<path_to_falcon-common-0.10.jar>> /jsonFile/dir/
+</verbatim>
+
+This command will create /jsonFile/dir/instanceMetadata.json
+
+---+++ 3. Import GraphDB from JSON file using Falcon 0.10
+Please run the following command to import graphDB the JSON file. The location of graphDB will be based on property
+"*.falcon.graph.storage.directory" set in startup.properties file.
+
+<verbatim>
+  $FALCON_HOME/bin/graphdbutil.sh export <<java_home> <<hadoop_home>> <<falcon_0.10_home>> <<path_to_falcon-common-0.10.jar>> /jsonFile/dir/
+</verbatim>
+
+This command will import from /jsonFile/dir/instanceMetadata.json, now start the Falcon 0.10 server.
+

http://git-wip-us.apache.org/repos/asf/falcon/blob/4612c3f7/trunk/releases/0.10/src/site/twiki/OnBoarding.twiki
----------------------------------------------------------------------
diff --git a/trunk/releases/0.10/src/site/twiki/OnBoarding.twiki b/trunk/releases/0.10/src/site/twiki/OnBoarding.twiki
new file mode 100644
index 0000000..8b02150
--- /dev/null
+++ b/trunk/releases/0.10/src/site/twiki/OnBoarding.twiki
@@ -0,0 +1,269 @@
+---++ Contents
+   * <a href="#Onboarding Steps">Onboarding Steps</a>
+   * <a href="#Sample Pipeline">Sample Pipeline</a>
+   * [[HiveIntegration][Hive Examples]]
+
+---+++ Onboarding Steps
+   * Create cluster definition for the cluster, specifying name node, job tracker, workflow engine endpoint, messaging endpoint. Refer to [[EntitySpecification][cluster definition]] for details.
+   * Create Feed definitions for each of the input and output specifying frequency, data path, ownership. Refer to [[EntitySpecification][feed definition]] for details.
+   * Create Process definition for your job. Process defines configuration for the workflow job. Important attributes are frequency, inputs/outputs and workflow path. Refer to [[EntitySpecification][process definition]] for process details.
+   * Define workflow for your job using the workflow engine(only oozie is supported as of now). Refer [[http://oozie.apache.org/docs/3.1.3-incubating/WorkflowFunctionalSpec.html][Oozie Workflow Specification]]. The libraries required for the workflow should be available in lib folder in workflow path.
+   * Set-up workflow definition, libraries and referenced scripts on hadoop. 
+   * Submit cluster definition
+   * Submit and schedule feed and process definitions
+   
+
+---+++ Sample Pipeline
+---++++ Cluster   
+Cluster definition that contains end points for name node, job tracker, oozie and jms server:
+The cluster locations MUST be created prior to submitting a cluster entity to Falcon.
+*staging* must have 777 permissions and the parent dirs must have execute permissions
+*working* must have 755 permissions and the parent dirs must have execute permissions
+
+<verbatim>
+<?xml version="1.0"?>
+<!--
+    Cluster configuration
+  -->
+<cluster colo="ua2" description="" name="corp" xmlns="uri:falcon:cluster:0.1"
+    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">    
+    <interfaces>
+        <interface type="readonly" endpoint="hftp://name-node.com:50070" version="2.5.0" />
+
+        <interface type="write" endpoint="hdfs://name-node.com:54310" version="2.5.0" />
+
+        <interface type="execute" endpoint="job-tracker:54311" version="2.5.0" />
+
+        <interface type="workflow" endpoint="http://oozie.com:11000/oozie/" version="4.0.1" />
+
+        <interface type="messaging" endpoint="tcp://jms-server.com:61616?daemon=true" version="5.1.6" />
+    </interfaces>
+
+    <locations>
+        <location name="staging" path="/projects/falcon/staging" />
+        <location name="temp" path="/tmp" />
+        <location name="working" path="/projects/falcon/working" />
+    </locations>
+</cluster>
+</verbatim>
+   
+---++++ Input Feed
+Hourly feed that defines feed path, frequency, ownership and validity:
+<verbatim>
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+    Hourly sample input data
+  -->
+
+<feed description="sample input data" name="SampleInput" xmlns="uri:falcon:feed:0.1"
+    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
+    <groups>group</groups>
+
+    <frequency>hours(1)</frequency>
+
+    <late-arrival cut-off="hours(6)" />
+
+    <clusters>
+        <cluster name="corp" type="source">
+            <validity start="2009-01-01T00:00Z" end="2099-12-31T00:00Z" timezone="UTC" />
+            <retention limit="months(24)" action="delete" />
+        </cluster>
+    </clusters>
+
+    <locations>
+        <location type="data" path="/projects/bootcamp/data/${YEAR}-${MONTH}-${DAY}-${HOUR}/SampleInput" />
+        <location type="stats" path="/projects/bootcamp/stats/SampleInput" />
+        <location type="meta" path="/projects/bootcamp/meta/SampleInput" />
+    </locations>
+
+    <ACL owner="suser" group="users" permission="0755" />
+
+    <schema location="/none" provider="none" />
+</feed>
+</verbatim>
+
+---++++ Output Feed
+Daily feed that defines feed path, frequency, ownership and validity:
+<verbatim>
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+    Daily sample output data
+  -->
+
+<feed description="sample output data" name="SampleOutput" xmlns="uri:falcon:feed:0.1"
+xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
+    <groups>group</groups>
+
+    <frequency>days(1)</frequency>
+
+    <late-arrival cut-off="hours(6)" />
+
+    <clusters>
+        <cluster name="corp" type="source">
+            <validity start="2009-01-01T00:00Z" end="2099-12-31T00:00Z" timezone="UTC" />
+            <retention limit="months(24)" action="delete" />
+        </cluster>
+    </clusters>
+
+    <locations>
+        <location type="data" path="/projects/bootcamp/output/${YEAR}-${MONTH}-${DAY}/SampleOutput" />
+        <location type="stats" path="/projects/bootcamp/stats/SampleOutput" />
+        <location type="meta" path="/projects/bootcamp/meta/SampleOutput" />
+    </locations>
+
+    <ACL owner="suser" group="users" permission="0755" />
+
+    <schema location="/none" provider="none" />
+</feed>
+</verbatim>
+
+---++++ Process
+Sample process which runs daily at 6th hour on corp cluster. It takes one input - !SampleInput for the previous day(24 instances). It generates one output - !SampleOutput for previous day. The workflow is defined at /projects/bootcamp/workflow/workflow.xml. Any libraries available for the workflow should be at /projects/bootcamp/workflow/lib. The process also defines properties queueName, ssh.host, and fileTimestamp which are passed to the workflow. In addition, Falcon exposes the following properties to the workflow: nameNode, jobTracker(hadoop properties), input and output(Input/Output properties).
+
+<verbatim>
+<?xml version="1.0" encoding="UTF-8"?>
+<!--
+    Daily sample process. Runs at 6th hour every day. Input - last day's hourly data. Generates output for yesterday
+ -->
+<process name="SampleProcess">
+    <cluster name="corp" />
+
+    <frequency>days(1)</frequency>
+
+    <validity start="2012-04-03T06:00Z" end="2022-12-30T00:00Z" timezone="UTC" />
+
+    <inputs>
+        <input name="input" feed="SampleInput" start="yesterday(0,0)" end="today(-1,0)" />
+    </inputs>
+
+    <outputs>
+            <output name="output" feed="SampleOutput" instance="yesterday(0,0)" />
+    </outputs>
+
+    <properties>
+        <property name="queueName" value="reports" />
+        <property name="ssh.host" value="host.com" />
+        <property name="fileTimestamp" value="${coord:formatTime(coord:nominalTime(), 'yyyy-MM-dd')}" />
+    </properties>
+
+    <workflow engine="oozie" path="/projects/bootcamp/workflow" />
+
+    <retry policy="periodic" delay="minutes(5)" attempts="3" />
+    
+    <late-process policy="exp-backoff" delay="hours(1)">
+        <late-input input="input" workflow-path="/projects/bootcamp/workflow/lateinput" />
+    </late-process>
+</process>
+</verbatim>
+
+---++++ Oozie Workflow
+The sample user workflow contains 3 actions:
+   * Pig action - Executes pig script /projects/bootcamp/workflow/script.pig
+   * concatenator - Java action that concatenates part files and generates a single file
+   * file upload - ssh action that gets the concatenated file from hadoop and sends the file to a remote host
+   
+<verbatim>
+<workflow-app xmlns="uri:oozie:workflow:0.2" name="sample-wf">
+        <start to="pig" />
+
+        <action name="pig">
+                <pig>
+                        <job-tracker>${jobTracker}</job-tracker>
+                        <name-node>${nameNode}</name-node>
+                        <prepare>
+                                <delete path="${output}"/>
+                        </prepare>
+                        <configuration>
+                                <property>
+                                        <name>mapred.job.queue.name</name>
+                                        <value>${queueName}</value>
+                                </property>
+                                <property>
+                                        <name>mapreduce.fileoutputcommitter.marksuccessfuljobs</name>
+                                        <value>true</value>
+                                </property>
+                        </configuration>
+                        <script>${nameNode}/projects/bootcamp/workflow/script.pig</script>
+                        <param>input=${input}</param>
+                        <param>output=${output}</param>
+                        <file>lib/dependent.jar</file>
+                </pig>
+                <ok to="concatenator" />
+                <error to="fail" />
+        </action>
+
+        <action name="concatenator">
+                <java>
+                        <job-tracker>${jobTracker}</job-tracker>
+                        <name-node>${nameNode}</name-node>
+                        <prepare>
+                                <delete path="${nameNode}/projects/bootcamp/concat/data-${fileTimestamp}.csv"/>
+                        </prepare>
+                        <configuration>
+                                <property>
+                                        <name>mapred.job.queue.name</name>
+                                        <value>${queueName}</value>
+                                </property>
+                        </configuration>
+                        <main-class>com.wf.Concatenator</main-class>
+                        <arg>${output}</arg>
+                        <arg>${nameNode}/projects/bootcamp/concat/data-${fileTimestamp}.csv</arg>
+                </java>
+                <ok to="fileupload" />
+                <error to="fail"/>
+        </action>
+                        
+        <action name="fileupload">
+                <ssh>
+                        <host>localhost</host>
+                        <command>/tmp/fileupload.sh</command>
+                        <args>${nameNode}/projects/bootcamp/concat/data-${fileTimestamp}.csv</args>
+                        <args>${wf:conf("ssh.host")}</args>
+                        <capture-output/>
+                </ssh>
+                <ok to="fileUploadDecision" />
+                <error to="fail"/>
+        </action>
+
+        <decision name="fileUploadDecision">
+                <switch>
+                        <case to="end">
+                                ${wf:actionData('fileupload')['output'] == '0'}
+                        </case>
+                        <default to="fail"/>
+                </switch>
+        </decision>
+
+        <kill name="fail">
+                <message>Workflow failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
+        </kill>
+
+        <end name="end" />
+</workflow-app>
+</verbatim>
+
+---++++ File Upload Script
+The script gets the file from hadoop, rsyncs the file to /tmp on remote host and deletes the file from hadoop
+<verbatim>
+#!/bin/bash
+
+trap 'echo "output=$?"; exit $?' ERR INT TERM
+
+echo "Arguments: $@"
+SRCFILE=$1
+DESTHOST=$3
+
+FILENAME=`basename $SRCFILE`
+rm -f /tmp/$FILENAME
+hadoop fs -copyToLocal $SRCFILE /tmp/
+echo "Copied $SRCFILE to /tmp"
+
+rsync -ztv --rsh=ssh --stats /tmp/$FILENAME $DESTHOST:/tmp
+echo "rsynced $FILENAME to $DESTUSER@$DESTHOST:$DESTFILE"
+
+hadoop fs -rmr $SRCFILE
+echo "Deleted $SRCFILE"
+
+rm -f /tmp/$FILENAME
+echo "output=0"
+</verbatim>

http://git-wip-us.apache.org/repos/asf/falcon/blob/4612c3f7/trunk/releases/0.10/src/site/twiki/Operability.twiki
----------------------------------------------------------------------
diff --git a/trunk/releases/0.10/src/site/twiki/Operability.twiki b/trunk/releases/0.10/src/site/twiki/Operability.twiki
new file mode 100644
index 0000000..2bccb51
--- /dev/null
+++ b/trunk/releases/0.10/src/site/twiki/Operability.twiki
@@ -0,0 +1,230 @@
+---+ Operationalizing Falcon
+
+---++ Overview
+
+Apache Falcon provides various tools to operationalize Falcon consisting of Alerts for
+unrecoverable errors, Audits of user actions, Metrics, and Notifications. They are detailed below.
+
+++ Lineage
+
+Currently Lineage has no way to access or restore information about entity instances created during the time lineage
+was disabled. Information about entities however, is preserved and bootstrapped when lineage is enabled. If you have to
+reset the graph db then you can delete the graph db files as specified in the startup.properties and restart the falcon.
+Please note: you will loose all the information about the instances if you delete the graph db.
+
+---++ Monitoring
+
+Falcon provides monitoring of various events by capturing metrics of those events.
+The metric numbers can then be used to monitor performance and health of the Falcon system and
+the entire processing pipelines.
+
+Falcon also exposes [[https://github.com/thinkaurelius/titan/wiki/Titan-Performance-and-Monitoring][metrics for titandb]]
+
+Users can view the logs of these events in the metric.log file, by default this file is created
+under ${user.dir}/logs/ directory. Users may also extend the Falcon monitoring framework to send
+events to systems like Mondemand/lwes by implementingorg.apache.falcon.plugin.MonitoringPlugin
+interface.
+
+The following events are captured by Falcon for logging the metrics:
+   1. New cluster definitions posted to Falcon (success & failures)
+   1. New feed definition posted to Falcon (success & failures)
+   1. New process definition posted to Falcon (success & failures)
+   1. Process update events (success & failures)
+   1. Feed update events (success & failures)
+   1. Cluster update events (success & failures)
+   1. Process suspend events (success & failures)
+   1. Feed suspend events (success & failures)
+   1. Process resume events (success & failures)
+   1. Feed resume events (success & failures)
+   1. Process remove events (success & failures)
+   1. Feed remove events (success & failures)
+   1. Cluster remove events (success & failures)
+   1. Process instance kill events (success & failures)
+   1. Process instance re-run events (success & failures)
+   1. Process instance generation events
+   1. Process instance failure events
+   1. Process instance auto-retry events
+   1. Process instance retry exhaust events
+   1. Feed instance deletion event
+   1. Feed instance deletion failure event (no retries)
+   1. Feed instance replication event
+   1. Feed instance replication failure event
+   1. Feed instance replication auto-retry event
+   1. Feed instance replication retry exhaust event
+   1. Feed instance late arrival event
+   1. Feed instance post cut-off arrival event
+   1. Process re-run due to late feed event
+   1. Transaction rollback failed event
+
+The metric logged for an event has the following properties:
+   1. Action - Name of the event.
+   2. Dimensions - A list of name/value pairs of various attributes for a given action.
+   3. Status- Status of an action FAILED/SUCCEEDED.
+   4. Time-taken - Time taken in nanoseconds for a given action.
+
+An example for an event logged for a submit of a new process definition:
+
+   2012-05-04 12:23:34,026 {Action:submit, Dimensions:{entityType=process}, Status: SUCCEEDED, Time-taken:97087000 ns}
+
+Users may parse the metric.log or capture these events from custom monitoring frameworks and can plot various graphs
+or send alerts according to their requirements.
+
+
+---++ Notifications
+
+Falcon has two types of notifications - System and User notifications.
+
+---+++ System notifications
+The System notifications are internally generated and used by Falcon to monitor the Falcon orchestrated workflow jobs.
+By default, Falcon starts an ActiveMQ embedded JMS server on Falcon machine on port 61616 as a daemon. Alternatively,
+users can make Falcon to use an existing JMS server instead of starting an embedded instance by doing the
+following 2 steps:
+
+   * Setting the property broker.url in the startup.properties as below
+<verbatim>
+   *.broker.url=tcp://jms-server-host:61616
+</verbatim>
+   * Set the system property falcon.embeddedmq to false as below
+<verbatim>
+   <FALCON-INSTALL-DIR>/bin/falcon-start -Dfalcon.embeddedmq=false
+</verbatim>
+
+Falcon uses FALCON.ENTITY.TOPIC to publish system notifications. This topic and the Map Message fields are internal
+and could change between releases.
+
+---+++ User notifications
+
+Falcon, in addition to the FALCON.ENTITY.TOPIC, also creates a JMS topic for every process/feed that is scheduled in
+Falcon as part of User notification. To enable User notifications, the broker url and implementation class of the JMS
+engine need to be specified in the cluster definition associated with the feed/process. Users may register consumers
+on the required topic to check the availability or status of feed instances. The User notification JMS broker instance
+can be same as the System notification or different.
+
+The name of the JMS topic is same as the process/feed name. Falcon sends a map message for every feed instance that is
+created/deleted/replicated/imported/exported to the JMS topic. The JMS Map Message sent to a topic has the following
+fields:
+
+   1. cluster - name of the current cluster the feed/process is dependent on.
+   1. entityType - type of the entity (feed or process).
+   1. entityName - name of the entity.
+   1. nominalTime - instance time (or data date).
+   1. operation - operation like generate, delete, replicate, import, export.
+   1. feedNames - name of the feeds which are generated/replicated/deleted/imported/exported.
+   1. feedInstancePaths - comma separated feed instance paths.
+   1. workflowId - current workflow-id of the instance.
+   1. workflowUser - user who owns the feed instance (i.e partition).
+   1. runId - current run-id of the instance.
+   1. status - status of the user workflow instance.
+   1. timeStamp - current timestamp.
+   1. logDir - log dir where lineage can be recorded.
+
+The JMS messages are automatically purged after a certain period (default 3 days) by the Falcon JMS house-keeping
+service. TTL (Time-to-live) for JMS message can be configured in the Falcon's startup.properties file.
+
+The following example shows how to enable and read user notification by connecting to the JMS broker.
+
+First, specify the JMS broker url in the cluster definition XML as shown below.
+
+<verbatim>
+
+<?xml version="1.0"?>
+<!-- filename : primaryCluster.xml -->
+<cluster colo="USWestOregon" description="oregonHadoopCluster" name="primaryCluster" xmlns="uri:falcon:cluster:0.1">
+    <interfaces>
+        ...
+        ...
+        <interface type="messaging" endpoint="tcp://user-jms-broker-host:61616?daemon=true" version="5.1.6" />
+        ...
+    </interfaces>
+</cluster>
+
+</verbatim>
+
+Next, use a JMS consumer (example below in Java) to read the message from the topic with the name
+FALCON.<feed-or-process-name>
+
+<verbatim>
+import org.apache.activemq.ActiveMQConnectionFactory;
+import org.apache.activemq.command.ActiveMQMapMessage;
+import javax.jms.ConnectionFactory;
+import javax.jms.Connection;
+import javax.jms.MessageConsumer;
+import javax.jms.Topic;
+import javax.jms.Session;
+import javax.jms.TopicSession;
+
+public class FalconUserJMSClient {
+    public static void main(String[] args)throws Exception {
+        // Note: specify the JMS broker URL
+        String brokerUrl = "tcp://localhost:61616";
+
+        ConnectionFactory connectionFactory = new ActiveMQConnectionFactory(brokerUrl);
+        Connection connection = connectionFactory.createConnection();
+        connection.setClientID("Falcon User JMS Consumer");
+        TopicSession session = (TopicSession) connection.createSession(false, Session.AUTO_ACKNOWLEDGE);
+        try {
+
+            // Note: the topic name for the feed will be FALCON.<feed-name>
+            Topic falconTopic = session.createTopic("FALCON.feed-sample");
+            MessageConsumer consumer = session.createConsumer(falconTopic);
+            connection.start();
+            while (true) {
+                ActiveMQMapMessage msg = (ActiveMQMapMessage) consumer.receive();
+                System.out.println("cluster             : " + msg.getString("cluster"));
+                System.out.println("entityType          : " + msg.getString("entityType"));
+                System.out.println("entityName          : " + msg.getString("entityName"));
+                System.out.println("nominalTime         : " + msg.getString("nominalTime"));
+                System.out.println("operation           : " + msg.getString("operation"));
+
+                System.out.println("feedNames           : " + msg.getString("feedNames"));
+                System.out.println("feedInstancePaths   : " + msg.getString("feedInstancePaths"));
+
+                System.out.println("workflowId          : " + msg.getString("workflowId"));
+                System.out.println("workflowUser        : " + msg.getString("workflowUser"));
+                System.out.println("runId               : " + msg.getString("runId"));
+                System.out.println("status              : " + msg.getString("status"));
+                System.out.println("timeStamp           : " + msg.getString("timeStamp"));
+                System.out.println("logDir              : " + msg.getString("logDir"));
+
+                System.out.println("brokerUrl           : " + msg.getString("brokerUrl"));
+                System.out.println("brokerImplClass     : " + msg.getString("brokerImplClass"));
+                System.out.println("logFile             : " + msg.getString("logFile"));
+                System.out.println("topicName           : " + msg.getString("topicName"));
+                System.out.println("brokerTTL           : " + msg.getString("brokerTTL"));
+            }
+        } finally {
+            if (session != null) {
+                session.close();
+            }
+            if (connection != null) {
+                connection.close();
+            }
+        }
+    }
+}
+</verbatim>
+
+
+---++ Alerts
+
+Falcon generates alerts for unrecoverable errors into a log file by default.
+Users can view these alerts in the alerts.log file, by default this file is created
+under ${user.dir}/logs/ directory.
+
+Users may also extend the Falcon Alerting plugin to send events to systems like Nagios, etc. by
+extending org.apache.falcon.plugin.AlertingPlugin interface.
+
+
+---++ Audits
+
+Falcon audits all user activity and captures them into a log file by default.
+Users can view these audits in the audit.log file, by default this file is created
+under ${user.dir}/logs/ directory.
+
+Users may also extend the Falcon Audit plugin to send audits to systems like Apache Argus, etc. by
+extending org.apache.falcon.plugin.AuditingPlugin interface.
+
+
+---++ Metrics Collection In Graphite
+
+Falcon has support to send metrics to graphite more details regarding this can be found on [[GraphiteMetricCollection][Graphite Metric Collection]]
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/falcon/blob/4612c3f7/trunk/releases/0.10/src/site/twiki/Security.twiki
----------------------------------------------------------------------
diff --git a/trunk/releases/0.10/src/site/twiki/Security.twiki b/trunk/releases/0.10/src/site/twiki/Security.twiki
new file mode 100644
index 0000000..b17650c
--- /dev/null
+++ b/trunk/releases/0.10/src/site/twiki/Security.twiki
@@ -0,0 +1,409 @@
+---+ Securing Falcon
+
+---++ Overview
+
+Apache Falcon provides the following security features:
+   * Support credential provider alias for passwords used in Falcon server.
+   * Support authentication to identify proper users.
+   * Support authorization to specify resource access permission for users or groups.
+   * Support SSL to provide transport level security for data confidentiality and integrity.
+
+
+---++ Credential Provider Alias for Passwords
+Server-side configuration properties (i.e. startup.properties) contain passwords and other sensitive information.
+In addition to specifying properties in plain text, we provide the user an option to use credential provider alias in the property file.
+
+Take SMTP password for example. The user can store the password in a
+[[http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/CommandsManual.html#credential][Hadoop credential provider]]
+with the alias name _SMTPPasswordAlias_. In startup.properties where SMTP password is needed, the user can refer to its
+alias name _SMTPPasswordAlias_ instead of providing the real password.
+
+The alias property to be resolved through Hadoop credential provider should have the format:
+_credential.provider.alias.for.[property-key]_. For example,
+_credential.provider.alias.for.falcon.email.smtp.password=SMTPPasswordAlias_ for SMTP password.
+Falcon server, during the start, will automatically retrieve the real password provided the alias name.
+
+The user can specify the provider path with the property key _credential.provider.path_,
+e.g. _credential.provider.path=jceks://file/tmp/test.jceks_.
+If not specified, Falcon will use the default Hadoop credential provider path in core-site.xml.
+
+
+---++ Authentication (User Identity)
+
+Apache Falcon enforces authentication on protected resources. Once authentication has been established it sets a
+signed HTTP Cookie that contains an authentication token with the user name, user principal,
+authentication type and expiration time.
+
+It does so by using [[http://hadoop.apache .org/docs/current/hadoop-auth/index.html][Hadoop Auth]].
+Hadoop Auth is a Java library consisting of a client and a server components to enable Kerberos SPNEGO authentication
+for HTTP. Hadoop Auth also supports additional authentication mechanisms on the client and the server side via 2
+simple interfaces.
+
+
+---+++ Authentication Methods
+
+It supports 2 authentication methods, simple and kerberos out of the box.
+
+---++++ Pseudo/Simple Authentication
+
+Falcon authenticates the user by simply trusting the value of the query string parameter 'user.name'. This is the
+default mode Falcon is configured with.
+
+---++++ Kerberos Authentication
+
+Falcon uses HTTP Kerberos SPNEGO to authenticate the user.
+
+
+---++ Authorization
+
+Falcon also enforces authorization on Entities using ACLs (Access Control Lists). ACLs are useful
+for implementing permission requirements and provide a way to set different permissions for
+specific users or named groups.
+
+By default, support for authorization is disabled and can be enabled in startup.properties.
+
+---+++ ACLs in Entity
+
+All Entities now have ACL which needs to be present if authorization is enabled. Only owners who
+own or created the entity will be allowed to update or delete their entities.
+
+An entity has ACLs (Access Control Lists) that are useful for implementing permission requirements
+and provide a way to set different permissions for specific users or named groups.
+<verbatim>
+    <ACL owner="test-user" group="test-group" permission="*"/>
+</verbatim>
+ACL indicates the Access control list for this cluster.
+owner is the Owner of this entity.
+group is the one which has access to read.
+permission indicates the rwx is not enforced at this time.
+
+---+++ Super-User
+
+The super-user is the user with the same identity as falcon process itself. Loosely, if you
+started the falcon, then you are the super-user. The super-user can do anything in that
+permissions checks never fail for the super-user. There is no persistent notion of who was the
+super-user; when the falcon is started the process identity determines who is the super-user
+for now. The Falcon super-user does not have to be the super-user of the falcon host, nor is it
+necessary that all clusters have the same super-user. Also, an experimenter running Falcon on a
+personal workstation, conveniently becomes that installation's super-user without any configuration.
+
+Falcon also allows users to configure a super user group and allows users belonging to this
+group to be a super user.
+
+ACL owner and group must be valid even if the authenticated user is a super-user.
+
+---+++ Group Memberships
+
+Once a user has been authenticated and a username has been determined, the list of groups is
+determined by a group mapping service, configured by the hadoop.security.group.mapping property
+in Hadoop. The default implementation, org.apache.hadoop.security.ShellBasedUnixGroupsMapping,
+will shell out to the Unix bash -c groups command to resolve a list of groups for a user.
+
+Note that Falcon stores the user and group of an Entity as strings; there is no
+conversion from user and group identity numbers as is conventional in Unix.
+
+The only limitation is that a user cannot add a group in ACL that he does not belong to.
+
+---+++ Authorization Provider
+
+Falcon provides a plugin-able provider interface for Authorization. It also ships with a default
+implementation that enforces the following authorization policy.
+
+---++++ Entity and Instance Management Operations Policy
+
+   * All Entity and Instance operations are authorized for users who created them, Owners and users with group memberships
+   * Reference to entities with in a feed or process is allowed with out enforcing permissions
+
+Any Feed or Process can refer to a Cluster entity not owned by the Feed or Process owner. Any Process can refer to a Feed entity not owned by the Process owner
+
+The authorization is enforced in the following way:
+
+   * if admin resource,
+      * If authenticated user name matches the admin users configuration
+      * Else if groups of the authenticated user matches the admin groups configuration
+      * Else authorization exception is thrown
+   * Else if entities or instance resource
+      * If the authenticated user matches the owner in ACL for the entity
+      * Else if the groups of the authenticated user matches the group in ACL for the entity
+      * Else authorization exception is thrown
+   * Else if lineage resource
+      * All have read-only permissions, reason being folks should be able to examine the dependency and allow reuse
+
+To authenticate user for REST api calls, user should append "user.name=<username>" to the query.
+
+*operations on Entity Resource*
+
+| *Resource*                                                                          | *Description*                      | *Authorization* |
+| [[restapi/EntityValidate][api/entities/validate/:entity-type]]                      | Validate the entity                | Owner/Group     |
+| [[restapi/EntitySubmit][api/entities/submit/:entity-type]]                          | Submit the entity                  | Owner/Group     |
+| [[restapi/EntityUpdate][api/entities/update/:entity-type/:entity-name]]             | Update the entity                  | Owner/Group     |
+| [[restapi/EntitySubmitAndSchedule][api/entities/submitAndSchedule/:entity-type]]    | Submit & Schedule the entity       | Owner/Group     |
+| [[restapi/EntitySchedule][api/entities/schedule/:entity-type/:entity-name]]         | Schedule the entity                | Owner/Group     |
+| [[restapi/EntitySuspend][api/entities/suspend/:entity-type/:entity-name]]           | Suspend the entity                 | Owner/Group     |
+| [[restapi/EntityResume][api/entities/resume/:entity-type/:entity-name]]             | Resume the entity                  | Owner/Group     |
+| [[restapi/EntityDelete][api/entities/delete/:entity-type/:entity-name]]             | Delete the entity                  | Owner/Group     |
+| [[restapi/EntityStatus][api/entities/status/:entity-type/:entity-name]]             | Get the status of the entity       | Owner/Group     |
+| [[restapi/EntityDefinition][api/entities/definition/:entity-type/:entity-name]]     | Get the definition of the entity   | Owner/Group     |
+| [[restapi/EntityList][api/entities/list/:entity-type?fields=:fields]]               | Get the list of entities           | Owner/Group     |
+| [[restapi/EntityDependencies][api/entities/dependencies/:entity-type/:entity-name]] | Get the dependencies of the entity | Owner/Group     |
+
+*REST Call on Feed and Process Instances*
+
+| *Resource*                                                                  | *Description*                | *Authorization* |
+| [[restapi/InstanceRunning][api/instance/running/:entity-type/:entity-name]] | List of running instances.   | Owner/Group     |
+| [[restapi/InstanceStatus][api/instance/status/:entity-type/:entity-name]]   | Status of a given instance   | Owner/Group     |
+| [[restapi/InstanceKill][api/instance/kill/:entity-type/:entity-name]]       | Kill a given instance        | Owner/Group     |
+| [[restapi/InstanceSuspend][api/instance/suspend/:entity-type/:entity-name]] | Suspend a running instance   | Owner/Group     |
+| [[restapi/InstanceResume][api/instance/resume/:entity-type/:entity-name]]   | Resume a given instance      | Owner/Group     |
+| [[restapi/InstanceRerun][api/instance/rerun/:entity-type/:entity-name]]     | Rerun a given instance       | Owner/Group     |
+| [[InstanceLogs][api/instance/logs/:entity-type/:entity-name]]               | Get logs of a given instance | Owner/Group     |
+
+---++++ Admin Resources Policy
+
+Only users belonging to admin users or groups have access to this resource. Admin membership is
+determined by a static configuration parameter.
+
+| *Resource*                                             | *Description*                               | *Authorization*  |
+| [[restapi/AdminVersion][api/admin/version]]            | Get version of the server                   | No restriction   |
+| [[restapi/AdminStack][api/admin/stack]]                | Get stack of the server                     | Admin User/Group |
+| [[restapi/AdminConfig][api/admin/config/:config-type]] | Get configuration information of the server | Admin User/Group |
+
+
+---++++ Lineage Resource Policy
+
+Lineage is read-only and hence all users can look at lineage for their respective entities.
+*Note:* This gap will be fixed in a later release.
+
+
+---++ Authentication Configuration
+
+Following is the Server Side Configuration Setup for Authentication.
+
+---+++ Common Configuration Parameters
+
+<verbatim>
+# Authentication type must be specified: simple|kerberos
+*.falcon.authentication.type=kerberos
+</verbatim>
+
+---+++ Kerberos Configuration
+
+<verbatim>
+##### Service Configuration
+
+# Indicates the Kerberos principal to be used in Falcon Service.
+*.falcon.service.authentication.kerberos.principal=falcon/_HOST@EXAMPLE.COM
+
+# Location of the keytab file with the credentials for the Service principal.
+*.falcon.service.authentication.kerberos.keytab=/etc/security/keytabs/falcon.service.keytab
+
+# name node principal to talk to config store
+*.dfs.namenode.kerberos.principal=nn/_HOST@EXAMPLE.COM
+
+# Indicates how long (in seconds) falcon authentication token is valid before it has to be renewed.
+*.falcon.service.authentication.token.validity=86400
+
+##### SPNEGO Configuration
+
+# Authentication type must be specified: simple|kerberos|<class>
+# org.apache.falcon.security.RemoteUserInHeaderBasedAuthenticationHandler can be used for backwards compatibility
+*.falcon.http.authentication.type=kerberos
+
+# Indicates how long (in seconds) an authentication token is valid before it has to be renewed.
+*.falcon.http.authentication.token.validity=36000
+
+# The signature secret for signing the authentication tokens.
+*.falcon.http.authentication.signature.secret=falcon
+
+# The domain to use for the HTTP cookie that stores the authentication token.
+*.falcon.http.authentication.cookie.domain=
+
+# Indicates if anonymous requests are allowed when using 'simple' authentication.
+*.falcon.http.authentication.simple.anonymous.allowed=true
+
+# Indicates the Kerberos principal to be used for HTTP endpoint.
+# The principal MUST start with 'HTTP/' as per Kerberos HTTP SPNEGO specification.
+*.falcon.http.authentication.kerberos.principal=HTTP/_HOST@EXAMPLE.COM
+
+# Location of the keytab file with the credentials for the HTTP principal.
+*.falcon.http.authentication.kerberos.keytab=/etc/security/keytabs/spnego.service.keytab
+
+# The kerberos names rules is to resolve kerberos principal names, refer to Hadoop's KerberosName for more details.
+*.falcon.http.authentication.kerberos.name.rules=DEFAULT
+
+# Comma separated list of black listed users
+*.falcon.http.authentication.blacklisted.users=
+
+# Increase Jetty request buffer size to accommodate the generated Kerberos token
+*.falcon.jetty.request.buffer.size=16192
+</verbatim>
+
+---+++ Pseudo/Simple Configuration
+
+<verbatim>
+##### SPNEGO Configuration
+
+# Authentication type must be specified: simple|kerberos|<class>
+# org.apache.falcon.security.RemoteUserInHeaderBasedAuthenticationHandler can be used for backwards compatibility
+*.falcon.http.authentication.type=simple
+
+# Indicates how long (in seconds) an authentication token is valid before it has to be renewed.
+*.falcon.http.authentication.token.validity=36000
+
+# The signature secret for signing the authentication tokens.
+*.falcon.http.authentication.signature.secret=falcon
+
+# The domain to use for the HTTP cookie that stores the authentication token.
+*.falcon.http.authentication.cookie.domain=
+
+# Indicates if anonymous requests are allowed when using 'simple' authentication.
+*.falcon.http.authentication.simple.anonymous.allowed=true
+
+# Comma separated list of black listed users
+*.falcon.http.authentication.blacklisted.users=
+</verbatim>
+
+---++ Authorization Configuration
+
+---+++ Enabling Authorization
+By default, support for authorization is disabled and specifying ACLs in entities are optional.
+To enable support for authorization, set falcon.security.authorization.enabled to true in the
+startup configuration.
+
+<verbatim>
+# Authorization Enabled flag: false|true
+*.falcon.security.authorization.enabled=true
+</verbatim>
+
+---+++ Authorization Provider
+
+Falcon provides a basic implementation for Authorization bundled, org.apache.falcon.security .DefaultFalconAuthorizationProvider.
+This can be overridden by custom implementations in the startup configuration.
+
+<verbatim>
+# Authorization Provider Fully Qualified Class Name
+*.falcon.security.authorization.provider=org.apache.falcon.security.DefaultAuthorizationProvider
+</verbatim>
+
+---+++ Super User Group
+
+Super user group is determined by the configuration:
+
+<verbatim>
+# The name of the group of super-users
+*.falcon.security.authorization.superusergroup=falcon
+</verbatim>
+
+---+++ Admin Membership
+
+Administrative users are determined by the configuration:
+
+<verbatim>
+# Admin Users, comma separated users
+*.falcon.security.authorization.admin.users=falcon,ambari-qa,seetharam
+</verbatim>
+
+Administrative groups are determined by the configuration:
+
+<verbatim>
+# Admin Group Membership, comma separated users
+*.falcon.security.authorization.admin.groups=falcon,testgroup,staff
+</verbatim>
+
+
+---++ SSL
+
+Falcon provides transport level security ensuring data confidentiality and integrity. This is
+enabled by default for communicating over HTTP between the client and the server.
+
+---+++ SSL Configuration
+
+<verbatim>
+*.falcon.enableTLS=true
+*.keystore.file=/path/to/keystore/file
+*.keystore.password=password
+</verbatim>
+
+---+++ Distributed Falcon Setup
+
+Falcon should be configured to communicate with Prism over TLS in secure mode. Its not enabled by default.
+
+
+---++ Changes to ownership and permissions of directories managed by Falcon
+
+| *Directory*              | *Location*                                                        | *Owner* | *Permissions* |
+| Configuration Store      | ${config.store.uri}                                               | falcon  | 700           |
+| Cluster Staging Location | ${cluster.staging-location}                                       | falcon  | 777           |
+| Cluster Working Location | ${cluster.working-location}                                       | falcon  | 755           |
+| Shared libs              | {cluster.working}/{lib,libext}                                    | falcon  | 755           |
+| Oozie coord/bundle XMLs  | ${cluster.staging-location}/workflows/{entity}/{entity-name}      | $user   | cluster umask |
+| App logs                 | ${cluster.staging-location}/workflows/{entity}/{entity-name}/logs | $user   | cluster umask |
+
+*Note:* Please note that the cluster staging and working locations MUST be created prior to
+submitting a cluster entity to Falcon. Also, note that the the parent dirs must have execute
+permissions.
+
+
+---++ Backwards compatibility
+
+---+++ Scheduled Entities
+
+Entities already scheduled with an earlier version of Falcon are not compatible with this version
+
+---+++ Falcon Clients
+
+Older Falcon clients are backwards compatible wrt Authentication and user information sent as part of the HTTP
+header, Remote-User is still honoured when the authentication type is configured as below:
+
+<verbatim>
+*.falcon.http.authentication.type=org.apache.falcon.security.RemoteUserInHeaderBasedAuthenticationHandler
+</verbatim>
+
+---+++ Blacklisted super users for authentication
+
+The blacklist users used to have the following super users: hdfs, mapreduce, oozie, and falcon.
+The list is externalized from code into Startup.properties file and is empty now and needs to be
+configured specifically in the file.
+
+
+---+++ Falcon Dashboard
+
+To initialize the current user for dashboard, user should append query param "user.name=<username>" to the REST api call.
+
+If dashboard user wishes to change the current user, they should do the following.
+   * delete the hadoop.auth cookie from browser cache.
+   * append query param "user.name=<new_user>" to the next REST API call.
+
+In Kerberos method, the browser must support HTTP Kerberos SPNEGO.
+
+
+---++ Known Limitations
+
+   * ActiveMQ topics are not secure but will be in the near future
+   * Entities already scheduled with an earlier version of Falcon are not compatible with this version as new
+   workflow parameters are being passed back into Falcon such as the user are required
+   * Use of hftp as the scheme for read only interface in cluster entity [[https://issues.apache.org/jira/browse/HADOOP-10215][will not work in Oozie]]
+   The alternative is to use webhdfs scheme instead and its been tested with DistCp.
+
+
+---++ Examples
+
+---+++ Accessing the server using Falcon CLI (Java client)
+
+There is no change in the way the CLI is used. The CLI has been changed to work with the configured authentication
+method.
+
+---+++ Accessing the server using curl
+
+Try accessing protected resources using curl. The protected resources are:
+
+<verbatim>
+$ kinit
+Please enter the password for venkatesh@LOCALHOST:
+
+$ curl http://localhost:15000/api/admin/version
+
+$ curl http://localhost:15000/api/admin/version?user.name=venkatesh
+
+$ curl --negotiate -u foo -b ~/cookiejar.txt -c ~/cookiejar.txt curl http://localhost:15000/api/admin/version
+</verbatim>

http://git-wip-us.apache.org/repos/asf/falcon/blob/4612c3f7/trunk/releases/0.10/src/site/twiki/falconcli/CommonCLI.twiki
----------------------------------------------------------------------
diff --git a/trunk/releases/0.10/src/site/twiki/falconcli/CommonCLI.twiki b/trunk/releases/0.10/src/site/twiki/falconcli/CommonCLI.twiki
new file mode 100644
index 0000000..fab2ed1
--- /dev/null
+++ b/trunk/releases/0.10/src/site/twiki/falconcli/CommonCLI.twiki
@@ -0,0 +1,21 @@
+---++ Common CLI Options
+
+---+++Falcon URL
+
+Optional -url option indicating the URL of the Falcon system to run the command against can be provided.  If not mentioned it will be picked from the system environment variable FALCON_URL. If FALCON_URL is not set then it will be picked from client.properties file. If the option is not
+provided and also not set in client.properties, Falcon CLI will fail.
+
+---+++Proxy user support
+
+The -doAs option allows the current user to impersonate other users when interacting with the Falcon system. The current user must be configured as a proxyuser in the Falcon system. The proxyuser configuration may restrict from
+which hosts a user may impersonate users, as well as users of which groups can be impersonated.
+
+<a href="../FalconDocumentation.html#Proxyuser_support">Proxyuser support described here.</a>
+
+---+++Debug Mode
+
+If you export FALCON_DEBUG=true then the Falcon CLI will output the Web Services API details used by any commands you execute. This is useful for debugging purposes to or see how the Falcon CLI works with the WS API.
+Alternately, you can specify '-debug' through the CLI arguments to get the debug statements.
+
+Example:
+$FALCON_HOME/bin/falcon entity -submit -type cluster -file /cluster/definition.xml -debug
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/falcon/blob/4612c3f7/trunk/releases/0.10/src/site/twiki/falconcli/ContinueInstance.twiki
----------------------------------------------------------------------
diff --git a/trunk/releases/0.10/src/site/twiki/falconcli/ContinueInstance.twiki b/trunk/releases/0.10/src/site/twiki/falconcli/ContinueInstance.twiki
new file mode 100644
index 0000000..304e281
--- /dev/null
+++ b/trunk/releases/0.10/src/site/twiki/falconcli/ContinueInstance.twiki
@@ -0,0 +1,8 @@
+---+++Continue
+
+[[CommonCLI][Common CLI Options]]
+
+Continue option is used to continue the failed workflow instance. This option is valid only for process instances in terminal state, i.e. KILLED or FAILED.
+
+Usage:
+$FALCON_HOME/bin/falcon instance -type <<feed/process>> -name <<name>> -continue -start "yyyy-MM-dd'T'HH:mm'Z'" -end "yyyy-MM-dd'T'HH:mm'Z'"

http://git-wip-us.apache.org/repos/asf/falcon/blob/4612c3f7/trunk/releases/0.10/src/site/twiki/falconcli/DefineExtension.twiki
----------------------------------------------------------------------
diff --git a/trunk/releases/0.10/src/site/twiki/falconcli/DefineExtension.twiki b/trunk/releases/0.10/src/site/twiki/falconcli/DefineExtension.twiki
new file mode 100644
index 0000000..c260911
--- /dev/null
+++ b/trunk/releases/0.10/src/site/twiki/falconcli/DefineExtension.twiki
@@ -0,0 +1,8 @@
+---+++Definition
+
+[[CommonCLI][Common CLI Options]]
+
+Definition of an extension. Outputs a JSON document describing the extension invocation parameters.
+
+Usage:
+$FALCON_HOME/bin/falcon extension �definition �name <extension�name>

http://git-wip-us.apache.org/repos/asf/falcon/blob/4612c3f7/trunk/releases/0.10/src/site/twiki/falconcli/Definition.twiki
----------------------------------------------------------------------
diff --git a/trunk/releases/0.10/src/site/twiki/falconcli/Definition.twiki b/trunk/releases/0.10/src/site/twiki/falconcli/Definition.twiki
new file mode 100644
index 0000000..08d46c7
--- /dev/null
+++ b/trunk/releases/0.10/src/site/twiki/falconcli/Definition.twiki
@@ -0,0 +1,8 @@
+---+++Definition
+
+[[CommonCLI][Common CLI Options]]
+
+Definition option returns the entity definition submitted earlier during submit step.
+
+Usage:
+$FALCON_HOME/bin/falcon entity -type [cluster|datasource|feed|process] -name <<name>> -definition

http://git-wip-us.apache.org/repos/asf/falcon/blob/4612c3f7/trunk/releases/0.10/src/site/twiki/falconcli/DeleteEntity.twiki
----------------------------------------------------------------------
diff --git a/trunk/releases/0.10/src/site/twiki/falconcli/DeleteEntity.twiki b/trunk/releases/0.10/src/site/twiki/falconcli/DeleteEntity.twiki
new file mode 100644
index 0000000..cc07406
--- /dev/null
+++ b/trunk/releases/0.10/src/site/twiki/falconcli/DeleteEntity.twiki
@@ -0,0 +1,8 @@
+---+++Delete
+
+[[CommonCLI][Common CLI Options]]
+
+Delete removes the submitted entity definition for the specified entity and put it into the archive.Archive path is defined in startup.properties in variable "config.store.uri".
+
+Usage:
+$FALCON_HOME/bin/falcon entity  -type [cluster|datasource|feed|process] -name <<name>> -delete

http://git-wip-us.apache.org/repos/asf/falcon/blob/4612c3f7/trunk/releases/0.10/src/site/twiki/falconcli/DependencyEntity.twiki
----------------------------------------------------------------------
diff --git a/trunk/releases/0.10/src/site/twiki/falconcli/DependencyEntity.twiki b/trunk/releases/0.10/src/site/twiki/falconcli/DependencyEntity.twiki
new file mode 100644
index 0000000..bdef1d7
--- /dev/null
+++ b/trunk/releases/0.10/src/site/twiki/falconcli/DependencyEntity.twiki
@@ -0,0 +1,10 @@
+---+++Dependency
+
+[[CommonCLI][Common CLI Options]]
+
+With the use of dependency option, we can list all the entities on which the specified entity is dependent.
+For example for a feed, dependency return the cluster name and for process it returns all the input feeds,
+output feeds and cluster names.
+
+Usage:
+$FALCON_HOME/bin/falcon entity -type [cluster|datasource|feed|process] -name <<name>> -dependency
\ No newline at end of file