You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@atlas.apache.org by kb...@apache.org on 2018/01/19 10:20:46 UTC
[2/2] atlas git commit: ATLAS-2365: updated README for 1.0.0-alpha release

ATLAS-2365: updated README for 1.0.0-alpha release

Signed-off-by: kevalbhatt <kb...@apache.org>


Project: http://git-wip-us.apache.org/repos/asf/atlas/repo
Commit: http://git-wip-us.apache.org/repos/asf/atlas/commit/c65586f1
Tree: http://git-wip-us.apache.org/repos/asf/atlas/tree/c65586f1
Diff: http://git-wip-us.apache.org/repos/asf/atlas/diff/c65586f1

Branch: refs/heads/master
Commit: c65586f13896a44eb400c45c084499ab121c2e59
Parents: 39be2cc
Author: Madhan Neethiraj <ma...@apache.org>
Authored: Fri Jan 19 15:49:44 2018 +0530
Committer: kevalbhatt <kb...@apache.org>
Committed: Fri Jan 19 15:49:44 2018 +0530

----------------------------------------------------------------------
 docs/pom.xml                                |   3 +
 docs/src/site/twiki/Architecture.twiki      |  30 +-
 docs/src/site/twiki/Bridge-Falcon.twiki     |  56 ++--
 docs/src/site/twiki/Bridge-Hive.twiki       | 117 ++++----
 docs/src/site/twiki/Bridge-Sqoop.twiki      |  45 +--
 docs/src/site/twiki/Configuration.twiki     | 226 ++++-----------
 docs/src/site/twiki/HighAvailability.twiki  |  12 +-
 docs/src/site/twiki/InstallationSteps.twiki | 341 +++++++++--------------
 docs/src/site/twiki/QuickStart.twiki        |   7 +-
 docs/src/site/twiki/Repository.twiki        |   4 -
 docs/src/site/twiki/TypeSystem.twiki        | 206 +++++++-------
 docs/src/site/twiki/index.twiki             |  46 +--
 docs/src/site/twiki/security.twiki          |   2 +-
 13 files changed, 451 insertions(+), 644 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/atlas/blob/c65586f1/docs/pom.xml
----------------------------------------------------------------------
diff --git a/docs/pom.xml b/docs/pom.xml
index 15c1c38..1e38757 100755
--- a/docs/pom.xml
+++ b/docs/pom.xml
@@ -77,6 +77,9 @@
                         <version>1.6</version>
                     </dependency>
                 </dependencies>
+                <configuration>
+				    <port>8080</port>
+                </configuration>
                 <executions>
                     <execution>
                         <goals>

http://git-wip-us.apache.org/repos/asf/atlas/blob/c65586f1/docs/src/site/twiki/Architecture.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/Architecture.twiki b/docs/src/site/twiki/Architecture.twiki
index c832500..d0f1a05 100755
--- a/docs/src/site/twiki/Architecture.twiki
+++ b/docs/src/site/twiki/Architecture.twiki
@@ -8,8 +8,7 @@
 The components of Atlas can be grouped under the following major categories:
 
 ---+++ Core
-
-This category contains the components that implement the core of Atlas functionality, including:
+Atlas core includes the following components:
 
 *Type System*: Atlas allows users to define a model for the metadata objects they want to manage. The model is composed
 of definitions called ‘types’. Instances of ‘types’ called ‘entities’ represent the actual metadata objects that are
@@ -21,25 +20,18 @@ One key point to note is that the generic nature of the modelling in Atlas allow
 define both technical metadata and business metadata. It is also possible to define rich relationships between the
 two using features of Atlas.
 
+*Graph Engine*: Internally, Atlas persists metadata objects it manages using a Graph model. This approach provides great
+flexibility and enables efficient handling of rich relationships between the metadata objects. Graph engine component is
+responsible for translating between types and entities of the Atlas type system, and the underlying graph persistence model.
+In addition to managing the graph objects, the graph engine also creates the appropriate indices for the metadata
+objects so that they can be searched efficiently. Atlas uses the JanusGraph to store the metadata objects.
+
 *Ingest / Export*: The Ingest component allows metadata to be added to Atlas. Similarly, the Export component exposes
 metadata changes detected by Atlas to be raised as events. Consumers can consume these change events to react to
 metadata changes in real time.
 
-*Graph Engine*: Internally, Atlas represents metadata objects it manages using a Graph model. It does this to
-achieve great flexibility and rich relations between the metadata objects. The Graph Engine is a component that is
-responsible for translating between types and entities of the Type System, and the underlying Graph model.
-In addition to managing the Graph objects, The Graph Engine also creates the appropriate indices for the metadata
-objects so that they can be searched for efficiently.
-
-*Titan*: Currently, Atlas uses the Titan Graph Database to store the metadata objects. Titan is used as a library
-within Atlas. Titan uses two stores: The Metadata store is configured to !HBase by default and the Index store
-is configured to Solr. It is also possible to use the Metadata store as BerkeleyDB and Index store as !ElasticSearch
-by building with corresponding profiles. The Metadata store is used for storing the metadata objects proper, and the
-Index store is used for storing indices of the Metadata properties, that allows efficient search.
-
 
 ---+++ Integration
-
 Users can manage metadata in Atlas using two methods:
 
 *API*: All functionality of Atlas is exposed to end users via a REST API that allows types and entities to be created,
@@ -53,7 +45,6 @@ uses Apache Kafka as a notification server for communication between hooks and d
 notification events. Events are written by the hooks and Atlas to different Kafka topics.
 
 ---+++ Metadata sources
-
 Atlas supports integration with many sources of metadata out of the box. More integrations will be added in future
 as well. Currently, Atlas supports ingesting and managing metadata from the following sources:
 
@@ -61,6 +52,7 @@ as well. Currently, Atlas supports ingesting and managing metadata from the foll
    * [[Bridge-Sqoop][Sqoop]]
    * [[Bridge-Falcon][Falcon]]
    * [[StormAtlasHook][Storm]]
+   * HBase - _documentation work-in-progress_
 
 The integration implies two things:
 There are metadata models that Atlas defines natively to represent objects of these components.
@@ -80,12 +72,6 @@ for the Hadoop ecosystem having wide integration with a variety of Hadoop compon
 Ranger allows security administrators to define metadata driven security policies for effective governance.
 Ranger is a consumer to the metadata change events notified by Atlas.
 
-*Business Taxonomy*: The metadata objects ingested into Atlas from the Metadata sources are primarily a form
-of technical metadata. To enhance the discoverability and governance capabilities, Atlas comes with a Business
-Taxonomy interface that allows users to first, define a hierarchical set of business terms that represent their
-business domain and associate them to the metadata entities Atlas manages. Business Taxonomy is a web application that
-is part of the Atlas Admin UI currently and integrates with Atlas using the REST API.
-
 
 
 

http://git-wip-us.apache.org/repos/asf/atlas/blob/c65586f1/docs/src/site/twiki/Bridge-Falcon.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/Bridge-Falcon.twiki b/docs/src/site/twiki/Bridge-Falcon.twiki
index de80035..0cf1645 100644
--- a/docs/src/site/twiki/Bridge-Falcon.twiki
+++ b/docs/src/site/twiki/Bridge-Falcon.twiki
@@ -1,44 +1,52 @@
 ---+ Falcon Atlas Bridge
 
 ---++ Falcon Model
-The default falcon modelling is available in org.apache.atlas.falcon.model.FalconDataModelGenerator. It defines the following types:
-<verbatim>
-falcon_cluster(ClassType) - super types [Infrastructure] - attributes [timestamp, colo, owner, tags]
-falcon_feed(ClassType) - super types [DataSet] - attributes [timestamp, stored-in, owner, groups, tags]
-falcon_feed_creation(ClassType) - super types [Process] - attributes [timestamp, stored-in, owner]
-falcon_feed_replication(ClassType) - super types [Process] - attributes [timestamp, owner]
-falcon_process(ClassType) - super types [Process] - attributes [timestamp, runs-on, owner, tags, pipelines, workflow-properties]
-</verbatim>
+The default hive model includes the following types:
+   * Entity types:
+      * falcon_cluster
+         * super-types: Infrastructure
+         * attributes: timestamp, colo, owner, tags
+      * falcon_feed
+         * super-types: !DataSet
+         * attributes: timestamp, stored-in, owner, groups, tags
+      * falcon_feed_creation
+         * super-types: Process
+         * attributes: timestamp, stored-in, owner
+      * falcon_feed_replication
+         * super-types: Process
+         * attributes: timestamp, owner
+      * falcon_process
+         * super-types: Process
+         * attributes: timestamp, runs-on, owner, tags, pipelines, workflow-properties
 
 One falcon_process entity is created for every cluster that the falcon process is defined for.
 
 The entities are created and de-duped using unique qualifiedName attribute. They provide namespace and can be used for querying/lineage as well. The unique attributes are:
-   * falcon_process - <process name>@<cluster name>
-   * falcon_cluster - <cluster name>
-   * falcon_feed - <feed name>@<cluster name>
-   * falcon_feed_creation - <feed name>
-   * falcon_feed_replication - <feed name>
+   * falcon_process.qualifiedName          - <process name>@<cluster name>
+   * falcon_cluster.qualifiedName          - <cluster name>
+   * falcon_feed.qualifiedName             - <feed name>@<cluster name>
+   * falcon_feed_creation.qualifiedName    - <feed name>
+   * falcon_feed_replication.qualifiedName - <feed name>
 
 ---++ Falcon Hook
-Falcon supports listeners on falcon entity submission. This is used to add entities in Atlas using the model defined in org.apache.atlas.falcon.model.FalconDataModelGenerator.
-The hook submits the request to a thread pool executor to avoid blocking the command execution. The thread submits the entities as message to the notification server and atlas server reads these messages and registers the entities.
+Falcon supports listeners on falcon entity submission. This is used to add entities in Atlas using the model detailed above.
+Follow the instructions below to setup Atlas hook in Falcon:
    * Add 'org.apache.atlas.falcon.service.AtlasService' to application.services in <falcon-conf>/startup.properties
-   * Link falcon hook jars in falcon classpath - 'ln -s <atlas-home>/hook/falcon/* <falcon-home>/server/webapp/falcon/WEB-INF/lib/'
+   * Link Atlas hook jars in Falcon classpath - 'ln -s <atlas-home>/hook/falcon/* <falcon-home>/server/webapp/falcon/WEB-INF/lib/'
    * In <falcon_conf>/falcon-env.sh, set an environment variable as follows:
      <verbatim>
-     export FALCON_SERVER_OPTS="<atlas_home>/hook/falcon/*:$FALCON_SERVER_OPTS"
-     </verbatim>
+     export FALCON_SERVER_OPTS="<atlas_home>/hook/falcon/*:$FALCON_SERVER_OPTS"</verbatim>
 
 The following properties in <atlas-conf>/atlas-application.properties control the thread pool and notification details:
-   * atlas.hook.falcon.synchronous - boolean, true to run the hook synchronously. default false
-   * atlas.hook.falcon.numRetries - number of retries for notification failure. default 3
-   * atlas.hook.falcon.minThreads - core number of threads. default 5
-   * atlas.hook.falcon.maxThreads - maximum number of threads. default 5
+   * atlas.hook.falcon.synchronous   - boolean, true to run the hook synchronously. default false
+   * atlas.hook.falcon.numRetries    - number of retries for notification failure. default 3
+   * atlas.hook.falcon.minThreads    - core number of threads. default 5
+   * atlas.hook.falcon.maxThreads    - maximum number of threads. default 5
    * atlas.hook.falcon.keepAliveTime - keep alive time in msecs. default 10
-   * atlas.hook.falcon.queueSize - queue size for the threadpool. default 10000
+   * atlas.hook.falcon.queueSize     - queue size for the threadpool. default 10000
 
 Refer [[Configuration][Configuration]] for notification related configurations
 
 
----++ Limitations
+---++ NOTES
    * In falcon cluster entity, cluster name used should be uniform across components like hive, falcon, sqoop etc. If used with ambari, ambari cluster name should be used for cluster entity

http://git-wip-us.apache.org/repos/asf/atlas/blob/c65586f1/docs/src/site/twiki/Bridge-Hive.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/Bridge-Hive.twiki b/docs/src/site/twiki/Bridge-Hive.twiki
index dd22b5c..7c93ecd 100644
--- a/docs/src/site/twiki/Bridge-Hive.twiki
+++ b/docs/src/site/twiki/Bridge-Hive.twiki
@@ -1,73 +1,71 @@
 ---+ Hive Atlas Bridge
 
 ---++ Hive Model
-The default hive modelling is available in org.apache.atlas.hive.model.HiveDataModelGenerator. It defines the following types:
-<verbatim>
-hive_db(ClassType) - super types [Referenceable] - attributes [name, clusterName, description, locationUri, parameters, ownerName, ownerType]
-hive_storagedesc(ClassType) - super types [Referenceable] - attributes [cols, location, inputFormat, outputFormat, compressed, numBuckets, serdeInfo, bucketCols, sortCols, parameters, storedAsSubDirectories]
-hive_column(ClassType) - super types [Referenceable] - attributes [name, type, comment, table]
-hive_table(ClassType) - super types [DataSet] - attributes [name, db, owner, createTime, lastAccessTime, comment, retention, sd, partitionKeys, columns, aliases, parameters, viewOriginalText, viewExpandedText, tableType, temporary]
-hive_process(ClassType) - super types [Process] - attributes [name, startTime, endTime, userName, operationType, queryText, queryPlan, queryId]
-hive_principal_type(EnumType) - values [USER, ROLE, GROUP]
-hive_order(StructType) - attributes [col, order]
-hive_serde(StructType) - attributes [name, serializationLib, parameters]
-</verbatim>
-
-The entities are created and de-duped using unique qualified name. They provide namespace and can be used for querying/lineage as well. Note that  dbName, tableName and columnName should be in lower case. clusterName is explained below.
-   * hive_db - attribute qualifiedName - <dbName>@<clusterName>
-   * hive_table - attribute qualifiedName - <dbName>.<tableName>@<clusterName>
-   * hive_column - attribute qualifiedName - <dbName>.<tableName>.<columnName>@<clusterName>
-   * hive_process - attribute name - <queryString> - trimmed query string in lower case
+The default hive model includes the following types:
+   * Entity types:
+      * hive_db
+         * super-types: Referenceable
+         * attributes: name, clusterName, description, locationUri, parameters, ownerName, ownerType
+      * hive_storagedesc
+         * super-types: Referenceable
+         * attributes: cols, location, inputFormat, outputFormat, compressed, numBuckets, serdeInfo, bucketCols, sortCols, parameters, storedAsSubDirectories
+      * hive_column
+         * super-types: Referenceable
+         * attributes: name, type, comment, table
+      * hive_table
+         * super-types: !DataSet
+         * attributes: name, db, owner, createTime, lastAccessTime, comment, retention, sd, partitionKeys, columns, aliases, parameters, viewOriginalText, viewExpandedText, tableType, temporary
+      * hive_process
+         * super-types: Process
+         * attributes: name, startTime, endTime, userName, operationType, queryText, queryPlan, queryId
+      * hive_column_lineage
+         * super-types: Process
+         * attributes: query, depenendencyType, expression
+
+   * Enum types:
+      * hive_principal_type
+         * values: USER, ROLE, GROUP
+
+   * Struct types:
+      * hive_order
+         * attributes: col, order
+      * hive_serde
+         * attributes: name, serializationLib, parameters
+
+The entities are created and de-duped using unique qualified name. They provide namespace and can be used for querying/lineage as well. Note that dbName, tableName and columnName should be in lower case. clusterName is explained below.
+   * hive_db.qualifiedName     - <dbName>@<clusterName>
+   * hive_table.qualifiedName  - <dbName>.<tableName>@<clusterName>
+   * hive_column.qualifiedName - <dbName>.<tableName>.<columnName>@<clusterName>
+   * hive_process.queryString  - trimmed query string in lower case
 
 
 ---++ Importing Hive Metadata
-org.apache.atlas.hive.bridge.HiveMetaStoreBridge imports the Hive metadata into Atlas using the model defined in org.apache.atlas.hive.model.HiveDataModelGenerator. import-hive.sh command can be used to facilitate this. The script needs Hadoop and Hive classpath jars.
-  * For Hadoop jars, please make sure that the environment variable HADOOP_CLASSPATH is set. Another way is to set HADOOP_HOME to point to root directory of your Hadoop installation
-  * Similarly, for Hive jars, set HIVE_HOME to the root of Hive installation
-  * Set environment variable HIVE_CONF_DIR to Hive configuration directory
-  * Copy <atlas-conf>/atlas-application.properties to the hive conf directory
-
+org.apache.atlas.hive.bridge.HiveMetaStoreBridge imports the Hive metadata into Atlas using the model defined above. import-hive.sh command can be used to facilitate this.
     <verbatim>
-    Usage: <atlas package>/hook-bin/import-hive.sh
-    </verbatim>
+    Usage: <atlas package>/hook-bin/import-hive.sh</verbatim>
 
 The logs are in <atlas package>/logs/import-hive.log
 
-If you you are importing metadata in a kerberized cluster you need to run the command like this:
-<verbatim>
-<atlas package>/hook-bin/import-hive.sh -Dsun.security.jgss.debug=true -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.krb5.conf=[krb5.conf location] -Djava.security.auth.login.config=[jaas.conf location]
-</verbatim>
-   * krb5.conf is typically found at /etc/krb5.conf
-   * for details about jaas.conf and a suggested location see the [[security][atlas security documentation]]
-
 
 ---++ Hive Hook
-Hive supports listeners on hive command execution using hive hooks. This is used to add/update/remove entities in Atlas using the model defined in org.apache.atlas.hive.model.HiveDataModelGenerator.
-The hook submits the request to a thread pool executor to avoid blocking the command execution. The thread submits the entities as message to the notification server and atlas server reads these messages and registers the entities.
-Follow these instructions in your hive set-up to add hive hook for Atlas:
-   * Set-up atlas hook in hive-site.xml of your hive configuration:
+Atlas Hive hook registers with Hive to listen for create/update/delete operations and updates the metadata in Atlas, via Kafka notifications, for the changes in Hive.
+Follow the instructions below to setup Atlas hook in Hive:
+   * Set-up Atlas hook in hive-site.xml by adding the following:
   <verbatim>
     <property>
       <name>hive.exec.post.hooks</name>
       <value>org.apache.atlas.hive.hook.HiveHook</value>
-    </property>
-  </verbatim>
-  <verbatim>
-    <property>
-      <name>atlas.cluster.name</name>
-      <value>primary</value>
-    </property>
-  </verbatim>
+    </property></verbatim>
    * Add 'export HIVE_AUX_JARS_PATH=<atlas package>/hook/hive' in hive-env.sh of your hive configuration
    * Copy <atlas-conf>/atlas-application.properties to the hive conf directory.
 
 The following properties in <atlas-conf>/atlas-application.properties control the thread pool and notification details:
-   * atlas.hook.hive.synchronous - boolean, true to run the hook synchronously. default false. Recommended to be set to false to avoid delays in hive query completion.
-   * atlas.hook.hive.numRetries - number of retries for notification failure. default 3
-   * atlas.hook.hive.minThreads - core number of threads. default 5
-   * atlas.hook.hive.maxThreads - maximum number of threads. default 5
+   * atlas.hook.hive.synchronous   - boolean, true to run the hook synchronously. default false. Recommended to be set to false to avoid delays in hive query completion.
+   * atlas.hook.hive.numRetries    - number of retries for notification failure. default 3
+   * atlas.hook.hive.minThreads    - core number of threads. default 1
+   * atlas.hook.hive.maxThreads    - maximum number of threads. default 5
    * atlas.hook.hive.keepAliveTime - keep alive time in msecs. default 10
-   * atlas.hook.hive.queueSize - queue size for the threadpool. default 10000
+   * atlas.hook.hive.queueSize     - queue size for the threadpool. default 10000
 
 Refer [[Configuration][Configuration]] for notification related configurations
 
@@ -76,24 +74,23 @@ Refer [[Configuration][Configuration]] for notification related configurations
 Starting from 0.8-incubating version of Atlas, Column level lineage is captured in Atlas. Below are the details
 
 ---+++ Model
-   * !ColumnLineageProcess type is a subclass of Process
+   * !ColumnLineageProcess type is a subtype of Process
 
    * This relates an output Column to a set of input Columns or the Input Table
 
-   * The Lineage also captures the kind of Dependency: currently the values are SIMPLE, EXPRESSION, SCRIPT
-      * A SIMPLE dependency means the output column has the same value as the input
-      * An EXPRESSION dependency means the output column is transformed by some expression in the runtime(for e.g. a Hive SQL expression) on the Input Columns.
-      * SCRIPT means that the output column is transformed by a user provided script.
+   * The lineage also captures the kind of dependency, as listed below:
+      * SIMPLE:     output column has the same value as the input
+      * EXPRESSION: output column is transformed by some expression at runtime (for e.g. a Hive SQL expression) on the Input Columns.
+      * SCRIPT:     output column is transformed by a user provided script.
 
    * In case of EXPRESSION dependency the expression attribute contains the expression in string form
 
-   * Since Process links input and output !DataSets, we make Column a subclass of !DataSet
+   * Since Process links input and output !DataSets, Column is a subtype of !DataSet
 
 ---+++ Examples
 For a simple CTAS below:
 <verbatim>
-create table t2 as select id, name from T1
-</verbatim>
+create table t2 as select id, name from T1</verbatim>
 
 The lineage is captured as
 
@@ -106,10 +103,8 @@ The lineage is captured as
 
   * The !LineageInfo in Hive provides column-level lineage for the final !FileSinkOperator, linking them to the input columns in the Hive Query
 
----+++ NOTE
-Column level lineage works with Hive version 1.2.1 after the patch for <a href="https://issues.apache.org/jira/browse/HIVE-13112">HIVE-13112</a> is applied to Hive source
-
----++ Limitations
+---++ NOTES
+   * Column level lineage works with Hive version 1.2.1 after the patch for <a href="https://issues.apache.org/jira/browse/HIVE-13112">HIVE-13112</a> is applied to Hive source
    * Since database name, table name and column names are case insensitive in hive, the corresponding names in entities are lowercase. So, any search APIs should use lowercase while querying on the entity names
    * The following hive operations are captured by hive hook currently
       * create database

http://git-wip-us.apache.org/repos/asf/atlas/blob/c65586f1/docs/src/site/twiki/Bridge-Sqoop.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/Bridge-Sqoop.twiki b/docs/src/site/twiki/Bridge-Sqoop.twiki
index bf942f2..480578b 100644
--- a/docs/src/site/twiki/Bridge-Sqoop.twiki
+++ b/docs/src/site/twiki/Bridge-Sqoop.twiki
@@ -1,37 +1,42 @@
 ---+ Sqoop Atlas Bridge
 
 ---++ Sqoop Model
-The default Sqoop modelling is available in org.apache.atlas.sqoop.model.SqoopDataModelGenerator. It defines the following types:
-<verbatim>
-sqoop_operation_type(EnumType) - values [IMPORT, EXPORT, EVAL]
-sqoop_dbstore_usage(EnumType) - values [TABLE, QUERY, PROCEDURE, OTHER]
-sqoop_process(ClassType) - super types [Process] - attributes [name, operation, dbStore, hiveTable, commandlineOpts, startTime, endTime, userName]
-sqoop_dbdatastore(ClassType) - super types [DataSet] - attributes [name, dbStoreType, storeUse, storeUri, source, description, ownerName]
-</verbatim>
+The default hive model includes the following types:
+   * Entity types:
+      * sqoop_process
+         * super-types: Process
+         * attributes: name, operation, dbStore, hiveTable, commandlineOpts, startTime, endTime, userName
+      * sqoop_dbdatastore
+         * super-types: !DataSet
+         * attributes: name, dbStoreType, storeUse, storeUri, source, description, ownerName
+
+   * Enum types:
+      * sqoop_operation_type
+         * values: IMPORT, EXPORT, EVAL
+      * sqoop_dbstore_usage
+         * values: TABLE, QUERY, PROCEDURE, OTHER
 
 The entities are created and de-duped using unique qualified name. They provide namespace and can be used for querying as well:
-sqoop_process - attribute name - sqoop-dbStoreType-storeUri-endTime
-sqoop_dbdatastore - attribute name - dbStoreType-connectorUrl-source
+   * sqoop_process.qualifiedName     - dbStoreType-storeUri-endTime
+   * sqoop_dbdatastore.qualifiedName - dbStoreType-storeUri-source
 
 ---++ Sqoop Hook
-Sqoop added a !SqoopJobDataPublisher that publishes data to Atlas after completion of import Job. Today, only hiveImport is supported in sqoopHook.
-This is used to add entities in Atlas using the model defined in org.apache.atlas.sqoop.model.SqoopDataModelGenerator.
-Follow these instructions in your sqoop set-up to add sqoop hook for Atlas in <sqoop-conf>/sqoop-site.xml:
+Sqoop added a !SqoopJobDataPublisher that publishes data to Atlas after completion of import Job. Today, only hiveImport is supported in !SqoopHook.
+This is used to add entities in Atlas using the model detailed above.
+
+Follow the instructions below to setup Atlas hook in Hive:
 
-   * Sqoop Job publisher class.  Currently only one publishing class is supported
+Add the following properties to  to enable Atlas hook in Sqoop:
+   * Set-up Atlas hook in <sqoop-conf>/sqoop-site.xml by adding the following:
+  <verbatim>
    <property>
      <name>sqoop.job.data.publish.class</name>
      <value>org.apache.atlas.sqoop.hook.SqoopHook</value>
-   </property>
-   * Atlas cluster name
-   <property>
-     <name>atlas.cluster.name</name>
-     <value><clustername></value>
-   </property>
+   </property></verbatim>
    * Copy <atlas-conf>/atlas-application.properties to to the sqoop conf directory <sqoop-conf>/
    * Link <atlas-home>/hook/sqoop/*.jar in sqoop lib
 
 Refer [[Configuration][Configuration]] for notification related configurations
 
----++ Limitations
+---++ NOTES
    * Only the following sqoop operations are captured by sqoop hook currently - hiveImport

http://git-wip-us.apache.org/repos/asf/atlas/blob/c65586f1/docs/src/site/twiki/Configuration.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/Configuration.twiki b/docs/src/site/twiki/Configuration.twiki
index 19c39b0..63c3fce 100644
--- a/docs/src/site/twiki/Configuration.twiki
+++ b/docs/src/site/twiki/Configuration.twiki
@@ -5,139 +5,42 @@ All configuration in Atlas uses java properties style configuration. The main co
 
 ---++ Graph Configs
 
----+++ Graph persistence engine
-
-This section sets up the graph db - titan - to use a persistence engine. Please refer to
-<a href="http://s3.thinkaurelius.com/docs/titan/0.5.4/titan-config-ref.html">link</a> for more
-details. The example below uses BerkeleyDBJE.
-
-<verbatim>
-atlas.graph.storage.backend=berkeleyje
-atlas.graph.storage.directory=data/berkeley
-</verbatim>
-
----++++ Graph persistence engine - Hbase
-
-Basic configuration
+---+++ Graph Persistence engine - HBase
+Set the following properties to configure JanusGraph to use HBase as the persistence engine. Please refer to
+<a href="http://docs.janusgraph.org/0.2.0/configuration.html#_hbase_caching">link</a> for more details.
 
 <verbatim>
 atlas.graph.storage.backend=hbase
-#For standalone mode , specify localhost
-#for distributed mode, specify zookeeper quorum here - For more information refer http://s3.thinkaurelius.com/docs/titan/current/hbase.html#_remote_server_mode_2
 atlas.graph.storage.hostname=<ZooKeeper Quorum>
+atlas.graph.storage.hbase.table=atlas
 </verbatim>
 
-HBASE_CONF_DIR environment variable needs to be set to point to the Hbase client configuration directory which is added to classpath when Atlas starts up.
-hbase-site.xml needs to have the following properties set according to the cluster setup
-<verbatim>
-#Set below to /hbase-secure if the Hbase server is setup in secure mode
-zookeeper.znode.parent=/hbase-unsecure
-</verbatim>
+If any further JanusGraph configuration needs to be setup, please prefix the property name with "atlas.graph.".
 
-Advanced configuration
+In addition to setting up configurations, please ensure that environment variable HBASE_CONF_DIR is setup to point to
+the directory containing HBase configuration file hbase-site.xml.
 
-# If you are planning to use any of the configs mentioned below, they need to be prefixed with "atlas.graph." to take effect in ATLAS
-Refer http://s3.thinkaurelius.com/docs/titan/0.5.4/titan-config-ref.html#_storage_hbase
-
-Permissions
-
-When Atlas is configured with HBase as the storage backend the graph db (titan) needs sufficient user permissions to be able to create and access an HBase table.  In a secure cluster it may be necessary to grant permissions to the 'atlas' user for the 'titan' table.
-
-With Ranger, a policy can be configured for 'titan'.
-
-Without Ranger, HBase shell can be used to set the permissions.
+---+++ Graph Search Index - Solr
+Solr installation in Cloud mode is a prerequisite for Apache Atlas use. Set the following properties to configure JanusGraph to use Solr as the index search engine.
 
 <verbatim>
-   su hbase
-   kinit -k -t <hbase keytab> <hbase principal>
-   echo "grant 'atlas', 'RWXCA', 'titan'" | hbase shell
-</verbatim>
+atlas.graph.index.search.backend=solr5
+atlas.graph.index.search.solr.mode=cloud
+atlas.graph.index.search.solr.wait-searcher=true
 
-Note that if the embedded-hbase-solr profile is used then HBase is included in the distribution so that a standalone
-instance of HBase can be started as the default storage backend for the graph repository.  Using the embedded-hbase-solr
-profile will configure Atlas so that HBase instance will be started and stopped along with the Atlas server by default.
-To use the embedded-hbase-solr profile please see "Building Atlas" in the [[InstallationSteps][Installation Steps]]
-section.
+# ZK quorum setup for solr as comma separated value. Example: 10.1.6.4:2181,10.1.6.5:2181
+atlas.graph.index.search.solr.zookeeper-url=
 
----+++ Graph Search Index
-This section sets up the graph db - titan - to use an search indexing system. The example
-configuration below sets up to use an embedded Elastic search indexing system.
+# SolrCloud Zookeeper Connection Timeout. Default value is 60000 ms
+atlas.graph.index.search.solr.zookeeper-connect-timeout=60000
 
-<verbatim>
-atlas.graph.index.search.backend=elasticsearch
-atlas.graph.index.search.directory=data/es
-atlas.graph.index.search.elasticsearch.client-only=false
-atlas.graph.index.search.elasticsearch.local-mode=true
-atlas.graph.index.search.elasticsearch.create.sleep=2000
-</verbatim>
-
----++++ Graph Search Index - Solr
-Please note that Solr installation in Cloud mode is a prerequisite before configuring Solr as the search indexing backend. Refer InstallationSteps section for Solr installation/configuration.
-
-<verbatim>
- atlas.graph.index.search.backend=solr5
- atlas.graph.index.search.solr.mode=cloud
- atlas.graph.index.search.solr.zookeeper-url=<the ZK quorum setup for solr as comma separated value> eg: 10.1.6.4:2181,10.1.6.5:2181
- atlas.graph.index.search.solr.zookeeper-connect-timeout=<SolrCloud Zookeeper Connection Timeout>. Default value is 60000 ms
- atlas.graph.index.search.solr.zookeeper-session-timeout=<SolrCloud Zookeeper Session Timeout>. Default value is 60000 ms
-</verbatim>
-
-Also note that if the embedded-hbase-solr profile is used then Solr is included in the distribution so that a standalone
-instance of Solr can be started as the default search indexing backend. Using the embedded-hbase-solr profile will
-configure Atlas so that the standalone Solr instance will be started and stopped along with the Atlas server by default.
-To use the embedded-hbase-solr profile please see "Building Atlas" in the [[InstallationSteps][Installation Steps]]
-section.
-
----+++ Choosing between Persistence and Indexing Backends
-
-Refer http://s3.thinkaurelius.com/docs/titan/0.5.4/bdb.html and http://s3.thinkaurelius.com/docs/titan/0.5.4/hbase.html for choosing between the persistence backends.
-BerkeleyDB is suitable for smaller data sets in the range of upto 10 million vertices with ACID gurantees.
-HBase on the other hand doesnt provide ACID guarantees but is able to scale for larger graphs. HBase also provides HA inherently.
-
----+++ Choosing between Persistence Backends
-
-Refer http://s3.thinkaurelius.com/docs/titan/0.5.4/bdb.html and http://s3.thinkaurelius.com/docs/titan/0.5.4/hbase.html for choosing between the persistence backends.
-BerkeleyDB is suitable for smaller data sets in the range of upto 10 million vertices with ACID gurantees.
-HBase on the other hand doesnt provide ACID guarantees but is able to scale for larger graphs. HBase also provides HA inherently.
-
----+++ Choosing between Indexing Backends
-
-Refer http://s3.thinkaurelius.com/docs/titan/0.5.4/elasticsearch.html and http://s3.thinkaurelius.com/docs/titan/0.5.4/solr.html for choosing between !ElasticSearch and Solr.
-Solr in cloud mode is the recommended setup.
-
----+++ Switching Persistence Backend
-
-For switching the storage backend from BerkeleyDB to HBase and vice versa, refer the documentation for "Graph Persistence Engine" described above and restart ATLAS.
-The data in the indexing backend needs to be cleared else there will be discrepancies between the storage and indexing backend which could result in errors during the search.
-!ElasticSearch runs by default in embedded mode and the data could easily be cleared by deleting the ATLAS_HOME/data/es directory.
-For Solr, the collections which were created during ATLAS Installation - vertex_index, edge_index, fulltext_index could be deleted which will cleanup the indexes
-
----+++ Switching Index Backend
-
-Switching the Index backend requires clearing the persistence backend data. Otherwise there will be discrepancies between the persistence and index backends since switching the indexing backend means index data will be lost.
-This leads to "Fulltext" queries not working on the existing data
-For clearing the data for BerkeleyDB, delete the ATLAS_HOME/data/berkeley directory
-For clearing the data for HBase, in Hbase shell, run 'disable titan' and 'drop titan'
-
-
----++ Lineage Configs
-
-The higher layer services like lineage, schema, etc. are driven by the type system and this section encodes the specific types for the hive data model.
-
-# This models reflects the base super types for Data and Process
-<verbatim>
-atlas.lineage.hive.table.type.name=DataSet
-atlas.lineage.hive.process.type.name=Process
-atlas.lineage.hive.process.inputs.name=inputs
-atlas.lineage.hive.process.outputs.name=outputs
-
-## Schema
-atlas.lineage.hive.table.schema.query=hive_table where name=?, columns
+# SolrCloud Zookeeper Session Timeout. Default value is 60000 ms
+atlas.graph.index.search.solr.zookeeper-session-timeout=60000
 </verbatim>
 
 
 ---++ Search Configs
-Search APIs (DSL and full text search) support pagination and have optional limit and offset arguments. Following configs are related to search pagination
+Search APIs (DSL, basic search, full-text search) support pagination and have optional limit and offset arguments. Following configs are related to search pagination
 
 <verbatim>
 # Default limit used when limit is not specified in API
@@ -152,53 +55,36 @@ atlas.search.maxlimit=10000
 Refer http://kafka.apache.org/documentation.html#configuration for Kafka configuration. All Kafka configs should be prefixed with 'atlas.kafka.'
 
 <verbatim>
-atlas.notification.embedded=true
-atlas.kafka.data=${sys:atlas.home}/data/kafka
-atlas.kafka.zookeeper.connect=localhost:9026
-atlas.kafka.bootstrap.servers=localhost:9027
-atlas.kafka.zookeeper.session.timeout.ms=400
-atlas.kafka.zookeeper.sync.time.ms=20
-atlas.kafka.auto.commit.interval.ms=1000
-atlas.kafka.hook.group.id=atlas
-</verbatim>
+atlas.kafka.auto.commit.enable=false
 
-Note that Kafka group ids are specified for a specific topic.  The Kafka group id configuration for entity notifications is 'atlas.kafka.entities.group.id'
+# Kafka servers. Example: localhost:6667
+atlas.kafka.bootstrap.servers=
 
-<verbatim>
-atlas.kafka.entities.group.id=<consumer id>
-</verbatim>
+atlas.kafka.hook.group.id=atlas
 
-These configuration parameters are useful for setting up Kafka topics via Atlas provided scripts, described in the
-[[InstallationSteps][Installation Steps]] page.
+# Zookeeper connect URL for Kafka. Example: localhost:2181
+atlas.kafka.zookeeper.connect=
 
-<verbatim>
-# Whether to create the topics automatically, default is true.
-# Comma separated list of topics to be created, default is "ATLAS_HOOK,ATLAS_ENTITES"
-atlas.notification.topics=ATLAS_HOOK,ATLAS_ENTITIES
-# Number of replicas for the Atlas topics, default is 1. Increase for higher resilience to Kafka failures.
-atlas.notification.replicas=1
-# Enable the below two properties if Kafka is running in Kerberized mode.
-# Set this to the service principal representing the Kafka service
-atlas.notification.kafka.service.principal=kafka/_HOST@EXAMPLE.COM
-# Set this to the location of the keytab file for Kafka
-#atlas.notification.kafka.keytab.location=/etc/security/keytabs/kafka.service.keytab
-</verbatim>
+atlas.kafka.zookeeper.connection.timeout.ms=30000
+atlas.kafka.zookeeper.session.timeout.ms=60000
+atlas.kafka.zookeeper.sync.time.ms=20
 
-These configuration parameters are useful for saving messages in case there are issues in reaching Kafka for
-sending messages.
+# Setup the following configurations only in test deployments where Kafka is started within Atlas in embedded mode
+# atlas.notification.embedded=true
+# atlas.kafka.data=${sys:atlas.home}/data/kafka
 
-<verbatim>
-# Whether to save messages that failed to be sent to Kafka, default is true
-atlas.notification.log.failed.messages=true
-# If saving messages is enabled, the file name to save them to. This file will be created under the log directory of the hook's host component - like HiveServer2
-atlas.notification.failed.messages.filename=atlas_hook_failed_messages.log
+# Setup the following two properties if Kafka is running in Kerberized mode.
+# atlas.notification.kafka.service.principal=kafka/_HOST@EXAMPLE.COM
+# atlas.notification.kafka.keytab.location=/etc/security/keytabs/kafka.service.keytab
 </verbatim>
 
 ---++ Client Configs
 <verbatim>
 atlas.client.readTimeoutMSecs=60000
 atlas.client.connectTimeoutMSecs=60000
-atlas.rest.address=<http/https>://<atlas-fqdn>:<atlas port> - default http://localhost:21000
+
+# URL to access Atlas server. For example: http://localhost:21000
+atlas.rest.address=
 </verbatim>
 
 
@@ -212,26 +98,28 @@ atlas.enableTLS=false
 </verbatim>
 
 ---++ High Availability Properties
-
 The following properties describe High Availability related configuration options:
 
 <verbatim>
 # Set the following property to true, to enable High Availability. Default = false.
 atlas.server.ha.enabled=true
 
-# Define a unique set of strings to identify each instance that should run an Atlas Web Service instance as a comma separated list.
+# Specify the list of Atlas instances
 atlas.server.ids=id1,id2
-# For each string defined above, define the host and port on which Atlas server binds to.
+# For each instance defined above, define the host and port on which Atlas server listens.
 atlas.server.address.id1=host1.company.com:21000
 atlas.server.address.id2=host2.company.com:31000
 
 # Specify Zookeeper properties needed for HA.
 # Specify the list of services running Zookeeper servers as a comma separated list.
 atlas.server.ha.zookeeper.connect=zk1.company.com:2181,zk2.company.com:2181,zk3.company.com:2181
+
 # Specify how many times should connection try to be established with a Zookeeper cluster, in case of any connection issues.
 atlas.server.ha.zookeeper.num.retries=3
+
 # Specify how much time should the server wait before attempting connections to Zookeeper, in case of any connection issues.
 atlas.server.ha.zookeeper.retry.sleeptime.ms=1000
+
 # Specify how long a session to Zookeeper should last without inactiviy to be deemed as unreachable.
 atlas.server.ha.zookeeper.session.timeout.ms=20000
 
@@ -239,6 +127,7 @@ atlas.server.ha.zookeeper.session.timeout.ms=20000
 # The format of these options is <scheme>:<identity>. For more information refer to http://zookeeper.apache.org/doc/r3.2.2/zookeeperProgrammers.html#sc_ZooKeeperAccessControl.
 # The 'acl' option allows to specify a scheme, identity pair to setup an ACL for.
 atlas.server.ha.zookeeper.acl=sasl:client@comany.com
+
 # The 'auth' option specifies the authentication that should be used for connecting to Zookeeper.
 atlas.server.ha.zookeeper.auth=sasl:client@company.com
 
@@ -254,14 +143,12 @@ atlas.client.ha.sleep.interval.ms=5000
 </verbatim>
 
 ---++ Server Properties
-
 <verbatim>
 # Set the following property to true, to enable the setup steps to run on each server start. Default = false.
 atlas.server.run.setup.on.start=false
 </verbatim>
 
 ---++ Performance configuration items
-
 The following properties can be used to tune performance of Atlas under specific circumstances:
 
 <verbatim>
@@ -288,14 +175,19 @@ atlas.webserver.queuesize=100
 </verbatim>
 
 ---+++ Recording performance metrics
-
-Atlas package should be built with '-P perf' to instrument atlas code to collect metrics. The metrics will be recorded in
-<atlas.log.dir>/metric.log, with one log line per API call. The metrics contain the number of times the instrumented methods
-are called and the total time spent in the instrumented method. Logging to metric.log is controlled through log4j configuration
-in atlas-log4j.xml. When the atlas code is instrumented, to disable logging to metric.log at runtime, set log level of METRICS logger to info level:
-<verbatim>
-<logger name="METRICS" additivity="false">
-    <level value="info"/>
-    <appender-ref ref="METRICS"/>
-</logger>
-</verbatim>
+To enable performance logs for various Atlas operations (like REST API calls, notification processing), setup the following in atlas-log4j.xml:
+<verbatim>
+  <appender name="perf_appender" class="org.apache.log4j.DailyRollingFileAppender">
+    <param name="File" value="/var/log/atlas/atlas_perf.log"/>
+    <param name="datePattern" value="'.'yyyy-MM-dd"/>
+    <param name="append" value="true"/>
+    <layout class="org.apache.log4j.PatternLayout">
+      <param name="ConversionPattern" value="%d|%t|%m%n"/>
+    </layout>
+  </appender>
+
+   <logger name="org.apache.atlas.perf" additivity="false">
+     <level value="debug"/>
+     <appender-ref ref="perf_appender"/>
+   </logger>
+ </verbatim>

http://git-wip-us.apache.org/repos/asf/atlas/blob/c65586f1/docs/src/site/twiki/HighAvailability.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/HighAvailability.twiki b/docs/src/site/twiki/HighAvailability.twiki
index 1e52c85..4270d09 100644
--- a/docs/src/site/twiki/HighAvailability.twiki
+++ b/docs/src/site/twiki/HighAvailability.twiki
@@ -157,9 +157,9 @@ At a high level the following points can be called out:
 
 ---++ Metadata Store
 
-As described above, Atlas uses Titan to store the metadata it manages. By default, Atlas uses a standalone HBase
-instance as the backing store for Titan. In order to provide HA for the metadata store, we recommend that Atlas be
-configured to use distributed HBase as the backing store for Titan.  Doing this implies that you could benefit from the
+As described above, Atlas uses JanusGraph to store the metadata it manages. By default, Atlas uses a standalone HBase
+instance as the backing store for JanusGraph. In order to provide HA for the metadata store, we recommend that Atlas be
+configured to use distributed HBase as the backing store for JanusGraph.  Doing this implies that you could benefit from the
 HA guarantees HBase provides. In order to configure Atlas to use HBase in HA mode, do the following:
 
    * Choose an existing HBase cluster that is set up in HA mode to configure in Atlas (OR) Set up a new HBase cluster in [[http://hbase.apache.org/book.html#quickstart_fully_distributed][HA mode]].
@@ -169,8 +169,8 @@ HA guarantees HBase provides. In order to configure Atlas to use HBase in HA mod
 
 ---++ Index Store
 
-As described above, Atlas indexes metadata through Titan to support full text search queries. In order to provide HA
-for the index store, we recommend that Atlas be configured to use Solr as the backing index store for Titan. In order
+As described above, Atlas indexes metadata through JanusGraph to support full text search queries. In order to provide HA
+for the index store, we recommend that Atlas be configured to use Solr as the backing index store for JanusGraph. In order
 to configure Atlas to use Solr in HA mode, do the following:
 
    * Choose an existing !SolrCloud cluster setup in HA mode to configure in Atlas (OR) Set up a new [[https://cwiki.apache.org/confluence/display/solr/SolrCloud][SolrCloud cluster]].
@@ -208,4 +208,4 @@ to configure Atlas to use Kafka in HA mode, do the following:
 
 ---++ Known Issues
 
-   * If the HBase region servers hosting the Atlas ‘titan’ HTable are down, Atlas would not be able to store or retrieve metadata from HBase until they are brought back online.
\ No newline at end of file
+   * If the HBase region servers hosting the Atlas table are down, Atlas would not be able to store or retrieve metadata from HBase until they are brought back online.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/atlas/blob/c65586f1/docs/src/site/twiki/InstallationSteps.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/InstallationSteps.twiki b/docs/src/site/twiki/InstallationSteps.twiki
index c59f495..6b9f031 100644
--- a/docs/src/site/twiki/InstallationSteps.twiki
+++ b/docs/src/site/twiki/InstallationSteps.twiki
@@ -1,135 +1,51 @@
 ---++ Building & Installing Apache Atlas
 
 ---+++ Building Atlas
-
 <verbatim>
 git clone https://git-wip-us.apache.org/repos/asf/atlas.git atlas
-
 cd atlas
+export MAVEN_OPTS="-Xms2g -Xmx2g"
+mvn clean -DskipTests install</verbatim>
 
-export MAVEN_OPTS="-Xmx1536m" && mvn clean install
-</verbatim>
-
-Once the build successfully completes, artifacts can be packaged for deployment.
-
-<verbatim>
-
-mvn clean package -Pdist
-
-</verbatim>
-
-NOTE:
-1. Use option '-DskipTests' to skip running unit and integration tests
-2. Use option '-P perf' to instrument atlas to collect performance metrics
 
-To build a distribution that configures Atlas for external HBase and Solr, build with the external-hbase-solr profile.
+---+++ Packaging Atlas
+To create Apache Atlas package for deployment in an environment having functional HBase and Solr instances, build with the following command:
 
 <verbatim>
+mvn clean -DskipTests package -Pdist</verbatim>
 
-mvn clean package -Pdist,external-hbase-solr
+   * NOTES:
+      * Remove option '-DskipTests' to run unit and integration tests
+      * To build a distribution without minified js,css file, build with skipMinify profile. By default js and css files are minified.
 
-</verbatim>
 
-Note that when the external-hbase-solr profile is used the following steps need to be completed to make Atlas functional.
+Above will build Atlas for an environment having functional HBase and Solr instances. Atlas needs to be setup with the following to run in this environment:
    * Configure atlas.graph.storage.hostname (see "Graph persistence engine - HBase" in the [[Configuration][Configuration]] section).
    * Configure atlas.graph.index.search.solr.zookeeper-url (see "Graph Search Index - Solr" in the [[Configuration][Configuration]] section).
    * Set HBASE_CONF_DIR to point to a valid HBase config directory (see "Graph persistence engine - HBase" in the [[Configuration][Configuration]] section).
    * Create the SOLR indices (see "Graph Search Index - Solr" in the [[Configuration][Configuration]] section).
 
-To build a distribution that packages HBase and Solr, build with the embedded-hbase-solr profile.
-
-<verbatim>
-
-mvn clean package -Pdist,embedded-hbase-solr
-
-</verbatim>
-
-Using the embedded-hbase-solr profile will configure Atlas so that an HBase instance and a Solr instance will be started
-and stopped along with the Atlas server by default.
 
-Atlas also supports building a distribution that can use BerkeleyDB and Elastic search as the graph and index backends.
-To build a distribution that is configured for these backends, build with the berkeley-elasticsearch profile.
+---+++ Packaging Atlas with Embedded HBase & Solr
+To create Apache Atlas package that includes HBase and Solr, build with the embedded-hbase-solr profile as shown below:
 
 <verbatim>
+mvn clean -DskipTests package -Pdist,embedded-hbase-solr</verbatim>
 
-mvn clean package -Pdist,berkeley-elasticsearch
-
-</verbatim>
-
-An additional step is required for the binary built using this profile to be used along with the Atlas distribution.
-Due to licensing requirements, Atlas does not bundle the BerkeleyDB Java Edition in the tarball.
+Using the embedded-hbase-solr profile will configure Atlas so that an HBase instance and a Solr instance will be started and stopped along with the Atlas server by default.
 
-You can download the Berkeley DB jar file from the URL: <verbatim>http://download.oracle.com/otn/berkeley-db/je-5.0.73.zip</verbatim>
-and copy the je-5.0.73.jar to the ${atlas_home}/libext directory.
 
-Tar can be found in atlas/distro/target/apache-atlas-${project.version}-bin.tar.gz
-
-Tar is structured as follows
+---+++ Apache Atlas Package
+Build will create following files, which are used to install Apache Atlas.
 
 <verbatim>
-
-|- bin
-   |- atlas_start.py
-   |- atlas_stop.py
-   |- atlas_config.py
-   |- quick_start.py
-   |- cputil.py
-|- conf
-   |- atlas-application.properties
-   |- atlas-env.sh
-   |- hbase
-      |- hbase-site.xml.template
-   |- log4j.xml
-   |- solr
-      |- currency.xml
-      |- lang
-         |- stopwords_en.txt
-      |- protowords.txt
-      |- schema.xml
-      |- solrconfig.xml
-      |- stopwords.txt
-      |- synonyms.txt
-|- docs
-|- hbase
-   |- bin
-   |- conf
-   ...
-|- server
-   |- webapp
-      |- atlas.war
-|- solr
-   |- bin
-   ...
-|- README
-|- NOTICE
-|- LICENSE
-|- DISCLAIMER.txt
-|- CHANGES.txt
-
-</verbatim>
-
-Note that if the embedded-hbase-solr profile is specified for the build then HBase and Solr are included in the
-distribution.
-
-In this case, a standalone instance of HBase can be started as the default storage backend for the graph repository.
-During Atlas installation the conf/hbase/hbase-site.xml.template gets expanded and moved to hbase/conf/hbase-site.xml
-for the initial standalone HBase configuration.  To configure ATLAS
-graph persistence for a different HBase instance, please see "Graph persistence engine - HBase" in the
-[[Configuration][Configuration]] section.
-
-Also, a standalone instance of Solr can be started as the default search indexing backend.  To configure ATLAS search
-indexing for a different Solr instance please see "Graph Search Index - Solr" in the
-[[Configuration][Configuration]] section.
-
-To build a distribution without minified js,css file, build with the skipMinify profile.
-
-<verbatim>
-
-mvn clean package -Pdist,skipMinify
-
-</verbatim>
-
-Note that by default js and css files are minified.
+distro/target/apache-atlas-${project.version}-bin.tar.gz
+distro/target/apache-atlas-${project.version}-hive-hook.gz
+distro/target/apache-atlas-${project.version}-hbase-hook.tar.gz
+distro/target/apache-atlas-${project.version}-sqoop-hook.tar.gz
+distro/target/apache-atlas-${project.version}-storm-hook.tar.gz
+distro/target/apache-atlas-${project.version}-falcon-hook.tar.gz
+distro/target/apache-atlas-${project.version}-sources.tar.gz</verbatim>
 
 ---+++ Installing & Running Atlas
 
@@ -137,18 +53,12 @@ Note that by default js and css files are minified.
 <verbatim>
 tar -xzvf apache-atlas-${project.version}-bin.tar.gz
 
-cd atlas-${project.version}
-</verbatim>
+cd atlas-${project.version}</verbatim>
 
 ---++++ Configuring Atlas
+By default config directory used by Atlas is {package dir}/conf. To override this set environment variable ATLAS_CONF to the path of the conf dir.
 
-By default config directory used by Atlas is {package dir}/conf. To override this set environment
-variable ATLAS_CONF to the path of the conf dir.
-
-atlas-env.sh has been added to the Atlas conf. This file can be used to set various environment
-variables that you need for you services. In addition you can set any other environment
-variables you might need. This file will be sourced by atlas scripts before any commands are
-executed. The following environment variables are available to set.
+Environment variables needed to run Atlas can be set in  atlas-env.sh file in the conf directory. This file will be sourced by Atlas scripts before any commands are executed. The following environment variables are available to set.
 
 <verbatim>
 # The java implementation to use. If JAVA_HOME is not found we expect java and jar to be in path
@@ -169,7 +79,7 @@ executed. The following environment variables are available to set.
 # java heap size we want to set for the atlas server. Default is 1024MB
 #export ATLAS_SERVER_HEAP=
 
-# What is is considered as atlas home dir. Default is the base locaion of the installed software
+# What is is considered as atlas home dir. Default is the base location of the installed software
 #export ATLAS_HOME_DIR=
 
 # Where log files are stored. Defatult is logs directory under the base install location
@@ -178,66 +88,48 @@ executed. The following environment variables are available to set.
 # Where pid files are stored. Defatult is logs directory under the base install location
 #export ATLAS_PID_DIR=
 
-# where the atlas titan db data is stored. Defatult is logs/data directory under the base install location
-#export ATLAS_DATA_DIR=
-
 # Where do you want to expand the war file. By Default it is in /server/webapp dir under the base install dir.
-#export ATLAS_EXPANDED_WEBAPP_DIR=
-</verbatim>
+#export ATLAS_EXPANDED_WEBAPP_DIR=</verbatim>
 
 *Settings to support large number of metadata objects*
 
-If you plan to store several tens of thousands of metadata objects, it is recommended that you use values
-tuned for better GC performance of the JVM.
+If you plan to store large number of metadata objects, it is recommended that you use values tuned for better GC performance of the JVM.
 
 The following values are common server side options:
 <verbatim>
-export ATLAS_SERVER_OPTS="-server -XX:SoftRefLRUPolicyMSPerMB=0 -XX:+CMSClassUnloadingEnabled -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+PrintTenuringDistribution -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=dumps/atlas_server.hprof -Xloggc:logs/gc-worker.log -verbose:gc -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=1m -XX:+PrintGCDetails -XX:+PrintHeapAtGC -XX:+PrintGCTimeStamps"
-</verbatim>
+export ATLAS_SERVER_OPTS="-server -XX:SoftRefLRUPolicyMSPerMB=0 -XX:+CMSClassUnloadingEnabled -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+PrintTenuringDistribution -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=dumps/atlas_server.hprof -Xloggc:logs/gc-worker.log -verbose:gc -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=1m -XX:+PrintGCDetails -XX:+PrintHeapAtGC -XX:+PrintGCTimeStamps"</verbatim>
 
-The =-XX:SoftRefLRUPolicyMSPerMB= option was found to be particularly helpful to regulate GC performance for
-query heavy workloads with many concurrent users.
+The =-XX:SoftRefLRUPolicyMSPerMB= option was found to be particularly helpful to regulate GC performance for query heavy workloads with many concurrent users.
 
 The following values are recommended for JDK 8:
 <verbatim>
-export ATLAS_SERVER_HEAP="-Xms15360m -Xmx15360m -XX:MaxNewSize=5120m -XX:MetaspaceSize=100M -XX:MaxMetaspaceSize=512m"
-</verbatim>
+export ATLAS_SERVER_HEAP="-Xms15360m -Xmx15360m -XX:MaxNewSize=5120m -XX:MetaspaceSize=100M -XX:MaxMetaspaceSize=512m"</verbatim>
 
 *NOTE for Mac OS users*
 If you are using a Mac OS, you will need to configure the ATLAS_SERVER_OPTS (explained above).
 
 In  {package dir}/conf/atlas-env.sh uncomment the following line
 <verbatim>
-#export ATLAS_SERVER_OPTS=
-</verbatim>
+#export ATLAS_SERVER_OPTS=</verbatim>
 
 and change it to look as below
 <verbatim>
-export ATLAS_SERVER_OPTS="-Djava.awt.headless=true -Djava.security.krb5.realm= -Djava.security.krb5.kdc="
-</verbatim>
+export ATLAS_SERVER_OPTS="-Djava.awt.headless=true -Djava.security.krb5.realm= -Djava.security.krb5.kdc="</verbatim>
 
-*Hbase as the Storage Backend for the Graph Repository*
+*HBase as the Storage Backend for the Graph Repository*
 
-By default, Atlas uses Titan as the graph repository and is the only graph repository implementation available currently.
-The HBase versions currently supported are 1.1.x. For configuring ATLAS graph persistence on HBase, please see "Graph persistence engine - HBase" in the [[Configuration][Configuration]] section
-for more details.
+By default, Atlas uses JanusGraph as the graph repository and is the only graph repository implementation available currently. The HBase versions currently supported are 1.1.x. For configuring ATLAS graph persistence on HBase, please see "Graph persistence engine - HBase" in the [[Configuration][Configuration]] section for more details.
 
-Pre-requisites for running HBase as a distributed cluster
-   * 3 or 5 !ZooKeeper nodes
-   * Atleast 3 !RegionServer nodes. It would be ideal to run the !DataNodes on the same hosts as the Region servers for data locality.
-
-HBase tablename in Titan can be set using the following configuration in ATLAS_HOME/conf/atlas-application.properties:
+HBase tables used by Atlas can be set using the following configurations:
 <verbatim>
-atlas.graph.storage.hbase.table=apache_atlas_titan
-atlas.audit.hbase.tablename=apache_atlas_entity_audit
-</verbatim>
+atlas.graph.storage.hbase.table=atlas
+atlas.audit.hbase.tablename=apache_atlas_entity_audit</verbatim>
 
 *Configuring SOLR as the Indexing Backend for the Graph Repository*
 
-By default, Atlas uses Titan as the graph repository and is the only graph repository implementation available currently.
-For configuring Titan to work with Solr, please follow the instructions below
+By default, Atlas uses JanusGraph as the graph repository and is the only graph repository implementation available currently. For configuring JanusGraph to work with Solr, please follow the instructions below
 
-   * Install solr if not already running. The version of SOLR supported is 5.2.1. Could be installed from http://archive.apache.org/dist/lucene/solr/5.2.1/solr-5.2.1.tgz
+   * Install solr if not already running. The version of SOLR supported is 5.5.1. Could be installed from http://archive.apache.org/dist/lucene/solr/5.5.1/solr-5.5.1.tgz
 
    * Start solr in cloud mode.
   !SolrCloud mode uses a !ZooKeeper Service as a highly available, central location for cluster management.
@@ -249,15 +141,12 @@ For configuring Titan to work with Solr, please follow the instructions below
       $SOLR_HOME/bin/solr start -c -z <zookeeper_host:port> -p 8983
       </verbatim>
 
-   * Run the following commands from SOLR_BIN (e.g. $SOLR_HOME/bin) directory to create collections in Solr corresponding to the indexes that Atlas uses. In the case that the ATLAS and SOLR instance are on 2 different hosts,
-  first copy the required configuration files from ATLAS_HOME/conf/solr on the ATLAS instance host to the Solr instance host. SOLR_CONF in the below mentioned commands refer to the directory where the solr configuration files
-  have been copied to on Solr host:
+   * Run the following commands from SOLR_BIN (e.g. $SOLR_HOME/bin) directory to create collections in Solr corresponding to the indexes that Atlas uses. In the case that the ATLAS and SOLR instance are on 2 different hosts, first copy the required configuration files from ATLAS_HOME/conf/solr on the ATLAS instance host to the Solr instance host. SOLR_CONF in the below mentioned commands refer to the directory where the solr configuration files have been copied to on Solr host:
 
 <verbatim>
   $SOLR_BIN/solr create -c vertex_index -d SOLR_CONF -shards #numShards -replicationFactor #replicationFactor
   $SOLR_BIN/solr create -c edge_index -d SOLR_CONF -shards #numShards -replicationFactor #replicationFactor
-  $SOLR_BIN/solr create -c fulltext_index -d SOLR_CONF -shards #numShards -replicationFactor #replicationFactor
-</verbatim>
+  $SOLR_BIN/solr create -c fulltext_index -d SOLR_CONF -shards #numShards -replicationFactor #replicationFactor</verbatim>
 
   Note: If numShards and replicationFactor are not specified, they default to 1 which suffices if you are trying out solr with ATLAS on a single node instance.
   Otherwise specify numShards according to the number of hosts that are in the Solr cluster and the maxShardsPerNode configuration.
@@ -274,12 +163,11 @@ For configuring Titan to work with Solr, please follow the instructions below
  atlas.graph.index.search.solr.mode=cloud
  atlas.graph.index.search.solr.zookeeper-url=<the ZK quorum setup for solr as comma separated value> eg: 10.1.6.4:2181,10.1.6.5:2181
  atlas.graph.index.search.solr.zookeeper-connect-timeout=<SolrCloud Zookeeper Connection Timeout>. Default value is 60000 ms
- atlas.graph.index.search.solr.zookeeper-session-timeout=<SolrCloud Zookeeper Session Timeout>. Default value is 60000 ms
-</verbatim>
+ atlas.graph.index.search.solr.zookeeper-session-timeout=<SolrCloud Zookeeper Session Timeout>. Default value is 60000 ms</verbatim>
 
    * Restart Atlas
 
-For more information on Titan solr configuration , please refer http://s3.thinkaurelius.com/docs/titan/0.5.4/solr.htm
+For more information on JanusGraph solr configuration , please refer http://docs.janusgraph.org/0.2.0/solr.html
 
 Pre-requisites for running Solr in cloud mode
   * Memory - Solr is both memory and CPU intensive. Make sure the server running Solr has adequate memory, CPU and disk.
@@ -299,85 +187,124 @@ use configuration in =atlas-application.properties= for setting up the topics. P
 for these details.
 
 ---++++ Setting up Atlas
+There are a few steps that setup dependencies of Atlas. One such example is setting up the JanusGraph schema in the storage backend of choice. In a simple single server setup, these are automatically setup with default configuration when the server first accesses these dependencies.
 
-There are a few steps that setup dependencies of Atlas. One such example is setting up the Titan schema
-in the storage backend of choice. In a simple single server setup, these are automatically setup with default
-configuration when the server first accesses these dependencies.
-
-However, there are scenarios when we may want to run setup steps explicitly as one time operations. For example, in a
-multiple server scenario using [[HighAvailability][High Availability]], it is preferable to run setup steps from one
-of the server instances the first time, and then start the services.
+However, there are scenarios when we may want to run setup steps explicitly as one time operations. For example, in a multiple server scenario using [[HighAvailability][High Availability]], it is preferable to run setup steps from one of the server instances the first time, and then start the services.
 
 To run these steps one time, execute the command =bin/atlas_start.py -setup= from a single Atlas server instance.
 
-However, the Atlas server does take care of parallel executions of the setup steps. Also, running the setup steps multiple
-times is idempotent. Therefore, if one chooses to run the setup steps as part of server startup, for convenience,
-then they should enable the configuration option =atlas.server.run.setup.on.start= by defining it with the value =true=
-in the =atlas-application.properties= file.
+However, the Atlas server does take care of parallel executions of the setup steps. Also, running the setup steps multiple times is idempotent. Therefore, if one chooses to run the setup steps as part of server startup, for convenience, then they should enable the configuration option =atlas.server.run.setup.on.start= by defining it with the value =true= in the =atlas-application.properties= file.
 
 ---++++ Starting Atlas Server
-
 <verbatim>
-bin/atlas_start.py [-port <port>]
-</verbatim>
-
-By default,
-   * To change the port, use -port option.
-   * atlas server starts with conf from {package dir}/conf. To override this (to use the same conf with multiple atlas upgrades), set environment variable ATLAS_CONF to the path of conf dir
+bin/atlas_start.py [-port <port>]</verbatim>
 
 ---+++ Using Atlas
-
-   * Quick start model - sample model and data
+   * Verify if the server is up and running
 <verbatim>
-  bin/quick_start.py [<atlas endpoint>]
-</verbatim>
+  curl -v -u username:password http://localhost:21000/api/atlas/admin/version
+  {"Version":"v0.1"}</verbatim>
 
-   * Verify if the server is up and running
+   * Access Atlas UI using a browser: http://localhost:21000
+
+   * Run quick start to load sample model and data
 <verbatim>
-  curl -v http://localhost:21000/api/atlas/admin/version
-  {"Version":"v0.1"}
-</verbatim>
+  bin/quick_start.py [<atlas endpoint>]</verbatim>
 
    * List the types in the repository
 <verbatim>
-  curl -v http://localhost:21000/api/atlas/types
-  {"results":["Process","Infrastructure","DataSet"],"count":3,"requestId":"1867493731@qtp-262860041-0 - 82d43a27-7c34-4573-85d1-a01525705091"}
-</verbatim>
+  curl -v -u username:password http://localhost:21000/api/atlas/v2/types/typedefs/headers
+  [ {"guid":"fa421be8-c21b-4cf8-a226-fdde559ad598","name":"Referenceable","category":"ENTITY"},
+    {"guid":"7f3f5712-521d-450d-9bb2-ba996b6f2a4e","name":"Asset","category":"ENTITY"},
+    {"guid":"84b02fa0-e2f4-4cc4-8b24-d2371cd00375","name":"DataSet","category":"ENTITY"},
+    {"guid":"f93975d5-5a5c-41da-ad9d-eb7c4f91a093","name":"Process","category":"ENTITY"},
+    {"guid":"79dcd1f9-f350-4f7b-b706-5bab416f8206","name":"Infrastructure","category":"ENTITY"}
+  ]</verbatim>
 
    * List the instances for a given type
 <verbatim>
-  curl -v http://localhost:21000/api/atlas/entities?type=hive_table
-  {"requestId":"788558007@qtp-44808654-5","list":["cb9b5513-c672-42cb-8477-b8f3e537a162","ec985719-a794-4c98-b98f-0509bd23aac0","48998f81-f1d3-45a2-989a-223af5c1ed6e","a54b386e-c759-4651-8779-a099294244c4"]}
-
-  curl -v http://localhost:21000/api/atlas/entities/list/hive_db
-</verbatim>
-
-   * Search for entities (instances) in the repository
+  curl -v -u username:password http://localhost:21000/api/atlas/v2/search/basic?typeName=hive_db
+  {
+    "queryType":"BASIC",
+    "searchParameters":{
+      "typeName":"hive_db",
+      "excludeDeletedEntities":false,
+      "includeClassificationAttributes":false,
+      "includeSubTypes":true,
+      "includeSubClassifications":true,
+      "limit":100,
+      "offset":0
+    },
+    "entities":[
+      {
+        "typeName":"hive_db",
+        "guid":"5d900c19-094d-4681-8a86-4eb1d6ffbe89",
+        "status":"ACTIVE",
+        "displayText":"default",
+        "classificationNames":[],
+        "attributes":{
+          "owner":"public",
+          "createTime":null,
+          "qualifiedName":"default@cl1",
+          "name":"default",
+          "description":"Default Hive database"
+        }
+      },
+      {
+        "typeName":"hive_db",
+        "guid":"3a0b14b0-ab85-4b65-89f2-e418f3f7f77c",
+        "status":"ACTIVE",
+        "displayText":"finance",
+        "classificationNames":[],
+        "attributes":{
+          "owner":"hive",
+          "createTime":null,
+          "qualifiedName":"finance@cl1",
+          "name":"finance",
+          "description":null
+        }
+      }
+    ]
+  }</verbatim>
+
+   * Search for entities
 <verbatim>
-  curl -v http://localhost:21000/api/atlas/discovery/search/dsl?query="from hive_table"
-</verbatim>
-
-
-*Dashboard*
-
-Once atlas is started, you can view the status of atlas entities using the Web-based dashboard. You can open your browser at the corresponding port to use the web UI.
+  curl -v -u username:password http://localhost:21000/api/atlas/v2/search/dsl?query=hive_db%20where%20name='default'
+    {
+      "queryType":"DSL",
+      "queryText":"hive_db where name='default'",
+      "entities":[
+        {
+          "typeName":"hive_db",
+          "guid":"5d900c19-094d-4681-8a86-4eb1d6ffbe89",
+          "status":"ACTIVE",
+          "displayText":"default",
+          "classificationNames":[],
+          "attributes":{
+            "owner":"public",
+            "createTime":null,
+            "qualifiedName":"default@cl1",
+            "name":"default",
+            "description":
+            "Default Hive database"
+          }
+        }
+      ]
+    }</verbatim>
 
 
 ---+++ Stopping Atlas Server
-
 <verbatim>
-bin/atlas_stop.py
-</verbatim>
-
----+++ Known Issues
+bin/atlas_stop.py</verbatim>
 
----++++ Setup
+---+++ Troubleshooting
 
+---++++ Setup issues
 If the setup of Atlas service fails due to any reason, the next run of setup (either by an explicit invocation of
 =atlas_start.py -setup= or by enabling the configuration option =atlas.server.run.setup.on.start=) will fail with
 a message such as =A previous setup run may not have completed cleanly.=. In such cases, you would need to manually
 ensure the setup can run and delete the Zookeeper node at =/apache_atlas/setup_in_progress= before attempting to
 run setup again.
 
-If the setup failed due to HBase Titan schema setup errors, it may be necessary to repair the HBase schema. If no
-data has been stored, one can also disable and drop the 'titan' schema in HBase to let setup run again.
+If the setup failed due to HBase JanusGraph schema setup errors, it may be necessary to repair the HBase schema. If no
+data has been stored, one can also disable and drop the HBase tables used by Atlas and run setup again.

http://git-wip-us.apache.org/repos/asf/atlas/blob/c65586f1/docs/src/site/twiki/QuickStart.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/QuickStart.twiki b/docs/src/site/twiki/QuickStart.twiki
index a3c1b1e..dd648d0 100644
--- a/docs/src/site/twiki/QuickStart.twiki
+++ b/docs/src/site/twiki/QuickStart.twiki
@@ -1,9 +1,8 @@
----+ Quick Start Guide
+---+ Quick Start
 
 ---++ Introduction
-This quick start user guide is a simple client that adds a few sample type definitions modeled
-after the example as shown below. It also adds example entities along with traits as shown in the
-instance graph below.
+Quick start is a simple client that adds a few sample type definitions modeled after the example shown below.
+It also adds sample entities along with traits as shown in the instance graph below.
 
 
 ---+++ Example Type Definitions

http://git-wip-us.apache.org/repos/asf/atlas/blob/c65586f1/docs/src/site/twiki/Repository.twiki
----------------------------------------------------------------------
diff --git a/docs/src/site/twiki/Repository.twiki b/docs/src/site/twiki/Repository.twiki
deleted file mode 100755
index b84b3b3..0000000
--- a/docs/src/site/twiki/Repository.twiki
+++ /dev/null
@@ -1,4 +0,0 @@
----+ Repository
-
----++ Introduction
-