ATLAS-2365: updated README for 1.0.0-alpha release

Signed-off-by: kevalbhatt <kbhatt@apache.org>

ATLAS-2365: updated README for 1.0.0-alpha release
5c2f7a0c · Madhan Neethiraj · kevalbhatt · 39be2ccf · 5c2f7a0c · 5c2f7a0c
Commit 5c2f7a0c authored Jan 19, 2018 by Madhan Neethiraj Committed by kevalbhatt Jan 19, 2018
14 changed files
--- a/README.txt
+++ b/README.txt
@@ -15,6 +15,7 @@
 # limitations under the License.
 Apache Atlas Overview
+=====================
 Apache Atlas framework is an extensible set of core
 foundational governance services – enabling enterprises to effectively and
@@ -31,6 +32,16 @@ The metadata veracity is maintained by leveraging Apache Ranger to prevent
 non-authorized access paths to data at runtime.
 Security is both role based (RBAC) and attribute based (ABAC).
+Apache Atlas 1.0.0-alpha release
+================================
+Please note that this is an alpha/technical-preview release and is not
+recommended for production use. There is no support for migration of data
+from earlier version of Apache Atlas. Also, the data generated using this
+alpha release may not migrate to Apache Atlas 1.0 GA release.
 Build Process
 =============
@@ -51,14 +62,6 @@ Build Process
   $ export MAVEN_OPTS="-Xms2g -Xmx2g"
   $ mvn clean install
-   # currently few tests might fail in some environments
-   # (timing issue?), the community is reviewing and updating
-   # such tests.
-   #
-   # if you see test failures, please run the following command:
-      $ mvn clean -DskipTests install
   $ mvn clean package -Pdist
 3. After above build commands successfully complete, you should see the following files
@@ -68,3 +71,5 @@ Build Process
   addons/hive-bridge/target/hive-bridge-<version>.jar
   addons/sqoop-bridge/target/sqoop-bridge-<version>.jar
   addons/storm-bridge/target/storm-bridge-<version>.jar
+4. For more details on building and running Apache Atlas, please refer to http://atlas.apache.org/InstallationSteps.html
--- a/docs/pom.xml
+++ b/docs/pom.xml
@@ -77,6 +77,9 @@
                        <version>1.6</version>
                    </dependency>
                </dependencies>
+                <configuration>
+				    <port>8080</port>
+                </configuration>
                <executions>
                    <execution>
                        <goals>

--- a/docs/src/site/twiki/Architecture.twiki
+++ b/docs/src/site/twiki/Architecture.twiki
@@ -8,8 +8,7 @@
 The components of Atlas can be grouped under the following major categories:
 ---+++ Core
+Atlas core includes the following components:
-This category contains the components that implement the core of Atlas functionality, including:
 *Type System*: Atlas allows users to define a model for the metadata objects they want to manage. The model is composed
 of definitions called ‘types’. Instances of ‘types’ called ‘entities’ represent the actual metadata objects that are
@@ -21,25 +20,18 @@ One key point to note is that the generic nature of the modelling in Atlas allow
 define both technical metadata and business metadata. It is also possible to define rich relationships between the
 two using features of Atlas.
+*Graph Engine*: Internally, Atlas persists metadata objects it manages using a Graph model. This approach provides great
+flexibility and enables efficient handling of rich relationships between the metadata objects. Graph engine component is
+responsible for translating between types and entities of the Atlas type system, and the underlying graph persistence model.
+In addition to managing the graph objects, the graph engine also creates the appropriate indices for the metadata
+objects so that they can be searched efficiently. Atlas uses the JanusGraph to store the metadata objects.
 *Ingest / Export*: The Ingest component allows metadata to be added to Atlas. Similarly, the Export component exposes
 metadata changes detected by Atlas to be raised as events. Consumers can consume these change events to react to
 metadata changes in real time.
-*Graph Engine*: Internally, Atlas represents metadata objects it manages using a Graph model. It does this to
-achieve great flexibility and rich relations between the metadata objects. The Graph Engine is a component that is
-responsible for translating between types and entities of the Type System, and the underlying Graph model.
-In addition to managing the Graph objects, The Graph Engine also creates the appropriate indices for the metadata
-objects so that they can be searched for efficiently.
-*Titan*: Currently, Atlas uses the Titan Graph Database to store the metadata objects. Titan is used as a library
-within Atlas. Titan uses two stores: The Metadata store is configured to !HBase by default and the Index store
-is configured to Solr. It is also possible to use the Metadata store as BerkeleyDB and Index store as !ElasticSearch
-by building with corresponding profiles. The Metadata store is used for storing the metadata objects proper, and the
-Index store is used for storing indices of the Metadata properties, that allows efficient search.
 ---+++ Integration
 Users can manage metadata in Atlas using two methods:
 *API*: All functionality of Atlas is exposed to end users via a REST API that allows types and entities to be created,
@@ -53,7 +45,6 @@ uses Apache Kafka as a notification server for communication between hooks and d
 notification events. Events are written by the hooks and Atlas to different Kafka topics.
 ---+++ Metadata sources
 Atlas supports integration with many sources of metadata out of the box. More integrations will be added in future
 as well. Currently, Atlas supports ingesting and managing metadata from the following sources:
@@ -61,6 +52,7 @@ as well. Currently, Atlas supports ingesting and managing metadata from the foll
   * [[Bridge-Sqoop][Sqoop]]
   * [[Bridge-Falcon][Falcon]]
   * [[StormAtlasHook][Storm]]
+   * HBase - _documentation work-in-progress_
 The integration implies two things:
 There are metadata models that Atlas defines natively to represent objects of these components.
@@ -80,12 +72,6 @@ for the Hadoop ecosystem having wide integration with a variety of Hadoop compon
 Ranger allows security administrators to define metadata driven security policies for effective governance.
 Ranger is a consumer to the metadata change events notified by Atlas.
-*Business Taxonomy*: The metadata objects ingested into Atlas from the Metadata sources are primarily a form
-of technical metadata. To enhance the discoverability and governance capabilities, Atlas comes with a Business
-Taxonomy interface that allows users to first, define a hierarchical set of business terms that represent their
-business domain and associate them to the metadata entities Atlas manages. Business Taxonomy is a web application that
-is part of the Atlas Admin UI currently and integrates with Atlas using the REST API.

--- a/docs/src/site/twiki/Bridge-Falcon.twiki
+++ b/docs/src/site/twiki/Bridge-Falcon.twiki
 ---+ Falcon Atlas Bridge
 ---++ Falcon Model
-The default falcon modelling is available in org.apache.atlas.falcon.model.FalconDataModelGenerator. It defines the following types:
+The default hive model includes the following types:
-<verbatim>
+   * Entity types:
-falcon_cluster(ClassType) - super types [Infrastructure] - attributes [timestamp, colo, owner, tags]
+      * falcon_cluster
-falcon_feed(ClassType) - super types [DataSet] - attributes [timestamp, stored-in, owner, groups, tags]
+         * super-types: Infrastructure
-falcon_feed_creation(ClassType) - super types [Process] - attributes [timestamp, stored-in, owner]
+         * attributes: timestamp, colo, owner, tags
-falcon_feed_replication(ClassType) - super types [Process] - attributes [timestamp, owner]
+      * falcon_feed
-falcon_process(ClassType) - super types [Process] - attributes [timestamp, runs-on, owner, tags, pipelines, workflow-properties]
+         * super-types: !DataSet
-</verbatim>
+         * attributes: timestamp, stored-in, owner, groups, tags
+      * falcon_feed_creation
+         * super-types: Process
+         * attributes: timestamp, stored-in, owner
+      * falcon_feed_replication
+         * super-types: Process
+         * attributes: timestamp, owner
+      * falcon_process
+         * super-types: Process
+         * attributes: timestamp, runs-on, owner, tags, pipelines, workflow-properties
 One falcon_process entity is created for every cluster that the falcon process is defined for.
 The entities are created and de-duped using unique qualifiedName attribute. They provide namespace and can be used for querying/lineage as well. The unique attributes are:
-   * falcon_process - <process name>@<cluster name>
+   * falcon_process.qualifiedName          - <process name>@<cluster name>
-   * falcon_cluster - <cluster name>
+   * falcon_cluster.qualifiedName          - <cluster name>
-   * falcon_feed - <feed name>@<cluster name>
+   * falcon_feed.qualifiedName             - <feed name>@<cluster name>
-   * falcon_feed_creation - <feed name>
+   * falcon_feed_creation.qualifiedName    - <feed name>
-   * falcon_feed_replication - <feed name>
+   * falcon_feed_replication.qualifiedName - <feed name>
 ---++ Falcon Hook
-Falcon supports listeners on falcon entity submission. This is used to add entities in Atlas using the model defined in org.apache.atlas.falcon.model.FalconDataModelGenerator.
+Falcon supports listeners on falcon entity submission. This is used to add entities in Atlas using the model detailed above.
-The hook submits the request to a thread pool executor to avoid blocking the command execution. The thread submits the entities as message to the notification server and atlas server reads these messages and registers the entities.
+Follow the instructions below to setup Atlas hook in Falcon:
   * Add 'org.apache.atlas.falcon.service.AtlasService' to application.services in <falcon-conf>/startup.properties
-   * Link falcon hook jars in falcon classpath - 'ln -s <atlas-home>/hook/falcon/* <falcon-home>/server/webapp/falcon/WEB-INF/lib/'
+   * Link Atlas hook jars in Falcon classpath - 'ln -s <atlas-home>/hook/falcon/* <falcon-home>/server/webapp/falcon/WEB-INF/lib/'
   * In <falcon_conf>/falcon-env.sh, set an environment variable as follows:
     <verbatim>
-     export FALCON_SERVER_OPTS="<atlas_home>/hook/falcon/*:$FALCON_SERVER_OPTS"
+     export FALCON_SERVER_OPTS="<atlas_home>/hook/falcon/*:$FALCON_SERVER_OPTS"</verbatim>
-     </verbatim>
 The following properties in <atlas-conf>/atlas-application.properties control the thread pool and notification details:
-   * atlas.hook.falcon.synchronous - boolean, true to run the hook synchronously. default false
+   * atlas.hook.falcon.synchronous   - boolean, true to run the hook synchronously. default false
-   * atlas.hook.falcon.numRetries - number of retries for notification failure. default 3
+   * atlas.hook.falcon.numRetries    - number of retries for notification failure. default 3
-   * atlas.hook.falcon.minThreads - core number of threads. default 5
+   * atlas.hook.falcon.minThreads    - core number of threads. default 5
-   * atlas.hook.falcon.maxThreads - maximum number of threads. default 5
+   * atlas.hook.falcon.maxThreads    - maximum number of threads. default 5
   * atlas.hook.falcon.keepAliveTime - keep alive time in msecs. default 10
-   * atlas.hook.falcon.queueSize - queue size for the threadpool. default 10000
+   * atlas.hook.falcon.queueSize     - queue size for the threadpool. default 10000
 Refer [[Configuration][Configuration]] for notification related configurations
---++ Limitations
+---++ NOTES
   * In falcon cluster entity, cluster name used should be uniform across components like hive, falcon, sqoop etc. If used with ambari, ambari cluster name should be used for cluster entity
--- a/docs/src/site/twiki/Bridge-Hive.twiki
+++ b/docs/src/site/twiki/Bridge-Hive.twiki
 ---+ Hive Atlas Bridge
 ---++ Hive Model
-The default hive modelling is available in org.apache.atlas.hive.model.HiveDataModelGenerator. It defines the following types:
+The default hive model includes the following types:
-<verbatim>
+   * Entity types:
-hive_db(ClassType) - super types [Referenceable] - attributes [name, clusterName, description, locationUri, parameters, ownerName, ownerType]
+      * hive_db
-hive_storagedesc(ClassType) - super types [Referenceable] - attributes [cols, location, inputFormat, outputFormat, compressed, numBuckets, serdeInfo, bucketCols, sortCols, parameters, storedAsSubDirectories]
+         * super-types: Referenceable
-hive_column(ClassType) - super types [Referenceable] - attributes [name, type, comment, table]
+         * attributes: name, clusterName, description, locationUri, parameters, ownerName, ownerType
-hive_table(ClassType) - super types [DataSet] - attributes [name, db, owner, createTime, lastAccessTime, comment, retention, sd, partitionKeys, columns, aliases, parameters, viewOriginalText, viewExpandedText, tableType, temporary]
+      * hive_storagedesc
-hive_process(ClassType) - super types [Process] - attributes [name, startTime, endTime, userName, operationType, queryText, queryPlan, queryId]
+         * super-types: Referenceable
-hive_principal_type(EnumType) - values [USER, ROLE, GROUP]
+         * attributes: cols, location, inputFormat, outputFormat, compressed, numBuckets, serdeInfo, bucketCols, sortCols, parameters, storedAsSubDirectories
-hive_order(StructType) - attributes [col, order]
+      * hive_column
-hive_serde(StructType) - attributes [name, serializationLib, parameters]
+         * super-types: Referenceable
-</verbatim>
+         * attributes: name, type, comment, table
+      * hive_table
-The entities are created and de-duped using unique qualified name. They provide namespace and can be used for querying/lineage as well. Note that  dbName, tableName and columnName should be in lower case. clusterName is explained below.
+         * super-types: !DataSet
-   * hive_db - attribute qualifiedName - <dbName>@<clusterName>
+         * attributes: name, db, owner, createTime, lastAccessTime, comment, retention, sd, partitionKeys, columns, aliases, parameters, viewOriginalText, viewExpandedText, tableType, temporary
-   * hive_table - attribute qualifiedName - <dbName>.<tableName>@<clusterName>
+      * hive_process
-   * hive_column - attribute qualifiedName - <dbName>.<tableName>.<columnName>@<clusterName>
+         * super-types: Process
-   * hive_process - attribute name - <queryString> - trimmed query string in lower case
+         * attributes: name, startTime, endTime, userName, operationType, queryText, queryPlan, queryId
+      * hive_column_lineage
+         * super-types: Process
+         * attributes: query, depenendencyType, expression
+   * Enum types:
+      * hive_principal_type
+         * values: USER, ROLE, GROUP
+   * Struct types:
+      * hive_order
+         * attributes: col, order
+      * hive_serde
+         * attributes: name, serializationLib, parameters
+The entities are created and de-duped using unique qualified name. They provide namespace and can be used for querying/lineage as well. Note that dbName, tableName and columnName should be in lower case. clusterName is explained below.
+   * hive_db.qualifiedName     - <dbName>@<clusterName>
+   * hive_table.qualifiedName  - <dbName>.<tableName>@<clusterName>
+   * hive_column.qualifiedName - <dbName>.<tableName>.<columnName>@<clusterName>
+   * hive_process.queryString  - trimmed query string in lower case
 ---++ Importing Hive Metadata
-org.apache.atlas.hive.bridge.HiveMetaStoreBridge imports the Hive metadata into Atlas using the model defined in org.apache.atlas.hive.model.HiveDataModelGenerator. import-hive.sh command can be used to facilitate this. The script needs Hadoop and Hive classpath jars.
+org.apache.atlas.hive.bridge.HiveMetaStoreBridge imports the Hive metadata into Atlas using the model defined above. import-hive.sh command can be used to facilitate this.
-  * For Hadoop jars, please make sure that the environment variable HADOOP_CLASSPATH is set. Another way is to set HADOOP_HOME to point to root directory of your Hadoop installation
-  * Similarly, for Hive jars, set HIVE_HOME to the root of Hive installation
-  * Set environment variable HIVE_CONF_DIR to Hive configuration directory
-  * Copy <atlas-conf>/atlas-application.properties to the hive conf directory
    <verbatim>
-    Usage: <atlas package>/hook-bin/import-hive.sh
+    Usage: <atlas package>/hook-bin/import-hive.sh</verbatim>
-    </verbatim>
 The logs are in <atlas package>/logs/import-hive.log
-If you you are importing metadata in a kerberized cluster you need to run the command like this:
-<verbatim>
-<atlas package>/hook-bin/import-hive.sh -Dsun.security.jgss.debug=true -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.krb5.conf=[krb5.conf location] -Djava.security.auth.login.config=[jaas.conf location]
-</verbatim>
-   * krb5.conf is typically found at /etc/krb5.conf
-   * for details about jaas.conf and a suggested location see the [[security][atlas security documentation]]
 ---++ Hive Hook
-Hive supports listeners on hive command execution using hive hooks. This is used to add/update/remove entities in Atlas using the model defined in org.apache.atlas.hive.model.HiveDataModelGenerator.
+Atlas Hive hook registers with Hive to listen for create/update/delete operations and updates the metadata in Atlas, via Kafka notifications, for the changes in Hive.
-The hook submits the request to a thread pool executor to avoid blocking the command execution. The thread submits the entities as message to the notification server and atlas server reads these messages and registers the entities.
+Follow the instructions below to setup Atlas hook in Hive:
-Follow these instructions in your hive set-up to add hive hook for Atlas:
+   * Set-up Atlas hook in hive-site.xml by adding the following:
-   * Set-up atlas hook in hive-site.xml of your hive configuration:
  <verbatim>
    <property>
      <name>hive.exec.post.hooks</name>
      <value>org.apache.atlas.hive.hook.HiveHook</value>
-    </property>
+    </property></verbatim>
-  </verbatim>
-  <verbatim>
-    <property>
-      <name>atlas.cluster.name</name>
-      <value>primary</value>
-    </property>
-  </verbatim>
   * Add 'export HIVE_AUX_JARS_PATH=<atlas package>/hook/hive' in hive-env.sh of your hive configuration
   * Copy <atlas-conf>/atlas-application.properties to the hive conf directory.
 The following properties in <atlas-conf>/atlas-application.properties control the thread pool and notification details:
-   * atlas.hook.hive.synchronous - boolean, true to run the hook synchronously. default false. Recommended to be set to false to avoid delays in hive query completion.
+   * atlas.hook.hive.synchronous   - boolean, true to run the hook synchronously. default false. Recommended to be set to false to avoid delays in hive query completion.
-   * atlas.hook.hive.numRetries - number of retries for notification failure. default 3
+   * atlas.hook.hive.numRetries    - number of retries for notification failure. default 3
-   * atlas.hook.hive.minThreads - core number of threads. default 5
+   * atlas.hook.hive.minThreads    - core number of threads. default 1
-   * atlas.hook.hive.maxThreads - maximum number of threads. default 5
+   * atlas.hook.hive.maxThreads    - maximum number of threads. default 5
   * atlas.hook.hive.keepAliveTime - keep alive time in msecs. default 10
-   * atlas.hook.hive.queueSize - queue size for the threadpool. default 10000
+   * atlas.hook.hive.queueSize     - queue size for the threadpool. default 10000
 Refer [[Configuration][Configuration]] for notification related configurations
@@ -76,24 +74,23 @@ Refer [[Configuration][Configuration]] for notification related configurations
 Starting from 0.8-incubating version of Atlas, Column level lineage is captured in Atlas. Below are the details
 ---+++ Model
-   * !ColumnLineageProcess type is a subclass of Process
+   * !ColumnLineageProcess type is a subtype of Process
   * This relates an output Column to a set of input Columns or the Input Table
-   * The Lineage also captures the kind of Dependency: currently the values are SIMPLE, EXPRESSION, SCRIPT
+   * The lineage also captures the kind of dependency, as listed below:
-      * A SIMPLE dependency means the output column has the same value as the input
+      * SIMPLE:     output column has the same value as the input
-      * An EXPRESSION dependency means the output column is transformed by some expression in the runtime(for e.g. a Hive SQL expression) on the Input Columns.
+      * EXPRESSION: output column is transformed by some expression at runtime (for e.g. a Hive SQL expression) on the Input Columns.
-      * SCRIPT means that the output column is transformed by a user provided script.
+      * SCRIPT:     output column is transformed by a user provided script.
   * In case of EXPRESSION dependency the expression attribute contains the expression in string form
-   * Since Process links input and output !DataSets, we make Column a subclass of !DataSet
+   * Since Process links input and output !DataSets, Column is a subtype of !DataSet
 ---+++ Examples
 For a simple CTAS below:
 <verbatim>
-create table t2 as select id, name from T1
+create table t2 as select id, name from T1</verbatim>
-</verbatim>
 The lineage is captured as
@@ -106,10 +103,8 @@ The lineage is captured as
  * The !LineageInfo in Hive provides column-level lineage for the final !FileSinkOperator, linking them to the input columns in the Hive Query
---+++ NOTE
+---++ NOTES
-Column level lineage works with Hive version 1.2.1 after the patch for <a href="https://issues.apache.org/jira/browse/HIVE-13112">HIVE-13112</a> is applied to Hive source
+   * Column level lineage works with Hive version 1.2.1 after the patch for <a href="https://issues.apache.org/jira/browse/HIVE-13112">HIVE-13112</a> is applied to Hive source
---++ Limitations
   * Since database name, table name and column names are case insensitive in hive, the corresponding names in entities are lowercase. So, any search APIs should use lowercase while querying on the entity names
   * The following hive operations are captured by hive hook currently
      * create database

--- a/docs/src/site/twiki/Bridge-Sqoop.twiki
+++ b/docs/src/site/twiki/Bridge-Sqoop.twiki
 ---+ Sqoop Atlas Bridge
 ---++ Sqoop Model
-The default Sqoop modelling is available in org.apache.atlas.sqoop.model.SqoopDataModelGenerator. It defines the following types:
+The default hive model includes the following types:
-<verbatim>
+   * Entity types:
-sqoop_operation_type(EnumType) - values [IMPORT, EXPORT, EVAL]
+      * sqoop_process
-sqoop_dbstore_usage(EnumType) - values [TABLE, QUERY, PROCEDURE, OTHER]
+         * super-types: Process
-sqoop_process(ClassType) - super types [Process] - attributes [name, operation, dbStore, hiveTable, commandlineOpts, startTime, endTime, userName]
+         * attributes: name, operation, dbStore, hiveTable, commandlineOpts, startTime, endTime, userName
-sqoop_dbdatastore(ClassType) - super types [DataSet] - attributes [name, dbStoreType, storeUse, storeUri, source, description, ownerName]
+      * sqoop_dbdatastore
-</verbatim>
+         * super-types: !DataSet
+         * attributes: name, dbStoreType, storeUse, storeUri, source, description, ownerName
+   * Enum types:
+      * sqoop_operation_type
+         * values: IMPORT, EXPORT, EVAL
+      * sqoop_dbstore_usage
+         * values: TABLE, QUERY, PROCEDURE, OTHER
 The entities are created and de-duped using unique qualified name. They provide namespace and can be used for querying as well:
-sqoop_process - attribute name - sqoop-dbStoreType-storeUri-endTime
+   * sqoop_process.qualifiedName     - dbStoreType-storeUri-endTime
-sqoop_dbdatastore - attribute name - dbStoreType-connectorUrl-source
+   * sqoop_dbdatastore.qualifiedName - dbStoreType-storeUri-source
 ---++ Sqoop Hook
-Sqoop added a !SqoopJobDataPublisher that publishes data to Atlas after completion of import Job. Today, only hiveImport is supported in sqoopHook.
+Sqoop added a !SqoopJobDataPublisher that publishes data to Atlas after completion of import Job. Today, only hiveImport is supported in !SqoopHook.
-This is used to add entities in Atlas using the model defined in org.apache.atlas.sqoop.model.SqoopDataModelGenerator.
+This is used to add entities in Atlas using the model detailed above.
-Follow these instructions in your sqoop set-up to add sqoop hook for Atlas in <sqoop-conf>/sqoop-site.xml:
+Follow the instructions below to setup Atlas hook in Hive:
-   * Sqoop Job publisher class.  Currently only one publishing class is supported
+Add the following properties to  to enable Atlas hook in Sqoop:
+   * Set-up Atlas hook in <sqoop-conf>/sqoop-site.xml by adding the following:
+  <verbatim>
   <property>
     <name>sqoop.job.data.publish.class</name>
     <value>org.apache.atlas.sqoop.hook.SqoopHook</value>
-   </property>
+   </property></verbatim>
-   * Atlas cluster name
-   <property>
-     <name>atlas.cluster.name</name>
-     <value><clustername></value>
-   </property>
   * Copy <atlas-conf>/atlas-application.properties to to the sqoop conf directory <sqoop-conf>/
   * Link <atlas-home>/hook/sqoop/*.jar in sqoop lib
 Refer [[Configuration][Configuration]] for notification related configurations
---++ Limitations
+---++ NOTES
   * Only the following sqoop operations are captured by sqoop hook currently - hiveImport
--- a/docs/src/site/twiki/Configuration.twiki
+++ b/docs/src/site/twiki/Configuration.twiki
@@ -5,139 +5,42 @@ All configuration in Atlas uses java properties style configuration. The main co
 ---++ Graph Configs
---+++ Graph persistence engine
+---+++ Graph Persistence engine - HBase
+Set the following properties to configure JanusGraph to use HBase as the persistence engine. Please refer to
-This section sets up the graph db - titan - to use a persistence engine. Please refer to
+<a href="http://docs.janusgraph.org/0.2.0/configuration.html#_hbase_caching">link</a> for more details.
-<a href="http://s3.thinkaurelius.com/docs/titan/0.5.4/titan-config-ref.html">link</a> for more
-details. The example below uses BerkeleyDBJE.
-<verbatim>
-atlas.graph.storage.backend=berkeleyje
-atlas.graph.storage.directory=data/berkeley
-</verbatim>
---++++ Graph persistence engine - Hbase
-Basic configuration
 <verbatim>
 atlas.graph.storage.backend=hbase
-#For standalone mode , specify localhost
-#for distributed mode, specify zookeeper quorum here - For more information refer http://s3.thinkaurelius.com/docs/titan/current/hbase.html#_remote_server_mode_2
 atlas.graph.storage.hostname=<ZooKeeper Quorum>
+atlas.graph.storage.hbase.table=atlas
 </verbatim>
-HBASE_CONF_DIR environment variable needs to be set to point to the Hbase client configuration directory which is added to classpath when Atlas starts up.
+If any further JanusGraph configuration needs to be setup, please prefix the property name with "atlas.graph.".
-hbase-site.xml needs to have the following properties set according to the cluster setup
-<verbatim>
-#Set below to /hbase-secure if the Hbase server is setup in secure mode
-zookeeper.znode.parent=/hbase-unsecure
-</verbatim>
-Advanced configuration
+In addition to setting up configurations, please ensure that environment variable HBASE_CONF_DIR is setup to point to
+the directory containing HBase configuration file hbase-site.xml.
-# If you are planning to use any of the configs mentioned below, they need to be prefixed with "atlas.graph." to take effect in ATLAS
+---+++ Graph Search Index - Solr
-Refer http://s3.thinkaurelius.com/docs/titan/0.5.4/titan-config-ref.html#_storage_hbase
+Solr installation in Cloud mode is a prerequisite for Apache Atlas use. Set the following properties to configure JanusGraph to use Solr as the index search engine.
-Permissions
-When Atlas is configured with HBase as the storage backend the graph db (titan) needs sufficient user permissions to be able to create and access an HBase table.  In a secure cluster it may be necessary to grant permissions to the 'atlas' user for the 'titan' table.
-With Ranger, a policy can be configured for 'titan'.
-Without Ranger, HBase shell can be used to set the permissions.
 <verbatim>
-   su hbase
+atlas.graph.index.search.backend=solr5
-   kinit -k -t <hbase keytab> <hbase principal>
+atlas.graph.index.search.solr.mode=cloud
-   echo "grant 'atlas', 'RWXCA', 'titan'" | hbase shell
+atlas.graph.index.search.solr.wait-searcher=true
-</verbatim>
-Note that if the embedded-hbase-solr profile is used then HBase is included in the distribution so that a standalone
+# ZK quorum setup for solr as comma separated value. Example: 10.1.6.4:2181,10.1.6.5:2181
-instance of HBase can be started as the default storage backend for the graph repository.  Using the embedded-hbase-solr
+atlas.graph.index.search.solr.zookeeper-url=
-profile will configure Atlas so that HBase instance will be started and stopped along with the Atlas server by default.
-To use the embedded-hbase-solr profile please see "Building Atlas" in the [[InstallationSteps][Installation Steps]]
-section.
---+++ Graph Search Index
+# SolrCloud Zookeeper Connection Timeout. Default value is 60000 ms
-This section sets up the graph db - titan - to use an search indexing system. The example
+atlas.graph.index.search.solr.zookeeper-connect-timeout=60000
-configuration below sets up to use an embedded Elastic search indexing system.
-<verbatim>
+# SolrCloud Zookeeper Session Timeout. Default value is 60000 ms
-atlas.graph.index.search.backend=elasticsearch
+atlas.graph.index.search.solr.zookeeper-session-timeout=60000
-atlas.graph.index.search.directory=data/es
-atlas.graph.index.search.elasticsearch.client-only=false
-atlas.graph.index.search.elasticsearch.local-mode=true
-atlas.graph.index.search.elasticsearch.create.sleep=2000
-</verbatim>
---++++ Graph Search Index - Solr
-Please note that Solr installation in Cloud mode is a prerequisite before configuring Solr as the search indexing backend. Refer InstallationSteps section for Solr installation/configuration.
-<verbatim>
- atlas.graph.index.search.backend=solr5
- atlas.graph.index.search.solr.mode=cloud
- atlas.graph.index.search.solr.zookeeper-url=<the ZK quorum setup for solr as comma separated value> eg: 10.1.6.4:2181,10.1.6.5:2181
- atlas.graph.index.search.solr.zookeeper-connect-timeout=<SolrCloud Zookeeper Connection Timeout>. Default value is 60000 ms
- atlas.graph.index.search.solr.zookeeper-session-timeout=<SolrCloud Zookeeper Session Timeout>. Default value is 60000 ms
-</verbatim>
-Also note that if the embedded-hbase-solr profile is used then Solr is included in the distribution so that a standalone
-instance of Solr can be started as the default search indexing backend. Using the embedded-hbase-solr profile will
-configure Atlas so that the standalone Solr instance will be started and stopped along with the Atlas server by default.
-To use the embedded-hbase-solr profile please see "Building Atlas" in the [[InstallationSteps][Installation Steps]]
-section.
---+++ Choosing between Persistence and Indexing Backends
-Refer http://s3.thinkaurelius.com/docs/titan/0.5.4/bdb.html and http://s3.thinkaurelius.com/docs/titan/0.5.4/hbase.html for choosing between the persistence backends.
-BerkeleyDB is suitable for smaller data sets in the range of upto 10 million vertices with ACID gurantees.
-HBase on the other hand doesnt provide ACID guarantees but is able to scale for larger graphs. HBase also provides HA inherently.
---+++ Choosing between Persistence Backends
-Refer http://s3.thinkaurelius.com/docs/titan/0.5.4/bdb.html and http://s3.thinkaurelius.com/docs/titan/0.5.4/hbase.html for choosing between the persistence backends.
-BerkeleyDB is suitable for smaller data sets in the range of upto 10 million vertices with ACID gurantees.
-HBase on the other hand doesnt provide ACID guarantees but is able to scale for larger graphs. HBase also provides HA inherently.
---+++ Choosing between Indexing Backends
-Refer http://s3.thinkaurelius.com/docs/titan/0.5.4/elasticsearch.html and http://s3.thinkaurelius.com/docs/titan/0.5.4/solr.html for choosing between !ElasticSearch and Solr.
-Solr in cloud mode is the recommended setup.
---+++ Switching Persistence Backend
-For switching the storage backend from BerkeleyDB to HBase and vice versa, refer the documentation for "Graph Persistence Engine" described above and restart ATLAS.
-The data in the indexing backend needs to be cleared else there will be discrepancies between the storage and indexing backend which could result in errors during the search.
-!ElasticSearch runs by default in embedded mode and the data could easily be cleared by deleting the ATLAS_HOME/data/es directory.
-For Solr, the collections which were created during ATLAS Installation - vertex_index, edge_index, fulltext_index could be deleted which will cleanup the indexes
---+++ Switching Index Backend
-Switching the Index backend requires clearing the persistence backend data. Otherwise there will be discrepancies between the persistence and index backends since switching the indexing backend means index data will be lost.
-This leads to "Fulltext" queries not working on the existing data
-For clearing the data for BerkeleyDB, delete the ATLAS_HOME/data/berkeley directory
-For clearing the data for HBase, in Hbase shell, run 'disable titan' and 'drop titan'
---++ Lineage Configs
-The higher layer services like lineage, schema, etc. are driven by the type system and this section encodes the specific types for the hive data model.
-# This models reflects the base super types for Data and Process
-<verbatim>
-atlas.lineage.hive.table.type.name=DataSet
-atlas.lineage.hive.process.type.name=Process
-atlas.lineage.hive.process.inputs.name=inputs
-atlas.lineage.hive.process.outputs.name=outputs
-## Schema
-atlas.lineage.hive.table.schema.query=hive_table where name=?, columns
 </verbatim>
 ---++ Search Configs
-Search APIs (DSL and full text search) support pagination and have optional limit and offset arguments. Following configs are related to search pagination
+Search APIs (DSL, basic search, full-text search) support pagination and have optional limit and offset arguments. Following configs are related to search pagination
 <verbatim>
 # Default limit used when limit is not specified in API
@@ -152,53 +55,36 @@ atlas.search.maxlimit=10000
 Refer http://kafka.apache.org/documentation.html#configuration for Kafka configuration. All Kafka configs should be prefixed with 'atlas.kafka.'
 <verbatim>
-atlas.notification.embedded=true
+atlas.kafka.auto.commit.enable=false
-atlas.kafka.data=${sys:atlas.home}/data/kafka
-atlas.kafka.zookeeper.connect=localhost:9026
-atlas.kafka.bootstrap.servers=localhost:9027
-atlas.kafka.zookeeper.session.timeout.ms=400
-atlas.kafka.zookeeper.sync.time.ms=20
-atlas.kafka.auto.commit.interval.ms=1000
-atlas.kafka.hook.group.id=atlas
-</verbatim>
-Note that Kafka group ids are specified for a specific topic.  The Kafka group id configuration for entity notifications is 'atlas.kafka.entities.group.id'
+# Kafka servers. Example: localhost:6667
+atlas.kafka.bootstrap.servers=
-<verbatim>
+atlas.kafka.hook.group.id=atlas
-atlas.kafka.entities.group.id=<consumer id>
-</verbatim>
-These configuration parameters are useful for setting up Kafka topics via Atlas provided scripts, described in the
+# Zookeeper connect URL for Kafka. Example: localhost:2181
-[[InstallationSteps][Installation Steps]] page.
+atlas.kafka.zookeeper.connect=
-<verbatim>
+atlas.kafka.zookeeper.connection.timeout.ms=30000
-# Whether to create the topics automatically, default is true.
+atlas.kafka.zookeeper.session.timeout.ms=60000
-# Comma separated list of topics to be created, default is "ATLAS_HOOK,ATLAS_ENTITES"
+atlas.kafka.zookeeper.sync.time.ms=20
-atlas.notification.topics=ATLAS_HOOK,ATLAS_ENTITIES
-# Number of replicas for the Atlas topics, default is 1. Increase for higher resilience to Kafka failures.
-atlas.notification.replicas=1
-# Enable the below two properties if Kafka is running in Kerberized mode.
-# Set this to the service principal representing the Kafka service
-atlas.notification.kafka.service.principal=kafka/_HOST@EXAMPLE.COM
-# Set this to the location of the keytab file for Kafka
-#atlas.notification.kafka.keytab.location=/etc/security/keytabs/kafka.service.keytab
-</verbatim>
-These configuration parameters are useful for saving messages in case there are issues in reaching Kafka for
+# Setup the following configurations only in test deployments where Kafka is started within Atlas in embedded mode
-sending messages.
+# atlas.notification.embedded=true
+# atlas.kafka.data=${sys:atlas.home}/data/kafka
-<verbatim>
+# Setup the following two properties if Kafka is running in Kerberized mode.
-# Whether to save messages that failed to be sent to Kafka, default is true
+# atlas.notification.kafka.service.principal=kafka/_HOST@EXAMPLE.COM
-atlas.notification.log.failed.messages=true
+# atlas.notification.kafka.keytab.location=/etc/security/keytabs/kafka.service.keytab
-# If saving messages is enabled, the file name to save them to. This file will be created under the log directory of the hook's host component - like HiveServer2
-atlas.notification.failed.messages.filename=atlas_hook_failed_messages.log
 </verbatim>
 ---++ Client Configs
 <verbatim>
 atlas.client.readTimeoutMSecs=60000
 atlas.client.connectTimeoutMSecs=60000
-atlas.rest.address=<http/https>://<atlas-fqdn>:<atlas port> - default http://localhost:21000
+# URL to access Atlas server. For example: http://localhost:21000
+atlas.rest.address=
 </verbatim>
@@ -212,26 +98,28 @@ atlas.enableTLS=false
 </verbatim>
 ---++ High Availability Properties
 The following properties describe High Availability related configuration options:
 <verbatim>
 # Set the following property to true, to enable High Availability. Default = false.
 atlas.server.ha.enabled=true
-# Define a unique set of strings to identify each instance that should run an Atlas Web Service instance as a comma separated list.
+# Specify the list of Atlas instances
 atlas.server.ids=id1,id2
-# For each string defined above, define the host and port on which Atlas server binds to.
+# For each instance defined above, define the host and port on which Atlas server listens.
 atlas.server.address.id1=host1.company.com:21000
 atlas.server.address.id2=host2.company.com:31000
 # Specify Zookeeper properties needed for HA.
 # Specify the list of services running Zookeeper servers as a comma separated list.
 atlas.server.ha.zookeeper.connect=zk1.company.com:2181,zk2.company.com:2181,zk3.company.com:2181
 # Specify how many times should connection try to be established with a Zookeeper cluster, in case of any connection issues.
 atlas.server.ha.zookeeper.num.retries=3
 # Specify how much time should the server wait before attempting connections to Zookeeper, in case of any connection issues.
 atlas.server.ha.zookeeper.retry.sleeptime.ms=1000
 # Specify how long a session to Zookeeper should last without inactiviy to be deemed as unreachable.
 atlas.server.ha.zookeeper.session.timeout.ms=20000
@@ -239,6 +127,7 @@ atlas.server.ha.zookeeper.session.timeout.ms=20000
 # The format of these options is <scheme>:<identity>. For more information refer to http://zookeeper.apache.org/doc/r3.2.2/zookeeperProgrammers.html#sc_ZooKeeperAccessControl.
 # The 'acl' option allows to specify a scheme, identity pair to setup an ACL for.
 atlas.server.ha.zookeeper.acl=sasl:client@comany.com
 # The 'auth' option specifies the authentication that should be used for connecting to Zookeeper.
 atlas.server.ha.zookeeper.auth=sasl:client@company.com
@@ -254,14 +143,12 @@ atlas.client.ha.sleep.interval.ms=5000
 </verbatim>
 ---++ Server Properties
 <verbatim>
 # Set the following property to true, to enable the setup steps to run on each server start. Default = false.
 atlas.server.run.setup.on.start=false
 </verbatim>
 ---++ Performance configuration items
 The following properties can be used to tune performance of Atlas under specific circumstances:
 <verbatim>
@@ -288,14 +175,19 @@ atlas.webserver.queuesize=100
 </verbatim>
 ---+++ Recording performance metrics
+To enable performance logs for various Atlas operations (like REST API calls, notification processing), setup the following in atlas-log4j.xml:
-Atlas package should be built with '-P perf' to instrument atlas code to collect metrics. The metrics will be recorded in
+<verbatim>
-<atlas.log.dir>/metric.log, with one log line per API call. The metrics contain the number of times the instrumented methods
+  <appender name="perf_appender" class="org.apache.log4j.DailyRollingFileAppender">
-are called and the total time spent in the instrumented method. Logging to metric.log is controlled through log4j configuration
+    <param name="File" value="/var/log/atlas/atlas_perf.log"/>
-in atlas-log4j.xml. When the atlas code is instrumented, to disable logging to metric.log at runtime, set log level of METRICS logger to info level:
+    <param name="datePattern" value="'.'yyyy-MM-dd"/>
-<verbatim>
+    <param name="append" value="true"/>
-<logger name="METRICS" additivity="false">
+    <layout class="org.apache.log4j.PatternLayout">
-    <level value="info"/>
+      <param name="ConversionPattern" value="%d|%t|%m%n"/>
-    <appender-ref ref="METRICS"/>
+    </layout>
-</logger>
+  </appender>
-</verbatim>
+   <logger name="org.apache.atlas.perf" additivity="false">
+     <level value="debug"/>
+     <appender-ref ref="perf_appender"/>
+   </logger>
+ </verbatim>
--- a/docs/src/site/twiki/HighAvailability.twiki
+++ b/docs/src/site/twiki/HighAvailability.twiki
@@ -157,9 +157,9 @@ At a high level the following points can be called out:
 ---++ Metadata Store
-As described above, Atlas uses Titan to store the metadata it manages. By default, Atlas uses a standalone HBase
+As described above, Atlas uses JanusGraph to store the metadata it manages. By default, Atlas uses a standalone HBase
-instance as the backing store for Titan. In order to provide HA for the metadata store, we recommend that Atlas be
+instance as the backing store for JanusGraph. In order to provide HA for the metadata store, we recommend that Atlas be
-configured to use distributed HBase as the backing store for Titan.  Doing this implies that you could benefit from the
+configured to use distributed HBase as the backing store for JanusGraph.  Doing this implies that you could benefit from the
 HA guarantees HBase provides. In order to configure Atlas to use HBase in HA mode, do the following:
   * Choose an existing HBase cluster that is set up in HA mode to configure in Atlas (OR) Set up a new HBase cluster in [[http://hbase.apache.org/book.html#quickstart_fully_distributed][HA mode]].
@@ -169,8 +169,8 @@ HA guarantees HBase provides. In order to configure Atlas to use HBase in HA mod
 ---++ Index Store
-As described above, Atlas indexes metadata through Titan to support full text search queries. In order to provide HA
+As described above, Atlas indexes metadata through JanusGraph to support full text search queries. In order to provide HA
-for the index store, we recommend that Atlas be configured to use Solr as the backing index store for Titan. In order
+for the index store, we recommend that Atlas be configured to use Solr as the backing index store for JanusGraph. In order
 to configure Atlas to use Solr in HA mode, do the following:
   * Choose an existing !SolrCloud cluster setup in HA mode to configure in Atlas (OR) Set up a new [[https://cwiki.apache.org/confluence/display/solr/SolrCloud][SolrCloud cluster]].
@@ -208,4 +208,4 @@ to configure Atlas to use Kafka in HA mode, do the following:
 ---++ Known Issues
-   * If the HBase region servers hosting the Atlas ‘titan’ HTable are down, Atlas would not be able to store or retrieve metadata from HBase until they are brought back online.
+   * If the HBase region servers hosting the Atlas table are down, Atlas would not be able to store or retrieve metadata from HBase until they are brought back online.
\ No newline at end of file
--- a/docs/src/site/twiki/InstallationSteps.twiki
+++ b/docs/src/site/twiki/InstallationSteps.twiki
 ---++ Building & Installing Apache Atlas
 ---+++ Building Atlas
 <verbatim>
 git clone https://git-wip-us.apache.org/repos/asf/atlas.git atlas
 cd atlas
+export MAVEN_OPTS="-Xms2g -Xmx2g"
+mvn clean -DskipTests install</verbatim>
-export MAVEN_OPTS="-Xmx1536m" && mvn clean install
-</verbatim>
-Once the build successfully completes, artifacts can be packaged for deployment.
-<verbatim>
-mvn clean package -Pdist
-</verbatim>
-NOTE:
-1. Use option '-DskipTests' to skip running unit and integration tests
-2. Use option '-P perf' to instrument atlas to collect performance metrics
-To build a distribution that configures Atlas for external HBase and Solr, build with the external-hbase-solr profile.
+---+++ Packaging Atlas
+To create Apache Atlas package for deployment in an environment having functional HBase and Solr instances, build with the following command:
 <verbatim>
+mvn clean -DskipTests package -Pdist</verbatim>
-mvn clean package -Pdist,external-hbase-solr
+   * NOTES:
+      * Remove option '-DskipTests' to run unit and integration tests
+      * To build a distribution without minified js,css file, build with skipMinify profile. By default js and css files are minified.
-</verbatim>
-Note that when the external-hbase-solr profile is used the following steps need to be completed to make Atlas functional.
+Above will build Atlas for an environment having functional HBase and Solr instances. Atlas needs to be setup with the following to run in this environment:
   * Configure atlas.graph.storage.hostname (see "Graph persistence engine - HBase" in the [[Configuration][Configuration]] section).
   * Configure atlas.graph.index.search.solr.zookeeper-url (see "Graph Search Index - Solr" in the [[Configuration][Configuration]] section).
   * Set HBASE_CONF_DIR to point to a valid HBase config directory (see "Graph persistence engine - HBase" in the [[Configuration][Configuration]] section).
   * Create the SOLR indices (see "Graph Search Index - Solr" in the [[Configuration][Configuration]] section).
-To build a distribution that packages HBase and Solr, build with the embedded-hbase-solr profile.
-<verbatim>
-mvn clean package -Pdist,embedded-hbase-solr
-</verbatim>
-Using the embedded-hbase-solr profile will configure Atlas so that an HBase instance and a Solr instance will be started
-and stopped along with the Atlas server by default.
-Atlas also supports building a distribution that can use BerkeleyDB and Elastic search as the graph and index backends.
+---+++ Packaging Atlas with Embedded HBase & Solr
-To build a distribution that is configured for these backends, build with the berkeley-elasticsearch profile.
+To create Apache Atlas package that includes HBase and Solr, build with the embedded-hbase-solr profile as shown below:
 <verbatim>
+mvn clean -DskipTests package -Pdist,embedded-hbase-solr</verbatim>
-mvn clean package -Pdist,berkeley-elasticsearch
+Using the embedded-hbase-solr profile will configure Atlas so that an HBase instance and a Solr instance will be started and stopped along with the Atlas server by default.
-</verbatim>
-An additional step is required for the binary built using this profile to be used along with the Atlas distribution.
-Due to licensing requirements, Atlas does not bundle the BerkeleyDB Java Edition in the tarball.
-You can download the Berkeley DB jar file from the URL: <verbatim>http://download.oracle.com/otn/berkeley-db/je-5.0.73.zip</verbatim>
-and copy the je-5.0.73.jar to the ${atlas_home}/libext directory.
-Tar can be found in atlas/distro/target/apache-atlas-${project.version}-bin.tar.gz
+---+++ Apache Atlas Package
+Build will create following files, which are used to install Apache Atlas.
-Tar is structured as follows
 <verbatim>
+distro/target/apache-atlas-${project.version}-bin.tar.gz
-|- bin
+distro/target/apache-atlas-${project.version}-hive-hook.gz
-   |- atlas_start.py
+distro/target/apache-atlas-${project.version}-hbase-hook.tar.gz
-   |- atlas_stop.py
+distro/target/apache-atlas-${project.version}-sqoop-hook.tar.gz
-   |- atlas_config.py
+distro/target/apache-atlas-${project.version}-storm-hook.tar.gz
-   |- quick_start.py
+distro/target/apache-atlas-${project.version}-falcon-hook.tar.gz
-   |- cputil.py
+distro/target/apache-atlas-${project.version}-sources.tar.gz</verbatim>
-|- conf
-   |- atlas-application.properties
-   |- atlas-env.sh
-   |- hbase
-      |- hbase-site.xml.template
-   |- log4j.xml
-   |- solr
-      |- currency.xml
-      |- lang
-         |- stopwords_en.txt
-      |- protowords.txt
-      |- schema.xml
-      |- solrconfig.xml
-      |- stopwords.txt
-      |- synonyms.txt
-|- docs
-|- hbase
-   |- bin
-   |- conf
-   ...
-|- server
-   |- webapp
-      |- atlas.war
-|- solr
-   |- bin
-   ...
-|- README
-|- NOTICE
-|- LICENSE
-|- DISCLAIMER.txt
-|- CHANGES.txt
-</verbatim>
-Note that if the embedded-hbase-solr profile is specified for the build then HBase and Solr are included in the
-distribution.
-In this case, a standalone instance of HBase can be started as the default storage backend for the graph repository.
-During Atlas installation the conf/hbase/hbase-site.xml.template gets expanded and moved to hbase/conf/hbase-site.xml
-for the initial standalone HBase configuration.  To configure ATLAS
-graph persistence for a different HBase instance, please see "Graph persistence engine - HBase" in the
-[[Configuration][Configuration]] section.
-Also, a standalone instance of Solr can be started as the default search indexing backend.  To configure ATLAS search
-indexing for a different Solr instance please see "Graph Search Index - Solr" in the
-[[Configuration][Configuration]] section.
-To build a distribution without minified js,css file, build with the skipMinify profile.
-<verbatim>
-mvn clean package -Pdist,skipMinify
-</verbatim>
-Note that by default js and css files are minified.
 ---+++ Installing & Running Atlas
@@ -137,18 +53,12 @@ Note that by default js and css files are minified.
 <verbatim>
 tar -xzvf apache-atlas-${project.version}-bin.tar.gz
-cd atlas-${project.version}
+cd atlas-${project.version}</verbatim>
-</verbatim>
 ---++++ Configuring Atlas
+By default config directory used by Atlas is {package dir}/conf. To override this set environment variable ATLAS_CONF to the path of the conf dir.
-By default config directory used by Atlas is {package dir}/conf. To override this set environment
+Environment variables needed to run Atlas can be set in  atlas-env.sh file in the conf directory. This file will be sourced by Atlas scripts before any commands are executed. The following environment variables are available to set.
-variable ATLAS_CONF to the path of the conf dir.
-atlas-env.sh has been added to the Atlas conf. This file can be used to set various environment
-variables that you need for you services. In addition you can set any other environment
-variables you might need. This file will be sourced by atlas scripts before any commands are
-executed. The following environment variables are available to set.
 <verbatim>
 # The java implementation to use. If JAVA_HOME is not found we expect java and jar to be in path
@@ -169,7 +79,7 @@ executed. The following environment variables are available to set.
 # java heap size we want to set for the atlas server. Default is 1024MB
 #export ATLAS_SERVER_HEAP=
-# What is is considered as atlas home dir. Default is the base locaion of the installed software
+# What is is considered as atlas home dir. Default is the base location of the installed software
 #export ATLAS_HOME_DIR=
 # Where log files are stored. Defatult is logs directory under the base install location
@@ -178,66 +88,48 @@ executed. The following environment variables are available to set.
 # Where pid files are stored. Defatult is logs directory under the base install location
 #export ATLAS_PID_DIR=
-# where the atlas titan db data is stored. Defatult is logs/data directory under the base install location
-#export ATLAS_DATA_DIR=
 # Where do you want to expand the war file. By Default it is in /server/webapp dir under the base install dir.
-#export ATLAS_EXPANDED_WEBAPP_DIR=
+#export ATLAS_EXPANDED_WEBAPP_DIR=</verbatim>
-</verbatim>
 *Settings to support large number of metadata objects*
-If you plan to store several tens of thousands of metadata objects, it is recommended that you use values
+If you plan to store large number of metadata objects, it is recommended that you use values tuned for better GC performance of the JVM.
-tuned for better GC performance of the JVM.
 The following values are common server side options:
 <verbatim>
-export ATLAS_SERVER_OPTS="-server -XX:SoftRefLRUPolicyMSPerMB=0 -XX:+CMSClassUnloadingEnabled -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+PrintTenuringDistribution -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=dumps/atlas_server.hprof -Xloggc:logs/gc-worker.log -verbose:gc -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=1m -XX:+PrintGCDetails -XX:+PrintHeapAtGC -XX:+PrintGCTimeStamps"
+export ATLAS_SERVER_OPTS="-server -XX:SoftRefLRUPolicyMSPerMB=0 -XX:+CMSClassUnloadingEnabled -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:+PrintTenuringDistribution -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=dumps/atlas_server.hprof -Xloggc:logs/gc-worker.log -verbose:gc -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=1m -XX:+PrintGCDetails -XX:+PrintHeapAtGC -XX:+PrintGCTimeStamps"</verbatim>
-</verbatim>
-The =-XX:SoftRefLRUPolicyMSPerMB= option was found to be particularly helpful to regulate GC performance for
+The =-XX:SoftRefLRUPolicyMSPerMB= option was found to be particularly helpful to regulate GC performance for query heavy workloads with many concurrent users.
-query heavy workloads with many concurrent users.
 The following values are recommended for JDK 8:
 <verbatim>
-export ATLAS_SERVER_HEAP="-Xms15360m -Xmx15360m -XX:MaxNewSize=5120m -XX:MetaspaceSize=100M -XX:MaxMetaspaceSize=512m"
+export ATLAS_SERVER_HEAP="-Xms15360m -Xmx15360m -XX:MaxNewSize=5120m -XX:MetaspaceSize=100M -XX:MaxMetaspaceSize=512m"</verbatim>
-</verbatim>
 *NOTE for Mac OS users*
 If you are using a Mac OS, you will need to configure the ATLAS_SERVER_OPTS (explained above).
 In  {package dir}/conf/atlas-env.sh uncomment the following line
 <verbatim>
-#export ATLAS_SERVER_OPTS=
+#export ATLAS_SERVER_OPTS=</verbatim>
-</verbatim>
 and change it to look as below
 <verbatim>
-export ATLAS_SERVER_OPTS="-Djava.awt.headless=true -Djava.security.krb5.realm= -Djava.security.krb5.kdc="
+export ATLAS_SERVER_OPTS="-Djava.awt.headless=true -Djava.security.krb5.realm= -Djava.security.krb5.kdc="</verbatim>
-</verbatim>
-*Hbase as the Storage Backend for the Graph Repository*
+*HBase as the Storage Backend for the Graph Repository*
-By default, Atlas uses Titan as the graph repository and is the only graph repository implementation available currently.
+By default, Atlas uses JanusGraph as the graph repository and is the only graph repository implementation available currently. The HBase versions currently supported are 1.1.x. For configuring ATLAS graph persistence on HBase, please see "Graph persistence engine - HBase" in the [[Configuration][Configuration]] section for more details.
-The HBase versions currently supported are 1.1.x. For configuring ATLAS graph persistence on HBase, please see "Graph persistence engine - HBase" in the [[Configuration][Configuration]] section
-for more details.
-Pre-requisites for running HBase as a distributed cluster
+HBase tables used by Atlas can be set using the following configurations:
-   * 3 or 5 !ZooKeeper nodes
-   * Atleast 3 !RegionServer nodes. It would be ideal to run the !DataNodes on the same hosts as the Region servers for data locality.
-HBase tablename in Titan can be set using the following configuration in ATLAS_HOME/conf/atlas-application.properties:
 <verbatim>
-atlas.graph.storage.hbase.table=apache_atlas_titan
+atlas.graph.storage.hbase.table=atlas
-atlas.audit.hbase.tablename=apache_atlas_entity_audit
+atlas.audit.hbase.tablename=apache_atlas_entity_audit</verbatim>
-</verbatim>
 *Configuring SOLR as the Indexing Backend for the Graph Repository*
-By default, Atlas uses Titan as the graph repository and is the only graph repository implementation available currently.
+By default, Atlas uses JanusGraph as the graph repository and is the only graph repository implementation available currently. For configuring JanusGraph to work with Solr, please follow the instructions below
-For configuring Titan to work with Solr, please follow the instructions below
-   * Install solr if not already running. The version of SOLR supported is 5.2.1. Could be installed from http://archive.apache.org/dist/lucene/solr/5.2.1/solr-5.2.1.tgz
+   * Install solr if not already running. The version of SOLR supported is 5.5.1. Could be installed from http://archive.apache.org/dist/lucene/solr/5.5.1/solr-5.5.1.tgz
   * Start solr in cloud mode.
  !SolrCloud mode uses a !ZooKeeper Service as a highly available, central location for cluster management.
@@ -249,15 +141,12 @@ For configuring Titan to work with Solr, please follow the instructions below
      $SOLR_HOME/bin/solr start -c -z <zookeeper_host:port> -p 8983
      </verbatim>
-   * Run the following commands from SOLR_BIN (e.g. $SOLR_HOME/bin) directory to create collections in Solr corresponding to the indexes that Atlas uses. In the case that the ATLAS and SOLR instance are on 2 different hosts,
+   * Run the following commands from SOLR_BIN (e.g. $SOLR_HOME/bin) directory to create collections in Solr corresponding to the indexes that Atlas uses. In the case that the ATLAS and SOLR instance are on 2 different hosts, first copy the required configuration files from ATLAS_HOME/conf/solr on the ATLAS instance host to the Solr instance host. SOLR_CONF in the below mentioned commands refer to the directory where the solr configuration files have been copied to on Solr host:
-  first copy the required configuration files from ATLAS_HOME/conf/solr on the ATLAS instance host to the Solr instance host. SOLR_CONF in the below mentioned commands refer to the directory where the solr configuration files
-  have been copied to on Solr host:
 <verbatim>
  $SOLR_BIN/solr create -c vertex_index -d SOLR_CONF -shards #numShards -replicationFactor #replicationFactor
  $SOLR_BIN/solr create -c edge_index -d SOLR_CONF -shards #numShards -replicationFactor #replicationFactor
-  $SOLR_BIN/solr create -c fulltext_index -d SOLR_CONF -shards #numShards -replicationFactor #replicationFactor
+  $SOLR_BIN/solr create -c fulltext_index -d SOLR_CONF -shards #numShards -replicationFactor #replicationFactor</verbatim>
-</verbatim>
  Note: If numShards and replicationFactor are not specified, they default to 1 which suffices if you are trying out solr with ATLAS on a single node instance.
  Otherwise specify numShards according to the number of hosts that are in the Solr cluster and the maxShardsPerNode configuration.
@@ -274,12 +163,11 @@ For configuring Titan to work with Solr, please follow the instructions below
 atlas.graph.index.search.solr.mode=cloud
 atlas.graph.index.search.solr.zookeeper-url=<the ZK quorum setup for solr as comma separated value> eg: 10.1.6.4:2181,10.1.6.5:2181
 atlas.graph.index.search.solr.zookeeper-connect-timeout=<SolrCloud Zookeeper Connection Timeout>. Default value is 60000 ms
- atlas.graph.index.search.solr.zookeeper-session-timeout=<SolrCloud Zookeeper Session Timeout>. Default value is 60000 ms
+ atlas.graph.index.search.solr.zookeeper-session-timeout=<SolrCloud Zookeeper Session Timeout>. Default value is 60000 ms</verbatim>
-</verbatim>
   * Restart Atlas
-For more information on Titan solr configuration , please refer http://s3.thinkaurelius.com/docs/titan/0.5.4/solr.htm
+For more information on JanusGraph solr configuration , please refer http://docs.janusgraph.org/0.2.0/solr.html
 Pre-requisites for running Solr in cloud mode
  * Memory - Solr is both memory and CPU intensive. Make sure the server running Solr has adequate memory, CPU and disk.
@@ -299,85 +187,124 @@ use configuration in =atlas-application.properties= for setting up the topics. P
 for these details.
 ---++++ Setting up Atlas
+There are a few steps that setup dependencies of Atlas. One such example is setting up the JanusGraph schema in the storage backend of choice. In a simple single server setup, these are automatically setup with default configuration when the server first accesses these dependencies.
-There are a few steps that setup dependencies of Atlas. One such example is setting up the Titan schema
+However, there are scenarios when we may want to run setup steps explicitly as one time operations. For example, in a multiple server scenario using [[HighAvailability][High Availability]], it is preferable to run setup steps from one of the server instances the first time, and then start the services.
-in the storage backend of choice. In a simple single server setup, these are automatically setup with default
-configuration when the server first accesses these dependencies.
-However, there are scenarios when we may want to run setup steps explicitly as one time operations. For example, in a
-multiple server scenario using [[HighAvailability][High Availability]], it is preferable to run setup steps from one
-of the server instances the first time, and then start the services.
 To run these steps one time, execute the command =bin/atlas_start.py -setup= from a single Atlas server instance.
-However, the Atlas server does take care of parallel executions of the setup steps. Also, running the setup steps multiple
+However, the Atlas server does take care of parallel executions of the setup steps. Also, running the setup steps multiple times is idempotent. Therefore, if one chooses to run the setup steps as part of server startup, for convenience, then they should enable the configuration option =atlas.server.run.setup.on.start= by defining it with the value =true= in the =atlas-application.properties= file.
-times is idempotent. Therefore, if one chooses to run the setup steps as part of server startup, for convenience,
-then they should enable the configuration option =atlas.server.run.setup.on.start= by defining it with the value =true=
-in the =atlas-application.properties= file.
 ---++++ Starting Atlas Server
 <verbatim>
-bin/atlas_start.py [-port <port>]
+bin/atlas_start.py [-port <port>]</verbatim>
-</verbatim>
-By default,
-   * To change the port, use -port option.
-   * atlas server starts with conf from {package dir}/conf. To override this (to use the same conf with multiple atlas upgrades), set environment variable ATLAS_CONF to the path of conf dir
 ---+++ Using Atlas
+   * Verify if the server is up and running
-   * Quick start model - sample model and data
 <verbatim>
-  bin/quick_start.py [<atlas endpoint>]
+  curl -v -u username:password http://localhost:21000/api/atlas/admin/version
-</verbatim>
+  {"Version":"v0.1"}</verbatim>
-   * Verify if the server is up and running
+   * Access Atlas UI using a browser: http://localhost:21000
+   * Run quick start to load sample model and data
 <verbatim>
-  curl -v http://localhost:21000/api/atlas/admin/version
+  bin/quick_start.py [<atlas endpoint>]</verbatim>
-  {"Version":"v0.1"}
-</verbatim>
   * List the types in the repository
 <verbatim>
-  curl -v http://localhost:21000/api/atlas/types
+  curl -v -u username:password http://localhost:21000/api/atlas/v2/types/typedefs/headers
-  {"results":["Process","Infrastructure","DataSet"],"count":3,"requestId":"1867493731@qtp-262860041-0 - 82d43a27-7c34-4573-85d1-a01525705091"}
+  [ {"guid":"fa421be8-c21b-4cf8-a226-fdde559ad598","name":"Referenceable","category":"ENTITY"},
-</verbatim>
+    {"guid":"7f3f5712-521d-450d-9bb2-ba996b6f2a4e","name":"Asset","category":"ENTITY"},
+    {"guid":"84b02fa0-e2f4-4cc4-8b24-d2371cd00375","name":"DataSet","category":"ENTITY"},
+    {"guid":"f93975d5-5a5c-41da-ad9d-eb7c4f91a093","name":"Process","category":"ENTITY"},
+    {"guid":"79dcd1f9-f350-4f7b-b706-5bab416f8206","name":"Infrastructure","category":"ENTITY"}
+  ]</verbatim>
   * List the instances for a given type
 <verbatim>
-  curl -v http://localhost:21000/api/atlas/entities?type=hive_table
+  curl -v -u username:password http://localhost:21000/api/atlas/v2/search/basic?typeName=hive_db
-  {"requestId":"788558007@qtp-44808654-5","list":["cb9b5513-c672-42cb-8477-b8f3e537a162","ec985719-a794-4c98-b98f-0509bd23aac0","48998f81-f1d3-45a2-989a-223af5c1ed6e","a54b386e-c759-4651-8779-a099294244c4"]}
+  {
+    "queryType":"BASIC",
-  curl -v http://localhost:21000/api/atlas/entities/list/hive_db
+    "searchParameters":{
-</verbatim>
+      "typeName":"hive_db",
+      "excludeDeletedEntities":false,
-   * Search for entities (instances) in the repository
+      "includeClassificationAttributes":false,
+      "includeSubTypes":true,
+      "includeSubClassifications":true,
+      "limit":100,
+      "offset":0
+    },
+    "entities":[
+      {
+        "typeName":"hive_db",
+        "guid":"5d900c19-094d-4681-8a86-4eb1d6ffbe89",
+        "status":"ACTIVE",
+        "displayText":"default",
+        "classificationNames":[],
+        "attributes":{
+          "owner":"public",
+          "createTime":null,
+          "qualifiedName":"default@cl1",
+          "name":"default",
+          "description":"Default Hive database"
+        }
+      },
+      {
+        "typeName":"hive_db",
+        "guid":"3a0b14b0-ab85-4b65-89f2-e418f3f7f77c",
+        "status":"ACTIVE",
+        "displayText":"finance",
+        "classificationNames":[],
+        "attributes":{
+          "owner":"hive",
+          "createTime":null,
+          "qualifiedName":"finance@cl1",
+          "name":"finance",
+          "description":null
+        }
+      }
+    ]
+  }</verbatim>
+   * Search for entities
 <verbatim>
-  curl -v http://localhost:21000/api/atlas/discovery/search/dsl?query="from hive_table"
+  curl -v -u username:password http://localhost:21000/api/atlas/v2/search/dsl?query=hive_db%20where%20name='default'
-</verbatim>
+    {
+      "queryType":"DSL",
+      "queryText":"hive_db where name='default'",
-*Dashboard*
+      "entities":[
+        {
-Once atlas is started, you can view the status of atlas entities using the Web-based dashboard. You can open your browser at the corresponding port to use the web UI.
+          "typeName":"hive_db",
+          "guid":"5d900c19-094d-4681-8a86-4eb1d6ffbe89",
+          "status":"ACTIVE",
+          "displayText":"default",
+          "classificationNames":[],
+          "attributes":{
+            "owner":"public",
+            "createTime":null,
+            "qualifiedName":"default@cl1",
+            "name":"default",
+            "description":
+            "Default Hive database"
+          }
+        }
+      ]
+    }</verbatim>
 ---+++ Stopping Atlas Server
 <verbatim>
-bin/atlas_stop.py
+bin/atlas_stop.py</verbatim>
-</verbatim>
---+++ Known Issues
---++++ Setup
+---+++ Troubleshooting
+---++++ Setup issues
 If the setup of Atlas service fails due to any reason, the next run of setup (either by an explicit invocation of
 =atlas_start.py -setup= or by enabling the configuration option =atlas.server.run.setup.on.start=) will fail with
 a message such as =A previous setup run may not have completed cleanly.=. In such cases, you would need to manually
 ensure the setup can run and delete the Zookeeper node at =/apache_atlas/setup_in_progress= before attempting to
 run setup again.
-If the setup failed due to HBase Titan schema setup errors, it may be necessary to repair the HBase schema. If no
+If the setup failed due to HBase JanusGraph schema setup errors, it may be necessary to repair the HBase schema. If no
-data has been stored, one can also disable and drop the 'titan' schema in HBase to let setup run again.
+data has been stored, one can also disable and drop the HBase tables used by Atlas and run setup again.
--- a/docs/src/site/twiki/QuickStart.twiki
+++ b/docs/src/site/twiki/QuickStart.twiki
---+ Quick Start Guide
+---+ Quick Start
 ---++ Introduction
-This quick start user guide is a simple client that adds a few sample type definitions modeled
+Quick start is a simple client that adds a few sample type definitions modeled after the example shown below.
-after the example as shown below. It also adds example entities along with traits as shown in the
+It also adds sample entities along with traits as shown in the instance graph below.
-instance graph below.
 ---+++ Example Type Definitions

--- a/docs/src/site/twiki/Repository.twiki
+++ b/docs/src/site/twiki/Repository.twiki
---+ Repository
---++ Introduction
--- a/docs/src/site/twiki/TypeSystem.twiki
+++ b/docs/src/site/twiki/TypeSystem.twiki
 ---+ Type System
 ---++ Overview
 Atlas allows users to define a model for the metadata objects they want to manage. The model is composed of definitions
 called ‘types’. Instances of ‘types’ called ‘entities’ represent the actual metadata objects that are managed. The Type
 System is a component that allows users to define and manage the types and entities. All metadata objects managed by
@@ -9,7 +8,6 @@ Atlas out of the box (like Hive tables, for e.g.) are modelled using types and r
 types of metadata in Atlas, one needs to understand the concepts of the type system component.
 ---++ Types
 A ‘Type’ in Atlas is a definition of how a particular type of metadata objects are stored and accessed. A type
 represents one or a collection of attributes that define the properties for the metadata object. Users with a
 development background will recognize the similarity of a type to a ‘Class’ definition of object oriented programming
@@ -19,108 +17,100 @@ An example of a type that comes natively defined with Atlas is a Hive table. A H
 attributes:
 <verbatim>
-Name: hive_table
+Name:         hive_table
-MetaType: Class
+TypeCategory: Entity
-SuperTypes: DataSet
+SuperTypes:   DataSet
 Attributes:
-    name: String (name of the table)
+    name:             string
-    db: Database object of type hive_db
+    db:               hive_db
-    owner: String
+    owner:            string
-    createTime: Date
+    createTime:       date
-    lastAccessTime: Date
+    lastAccessTime:   date
-    comment: String
+    comment:          string
-    retention: int
+    retention:        int
-    sd: Storage Description object of type hive_storagedesc
+    sd:               hive_storagedesc
-    partitionKeys: Array of objects of type hive_column
+    partitionKeys:    array<hive_column>
-    aliases: Array of strings
+    aliases:          array<string>
-    columns: Array of objects of type hive_column
+    columns:          array<hive_column>
-    parameters: Map of String keys to String values
+    parameters:       map<string,string>
-    viewOriginalText: String
+    viewOriginalText: string
-    viewExpandedText: String
+    viewExpandedText: string
-    tableType: String
+    tableType:        string
-    temporary: Boolean
+    temporary:        boolean</verbatim>
-</verbatim>
 The following points can be noted from the above example:
   * A type in Atlas is identified uniquely by a ‘name’
-   * A type has a metatype. A metatype represents the type of this model in Atlas. Atlas has the following metatypes:
+   * A type has a metatype. Atlas has the following metatypes:
-      * Basic metatypes: E.g. Int, String, Boolean etc.
+      * Primitive metatypes: boolean, byte, short, int, long, float, double, biginteger, bigdecimal, string, date
      * Enum metatypes
-      * Collection metatypes: E.g. Array, Map
+      * Collection metatypes: array, map
-      * Composite metatypes: E.g. Class, Struct, Trait
+      * Composite metatypes: Entity, Struct, Classification, Relationship
-   * A type can ‘extend’ from a parent type called ‘supertype’ - by virtue of this, it will get to include the attributes that are defined in the supertype as well. This allows modellers to define common attributes across a set of related types etc. This is again similar to the concept of how Object Oriented languages define super classes for a class. It is also possible for a type in Atlas to extend from multiple super types.
+   * Entity & Classification types can ‘extend’ from other types, called ‘supertype’ - by virtue of this, it will get to include the attributes that are defined in the supertype as well. This allows modellers to define common attributes across a set of related types etc. This is again similar to the concept of how Object Oriented languages define super classes for a class. It is also possible for a type in Atlas to extend from multiple super types.
      * In this example, every hive table extends from a pre-defined supertype called a ‘DataSet’. More details about this pre-defined types will be provided later.
-   * Types which have a metatype of ‘Class’, ‘Struct’ or ‘Trait’ can have a collection of attributes. Each attribute has a name (e.g.  ‘name’) and some other associated properties. A property can be referred to using an expression type_name.attribute_name. It is also good to note that attributes themselves are defined using Atlas metatypes.
+   * Types which have a metatype of ‘Entity’, ‘Struct’, ‘Classification’ or 'Relationship' can have a collection of attributes. Each attribute has a name (e.g.  ‘name’) and some other associated properties. A property can be referred to using an expression type_name.attribute_name. It is also good to note that attributes themselves are defined using Atlas metatypes.
      * In this example, hive_table.name is a String, hive_table.aliases is an array of Strings, hive_table.db refers to an instance of a type called hive_db and so on.
-   * Type references in attributes, (like hive_table.db) are particularly interesting. Note that using such an attribute, we can define arbitrary relationships between two types defined in Atlas and thus build rich models. Note that one can also collect a list of references as an attribute type (e.g. hive_table.cols which represents a list of references from hive_table to the hive_column type)
+   * Type references in attributes, (like hive_table.db) are particularly interesting. Note that using such an attribute, we can define arbitrary relationships between two types defined in Atlas and thus build rich models. Note that one can also collect a list of references as an attribute type (e.g. hive_table.columns which represents a list of references from hive_table to hive_column type)
 ---++ Entities
+An ‘entity’ in Atlas is a specific value or instance of an Entity ‘type’ and thus represents a specific metadata object
-An ‘entity’ in Atlas is a specific value or instance of a Class ‘type’ and thus represents a specific metadata object
 in the real world. Referring back to our analogy of Object Oriented Programming languages, an ‘instance’ is an
 ‘Object’ of a certain ‘Class’.
 An example of an entity will be a specific Hive Table. Say Hive has a table called ‘customers’ in the ‘default’
-database. This table will be an ‘entity’ in Atlas of type hive_table. By virtue of being an instance of a class
+database. This table will be an ‘entity’ in Atlas of type hive_table. By virtue of being an instance of an entity
 type, it will have values for every attribute that are a part of the Hive table ‘type’, such as:
 <verbatim>
-id: "9ba387dd-fa76-429c-b791-ffc338d3c91f"
+guid:     "9ba387dd-fa76-429c-b791-ffc338d3c91f"
-typeName: “hive_table”
+typeName: "hive_table"
+status:   "ACTIVE"
 values:
-    name: “customers”
+    name:             “customers”
-    db: "b42c6cfc-c1e7-42fd-a9e6-890e0adf33bc"
+    db:               { "guid": "b42c6cfc-c1e7-42fd-a9e6-890e0adf33bc", "typeName": "hive_db" }
-    owner: “admin”
+    owner:            “admin”
-    createTime: "2016-06-20T06:13:28.000Z"
+    createTime:       1490761686029
-    lastAccessTime: "2016-06-20T06:13:28.000Z"
+    updateTime:       1516298102877
-    comment: null
+    comment:          null
-    retention: 0
+    retention:        0
-    sd: "ff58025f-6854-4195-9f75-3a3058dd8dcf"
+    sd:               { "guid": "ff58025f-6854-4195-9f75-3a3058dd8dcf", "typeName": "hive_storagedesc" }
-    partitionKeys: null
+    partitionKeys:    null
-    aliases: null
+    aliases:          null
-    columns: ["65e2204f-6a23-4130-934a-9679af6a211f", "d726de70-faca-46fb-9c99-cf04f6b579a6", ...]
+    columns:          [ { "guid": ""65e2204f-6a23-4130-934a-9679af6a211f", "typeName": "hive_column" }, { "guid": ""d726de70-faca-46fb-9c99-cf04f6b579a6", "typeName": "hive_column" }, ...]
-    parameters: {"transient_lastDdlTime": "1466403208"}
+    parameters:       { "transient_lastDdlTime": "1466403208"}
    viewOriginalText: null
    viewExpandedText: null
-    tableType: “MANAGED_TABLE”
+    tableType:        “MANAGED_TABLE”
-    temporary: false
+    temporary:        false</verbatim>
-</verbatim>
 The following points can be noted from the example above:
-   * Every entity that is an instance of a Class type is identified by a unique identifier, a GUID. This GUID is generated by the Atlas server when the object is defined, and remains constant for the entire lifetime of the entity. At any point in time, this particular entity can be accessed using its GUID.
+   * Every instance ofan entity type is identified by a unique identifier, a GUID. This GUID is generated by the Atlas server when the object is defined, and remains constant for the entire lifetime of the entity. At any point in time, this particular entity can be accessed using its GUID.
      * In this example, the ‘customers’ table in the default database is uniquely identified by the GUID "9ba387dd-fa76-429c-b791-ffc338d3c91f"
   * An entity is of a given type, and the name of the type is provided with the entity definition.
      * In this example, the ‘customers’ table is a ‘hive_table.
   * The values of this entity are a map of all the attribute names and their values for attributes that are defined in the hive_table type definition.
-   * Attribute values will be according to the metatype of the attribute.
+   * Attribute values will be according to the datatype of the attribute. Entity-type attributes will have value of type AtlasObjectId
-      * Basic metatypes: integer, String, boolean values. E.g. ‘name’ = ‘customers’, ‘Temporary’ = ‘false’
-      * Collection metatypes: An array or map of values of the contained metatype. E.g. parameters = { “transient_lastDdlTime”: “1466403208”}
+With this idea on entities, we can now see the difference between Entity and Struct metatypes. Entities and Structs
-      * Composite metatypes: For classes, the value will be an entity with which this particular entity will have a relationship. E.g. The hive table “customers” is present in a database called “default”. The relationship between the table and database are captured via the “db” attribute. Hence, the value of the “db” attribute will be a GUID that uniquely identifies the hive_db entity called “default”
+both compose attributes of other types. However, instances of Entity types have an identity (with a GUID value) and can
+be referenced from other entities (like a hive_db entity is referenced from a hive_table entity). Instances of Struct
-With this idea on entities, we can now see the difference between Class and Struct metatypes. Classes and Structs
+types do not have an identity of their own. The value of a Struct type is a collection of attributes that are
-both compose attributes of other types. However, entities of Class types have the Id attribute (with a GUID value) a
-nd can be referenced from other entities (like a hive_db entity is referenced from a hive_table entity). Instances of
-Struct types do not have an identity of their own. The value of a Struct type is a collection of attributes that are
 ‘embedded’ inside the entity itself.
 ---++ Attributes
+We already saw that attributes are defined inside metatypes like Entity, Struct, Classification and Relationship. But we
-We already saw that attributes are defined inside composite metatypes like Class and Struct. But we simplistically
+implistically referred to attributes as having a name and a metatype value. However, attributes in Atlas have some more
-referred to attributes as having a name and a metatype value. However, attributes in Atlas have some more properties
+properties that define more concepts related to the type system.
-that define more concepts related to the type system.
 An attribute has the following properties:
 <verbatim>
-    name: string,
+    name:        string,
-    dataTypeName: string,
+    typeName:    string,
-    isComposite: boolean,
+    isOptional:  boolean,
    isIndexable: boolean,
-    isUnique: boolean,
+    isUnique:    boolean,
-    multiplicity: enum,
+    cardinality: enum</verbatim>
-    reverseAttributeName: string
-</verbatim>
 The properties above have the following meanings:
@@ -132,7 +122,7 @@ The properties above have the following meanings:
   * isIndexable -
      * This flag indicates whether this property should be indexed on, so that look ups can be performed using the attribute value as a predicate and can be performed efficiently.
   * isUnique -
-      * This flag is again related to indexing. If specified to be unique, it means that a special index is created for this attribute in Titan that allows for equality based look ups.
+      * This flag is again related to indexing. If specified to be unique, it means that a special index is created for this attribute in JanusGraph that allows for equality based look ups.
      * Any attribute with a true value for this flag is treated like a primary key to distinguish this entity from other entities. Hence care should be taken ensure that this attribute does model a unique property in real world.
         * For e.g. consider the name attribute of a hive_table. In isolation, a name is not a unique attribute for a hive_table, because tables with the same name can exist in multiple databases. Even a pair of (database name, table name) is not unique if Atlas is storing metadata of hive tables amongst multiple clusters. Only a cluster location, database name and table name can be deemed unique in the physical world.
   * multiplicity - indicates whether this attribute is required, optional, or could be multi-valued. If an entity’s definition of the attribute value does not match the multiplicity declaration in the type definition, this would be a constraint violation and the entity addition will fail. This field can therefore be used to define some constraints on the metadata information.
@@ -142,59 +132,55 @@ Let us look at the attribute called ‘db’ which represents the database to wh
 <verbatim>
 db:
-    "dataTypeName": "hive_db",
+    "name":        "db",
-    "isComposite": false,
+    "typeName":    "hive_db",
+    "isOptional":  false,
    "isIndexable": true,
-    "isUnique": false,
+    "isUnique":    false,
-    "multiplicity": "required",
+    "cardinality": "SINGLE"</verbatim>
-    "name": "db",
-    "reverseAttributeName": null
-</verbatim>
-Note the “required” constraint on multiplicity. A table entity cannot be sent without a db reference.
+Note the “isOptional=true” constraint - a table entity cannot be created without a db reference.
 <verbatim>
 columns:
-    "dataTypeName": "array<hive_column>",
+    "name":        "columns",
-    "isComposite": true,
+    "typeName":    "array<hive_column>",
+    "isOptional":  optional,
    "isIndexable": true,
-    “isUnique": false,
+    “isUnique":    false,
-    "multiplicity": "optional",
+    "constraints": [ { "type": "ownedRef" } ]</verbatim>
-    "name": "columns",
-    "reverseAttributeName": null
-</verbatim>
-Note the “isComposite” true value for columns. By doing this, we are indicating that the defined column entities should
+Note the “ownedRef” constraint for columns. By doing this, we are indicating that the defined column entities should
 always be bound to the table entity they are defined with.
 From this description and examples, you will be able to realize that attribute definitions can be used to influence
 specific modelling behavior (constraints, indexing, etc) to be enforced by the Atlas system.
 ---++ System specific types and their significance
+Atlas comes with a few pre-defined system types. We saw one example (DataSet) in preceding sections. In this
-Atlas comes with a few pre-defined system types. We saw one example (DataSet) in the preceding sections. In this
+section we will see more of these types and understand their significance.
-section we will see all these types and understand their significance.
 *Referenceable*: This type represents all entities that can be searched for using a unique attribute called
 qualifiedName.
-*Asset*: This type contains attributes like name, description and owner. Name is a required attribute
+*Asset*: This type extends Referenceable and adds attributes like name, description and owner. Name is a required
-(multiplicity = required), the others are optional. The purpose of Referenceable and Asset is to provide modellers
+attribute (isOptional=false), the others are optional.
-with way to enforce consistency when defining and querying entities of their own types. Having these fixed set of
-attributes allows applications and User interfaces to make convention based assumptions about what attributes they can
+The purpose of Referenceable and Asset is to provide modellers with way to enforce consistency when defining and
-expect of types by default.
+querying entities of their own types. Having these fixed set of attributes allows applications and user interfaces to
+make convention based assumptions about what attributes they can expect of types by default.
-*Infrastructure*: This type extends Referenceable and Asset and typically can be used to be a common super type for
-infrastructural metadata objects like clusters, hosts etc.
+*Infrastructure*: This type extends Asset and typically can be used to be a common super type for infrastructural
+metadata objects like clusters, hosts etc.
-*!DataSet*: This type extends Referenceable and Asset. Conceptually, it can be used to represent an type that stores
-data. In Atlas, hive tables, Sqoop RDBMS tables etc are all types that extend from !DataSet. Types that extend !DataSet
+*!DataSet*: This type extends Referenceable. Conceptually, it can be used to represent an type that stores data. In Atlas,
-can be expected to have a Schema in the sense that they would have an attribute that defines attributes of that dataset.
+hive tables, hbase_tables etc are all types that extend from !DataSet. Types that extend !DataSet can be expected to have
-For e.g. the columns attribute in a hive_table. Also entities of types that extend !DataSet participate in data
+a Schema in the sense that they would have an attribute that defines attributes of that dataset. For e.g. the columns
-transformation and this transformation can be captured by Atlas via lineage (or provenance) graphs.
+attribute in a hive_table. Also entities of types that extend !DataSet participate in data transformation and this
+transformation can be captured by Atlas via lineage (or provenance) graphs.
-*Process*: This type extends Referenceable and Asset. Conceptually, it can be used to represent any data transformation
-operation. For example, an ETL process that transforms a hive table with raw data to another hive table that stores
+*Process*: This type extends Asset. Conceptually, it can be used to represent any data transformation operation. For
-some aggregate can be a specific type that extends the Process type. A Process type has two specific attributes,
+example, an ETL process that transforms a hive table with raw data to another hive table that stores some aggregate can
-inputs and outputs. Both  inputs and outputs are arrays of !DataSet entities. Thus an instance of a Process type can
+be a specific type that extends the Process type. A Process type has two specific attributes, inputs and outputs. Both
-use these inputs and outputs to capture how the lineage of a !DataSet evolves.
+inputs and outputs are arrays of !DataSet entities. Thus an instance of a Process type can use these inputs and outputs
\ No newline at end of file
+to capture how the lineage of a !DataSet evolves.
--- a/docs/src/site/twiki/index.twiki
+++ b/docs/src/site/twiki/index.twiki
@@ -7,39 +7,49 @@ Atlas is a scalable and extensible set of core foundational governance services 
 enterprises to effectively and efficiently meet their compliance requirements within Hadoop and
 allows integration with the whole enterprise data ecosystem.
+Apache Atlas provides open metadata management and governance capabilities for organizations
+to build a catalog of their data assets, classify and govern these assets and provide collaboration
+capabilities around these data assets for data scientists, analysts and the data governance team.
 ---++ Features
---+++ Data Classification
+---+++ Metadata types & instances
-   * Import or define taxonomy business-oriented annotations for data
+   * Pre-defined types for various Hadoop and non-Hadoop metadata
-   * Define, annotate, and automate capture of relationships between data sets and underlying elements including source, target, and derivation processes
+   * Ability to define new types for the metadata to be managed
-   * Export metadata to third-party systems
+   * Types can have primitive attributes, complex attributes, object references; can inherit from other types
+   * Instances of types, called entities, capture metadata object details and their relationships
+   * REST APIs to work with types and instances allow easier integration
+---+++ Classification
+   * Ability to dynamically create classifications - like PII, EXPIRES_ON, DATA_QUALITY, SENSITIVE
+   * Classifications can include attributes - like expiry_date attribute in EXPIRES_ON classification
+   * Entities can be associated with multiple classifications, enabling easier discovery and security enforcement
---+++ Centralized Auditing
+---+++ Lineage
-   * Capture security access information for every application, process, and interaction with data
+   * Intuitive UI to view lineage of data as it moves through various processes
-   * Capture the operational information for execution, steps, and activities
+   * REST APIs to access and update lineage
---+++ Search & Lineage (Browse)
+---+++ Search/Discovery
-   * Pre-defined navigation paths to explore the data classification and audit information
+   * Intuitive UI to search entities by type, classification, attribute value or free-text
-   * Text-based search features locates relevant data and audit event across Data Lake quickly and accurately
+   * Rich REST APIs to search by complex criteria
-   * Browse visualization of data set lineage allowing users to drill-down into operational, security, and provenance related information
+   * SQL like query language to search entities - Domain Specific Language (DSL)
---+++ Security & Policy Engine
+---+++ Security & Data Masking
-   * Rationalize compliance policy at runtime based on data classification schemes, attributes and roles.
+   * Integration with Apache Ranger enables authorization/data-masking based on classifications associated with entities in Apache Atlas. For example:
-   * Advanced definition of policies for preventing data derivation based on classification (i.e. re-identification) – Prohibitions
+      * who can access data classified as PII, SENSITIVE
-   * Column and Row level masking based on cell values and attibutes.
+      * customer-service users can only see last 4 digits of columns classified as NATIONAL_ID
 ---++ Getting Started
-   * [[InstallationSteps][Install Steps]]
+   * [[InstallationSteps][Build & Install]]
-   * [[QuickStart][Quick Start Guide]]
+   * [[QuickStart][Quick Start]]
 ---++ Documentation
   * [[Architecture][High Level Architecture]]
   * [[TypeSystem][Type System]]
-   * [[Repository][Metadata Repository]]
   * [[Search][Search]]
   * [[security][Security]]
   * [[Authentication-Authorization][Authentication and Authorization]]

--- a/docs/src/site/twiki/security.twiki
+++ b/docs/src/site/twiki/security.twiki
@@ -43,7 +43,7 @@ The properties for configuring service authentication are:
   * <code>atlas.authentication.keytab</code> - the path to the keytab file.
   * <code>atlas.authentication.principal</code> - the principal to use for authenticating to the KDC.  The principal is generally of the form "user/host@realm".  You may use the '_HOST' token for the hostname and the local hostname will be substituted in by the runtime (e.g. "Atlas/_HOST@EXAMPLE.COM").
-Note that when Atlas is configured with HBase as the storage backend in a secure cluster, the graph db (titan) needs sufficient user permissions to be able to create and access an HBase table.  To grant the appropriate permissions see [[Configuration][Graph persistence engine - Hbase]].
+Note that when Atlas is configured with HBase as the storage backend in a secure cluster, the graph db (JanusGraph) needs sufficient user permissions to be able to create and access an HBase table.  To grant the appropriate permissions see [[Configuration][Graph persistence engine - Hbase]].
 ---+++ JAAS configuration