ATLAS-2365: updated README for 1.0.0-alpha release

Signed-off-by: kevalbhatt <kbhatt@apache.org>

ATLAS-2365: updated README for 1.0.0-alpha release
5c2f7a0c · Madhan Neethiraj · kevalbhatt · 39be2ccf · 5c2f7a0c · 5c2f7a0c
Commit 5c2f7a0c authored Jan 19, 2018 by Madhan Neethiraj Committed by kevalbhatt Jan 19, 2018
14 changed files
--- a/README.txt
+++ b/README.txt
@@ -15,6 +15,7 @@
 # limitations under the License.
 Apache Atlas Overview
+=====================
 Apache Atlas framework is an extensible set of core
 foundational governance services – enabling enterprises to effectively and
@@ -31,6 +32,16 @@ The metadata veracity is maintained by leveraging Apache Ranger to prevent
 non-authorized access paths to data at runtime.
 Security is both role based (RBAC) and attribute based (ABAC).
+Apache Atlas 1.0.0-alpha release
+================================
+Please note that this is an alpha/technical-preview release and is not
+recommended for production use. There is no support for migration of data
+from earlier version of Apache Atlas. Also, the data generated using this
+alpha release may not migrate to Apache Atlas 1.0 GA release.
 Build Process
 =============
@@ -51,14 +62,6 @@ Build Process
   $ export MAVEN_OPTS="-Xms2g -Xmx2g"
   $ mvn clean install
-   # currently few tests might fail in some environments
-   # (timing issue?), the community is reviewing and updating
-   # such tests.
-   #
-   # if you see test failures, please run the following command:
-      $ mvn clean -DskipTests install
   $ mvn clean package -Pdist
 3. After above build commands successfully complete, you should see the following files
@@ -68,3 +71,5 @@ Build Process
   addons/hive-bridge/target/hive-bridge-<version>.jar
   addons/sqoop-bridge/target/sqoop-bridge-<version>.jar
   addons/storm-bridge/target/storm-bridge-<version>.jar
+4. For more details on building and running Apache Atlas, please refer to http://atlas.apache.org/InstallationSteps.html
--- a/docs/pom.xml
+++ b/docs/pom.xml
@@ -77,6 +77,9 @@
                        <version>1.6</version>
                    </dependency>
                </dependencies>
+                <configuration>
+				    <port>8080</port>
+                </configuration>
                <executions>
                    <execution>
                        <goals>

--- a/docs/src/site/twiki/Architecture.twiki
+++ b/docs/src/site/twiki/Architecture.twiki
@@ -8,8 +8,7 @@
 The components of Atlas can be grouped under the following major categories:
 ---+++ Core
+Atlas core includes the following components:
-This category contains the components that implement the core of Atlas functionality, including:
 *Type System*: Atlas allows users to define a model for the metadata objects they want to manage. The model is composed
 of definitions called ‘types’. Instances of ‘types’ called ‘entities’ represent the actual metadata objects that are
@@ -21,25 +20,18 @@ One key point to note is that the generic nature of the modelling in Atlas allow
 define both technical metadata and business metadata. It is also possible to define rich relationships between the
 two using features of Atlas.
+*Graph Engine*: Internally, Atlas persists metadata objects it manages using a Graph model. This approach provides great
+flexibility and enables efficient handling of rich relationships between the metadata objects. Graph engine component is
+responsible for translating between types and entities of the Atlas type system, and the underlying graph persistence model.
+In addition to managing the graph objects, the graph engine also creates the appropriate indices for the metadata
+objects so that they can be searched efficiently. Atlas uses the JanusGraph to store the metadata objects.
 *Ingest / Export*: The Ingest component allows metadata to be added to Atlas. Similarly, the Export component exposes
 metadata changes detected by Atlas to be raised as events. Consumers can consume these change events to react to
 metadata changes in real time.
-*Graph Engine*: Internally, Atlas represents metadata objects it manages using a Graph model. It does this to
-achieve great flexibility and rich relations between the metadata objects. The Graph Engine is a component that is
-responsible for translating between types and entities of the Type System, and the underlying Graph model.
-In addition to managing the Graph objects, The Graph Engine also creates the appropriate indices for the metadata
-objects so that they can be searched for efficiently.
-*Titan*: Currently, Atlas uses the Titan Graph Database to store the metadata objects. Titan is used as a library
-within Atlas. Titan uses two stores: The Metadata store is configured to !HBase by default and the Index store
-is configured to Solr. It is also possible to use the Metadata store as BerkeleyDB and Index store as !ElasticSearch
-by building with corresponding profiles. The Metadata store is used for storing the metadata objects proper, and the
-Index store is used for storing indices of the Metadata properties, that allows efficient search.
 ---+++ Integration
 Users can manage metadata in Atlas using two methods:
 *API*: All functionality of Atlas is exposed to end users via a REST API that allows types and entities to be created,
@@ -53,7 +45,6 @@ uses Apache Kafka as a notification server for communication between hooks and d
 notification events. Events are written by the hooks and Atlas to different Kafka topics.
 ---+++ Metadata sources
 Atlas supports integration with many sources of metadata out of the box. More integrations will be added in future
 as well. Currently, Atlas supports ingesting and managing metadata from the following sources:
@@ -61,6 +52,7 @@ as well. Currently, Atlas supports ingesting and managing metadata from the foll
   * [[Bridge-Sqoop][Sqoop]]
   * [[Bridge-Falcon][Falcon]]
   * [[StormAtlasHook][Storm]]
+   * HBase - _documentation work-in-progress_
 The integration implies two things:
 There are metadata models that Atlas defines natively to represent objects of these components.
@@ -80,12 +72,6 @@ for the Hadoop ecosystem having wide integration with a variety of Hadoop compon
 Ranger allows security administrators to define metadata driven security policies for effective governance.
 Ranger is a consumer to the metadata change events notified by Atlas.
-*Business Taxonomy*: The metadata objects ingested into Atlas from the Metadata sources are primarily a form
-of technical metadata. To enhance the discoverability and governance capabilities, Atlas comes with a Business
-Taxonomy interface that allows users to first, define a hierarchical set of business terms that represent their
-business domain and associate them to the metadata entities Atlas manages. Business Taxonomy is a web application that
-is part of the Atlas Admin UI currently and integrates with Atlas using the REST API.

--- a/docs/src/site/twiki/Bridge-Falcon.twiki
+++ b/docs/src/site/twiki/Bridge-Falcon.twiki
 ---+ Falcon Atlas Bridge
 ---++ Falcon Model
-The default falcon modelling is available in org.apache.atlas.falcon.model.FalconDataModelGenerator. It defines the following types:
+The default hive model includes the following types:
-<verbatim>
+   * Entity types:
-falcon_cluster(ClassType) - super types [Infrastructure] - attributes [timestamp, colo, owner, tags]
+      * falcon_cluster
-falcon_feed(ClassType) - super types [DataSet] - attributes [timestamp, stored-in, owner, groups, tags]
+         * super-types: Infrastructure
-falcon_feed_creation(ClassType) - super types [Process] - attributes [timestamp, stored-in, owner]
+         * attributes: timestamp, colo, owner, tags
-falcon_feed_replication(ClassType) - super types [Process] - attributes [timestamp, owner]
+      * falcon_feed
-falcon_process(ClassType) - super types [Process] - attributes [timestamp, runs-on, owner, tags, pipelines, workflow-properties]
+         * super-types: !DataSet
-</verbatim>
+         * attributes: timestamp, stored-in, owner, groups, tags
+      * falcon_feed_creation
+         * super-types: Process
+         * attributes: timestamp, stored-in, owner
+      * falcon_feed_replication
+         * super-types: Process
+         * attributes: timestamp, owner
+      * falcon_process
+         * super-types: Process
+         * attributes: timestamp, runs-on, owner, tags, pipelines, workflow-properties
 One falcon_process entity is created for every cluster that the falcon process is defined for.
 The entities are created and de-duped using unique qualifiedName attribute. They provide namespace and can be used for querying/lineage as well. The unique attributes are:
-   * falcon_process - <process name>@<cluster name>
+   * falcon_process.qualifiedName          - <process name>@<cluster name>
-   * falcon_cluster - <cluster name>
+   * falcon_cluster.qualifiedName          - <cluster name>
-   * falcon_feed - <feed name>@<cluster name>
+   * falcon_feed.qualifiedName             - <feed name>@<cluster name>
-   * falcon_feed_creation - <feed name>
+   * falcon_feed_creation.qualifiedName    - <feed name>
-   * falcon_feed_replication - <feed name>
+   * falcon_feed_replication.qualifiedName - <feed name>
 ---++ Falcon Hook
-Falcon supports listeners on falcon entity submission. This is used to add entities in Atlas using the model defined in org.apache.atlas.falcon.model.FalconDataModelGenerator.
+Falcon supports listeners on falcon entity submission. This is used to add entities in Atlas using the model detailed above.
-The hook submits the request to a thread pool executor to avoid blocking the command execution. The thread submits the entities as message to the notification server and atlas server reads these messages and registers the entities.
+Follow the instructions below to setup Atlas hook in Falcon:
   * Add 'org.apache.atlas.falcon.service.AtlasService' to application.services in <falcon-conf>/startup.properties
-   * Link falcon hook jars in falcon classpath - 'ln -s <atlas-home>/hook/falcon/* <falcon-home>/server/webapp/falcon/WEB-INF/lib/'
+   * Link Atlas hook jars in Falcon classpath - 'ln -s <atlas-home>/hook/falcon/* <falcon-home>/server/webapp/falcon/WEB-INF/lib/'
   * In <falcon_conf>/falcon-env.sh, set an environment variable as follows:
     <verbatim>
-     export FALCON_SERVER_OPTS="<atlas_home>/hook/falcon/*:$FALCON_SERVER_OPTS"
+     export FALCON_SERVER_OPTS="<atlas_home>/hook/falcon/*:$FALCON_SERVER_OPTS"</verbatim>
-     </verbatim>
 The following properties in <atlas-conf>/atlas-application.properties control the thread pool and notification details:
   * atlas.hook.falcon.synchronous   - boolean, true to run the hook synchronously. default false
@@ -40,5 +48,5 @@ The following properties in <atlas-conf>/atlas-application.properties control th
 Refer [[Configuration][Configuration]] for notification related configurations
---++ Limitations
+---++ NOTES
   * In falcon cluster entity, cluster name used should be uniform across components like hive, falcon, sqoop etc. If used with ambari, ambari cluster name should be used for cluster entity
--- a/docs/src/site/twiki/Bridge-Hive.twiki
+++ b/docs/src/site/twiki/Bridge-Hive.twiki
 ---+ Hive Atlas Bridge
 ---++ Hive Model
-The default hive modelling is available in org.apache.atlas.hive.model.HiveDataModelGenerator. It defines the following types:
+The default hive model includes the following types:
-<verbatim>
+   * Entity types:
-hive_db(ClassType) - super types [Referenceable] - attributes [name, clusterName, description, locationUri, parameters, ownerName, ownerType]
+      * hive_db
-hive_storagedesc(ClassType) - super types [Referenceable] - attributes [cols, location, inputFormat, outputFormat, compressed, numBuckets, serdeInfo, bucketCols, sortCols, parameters, storedAsSubDirectories]
+         * super-types: Referenceable
-hive_column(ClassType) - super types [Referenceable] - attributes [name, type, comment, table]
+         * attributes: name, clusterName, description, locationUri, parameters, ownerName, ownerType
-hive_table(ClassType) - super types [DataSet] - attributes [name, db, owner, createTime, lastAccessTime, comment, retention, sd, partitionKeys, columns, aliases, parameters, viewOriginalText, viewExpandedText, tableType, temporary]
+      * hive_storagedesc
-hive_process(ClassType) - super types [Process] - attributes [name, startTime, endTime, userName, operationType, queryText, queryPlan, queryId]
+         * super-types: Referenceable
-hive_principal_type(EnumType) - values [USER, ROLE, GROUP]
+         * attributes: cols, location, inputFormat, outputFormat, compressed, numBuckets, serdeInfo, bucketCols, sortCols, parameters, storedAsSubDirectories
-hive_order(StructType) - attributes [col, order]
+      * hive_column
-hive_serde(StructType) - attributes [name, serializationLib, parameters]
+         * super-types: Referenceable
-</verbatim>
+         * attributes: name, type, comment, table
+      * hive_table
+         * super-types: !DataSet
+         * attributes: name, db, owner, createTime, lastAccessTime, comment, retention, sd, partitionKeys, columns, aliases, parameters, viewOriginalText, viewExpandedText, tableType, temporary
+      * hive_process
+         * super-types: Process
+         * attributes: name, startTime, endTime, userName, operationType, queryText, queryPlan, queryId
+      * hive_column_lineage
+         * super-types: Process
+         * attributes: query, depenendencyType, expression
+   * Enum types:
+      * hive_principal_type
+         * values: USER, ROLE, GROUP
+   * Struct types:
+      * hive_order
+         * attributes: col, order
+      * hive_serde
+         * attributes: name, serializationLib, parameters
 The entities are created and de-duped using unique qualified name. They provide namespace and can be used for querying/lineage as well. Note that dbName, tableName and columnName should be in lower case. clusterName is explained below.
-   * hive_db - attribute qualifiedName - <dbName>@<clusterName>
+   * hive_db.qualifiedName     - <dbName>@<clusterName>
-   * hive_table - attribute qualifiedName - <dbName>.<tableName>@<clusterName>
+   * hive_table.qualifiedName  - <dbName>.<tableName>@<clusterName>
-   * hive_column - attribute qualifiedName - <dbName>.<tableName>.<columnName>@<clusterName>
+   * hive_column.qualifiedName - <dbName>.<tableName>.<columnName>@<clusterName>
-   * hive_process - attribute name - <queryString> - trimmed query string in lower case
+   * hive_process.queryString  - trimmed query string in lower case
 ---++ Importing Hive Metadata
-org.apache.atlas.hive.bridge.HiveMetaStoreBridge imports the Hive metadata into Atlas using the model defined in org.apache.atlas.hive.model.HiveDataModelGenerator. import-hive.sh command can be used to facilitate this. The script needs Hadoop and Hive classpath jars.
+org.apache.atlas.hive.bridge.HiveMetaStoreBridge imports the Hive metadata into Atlas using the model defined above. import-hive.sh command can be used to facilitate this.
-  * For Hadoop jars, please make sure that the environment variable HADOOP_CLASSPATH is set. Another way is to set HADOOP_HOME to point to root directory of your Hadoop installation
-  * Similarly, for Hive jars, set HIVE_HOME to the root of Hive installation
-  * Set environment variable HIVE_CONF_DIR to Hive configuration directory
-  * Copy <atlas-conf>/atlas-application.properties to the hive conf directory
    <verbatim>
-    Usage: <atlas package>/hook-bin/import-hive.sh
+    Usage: <atlas package>/hook-bin/import-hive.sh</verbatim>
-    </verbatim>
 The logs are in <atlas package>/logs/import-hive.log
-If you you are importing metadata in a kerberized cluster you need to run the command like this:
-<verbatim>
-<atlas package>/hook-bin/import-hive.sh -Dsun.security.jgss.debug=true -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.krb5.conf=[krb5.conf location] -Djava.security.auth.login.config=[jaas.conf location]
-</verbatim>
-   * krb5.conf is typically found at /etc/krb5.conf
-   * for details about jaas.conf and a suggested location see the [[security][atlas security documentation]]
 ---++ Hive Hook
-Hive supports listeners on hive command execution using hive hooks. This is used to add/update/remove entities in Atlas using the model defined in org.apache.atlas.hive.model.HiveDataModelGenerator.
+Atlas Hive hook registers with Hive to listen for create/update/delete operations and updates the metadata in Atlas, via Kafka notifications, for the changes in Hive.
-The hook submits the request to a thread pool executor to avoid blocking the command execution. The thread submits the entities as message to the notification server and atlas server reads these messages and registers the entities.
+Follow the instructions below to setup Atlas hook in Hive:
-Follow these instructions in your hive set-up to add hive hook for Atlas:
+   * Set-up Atlas hook in hive-site.xml by adding the following:
-   * Set-up atlas hook in hive-site.xml of your hive configuration:
  <verbatim>
    <property>
      <name>hive.exec.post.hooks</name>
      <value>org.apache.atlas.hive.hook.HiveHook</value>
-    </property>
+    </property></verbatim>
-  </verbatim>
-  <verbatim>
-    <property>
-      <name>atlas.cluster.name</name>
-      <value>primary</value>
-    </property>
-  </verbatim>
   * Add 'export HIVE_AUX_JARS_PATH=<atlas package>/hook/hive' in hive-env.sh of your hive configuration
   * Copy <atlas-conf>/atlas-application.properties to the hive conf directory.
 The following properties in <atlas-conf>/atlas-application.properties control the thread pool and notification details:
   * atlas.hook.hive.synchronous   - boolean, true to run the hook synchronously. default false. Recommended to be set to false to avoid delays in hive query completion.
   * atlas.hook.hive.numRetries    - number of retries for notification failure. default 3
-   * atlas.hook.hive.minThreads - core number of threads. default 5
+   * atlas.hook.hive.minThreads    - core number of threads. default 1
   * atlas.hook.hive.maxThreads    - maximum number of threads. default 5
   * atlas.hook.hive.keepAliveTime - keep alive time in msecs. default 10
   * atlas.hook.hive.queueSize     - queue size for the threadpool. default 10000
@@ -76,24 +74,23 @@ Refer [[Configuration][Configuration]] for notification related configurations
 Starting from 0.8-incubating version of Atlas, Column level lineage is captured in Atlas. Below are the details
 ---+++ Model
-   * !ColumnLineageProcess type is a subclass of Process
+   * !ColumnLineageProcess type is a subtype of Process
   * This relates an output Column to a set of input Columns or the Input Table
-   * The Lineage also captures the kind of Dependency: currently the values are SIMPLE, EXPRESSION, SCRIPT
+   * The lineage also captures the kind of dependency, as listed below:
-      * A SIMPLE dependency means the output column has the same value as the input
+      * SIMPLE:     output column has the same value as the input
-      * An EXPRESSION dependency means the output column is transformed by some expression in the runtime(for e.g. a Hive SQL expression) on the Input Columns.
+      * EXPRESSION: output column is transformed by some expression at runtime (for e.g. a Hive SQL expression) on the Input Columns.
-      * SCRIPT means that the output column is transformed by a user provided script.
+      * SCRIPT:     output column is transformed by a user provided script.
   * In case of EXPRESSION dependency the expression attribute contains the expression in string form
-   * Since Process links input and output !DataSets, we make Column a subclass of !DataSet
+   * Since Process links input and output !DataSets, Column is a subtype of !DataSet
 ---+++ Examples
 For a simple CTAS below:
 <verbatim>
-create table t2 as select id, name from T1
+create table t2 as select id, name from T1</verbatim>
-</verbatim>
 The lineage is captured as
@@ -106,10 +103,8 @@ The lineage is captured as
  * The !LineageInfo in Hive provides column-level lineage for the final !FileSinkOperator, linking them to the input columns in the Hive Query
---+++ NOTE
+---++ NOTES
-Column level lineage works with Hive version 1.2.1 after the patch for <a href="https://issues.apache.org/jira/browse/HIVE-13112">HIVE-13112</a> is applied to Hive source
+   * Column level lineage works with Hive version 1.2.1 after the patch for <a href="https://issues.apache.org/jira/browse/HIVE-13112">HIVE-13112</a> is applied to Hive source
---++ Limitations
   * Since database name, table name and column names are case insensitive in hive, the corresponding names in entities are lowercase. So, any search APIs should use lowercase while querying on the entity names
   * The following hive operations are captured by hive hook currently
      * create database

--- a/docs/src/site/twiki/Bridge-Sqoop.twiki
+++ b/docs/src/site/twiki/Bridge-Sqoop.twiki
 ---+ Sqoop Atlas Bridge
 ---++ Sqoop Model
-The default Sqoop modelling is available in org.apache.atlas.sqoop.model.SqoopDataModelGenerator. It defines the following types:
+The default hive model includes the following types:
-<verbatim>
+   * Entity types:
-sqoop_operation_type(EnumType) - values [IMPORT, EXPORT, EVAL]
+      * sqoop_process
-sqoop_dbstore_usage(EnumType) - values [TABLE, QUERY, PROCEDURE, OTHER]
+         * super-types: Process
-sqoop_process(ClassType) - super types [Process] - attributes [name, operation, dbStore, hiveTable, commandlineOpts, startTime, endTime, userName]
+         * attributes: name, operation, dbStore, hiveTable, commandlineOpts, startTime, endTime, userName
-sqoop_dbdatastore(ClassType) - super types [DataSet] - attributes [name, dbStoreType, storeUse, storeUri, source, description, ownerName]
+      * sqoop_dbdatastore
-</verbatim>
+         * super-types: !DataSet
+         * attributes: name, dbStoreType, storeUse, storeUri, source, description, ownerName
+   * Enum types:
+      * sqoop_operation_type
+         * values: IMPORT, EXPORT, EVAL
+      * sqoop_dbstore_usage
+         * values: TABLE, QUERY, PROCEDURE, OTHER
 The entities are created and de-duped using unique qualified name. They provide namespace and can be used for querying as well:
-sqoop_process - attribute name - sqoop-dbStoreType-storeUri-endTime
+   * sqoop_process.qualifiedName     - dbStoreType-storeUri-endTime
-sqoop_dbdatastore - attribute name - dbStoreType-connectorUrl-source
+   * sqoop_dbdatastore.qualifiedName - dbStoreType-storeUri-source
 ---++ Sqoop Hook
-Sqoop added a !SqoopJobDataPublisher that publishes data to Atlas after completion of import Job. Today, only hiveImport is supported in sqoopHook.
+Sqoop added a !SqoopJobDataPublisher that publishes data to Atlas after completion of import Job. Today, only hiveImport is supported in !SqoopHook.
-This is used to add entities in Atlas using the model defined in org.apache.atlas.sqoop.model.SqoopDataModelGenerator.
+This is used to add entities in Atlas using the model detailed above.
-Follow these instructions in your sqoop set-up to add sqoop hook for Atlas in <sqoop-conf>/sqoop-site.xml:
+Follow the instructions below to setup Atlas hook in Hive:
-   * Sqoop Job publisher class.  Currently only one publishing class is supported
+Add the following properties to  to enable Atlas hook in Sqoop:
+   * Set-up Atlas hook in <sqoop-conf>/sqoop-site.xml by adding the following:
+  <verbatim>
   <property>
     <name>sqoop.job.data.publish.class</name>
     <value>org.apache.atlas.sqoop.hook.SqoopHook</value>
-   </property>
+   </property></verbatim>
-   * Atlas cluster name
-   <property>
-     <name>atlas.cluster.name</name>
-     <value><clustername></value>
-   </property>
   * Copy <atlas-conf>/atlas-application.properties to to the sqoop conf directory <sqoop-conf>/
   * Link <atlas-home>/hook/sqoop/*.jar in sqoop lib
 Refer [[Configuration][Configuration]] for notification related configurations
---++ Limitations
+---++ NOTES
   * Only the following sqoop operations are captured by sqoop hook currently - hiveImport
--- a/docs/src/site/twiki/Configuration.twiki
+++ b/docs/src/site/twiki/Configuration.twiki
--- a/docs/src/site/twiki/HighAvailability.twiki
+++ b/docs/src/site/twiki/HighAvailability.twiki
@@ -157,9 +157,9 @@ At a high level the following points can be called out:
 ---++ Metadata Store
-As described above, Atlas uses Titan to store the metadata it manages. By default, Atlas uses a standalone HBase
+As described above, Atlas uses JanusGraph to store the metadata it manages. By default, Atlas uses a standalone HBase
-instance as the backing store for Titan. In order to provide HA for the metadata store, we recommend that Atlas be
+instance as the backing store for JanusGraph. In order to provide HA for the metadata store, we recommend that Atlas be
-configured to use distributed HBase as the backing store for Titan.  Doing this implies that you could benefit from the
+configured to use distributed HBase as the backing store for JanusGraph.  Doing this implies that you could benefit from the
 HA guarantees HBase provides. In order to configure Atlas to use HBase in HA mode, do the following:
   * Choose an existing HBase cluster that is set up in HA mode to configure in Atlas (OR) Set up a new HBase cluster in [[http://hbase.apache.org/book.html#quickstart_fully_distributed][HA mode]].
@@ -169,8 +169,8 @@ HA guarantees HBase provides. In order to configure Atlas to use HBase in HA mod
 ---++ Index Store
-As described above, Atlas indexes metadata through Titan to support full text search queries. In order to provide HA
+As described above, Atlas indexes metadata through JanusGraph to support full text search queries. In order to provide HA
-for the index store, we recommend that Atlas be configured to use Solr as the backing index store for Titan. In order
+for the index store, we recommend that Atlas be configured to use Solr as the backing index store for JanusGraph. In order
 to configure Atlas to use Solr in HA mode, do the following:
   * Choose an existing !SolrCloud cluster setup in HA mode to configure in Atlas (OR) Set up a new [[https://cwiki.apache.org/confluence/display/solr/SolrCloud][SolrCloud cluster]].
@@ -208,4 +208,4 @@ to configure Atlas to use Kafka in HA mode, do the following:
 ---++ Known Issues
-   * If the HBase region servers hosting the Atlas ‘titan’ HTable are down, Atlas would not be able to store or retrieve metadata from HBase until they are brought back online.
+   * If the HBase region servers hosting the Atlas table are down, Atlas would not be able to store or retrieve metadata from HBase until they are brought back online.
\ No newline at end of file
--- a/docs/src/site/twiki/InstallationSteps.twiki
+++ b/docs/src/site/twiki/InstallationSteps.twiki
--- a/docs/src/site/twiki/QuickStart.twiki
+++ b/docs/src/site/twiki/QuickStart.twiki
---+ Quick Start Guide
+---+ Quick Start
 ---++ Introduction
-This quick start user guide is a simple client that adds a few sample type definitions modeled
+Quick start is a simple client that adds a few sample type definitions modeled after the example shown below.
-after the example as shown below. It also adds example entities along with traits as shown in the
+It also adds sample entities along with traits as shown in the instance graph below.
-instance graph below.
 ---+++ Example Type Definitions

--- a/docs/src/site/twiki/Repository.twiki
+++ b/docs/src/site/twiki/Repository.twiki
---+ Repository
---++ Introduction
--- a/docs/src/site/twiki/TypeSystem.twiki
+++ b/docs/src/site/twiki/TypeSystem.twiki
--- a/docs/src/site/twiki/index.twiki
+++ b/docs/src/site/twiki/index.twiki
@@ -7,39 +7,49 @@ Atlas is a scalable and extensible set of core foundational governance services 
 enterprises to effectively and efficiently meet their compliance requirements within Hadoop and
 allows integration with the whole enterprise data ecosystem.
+Apache Atlas provides open metadata management and governance capabilities for organizations
+to build a catalog of their data assets, classify and govern these assets and provide collaboration
+capabilities around these data assets for data scientists, analysts and the data governance team.
 ---++ Features
---+++ Data Classification
+---+++ Metadata types & instances
-   * Import or define taxonomy business-oriented annotations for data
+   * Pre-defined types for various Hadoop and non-Hadoop metadata
-   * Define, annotate, and automate capture of relationships between data sets and underlying elements including source, target, and derivation processes
+   * Ability to define new types for the metadata to be managed
-   * Export metadata to third-party systems
+   * Types can have primitive attributes, complex attributes, object references; can inherit from other types
+   * Instances of types, called entities, capture metadata object details and their relationships
+   * REST APIs to work with types and instances allow easier integration
+---+++ Classification
+   * Ability to dynamically create classifications - like PII, EXPIRES_ON, DATA_QUALITY, SENSITIVE
+   * Classifications can include attributes - like expiry_date attribute in EXPIRES_ON classification
+   * Entities can be associated with multiple classifications, enabling easier discovery and security enforcement
---+++ Centralized Auditing
+---+++ Lineage
-   * Capture security access information for every application, process, and interaction with data
+   * Intuitive UI to view lineage of data as it moves through various processes
-   * Capture the operational information for execution, steps, and activities
+   * REST APIs to access and update lineage
---+++ Search & Lineage (Browse)
+---+++ Search/Discovery
-   * Pre-defined navigation paths to explore the data classification and audit information
+   * Intuitive UI to search entities by type, classification, attribute value or free-text
-   * Text-based search features locates relevant data and audit event across Data Lake quickly and accurately
+   * Rich REST APIs to search by complex criteria
-   * Browse visualization of data set lineage allowing users to drill-down into operational, security, and provenance related information
+   * SQL like query language to search entities - Domain Specific Language (DSL)
---+++ Security & Policy Engine
+---+++ Security & Data Masking
-   * Rationalize compliance policy at runtime based on data classification schemes, attributes and roles.
+   * Integration with Apache Ranger enables authorization/data-masking based on classifications associated with entities in Apache Atlas. For example:
-   * Advanced definition of policies for preventing data derivation based on classification (i.e. re-identification) – Prohibitions
+      * who can access data classified as PII, SENSITIVE
-   * Column and Row level masking based on cell values and attibutes.
+      * customer-service users can only see last 4 digits of columns classified as NATIONAL_ID
 ---++ Getting Started
-   * [[InstallationSteps][Install Steps]]
+   * [[InstallationSteps][Build & Install]]
-   * [[QuickStart][Quick Start Guide]]
+   * [[QuickStart][Quick Start]]
 ---++ Documentation
   * [[Architecture][High Level Architecture]]
   * [[TypeSystem][Type System]]
-   * [[Repository][Metadata Repository]]
   * [[Search][Search]]
   * [[security][Security]]
   * [[Authentication-Authorization][Authentication and Authorization]]

--- a/docs/src/site/twiki/security.twiki
+++ b/docs/src/site/twiki/security.twiki
@@ -43,7 +43,7 @@ The properties for configuring service authentication are:
   * <code>atlas.authentication.keytab</code> - the path to the keytab file.
   * <code>atlas.authentication.principal</code> - the principal to use for authenticating to the KDC.  The principal is generally of the form "user/host@realm".  You may use the '_HOST' token for the hostname and the local hostname will be substituted in by the runtime (e.g. "Atlas/_HOST@EXAMPLE.COM").
-Note that when Atlas is configured with HBase as the storage backend in a secure cluster, the graph db (titan) needs sufficient user permissions to be able to create and access an HBase table.  To grant the appropriate permissions see [[Configuration][Graph persistence engine - Hbase]].
+Note that when Atlas is configured with HBase as the storage backend in a secure cluster, the graph db (JanusGraph) needs sufficient user permissions to be able to create and access an HBase table.  To grant the appropriate permissions see [[Configuration][Graph persistence engine - Hbase]].
 ---+++ JAAS configuration