Commit 8f807351 by sameer79 Committed by kevalbhatt

ATLAS-3487 : UI : Atlas website missing images

Signed-off-by: 's avatarkevalbhatt <kbhatt@apache.org>
parent f687fb7d
...@@ -92,6 +92,7 @@ ...@@ -92,6 +92,7 @@
<resource> <resource>
<directory>${basedir}</directory> <directory>${basedir}</directory>
<excludes> <excludes>
<exclude>package-lock.json</exclude>
<exclude>.docz/**</exclude> <exclude>.docz/**</exclude>
<exclude>node_modules/**</exclude> <exclude>node_modules/**</exclude>
<exclude>target/**</exclude> <exclude>target/**</exclude>
...@@ -122,6 +123,7 @@ ...@@ -122,6 +123,7 @@
<goal>npm</goal> <goal>npm</goal>
</goals> </goals>
<configuration> <configuration>
<arguments>install --no-package-lock</arguments>
<arguments>install</arguments> <arguments>install</arguments>
</configuration> </configuration>
</execution> </execution>
...@@ -138,4 +140,4 @@ ...@@ -138,4 +140,4 @@
</plugin> </plugin>
</plugins> </plugins>
</build> </build>
</project> </project>
\ No newline at end of file
...@@ -49,16 +49,16 @@ For example. when _employees_ table is deleted, classifications associated with ...@@ -49,16 +49,16 @@ For example. when _employees_ table is deleted, classifications associated with
**Case 2:** **Case 2:**
When an entity is deleted in the middle of a lineage path, the propagation link is broken and previously propagated classifications will be removed from all derived entities of the deleted entity. When an entity is deleted in the middle of a lineage path, the propagation link is broken and previously propagated classifications will be removed from all derived entities of the deleted entity.
For example. when 'us_employees' table is deleted, classifications propagating through this table (**PII**) are removed from 'ca_employees' table, since the only path of propagation is broken by entity deletion. For example. when 'us_employees' table is deleted, classifications propagating through this table (**PII**) are removed from 'ca_employees' table, since the only path of propagation is broken by entity deletion.
<Img src={`/images/twiki/classification-propagation-entity-delete-1.png"`}/> <Img src={`/images/twiki/classification-propagation-entity-delete-1.png`}/>
<Img src={`/images/twiki/classification-propagation-entity-delete-2.png"`}/> <Img src={`/images/twiki/classification-propagation-entity-delete-2.png`}/>
**Case 3:** **Case 3:**
When an entity is deleted in the middle of a lineage path and if there exists alternate path for propagation, previously propagated classifications will be retained. When an entity is deleted in the middle of a lineage path and if there exists alternate path for propagation, previously propagated classifications will be retained.
For example. when 'us_employees' table is deleted, classifications propagating (**PII**) through this table are retained in 'ca_employees' table, since there are two propagation paths available and only one of them is broken by entity deletion. For example. when 'us_employees' table is deleted, classifications propagating (**PII**) through this table are retained in 'ca_employees' table, since there are two propagation paths available and only one of them is broken by entity deletion.
<Img src={`/images/twiki/classification-propagation-entity-delete-3.png"`}/> <Img src={`/images/twiki/classification-propagation-entity-delete-3.png`}/>
<Img src={`/images/twiki/classification-propagation-entity-delete-4.png"`}/> <Img src={`/images/twiki/classification-propagation-entity-delete-4.png`}/>
## Control Propagation ## Control Propagation
......
...@@ -58,16 +58,23 @@ The following pre-requisites must be met for setting up the High Availability fe ...@@ -58,16 +58,23 @@ The following pre-requisites must be met for setting up the High Availability fe
* Ensure that you install Apache Zookeeper on a cluster of machines (a minimum of 3 servers is recommended for production). * Ensure that you install Apache Zookeeper on a cluster of machines (a minimum of 3 servers is recommended for production).
* Select 2 or more physical machines to run the Atlas Web Service instances on. These machines define what we refer to as a 'server ensemble' for Atlas. * Select 2 or more physical machines to run the Atlas Web Service instances on. These machines define what we refer to as a 'server ensemble' for Atlas.
To setup High Availability in Atlas, a few configuration options must be defined in the `atlas-application.properties` To setup High Availability in Atlas, a few configuration options must be defined in the `atlas-application.properties`
file. While the complete list of configuration items are defined in the [Configuration Page](#/Configuration), this file. While the complete list of configuration items are defined in the [Configuration Page](#/Configuration), this
section lists a few of the main options. section lists a few of the main options.
* High Availability is an optional feature in Atlas. Hence, it must be enabled by setting the configuration option `atlas.server.ha.enabled` to true.
* Next, define a list of identifiers, one for each physical machine you have selected for the Atlas Web Service instance. These identifiers can be simple strings like `id1`, `id2` etc. They should be unique and should not contain a comma.
* Define a comma separated list of these identifiers as the value of the option `atlas.server.ids`.
* For each physical machine, list the IP Address/hostname and port as the value of the configuration `atlas.server.address.id`, where `id` refers to the identifier string for this physical machine.
* For e.g., if you have selected 2 machines with hostnames `host1.company.com` and `host2.company.com`, you can define the configuration options as below: * High Availability is an optional feature in Atlas. Hence, it must be enabled by setting the configuration option `atlas.server.ha.enabled` to true.
* Next, define a list of identifiers, one for each physical machine you have selected for the Atlas Web Service instance. These identifiers can be simple strings like `id1`, `id2` etc. They should be unique and should not contain a comma.
* Define a comma separated list of these identifiers as the value of the option `atlas.server.ids`.
* For each physical machine, list the IP Address/hostname and port as the value of the configuration `atlas.server.address.id`, where `id` refers to the identifier string for this physical machine.
* For e.g., if you have selected 2 machines with hostnames `host1.company.com` and `host2.company.com`, you can define the configuration options as below:
<SyntaxHighlighter wrapLines={true} language="java" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="java" style={theme.dark}>
{`atlas.server.ids=id1,id2 {`atlas.server.ids=id1,id2
...@@ -109,7 +116,7 @@ the script, and the others would print *PASSIVE*. ...@@ -109,7 +116,7 @@ the script, and the others would print *PASSIVE*.
The Atlas Web Service can be accessed in two ways: The Atlas Web Service can be accessed in two ways:
* **Using the Atlas Web UI**: This is a browser based client that can be used to query the metadata stored in Atlas. * **Using the Atlas Web UI**: This is a browser based client that can be used to query the metadata stored in Atlas.
* **Using the Atlas REST API**: As Atlas exposes a RESTful API, one can use any standard REST client including libraries in other applications. In fact, Atlas ships with a client called !AtlasClient that can be used as an example to build REST client access. * **Using the Atlas REST API**: As Atlas exposes a RESTful API, one can use any standard REST client including libraries in other applications. In fact, Atlas ships with a client called AtlasClient that can be used as an example to build REST client access.
In order to take advantage of the High Availability feature in the clients, there are two options possible. In order to take advantage of the High Availability feature in the clients, there are two options possible.
...@@ -151,11 +158,11 @@ the active instance. The response from the Active instance would be of the form ...@@ -151,11 +158,11 @@ the active instance. The response from the Active instance would be of the form
client faces any exceptions in the course of an operation, it should again determine which of the remaining URLs client faces any exceptions in the course of an operation, it should again determine which of the remaining URLs
is active and retry the operation. is active and retry the operation.
The !AtlasClient class that ships with Atlas can be used as an example client library that implements the logic The AtlasClient class that ships with Atlas can be used as an example client library that implements the logic
for working with an ensemble and selecting the right Active server instance. for working with an ensemble and selecting the right Active server instance.
Utilities in Atlas, like =quick_start.py= and =import-hive.sh= can be configured to run with multiple server Utilities in Atlas, like `quick_start.py` and `import-hive.sh` can be configured to run with multiple server
URLs. When launched in this mode, the !AtlasClient automatically selects and works with the current active instance. URLs. When launched in this mode, the AtlasClient automatically selects and works with the current active instance.
If a proxy is set up in between, then its address can be used when running quick_start.py or import-hive.sh. If a proxy is set up in between, then its address can be used when running quick_start.py or import-hive.sh.
### Implementation Details of Atlas High Availability ### Implementation Details of Atlas High Availability
...@@ -191,10 +198,10 @@ for the index store, we recommend that Atlas be configured to use Solr or Elasti ...@@ -191,10 +198,10 @@ for the index store, we recommend that Atlas be configured to use Solr or Elasti
### Solr ### Solr
In order to configure Atlas to use Solr in HA mode, do the following: In order to configure Atlas to use Solr in HA mode, do the following:
* Choose an existing !SolrCloud cluster setup in HA mode to configure in Atlas (OR) Set up a new [SolrCloud cluster](https://cwiki.apache.org/confluence/display/solr/SolrCloud). * Choose an existing SolrCloud cluster setup in HA mode to configure in Atlas (OR) Set up a new [SolrCloud cluster](https://cwiki.apache.org/confluence/display/solr/SolrCloud).
* Ensure Solr is brought up on at least 2 physical hosts for redundancy, and each host runs a Solr node. * Ensure Solr is brought up on at least 2 physical hosts for redundancy, and each host runs a Solr node.
* We recommend the number of replicas to be set to at least 2 for redundancy. * We recommend the number of replicas to be set to at least 2 for redundancy.
* Create the !SolrCloud collections required by Atlas, as described in [Installation Steps](#/Installation) * Create the SolrCloud collections required by Atlas, as described in [Installation Steps](#/Installation)
* Refer to the [Configuration page](#/Configuration) for the options to configure in atlas.properties to setup Atlas with Solr. * Refer to the [Configuration page](#/Configuration) for the options to configure in atlas.properties to setup Atlas with Solr.
### Elasticsearch (Tech Preview) ### Elasticsearch (Tech Preview)
...@@ -213,27 +220,31 @@ persists these messages, the events will not be lost even if the consumers are d ...@@ -213,27 +220,31 @@ persists these messages, the events will not be lost even if the consumers are d
addition, we recommend Kafka is also setup for fault tolerance so that it has higher availability guarantees. In order addition, we recommend Kafka is also setup for fault tolerance so that it has higher availability guarantees. In order
to configure Atlas to use Kafka in HA mode, do the following: to configure Atlas to use Kafka in HA mode, do the following:
* Choose an existing Kafka cluster set up in HA mode to configure in Atlas (OR) Set up a new Kafka cluster. * Choose an existing Kafka cluster set up in HA mode to configure in Atlas (OR) Set up a new Kafka cluster.
* We recommend that there are more than one Kafka brokers in the cluster on different physical hosts that use Zookeeper for coordination to provide redundancy and high availability of Kafka.
* Setup at least 2 physical hosts for redundancy, each hosting a Kafka broker.
* Set up Kafka topics for Atlas usage:
* The number of partitions for the ATLAS topics should be set to 1 (numPartitions)
* Decide number of replicas for Kafka topic: Set this to at least 2 for redundancy.
* Run the following commands:
<SyntaxHighlighter wrapLines={true} language="java" style={theme.dark}> * We recommend that there are more than one Kafka brokers in the cluster on different physical hosts that use Zookeeper for coordination to provide redundancy and high availability of Kafka.
{`$KAFKA_HOME/bin/kafka-topics.sh --create --zookeeper <list of zookeeper host:port entries> --topic ATLAS_HOOK --replication-factor <numReplicas> --partitions 1 * Setup at least 2 physical hosts for redundancy, each hosting a Kafka broker.
$KAFKA_HOME/bin/kafka-topics.sh --create --zookeeper <list of zookeeper host:port entries> --topic ATLAS_ENTITIES --replication-factor <numReplicas> --partitions 1
Here KAFKA_HOME points to the Kafka installation directory.`}
</SyntaxHighlighter> * Set up Kafka topics for Atlas usage:
* The number of partitions for the ATLAS topics should be set to 1 (numPartitions)
* Decide number of replicas for Kafka topic: Set this to at least 2 for redundancy.
* Run the following commands:
<SyntaxHighlighter wrapLines={true} language="java" style={theme.dark}>
{`$KAFKA_HOME/bin/kafka-topics.sh --create --zookeeper <list of zookeeper host:port entries> --topic ATLAS_HOOK --replication-factor <numReplicas> --partitions 1
$KAFKA_HOME/bin/kafka-topics.sh --create --zookeeper <list of zookeeper host:port entries> --topic ATLAS_ENTITIES --replication-factor <numReplicas> --partitions 1
Here KAFKA_HOME points to the Kafka installation directory.`}
</SyntaxHighlighter>
* In atlas-application.properties, set the following configuration: * In atlas-application.properties, set the following configuration:
<SyntaxHighlighter wrapLines={true} language="java" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="java" style={theme.dark}>
{`atlas.notification.embedded=false {`atlas.notification.embedded=false
atlas.kafka.zookeeper.connect=<comma separated list of servers forming Zookeeper quorum used by Kafka> atlas.kafka.zookeeper.connect=<comma separated list of servers forming Zookeeper quorum used by Kafka>
atlas.kafka.bootstrap.servers=<comma separated list of Kafka broker endpoints in host:port form> - Give at least 2 for redundancy.`} atlas.kafka.bootstrap.servers=<comma separated list of Kafka broker endpoints in host:port form> - Give at least 2 for redundancy.`}
</SyntaxHighlighter> </SyntaxHighlighter>
## Known Issues ## Known Issues
......
...@@ -18,7 +18,7 @@ The default hive model includes the following types: ...@@ -18,7 +18,7 @@ The default hive model includes the following types:
* super-types: Infrastructure * super-types: Infrastructure
* attributes: timestamp, colo, owner, tags * attributes: timestamp, colo, owner, tags
* falcon_feed * falcon_feed
* super-types: !DataSet * super-types: DataSet
* attributes: timestamp, stored-in, owner, groups, tags * attributes: timestamp, stored-in, owner, groups, tags
* falcon_feed_creation * falcon_feed_creation
* super-types: Process * super-types: Process
...@@ -48,7 +48,7 @@ Follow the instructions below to setup Atlas hook in Falcon: ...@@ -48,7 +48,7 @@ Follow the instructions below to setup Atlas hook in Falcon:
* Copy entire contents of folder apache-atlas-falcon-hook-${project.version}/hook/falcon to `<atlas-home>`/hook/falcon * Copy entire contents of folder apache-atlas-falcon-hook-${project.version}/hook/falcon to `<atlas-home>`/hook/falcon
* Link Atlas hook jars in Falcon classpath - 'ln -s `<atlas-home>`/hook/falcon/* `<falcon-home>`/server/webapp/falcon/WEB-INF/lib/' * Link Atlas hook jars in Falcon classpath - 'ln -s `<atlas-home>`/hook/falcon/* `<falcon-home>`/server/webapp/falcon/WEB-INF/lib/'
* In `<falcon_conf>`/falcon-env.sh, set an environment variable as follows: * In `<falcon_conf>`/falcon-env.sh, set an environment variable as follows:
<SyntaxHighlighter wrapLines={true} language="java" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="java" style={theme.dark}>
{`export FALCON_SERVER_OPTS="<atlas_home>/hook/falcon/*:$FALCON_SERVER_OPTS"`} {`export FALCON_SERVER_OPTS="<atlas_home>/hook/falcon/*:$FALCON_SERVER_OPTS"`}
</SyntaxHighlighter> </SyntaxHighlighter>
......
...@@ -15,13 +15,13 @@ import SyntaxHighlighter from 'react-syntax-highlighter'; ...@@ -15,13 +15,13 @@ import SyntaxHighlighter from 'react-syntax-highlighter';
HBase model includes the following types: HBase model includes the following types:
* Entity types: * Entity types:
* hbase_namespace * hbase_namespace
* super-types: !Asset * super-types: Asset
* attributes: qualifiedName, name, description, owner, clusterName, parameters, createTime, modifiedTime * attributes: qualifiedName, name, description, owner, clusterName, parameters, createTime, modifiedTime
* hbase_table * hbase_table
* super-types: !DataSet * super-types: DataSet
* attributes: qualifiedName, name, description, owner, namespace, column_families, uri, parameters, createtime, modifiedtime, maxfilesize, isReadOnly, isCompactionEnabled, isNormalizationEnabled, ReplicaPerRegion, Durability * attributes: qualifiedName, name, description, owner, namespace, column_families, uri, parameters, createtime, modifiedtime, maxfilesize, isReadOnly, isCompactionEnabled, isNormalizationEnabled, ReplicaPerRegion, Durability
* hbase_column_family * hbase_column_family
* super-types: !DataSet * super-types: DataSet
* attributes: qualifiedName, name, description, owner, columns, createTime, bloomFilterType, compressionType, compactionCompressionType, encryptionType, inMemoryCompactionPolicy, keepDeletedCells, maxversions, minVersions, datablockEncoding, storagePolicy, ttl, blockCachedEnabled, cacheBloomsOnWrite, cacheDataOnWrite, evictBlocksOnClose, prefetchBlocksOnOpen, newVersionsBehavior, isMobEnabled, mobCompactPartitionPolicy * attributes: qualifiedName, name, description, owner, columns, createTime, bloomFilterType, compressionType, compactionCompressionType, encryptionType, inMemoryCompactionPolicy, keepDeletedCells, maxversions, minVersions, datablockEncoding, storagePolicy, ttl, blockCachedEnabled, cacheBloomsOnWrite, cacheDataOnWrite, evictBlocksOnClose, prefetchBlocksOnOpen, newVersionsBehavior, isMobEnabled, mobCompactPartitionPolicy
HBase entities are created and de-duped in Atlas using unique attribute qualifiedName, whose value should be formatted as detailed below. Note that namespaceName, tableName and columnFamilyName should be in lower case. HBase entities are created and de-duped in Atlas using unique attribute qualifiedName, whose value should be formatted as detailed below. Note that namespaceName, tableName and columnFamilyName should be in lower case.
...@@ -37,7 +37,7 @@ hbase_column_family.qualifiedName: <namespaceName>:<tableName>.<columnFamilyNam ...@@ -37,7 +37,7 @@ hbase_column_family.qualifiedName: <namespaceName>:<tableName>.<columnFamilyNam
Atlas HBase hook registers with HBase master as a co-processor. On detecting changes to HBase namespaces/tables/column-families, Atlas hook updates the metadata in Atlas via Kafka notifications. Atlas HBase hook registers with HBase master as a co-processor. On detecting changes to HBase namespaces/tables/column-families, Atlas hook updates the metadata in Atlas via Kafka notifications.
Follow the instructions below to setup Atlas hook in HBase: Follow the instructions below to setup Atlas hook in HBase:
* Register Atlas hook in hbase-site.xml by adding the following: * Register Atlas hook in hbase-site.xml by adding the following:
<SyntaxHighlighter wrapLines={true} language="xml" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="xml" style={theme.dark}>
{`<property> {`<property>
<name>hbase.coprocessor.master.classes</name> <name>hbase.coprocessor.master.classes</name>
......
...@@ -17,13 +17,13 @@ import Img from 'theme/components/shared/Img' ...@@ -17,13 +17,13 @@ import Img from 'theme/components/shared/Img'
Hive model includes the following types: Hive model includes the following types:
* Entity types: * Entity types:
* hive_db * hive_db
* super-types: !Asset * super-types: Asset
* attributes: qualifiedName, name, description, owner, clusterName, location, parameters, ownerName * attributes: qualifiedName, name, description, owner, clusterName, location, parameters, ownerName
* hive_table * hive_table
* super-types: !DataSet * super-types: DataSet
* attributes: qualifiedName, name, description, owner, db, createTime, lastAccessTime, comment, retention, sd, partitionKeys, columns, aliases, parameters, viewOriginalText, viewExpandedText, tableType, temporary * attributes: qualifiedName, name, description, owner, db, createTime, lastAccessTime, comment, retention, sd, partitionKeys, columns, aliases, parameters, viewOriginalText, viewExpandedText, tableType, temporary
* hive_column * hive_column
* super-types: !DataSet * super-types: DataSet
* attributes: qualifiedName, name, description, owner, type, comment, table * attributes: qualifiedName, name, description, owner, type, comment, table
* hive_storagedesc * hive_storagedesc
* super-types: Referenceable * super-types: Referenceable
...@@ -34,19 +34,19 @@ Hive model includes the following types: ...@@ -34,19 +34,19 @@ Hive model includes the following types:
* hive_column_lineage * hive_column_lineage
* super-types: Process * super-types: Process
* attributes: qualifiedName, name, description, owner, inputs, outputs, query, depenendencyType, expression * attributes: qualifiedName, name, description, owner, inputs, outputs, query, depenendencyType, expression
* Enum types: * Enum types:
* hive_principal_type * hive_principal_type
* values: USER, ROLE, GROUP * values: USER, ROLE, GROUP
* Struct types: * Struct types:
* hive_order * hive_order
* attributes: col, order * attributes: col, order
* hive_serde * hive_serde
* attributes: name, serializationLib, parameters * attributes: name, serializationLib, parameters
Hive entities are created and de-duped in Atlas using unique attribute qualifiedName, whose value should be formatted as detailed below. Note that dbName, tableName and columnName should be in lower case. Hive entities are created and de-duped in Atlas using unique attribute qualifiedName, whose value should be formatted as detailed below. Note that dbName, tableName and columnName should be in lower case.
...@@ -62,7 +62,7 @@ hive_process.queryString: trimmed query string in lower case`} ...@@ -62,7 +62,7 @@ hive_process.queryString: trimmed query string in lower case`}
Atlas Hive hook registers with Hive to listen for create/update/delete operations and updates the metadata in Atlas, via Kafka notifications, for the changes in Hive. Atlas Hive hook registers with Hive to listen for create/update/delete operations and updates the metadata in Atlas, via Kafka notifications, for the changes in Hive.
Follow the instructions below to setup Atlas hook in Hive: Follow the instructions below to setup Atlas hook in Hive:
* Set-up Atlas hook in hive-site.xml by adding the following: * Set-up Atlas hook in hive-site.xml by adding the following:
<SyntaxHighlighter wrapLines={true} language="xml" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="xml" style={theme.dark}>
{`<property> {`<property>
<name>hive.exec.post.hooks</name> <name>hive.exec.post.hooks</name>
...@@ -75,7 +75,7 @@ Follow the instructions below to setup Atlas hook in Hive: ...@@ -75,7 +75,7 @@ Follow the instructions below to setup Atlas hook in Hive:
* Copy entire contents of folder apache-atlas-hive-hook-${project.version}/hook/hive to `<atlas package>`/hook/hive * Copy entire contents of folder apache-atlas-hive-hook-${project.version}/hook/hive to `<atlas package>`/hook/hive
* Add 'export HIVE_AUX_JARS_PATH=`<atlas package>`/hook/hive' in hive-env.sh of your hive configuration * Add 'export HIVE_AUX_JARS_PATH=`<atlas package>`/hook/hive' in hive-env.sh of your hive configuration
* Copy `<atlas-conf>`/atlas-application.properties to the hive conf directory. * Copy `<atlas-conf>`/atlas-application.properties to the hive conf directory.
The following properties in atlas-application.properties control the thread pool and notification details: The following properties in atlas-application.properties control the thread pool and notification details:
...@@ -97,18 +97,15 @@ Other configurations for Kafka notification producer can be specified by prefixi ...@@ -97,18 +97,15 @@ Other configurations for Kafka notification producer can be specified by prefixi
Starting from 0.8-incubating version of Atlas, Column level lineage is captured in Atlas. Below are the details Starting from 0.8-incubating version of Atlas, Column level lineage is captured in Atlas. Below are the details
### Model ### Model
* !ColumnLineageProcess type is a subtype of Process
* This relates an output Column to a set of input Columns or the Input Table * ColumnLineageProcess type is a subtype of Process
* This relates an output Column to a set of input Columns or the Input Table
* The lineage also captures the kind of dependency, as listed below: * The lineage also captures the kind of dependency, as listed below:
* SIMPLE: output column has the same value as the input * SIMPLE: output column has the same value as the input
* EXPRESSION: output column is transformed by some expression at runtime (for e.g. a Hive SQL expression) on the Input Columns. * EXPRESSION: output column is transformed by some expression at runtime (for e.g. a Hive SQL expression) on the Input Columns.
* SCRIPT: output column is transformed by a user provided script. * SCRIPT: output column is transformed by a user provided script.
* In case of EXPRESSION dependency the expression attribute contains the expression in string form
* In case of EXPRESSION dependency the expression attribute contains the expression in string form * Since Process links input and output DataSets, Column is a subtype of DataSet
* Since Process links input and output !DataSets, Column is a subtype of !DataSet
### Examples ### Examples
For a simple CTAS below: For a simple CTAS below:
...@@ -124,10 +121,12 @@ The lineage is captured as ...@@ -124,10 +121,12 @@ The lineage is captured as
### Extracting Lineage from Hive commands ### Extracting Lineage from Hive commands
* The !HiveHook maps the !LineageInfo in the !HookContext to Column lineage instances
* The !LineageInfo in Hive provides column-level lineage for the final !FileSinkOperator, linking them to the input columns in the Hive Query * The HiveHook maps the LineageInfo in the HookContext to Column lineage instances
* The LineageInfo in Hive provides column-level lineage for the final FileSinkOperator, linking them to the input columns in the Hive Query
## NOTES ## NOTES
* Column level lineage works with Hive version 1.2.1 after the patch for <a href="https://issues.apache.org/jira/browse/HIVE-13112">HIVE-13112</a> is applied to Hive source * Column level lineage works with Hive version 1.2.1 after the patch for <a href="https://issues.apache.org/jira/browse/HIVE-13112">HIVE-13112</a> is applied to Hive source
* Since database name, table name and column names are case insensitive in hive, the corresponding names in entities are lowercase. So, any search APIs should use lowercase while querying on the entity names * Since database name, table name and column names are case insensitive in hive, the corresponding names in entities are lowercase. So, any search APIs should use lowercase while querying on the entity names
* The following hive operations are captured by hive hook currently * The following hive operations are captured by hive hook currently
......
...@@ -17,7 +17,7 @@ import Img from 'theme/components/shared/Img' ...@@ -17,7 +17,7 @@ import Img from 'theme/components/shared/Img'
Kafka model includes the following types: Kafka model includes the following types:
* Entity types: * Entity types:
* kafka_topic * kafka_topic
* super-types: !DataSet * super-types: DataSet
* attributes: qualifiedName, name, description, owner, topic, uri, partitionCount * attributes: qualifiedName, name, description, owner, topic, uri, partitionCount
Kafka entities are created and de-duped in Atlas using unique attribute qualifiedName, whose value should be formatted as detailed below. Kafka entities are created and de-duped in Atlas using unique attribute qualifiedName, whose value should be formatted as detailed below.
......
...@@ -18,7 +18,7 @@ Sqoop model includes the following types: ...@@ -18,7 +18,7 @@ Sqoop model includes the following types:
* super-types: Process * super-types: Process
* attributes: qualifiedName, name, description, owner, inputs, outputs, operation, commandlineOpts, startTime, endTime, userName * attributes: qualifiedName, name, description, owner, inputs, outputs, operation, commandlineOpts, startTime, endTime, userName
* sqoop_dbdatastore * sqoop_dbdatastore
* super-types: !DataSet * super-types: DataSet
* attributes: qualifiedName, name, description, owner, dbStoreType, storeUse, storeUri, source * attributes: qualifiedName, name, description, owner, dbStoreType, storeUse, storeUri, source
* Enum types: * Enum types:
* sqoop_operation_type * sqoop_operation_type
...@@ -34,14 +34,14 @@ sqoop_dbdatastore.qualifiedName: <storeType> --url <storeUri> {[--table <tableNa ...@@ -34,14 +34,14 @@ sqoop_dbdatastore.qualifiedName: <storeType> --url <storeUri> {[--table <tableNa
</SyntaxHighlighter> </SyntaxHighlighter>
## Sqoop Hook ## Sqoop Hook
Sqoop added a !SqoopJobDataPublisher that publishes data to Atlas after completion of import Job. Today, only hiveImport is supported in !SqoopHook. Sqoop added a SqoopJobDataPublisher that publishes data to Atlas after completion of import Job. Today, only hiveImport is supported in SqoopHook.
This is used to add entities in Atlas using the model detailed above. This is used to add entities in Atlas using the model detailed above.
Follow the instructions below to setup Atlas hook in Hive: Follow the instructions below to setup Atlas hook in Hive:
Add the following properties to to enable Atlas hook in Sqoop: Add the following properties to to enable Atlas hook in Sqoop:
* Set-up Atlas hook in `<sqoop-conf>`/sqoop-site.xml by adding the following: * Set-up Atlas hook in `<sqoop-conf>`/sqoop-site.xml by adding the following:
<SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}>
{`<property> {`<property>
<name>sqoop.job.data.publish.class</name> <name>sqoop.job.data.publish.class</name>
......
...@@ -23,9 +23,9 @@ See [here](#/ExportHDFSAPI) for details on exporting *hdfs_path* entities. ...@@ -23,9 +23,9 @@ See [here](#/ExportHDFSAPI) for details on exporting *hdfs_path* entities.
| _URL_ |_api/atlas/admin/export_ | | _URL_ |_api/atlas/admin/export_ |
| _Method_ |_POST_ | | _Method_ |_POST_ |
| _URL Parameters_ |_None_ | | _URL Parameters_ |_None_ |
| _Data Parameters_| The class _!AtlasExportRequest_ is used to specify the items to export. The list of _!AtlasObjectId_(s) allow for specifying the multiple items to export in a session. The _!AtlasObjectId_ is a tuple of entity type, name of unique attribute, value of unique attribute. Several items can be specified. See examples below.| | _Data Parameters_| The class _AtlasExportRequest_ is used to specify the items to export. The list of _AtlasObjectId_(s) allow for specifying the multiple items to export in a session. The _AtlasObjectId_ is a tuple of entity type, name of unique attribute, value of unique attribute. Several items can be specified. See examples below.|
| _Success Response_|File stream as _application/zip_.| | _Success Response_|File stream as _application/zip_.|
|_Error Response_|Errors that are handled within the system will be returned as _!AtlasBaseException_. | |_Error Response_|Errors that are handled within the system will be returned as _AtlasBaseException_. |
| _Notes_ | Consumer could choose to consume the output of the API by programmatically using _java.io.ByteOutputStream_ or by manually, save the contents of the stream to a file on the disk.| | _Notes_ | Consumer could choose to consume the output of the API by programmatically using _java.io.ByteOutputStream_ or by manually, save the contents of the stream to a file on the disk.|
__Method Signature__ __Method Signature__
...@@ -40,16 +40,24 @@ __Method Signature__ ...@@ -40,16 +40,24 @@ __Method Signature__
It is possible to specify additional parameters for the _Export_ operation. It is possible to specify additional parameters for the _Export_ operation.
Current implementation has 2 options. Both are optional: Current implementation has 2 options. Both are optional:
* _matchType_ This option configures the approach used for fetching the starting entity. It has follow values:
* _startsWith_ Search for an entity that is prefixed with the specified criteria.
* _endsWith_ Search for an entity that is suffixed with the specified criteria.
* _contains_ Search for an entity that has the specified criteria as a sub-string.
* _matches_ Search for an entity that is a regular expression match with the specified criteria.
* _fetchType_ This option configures the approach used for fetching entities. It has following values:
* _FULL_: This fetches all the entities that are connected directly and indirectly to the starting entity. E.g. If a starting entity specified is a table, then this option will fetch the table, database and all the other tables within the database. * _matchType_ This option configures the approach used for fetching the starting entity. It has follow values:
* _CONNECTED_: This fetches all the etnties that are connected directly to the starting entity. E.g. If a starting entity specified is a table, then this option will fetch the table and the database entity only. * _startsWith_ Search for an entity that is prefixed with the specified criteria.
* _INCREMENTAL_: See [here](#/IncrementalExport) for details. * _endsWith_ Search for an entity that is suffixed with the specified criteria.
* _contains_ Search for an entity that has the specified criteria as a sub-string.
* _matches_ Search for an entity that is a regular expression match with the specified criteria.
* _fetchType_ This option configures the approach used for fetching entities. It has following values:
* _FULL_: This fetches all the entities that are connected directly and indirectly to the starting entity. E.g. If a starting entity specified is a table, then this option will fetch the table, database and all the other tables within the database.
* _CONNECTED_: This fetches all the etnties that are connected directly to the starting entity. E.g. If a starting entity specified is a table, then this option will fetch the table and the database entity only.
* _INCREMENTAL_: See [here](#/IncrementalExport) for details.
If no _matchType_ is specified, exact match is used. Which means, that the entire string is used in the search criteria. If no _matchType_ is specified, exact match is used. Which means, that the entire string is used in the search criteria.
...@@ -71,7 +79,7 @@ The exported ZIP file has the following entries within it: ...@@ -71,7 +79,7 @@ The exported ZIP file has the following entries within it:
* _{guid}.json_: Individual entities are exported with file names that correspond to their id. * _{guid}.json_: Individual entities are exported with file names that correspond to their id.
### Examples ### Examples
The _!AtlasExportRequest_ below shows filters that attempt to export 2 databases in cluster cl1: The _AtlasExportRequest_ below shows filters that attempt to export 2 databases in cluster cl1:
<SyntaxHighlighter wrapLines={true} language="json" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="json" style={theme.dark}>
{`{ {`{
...@@ -82,7 +90,7 @@ The _!AtlasExportRequest_ below shows filters that attempt to export 2 databases ...@@ -82,7 +90,7 @@ The _!AtlasExportRequest_ below shows filters that attempt to export 2 databases
}`} }`}
</SyntaxHighlighter> </SyntaxHighlighter>
The _!AtlasExportRequest_ below specifies the _fetchType_ as _FULL_. The _matchType_ option will fetch _accounts@cl1_. The _AtlasExportRequest_ below specifies the _fetchType_ as _FULL_. The _matchType_ option will fetch _accounts@cl1_.
<SyntaxHighlighter wrapLines={true} language="json" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="json" style={theme.dark}>
{`{ {`{
...@@ -96,7 +104,7 @@ The _!AtlasExportRequest_ below specifies the _fetchType_ as _FULL_. The _matchT ...@@ -96,7 +104,7 @@ The _!AtlasExportRequest_ below specifies the _fetchType_ as _FULL_. The _matchT
}`} }`}
</SyntaxHighlighter> </SyntaxHighlighter>
The _!AtlasExportRequest_ below specifies the _fetchType_ as _connected_. The _matchType_ option will fetch _accountsReceivable_, _accountsPayable_, etc present in the database. The _AtlasExportRequest_ below specifies the _fetchType_ as _connected_. The _matchType_ option will fetch _accountsReceivable_, _accountsPayable_, etc present in the database.
<SyntaxHighlighter wrapLines={true} language="json" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="json" style={theme.dark}>
{`{ {`{
...@@ -110,7 +118,7 @@ The _!AtlasExportRequest_ below specifies the _fetchType_ as _connected_. The _m ...@@ -110,7 +118,7 @@ The _!AtlasExportRequest_ below specifies the _fetchType_ as _connected_. The _m
}`} }`}
</SyntaxHighlighter> </SyntaxHighlighter>
Below is the _!AtlasExportResult_ JSON for the export of the _Sales_ DB present in the _!QuickStart_. Below is the _AtlasExportResult_ JSON for the export of the _Sales_ DB present in the _QuickStart_.
The _metrics_ contains the number of types and entities exported as part of the operation. The _metrics_ contains the number of types and entities exported as part of the operation.
...@@ -152,7 +160,7 @@ The _metrics_ contains the number of types and entities exported as part of the ...@@ -152,7 +160,7 @@ The _metrics_ contains the number of types and entities exported as part of the
</SyntaxHighlighter> </SyntaxHighlighter>
### CURL Calls ### CURL Calls
Below are sample CURL calls that demonstrate Export of _!QuickStart_ database. Below are sample CURL calls that demonstrate Export of _QuickStart_ database.
<SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}>
{`curl -X POST -u adminuser:password -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{ {`curl -X POST -u adminuser:password -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
......
...@@ -35,8 +35,16 @@ The new audits for Export and Import operations also have corresponding REST API ...@@ -35,8 +35,16 @@ The new audits for Export and Import operations also have corresponding REST API
|Error Response | Errors Returned as AtlasBaseException | |Error Response | Errors Returned as AtlasBaseException |
|Notes | None | |Notes | None |
###### CURL ##### CURL
curl -X GET -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" 'http://localhost:21000/api/atlas/admin/expimp/audit?sourceClusterName=cl2'
<SyntaxHighlighter wrapLines={true} language="json" style={theme.dark}>
{`
curl -X GET -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache"
http://localhost:port/api/atlas/admin/expimp/audit?sourceClusterName=cl2
`}
</SyntaxHighlighter>
##### RESPONSE
<SyntaxHighlighter wrapLines={true} language="json" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="json" style={theme.dark}>
{`{ {`{
......
...@@ -27,8 +27,8 @@ The general approach is: ...@@ -27,8 +27,8 @@ The general approach is:
| _Method_ |_POST_ | | _Method_ |_POST_ |
| _URL Parameters_ |_None_ | | _URL Parameters_ |_None_ |
| _Data Parameters_|_None_| | _Data Parameters_|_None_|
| _Success Response_ | _!AtlasImporResult_ is returned as JSON. See details below.| | _Success Response_ | _AtlasImporResult_ is returned as JSON. See details below.|
|_Error Response_|Errors that are handled within the system will be returned as _!AtlasBaseException_. | |_Error Response_|Errors that are handled within the system will be returned as _AtlasBaseException_. |
### Import ZIP File Available on Server ### Import ZIP File Available on Server
...@@ -40,8 +40,8 @@ The general approach is: ...@@ -40,8 +40,8 @@ The general approach is:
| _Method_ |_POST_ | | _Method_ |_POST_ |
| _URL Parameters_ |_None_ | | _URL Parameters_ |_None_ |
| _Data Parameters_|_None_| | _Data Parameters_|_None_|
| _Success Response_ | _!AtlasImporResult_ is returned as JSON. See details below.| | _Success Response_ | _AtlasImporResult_ is returned as JSON. See details below.|
|_Error Response_|Errors that are handled within the system will be returned as _!AtlasBaseException_. | |_Error Response_|Errors that are handled within the system will be returned as _AtlasBaseException_. |
|_Notes_| The file to be imported needs to be present on the server at the location specified by the _FILENAME_ parameter.| |_Notes_| The file to be imported needs to be present on the server at the location specified by the _FILENAME_ parameter.|
__Method Signature for Import__ __Method Signature for Import__
...@@ -73,7 +73,7 @@ The API will return the results of the import operation in the format defined by ...@@ -73,7 +73,7 @@ The API will return the results of the import operation in the format defined by
* _Operation Status_: Overall status of the operation. Values are _SUCCESS_, PARTIAL_SUCCESS, _FAIL_. * _Operation Status_: Overall status of the operation. Values are _SUCCESS_, PARTIAL_SUCCESS, _FAIL_.
### Examples Using CURL Calls ### Examples Using CURL Calls
The call below performs Import of _!QuickStart_ database using POST. The call below performs Import of _QuickStart_ database using POST.
<SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}>
{`curl -g -X POST -u adminuser:password -H "Content-Type: multipart/form-data" {`curl -g -X POST -u adminuser:password -H "Content-Type: multipart/form-data"
...@@ -138,4 +138,3 @@ The _metrics_ contain a breakdown of the types and entities imported along with ...@@ -138,4 +138,3 @@ The _metrics_ contain a breakdown of the types and entities imported along with
"operationStatus": "SUCCESS" "operationStatus": "SUCCESS"
}`} }`}
</SyntaxHighlighter> </SyntaxHighlighter>
...@@ -112,9 +112,10 @@ This option allows for optionally importing of type definition. The option is se ...@@ -112,9 +112,10 @@ This option allows for optionally importing of type definition. The option is se
Table below enumerates the conditions that get addressed as part of type definition import: Table below enumerates the conditions that get addressed as part of type definition import:
|**Condition**|**Action**| |**Condition**|**Action**|
|-------------|----------|
| Incoming type does not exist in target system | Type is created. | | Incoming type does not exist in target system | Type is created. |
|Type to be imported and type in target system are same | No change | |Type to be imported and type in target system are same | No change |
|Type to be imported and type in target system differ by some attributes| Target system type is updated to the attributes present in the source. It is possible that the target system will have attributes in addition to the one present in the source. In that case, the target system's type attributes will be an union of the attributes. Attributes in target system will not be deleted to match the source. If the type of the attribute differ, import process will be aborted and exception logged.| |Type to be imported and type in target system differ by some attributes| Target system type is updated to the attributes present in the source.<br /> It is possible that the target system will have attributes in addition to the one present in the source.<br /> In that case, the target system's type attributes will be an union of the attributes.<br /> Attributes in target system will not be deleted to match the source. <br />If the type of the attribute differ, import process will be aborted and exception logged.|
To use the option, set the contents of _importOptions.json_ to: To use the option, set the contents of _importOptions.json_ to:
......
...@@ -47,7 +47,7 @@ ENTITY_ALL | Any/every entity | ...@@ -47,7 +47,7 @@ ENTITY_ALL | Any/every entity |
ENTITY_TOP_LEVEL | Entity that is the top-level entity. This is also the entity present specified in _AtlasExportRequest_.| ENTITY_TOP_LEVEL | Entity that is the top-level entity. This is also the entity present specified in _AtlasExportRequest_.|
EQUALS | Entity attribute equals to the one specified in the condition. | EQUALS | Entity attribute equals to the one specified in the condition. |
EQUALS_IGNORE_CASE | Entity attribute equals to the one specified in the condition ignoring case. | EQUALS_IGNORE_CASE | Entity attribute equals to the one specified in the condition ignoring case. |
STARTS_WITH | Entity attribute starts with. | STARTS_WITH | Entity attribute starts with. |
STARTS_WITH_IGNORE_CASE | Entity attribute starts with ignoring case. | STARTS_WITH_IGNORE_CASE | Entity attribute starts with ignoring case. |
HAS_VALUE | Entity attribute has value. | HAS_VALUE | Entity attribute has value. |
...@@ -64,7 +64,7 @@ CLEAR | Clear value of an attribute | ...@@ -64,7 +64,7 @@ CLEAR | Clear value of an attribute |
#### Built-in Transforms #### Built-in Transforms
###### Add Classification ##### Add Classification
During import, hive_db entity whose _qualifiedName_ is _stocks@cl1_ will get the classification _clSrcImported_. During import, hive_db entity whose _qualifiedName_ is _stocks@cl1_ will get the classification _clSrcImported_.
...@@ -106,7 +106,7 @@ To add classification to only the top-level entity (entity that is used as start ...@@ -106,7 +106,7 @@ To add classification to only the top-level entity (entity that is used as start
}`} }`}
</SyntaxHighlighter> </SyntaxHighlighter>
###### Replace Prefix ##### Replace Prefix
This action works on string values. The first parameter is the prefix that is searched for a match, once matched, it is replaced with the provided replacement string. This action works on string values. The first parameter is the prefix that is searched for a match, once matched, it is replaced with the provided replacement string.
...@@ -123,11 +123,11 @@ The sample below searches for _/aa/bb/_, once found replaces it with _/xx/yy/_. ...@@ -123,11 +123,11 @@ The sample below searches for _/aa/bb/_, once found replaces it with _/xx/yy/_.
}`} }`}
</SyntaxHighlighter> </SyntaxHighlighter>
###### To Lower ##### To Lower
Entity whose hdfs_path.clusterName is CL1 will get its path attribute converted to lower case. Entity whose hdfs_path.clusterName is CL1 will get its path attribute converted to lower case.
<SyntaxHighlighter wrapLines={true} language="json" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="json" style={theme.dark}>
{`{ {`{
"conditions": { "conditions": {
"hdfs_path.clusterName": "EQUALS: CL1" "hdfs_path.clusterName": "EQUALS: CL1"
...@@ -138,11 +138,11 @@ Entity whose hdfs_path.clusterName is CL1 will get its path attribute converted ...@@ -138,11 +138,11 @@ Entity whose hdfs_path.clusterName is CL1 will get its path attribute converted
}`} }`}
</SyntaxHighlighter> </SyntaxHighlighter>
###### Clear ##### Clear
Entity whose hdfs_path.clusterName has value set, will get its _replicatedTo_ attribute value cleared. Entity whose hdfs_path.clusterName has value set, will get its _replicatedTo_ attribute value cleared.
<SyntaxHighlighter wrapLines={true} language="json" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="json" style={theme.dark}>
{`{ {`{
"conditions": { "conditions": {
"hdfs_path.clusterName": "HAS_VALUE:" "hdfs_path.clusterName": "HAS_VALUE:"
...@@ -156,4 +156,4 @@ Entity whose hdfs_path.clusterName has value set, will get its _replicatedTo_ at ...@@ -156,4 +156,4 @@ Entity whose hdfs_path.clusterName has value set, will get its _replicatedTo_ at
#### Additional Examples #### Additional Examples
Please look at [these tests](https://github.com/apache/atlas/blob/master/intg/src/test/java/org/apache/atlas/entitytransform/TransformationHandlerTest.java) for examples using Java classes. Please look at [these tests](https://github.com/apache/atlas/blob/master/intg/src/test/java/org/apache/atlas/entitytransform/TransformationHandlerTest.java) for examples using Java classes.
\ No newline at end of file
...@@ -11,9 +11,9 @@ import SyntaxHighlighter from 'react-syntax-highlighter'; ...@@ -11,9 +11,9 @@ import SyntaxHighlighter from 'react-syntax-highlighter';
# Migrating data from Apache Atlas 0.8 to Apache Atlas 1.0 # Migrating data from Apache Atlas 0.8 to Apache Atlas 1.0
Apache Atlas 1.0 uses !JanusGraph graph database to store its type and entity details. Prior versions of Apache Atlas Apache Atlas 1.0 uses JanusGraph graph database to store its type and entity details. Prior versions of Apache Atlas
use Titan 0.5.4 graph database. The two databases use different formats for storage. For deployments upgrading from use Titan 0.5.4 graph database. The two databases use different formats for storage. For deployments upgrading from
earlier version Apache Atlas, the data in Titan 0.5.4 graph database should be migrated to !JanusGraph graph database. earlier version Apache Atlas, the data in Titan 0.5.4 graph database should be migrated to JanusGraph graph database.
In addition to the change to the graph database, Apache Atlas 1.0 introduces few optimizations that require different internal In addition to the change to the graph database, Apache Atlas 1.0 introduces few optimizations that require different internal
representation compared to previous versions. Migration steps detailed below will transform data to be compliant with representation compared to previous versions. Migration steps detailed below will transform data to be compliant with
...@@ -26,7 +26,7 @@ Migration of data is done in following steps: ...@@ -26,7 +26,7 @@ Migration of data is done in following steps:
* Export Apache Atlas 0.8 data to a directory on the file system. * Export Apache Atlas 0.8 data to a directory on the file system.
* Import data from exported files into Apache Atlas 1.0. * Import data from exported files into Apache Atlas 1.0.
##### Planning the migration #### Planning the migration
The duration of migration of data from Apache Atlas 0.8 to Apache Atlas 1.0 can be significant, depending upon the The duration of migration of data from Apache Atlas 0.8 to Apache Atlas 1.0 can be significant, depending upon the
amount of data present in Apache Atlas. This section helps you to estimate the time to migrate, so that you can plan the amount of data present in Apache Atlas. This section helps you to estimate the time to migrate, so that you can plan the
...@@ -77,7 +77,7 @@ atlas-migration-export: exporting typesDef to file /home/atlas-0.8-data/atlas-mi ...@@ -77,7 +77,7 @@ atlas-migration-export: exporting typesDef to file /home/atlas-0.8-data/atlas-mi
atlas-migration-export: exported typesDef to file /home/atlas-0.8-data/atlas-migration-typesdef.json atlas-migration-export: exported typesDef to file /home/atlas-0.8-data/atlas-migration-typesdef.json
atlas-migration-export: exporting data to file /home/atlas-0.8-data/atlas-migration-data.json atlas-migration-export: exporting data to file /home/atlas-0.8-data/atlas-migration-data.json
atlas-migration-export: exported data to file /home/atlas-0.8-data/atlas-migration-data.json atlas-migration-export: exported data to file /home/atlas-0.8-data/atlas-migration-data.json
atlas-migration-export: completed migration export!`} atlas-migration-export: completed migration export`}
</SyntaxHighlighter> </SyntaxHighlighter>
More details on the progress of export can be found in a log file named _atlas-migration-exporter.log_, in the log directory More details on the progress of export can be found in a log file named _atlas-migration-exporter.log_, in the log directory
...@@ -98,18 +98,21 @@ curl 'http://<solrHost:port>/solr/admin/collections?action=DELETE&name=fulltext_ ...@@ -98,18 +98,21 @@ curl 'http://<solrHost:port>/solr/admin/collections?action=DELETE&name=fulltext_
Apache Atlas specific Solr collections can be created using CURL commands shown below: Apache Atlas specific Solr collections can be created using CURL commands shown below:
<SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}>
{`curl 'http://<solrHost:port>/solr/admin/collections?action=CREATE&name=vertex_index&numShards=1&replicationFactor=1&collection.configName=atlas_configs' {`curl 'http://<solrHost:port>/solr/admin/collections?action=CREATE&name=vertex_index&numShards=1&replicationFactor=1&collection.configName=atlas_configs'
curl 'http://<solrHost:port>/solr/admin/collections?action=CREATE&name=edge_index&numShards=1&replicationFactor=1&collection.configName=atlas_configs' curl 'http://<solrHost:port>/solr/admin/collections?action=CREATE&name=edge_index&numShards=1&replicationFactor=1&collection.configName=atlas_configs'
curl 'http://<solrHost:port>/solr/admin/collections?action=CREATE&name=fulltext_index&numShards=1&replicationFactor=1&collection.configName=atlas_configs'`} curl 'http://<solrHost:port>/solr/admin/collections?action=CREATE&name=fulltext_index&numShards=1&replicationFactor=1&collection.configName=atlas_configs'`}
</SyntaxHighlighter> </SyntaxHighlighter>
* For Apache Atlas deployments that use HBase as backend store, please note that HBase table used by earlier version can't be used by Apache Atlas 1.0. If you are constrained on disk storage space, the table used by earlier version can be removed after successful export of data.
* Apache Atlas 0.8 uses HBase table named 'atlas_titan' (by default)
* Apache Atlas 1.0 uses HBase table named 'atlas_janus' (by default)
* Install Apache Atlas 1.0. Do not start yet! * For Apache Atlas deployments that use HBase as backend store, please note that HBase table used by earlier version can't be used by Apache Atlas 1.0. If you are constrained on disk storage space, the table used by earlier version can be removed after successful export of data.
* Apache Atlas 0.8 uses HBase table named 'atlas_titan' (by default)
* Apache Atlas 1.0 uses HBase table named 'atlas_janus' (by default)
* Make sure the directory containing exported data is accessible to Apache Atlas 1.0 instance.
* Install Apache Atlas 1.0. Do not start yet!
* Make sure the directory containing exported data is accessible to Apache Atlas 1.0 instance.
#### Importing Data into Apache Atlas 1.0 #### Importing Data into Apache Atlas 1.0
...@@ -117,12 +120,14 @@ Please follow the steps below to import the data exported above into Apache Atla ...@@ -117,12 +120,14 @@ Please follow the steps below to import the data exported above into Apache Atla
* Specify the location of the directory containing exported data in following property to _atlas-application.properties_: * Specify the location of the directory containing exported data in following property to _atlas-application.properties_:
<SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}>
{`atlas.migration.data.filename=<location of the directory containing exported data>`} {`atlas.migration.data.filename=<location of the directory containing exported data>`}
</SyntaxHighlighter> </SyntaxHighlighter>
* Start Apache Atlas 1.0. Apache Atlas will start in migration mode. It will start importing data from the specified directory.
* Monitor the progress of import process with the following curl command: * Start Apache Atlas 1.0. Apache Atlas will start in migration mode. It will start importing data from the specified directory.
* Monitor the progress of import process with the following curl command:
<SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}>
{`curl -X GET -u admin:<password> -H "Content-Type: application/json" -H "Cache-Control: no-cache" http://<atlasHost>:port/api/atlas/admin/status`} {`curl -X GET -u admin:<password> -H "Content-Type: application/json" -H "Cache-Control: no-cache" http://<atlasHost>:port/api/atlas/admin/status`}
......
...@@ -35,7 +35,7 @@ The _additionalInfo_ attribute property is discussed in detail below. ...@@ -35,7 +35,7 @@ The _additionalInfo_ attribute property is discussed in detail below.
<Img src={`/images/markdown/atlas-server-properties.png`}/> <Img src={`/images/markdown/atlas-server-properties.png`}/>
###### Export/Import Audits #### Export/Import Audits
The table has following columns: The table has following columns:
...@@ -48,7 +48,7 @@ The table has following columns: ...@@ -48,7 +48,7 @@ The table has following columns:
<Img src={'/images/markdown/atlas-server-exp-imp-audits.png'}/> <Img src={'/images/markdown/atlas-server-exp-imp-audits.png'}/>
###### Example #### Example
The following export request will end up creating _AtlasServer_ entity with _clMain_ as its name. The audit record of this operation will be displayed within the property page of this entity. The following export request will end up creating _AtlasServer_ entity with _clMain_ as its name. The audit record of this operation will be displayed within the property page of this entity.
...@@ -89,7 +89,7 @@ Data Parameters | None | ...@@ -89,7 +89,7 @@ Data Parameters | None |
Success Response| _AtlasServer_ | Success Response| _AtlasServer_ |
Error Response | Errors Returned as AtlasBaseException | Error Response | Errors Returned as AtlasBaseException |
###### CURL #### CURL
<SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}>
{`curl -X GET -u admin:admin -H "Content-Type: application/json" -H "Cache-Control:no-cache" http://localhost:21000/api/atlas/admin/server/cl2`} {`curl -X GET -u admin:admin -H "Content-Type: application/json" -H "Cache-Control:no-cache" http://localhost:21000/api/atlas/admin/server/cl2`}
......
...@@ -16,8 +16,6 @@ import SyntaxHighlighter from 'react-syntax-highlighter'; ...@@ -16,8 +16,6 @@ import SyntaxHighlighter from 'react-syntax-highlighter';
# Issue Tracking # Issue Tracking
* Issues, bugs, and feature requests should be submitted to the following issue tracking system for this project.
<SyntaxHighlighter wrapLines={true} language="html" style={theme.dark}> Issues, bugs, and feature requests should be submitted to the following issue tracking system for this project.
{`https://issues.apache.org/jira/browse/ATLAS`} [https://issues.apache.org/jira/browse/ATLAS](https://issues.apache.org/jira/browse/ATLAS)
</SyntaxHighlighter>
...@@ -13,6 +13,6 @@ submenu: Mailing Lists ...@@ -13,6 +13,6 @@ submenu: Mailing Lists
| **Name** | **Subscribe** | **Unsubscribe** | **Post** | **Archive** | | **Name** | **Subscribe** | **Unsubscribe** | **Post** | **Archive** |
| : ------------- : | : ------------- : | : ------------- : | : ------------- : |: ------------- :| | : ------------- : | : ------------- : | : ------------- : | : ------------- : |: ------------- :|
|atlas-dev|[Subscribe](mailto:dev-subscribe@atlas.incubator.apache.org) |[Unsubscribe](mailto:dev-unsubscribe@atlas.incubator.apache.org)|[Post](mailto:dev@atlas.incubator.apache.org)|[mail-archives.apache.org](http://mail-archives.apache.org/mod_mbox/atlas-dev/)| |atlas-dev|[Subscribe](mailto:dev-subscribe@atlas.apache.org) |[Unsubscribe](mailto:dev-unsubscribe@atlas.apache.org)|[Post](mailto:dev@atlas.apache.org)|[mail-archives.apache.org](http://mail-archives.apache.org/mod_mbox/atlas-dev/)|
|atlas-user |[Subscribe](mailto:user-subscribe@atlas.apache.org) |[Unsubscribe](mailto:user-unsubscribe@atlas.apache.org) |[Post](mailto:user@atlas.apache.org) |[mail-archives.apache.org](http://mail-archives.apache.org/mod_mbox/atlas-user/)| |atlas-user |[Subscribe](mailto:user-subscribe@atlas.apache.org) |[Unsubscribe](mailto:user-unsubscribe@atlas.apache.org) |[Post](mailto:user@atlas.apache.org) |[mail-archives.apache.org](http://mail-archives.apache.org/mod_mbox/atlas-user/)|
|atlas-commits|[Subscribe](mailto:commits-subscribe@atlas.incubator.apache.org)|[Unsubscribe](mailto:commits-unsubscribe@atlas.incubator.apache.org)|[Post](mailto:commits@atlas.incubator.apache.org)|[mail-archives.apache.org](http://mail-archives.apache.org/mod_mbox/atlas-commits/)| |atlas-commits|[Subscribe](mailto:commits-subscribe@atlas.apache.org)|[Unsubscribe](mailto:commits-unsubscribe@atlas.apache.org)|[Post](mailto:commits@atlas.apache.org)|[mail-archives.apache.org](http://mail-archives.apache.org/mod_mbox/atlas-commits/)|
...@@ -15,8 +15,8 @@ submenu: Project Information ...@@ -15,8 +15,8 @@ submenu: Project Information
|Document|Description| |Document|Description|
|:----|:----| |:----|:----|
|[About](#/)|Apache Atlas Documentation| |[About](#/)|Apache Atlas Documentation|
|[Project Team](#/TeamList)|This document provides information on the members of this project. These are the individuals who have contributed to the project in one form or another.| |[Project Team](#/TeamList)|This document provides information on the members of this project.<br/> These are the individuals who have contributed to the project in one form or another.|
|[Mailing Lists](#/MailingLists)|This document provides subscription and archive information for this project's mailing lists.| |[Mailing Lists](#/MailingLists)|This document provides subscription and archive information for this project's mailing lists.|
|[Issue Tracking](#/IssueTracking)|This is a link to the issue management system for this project. Issues (bugs, features, change requests) can be created and queried using this link.| |[Issue Tracking](#/IssueTracking)|This is a link to the issue management system for this project.<br/> Issues (bugs, features, change requests) can be created and queried using this link.|
|[Project License](#/ProjectLicense)|This is a link to the definitions of project licenses.| |[Project License](#/ProjectLicense)|This is a link to the definitions of project licenses.|
|[Source Repository](#/SourceRepository)|This is a link to the online source repository that can be viewed via a web browser.| |[Source Repository](#/SourceRepository)|This is a link to the online source repository that can be viewed via a web browser.|
...@@ -15,19 +15,14 @@ This project uses a Source Content Management System to manage its source code. ...@@ -15,19 +15,14 @@ This project uses a Source Content Management System to manage its source code.
# Web Access # Web Access
The following is a link to the online source repository. The following is a link to the online source repository.
[https://github.com/apache/atlas.git](https://github.com/apache/atlas.git)
<SyntaxHighlighter wrapLines={true} language="html" style={theme.dark}>
https://github.com/apache/atlas.git
</SyntaxHighlighter>
# Anonymous access # Anonymous access
Refer to the documentation of the SCM used for more information about anonymously check out. The connection url is: Refer to the documentation of the SCM used for more information about anonymously check out.<br /> The connection url is:
git://git.apache.org/atlas.git git://git.apache.org/atlas.git
# Developer access # Developer access
Refer to the documentation of the SCM used for more information about developer check out. The connection url is: Refer to the documentation of the SCM used for more information about developer check out.<br /> The connection url is: [https://gitbox.apache.org/repos/asf/atlas.git](https://gitbox.apache.org/repos/asf/atlas.git)
<SyntaxHighlighter wrapLines={true} language="html" style={theme.dark}>
https://gitbox.apache.org/repos/asf/atlas.git
</SyntaxHighlighter>
# Access from behind a firewall # Access from behind a firewall
Refer to the documentation of the SCM used for more information about access behind a firewall. Refer to the documentation of the SCM used for more information about access behind a firewall.
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
name: Advance Search name: Advance Search
route: /SearchAdvance route: /SearchAdvance
menu: Documentation menu: Documentation
submenu: Search submenu: Search
--- ---
import themen from 'theme/styles/styled-colors'; import themen from 'theme/styles/styled-colors';
...@@ -138,6 +138,7 @@ Example: To retreive entity of type Table with a property locationUri. ...@@ -138,6 +138,7 @@ Example: To retreive entity of type Table with a property locationUri.
{`Table has locationUri {`Table has locationUri
from Table where Table has locationUri`} from Table where Table has locationUri`}
</SyntaxHighlighter> </SyntaxHighlighter>
### Select Clause ### Select Clause
If you noticed the output displayed on the web page, it displays a tabular display, each row corresponding to an entity and columns are properties of that entity. The select clause allows for choosing the properties of entity that are of interest. If you noticed the output displayed on the web page, it displays a tabular display, each row corresponding to an entity and columns are properties of that entity. The select clause allows for choosing the properties of entity that are of interest.
...@@ -147,6 +148,7 @@ Example: To retrieve entity of type _Table_ with few properties: ...@@ -147,6 +148,7 @@ Example: To retrieve entity of type _Table_ with few properties:
<SyntaxHighlighter wrapLines={true} language="sql" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="sql" style={theme.dark}>
{`from Table select owner, name, qualifiedName`} {`from Table select owner, name, qualifiedName`}
</SyntaxHighlighter> </SyntaxHighlighter>
Example: To retrieve entity of type Table for a specific table with some properties. Example: To retrieve entity of type Table for a specific table with some properties.
<SyntaxHighlighter wrapLines={true} language="sql" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="sql" style={theme.dark}>
...@@ -443,4 +445,4 @@ The following clauses are no longer supported: ...@@ -443,4 +445,4 @@ The following clauses are no longer supported:
## Resources ## Resources
* Antlr [Book](https://pragprog.com/book/tpantlr2/the-definitive-antlr-4-reference). * Antlr [Book](https://pragprog.com/book/tpantlr2/the-definitive-antlr-4-reference).
* Antlr [Quick Start](https://github.com/antlr/antlr4/blob/master/doc/getting-started.md). * Antlr [Quick Start](https://github.com/antlr/antlr4/blob/master/doc/getting-started.md).
* Atlas DSL Grammar on [Github](https://github.com/apache/atlas/blob/master/repository/src/main/java/org/apache/atlas/query/antlr4/AtlasDSLParser.g4) (Antlr G4 format). * Atlas DSL Grammar on [Github](https://github.com/apache/atlas/blob/master/repository/src/main/java/org/apache/atlas/query/antlr4/AtlasDSLParser.g4) (Antlr G4 format).
\ No newline at end of file
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
name: Basic Search name: Basic Search
route: /SearchBasic route: /SearchBasic
menu: Documentation menu: Documentation
submenu: Search submenu: Search
--- ---
import themen from 'theme/styles/styled-colors'; import themen from 'theme/styles/styled-colors';
...@@ -14,7 +14,7 @@ import Img from 'theme/components/shared/Img' ...@@ -14,7 +14,7 @@ import Img from 'theme/components/shared/Img'
The basic search allows you to query using typename of an entity, associated classification/tag and has support for filtering on the entity attribute(s) as well as the classification/tag attributes. The basic search allows you to query using typename of an entity, associated classification/tag and has support for filtering on the entity attribute(s) as well as the classification/tag attributes.
The entire query structure can be represented using the following JSON structure (called !SearchParameters) The entire query structure can be represented using the following JSON structure (called SearchParameters)
<SyntaxHighlighter wrapLines={true} language="json" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="json" style={theme.dark}>
{`{ {`{
...@@ -44,13 +44,13 @@ tagFilters: classification attribute filter(s) ...@@ -44,13 +44,13 @@ tagFilters: classification attribute filter(s)
attributes: attributes to include in the search result`} attributes: attributes to include in the search result`}
</SyntaxHighlighter> </SyntaxHighlighter>
<Img src={`/images/twiki/search-basic-hive_column-PII.png`} height="400" width="600"/> <Img src={`/images/twiki/search-basic-hive_column-PII.png`} height="500" width="840"/>
Attribute based filtering can be done on multiple attributes with AND/OR conditions. Attribute based filtering can be done on multiple attributes with AND/OR conditions.
**Examples of filtering (for hive_table attributes)** **Examples of filtering (for hive_table attributes)**
* Single attribute * Single attribute
<SyntaxHighlighter wrapLines={true} language="json" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="json" style={theme.dark}>
{` { {` {
"typeName": "hive_table", "typeName": "hive_table",
...@@ -66,7 +66,7 @@ attributes: attributes to include in the search result`} ...@@ -66,7 +66,7 @@ attributes: attributes to include in the search result`}
}`} }`}
</SyntaxHighlighter> </SyntaxHighlighter>
<Img src={`/images/twiki/search-basic-hive_table-customers.png`} height="400" width="600"/> <Img src={`/images/twiki/search-basic-hive_table-customers.png`} height="500" width="840"/>
* Multi-attribute with OR * Multi-attribute with OR
...@@ -95,7 +95,7 @@ attributes: attributes to include in the search result`} ...@@ -95,7 +95,7 @@ attributes: attributes to include in the search result`}
}`} }`}
</SyntaxHighlighter> </SyntaxHighlighter>
<Img src={`/images/twiki/search-basic-hive_table-customers-or-provider.png`} height="400" width="600"/> <Img src={`/images/twiki/search-basic-hive_table-customers-or-provider.png`} height="500" width="840"/>
* Multi-attribute with AND * Multi-attribute with AND
...@@ -124,7 +124,7 @@ attributes: attributes to include in the search result`} ...@@ -124,7 +124,7 @@ attributes: attributes to include in the search result`}
}`} }`}
</SyntaxHighlighter> </SyntaxHighlighter>
<Img src={`/images/twiki/search-basic-hive_table-customers-owner_is_hive.png`} height="400" width="600"/> <Img src={`/images/twiki/search-basic-hive_table-customers-owner_is_hive.png`} height="500" width="840"/>
**Supported operators for filtering** **Supported operators for filtering**
...@@ -170,4 +170,4 @@ attributes: attributes to include in the search result`} ...@@ -170,4 +170,4 @@ attributes: attributes to include in the search result`}
"attributes": [ "db", "qualifiedName" ] "attributes": [ "db", "qualifiedName" ]
}' }'
<protocol>://<atlas_host>:<atlas_port>/api/atlas/v2/search/basic`} <protocol>://<atlas_host>:<atlas_port>/api/atlas/v2/search/basic`}
</SyntaxHighlighter> </SyntaxHighlighter>
\ No newline at end of file
...@@ -25,23 +25,26 @@ To configure Apache Atlas to use Apache Ranger authorizer, please follow the ins ...@@ -25,23 +25,26 @@ To configure Apache Atlas to use Apache Ranger authorizer, please follow the ins
* Include the following property in atlas-application.properties config file: * Include the following property in atlas-application.properties config file:
<SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="shell" style={theme.dark}>
{`atlas.authorizer.impl=ranger`} {`atlas.authorizer.impl=ranger`}
</SyntaxHighlighter> </SyntaxHighlighter>
If you use Apache Ambari to deploy Apache Atlas and Apache Ranger, enable Atlas plugin in configuration pages for If you use Apache Ambari to deploy Apache Atlas and Apache Ranger, enable Atlas plugin in configuration pages for
Apache Ranger. Apache Ranger.
* Include libraries of Apache Ranger plugin in libext directory of Apache Atlas * Include libraries of Apache Ranger plugin in libext directory of Apache Atlas
* `<Atlas installation directory>`/libext/ranger-atlas-plugin-impl/ * `<Atlas installation directory>`/libext/ranger-atlas-plugin-impl/
* `<Atlas installation directory>`/libext/ranger-atlas-plugin-shim-<version/>.jar * `<Atlas installation directory>`/libext/ranger-atlas-plugin-shim-<version/>.jar
* `<Atlas installation directory>`/libext/ranger-plugin-classloader-<version/>.jar * `<Atlas installation directory>`/libext/ranger-plugin-classloader-<version/>.jar
* Include configuration files for Apache Ranger plugin in configuration directory of Apache Atlas - typically under /etc/atlas/conf directory. For more details on configuration file contents, please refer to appropriate documentation in Apache Ranger. * Include configuration files for Apache Ranger plugin in configuration directory of Apache Atlas - typically under /etc/atlas/conf directory. For more details on configuration file contents, please refer to appropriate documentation in Apache Ranger.
* `<Atlas configuration directory>`/ranger-atlas-audit.xml * `<Atlas configuration directory>`/ranger-atlas-audit.xml
* `<Atlas configuration directory>`/ranger-atlas-security.xml * `<Atlas configuration directory>`/ranger-atlas-security.xml
* `<Atlas configuration directory>`/ranger-policymgr-ssl.xml * `<Atlas configuration directory>`/ranger-policymgr-ssl.xml
* `<Atlas configuration directory>`/ranger-security.xml * `<Atlas configuration directory>`/ranger-security.xml
......
...@@ -36,13 +36,13 @@ In order to prevent the use of clear-text passwords, the Atlas platofrm makes us ...@@ -36,13 +36,13 @@ In order to prevent the use of clear-text passwords, the Atlas platofrm makes us
To create the credential provdier for Atlas: To create the credential provdier for Atlas:
* cd to the `bin` directory * cd to the `bin` directory
* type `./cputil.py` * type `./cputil.py`
* Enter the path for the generated credential provider. The format for the path is: * Enter the path for the generated credential provider. The format for the path is:
* [jceks://file/local/file/path/file.jceks]() or [jceks://hdfs@namenodehost:port/path/in/hdfs/to/file.jceks](). The files generally use the ".jceks" extension (e.g. test.jceks) * jceks://file/local/file/path/file.jceks or jceks://hdfs@namenodehost:port/path/in/hdfs/to/file.jceks. The files generally use the ".jceks" extension (e.g. test.jceks)
* Enter the passwords for the keystore, truststore, and server key (these passwords need to match the ones utilized for actually creating the associated certificate store files). * Enter the passwords for the keystore, truststore, and server key (these passwords need to match the ones utilized for actually creating the associated certificate store files).
The credential provider will be generated and saved to the path provided. The credential provider will be generated and saved to the path provided.
## Service Authentication ## Service Authentication
......
...@@ -33,10 +33,13 @@ bin/atlas_start.py`} ...@@ -33,10 +33,13 @@ bin/atlas_start.py`}
#### Using Apache Atlas #### Using Apache Atlas
* To verify if Apache Atlas server is up and running, run curl command as shown below: * To verify if Apache Atlas server is up and running, run curl command as shown below:
<SyntaxHighlighter wrapLines={true} style={theme.dark}>
<SyntaxHighlighter wrapLines={true} style={theme.dark}>
{`curl -u username:password http://localhost:21000/api/atlas/admin/version {`curl -u username:password http://localhost:21000/api/atlas/admin/version
{"Description":"Metadata Management and Data Governance Platform over Hadoop","Version":"1.0.0","Name":"apache-atlas"}`} {"Description":"Metadata Management and Data Governance Platform over Hadoop","Version":"1.0.0","Name":"apache-atlas"}`}
</SyntaxHighlighter> </SyntaxHighlighter>
* Run quick start to load sample model and data * Run quick start to load sample model and data
...@@ -140,17 +143,23 @@ atlas.audit.hbase.tablename=apache_atlas_entity_audit`} ...@@ -140,17 +143,23 @@ atlas.audit.hbase.tablename=apache_atlas_entity_audit`}
By default, Apache Atlas uses JanusGraph as the graph repository and is the only graph repository implementation available currently. For configuring JanusGraph to work with Apache Solr, please follow the instructions below By default, Apache Atlas uses JanusGraph as the graph repository and is the only graph repository implementation available currently. For configuring JanusGraph to work with Apache Solr, please follow the instructions below
* Install Apache Solr if not already running. The version of Apache Solr supported is 5.5.1. Could be installed from http://archive.apache.org/dist/lucene/solr/5.5.1/solr-5.5.1.tgz * Install Apache Solr if not already running. The version of Apache Solr supported is 5.5.1. Could be installed from http://archive.apache.org/dist/lucene/solr/5.5.1/solr-5.5.1.tgz
* Start Apache Solr in cloud mode.
* Start Apache Solr in cloud mode.
SolrCloud mode uses a ZooKeeper Service as a highly available, central location for cluster management. For a small cluster, running with an existing ZooKeeper quorum should be fine. For larger clusters, you would want to run separate multiple ZooKeeper quorum with at least 3 servers. SolrCloud mode uses a ZooKeeper Service as a highly available, central location for cluster management. For a small cluster, running with an existing ZooKeeper quorum should be fine. For larger clusters, you would want to run separate multiple ZooKeeper quorum with at least 3 servers.
Note: Apache Atlas currently supports Apache Solr in "cloud" mode only. "http" mode is not supported. For more information, refer Apache Solr documentation - https://cwiki.apache.org/confluence/display/solr/SolrCloud Note: Apache Atlas currently supports Apache Solr in "cloud" mode only. "http" mode is not supported. For more information, refer Apache Solr documentation - https://cwiki.apache.org/confluence/display/solr/SolrCloud
* For e.g., to bring up an Apache Solr node listening on port 8983 on a machine, you can use the command: * For e.g., to bring up an Apache Solr node listening on port 8983 on a machine, you can use the command:
<SyntaxHighlighter wrapLines={true} language="powershell" style={theme.dark}>
{`$SOLR_HOME/bin/solr start -c -z <zookeeper_host:port> -p 8983`}
</SyntaxHighlighter> <SyntaxHighlighter wrapLines={true} language="powershell" style={theme.dark}>
{`$SOLR_HOME/bin/solr start -c -z <zookeeper_host:port> -p 8983`}
</SyntaxHighlighter>
* Run the following commands from SOLR_BIN (e.g. $SOLR_HOME/bin) directory to create collections in Apache Solr corresponding to the indexes that Apache Atlas uses. In the case that the Apache Atlas and Apache Solr instances are on 2 different hosts, first copy the required configuration files from ATLAS_HOME/conf/solr on the Apache Atlas instance host to Apache Solr instance host. SOLR_CONF in the below mentioned commands refer to the directory where Apache Solr configuration files have been copied to on Apache Solr host: * Run the following commands from SOLR_BIN (e.g. $SOLR_HOME/bin) directory to create collections in Apache Solr corresponding to the indexes that Apache Atlas uses. In the case that the Apache Atlas and Apache Solr instances are on 2 different hosts, first copy the required configuration files from ATLAS_HOME/conf/solr on the Apache Atlas instance host to Apache Solr instance host. SOLR_CONF in the below mentioned commands refer to the directory where Apache Solr configuration files have been copied to on Apache Solr host:
...@@ -162,7 +171,7 @@ $SOLR_BIN/solr create -c fulltext_index -d SOLR_CONF -shards #numShards -replica ...@@ -162,7 +171,7 @@ $SOLR_BIN/solr create -c fulltext_index -d SOLR_CONF -shards #numShards -replica
Note: If numShards and replicationFactor are not specified, they default to 1 which suffices if you are trying out solr with ATLAS on a single node instance. Note: If numShards and replicationFactor are not specified, they default to 1 which suffices if you are trying out solr with ATLAS on a single node instance.
Otherwise specify numShards according to the number of hosts that are in the Solr cluster and the maxShardsPerNode configuration. Otherwise specify numShards according to the number of hosts that are in the Solr cluster and the maxShardsPerNode configuration.
The number of shards cannot exceed the total number of Solr nodes in your !SolrCloud cluster. The number of shards cannot exceed the total number of Solr nodes in your SolrCloud cluster.
The number of replicas (replicationFactor) can be set according to the redundancy required. The number of replicas (replicationFactor) can be set according to the redundancy required.
...@@ -185,18 +194,23 @@ Pre-requisites for running Apache Solr in cloud mode ...@@ -185,18 +194,23 @@ Pre-requisites for running Apache Solr in cloud mode
* Memory - Apache Solr is both memory and CPU intensive. Make sure the server running Apache Solr has adequate memory, CPU and disk. * Memory - Apache Solr is both memory and CPU intensive. Make sure the server running Apache Solr has adequate memory, CPU and disk.
Apache Solr works well with 32GB RAM. Plan to provide as much memory as possible to Apache Solr process Apache Solr works well with 32GB RAM. Plan to provide as much memory as possible to Apache Solr process
* Disk - If the number of entities that need to be stored are large, plan to have at least 500 GB free space in the volume where Apache Solr is going to store the index data * Disk - If the number of entities that need to be stored are large, plan to have at least 500 GB free space in the volume where Apache Solr is going to store the index data
* !SolrCloud has support for replication and sharding. It is highly recommended to use !SolrCloud with at least two Apache Solr nodes running on different servers with replication enabled. * SolrCloud has support for replication and sharding. It is highly recommended to use SolrCloud with at least two Apache Solr nodes running on different servers with replication enabled.
If using !SolrCloud, then you also need !ZooKeeper installed and configured with 3 or 5 !ZooKeeper nodes If using SolrCloud, then you also need ZooKeeper installed and configured with 3 or 5 ZooKeeper nodes
*Configuring Elasticsearch as the indexing backend for the Graph Repository (Tech Preview)* *Configuring Elasticsearch as the indexing backend for the Graph Repository (Tech Preview)*
By default, Apache Atlas uses [JanusGraph](https://janusgraph.org/) as the graph repository and is the only graph repository implementation available currently. For configuring [JanusGraph](https://janusgraph.org/) to work with Elasticsearch, please follow the instructions below By default, Apache Atlas uses [JanusGraph](https://janusgraph.org/) as the graph repository and is the only graph repository implementation available currently. For configuring [JanusGraph](https://janusgraph.org/) to work with Elasticsearch, please follow the instructions below
* Install an Elasticsearch cluster. The version currently supported is 5.6.4, and can be acquired from: https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.4.tar.gz * Install an Elasticsearch cluster. The version currently supported is 5.6.4, and can be acquired from: https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.6.4.tar.gz
* For simple testing a single Elasticsearch node can be started by using the 'elasticsearch' command in the bin directory of the Elasticsearch distribution. * For simple testing a single Elasticsearch node can be started by using the 'elasticsearch' command in the bin directory of the Elasticsearch distribution.
* Change Apache Atlas configuration to point to the Elasticsearch instance setup. Please make sure the following configurations are set to the below values in ATLAS_HOME/conf/atlas-application.properties * Change Apache Atlas configuration to point to the Elasticsearch instance setup. Please make sure the following configurations are set to the below values in ATLAS_HOME/conf/atlas-application.properties
<SyntaxHighlighter wrapLines={true} language="powershell" style={theme.dark}> <SyntaxHighlighter wrapLines={true} language="powershell" style={theme.dark}>
{`atlas.graph.index.search.backend=elasticsearch {`atlas.graph.index.search.backend=elasticsearch
atlas.graph.index.search.hostname=<the hostname(s) of the Elasticsearch master nodes comma separated> atlas.graph.index.search.hostname=<the hostname(s) of the Elasticsearch master nodes comma separated>
......
...@@ -2,7 +2,7 @@ ...@@ -2,7 +2,7 @@
name: Type System name: Type System
route: /TypeSystem route: /TypeSystem
menu: Documentation menu: Documentation
submenu: Features submenu: Features
--- ---
import themen from 'theme/styles/styled-colors'; import themen from 'theme/styles/styled-colors';
...@@ -74,24 +74,24 @@ typeName: "hive_table" ...@@ -74,24 +74,24 @@ typeName: "hive_table"
status: "ACTIVE" status: "ACTIVE"
values: values:
name: “customers” name: “customers”
db: { "guid": "b42c6cfc-c1e7-42fd-a9e6-890e0adf33bc", db: { "guid": "b42c6cfc-c1e7-42fd-a9e6-890e0adf33bc",
"typeName": "hive_db" "typeName": "hive_db"
} }
owner: “admin” owner: “admin”
createTime: 1490761686029 createTime: 1490761686029
updateTime: 1516298102877 updateTime: 1516298102877
comment: null comment: null
retention: 0 retention: 0
sd: { "guid": "ff58025f-6854-4195-9f75-3a3058dd8dcf", sd: { "guid": "ff58025f-6854-4195-9f75-3a3058dd8dcf",
"typeName": "typeName":
"hive_storagedesc" "hive_storagedesc"
} }
partitionKeys: null partitionKeys: null
aliases: null aliases: null
columns: [ { "guid": "65e2204f-6a23-4130-934a-9679af6a211f", columns: [ { "guid": "65e2204f-6a23-4130-934a-9679af6a211f",
"typeName": "hive_column" }, "typeName": "hive_column" },
{ "guid": "d726de70-faca-46fb-9c99-cf04f6b579a6", { "guid": "d726de70-faca-46fb-9c99-cf04f6b579a6",
"typeName": "hive_column" }, "typeName": "hive_column" },
... ...
] ]
parameters: { "transient_lastDdlTime": "1466403208"} parameters: { "transient_lastDdlTime": "1466403208"}
...@@ -194,13 +194,13 @@ make convention based assumptions about what attributes they can expect of types ...@@ -194,13 +194,13 @@ make convention based assumptions about what attributes they can expect of types
metadata objects like clusters, hosts etc. metadata objects like clusters, hosts etc.
**DataSet**: This type extends Referenceable. Conceptually, it can be used to represent an type that stores data. In Atlas, **DataSet**: This type extends Referenceable. Conceptually, it can be used to represent an type that stores data. In Atlas,
hive tables, hbase_tables etc are all types that extend from !DataSet. Types that extend !DataSet can be expected to have hive tables, hbase_tables etc are all types that extend from DataSet. Types that extend DataSet can be expected to have
a Schema in the sense that they would have an attribute that defines attributes of that dataset. For e.g. the columns a Schema in the sense that they would have an attribute that defines attributes of that dataset. For e.g. the columns
attribute in a hive_table. Also entities of types that extend !DataSet participate in data transformation and this attribute in a hive_table. Also entities of types that extend DataSet participate in data transformation and this
transformation can be captured by Atlas via lineage (or provenance) graphs. transformation can be captured by Atlas via lineage (or provenance) graphs.
**Process**: This type extends Asset. Conceptually, it can be used to represent any data transformation operation. For **Process**: This type extends Asset. Conceptually, it can be used to represent any data transformation operation. For
example, an ETL process that transforms a hive table with raw data to another hive table that stores some aggregate can example, an ETL process that transforms a hive table with raw data to another hive table that stores some aggregate can
be a specific type that extends the Process type. A Process type has two specific attributes, inputs and outputs. Both be a specific type that extends the Process type. A Process type has two specific attributes, inputs and outputs. Both
inputs and outputs are arrays of !DataSet entities. Thus an instance of a Process type can use these inputs and outputs inputs and outputs are arrays of DataSet entities. Thus an instance of a Process type can use these inputs and outputs
to capture how the lineage of a !DataSet evolves. to capture how the lineage of a DataSet evolves.
\ No newline at end of file
...@@ -17,16 +17,19 @@ ...@@ -17,16 +17,19 @@
*/ */
import * as React from "react"; import * as React from "react";
import styled from "styled-components";
import { get } from "../../../utils/theme";
import { mq, breakpoints } from "../../../styles/responsive";
import { useConfig } from "../../../../docz-lib/docz/dist"; import { useConfig } from "../../../../docz-lib/docz/dist";
const Img = props => { const Img = props => {
const { src, width, height } = props; const { src, width, height } = props;
const { baseUrl } = useConfig(); const { baseUrl } = useConfig();
const styles = {
boxShadow: "0 2px 2px 0 rgba(0,0,0,0.14), 0 3px 1px -2px rgba(0,0,0,0.12), 0 1px 5px 0 rgba(0,0,0,0.2)",
WebkitBoxShadow: "0 2px 2px 0 rgba(0,0,0,0.14) 0 3px 1px -2px rgba(0,0,0,0.12), 0 1px 5px 0 rgba(0,0,0,0.2)",
MozBoxShadow: "0 2px 2px 0 rgba(0,0,0,0.14) 0 3px 1px -2px rgba(0,0,0,0.12), 0 1px 5px 0 rgba(0,0,0,0.2)"
}
return ( return (
<div> <div>
<img <img
style={styles}
src={`${baseUrl}${src}`} src={`${baseUrl}${src}`}
height={`${height || "auto"}`} height={`${height || "auto"}`}
width={`${width || "100%"}`} width={`${width || "100%"}`}
...@@ -34,4 +37,4 @@ const Img = props => { ...@@ -34,4 +37,4 @@ const Img = props => {
</div> </div>
); );
}; };
export default Img; export default Img;
\ No newline at end of file
...@@ -22,13 +22,16 @@ import { parseString } from "xml2js"; ...@@ -22,13 +22,16 @@ import { parseString } from "xml2js";
import styled from "styled-components"; import styled from "styled-components";
const TeamListStyle = styled.div` const TeamListStyle = styled.div`
width: 100%;
overflow: auto;
> table { > table {
font-family: "Inconsolata", monospace; font-family: "Inconsolata", monospace;
font-size: 14px; font-size: 14px;
display: table; display: inline-table;
table-layout: auto; table-layout: auto;
color: #13161f; color: #13161f;
width: 100%; width: 98%;
padding: 0; padding: 0;
box-shadow: 0 0 0 1px #529d8b; box-shadow: 0 0 0 1px #529d8b;
background-color: transparent; background-color: transparent;
...@@ -38,6 +41,7 @@ const TeamListStyle = styled.div` ...@@ -38,6 +41,7 @@ const TeamListStyle = styled.div`
border-radius: 2px; border-radius: 2px;
overflow-y: hidden; overflow-y: hidden;
overflow-x: initial; overflow-x: initial;
margin: 5px 10px;
} }
> table tr { > table tr {
display: table-row; display: table-row;
...@@ -45,9 +49,10 @@ const TeamListStyle = styled.div` ...@@ -45,9 +49,10 @@ const TeamListStyle = styled.div`
border-color: inherit; border-color: inherit;
} }
> table tr > td { > table tr > td {
padding: 15px; padding: 10px;
line-height: 2; line-height: 2;
font-weight: 200; font-weight: 200;
white-space: pre;
} }
> table > thead { > table > thead {
color: #7d899c; color: #7d899c;
...@@ -59,7 +64,8 @@ const TeamListStyle = styled.div` ...@@ -59,7 +64,8 @@ const TeamListStyle = styled.div`
} }
> table > thead > tr > th { > table > thead > tr > th {
font-weight: 400; font-weight: 400;
padding: 15px; padding: 10px;
text-align: left;
} }
`; `;
...@@ -137,4 +143,4 @@ export default class TeamList extends Component { ...@@ -137,4 +143,4 @@ export default class TeamList extends Component {
</TeamListStyle> </TeamListStyle>
); );
} }
} }
\ No newline at end of file
...@@ -35,5 +35,9 @@ export const OrderedList = styled.ol` ...@@ -35,5 +35,9 @@ export const OrderedList = styled.ol`
margin-right: 5px; margin-right: 5px;
} }
ol li {
padding-left: 25px;
}
${get("styles.ol")}; ${get("styles.ol")};
`; `;
\ No newline at end of file
...@@ -43,8 +43,8 @@ export const Container = styled.div` ...@@ -43,8 +43,8 @@ export const Container = styled.div`
margin: 0 auto; margin: 0 auto;
${mq({ ${mq({
width: ["100%", "100%", 920], width: ["100%", "100%", "95%"],
padding: ["20px", "0 40px 40px"] padding: ["20px", "0 30px 36px"]
})} })}
${get("styles.container")}; ${get("styles.container")};
...@@ -103,4 +103,4 @@ export const Page = ({ children, doc: { link, fullpage, edit = false } }) => { ...@@ -103,4 +103,4 @@ export const Page = ({ children, doc: { link, fullpage, edit = false } }) => {
<Wrapper>{fullpage ? content : <Container>{content}</Container>}</Wrapper> <Wrapper>{fullpage ? content : <Container>{content}</Container>}</Wrapper>
</Main> </Main>
); );
}; };
\ No newline at end of file
...@@ -57,7 +57,8 @@ const TableStyled = styled.table` ...@@ -57,7 +57,8 @@ const TableStyled = styled.table`
& thead th { & thead th {
font-weight: 400; font-weight: 400;
padding: 20px 20px; padding: 10px;
text-align: left;
&:nth-of-type(1) { &:nth-of-type(1) {
${mq({ ${mq({
...@@ -91,9 +92,10 @@ const TableStyled = styled.table` ...@@ -91,9 +92,10 @@ const TableStyled = styled.table`
} }
& tbody td { & tbody td {
padding: 12px 20px; padding: 10px;
line-height: 2; line-height: 2;
font-weight: 200; font-weight: 200;
text-align: left;
} }
& tbody > tr { & tbody > tr {
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment