Commit 19f24d50 by Shwetha GS

ATLAS-451 Doc: Fix few broken links due to Wiki words in Atlas documentation…

ATLAS-451 Doc: Fix few broken links due to Wiki words in Atlas documentation (ssainath via shwethags)
parent a1fb9eda
......@@ -30,7 +30,7 @@ Available bridges are:
---++ Notification
Notification is used for reliable entity registration from hooks and for entity/type change notifications. Atlas, by default, provides Kafka integration, but its possible to provide other implementations as well. Atlas service starts embedded Kafka server by default.
Atlas also provides NotificationHookConsumer that runs in Atlas Service and listens to messages from hook and registers the entities in Atlas.
Atlas also provides !NotificationHookConsumer that runs in Atlas Service and listens to messages from hook and registers the entities in Atlas.
<img src="images/twiki/notification.png" height="10" width="20" />
......
......@@ -14,7 +14,7 @@ sqoop_process - attribute name - sqoop-dbStoreType-storeUri-endTime
sqoop_dbdatastore - attribute name - dbStoreType-connectorUrl-source
---++ Sqoop Hook
Sqoop added a SqoopJobDataPublisher that publishes data to Atlas after completion of import Job. Today, only hiveImport is supported in sqoopHook.
Sqoop added a !SqoopJobDataPublisher that publishes data to Atlas after completion of import Job. Today, only hiveImport is supported in sqoopHook.
This is used to add entities in Atlas using the model defined in org.apache.atlas.sqoop.model.SqoopDataModelGenerator.
Follow these instructions in your sqoop set-up to add sqoop hook for Atlas in <sqoop-conf>/sqoop-site.xml:
......
......@@ -88,14 +88,14 @@ HBase on the other hand doesnt provide ACID guarantees but is able to scale for
---+++ Choosing between Indexing Backends
Refer http://s3.thinkaurelius.com/docs/titan/0.5.4/elasticsearch.html and http://s3.thinkaurelius.com/docs/titan/0.5.4/solr.html for chossing between ElasticSarch and Solr.
Refer http://s3.thinkaurelius.com/docs/titan/0.5.4/elasticsearch.html and http://s3.thinkaurelius.com/docs/titan/0.5.4/solr.html for choosing between !ElasticSearch and Solr.
Solr in cloud mode is the recommended setup.
---+++ Switching Persistence Backend
For switching the storage backend from BerkeleyDB to HBase and vice versa, refer the documentation for "Graph Persistence Engine" described above and restart ATLAS.
The data in the indexing backend needs to be cleared else there will be discrepancies between the storage and indexing backend which could result in errors during the search.
ElasticSearch runs by default in embedded mode and the data could easily be cleared by deleting the ATLAS_HOME/data/es directory.
!ElasticSearch runs by default in embedded mode and the data could easily be cleared by deleting the ATLAS_HOME/data/es directory.
For Solr, the collections which were created during ATLAS Installation - vertex_index, edge_index, fulltext_index could be deleted which will cleanup the indexes
---+++ Switching Index Backend
......
......@@ -85,5 +85,5 @@ to configure Atlas to use Kafka in HA mode, do the following:
---++ Known Issues
* [[https://issues.apache.org/jira/browse/ATLAS-338][ATLAS-338]]: ATLAS-338: Metadata events generated from a Hive CLI (as opposed to Beeline or any client going HiveServer2) would be lost if Atlas server is down.
* [[https://issues.apache.org/jira/browse/ATLAS-338][ATLAS-338]]: ATLAS-338: Metadata events generated from a Hive CLI (as opposed to Beeline or any client going !HiveServer2) would be lost if Atlas server is down.
* If the HBase region servers hosting the Atlas ‘titan’ HTable are down, Atlas would not be able to store or retrieve metadata from HBase until they are brought back online.
\ No newline at end of file
......@@ -131,8 +131,8 @@ The HBase versions currently supported are 1.1.x. For configuring ATLAS graph pe
for more details.
Pre-requisites for running HBase as a distributed cluster
* 3 or 5 ZooKeeper nodes
* Atleast 3 RegionServer nodes. It would be ideal to run the DataNodes on the same hosts as the Region servers for data locality.
* 3 or 5 !ZooKeeper nodes
* Atleast 3 !RegionServer nodes. It would be ideal to run the !DataNodes on the same hosts as the Region servers for data locality.
*Configuring SOLR as the Indexing Backend for the Graph Repository*
......@@ -142,8 +142,8 @@ For configuring Titan to work with Solr, please follow the instructions below
* Install solr if not already running. The version of SOLR supported is 5.2.1. Could be installed from http://archive.apache.org/dist/lucene/solr/5.2.1/solr-5.2.1.tgz
* Start solr in cloud mode.
SolrCloud mode uses a ZooKeeper Service as a highly available, central location for cluster management.
For a small cluster, running with an existing ZooKeeper quorum should be fine. For larger clusters, you would want to run separate multiple ZooKeeper quorum with atleast 3 servers.
!SolrCloud mode uses a !ZooKeeper Service as a highly available, central location for cluster management.
For a small cluster, running with an existing !ZooKeeper quorum should be fine. For larger clusters, you would want to run separate multiple !ZooKeeper quorum with atleast 3 servers.
Note: Atlas currently supports solr in "cloud" mode only. "http" mode is not supported. For more information, refer solr documentation - https://cwiki.apache.org/confluence/display/solr/SolrCloud
* For e.g., to bring up a Solr node listening on port 8983 on a machine, you can use the command:
......@@ -163,7 +163,7 @@ For configuring Titan to work with Solr, please follow the instructions below
Note: If numShards and replicationFactor are not specified, they default to 1 which suffices if you are trying out solr with ATLAS on a single node instance.
Otherwise specify numShards according to the number of hosts that are in the Solr cluster and the maxShardsPerNode configuration.
The number of shards cannot exceed the total number of Solr nodes in your SolrCloud cluster.
The number of shards cannot exceed the total number of Solr nodes in your !SolrCloud cluster.
The number of replicas (replicationFactor) can be set according to the redundancy required.
......@@ -182,8 +182,8 @@ Pre-requisites for running Solr in cloud mode
* Memory - Solr is both memory and CPU intensive. Make sure the server running Solr has adequate memory, CPU and disk.
Solr works well with 32GB RAM. Plan to provide as much memory as possible to Solr process
* Disk - If the number of entities that need to be stored are large, plan to have at least 500 GB free space in the volume where Solr is going to store the index data
* SolrCloud has support for replication and sharding. It is highly recommended to use SolrCloud with at least two Solr nodes running on different servers with replication enabled.
If using SolrCloud, then you also need ZooKeeper installed and configured with 3 or 5 ZooKeeper nodes
* !SolrCloud has support for replication and sharding. It is highly recommended to use !SolrCloud with at least two Solr nodes running on different servers with replication enabled.
If using !SolrCloud, then you also need !ZooKeeper installed and configured with 3 or 5 !ZooKeeper nodes
*Starting Atlas Server*
<verbatim>
......
......@@ -69,14 +69,14 @@ rep1sep => one or more, separated by second arg.
{noformat}
Language Notes:
* A *SingleQuery* expression can be used to search for entities of a _Trait_ or _Class_.
* A *!SingleQuery* expression can be used to search for entities of a _Trait_ or _Class_.
Entities can be filtered based on a 'Where Clause' and Entity Attributes can be retrieved based on a 'Select Clause'.
* An Entity Graph can be traversed/joined by combining one or more SingleQueries.
* An Entity Graph can be traversed/joined by combining one or more !SingleQueries.
* An attempt is made to make the expressions look SQL like by accepting keywords "SELECT",
"FROM", and "WHERE"; but these are optional and users can simply think in terms of Entity Graph Traversals.
* The transitive closure of an Entity relationship can be expressed via the _Loop_ expression. A
_Loop_ expression can be any traversal (recursively a query) that represents a _Path_ that ends in an Entity of the same _Type_ as the starting Entity.
* The _WithPath_ clause can be used with transitive closure queries to retrieve the Path that
* The _!WithPath_ clause can be used with transitive closure queries to retrieve the Path that
connects the two related Entities. (We also provide a higher level interface for Closure Queries
see scaladoc for 'org.apache.atlas.query.ClosureQuery')
* There are couple of Predicate functions different from SQL:
......@@ -90,7 +90,7 @@ Language Notes:
* from DB
* DB where name="Reporting" select name, owner
* DB has name
* DB is JdbcAccess
* DB is !JdbcAccess
* Column where Column isa PII
* Table where name="sales_fact", columns
* Table where name="sales_fact", columns as column select column.name, column.dataType, column.comment
......
......@@ -8,7 +8,7 @@
<img src="images/twiki/data-types.png" height="400" width="600" />
---+++ Types Instances Overview
<img src="images/twiki/types-instances.png" height="400" width="600" />
<img src="images/twiki/types-instance.png" height="400" width="600" />
---++ Details
......@@ -23,7 +23,7 @@
- can have inheritence
- can contain structs
- don't necessarily need to use a struct inside the class to define props
- can also define props using AttributeDefinition using the basic data types
- can also define props using !AttributeDefinition using the basic data types
- classes are immutable once created
### On search interface:
......@@ -58,17 +58,17 @@
### Other useful information
HierarchicalClassType - base type for ClassType and TraitType
!HierarchicalClassType - base type for !ClassType and !TraitType
Instances created from Definitions
Every instance is referenceable - i.e. something can point to it in the graph db
MetadataService may not be used longterm - it is currently used for bootstrapping the repo & type system
!MetadataService may not be used longterm - it is currently used for bootstrapping the repo & type system
Id class - represents the Id of an instance
When the web service receives an object graph, the ObjectGraphWalker is used to update things
- DiscoverInstances is used to discover the instances in the object graph received by the web service
When the web service receives an object graph, the !ObjectGraphWalker is used to update things
- !DiscoverInstances is used to discover the instances in the object graph received by the web service
MapIds assigns new ids to the discovered instances in the object graph
!MapIds assigns new ids to the discovered instances in the object graph
Anything under the storage package is not part of the public interface
\ No newline at end of file
......@@ -7,6 +7,7 @@ ATLAS-409 Atlas will not import avro tables with schema read from a file (dosset
ATLAS-379 Create sqoop and falcon metadata addons (venkatnrangan,bvellanki,sowmyaramesh via shwethags)
ALL CHANGES:
ATLAS-451 Doc: Fix few broken links due to Wiki words in Atlas documentation (ssainath via shwethags)
ATLAS-439 Investigate apache build failure - EntityJerseyResourceIT.testEntityDeduping (shwethags)
ATLAS-426 atlas_start fails on cygwin (dkantor via shwethags)
ATLAS-448 Hive IllegalArgumentException with Atlas hook enabled on SHOW TRANSACTIONS AND SHOW COMPACTIONS (shwethags)
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment