Commit 540129f5 by rmani Committed by Madhan Neethiraj

ATLAS-2317:[Docs] Add HBase Bridge Documents

parent c9924fdc
---+ HBase Atlas Bridge
---++ HBase Model
The default HBase model includes the following types:
* Entity types:
* hbase_namespace
* super-types: !Asset
* attributes: name, owner, description, type, classifications, term, clustername, parameters, createtime, modifiedtime, qualifiedName
* hbase_table
* super-types: !DataSet
* attributes: name, owner, description, type, classifications, term, uri, column_families, namespace, parameters, createtime, modifiedtime, maxfilesize,
isReadOnly, isCompactionEnabled, isNormalizationEnabled, ReplicaPerRegion, Durability, qualifiedName
* hbase_column_family
* super-types: !DataSet
* attributes: name, owner, description, type, classifications, term, columnns, createtime, bloomFilterType, compressionType, CompactionCompressionType, EncryptionType,
inMemoryCompactionPolicy, keepDeletedCells, Maxversions, MinVersions, datablockEncoding, storagePolicy, Ttl, blockCachedEnabled, cacheBloomsOnWrite,
cacheDataOnWrite, EvictBlocksOnClose, PerfectBlocksOnOpen, NewVersionsBehavior, isMobEnbaled, MobCompactPartitionPolicy, qualifiedName
The entities are created and de-duped using unique qualified name. They provide namespace and can be used for querying as well:
* hbase_namespace.qualifiedName - <namespace>@<clusterName>
* hbase_table.qualifiedName - <namespace>:<tableName>@<clusterName>
* hbase_column_family.qualifiedName - <namespace>:<tableName>.<columnFamily>@<clusterName>
---++ Importing HBase Metadata
org.apache.atlas.hbase.bridge.HBaseBridge imports the HBase metadata into Atlas using the model defined above. import-hbase.sh command can be used to facilitate this.
<verbatim>
Usage 1: <atlas package>/hook-bin/import-hbase.sh
Usage 2: <atlas package>/hook-bin/import-hbase.sh [-n <namespace regex> OR --namespace <namespace regex >] [-t <table regex > OR --table <table regex>]
Usage 3: <atlas package>/hook-bin/import-hbase.sh [-f <filename>]
File Format:
namespace1:tbl1
namespace1:tbl2
namespace2:tbl1
</verbatim>
The logs are in <atlas package>/logs/import-hbase.log
---++ HBase Hook
Atlas HBase hook registers with HBase to listen for create/update/delete operations and updates the metadata in Atlas, via Kafka notifications, for the changes in HBase.
Follow the instructions below to setup Atlas hook in HBase:
* Set-up Atlas hook in hbase-site.xml by adding the following:
<verbatim>
<property>
<name>hbase.coprocessor.master.classes</name>
<value>org.apache.atlas.hbase.hook.HBaseAtlasCoprocessor</value>
</property></verbatim>
* Copy <atlas package>/hook/hbase/<All files and folder> to hbase class path. HBase hook binary files are present in apache-atlas-<release-vesion>-SNAPSHOT-hbase-hook.tar.gz
* Copy <atlas-conf>/atlas-application.properties to the hbase conf directory.
The following properties in <atlas-conf>/atlas-application.properties control the thread pool and notification details:
* atlas.hook.hbase.synchronous - boolean, true to run the hook synchronously. default false. Recommended to be set to false to avoid delays in Hbase operation.
* atlas.hook.hbase.numRetries - number of retries for notification failure. default 3
* atlas.hook.hbase.minThreads - core number of threads. default 1
* atlas.hook.hbase.maxThreads - maximum number of threads. default 5
* atlas.hook.hbase.keepAliveTime - keep alive time in msecs. default 10
* atlas.hook.hbase.queueSize - queue size for the threadpool. default 10000
Refer [[Configuration][Configuration]] for notification related configurations
---++ NOTES
* Only the namespace, table and columnfamily create / update / delete operations are caputured by the hook. Columns changes wont be captured and propagated.
\ No newline at end of file
......@@ -57,11 +57,13 @@ capabilities around these data assets for data scientists, analysts and the data
* [[Configuration][Configuration]]
* Notification
* [[Notification-Entity][Entity Notification]]
* Bridges
* [[Bridge-Hive][Hive Bridge]]
* [[Bridge-Sqoop][Sqoop Bridge]]
* [[Bridge-Falcon][Falcon Bridge]]
* [[StormAtlasHook][Storm Bridge]]
* Hooks & Bridges
* [[Bridge-HBase][HBase Hook & Bridge]]
* [[Bridge-Hive][Hive Hook & Bridge]]
* [[Bridge-Kafka][Kafka Bridge]]
* [[Bridge-Sqoop][Sqoop Hook]]
* [[StormAtlasHook][Storm Hook]]
* [[Bridge-Falcon][Falcon Hook]]
* [[HighAvailability][Fault Tolerance And High Availability Options]]
---++ API Documentation
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment