---+ HBase Atlas Bridge ---++ HBase Model The default HBase model includes the following types: * Entity types: * hbase_namespace * super-types: !Asset * attributes: name, owner, description, type, classifications, term, clustername, parameters, createtime, modifiedtime, qualifiedName * hbase_table * super-types: !DataSet * attributes: name, owner, description, type, classifications, term, uri, column_families, namespace, parameters, createtime, modifiedtime, maxfilesize, isReadOnly, isCompactionEnabled, isNormalizationEnabled, ReplicaPerRegion, Durability, qualifiedName * hbase_column_family * super-types: !DataSet * attributes: name, owner, description, type, classifications, term, columnns, createtime, bloomFilterType, compressionType, CompactionCompressionType, EncryptionType, inMemoryCompactionPolicy, keepDeletedCells, Maxversions, MinVersions, datablockEncoding, storagePolicy, Ttl, blockCachedEnabled, cacheBloomsOnWrite, cacheDataOnWrite, EvictBlocksOnClose, PerfectBlocksOnOpen, NewVersionsBehavior, isMobEnbaled, MobCompactPartitionPolicy, qualifiedName The entities are created and de-duped using unique qualified name. They provide namespace and can be used for querying as well: * hbase_namespace.qualifiedName - <namespace>@<clusterName> * hbase_table.qualifiedName - <namespace>:<tableName>@<clusterName> * hbase_column_family.qualifiedName - <namespace>:<tableName>.<columnFamily>@<clusterName> ---++ Importing HBase Metadata org.apache.atlas.hbase.bridge.HBaseBridge imports the HBase metadata into Atlas using the model defined above. import-hbase.sh command can be used to facilitate this. <verbatim> Usage 1: <atlas package>/hook-bin/import-hbase.sh Usage 2: <atlas package>/hook-bin/import-hbase.sh [-n <namespace regex> OR --namespace <namespace regex >] [-t <table regex > OR --table <table regex>] Usage 3: <atlas package>/hook-bin/import-hbase.sh [-f <filename>] File Format: namespace1:tbl1 namespace1:tbl2 namespace2:tbl1 </verbatim> The logs are in <atlas package>/logs/import-hbase.log ---++ HBase Hook Atlas HBase hook registers with HBase to listen for create/update/delete operations and updates the metadata in Atlas, via Kafka notifications, for the changes in HBase. Follow the instructions below to setup Atlas hook in HBase: * Set-up Atlas hook in hbase-site.xml by adding the following: <verbatim> <property> <name>hbase.coprocessor.master.classes</name> <value>org.apache.atlas.hbase.hook.HBaseAtlasCoprocessor</value> </property></verbatim> * Copy <atlas package>/hook/hbase/<All files and folder> to hbase class path. HBase hook binary files are present in apache-atlas-<release-vesion>-SNAPSHOT-hbase-hook.tar.gz * Copy <atlas-conf>/atlas-application.properties to the hbase conf directory. The following properties in <atlas-conf>/atlas-application.properties control the thread pool and notification details: * atlas.hook.hbase.synchronous - boolean, true to run the hook synchronously. default false. Recommended to be set to false to avoid delays in Hbase operation. * atlas.hook.hbase.numRetries - number of retries for notification failure. default 3 * atlas.hook.hbase.minThreads - core number of threads. default 1 * atlas.hook.hbase.maxThreads - maximum number of threads. default 5 * atlas.hook.hbase.keepAliveTime - keep alive time in msecs. default 10 * atlas.hook.hbase.queueSize - queue size for the threadpool. default 10000 Refer [[Configuration][Configuration]] for notification related configurations ---++ NOTES * Only the namespace, table and columnfamily create / update / delete operations are caputured by the hook. Columns changes wont be captured and propagated.