Commit ceea868d by ashutoshm Committed by apoorvnaik

ATLAS-1717-IX-Documentation

parent cb554197
---+ Export API
The general approach is:
* Consumer specifies the scope of data to be exported (details below).
* The API if successful, will return the stream in the format specified.
* Error will be returned on failure of the call.
See [[Export-HDFS-API][here]] for details on exporting *hdfs_path* entities.
|*Title*|*Export API*|
| _Example_ | See Examples sections below. |
| _URL_ |_api/atlas/admin/export_ |
| _Method_ |_POST_ |
| _URL Parameters_ |_None_ |
| _Data Parameters_| The class _!AtlasExportRequest_ is used to specify the items to export. The list of _!AtlasObjectId_(s) allow for specifying the multiple items to export in a session. The _!AtlasObjectId_ is a tuple of entity type, name of unique attribute, value of unique attribute. Several items can be specified. See examples below.|
| _Success Response_|File stream as _application/zip_.|
|_Error Response_|Errors that are handled within the system will be returned as _!AtlasBaseException_. |
| _Notes_ | Consumer could choose to consume the output of the API by programmatically using _java.io.ByteOutputStream_ or by manually, save the contents of the stream to a file on the disk.|
__Method Signature__
<verbatim>
@POST
@Path("/export")
@Consumes("application/json;charset=UTF-8")
</verbatim>
---+++ Additional Options
It is possible to specify additional parameters for the _Export_ operation.
Current implementation has 2 options. Both are optional:
* _matchType_ This option configures the approach used for fetching the starting entity. It has follow values:
* _startsWith_ Search for an entity that is prefixed with the specified criteria.
* _endsWith_ Search for an entity that is suffixed with the specified criteria.
* _contains_ Search for an entity that has the specified criteria as a sub-string.
* _matches_ Search for an entity that is a regular expression match with the specified criteria.
* _fetchType_ This option configures the approach used for fetching entities. It has following values:
* _FULL_: This fetches all the entities that are connected directly and indirectly to the starting entity. E.g. If a starting entity specified is a table, then this option will fetch the table, database and all the other tables within the database.
* _CONNECTED_: This fetches all the etnties that are connected directly to the starting entity. E.g. If a starting entity specified is a table, then this option will fetch the table and the database entity only.
If no _matchType_ is specified, exact match is used. Which means, that the entire string is used in the search criteria.
Searching using _matchType_ applies for all types of entities. It is particularly useful for matching entities of type hdfs_path (see [[Export-HDFS-API][here]]).
The _fetchType_ option defaults to _FULL_.
For complete example see section below.
---+++ Contents of Exported ZIP File
The exported ZIP file has the following entries within it:
* _atlas-export-result.json_:
* Input filters: The scope of export.
* File format: The format chosen for the export operation.
* Metrics: The number of entity definitions, classifications and entities exported.
* _atlas-typesdef.json_: Type definitions for the entities exported.
* _atlas-export-order.json_: Order in which entities should be exported.
* _{guid}.json_: Individual entities are exported with file names that correspond to their id.
---+++ Examples
The _!AtlasExportRequest_ below shows filters that attempt to export 2 databases in cluster cl1:
<verbatim>
{
"itemsToExport": [
{ "typeName": "hive_db", "uniqueAttributes": { "qualifiedName": "accounts@cl1" } },
{ "typeName": "hive_db", "uniqueAttributes": { "qualifiedName": "hr@cl1" } }
]
}
</verbatim>
The _!AtlasExportRequest_ below specifies the _fetchType_ as _FULL_. The _matchType_ option will fetch _accounts@cl1_.
<verbatim>
{
"itemsToExport": [
{ "typeName": "hive_db", "uniqueAttributes": { "qualifiedName": "accounts@" } },
],
"options" {
"fetchType": "FULL",
"matchType": "startsWith"
}
}
</verbatim>
The _!AtlasExportRequest_ below specifies the _fetchType_ as _connected_. The _matchType_ option will fetch _accountsReceivable_, _accountsPayable_, etc present in the database.
<verbatim>
{
"itemsToExport": [
{ "typeName": "hive_db", "uniqueAttributes": { "name": "accounts" } },
],
"options" {
"fetchType": "CONNECTED",
"matchType": "startsWith"
}
}
</verbatim>
Below is the _!AtlasExportResult_ JSON for the export of the _Sales_ DB present in the _!QuickStart_.
The _metrics_ contains the number of types and entities exported as part of the operation.
<verbatim>
{
"clientIpAddress": "10.0.2.15",
"hostName": "10.0.2.2",
"metrics": {
"duration": 1415,
"entitiesWithExtInfo": 12,
"entity:DB_v1": 2,
"entity:LoadProcess_v1": 2,
"entity:Table_v1": 6,
"entity:View_v1": 2,
"typedef:Column_v1": 1,
"typedef:DB_v1": 1,
"typedef:LoadProcess_v1": 1,
"typedef:StorageDesc_v1": 1,
"typedef:Table_v1": 1,
"typedef:View_v1": 1,
"typedef:classification": 6
},
"operationStatus": "SUCCESS",
"request": {
"itemsToExport": [
{
"typeName": "DB_v1",
"uniqueAttributes": {
"name": "Sales"
}
}
],
"options": {
"fetchType": "full"
}
},
"userName": "admin"
}
</verbatim>
---+++ CURL Calls
Below are sample CURL calls that demonstrate Export of _!QuickStart_ database.
<verbatim>
curl -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
"itemsToExport": [
{ "typeName": "DB", "uniqueAttributes": { "name": "Sales" }
{ "typeName": "DB", "uniqueAttributes": { "name": "Reporting" }
{ "typeName": "DB", "uniqueAttributes": { "name": "Logging" }
}
],
"options": "full"
}' "http://localhost:21000/api/atlas/admin/export" > quickStartDB.zip
</verbatim>
---+ Export & Import APIs for HDFS Path
---+++ Introduction
The general approach for using the Import-Export APIs for HDFS Paths remain the same. There are minor variations caused how HDFS paths are handled within Atlas.
Unlike HIVE entities, HDFS entities within Atlas are created manually using the _Create Entity_ link within the Atlas Web UI.
Also, HDFS paths tend to be hierarchical, in the sense that users tend to model the same HDFS storage structure within Atlas.
__Sample HDFS Setup__
<table border="1" cellpadding="pixels" cellspacing="pixels">
<tr>
<th><strong>HDFS Path</strong></th> <th><strong>Atlas Entity</strong></th>
</tr>
<tr>
<td style="padding:0 15px 0 15px;">
<em>/apps/warehouse/finance</em>
</td>
<td style="padding:0 15px 0 15px;">
<strong>Entity type: </strong><em>hdfs_path</em> <br/>
<strong>Name: </strong><em>Finance</em> <br/>
<strong>QualifiedName: </strong><em>FinanceAll</em>
</td>
</tr>
<tr>
<td style="padding:0 15px 0 15px;">
<em>/apps/warehouse/finance/accounts-receivable</em>
</td>
<td style="padding:0 15px 0 15px;">
<strong>Entity type: </strong><em>hdfs_path</em> <br/>
<strong>Name: </strong><em>FinanceReceivable</em> <br/>
<strong>QualifiedName: </strong><em>FinanceReceivable</em> <br/>
<strong>Path: </strong><em>/apps/warehouse/finance</em>
</td>
</tr>
<td style="padding:0 15px 0 15px;">
<em>/apps/warehouse/finance/accounts-payable</em>
</td>
<td style="padding:0 15px 0 15px;">
<strong>Entity type: </strong><em>hdfs_path</em> <br/>
<strong>Name: </strong><em>Finance-Payable</em> <br/>
<strong>QualifiedName: </strong><em>FinancePayable</em> <br/>
<strong>Path: </strong><em>/apps/warehouse/finance/accounts-payable</em>
</td>
</tr>
</tr>
<td style="padding:0 15px 0 15px;">
<em>/apps/warehouse/finance/billing</em>
</td>
<td style="padding:0 15px 0 15px;">
<strong>Entity type: </strong><em>hdfs_path</em> <br/>
<strong>Name: </strong><em>FinanceBilling</em> <br/>
<strong>QualifiedName: </strong><em>FinanceBilling</em> <br/>
<strong>Path: </strong><em>/apps/warehouse/finance/billing</em>
</td>
</tr>
</table>
---+++ Export API Using matchType
To export entities that represent HDFS path, use the Export API using the _matchType_ option. Details can be found [[Export-API][here]].
---+++ Example Using CURL Calls
Below are sample CURL calls that performs export operation on the _Sample HDFS Setup_ shown above.
<verbatim>
curl -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
"itemsToExport": [
{ "typeName": "hdfs_path", "uniqueAttributes": { "name": "FinanceAll" }
}
],
"options": {
"fetchType": "full",
"matchType": "startsWith"
}
}' "http://localhost:21000/api/atlas/admin/export" > financeAll.zip
</verbatim>
---+ Import API
The general approach is:
* Consumer makes a ZIP file available for import operation. See details below for the 2 flavors of the API.
* The API if successful, will return the results of the operation.
* Error will be returned on failure of the call.
---+++ Import ZIP File Using POST
|*Title*|*Import API*|
| _Example_ | See Examples sections below. |
| _Description_|Provide the contents of the file to be imported in the request body.|
| _URL_ |_api/atlas/admin/import_ |
| _Method_ |_POST_ |
| _URL Parameters_ |_None_ |
| _Data Parameters_|_None_|
| _Success Response_ | _!AtlasImporResult_ is returned as JSON. See details below.|
|_Error Response_|Errors that are handled within the system will be returned as _!AtlasBaseException_. |
---+++ Import ZIP File Available on Server
|*Title*|*Import API*|
| _Example_ | See Examples sections below. |
| _Description_|Provide the path of the file to be imported.|
| _URL_ |_api/atlas/admin/importfile_ |
| _Method_ |_POST_ |
| _URL Parameters_ |_?FILENAME=<path of file>_ Specify the options as name-value pairs. Use _FILENAME_ to specify the file path. |
| _Data Parameters_|_None_|
| _Success Response_ | _!AtlasImporResult_ is returned as JSON. See details below.|
|_Error Response_|Errors that are handled within the system will be returned as _!AtlasBaseException_. |
|_Notes_| The file to be imported needs to be present on the server at the location specified by the _FILENAME_ parameter.|
__Method Signature for Import__
<verbatim>
@POST
@Path("/import")
@Produces("application/json; charset=UTF-8")
@Consumes("application/octet-stream")
</verbatim>
__Method Signature for Import File__
<verbatim>
@POST
@Path("/importfile")
@Produces("application/json; charset=UTF-8")
@Consumes("application/json")
</verbatim>
__!AtlasImportResult Response__
The API will return the results of the import operation in the format defined by the _!AtlasImportResult_:
* _!AtlasImportParameters_: This contains a collection of name value pair of the options that are applied during the import operation.
* _Metrics_: Operation metrics. These include details on the number of types imported, number of entities imported, etc.
* _Processed Entities_: Contains list of GUIDs for the entities that were processed.
* _Operation Status_: Overall status of the operation. Values are _SUCCESS_, PARTIAL_SUCCESS, _FAIL_.
---+++ Examples Using CURL Calls
The call below performs Import of _!QuickStart_ database using POST.
<verbatim>
curl -X POST -u admin:admin -H "Content-Type: application/octet-stream" -H "Cache-Control: no-cache"
--data-binary @quickStartDB.zip
"http://localhost:21000/api/atlas/admin/import" > quickStartDB-import-result.json
</verbatim>
The call below performs Import of _!QuickStart_ database using a ZIP file available on server.
<verbatim>
curl -X POST -u admin:admin -H "Cache-Control: no-cache"
"http://localhost:21000/api/atlas/admin/importFile?FILENAME=/root/quickStartDB.zip" > quickStartDB-import-result.json
</verbatim>
Below is the _!AtlasImportResult_ JSON for an import that contains _hive_db_.
The _processedEntities_ contains the _guids_ of all the entities imported.
The _metrics_ contain a breakdown of the types and entities imported along with the operation performed on them viz. _created_ or _updated_.
<verbatim>
{
"request": {
"options": {}
},
"userName": "admin",
"clientIpAddress": "10.0.2.2",
"hostName": "10.0.2.15",
"timeStamp": 1491285622823,
"metrics": {
"duration": 9143,
"typedef:enum": 0,
"typedef:struct": 0,
"entity:hive_column:created": 461,
"entity:hive_storagedesc:created": 20,
"entity:hive_process:created": 12,
"entity:hive_db:created": 5,
"entity:hive_table:created": 20,
"entity:hdfs_path:created": 2,
"typedef:entitydef": 0,
"typedef:classification": 3
},
"processedEntities": [
"2c4aa713-030b-4fb3-98b1-1cab23d9ac81",
"e4aa71ed-70fd-4fa7-9dfb-8250a573e293",
...
"ea0f9bdb-1dfc-4e48-9848-a006129929f9",
"b5e2cb41-3e7d-4468-84e1-d87c320e75f9"
],
"operationStatus": "SUCCESS"
}
</verbatim>
\ No newline at end of file
---+ Export & Import REST APIs
---+++ Background
The Import-Export APIs for Atlas facilitate transfer of data to and from a cluster that has Atlas provisioned.
The APIs when integrated with backup and/or disaster recovery process will ensure participation of Atlas.
---+++ Introduction
There are 2 broad categories viz. Export & Import. The details of the APIs are discussed below.
The APIs are available only to _admin_ user.
Only a single import or export operation can be performed at a given time. The operations have a potential for generating large amount. They can also put pressure on resources. This restriction tries to alleviate this problem.
For Import-Export APIs relating to HDFS path, can be found [[Import-Export-HDFS-Path][here]].
For additional information please refer to the following:
* [[https://issues.apache.org/jira/browse/ATLAS-1503][ATLAS-1503]] Original Import-Export API requirements.
* [[https://issues.apache.org/jira/browse/ATLAS-1618][ATLAS-1618]] Export API Scope Specification.
---+++ Errors
If an import or export operation is initiated while another is in progress, the consumer will receive this error:
<verbatim>
"ATLAS5005E": "Another import or export is in progress. Please try again."
</verbatim>
Unhandled errors will be returned as Internal error code 500.
---++ REST API Reference
* __[[Export-API][Export API]]__
* __[[Export-HDFS-API][Export HDFS API]]__
* __[[Import-API][Import API]]__
...@@ -56,6 +56,7 @@ allows integration with the whole enterprise data ecosystem. ...@@ -56,6 +56,7 @@ allows integration with the whole enterprise data ecosystem.
---++ API Documentation ---++ API Documentation
* <a href="api/v2/index.html">REST API Documentation</a> * <a href="api/v2/index.html">REST API Documentation</a>
* [[Import-Export-API][Export & Import REST API Documentation]]
* <a href="api/rest.html">Legacy API Documentation</a> * <a href="api/rest.html">Legacy API Documentation</a>
---++ Developer Setup Documentation ---++ Developer Setup Documentation
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment