Export-API.twiki 6.13 KB
Newer Older
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38
---+ Export API
The general approach is:
   * Consumer specifies the scope of data to be exported (details below).
   * The API if successful, will return the stream in the format specified.
   * Error will be returned on failure of the call.

See [[Export-HDFS-API][here]] for details on exporting *hdfs_path* entities.

|*Title*|*Export API*|
| _Example_ | See Examples sections below. |
| _URL_ |_api/atlas/admin/export_ |
| _Method_ |_POST_ |
| _URL Parameters_ |_None_ |
| _Data Parameters_| The class _!AtlasExportRequest_ is used to specify the items to export. The list of _!AtlasObjectId_(s) allow for specifying the multiple items to export in a session. The _!AtlasObjectId_ is a tuple of entity type, name of unique attribute, value of unique attribute. Several items can be specified. See examples below.|
| _Success Response_|File stream as _application/zip_.|
|_Error Response_|Errors that are handled within the system will be returned as _!AtlasBaseException_. |
| _Notes_ | Consumer could choose to consume the output of the API by programmatically using _java.io.ByteOutputStream_ or by manually, save the contents of the stream to a file on the disk.|

__Method Signature__
<verbatim>
@POST
@Path("/export")
@Consumes("application/json;charset=UTF-8")
</verbatim>

---+++ Additional Options
It is possible to specify additional parameters for the _Export_ operation.

Current implementation has 2 options. Both are optional:
   * _matchType_ This option configures the approach used for fetching the starting entity. It has follow values:
      * _startsWith_ Search for an entity that is prefixed with the specified criteria.
      * _endsWith_ Search for an entity that is suffixed with the specified criteria.
      * _contains_ Search for an entity that has the specified criteria as a sub-string.
      * _matches_ Search for an entity that is a regular expression match with the specified criteria.

   * _fetchType_ This option configures the approach used for fetching entities. It has following values:
      * _FULL_: This fetches all the entities that are connected directly and indirectly to the starting entity. E.g. If a starting entity specified is a table, then this option will fetch the table, database and all the other tables within the database.
      * _CONNECTED_: This fetches all the etnties that are connected directly to the starting entity. E.g. If a starting entity specified is a table, then this option will fetch the table and the database entity only.
39
      * _INCREMENTAL_: See [[Incremental-Export][here]] for details.
40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141

If no _matchType_ is specified, exact match is used. Which means, that the entire string is used in the search criteria.

Searching using _matchType_ applies for all types of entities. It is particularly useful for matching entities of type hdfs_path (see [[Export-HDFS-API][here]]).

The _fetchType_ option defaults to _FULL_.

For complete example see section below.

---+++ Contents of Exported ZIP File

The exported ZIP file has the following entries within it:
   * _atlas-export-result.json_:
      * Input filters: The scope of export.
      * File format: The format chosen for the export operation.
      * Metrics: The number of entity definitions, classifications and entities exported.
   * _atlas-typesdef.json_: Type definitions for the entities exported.
   * _atlas-export-order.json_: Order in which entities should be exported.
   * _{guid}.json_: Individual entities are exported with file names that correspond to their id.

---+++ Examples
The _!AtlasExportRequest_ below shows filters that attempt to export 2 databases in cluster cl1:
<verbatim>
{
    "itemsToExport": [
       { "typeName": "hive_db", "uniqueAttributes": { "qualifiedName": "accounts@cl1" } },
       { "typeName": "hive_db", "uniqueAttributes": { "qualifiedName": "hr@cl1" } }
    ]
}
</verbatim>

The _!AtlasExportRequest_ below specifies the _fetchType_ as _FULL_. The _matchType_ option will fetch _accounts@cl1_.
<verbatim>
{
    "itemsToExport": [
       { "typeName": "hive_db", "uniqueAttributes": { "qualifiedName": "accounts@" } },
    ],
    "options" {
        "fetchType": "FULL",
        "matchType": "startsWith"
    }
}
</verbatim>

The _!AtlasExportRequest_ below specifies the _fetchType_ as _connected_. The _matchType_ option will fetch _accountsReceivable_, _accountsPayable_, etc present in the database.
<verbatim>
{
    "itemsToExport": [
       { "typeName": "hive_db", "uniqueAttributes": { "name": "accounts" } },
    ],
    "options" {
        "fetchType": "CONNECTED",
        "matchType": "startsWith"
    }
}
</verbatim>

Below is the _!AtlasExportResult_ JSON for the export of the _Sales_ DB present in the _!QuickStart_.

The _metrics_ contains the number of types and entities exported as part of the operation.

<verbatim>
{
    "clientIpAddress": "10.0.2.15",
    "hostName": "10.0.2.2",
    "metrics": {
        "duration": 1415,
        "entitiesWithExtInfo": 12,
        "entity:DB_v1": 2,
        "entity:LoadProcess_v1": 2,
        "entity:Table_v1": 6,
        "entity:View_v1": 2,
        "typedef:Column_v1": 1,
        "typedef:DB_v1": 1,
        "typedef:LoadProcess_v1": 1,
        "typedef:StorageDesc_v1": 1,
        "typedef:Table_v1": 1,
        "typedef:View_v1": 1,
        "typedef:classification": 6
    },
    "operationStatus": "SUCCESS",
    "request": {
        "itemsToExport": [
            {
                "typeName": "DB_v1",
                "uniqueAttributes": {
                    "name": "Sales"
                }
            }
        ],
        "options": {
            "fetchType": "full"
        }
    },
    "userName": "admin"
}
</verbatim>

---+++ CURL Calls
Below are sample CURL calls that demonstrate Export of _!QuickStart_ database.

<verbatim>
142
curl -X POST -u adminuser:password -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
143 144 145 146 147 148
    "itemsToExport": [
            { "typeName": "DB", "uniqueAttributes": { "name": "Sales" }
            { "typeName": "DB", "uniqueAttributes": { "name": "Reporting" }
            { "typeName": "DB", "uniqueAttributes": { "name": "Logging" }
        }
    ],
149 150
        "options": { "full" }
    }' "http://localhost:21000/api/atlas/admin/export" > quickStartDB.zip
151
</verbatim>