Commit fb28760a by apoorvnaik Committed by nixonrodrigues

ATLAS-2024: Updated Atlas TWiki for basic search functionality (ATLAS-1880)

parent 1298c83d
---+ Search ---+ Search
Atlas exposes search over the metadata in two ways: Atlas exposes search over the metadata in two ways:
* Search using DSL * Basic Search
* Full-text search * Advanced Search (DSL or Full-Text)
---++ Search DSL Grammar ---++ Basic search
The basic search allows you to query using typename of an entity, associated classification/tag
and has support for filtering on the entity attribute(s) as well as the classification/tag attributes.
The entire query structure can be represented using the following JSON structure (called !SearchParameters)
<verbatim>
{
"typeName": "hive_table",
"excludeDeletedEntities": true,
"classification" : "",
"query": "",
"limit": 25,
"offset": 0,
"entityFilters": {
"attributeName": "name",
"operator": "contains",
"attributeValue": "testtable"
},
"tagFilters": null,
"attributes": [""]
}
</verbatim>
__Field description__
* typeName: The type of entity to look for
* excludeDeletedEntities: Should the search include deleted entities too (default: true)
* classification: Only include entities with given Classification/tag
* query: Any free text occurrence that the entity should have (generic/wildcard queries might be slow)
* limit: Max number of results to fetch
* offset: Starting offset of the result set (useful for pagination)
* entityFilters: Entity Attribute filter(s)
* tagFilters: Classification/tag Attribute filter(s)
* attributes: Attributes to include in the search result (default: include any attribute present in the filter)
Attribute based filtering can be done on multiple attributes with AND/OR condition.
*NOTE: The tagFilters and entityFilters field have same JSON structure.*
__Examples of filtering (for hive_table attributes)__
* Single attribute
<verbatim>
{
"typeName": "hive_table",
"excludeDeletedEntities": true,
"classification" : "",
"query": "",
"limit": 50,
"offset": 0,
"entityFilters": {
"attributeName": "name",
"operator": "contains",
"attributeValue": "testtable"
},
"tagFilters": null,
"attributes": [""]
}
</verbatim>
* Multi-attribute with OR
<verbatim>
{
"typeName": "hive_table",
"excludeDeletedEntities": true,
"classification" : "",
"query": "",
"limit": 50,
"offset": 0,
"entityFilters": {
"condition": "OR",
"criterion": [
{
"attributeName": "name",
"operator": "contains",
"attributeValue": "testtable"
},
{
"attributeName": "owner",
"operator": "eq",
"attributeValue": "admin"
}
]
},
"tagFilters": null,
"attributes": [""]
}
</verbatim>
* Multi-attribute with AND
<verbatim>
{
"typeName": "hive_table",
"excludeDeletedEntities": true,
"classification" : "",
"query": "",
"limit": 50,
"offset": 0,
"entityFilters": {
"condition": "AND",
"criterion": [
{
"attributeName": "name",
"operator": "contains",
"attributeValue": "testtable"
},
{
"attributeName": "owner",
"operator": "eq",
"attributeValue": "admin"
}
]
},
"tagFilters": null,
"attributes": [""]
}
</verbatim>
__Supported operators for filtering__
* LT (symbols: <, lt) works with Numeric, Date attributes
* GT (symbols: >, gt) works with Numeric, Date attributes
* LTE (symbols: <=, lte) works with Numeric, Date attributes
* GTE (symbols: >=, gte) works with Numeric, Date attributes
* EQ (symbols: eq, =) works with Numeric, Date, String attributes
* NEQ (symbols: neq, !=) works with Numeric, Date, String attributes
* LIKE (symbols: like, LIKE) works with String attributes
* STARTS_WITH (symbols: startsWith, STARTSWITH) works with String attributes
* ENDS_WITH (symbols: endsWith, ENDSWITH) works with String attributes
* CONTAINS (symbols: contains, CONTAINS) works with String attributes
__CURL Samples__
<verbatim>
curl -sivk -g
-u <user>:<password>
-X POST
-d '{
"typeName": "hive_table",
"excludeDeletedEntities": true,
"classification" : "",
"query": "",
"limit": 50,
"offset": 0,
"entityFilters": {
"condition": "AND",
"criterion": [
{
"attributeName": "name",
"operator": "contains",
"attributeValue": "testtable"
},
{
"attributeName": "owner",
"operator": "eq",
"attributeValue": "admin"
}
]
},
"tagFilters": null,
"attributes": [""]
}'
<protocol>://<atlas_host>:<atlas_port>/api/atlas/v2/search/basic
</verbatim>
---++ Advanced Search
---+++ Search DSL Grammar
The DSL exposes an SQL like query language for searching the metadata based on the type system. The DSL exposes an SQL like query language for searching the metadata based on the type system.
The grammar for the DSL is below. The grammar for the DSL is below.
...@@ -16,9 +181,9 @@ query: querySrc ~ opt(loopExpression) ~ opt(groupByExpr) ~ opt(selectClause) ~ o ...@@ -16,9 +181,9 @@ query: querySrc ~ opt(loopExpression) ~ opt(groupByExpr) ~ opt(selectClause) ~ o
querySrc: rep1sep(singleQrySrc, opt(COMMA)) querySrc: rep1sep(singleQrySrc, opt(COMMA))
singleQrySrc = FROM ~ fromSrc ~ opt(WHERE) ~ opt(expr ^? notIdExpression) | singleQrySrc = FROM ~ fromSrc ~ opt(WHERE) ~ opt(expr ^? notIdExpression) |
WHERE ~ (expr ^? notIdExpression) | WHERE ~ (expr ^? notIdExpression) |
expr ^? notIdExpression | expr ^? notIdExpression |
fromSrc ~ opt(WHERE) ~ opt(expr ^? notIdExpression) fromSrc ~ opt(WHERE) ~ opt(expr ^? notIdExpression)
fromSrc: identifier ~ AS ~ alias | identifier fromSrc: identifier ~ AS ~ alias | identifier
...@@ -51,10 +216,10 @@ expr: compE ~ opt(rep(exprRight)) ...@@ -51,10 +216,10 @@ expr: compE ~ opt(rep(exprRight))
exprRight: (AND | OR) ~ compE exprRight: (AND | OR) ~ compE
compE: compE:
arithE ~ (LT | LTE | EQ | NEQ | GT | GTE) ~ arithE | arithE ~ (LT | LTE | EQ | NEQ | GT | GTE) ~ arithE |
arithE ~ (ISA | IS) ~ ident | arithE ~ (ISA | IS) ~ ident |
arithE ~ HAS ~ ident | arithE ~ HAS ~ ident |
arithE | countClause | maxClause | minClause | sumClause arithE | countClause | maxClause | minClause | sumClause
arithE: multiE ~ opt(rep(arithERight)) arithE: multiE ~ opt(rep(arithERight))
...@@ -71,18 +236,18 @@ identifier: rep1sep(ident, DOT) ...@@ -71,18 +236,18 @@ identifier: rep1sep(ident, DOT)
alias: ident | stringLit alias: ident | stringLit
literal: booleanConstant | literal: booleanConstant |
intConstant | intConstant |
longConstant | longConstant |
floatConstant | floatConstant |
doubleConstant | doubleConstant |
stringLit stringLit
</verbatim> </verbatim>
Grammar language: Grammar language:
{noformat} {noformat}
opt(a) => a is optional opt(a) => a is optional
~ => a combinator. 'a ~ b' means a followed by b ~ => a combinator. 'a ~ b' means a followed by b
rep => zero or more rep => zero or more
rep1sep => one or more, separated by second arg. rep1sep => one or more, separated by second arg.
{noformat} {noformat}
...@@ -99,14 +264,14 @@ Language Notes: ...@@ -99,14 +264,14 @@ Language Notes:
see scaladoc for 'org.apache.atlas.query.ClosureQuery') see scaladoc for 'org.apache.atlas.query.ClosureQuery')
* GROUPBY is optional. Group by can be specified with aggregate methods like max, min, sum and count. When group by is specified aggregated results are returned based on the method specified in select clause. Select expression is mandatory with group by expression. * GROUPBY is optional. Group by can be specified with aggregate methods like max, min, sum and count. When group by is specified aggregated results are returned based on the method specified in select clause. Select expression is mandatory with group by expression.
* ORDERBY is optional. When order by clause is specified, case insensitive sorting is done based on the column specified. * ORDERBY is optional. When order by clause is specified, case insensitive sorting is done based on the column specified.
For sorting in descending order specify 'DESC' after order by clause. If no order by is specified, then no default sorting is applied. For sorting in descending order specify 'DESC' after order by clause. If no order by is specified, then no default sorting is applied.
* LIMIT is optional. It limits the maximum number of objects to be fetched starting from specified optional offset. If no offset is specified count starts from beginning. * LIMIT is optional. It limits the maximum number of objects to be fetched starting from specified optional offset. If no offset is specified count starts from beginning.
* There are couple of Predicate functions different from SQL: * There are couple of Predicate functions different from SQL:
* _is_ or _isa_can be used to filter Entities that have a particular Trait. * _is_ or _isa_can be used to filter Entities that have a particular Trait.
* _has_ can be used to filter Entities that have a value for a particular Attribute. * _has_ can be used to filter Entities that have a value for a particular Attribute.
* Any identifiers or constants with special characters(space,$,",{,}) should be enclosed within backquote (`) * Any identifiers or constants with special characters(space,$,",{,}) should be enclosed within backquote (`)
---+++ DSL Examples ---++++ DSL Examples
For the model, For the model,
Asset - attributes name, owner, description Asset - attributes name, owner, description
DB - supertype Asset - attributes clusterName, parameters, comment DB - supertype Asset - attributes clusterName, parameters, comment
...@@ -133,6 +298,6 @@ DSL queries: ...@@ -133,6 +298,6 @@ DSL queries:
* from Person select count() as 'count', max(Person.age) as 'max', min(Person.age) * from Person select count() as 'count', max(Person.age) as 'max', min(Person.age)
* `Log Data` * `Log Data`
---++ Full-text Search ---+++ Full-text Search
Atlas also exposes a lucene style full-text search capability. Atlas also exposes a lucene style full-text search capability.
\ No newline at end of file
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment