Virtual Data Source

Models

VirtualDataSourceItem

Sub-model used in the parent Models of Schema, Table, Column and Index.

Attributes:

Name Type Description
key str Key of the Virtual Data Source object
description str Description of the Virtual Data Source object
title str Title of the Virtual Data Source object

VirtualDataSourceParams

These optional Model items can be used to update title and descriptions or remove VDS objects.

Attributes:

Name Type Description
set_title_descs boolean This parameter specifies if the title and description of the object passed should be updated. When set to true, will add a title and description to the specified object, if the object is being newly added to a database’s metadata. If the object already exists, they will be updated
remove_not_seen boolean This parameter specifies if the technical metadata of a data source should be deleted.

VirtualDataSourceSchema

Python object used to create a Schema in Alation virtual data source and passed in the parameter vds_objects in the function post_metadata.

Attributes: Key, description, title are inherited from VirtualDataSourceItem. The key is required.

Name Type Description
key str Key of the Virtual Data Source object
description str Description of the Virtual Data Source object
title str Title of the Virtual Data Source object

VirtualDataSourceTable

Python object used to create a Table in Alation and passed in the parameter vds_objects as a list in the function post_metadata.

Attributes: Key, description, title are inherited from VirtualDataSourceItem. The key is required

Name Type Description
key str Key of the Virtual Data Source object
description str Description of the Virtual Data Source object
title str Title of the Virtual Data Source object
data_location str A URI or file path to the location of the underlying data, such as an HDFS URL for a Hive table.
db_owner str Name of the database account that owns this table.
definition_sql str CREATE TABLE statement which was used to create the table.
constraint_text str Constraint statements which are enforced by the DB.
ts_created str Timestamp at which the table or view was created. Example: 2018-03-13T22:09:33Z
ts_last_altered str Timestamp of the last ALTER statement executed against this table. Example: 2018-03-13T22:09:33Z
partitioning_attributes list An array of columns which are used to partition the table. Example: [“column1”, “column2”]
bucket_attributes list An array of columns which are used to bucket the table (in data sources like Hive, bucketing is an alternative mechanism to partitioning for grouping similar data together: LanguageManualDDL-BucketedTables). Example: [“column1”, “column2”]
sort_attributes list An array of columns used to sort the table (in Hive, used with bucketing to store data for faster computation: LanguageManualDDL-BucketedSortedTables). Example: [“column1”, “column2”].
synonyms list An array of other names that can be used to refer to this table. Each synonym is represented as a JSON comprising schema_name and table_name. Example: [{“schema_name”: “schema_a”,”table_name”: “table_a”}, {“schema_name”: “schema_b”,”table_name”: “table_b”}].
skews_info dict A JSON of the skew column names to an array of their respective skewed column values that appear very often. Example:{“column1”: [“column1_value1”, “column1_value2”], “column2”: [“column2_value1”, “column2_value2”]}.
table_comment str A comment field that stores a description of the table which is ingested from the source system. Example: “This Table is created by ELT”.

VirtualDataSourceView

Python object used to create a View in Alation and passed in the parameter vds_objects as a list in the function post_metadata.

Attributes: Key, description, title are inherited from VirtualDataSourceItem. The key is required

Name Type Description
key str Key of the Virtual Data Source object
description str Description of the Virtual Data Source object
title str Title of the Virtual Data Source object
db_owner str Name of the database account that owns this table.
view_sql str CREATE VIEW statement which was used to create the view.
view_sql_expanded str CREATE VIEW statement with fully qualified object references.
ts_created str Timestamp at which the table or view was created. Example: 2018-03-13T22:09:33Z
ts_last_altered str Timestamp of the last ALTER statement executed against this table. Example: 2018-03-13T22:09:33Z
partitioning_attributes list An array of columns which are used to partition the table. Example: [“column1”, “column2”]
bucket_attributes list An array of columns which are used to bucket the table (in data sources like Hive, bucketing is an alternative mechanism to partitioning for grouping similar data together: LanguageManualDDL-BucketedTables). Example: [“column1”, “column2”]
sort_attributes list An array of columns used to sort the table (in Hive, used with bucketing to store data for faster computation: LanguageManualDDL-BucketedSortedTables). Example: [“column1”, “column2”].
synonyms list An array of other names that can be used to refer to this table. Each synonym is represented as a JSON comprising schema_name and table_name. Example: [{“schema_name”: “schema_a”,”table_name”: “table_a”}, {“schema_name”: “schema_b”,”table_name”: “table_b”}].
skews_info dict A JSON of the skew column names to an array of their respective skewed column values that appear very often. Example:{“column1”: [“column1_value1”, “column1_value2”], “column2”: [“column2_value1”, “column2_value2”]}.
table_comment str A comment field that stores a description of the table which is ingested from the source system. Example: “This Table is created by ELT”.

VirtualDataSourceColumn

Python object used to create a Column in Alation and passed in the parameter vds_objects as a list in the function post_metadata.

Attributes: Key, description, title are inherited from VirtualDataSourceItem. The key is required

Name Type Description
key str Key of the Virtual Data Source object
description str Description of the Virtual Data Source object
title str Title of the Virtual Data Source object
column_type str The type of the column. The value for this parameter can be any of the column types supported by the underlying database..
position int Position of the column in the table which contains it.
NOTE:
1) This value needs to be a positive integer.
2) When specifying a column, please make sure the table it corresponds to is already a part of the database’s metadata..
column_comment str A comment field that stores a description of the column which is ingested from the source system.
nullable bool Field to indicate if the column can be nullable. Set this to true if the column is a nullable field, false otherwise.

VirtualDataSourceIndex

Python object used to create an Index in Alation and passed in the parameter vds_objects as a list in the function post_metadata.

Attributes: Key, description, title are inherited from VirtualDataSourceItem. The key, index_type and column_names are required

Name Type Description
key str Key of the Virtual Data Source object
description str Description of the Virtual Data Source object
title str Title of the Virtual Data Source object
index_type str The type of the index. The value for this field can be one among: ‘PRIMARY’, ‘SECONDARY’, ‘PARTITIONED_PRIMARY’, ‘UNIQUE’, ‘OTHER’. Example: “PRIMARY” The presence of this field distinguishes index object from a column.
NOTE:
1) When specifying an index, please make sure the table it corresponds to is already a part of the database’s metadata.
2) Even in case of index upsert, this field is required.
column_names list An array of column names on which the index is defined. Example: [“column1”] If the index is composite, this array will have multiple column names.
NOTE:
1) This cannot be an empty array.
2) When specifying an index, please make sure the columns it corresponds to is already a part of the database’s metadata.
3) In case of index upsert(details below), this field can be optional.
4) The order in which the column names are specified is important. Since this implies the sequencing of the column in case of composite indices.
data_structure str The underlying data structure used by the index. The value for this field can be one among: ‘BTREE’, ‘HASH’, ’BITMAP’, ‘DENSE’, ‘SPARSE’, ‘REVERSE’, ‘OTHER’, ‘NONE’. Example: “BTREE” Default: “NONE”
index_type_detail str string containing custom detailed information about the index. Example: “MULTI_COLUMN_STATISTICS”
is_ascending bool Set this boolean to True, if the index is created in ascending order, else set False. NOTE: This is not valid for composite index.
filter_condition str Filter condition used while creating index for a portion of rows in the table. Example: “([filteredIndexCol]>(0))”
NOTE: This is not valid for composite index.
is_foreign_key bool Set this boolean to True, if the index is a foreign key.
NOTE: When this is true, fields: ‘foreign_key_table_name‘ and ‘foreign_key_column_names‘ are required.
foreign_key_table_name str The key of the parent table object which the foreign index refers to. Example: “7.schema_a.table_a”
NOTE: This is required only if ‘is_foreign_key‘ is set to True. Please make sure the table it corresponds to is already a part of the database’s metadata.
foreign_key_column_names list An array of column names on the parent table object which the foreign index refers to. Example: [“column1”]
NOTE:
1) This is required only if ‘is_foreign_key‘ is set to True.
2) Please make sure the columns it corresponds to is already a part of the database’s metadata.
3) The number of columns here should match the number of columns in ‘column_names’ field.

Methods

post_metadata

post_metadata(ds_id: int, vds_objects: list, query_params: VirtualDataSourceParams = None) -> list[JobDetailsVirtualDatasourcePost]

Add/Update/Remove Virtual Data Source Objects

Args:

  • ds_id (int): Virtual data source id.
  • vds_objects (list): Virtual Data Source object list.
  • query_params: (VirtualDataSourceParams): Query Params for the POST request.

Returns:

  • List of JobDetailsVirtualDatasourcePost: Status report of the executed background jobs.

Examples

Post Virtual Data Source Objects Add/Update

import allie_sdk as allie

alation = allie.Alation(
    host='<HOST>',
    user_id=<USER_ID>,
    refresh_token='<REFRESH_TOKEN>')

# Add/Update Objects   
ds_id = 36
ds_schema = "test"

# create the schema first and then submit the payload, otherwise the parser throws and error
s1 = allie.VirtualDataSourceSchema()
s1.key = f'{ds_id}.{ds_schema}'
s1.description = "New Schema for API testing"

t1 = allie.VirtualDataSourceTable()
t1.key = f'{ds_id}.{ds_schema}.Orders'
t1.table_type = "TABLE"
t1.table_comment = "This is a sample table created with Allie SDK for Virtual Dat Sources"
t1.title = "ORDERS"
t1.description = "Orders Table python description"
t1.data_location = "//hive_table_location_orders"
t1.definition_sql = "create table select from order_header"

v1 = allie.VirtualDataSourceView()
v1.key = f'{ds_id}.{ds_schema}.Orders_View'
v1.table_type = "VIEW"
v1.table_comment = "This is a sample table created with Allie SDK for Virtual Dat Sources"
v1.title = "ORDERS VIEW"
v1.description = "Orders View python description"
v1.view_sql = "select * from orders"

c1 = allie.VirtualDataSourceColumn()
c1.key = f'{t1.key}.Order_number'
c1.column_type = "int"
c1.nullable = False
c1.position = 1
c1.column_comment = "This is a sample column created with Allie SDK for Virtual Dat Sources"
c1.description = "Order Number for sales orders"
c1.title = "Order Number"

i1 = allie.VirtualDataSourceIndex()
i1.key = f'{t1.key}.index'
i1.index_type = "PRIMARY"
i1.column_names = [c1.key.split('.')[3]]
i1.data_structure = "BTREE"

vds_objects = [s1, t1, v1, c1, i1]

params = allie.VirtualDataSourceParams()
params.set_title_descs = "true"
params.remove_not_seen = "false"

vds_response = alation.virtual_datasource.post_metadata(ds_id=ds_id, vds_objects=vds_objects, query_params=params)

Remove Virtual Data Source Objects

# Remove All Objects   
import allie_sdk as allie

alation = allie.Alation(
    host='<HOST>',
    user_id=<USER_ID>,
    refresh_token='<REFRESH_TOKEN>')

ds_id = 36
params = allie.VirtualDataSourceParams()
params.set_title_descs = "false"
params.remove_not_seen = "true"

vds_response = alation.virtual_datasource.post_metadata(ds_id=ds_id, vds_objects=[], query_params=params)