You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by "Ramesh Mani (JIRA)" <ji...@apache.org> on 2018/05/07 06:54:00 UTC

[jira] [Created] (ATLAS-2649) Hive Hook should create lineage entities when storage handler mechanism to create hbase tables via hive

Ramesh Mani created ATLAS-2649:
----------------------------------

             Summary: Hive Hook should create lineage entities when storage handler mechanism to create hbase tables via hive
                 Key: ATLAS-2649
                 URL: https://issues.apache.org/jira/browse/ATLAS-2649
             Project: Atlas
          Issue Type: Bug
            Reporter: Ramesh Mani


Hive Hook should create lineage entities when storage handler mechanism to create hbase tables via hive.

When Hive on HBase is done via Hive's HBaseStorageHandler mechanism, corresponding HBase table is created in HBase and data is store in it. In this process Hive Hook should show Input process as Hive Table and Output as HBase Table.

e.g

CREATE TABLE hbase_table_emp(id int, name string, role string) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:name,cf1:role")
TBLPROPERTIES ("hbase.table.name" = "emp");

This will create a corresponding HBase table emp

hbase(main):003:0> list
TABLE
ATLAS_ENTITY_AUDIT_EVENTS
atlas_janus
emp
3 row(s)
Took 0.0127 seconds
=> ["ATLAS_ENTITY_AUDIT_EVENTS", "atlas_janus", "emp"]
hbase(main):004:0> describe 'emp'
Table emp is ENABLED
emp
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf1', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS =>
'FALSE', CACHE_DATA_ON_WRITE => 'false', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', MIN_VERSIONS => '0', REPLICATION
_SCOPE => '0', BLOOMFILTER => 'ROW', CACHE_INDEX_ON_WRITE => 'false', IN_MEMORY => 'false', CACHE_BLOOMS_ON_WRITE => 'fals
e', PREFETCH_BLOCKS_ON_OPEN => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '65536'}
1 row(s)
Took 0.1961 seconds

 

In this process the Hive hook should provide the lineage info for the corresponding Hive table -> HBase Table Storage.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)