You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Zoltán Borók-Nagy (Jira)" <ji...@apache.org> on 2023/06/23 09:59:00 UTC

[jira] [Assigned] (IMPALA-12237) Add information about the table type in the lineage log

     [ https://issues.apache.org/jira/browse/IMPALA-12237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zoltán Borók-Nagy reassigned IMPALA-12237:
------------------------------------------

    Assignee: Zoltán Borók-Nagy

> Add information about the table type in the lineage log
> -------------------------------------------------------
>
>                 Key: IMPALA-12237
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12237
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>            Reporter: Zoltán Borók-Nagy
>            Assignee: Zoltán Borók-Nagy
>            Priority: Major
>              Labels: impala-iceberg
>
> Atlas needs table type information to correctly build the lineage graph.
> Currently this is in the lineage log for a CTAS statement:
> {noformat}
> {
>   "queryText": "create table lineage_ctas as select * from lineage_test",
>   "queryId": "774232610e386de9:8111ae3500000000",
>   "hash": "ed91deffcdc11c442c2420da3b33d3b3",
>   "user": "boroknagyz",
>   "timestamp": 1687351038,
>   "endTime": 1687351038,
>   "edges": [
>     {
>       "sources": [
>         1
>       ],
>       "targets": [
>         0
>       ],
>       "edgeType": "PROJECTION"
>     }
>   ],
>   "vertices": [
>     {
>       "id": 0,
>       "vertexType": "COLUMN",
>       "vertexId": "i",
>       "metadata": {
>         "tableName": "default.lineage_ctas",
>         "tableCreateTime": 1687351038
>       }
>     },
>     {
>       "id": 1,
>       "vertexType": "COLUMN",
>       "vertexId": "default.lineage_test.i",
>       "metadata": {
>         "tableName": "default.lineage_test",
>         "tableCreateTime": 1687351020
>       }
>     }
>   ]
> }
> {noformat}
> Under vertices this is what they'd like to see:
> {noformat}
> "vertices": [
>     {
>       "id": 0,
>       "vertexType": "COLUMN",
>       "vertexId": "i",
>       "metadata": {
>         "tableName": "default.lineage_ctas",
>         "tableType": "iceberg",
>         "tableCreateTime": 1687351038
>       }
>     },
>     {
>       "id": 1,
>       "vertexType": "COLUMN",
>       "vertexId": "default.lineage_test.i",
>       "metadata": {
>         "tableName": "default.lineage_test",
>         "tableType": "hive",         
>         "tableCreateTime": 1687351020
>       }
>     }
>   ]
> {noformat}
> So under the vertices' metadata, there should be a new field: 'tableType'. For FS-based tables it should be "hive", except for Iceberg, in which case it should be "iceberg". For Kudu it should be "kudu", and for HBase it should be "hbase".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org