You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Zoltán Borók-Nagy (Jira)" <ji...@apache.org> on 2023/06/23 09:59:00 UTC

[jira] [Created] (IMPALA-12237) Add information about the table type in the lineage log

Zoltán Borók-Nagy created IMPALA-12237:
------------------------------------------

             Summary: Add information about the table type in the lineage log
                 Key: IMPALA-12237
                 URL: https://issues.apache.org/jira/browse/IMPALA-12237
             Project: IMPALA
          Issue Type: Bug
          Components: Frontend
            Reporter: Zoltán Borók-Nagy


Atlas needs table type information to correctly build the lineage graph.

Currently this is in the lineage log for a CTAS statement:
{noformat}
{
  "queryText": "create table lineage_ctas as select * from lineage_test",
  "queryId": "774232610e386de9:8111ae3500000000",
  "hash": "ed91deffcdc11c442c2420da3b33d3b3",
  "user": "boroknagyz",
  "timestamp": 1687351038,
  "endTime": 1687351038,
  "edges": [
    {
      "sources": [
        1
      ],
      "targets": [
        0
      ],
      "edgeType": "PROJECTION"
    }
  ],
  "vertices": [
    {
      "id": 0,
      "vertexType": "COLUMN",
      "vertexId": "i",
      "metadata": {
        "tableName": "default.lineage_ctas",
        "tableCreateTime": 1687351038
      }
    },
    {
      "id": 1,
      "vertexType": "COLUMN",
      "vertexId": "default.lineage_test.i",
      "metadata": {
        "tableName": "default.lineage_test",
        "tableCreateTime": 1687351020
      }
    }
  ]
}
{noformat}

Under vertices this is what they'd like to see:

{noformat}
"vertices": [
    {
      "id": 0,
      "vertexType": "COLUMN",
      "vertexId": "i",
      "metadata": {
        "tableName": "default.lineage_ctas",
        "tableType": "iceberg",
        "tableCreateTime": 1687351038
      }
    },
    {
      "id": 1,
      "vertexType": "COLUMN",
      "vertexId": "default.lineage_test.i",
      "metadata": {
        "tableName": "default.lineage_test",
        "tableType": "hive",         
        "tableCreateTime": 1687351020
      }
    }
  ]
{noformat}

So under the vertices' metadata, there should be a new field: 'tableType'. For FS-based tables it should be "hive", except for Iceberg, in which case it should be "iceberg". For Kudu it should be "kudu", and for HBase it should be "hbase".



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org