You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@kyuubi.apache.org by "iodone (via GitHub)" <gi...@apache.org> on 2023/04/13 09:53:10 UTC

[GitHub] [kyuubi] iodone commented on pull request #4694: [KYUUBI #4693] Enhanced the table lineage for input tables

iodone commented on PR #4694:
URL: https://github.com/apache/kyuubi/pull/4694#issuecomment-1506679998

   > I think we'd better to define lineage clearly. The current lineage is easy to follow that we only consider plan's output, which means if a column is used to do filter, sort or something do not affect schema, then we ignore it.
   > 
   > So if we want to consider those columns, how about adding a new mode to fully extract column and table lineage? For example:
   > 
   > ```sql
   > INSERT INTO TABLE t
   > SELECT c1 FROM t1 WHERE c2 > 0 ORDER BY c3
   > 
   > -- The lineage should be:
   > 
   > ColumnUsage(to: String, from: String, usage: String)
   > 
   > Lineage(
   >   List("default.t1"),
   >   List("default.t"),
   >   List(
   >      ColumnUsage("c1", "default.t1.c1", "OUTPUT"),
   >      ColumnUsage("N/A", "default.t1.c2", "PREDICATE"),
   >      ColumnUsage("N/A", "default.t1.c3", "ORDERING")
   >   )
   > )
   > ```
   
   Yes, from the perspective of column output, the current lineage relationship is clear. The main purpose of this PR is to analyze lineage relationships from the perspective of table lineage and consider any table involved in SQL as an input table.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org