You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by "Na Li (JIRA)" <ji...@apache.org> on 2019/06/18 22:43:01 UTC

[jira] [Commented] (ATLAS-3290) Impala Hook should get database name and table name from vertex metadata

    [ https://issues.apache.org/jira/browse/ATLAS-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16867103#comment-16867103 ] 

Na Li commented on ATLAS-3290:
------------------------------

pre-commit test at https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1211/console

> Impala Hook should get database name and table name from vertex metadata
> ------------------------------------------------------------------------
>
>                 Key: ATLAS-3290
>                 URL: https://issues.apache.org/jira/browse/ATLAS-3290
>             Project: Atlas
>          Issue Type: New Feature
>          Components:  atlas-core
>    Affects Versions: 2.1.0
>            Reporter: Na Li
>            Assignee: Na Li
>            Priority: Major
>         Attachments: ATLAS-3290.001.patch
>
>
> The column name in Impala lineage record may not contain its database name and its table name. 
> To get its  its database name and its table name, we should use the metadata in a vertex, not assuming column name contains its database name and its table name. 
> When assuming that column name always contains its database name and its table name, we run into the following exception
> {code}
> I0618 19:16:02.415920 209817 QueryEventHookManager.java:212] Initiating onQueryComplete: org.apache.atlas.impala.hook.ImpalaLineageHook
> E0618 19:16:02.418964 210738 ImpalaLineageHook.java:126] ImpalaLineageHook.process(): failed to process query create table sales_sg as select * from sales_asia
> Java exception follows:
> java.lang.IllegalArgumentException: fullColumnName {} does not contain database name or table name
>         at org.apache.atlas.impala.hook.AtlasImpalaHookContext.getQualifiedNameForColumn(AtlasImpalaHookContext.java:115)
>         at org.apache.atlas.impala.hook.events.BaseImpalaEvent.getQualifiedName(BaseImpalaEvent.java:164)
>         at org.apache.atlas.impala.hook.events.BaseImpalaEvent.getQualifiedName(BaseImpalaEvent.java:134)
>         at org.apache.atlas.impala.hook.events.BaseImpalaEvent.getColumnEntities(BaseImpalaEvent.java:495)
>         at org.apache.atlas.impala.hook.events.BaseImpalaEvent.toTableEntity(BaseImpalaEvent.java:430)
>         at org.apache.atlas.impala.hook.events.BaseImpalaEvent.toTableEntity(BaseImpalaEvent.java:393)
>         at org.apache.atlas.impala.hook.events.BaseImpalaEvent.toAtlasEntity(BaseImpalaEvent.java:315)
>         at org.apache.atlas.impala.hook.events.BaseImpalaEvent.getInputOutputEntity(BaseImpalaEvent.java:297)
>         at org.apache.atlas.impala.hook.events.CreateImpalaProcess.getEntities(CreateImpalaProcess.java:103)
>         at org.apache.atlas.impala.hook.events.CreateImpalaProcess.getNotificationMessages(CreateImpalaProcess.java:54)
>         at org.apache.atlas.impala.hook.ImpalaLineageHook.process(ImpalaLineageHook.java:122)
>         at org.apache.atlas.impala.hook.ImpalaLineageHook.process(ImpalaLineageHook.java:79)
>         at org.apache.atlas.impala.hook.ImpalaHook.onQueryComplete(ImpalaHook.java:36)
>         at org.apache.atlas.impala.hook.ImpalaLineageHook.onQueryComplete(ImpalaLineageHook.java:52)
>         at org.apache.impala.hooks.QueryEventHookManager.lambda$null$1(QueryEventHookManager.java:215)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> {code}
> The lineage record from Impala is
> {code}
> {  
>    "queryText":"create table sales_china as select * from sales_asia",
>    "queryId":"2940d0b242de53ea:e82ba8d300000000",
>    "hash":"a705a9ec851a5440afca0dfb8df86cd5",
>    "user":"root",
>    "timestamp":1560885032,
>    "endTime":1560885040,
>    "edges":[  
>       {  
>          "sources":[  
>             1
>          ],
>          "targets":[  
>             0
>          ],
>          "edgeType":"PROJECTION"
>       },
>       {  
>          "sources":[  
>             3
>          ],
>          "targets":[  
>             2
>          ],
>          "edgeType":"PROJECTION"
>       }
>    ],
>    "vertices":[  
>       {  
>          "id":0,
>          "vertexType":"COLUMN",
>          "vertexId":"id",
>          "metadata":{  
>             "tableName":"sales_db.sales_china",
>             "tableCreateTime":1560885039
>          }
>       },
>       {  
>          "id":1,
>          "vertexType":"COLUMN",
>          "vertexId":"sales_db.sales_asia.id",
>          "metadata":{  
>             "tableName":"sales_db.sales_asia",
>             "tableCreateTime":1560884919
>          }
>       },
>       {  
>          "id":2,
>          "vertexType":"COLUMN",
>          "vertexId":"name",
>          "metadata":{  
>             "tableName":"sales_db.sales_china",
>             "tableCreateTime":1560885039
>          }
>       },
>       {  
>          "id":3,
>          "vertexType":"COLUMN",
>          "vertexId":"sales_db.sales_asia.name",
>          "metadata":{  
>             "tableName":"sales_db.sales_asia",
>             "tableCreateTime":1560884919
>          }
>       }
>    ]
> }
> {code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)