You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by "Na Li (JIRA)" <ji...@apache.org> on 2019/06/18 22:43:01 UTC
[jira] [Commented] (ATLAS-3290) Impala Hook should get database
name and table name from vertex metadata
[ https://issues.apache.org/jira/browse/ATLAS-3290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16867103#comment-16867103 ]
Na Li commented on ATLAS-3290:
------------------------------
pre-commit test at https://builds.apache.org/view/A/view/Atlas/job/PreCommit-ATLAS-Build-Test/1211/console
> Impala Hook should get database name and table name from vertex metadata
> ------------------------------------------------------------------------
>
> Key: ATLAS-3290
> URL: https://issues.apache.org/jira/browse/ATLAS-3290
> Project: Atlas
> Issue Type: New Feature
> Components: atlas-core
> Affects Versions: 2.1.0
> Reporter: Na Li
> Assignee: Na Li
> Priority: Major
> Attachments: ATLAS-3290.001.patch
>
>
> The column name in Impala lineage record may not contain its database name and its table name.
> To get its its database name and its table name, we should use the metadata in a vertex, not assuming column name contains its database name and its table name.
> When assuming that column name always contains its database name and its table name, we run into the following exception
> {code}
> I0618 19:16:02.415920 209817 QueryEventHookManager.java:212] Initiating onQueryComplete: org.apache.atlas.impala.hook.ImpalaLineageHook
> E0618 19:16:02.418964 210738 ImpalaLineageHook.java:126] ImpalaLineageHook.process(): failed to process query create table sales_sg as select * from sales_asia
> Java exception follows:
> java.lang.IllegalArgumentException: fullColumnName {} does not contain database name or table name
> at org.apache.atlas.impala.hook.AtlasImpalaHookContext.getQualifiedNameForColumn(AtlasImpalaHookContext.java:115)
> at org.apache.atlas.impala.hook.events.BaseImpalaEvent.getQualifiedName(BaseImpalaEvent.java:164)
> at org.apache.atlas.impala.hook.events.BaseImpalaEvent.getQualifiedName(BaseImpalaEvent.java:134)
> at org.apache.atlas.impala.hook.events.BaseImpalaEvent.getColumnEntities(BaseImpalaEvent.java:495)
> at org.apache.atlas.impala.hook.events.BaseImpalaEvent.toTableEntity(BaseImpalaEvent.java:430)
> at org.apache.atlas.impala.hook.events.BaseImpalaEvent.toTableEntity(BaseImpalaEvent.java:393)
> at org.apache.atlas.impala.hook.events.BaseImpalaEvent.toAtlasEntity(BaseImpalaEvent.java:315)
> at org.apache.atlas.impala.hook.events.BaseImpalaEvent.getInputOutputEntity(BaseImpalaEvent.java:297)
> at org.apache.atlas.impala.hook.events.CreateImpalaProcess.getEntities(CreateImpalaProcess.java:103)
> at org.apache.atlas.impala.hook.events.CreateImpalaProcess.getNotificationMessages(CreateImpalaProcess.java:54)
> at org.apache.atlas.impala.hook.ImpalaLineageHook.process(ImpalaLineageHook.java:122)
> at org.apache.atlas.impala.hook.ImpalaLineageHook.process(ImpalaLineageHook.java:79)
> at org.apache.atlas.impala.hook.ImpalaHook.onQueryComplete(ImpalaHook.java:36)
> at org.apache.atlas.impala.hook.ImpalaLineageHook.onQueryComplete(ImpalaLineageHook.java:52)
> at org.apache.impala.hooks.QueryEventHookManager.lambda$null$1(QueryEventHookManager.java:215)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> {code}
> The lineage record from Impala is
> {code}
> {
> "queryText":"create table sales_china as select * from sales_asia",
> "queryId":"2940d0b242de53ea:e82ba8d300000000",
> "hash":"a705a9ec851a5440afca0dfb8df86cd5",
> "user":"root",
> "timestamp":1560885032,
> "endTime":1560885040,
> "edges":[
> {
> "sources":[
> 1
> ],
> "targets":[
> 0
> ],
> "edgeType":"PROJECTION"
> },
> {
> "sources":[
> 3
> ],
> "targets":[
> 2
> ],
> "edgeType":"PROJECTION"
> }
> ],
> "vertices":[
> {
> "id":0,
> "vertexType":"COLUMN",
> "vertexId":"id",
> "metadata":{
> "tableName":"sales_db.sales_china",
> "tableCreateTime":1560885039
> }
> },
> {
> "id":1,
> "vertexType":"COLUMN",
> "vertexId":"sales_db.sales_asia.id",
> "metadata":{
> "tableName":"sales_db.sales_asia",
> "tableCreateTime":1560884919
> }
> },
> {
> "id":2,
> "vertexType":"COLUMN",
> "vertexId":"name",
> "metadata":{
> "tableName":"sales_db.sales_china",
> "tableCreateTime":1560885039
> }
> },
> {
> "id":3,
> "vertexType":"COLUMN",
> "vertexId":"sales_db.sales_asia.name",
> "metadata":{
> "tableName":"sales_db.sales_asia",
> "tableCreateTime":1560884919
> }
> }
> ]
> }
> {code}
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)