You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Zheng Shao (JIRA)" <ji...@apache.org> on 2010/04/01 02:48:27 UTC

[jira] Commented: (HIVE-1131) Add column lineage information to the pre execution hooks

    [ https://issues.apache.org/jira/browse/HIVE-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12852200#action_12852200 ] 

Zheng Shao commented on HIVE-1131:
----------------------------------

The following tests failed. Mostly because the order of Can you take a look?
Also, it will be great to get rid of the "null" after EXPRESSION in the following example.


{code}
groupby11.q
groupby7_map_skew.q
input13.q
script_pipe.q
groupby9.q
multi_insert.q
union17.q

example:
    [junit] diff -a -I file: -I /tmp/ -I invalidscheme: -I lastUpdateTime -I lastAccessTime -I owner -I transient_lastDdlTime\
 -I java.lang.RuntimeException -I at org -I at sun -I at java -I at junit -I Caused by: -I [.][.][.] [0-9]* more /data/users/\
zshao/hadoop_hive_trunk/.ptest_1/build/ql/test/logs/clientpositive/groupby9.q.out /data/users/zshao/hadoop_hive_trunk/.ptest_\
1/ql/src/test/results/clientpositive/groupby9.q.out
    [junit] 238,239d237
    [junit] < POSTHOOK: Lineage: dest1.key EXPRESSION null[(src)src.FieldSchema(name:key, type:string, comment:default), ]
    [junit] < POSTHOOK: Lineage: dest1.value EXPRESSION null[(src)src.FieldSchema(name:value, type:string, comment:default), \
]
    [junit] 242a241,242
    [junit] > POSTHOOK: Lineage: dest1.key EXPRESSION null[(src)src.FieldSchema(name:key, type:string, comment:default), ]
    [junit] > POSTHOOK: Lineage: dest1.value EXPRESSION null[(src)src.FieldSchema(name:value, type:string, comment:default), \
]

{code}


> Add column lineage information to the pre execution hooks
> ---------------------------------------------------------
>
>                 Key: HIVE-1131
>                 URL: https://issues.apache.org/jira/browse/HIVE-1131
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Ashish Thusoo
>            Assignee: Ashish Thusoo
>         Attachments: HIVE-1131.patch, HIVE-1131_2.patch, HIVE-1131_3.patch, HIVE-1131_4.patch, HIVE-1131_5.patch
>
>
> We need a mechanism to pass the lineage information of the various columns of a table to a pre execution hook so that applications can use that for:
> - auditing
> - dependency checking
> and many other applications.
> The proposal is to expose this through a bunch of classes to the pre execution hook interface to the clients and put in the necessary transformation logic in the optimizer to generate this information.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.