You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Namit Jain (JIRA)" <ji...@apache.org> on 2010/11/10 19:05:13 UTC

[jira] Created: (HIVE-1781) outputs not populated for dynamic partitions at compile time

outputs not populated for dynamic partitions at compile time
------------------------------------------------------------

                 Key: HIVE-1781
                 URL: https://issues.apache.org/jira/browse/HIVE-1781
             Project: Hive
          Issue Type: Bug
            Reporter: Namit Jain


OSTHOOK: query: create table tstsrcpart like srcpart
POSTHOOK: type: CREATETABLE
POSTHOOK: Output: default@tstsrcpart
PREHOOK: query: from srcpart
insert overwrite table tstsrcpart partition (ds, hr) select key, value, ds, hr where ds <= '2008-04-08'
PREHOOK: type: QUERY
PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=11
PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=12
POSTHOOK: query: from srcpart



As is evident from above, the outputs are not populated at all at compile time.

This may create a problem for many components that depend on outputs: locking, authorization etc.
However, the exact set of outputs may be needed for some other components (for example. the
internal deployment in Facebook has a replication hook which is used for replication which needs the
exact set of outputs). It may  be a good idea to extend WriteEntity to include a flag which indicates
whether the output is complete or not, and then the hook can look at that flag if needed


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1781) outputs not populated for dynamic partitions at compile time

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931144#action_12931144 ] 

Namit Jain commented on HIVE-1781:
----------------------------------

The outputs are not empty.
This was a bug I fixed, but I think I forgot to update some of the log files.

When you run the tests, can you update them ?

> outputs not populated for dynamic partitions at compile time
> ------------------------------------------------------------
>
>                 Key: HIVE-1781
>                 URL: https://issues.apache.org/jira/browse/HIVE-1781
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>         Attachments: hive.1781.1.patch
>
>
> OSTHOOK: query: create table tstsrcpart like srcpart
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@tstsrcpart
> PREHOOK: query: from srcpart
> insert overwrite table tstsrcpart partition (ds, hr) select key, value, ds, hr where ds <= '2008-04-08'
> PREHOOK: type: QUERY
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=11
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=12
> POSTHOOK: query: from srcpart
> As is evident from above, the outputs are not populated at all at compile time.
> This may create a problem for many components that depend on outputs: locking, authorization etc.
> However, the exact set of outputs may be needed for some other components (for example. the
> internal deployment in Facebook has a replication hook which is used for replication which needs the
> exact set of outputs). It may  be a good idea to extend WriteEntity to include a flag which indicates
> whether the output is complete or not, and then the hook can look at that flag if needed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1781) outputs not populated for dynamic partitions at compile time

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-1781:
-----------------------------

    Status: Patch Available  (was: Open)

> outputs not populated for dynamic partitions at compile time
> ------------------------------------------------------------
>
>                 Key: HIVE-1781
>                 URL: https://issues.apache.org/jira/browse/HIVE-1781
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>         Attachments: hive.1781.1.patch
>
>
> OSTHOOK: query: create table tstsrcpart like srcpart
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@tstsrcpart
> PREHOOK: query: from srcpart
> insert overwrite table tstsrcpart partition (ds, hr) select key, value, ds, hr where ds <= '2008-04-08'
> PREHOOK: type: QUERY
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=11
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=12
> POSTHOOK: query: from srcpart
> As is evident from above, the outputs are not populated at all at compile time.
> This may create a problem for many components that depend on outputs: locking, authorization etc.
> However, the exact set of outputs may be needed for some other components (for example. the
> internal deployment in Facebook has a replication hook which is used for replication which needs the
> exact set of outputs). It may  be a good idea to extend WriteEntity to include a flag which indicates
> whether the output is complete or not, and then the hook can look at that flag if needed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1781) outputs not populated for dynamic partitions at compile time

Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931162#action_12931162 ] 

He Yongqiang commented on HIVE-1781:
------------------------------------

+1 running tests.

> outputs not populated for dynamic partitions at compile time
> ------------------------------------------------------------
>
>                 Key: HIVE-1781
>                 URL: https://issues.apache.org/jira/browse/HIVE-1781
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>         Attachments: hive.1781.1.patch
>
>
> OSTHOOK: query: create table tstsrcpart like srcpart
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@tstsrcpart
> PREHOOK: query: from srcpart
> insert overwrite table tstsrcpart partition (ds, hr) select key, value, ds, hr where ds <= '2008-04-08'
> PREHOOK: type: QUERY
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=11
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=12
> POSTHOOK: query: from srcpart
> As is evident from above, the outputs are not populated at all at compile time.
> This may create a problem for many components that depend on outputs: locking, authorization etc.
> However, the exact set of outputs may be needed for some other components (for example. the
> internal deployment in Facebook has a replication hook which is used for replication which needs the
> exact set of outputs). It may  be a good idea to extend WriteEntity to include a flag which indicates
> whether the output is complete or not, and then the hook can look at that flag if needed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1781) outputs not populated for dynamic partitions at compile time

Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

He Yongqiang updated HIVE-1781:
-------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

Committed! Thanks Namit!

> outputs not populated for dynamic partitions at compile time
> ------------------------------------------------------------
>
>                 Key: HIVE-1781
>                 URL: https://issues.apache.org/jira/browse/HIVE-1781
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>         Attachments: hive.1781.1.patch
>
>
> OSTHOOK: query: create table tstsrcpart like srcpart
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@tstsrcpart
> PREHOOK: query: from srcpart
> insert overwrite table tstsrcpart partition (ds, hr) select key, value, ds, hr where ds <= '2008-04-08'
> PREHOOK: type: QUERY
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=11
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=12
> POSTHOOK: query: from srcpart
> As is evident from above, the outputs are not populated at all at compile time.
> This may create a problem for many components that depend on outputs: locking, authorization etc.
> However, the exact set of outputs may be needed for some other components (for example. the
> internal deployment in Facebook has a replication hook which is used for replication which needs the
> exact set of outputs). It may  be a good idea to extend WriteEntity to include a flag which indicates
> whether the output is complete or not, and then the hook can look at that flag if needed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1781) outputs not populated for dynamic partitions at compile time

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931091#action_12931091 ] 

Namit Jain commented on HIVE-1781:
----------------------------------

In the patch, I remove the incomplete entries before the post execution hooks, so that
post-execute hooks don't have to change.

> outputs not populated for dynamic partitions at compile time
> ------------------------------------------------------------
>
>                 Key: HIVE-1781
>                 URL: https://issues.apache.org/jira/browse/HIVE-1781
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>
> OSTHOOK: query: create table tstsrcpart like srcpart
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@tstsrcpart
> PREHOOK: query: from srcpart
> insert overwrite table tstsrcpart partition (ds, hr) select key, value, ds, hr where ds <= '2008-04-08'
> PREHOOK: type: QUERY
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=11
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=12
> POSTHOOK: query: from srcpart
> As is evident from above, the outputs are not populated at all at compile time.
> This may create a problem for many components that depend on outputs: locking, authorization etc.
> However, the exact set of outputs may be needed for some other components (for example. the
> internal deployment in Facebook has a replication hook which is used for replication which needs the
> exact set of outputs). It may  be a good idea to extend WriteEntity to include a flag which indicates
> whether the output is complete or not, and then the hook can look at that flag if needed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HIVE-1781) outputs not populated for dynamic partitions at compile time

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain reassigned HIVE-1781:
--------------------------------

    Assignee: Namit Jain

> outputs not populated for dynamic partitions at compile time
> ------------------------------------------------------------
>
>                 Key: HIVE-1781
>                 URL: https://issues.apache.org/jira/browse/HIVE-1781
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>
> OSTHOOK: query: create table tstsrcpart like srcpart
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@tstsrcpart
> PREHOOK: query: from srcpart
> insert overwrite table tstsrcpart partition (ds, hr) select key, value, ds, hr where ds <= '2008-04-08'
> PREHOOK: type: QUERY
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=11
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=12
> POSTHOOK: query: from srcpart
> As is evident from above, the outputs are not populated at all at compile time.
> This may create a problem for many components that depend on outputs: locking, authorization etc.
> However, the exact set of outputs may be needed for some other components (for example. the
> internal deployment in Facebook has a replication hook which is used for replication which needs the
> exact set of outputs). It may  be a good idea to extend WriteEntity to include a flag which indicates
> whether the output is complete or not, and then the hook can look at that flag if needed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1781) outputs not populated for dynamic partitions at compile time

Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931115#action_12931115 ] 

He Yongqiang commented on HIVE-1781:
------------------------------------

For a query like "select key, value from src where key < 10", the outputs used to be a temp file, now the outputs are empty. Is this good.
Will the outputs be null if do a "insert overwrite [local] directory select key, value from src"?

> outputs not populated for dynamic partitions at compile time
> ------------------------------------------------------------
>
>                 Key: HIVE-1781
>                 URL: https://issues.apache.org/jira/browse/HIVE-1781
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>         Attachments: hive.1781.1.patch
>
>
> OSTHOOK: query: create table tstsrcpart like srcpart
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@tstsrcpart
> PREHOOK: query: from srcpart
> insert overwrite table tstsrcpart partition (ds, hr) select key, value, ds, hr where ds <= '2008-04-08'
> PREHOOK: type: QUERY
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=11
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=12
> POSTHOOK: query: from srcpart
> As is evident from above, the outputs are not populated at all at compile time.
> This may create a problem for many components that depend on outputs: locking, authorization etc.
> However, the exact set of outputs may be needed for some other components (for example. the
> internal deployment in Facebook has a replication hook which is used for replication which needs the
> exact set of outputs). It may  be a good idea to extend WriteEntity to include a flag which indicates
> whether the output is complete or not, and then the hook can look at that flag if needed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1781) outputs not populated for dynamic partitions at compile time

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-1781:
-----------------------------

    Attachment: hive.1781.1.patch

> outputs not populated for dynamic partitions at compile time
> ------------------------------------------------------------
>
>                 Key: HIVE-1781
>                 URL: https://issues.apache.org/jira/browse/HIVE-1781
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>         Attachments: hive.1781.1.patch
>
>
> OSTHOOK: query: create table tstsrcpart like srcpart
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@tstsrcpart
> PREHOOK: query: from srcpart
> insert overwrite table tstsrcpart partition (ds, hr) select key, value, ds, hr where ds <= '2008-04-08'
> PREHOOK: type: QUERY
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=11
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=12
> POSTHOOK: query: from srcpart
> As is evident from above, the outputs are not populated at all at compile time.
> This may create a problem for many components that depend on outputs: locking, authorization etc.
> However, the exact set of outputs may be needed for some other components (for example. the
> internal deployment in Facebook has a replication hook which is used for replication which needs the
> exact set of outputs). It may  be a good idea to extend WriteEntity to include a flag which indicates
> whether the output is complete or not, and then the hook can look at that flag if needed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.