You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Namit Jain (JIRA)" <ji...@apache.org> on 2010/11/10 19:05:13 UTC
[jira] Created: (HIVE-1781) outputs not populated for dynamic
partitions at compile time
outputs not populated for dynamic partitions at compile time
------------------------------------------------------------
Key: HIVE-1781
URL: https://issues.apache.org/jira/browse/HIVE-1781
Project: Hive
Issue Type: Bug
Reporter: Namit Jain
OSTHOOK: query: create table tstsrcpart like srcpart
POSTHOOK: type: CREATETABLE
POSTHOOK: Output: default@tstsrcpart
PREHOOK: query: from srcpart
insert overwrite table tstsrcpart partition (ds, hr) select key, value, ds, hr where ds <= '2008-04-08'
PREHOOK: type: QUERY
PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=11
PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=12
POSTHOOK: query: from srcpart
As is evident from above, the outputs are not populated at all at compile time.
This may create a problem for many components that depend on outputs: locking, authorization etc.
However, the exact set of outputs may be needed for some other components (for example. the
internal deployment in Facebook has a replication hook which is used for replication which needs the
exact set of outputs). It may be a good idea to extend WriteEntity to include a flag which indicates
whether the output is complete or not, and then the hook can look at that flag if needed
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1781) outputs not populated for dynamic
partitions at compile time
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931144#action_12931144 ]
Namit Jain commented on HIVE-1781:
----------------------------------
The outputs are not empty.
This was a bug I fixed, but I think I forgot to update some of the log files.
When you run the tests, can you update them ?
> outputs not populated for dynamic partitions at compile time
> ------------------------------------------------------------
>
> Key: HIVE-1781
> URL: https://issues.apache.org/jira/browse/HIVE-1781
> Project: Hive
> Issue Type: Bug
> Reporter: Namit Jain
> Assignee: Namit Jain
> Attachments: hive.1781.1.patch
>
>
> OSTHOOK: query: create table tstsrcpart like srcpart
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@tstsrcpart
> PREHOOK: query: from srcpart
> insert overwrite table tstsrcpart partition (ds, hr) select key, value, ds, hr where ds <= '2008-04-08'
> PREHOOK: type: QUERY
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=11
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=12
> POSTHOOK: query: from srcpart
> As is evident from above, the outputs are not populated at all at compile time.
> This may create a problem for many components that depend on outputs: locking, authorization etc.
> However, the exact set of outputs may be needed for some other components (for example. the
> internal deployment in Facebook has a replication hook which is used for replication which needs the
> exact set of outputs). It may be a good idea to extend WriteEntity to include a flag which indicates
> whether the output is complete or not, and then the hook can look at that flag if needed
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1781) outputs not populated for dynamic
partitions at compile time
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain updated HIVE-1781:
-----------------------------
Status: Patch Available (was: Open)
> outputs not populated for dynamic partitions at compile time
> ------------------------------------------------------------
>
> Key: HIVE-1781
> URL: https://issues.apache.org/jira/browse/HIVE-1781
> Project: Hive
> Issue Type: Bug
> Reporter: Namit Jain
> Assignee: Namit Jain
> Attachments: hive.1781.1.patch
>
>
> OSTHOOK: query: create table tstsrcpart like srcpart
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@tstsrcpart
> PREHOOK: query: from srcpart
> insert overwrite table tstsrcpart partition (ds, hr) select key, value, ds, hr where ds <= '2008-04-08'
> PREHOOK: type: QUERY
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=11
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=12
> POSTHOOK: query: from srcpart
> As is evident from above, the outputs are not populated at all at compile time.
> This may create a problem for many components that depend on outputs: locking, authorization etc.
> However, the exact set of outputs may be needed for some other components (for example. the
> internal deployment in Facebook has a replication hook which is used for replication which needs the
> exact set of outputs). It may be a good idea to extend WriteEntity to include a flag which indicates
> whether the output is complete or not, and then the hook can look at that flag if needed
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1781) outputs not populated for dynamic
partitions at compile time
Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931162#action_12931162 ]
He Yongqiang commented on HIVE-1781:
------------------------------------
+1 running tests.
> outputs not populated for dynamic partitions at compile time
> ------------------------------------------------------------
>
> Key: HIVE-1781
> URL: https://issues.apache.org/jira/browse/HIVE-1781
> Project: Hive
> Issue Type: Bug
> Reporter: Namit Jain
> Assignee: Namit Jain
> Attachments: hive.1781.1.patch
>
>
> OSTHOOK: query: create table tstsrcpart like srcpart
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@tstsrcpart
> PREHOOK: query: from srcpart
> insert overwrite table tstsrcpart partition (ds, hr) select key, value, ds, hr where ds <= '2008-04-08'
> PREHOOK: type: QUERY
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=11
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=12
> POSTHOOK: query: from srcpart
> As is evident from above, the outputs are not populated at all at compile time.
> This may create a problem for many components that depend on outputs: locking, authorization etc.
> However, the exact set of outputs may be needed for some other components (for example. the
> internal deployment in Facebook has a replication hook which is used for replication which needs the
> exact set of outputs). It may be a good idea to extend WriteEntity to include a flag which indicates
> whether the output is complete or not, and then the hook can look at that flag if needed
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1781) outputs not populated for dynamic
partitions at compile time
Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
He Yongqiang updated HIVE-1781:
-------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
Committed! Thanks Namit!
> outputs not populated for dynamic partitions at compile time
> ------------------------------------------------------------
>
> Key: HIVE-1781
> URL: https://issues.apache.org/jira/browse/HIVE-1781
> Project: Hive
> Issue Type: Bug
> Reporter: Namit Jain
> Assignee: Namit Jain
> Attachments: hive.1781.1.patch
>
>
> OSTHOOK: query: create table tstsrcpart like srcpart
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@tstsrcpart
> PREHOOK: query: from srcpart
> insert overwrite table tstsrcpart partition (ds, hr) select key, value, ds, hr where ds <= '2008-04-08'
> PREHOOK: type: QUERY
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=11
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=12
> POSTHOOK: query: from srcpart
> As is evident from above, the outputs are not populated at all at compile time.
> This may create a problem for many components that depend on outputs: locking, authorization etc.
> However, the exact set of outputs may be needed for some other components (for example. the
> internal deployment in Facebook has a replication hook which is used for replication which needs the
> exact set of outputs). It may be a good idea to extend WriteEntity to include a flag which indicates
> whether the output is complete or not, and then the hook can look at that flag if needed
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1781) outputs not populated for dynamic
partitions at compile time
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931091#action_12931091 ]
Namit Jain commented on HIVE-1781:
----------------------------------
In the patch, I remove the incomplete entries before the post execution hooks, so that
post-execute hooks don't have to change.
> outputs not populated for dynamic partitions at compile time
> ------------------------------------------------------------
>
> Key: HIVE-1781
> URL: https://issues.apache.org/jira/browse/HIVE-1781
> Project: Hive
> Issue Type: Bug
> Reporter: Namit Jain
> Assignee: Namit Jain
>
> OSTHOOK: query: create table tstsrcpart like srcpart
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@tstsrcpart
> PREHOOK: query: from srcpart
> insert overwrite table tstsrcpart partition (ds, hr) select key, value, ds, hr where ds <= '2008-04-08'
> PREHOOK: type: QUERY
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=11
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=12
> POSTHOOK: query: from srcpart
> As is evident from above, the outputs are not populated at all at compile time.
> This may create a problem for many components that depend on outputs: locking, authorization etc.
> However, the exact set of outputs may be needed for some other components (for example. the
> internal deployment in Facebook has a replication hook which is used for replication which needs the
> exact set of outputs). It may be a good idea to extend WriteEntity to include a flag which indicates
> whether the output is complete or not, and then the hook can look at that flag if needed
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HIVE-1781) outputs not populated for dynamic
partitions at compile time
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain reassigned HIVE-1781:
--------------------------------
Assignee: Namit Jain
> outputs not populated for dynamic partitions at compile time
> ------------------------------------------------------------
>
> Key: HIVE-1781
> URL: https://issues.apache.org/jira/browse/HIVE-1781
> Project: Hive
> Issue Type: Bug
> Reporter: Namit Jain
> Assignee: Namit Jain
>
> OSTHOOK: query: create table tstsrcpart like srcpart
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@tstsrcpart
> PREHOOK: query: from srcpart
> insert overwrite table tstsrcpart partition (ds, hr) select key, value, ds, hr where ds <= '2008-04-08'
> PREHOOK: type: QUERY
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=11
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=12
> POSTHOOK: query: from srcpart
> As is evident from above, the outputs are not populated at all at compile time.
> This may create a problem for many components that depend on outputs: locking, authorization etc.
> However, the exact set of outputs may be needed for some other components (for example. the
> internal deployment in Facebook has a replication hook which is used for replication which needs the
> exact set of outputs). It may be a good idea to extend WriteEntity to include a flag which indicates
> whether the output is complete or not, and then the hook can look at that flag if needed
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (HIVE-1781) outputs not populated for dynamic
partitions at compile time
Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12931115#action_12931115 ]
He Yongqiang commented on HIVE-1781:
------------------------------------
For a query like "select key, value from src where key < 10", the outputs used to be a temp file, now the outputs are empty. Is this good.
Will the outputs be null if do a "insert overwrite [local] directory select key, value from src"?
> outputs not populated for dynamic partitions at compile time
> ------------------------------------------------------------
>
> Key: HIVE-1781
> URL: https://issues.apache.org/jira/browse/HIVE-1781
> Project: Hive
> Issue Type: Bug
> Reporter: Namit Jain
> Assignee: Namit Jain
> Attachments: hive.1781.1.patch
>
>
> OSTHOOK: query: create table tstsrcpart like srcpart
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@tstsrcpart
> PREHOOK: query: from srcpart
> insert overwrite table tstsrcpart partition (ds, hr) select key, value, ds, hr where ds <= '2008-04-08'
> PREHOOK: type: QUERY
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=11
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=12
> POSTHOOK: query: from srcpart
> As is evident from above, the outputs are not populated at all at compile time.
> This may create a problem for many components that depend on outputs: locking, authorization etc.
> However, the exact set of outputs may be needed for some other components (for example. the
> internal deployment in Facebook has a replication hook which is used for replication which needs the
> exact set of outputs). It may be a good idea to extend WriteEntity to include a flag which indicates
> whether the output is complete or not, and then the hook can look at that flag if needed
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (HIVE-1781) outputs not populated for dynamic
partitions at compile time
Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/HIVE-1781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Namit Jain updated HIVE-1781:
-----------------------------
Attachment: hive.1781.1.patch
> outputs not populated for dynamic partitions at compile time
> ------------------------------------------------------------
>
> Key: HIVE-1781
> URL: https://issues.apache.org/jira/browse/HIVE-1781
> Project: Hive
> Issue Type: Bug
> Reporter: Namit Jain
> Assignee: Namit Jain
> Attachments: hive.1781.1.patch
>
>
> OSTHOOK: query: create table tstsrcpart like srcpart
> POSTHOOK: type: CREATETABLE
> POSTHOOK: Output: default@tstsrcpart
> PREHOOK: query: from srcpart
> insert overwrite table tstsrcpart partition (ds, hr) select key, value, ds, hr where ds <= '2008-04-08'
> PREHOOK: type: QUERY
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=11
> PREHOOK: Input: default@srcpart@ds=2008-04-08/hr=12
> POSTHOOK: query: from srcpart
> As is evident from above, the outputs are not populated at all at compile time.
> This may create a problem for many components that depend on outputs: locking, authorization etc.
> However, the exact set of outputs may be needed for some other components (for example. the
> internal deployment in Facebook has a replication hook which is used for replication which needs the
> exact set of outputs). It may be a good idea to extend WriteEntity to include a flag which indicates
> whether the output is complete or not, and then the hook can look at that flag if needed
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.