You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Venky Iyer (JIRA)" <ji...@apache.org> on 2008/12/11 03:38:44 UTC

[jira] Created: (HIVE-160) sampling in a subquery is broken

sampling in a subquery is broken
--------------------------------

                 Key: HIVE-160
                 URL: https://issues.apache.org/jira/browse/HIVE-160
             Project: Hadoop Hive
          Issue Type: Bug
            Reporter: Venky Iyer




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HIVE-160) sampling in a subquery is broken

Posted by "Raghotham Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghotham Murthy reassigned HIVE-160:
-------------------------------------

    Assignee: Raghotham Murthy

> sampling in a subquery is broken
> --------------------------------
>
>                 Key: HIVE-160
>                 URL: https://issues.apache.org/jira/browse/HIVE-160
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Venky Iyer
>            Assignee: Raghotham Murthy
>            Priority: Critical
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-160) sampling in a subquery is broken

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677479#action_12677479 ] 

Zheng Shao commented on HIVE-160:
---------------------------------

So it's resolved right? Will you close this issue?

> sampling in a subquery is broken
> --------------------------------
>
>                 Key: HIVE-160
>                 URL: https://issues.apache.org/jira/browse/HIVE-160
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Venky Iyer
>            Assignee: Raghotham Murthy
>            Priority: Critical
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HIVE-160) sampling in a subquery is broken

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain resolved HIVE-160.
-----------------------------

       Resolution: Fixed
    Fix Version/s: 0.4.0
     Hadoop Flags: [Reviewed]

Committed. Thanks Raghu

> sampling in a subquery is broken
> --------------------------------
>
>                 Key: HIVE-160
>                 URL: https://issues.apache.org/jira/browse/HIVE-160
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Venky Iyer
>            Assignee: Raghotham Murthy
>            Priority: Critical
>             Fix For: 0.4.0
>
>         Attachments: hive-160.1.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-160) sampling in a subquery is broken

Posted by "Raghotham Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677422#action_12677422 ] 

Raghotham Murthy commented on HIVE-160:
---------------------------------------

Sampling within a sub-query does not seem to prune the input. A filter is added and the result seems correct.

> sampling in a subquery is broken
> --------------------------------
>
>                 Key: HIVE-160
>                 URL: https://issues.apache.org/jira/browse/HIVE-160
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Venky Iyer
>            Assignee: Raghotham Murthy
>            Priority: Critical
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-160) sampling in a subquery is broken

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731218#action_12731218 ] 

Namit Jain commented on HIVE-160:
---------------------------------

+1


The code changes look good - will commit if the tests look good and they pass

> sampling in a subquery is broken
> --------------------------------
>
>                 Key: HIVE-160
>                 URL: https://issues.apache.org/jira/browse/HIVE-160
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Venky Iyer
>            Assignee: Raghotham Murthy
>            Priority: Critical
>         Attachments: hive-160.1.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-160) sampling in a subquery is broken

Posted by "Raghotham Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731222#action_12731222 ] 

Raghotham Murthy commented on HIVE-160:
---------------------------------------

Filed HIVE-638 to fix sampling in subqueries properly

> sampling in a subquery is broken
> --------------------------------
>
>                 Key: HIVE-160
>                 URL: https://issues.apache.org/jira/browse/HIVE-160
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Venky Iyer
>            Assignee: Raghotham Murthy
>            Priority: Critical
>         Attachments: hive-160.1.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-160) sampling in a subquery is broken

Posted by "Jeff Hammerbacher (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeff Hammerbacher updated HIVE-160:
-----------------------------------

    Component/s: Query Processor

Adding to "Query Processor" component.

> sampling in a subquery is broken
> --------------------------------
>
>                 Key: HIVE-160
>                 URL: https://issues.apache.org/jira/browse/HIVE-160
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Venky Iyer
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-160) sampling in a subquery is broken

Posted by "Raghotham Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghotham Murthy updated HIVE-160:
----------------------------------

    Attachment: hive-160.1.patch

No, the problem is that input pruning does not work well when done over parse structures (QB). We should do it over the operator tree. The current patch is a temporary fix for this bug. It always adds a sampling predicate to the where clause irrespective of whether there was input pruning or not. The final fix will be modeled after the partition pruning code that Ashish is fixing.

I also modified the tests so that srcbucket has an integer key. This allows for better testing of the case where a predicate is added to the where clause. 'Bucket 1 out of 2' will return keys which are even and bucket 2 out of 2 will return keys which are odd.

> sampling in a subquery is broken
> --------------------------------
>
>                 Key: HIVE-160
>                 URL: https://issues.apache.org/jira/browse/HIVE-160
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Venky Iyer
>            Assignee: Raghotham Murthy
>            Priority: Critical
>         Attachments: hive-160.1.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-160) sampling in a subquery is broken

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661556#action_12661556 ] 

Ashish Thusoo commented on HIVE-160:
------------------------------------

I think this one has been resolved by Raghu or Namit?

> sampling in a subquery is broken
> --------------------------------
>
>                 Key: HIVE-160
>                 URL: https://issues.apache.org/jira/browse/HIVE-160
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Venky Iyer
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-160) sampling in a subquery is broken

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ashish Thusoo updated HIVE-160:
-------------------------------

    Priority: Critical  (was: Major)

> sampling in a subquery is broken
> --------------------------------
>
>                 Key: HIVE-160
>                 URL: https://issues.apache.org/jira/browse/HIVE-160
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Venky Iyer
>            Priority: Critical
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.