You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Zheng Shao (JIRA)" <ji...@apache.org> on 2009/02/05 23:33:59 UTC

[jira] Created: (HIVE-276) input3_limit.q fails under 0.17

input3_limit.q fails under 0.17
-------------------------------

                 Key: HIVE-276
                 URL: https://issues.apache.org/jira/browse/HIVE-276
             Project: Hadoop Hive
          Issue Type: Bug
            Reporter: Zheng Shao


The plan ql/src/test/results/clientpositive/input3_limit.q.out shows that there are 2 map-reduce jobs:

The first one is distributed and sorted as is specified by the query. The reducer side has LIMIT 20.

The second one (single reducer job imposed by LIMIT 20) does not have the same sort order, so the final result is non-deterministic.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-276) input3_limit.q fails under 0.17

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao updated HIVE-276:
----------------------------

    Attachment: HIVE-276.1.patch

Modified the query to do another SORT at the end.

I also thought about propagating the sort order to the reduce sink operator of limit, but that does not seem very easy to do.


> input3_limit.q fails under 0.17
> -------------------------------
>
>                 Key: HIVE-276
>                 URL: https://issues.apache.org/jira/browse/HIVE-276
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Zheng Shao
>         Attachments: HIVE-276.1.patch
>
>
> The plan ql/src/test/results/clientpositive/input3_limit.q.out shows that there are 2 map-reduce jobs:
> The first one is distributed and sorted as is specified by the query. The reducer side has LIMIT 20.
> The second one (single reducer job imposed by LIMIT 20) does not have the same sort order, so the final result is non-deterministic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-276) input3_limit.q fails under 0.17

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao updated HIVE-276:
----------------------------

    Attachment: HIVE-276.2.patch

Incorporated Ashish's comments.


> input3_limit.q fails under 0.17
> -------------------------------
>
>                 Key: HIVE-276
>                 URL: https://issues.apache.org/jira/browse/HIVE-276
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-276.1.patch, HIVE-276.2.patch
>
>
> The plan ql/src/test/results/clientpositive/input3_limit.q.out shows that there are 2 map-reduce jobs:
> The first one is distributed and sorted as is specified by the query. The reducer side has LIMIT 20.
> The second one (single reducer job imposed by LIMIT 20) does not have the same sort order, so the final result is non-deterministic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-276) input3_limit.q fails under 0.17

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12670925#action_12670925 ] 

Zheng Shao commented on HIVE-276:
---------------------------------

To test, try: ant -Dhadoop.version=0.17.0 -Dtestcase=TestCliDriver -Dqfile=input3_limit.q test

> input3_limit.q fails under 0.17
> -------------------------------
>
>                 Key: HIVE-276
>                 URL: https://issues.apache.org/jira/browse/HIVE-276
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Zheng Shao
>
> The plan ql/src/test/results/clientpositive/input3_limit.q.out shows that there are 2 map-reduce jobs:
> The first one is distributed and sorted as is specified by the query. The reducer side has LIMIT 20.
> The second one (single reducer job imposed by LIMIT 20) does not have the same sort order, so the final result is non-deterministic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-276) input3_limit.q fails under 0.17

Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Carl Steinbach updated HIVE-276:
--------------------------------

    Fix Version/s:     (was: 0.6.0)

> input3_limit.q fails under 0.17
> -------------------------------
>
>                 Key: HIVE-276
>                 URL: https://issues.apache.org/jira/browse/HIVE-276
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>             Fix For: 0.3.0
>
>         Attachments: HIVE-276.1.patch, HIVE-276.2.patch
>
>
> The plan ql/src/test/results/clientpositive/input3_limit.q.out shows that there are 2 map-reduce jobs:
> The first one is distributed and sorted as is specified by the query. The reducer side has LIMIT 20.
> The second one (single reducer job imposed by LIMIT 20) does not have the same sort order, so the final result is non-deterministic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-276) input3_limit.q fails under 0.17

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12673105#action_12673105 ] 

Ashish Thusoo commented on HIVE-276:
------------------------------------

I think it is easier to just put a sort by in the final select.

SELECT * FROM T2 SORT BY T2.key, T2.value;


instead of

SELECT * FROM T2;



> input3_limit.q fails under 0.17
> -------------------------------
>
>                 Key: HIVE-276
>                 URL: https://issues.apache.org/jira/browse/HIVE-276
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-276.1.patch
>
>
> The plan ql/src/test/results/clientpositive/input3_limit.q.out shows that there are 2 map-reduce jobs:
> The first one is distributed and sorted as is specified by the query. The reducer side has LIMIT 20.
> The second one (single reducer job imposed by LIMIT 20) does not have the same sort order, so the final result is non-deterministic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-276) input3_limit.q fails under 0.17

Posted by "Raghotham Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12674870#action_12674870 ] 

Raghotham Murthy commented on HIVE-276:
---------------------------------------

+1

looks good.

> input3_limit.q fails under 0.17
> -------------------------------
>
>                 Key: HIVE-276
>                 URL: https://issues.apache.org/jira/browse/HIVE-276
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-276.1.patch, HIVE-276.2.patch
>
>
> The plan ql/src/test/results/clientpositive/input3_limit.q.out shows that there are 2 map-reduce jobs:
> The first one is distributed and sorted as is specified by the query. The reducer side has LIMIT 20.
> The second one (single reducer job imposed by LIMIT 20) does not have the same sort order, so the final result is non-deterministic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-276) input3_limit.q fails under 0.17

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12670926#action_12670926 ] 

Zheng Shao commented on HIVE-276:
---------------------------------

The query is: "INSERT OVERWRITE TABLE T2 SELECT * FROM (SELECT * FROM T1 DISTRIBUTE BY key SORT BY key, value) T ;"

Maybe we should propagate the SORT ORDER to the second map-reduce job that LIMIT imposes.


> input3_limit.q fails under 0.17
> -------------------------------
>
>                 Key: HIVE-276
>                 URL: https://issues.apache.org/jira/browse/HIVE-276
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Zheng Shao
>
> The plan ql/src/test/results/clientpositive/input3_limit.q.out shows that there are 2 map-reduce jobs:
> The first one is distributed and sorted as is specified by the query. The reducer side has LIMIT 20.
> The second one (single reducer job imposed by LIMIT 20) does not have the same sort order, so the final result is non-deterministic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HIVE-276) input3_limit.q fails under 0.17

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao resolved HIVE-276.
-----------------------------

       Resolution: Fixed
    Fix Version/s: 0.3.0
                   0.2.0
     Release Note: HIVE-276. Fix input3_limit.q for hadoop 0.17. (zshao)
     Hadoop Flags: [Reviewed]

trunk: Committed revision 745721.
branch 0.2: Committed revision 745723.


> input3_limit.q fails under 0.17
> -------------------------------
>
>                 Key: HIVE-276
>                 URL: https://issues.apache.org/jira/browse/HIVE-276
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>             Fix For: 0.2.0, 0.3.0
>
>         Attachments: HIVE-276.1.patch, HIVE-276.2.patch
>
>
> The plan ql/src/test/results/clientpositive/input3_limit.q.out shows that there are 2 map-reduce jobs:
> The first one is distributed and sorted as is specified by the query. The reducer side has LIMIT 20.
> The second one (single reducer job imposed by LIMIT 20) does not have the same sort order, so the final result is non-deterministic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HIVE-276) input3_limit.q fails under 0.17

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ashish Thusoo reassigned HIVE-276:
----------------------------------

    Assignee: Ashish Thusoo

> input3_limit.q fails under 0.17
> -------------------------------
>
>                 Key: HIVE-276
>                 URL: https://issues.apache.org/jira/browse/HIVE-276
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Zheng Shao
>            Assignee: Ashish Thusoo
>         Attachments: HIVE-276.1.patch
>
>
> The plan ql/src/test/results/clientpositive/input3_limit.q.out shows that there are 2 map-reduce jobs:
> The first one is distributed and sorted as is specified by the query. The reducer side has LIMIT 20.
> The second one (single reducer job imposed by LIMIT 20) does not have the same sort order, so the final result is non-deterministic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HIVE-276) input3_limit.q fails under 0.17

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ashish Thusoo reassigned HIVE-276:
----------------------------------

    Assignee: Zheng Shao  (was: Ashish Thusoo)

sorry. I thought you were not working on this.


> input3_limit.q fails under 0.17
> -------------------------------
>
>                 Key: HIVE-276
>                 URL: https://issues.apache.org/jira/browse/HIVE-276
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-276.1.patch
>
>
> The plan ql/src/test/results/clientpositive/input3_limit.q.out shows that there are 2 map-reduce jobs:
> The first one is distributed and sorted as is specified by the query. The reducer side has LIMIT 20.
> The second one (single reducer job imposed by LIMIT 20) does not have the same sort order, so the final result is non-deterministic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.