You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hive.apache.org by "Ning Zhang (JIRA)" <ji...@apache.org> on 2011/04/23 05:19:05 UTC

[jira] [Created] (HIVE-2127) Improve stats gathering reliability by retries on failures

Improve stats gathering reliability by retries on failures
----------------------------------------------------------

                 Key: HIVE-2127
                 URL: https://issues.apache.org/jira/browse/HIVE-2127
             Project: Hive
          Issue Type: Improvement
            Reporter: Ning Zhang
            Assignee: Ning Zhang


Stats publishing and aggregation only try once and if there is any exception it will fail and return. If many mappers/reducers updating stats at the same time, it is very common to get lock timeout. We should make stats more reliable by retry when there is an SQLException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2127) Improve stats gathering reliability by retries on failures

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HIVE-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025610#comment-13025610 ] 

Namit Jain commented on HIVE-2127:
----------------------------------

Also add the new configuration variables in the name of the jira

> Improve stats gathering reliability by retries on failures
> ----------------------------------------------------------
>
>                 Key: HIVE-2127
>                 URL: https://issues.apache.org/jira/browse/HIVE-2127
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: HIVE-2127.2.patch, HIVE-2127.patch
>
>
> Stats publishing and aggregation only try once and if there is any exception it will fail and return. If many mappers/reducers updating stats at the same time, it is very common to get lock timeout. We should make stats more reliable by retry when there is an SQLException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2127) Improve stats gathering reliability by retries on failures

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HIVE-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-2127:
-----------------------------

    Attachment: HIVE-2127.2.patch

> Improve stats gathering reliability by retries on failures
> ----------------------------------------------------------
>
>                 Key: HIVE-2127
>                 URL: https://issues.apache.org/jira/browse/HIVE-2127
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: HIVE-2127.2.patch, HIVE-2127.patch
>
>
> Stats publishing and aggregation only try once and if there is any exception it will fail and return. If many mappers/reducers updating stats at the same time, it is very common to get lock timeout. We should make stats more reliable by retry when there is an SQLException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2127) Improve stats gathering reliability by retries on failures with hive.stats.retries.max and hive.stats.retries.wait

Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HIVE-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Carl Steinbach updated HIVE-2127:
---------------------------------

      Component/s: Statistics
                   Query Processor
    Fix Version/s: 0.8.0

> Improve stats gathering reliability by retries on failures with hive.stats.retries.max and hive.stats.retries.wait
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-2127
>                 URL: https://issues.apache.org/jira/browse/HIVE-2127
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor, Statistics
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>             Fix For: 0.8.0
>
>         Attachments: HIVE-2127.2.patch, HIVE-2127.patch
>
>
> Stats publishing and aggregation only try once and if there is any exception it will fail and return. If many mappers/reducers updating stats at the same time, it is very common to get lock timeout. We should make stats more reliable by retry when there is an SQLException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2127) Improve stats gathering reliability by retries on failures

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HIVE-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025615#comment-13025615 ] 

Ning Zhang commented on HIVE-2127:
----------------------------------

@Namit, what does the new configuration variable do? Do you mean to define a variable to disable retry? If so set hive.stats.retries.max = 0 will do.

> Improve stats gathering reliability by retries on failures
> ----------------------------------------------------------
>
>                 Key: HIVE-2127
>                 URL: https://issues.apache.org/jira/browse/HIVE-2127
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: HIVE-2127.2.patch, HIVE-2127.patch
>
>
> Stats publishing and aggregation only try once and if there is any exception it will fail and return. If many mappers/reducers updating stats at the same time, it is very common to get lock timeout. We should make stats more reliable by retry when there is an SQLException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2127) Improve stats gathering reliability by retries on failures

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HIVE-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-2127:
-----------------------------

    Attachment: HIVE-2127.patch

> Improve stats gathering reliability by retries on failures
> ----------------------------------------------------------
>
>                 Key: HIVE-2127
>                 URL: https://issues.apache.org/jira/browse/HIVE-2127
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: HIVE-2127.patch
>
>
> Stats publishing and aggregation only try once and if there is any exception it will fail and return. If many mappers/reducers updating stats at the same time, it is very common to get lock timeout. We should make stats more reliable by retry when there is an SQLException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2127) Improve stats gathering reliability by retries on failures with hive.stats.retries.max and hive.stats.retries.wait

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HIVE-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-2127:
-----------------------------

    Status: Patch Available  (was: Open)

Changed the JIRA subject

> Improve stats gathering reliability by retries on failures with hive.stats.retries.max and hive.stats.retries.wait
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-2127
>                 URL: https://issues.apache.org/jira/browse/HIVE-2127
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: HIVE-2127.2.patch, HIVE-2127.patch
>
>
> Stats publishing and aggregation only try once and if there is any exception it will fail and return. If many mappers/reducers updating stats at the same time, it is very common to get lock timeout. We should make stats more reliable by retry when there is an SQLException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2127) Improve stats gathering reliability by retries on failures with hive.stats.retries.max and hive.stats.retries.wait

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HIVE-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-2127:
-----------------------------

    Summary: Improve stats gathering reliability by retries on failures with hive.stats.retries.max and hive.stats.retries.wait  (was: Improve stats gathering reliability by retries on failures)

> Improve stats gathering reliability by retries on failures with hive.stats.retries.max and hive.stats.retries.wait
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-2127
>                 URL: https://issues.apache.org/jira/browse/HIVE-2127
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: HIVE-2127.2.patch, HIVE-2127.patch
>
>
> Stats publishing and aggregation only try once and if there is any exception it will fail and return. If many mappers/reducers updating stats at the same time, it is very common to get lock timeout. We should make stats more reliable by retry when there is an SQLException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2127) Improve stats gathering reliability by retries on failures with hive.stats.retries.max and hive.stats.retries.wait

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HIVE-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-2127:
-----------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

Committed. Thanks Ning

> Improve stats gathering reliability by retries on failures with hive.stats.retries.max and hive.stats.retries.wait
> ------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-2127
>                 URL: https://issues.apache.org/jira/browse/HIVE-2127
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: HIVE-2127.2.patch, HIVE-2127.patch
>
>
> Stats publishing and aggregation only try once and if there is any exception it will fail and return. If many mappers/reducers updating stats at the same time, it is very common to get lock timeout. We should make stats more reliable by retry when there is an SQLException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2127) Improve stats gathering reliability by retries on failures

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HIVE-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-2127:
-----------------------------

    Status: Patch Available  (was: Open)

Updated the review board. 

> Improve stats gathering reliability by retries on failures
> ----------------------------------------------------------
>
>                 Key: HIVE-2127
>                 URL: https://issues.apache.org/jira/browse/HIVE-2127
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: HIVE-2127.2.patch, HIVE-2127.patch
>
>
> Stats publishing and aggregation only try once and if there is any exception it will fail and return. If many mappers/reducers updating stats at the same time, it is very common to get lock timeout. We should make stats more reliable by retry when there is an SQLException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2127) Improve stats gathering reliability by retries on failures

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HIVE-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025620#comment-13025620 ] 

Namit Jain commented on HIVE-2127:
----------------------------------

What I meant was: 

Change the subject of the jira:
Improve stats gathering reliability by retries on failures

for better searching


> Improve stats gathering reliability by retries on failures
> ----------------------------------------------------------
>
>                 Key: HIVE-2127
>                 URL: https://issues.apache.org/jira/browse/HIVE-2127
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: HIVE-2127.2.patch, HIVE-2127.patch
>
>
> Stats publishing and aggregation only try once and if there is any exception it will fail and return. If many mappers/reducers updating stats at the same time, it is very common to get lock timeout. We should make stats more reliable by retry when there is an SQLException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2127) Improve stats gathering reliability by retries on failures

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HIVE-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-2127:
-----------------------------

    Status: Open  (was: Patch Available)

Comments in review-board

> Improve stats gathering reliability by retries on failures
> ----------------------------------------------------------
>
>                 Key: HIVE-2127
>                 URL: https://issues.apache.org/jira/browse/HIVE-2127
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: HIVE-2127.2.patch, HIVE-2127.patch
>
>
> Stats publishing and aggregation only try once and if there is any exception it will fail and return. If many mappers/reducers updating stats at the same time, it is very common to get lock timeout. We should make stats more reliable by retry when there is an SQLException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2127) Improve stats gathering reliability by retries on failures

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HIVE-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025621#comment-13025621 ] 

Namit Jain commented on HIVE-2127:
----------------------------------

Looks good otherwise

> Improve stats gathering reliability by retries on failures
> ----------------------------------------------------------
>
>                 Key: HIVE-2127
>                 URL: https://issues.apache.org/jira/browse/HIVE-2127
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: HIVE-2127.2.patch, HIVE-2127.patch
>
>
> Stats publishing and aggregation only try once and if there is any exception it will fail and return. If many mappers/reducers updating stats at the same time, it is very common to get lock timeout. We should make stats more reliable by retry when there is an SQLException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2127) Improve stats gathering reliability by retries on failures

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HIVE-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-2127:
-----------------------------

    Status: Open  (was: Patch Available)

Paul has an offline comment about it cannot handle Connection exceptions. I'm working on a new patch and will update it soon. 

> Improve stats gathering reliability by retries on failures
> ----------------------------------------------------------
>
>                 Key: HIVE-2127
>                 URL: https://issues.apache.org/jira/browse/HIVE-2127
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: HIVE-2127.patch
>
>
> Stats publishing and aggregation only try once and if there is any exception it will fail and return. If many mappers/reducers updating stats at the same time, it is very common to get lock timeout. We should make stats more reliable by retry when there is an SQLException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2127) Improve stats gathering reliability by retries on failures

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/HIVE-2127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ning Zhang updated HIVE-2127:
-----------------------------

    Status: Patch Available  (was: Open)

Review board: https://reviews.apache.org/r/664/


> Improve stats gathering reliability by retries on failures
> ----------------------------------------------------------
>
>                 Key: HIVE-2127
>                 URL: https://issues.apache.org/jira/browse/HIVE-2127
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: HIVE-2127.patch
>
>
> Stats publishing and aggregation only try once and if there is any exception it will fail and return. If many mappers/reducers updating stats at the same time, it is very common to get lock timeout. We should make stats more reliable by retry when there is an SQLException.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira