You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Prasad Mujumdar (JIRA)" <ji...@apache.org> on 2012/12/03 09:57:58 UTC

[jira] [Created] (HIVE-3764) Support metastore version consistency check

Prasad Mujumdar created HIVE-3764:
-------------------------------------

             Summary: Support metastore version consistency check
                 Key: HIVE-3764
                 URL: https://issues.apache.org/jira/browse/HIVE-3764
             Project: Hive
          Issue Type: Improvement
          Components: Metastore
            Reporter: Prasad Mujumdar
            Assignee: Prasad Mujumdar


Today there's no version/compatibility information stored in hive metastore. Also the datanucleus configuration property to automatically create missing tables is enabled by default. If you happen to start an older or newer hive or don't run the correct upgrade scripts during migration, the metastore would end up corrupted. The autoCreate schema is not always sufficient to upgrade metastore when migrating to newer release. It's not supported with all databases. Besides the migration often involves altering existing table, changing or moving data etc.

Hence it's very useful to have some consistency check to make sure that hive is using correct metastore and for production systems the schema is not automatically by running hive.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3764) Support metastore version consistency check

Posted by "Prasad Mujumdar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508609#comment-13508609 ] 

Prasad Mujumdar commented on HIVE-3764:
---------------------------------------

Code review request on https://reviews.apache.org/r/8314/
                
> Support metastore version consistency check
> -------------------------------------------
>
>                 Key: HIVE-3764
>                 URL: https://issues.apache.org/jira/browse/HIVE-3764
>             Project: Hive
>          Issue Type: Improvement
>          Components: Metastore
>            Reporter: Prasad Mujumdar
>            Assignee: Prasad Mujumdar
>             Fix For: 0.10.0
>
>         Attachments: HIVE-3764-1.patch
>
>
> Today there's no version/compatibility information stored in hive metastore. Also the datanucleus configuration property to automatically create missing tables is enabled by default. If you happen to start an older or newer hive or don't run the correct upgrade scripts during migration, the metastore would end up corrupted. The autoCreate schema is not always sufficient to upgrade metastore when migrating to newer release. It's not supported with all databases. Besides the migration often involves altering existing table, changing or moving data etc.
> Hence it's very useful to have some consistency check to make sure that hive is using correct metastore and for production systems the schema is not automatically by running hive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3764) Support metastore version consistency check

Posted by "Shreepadma Venugopalan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509300#comment-13509300 ] 

Shreepadma Venugopalan commented on HIVE-3764:
----------------------------------------------

I think adding the consistency check is a good idea too. I've not looked into all the details of the code, but I noticed that the metastore version number is the hive release version. While this makes the version numbers easily readable, we would need to provide scripts and perform a metastore upgrade on every Hive release even if there are no other patches in the release that require a metastore schema upgrade. The other option would be to use version numbers from a monotonically increasing sequence instead and bump up the version number only if there are changes in a release that require a metastore upgrade. Wondering if you have considered the later option. Thanks.
                
> Support metastore version consistency check
> -------------------------------------------
>
>                 Key: HIVE-3764
>                 URL: https://issues.apache.org/jira/browse/HIVE-3764
>             Project: Hive
>          Issue Type: Improvement
>          Components: Metastore
>            Reporter: Prasad Mujumdar
>            Assignee: Prasad Mujumdar
>             Fix For: 0.10.0
>
>         Attachments: HIVE-3764-1.patch
>
>
> Today there's no version/compatibility information stored in hive metastore. Also the datanucleus configuration property to automatically create missing tables is enabled by default. If you happen to start an older or newer hive or don't run the correct upgrade scripts during migration, the metastore would end up corrupted. The autoCreate schema is not always sufficient to upgrade metastore when migrating to newer release. It's not supported with all databases. Besides the migration often involves altering existing table, changing or moving data etc.
> Hence it's very useful to have some consistency check to make sure that hive is using correct metastore and for production systems the schema is not automatically by running hive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3764) Support metastore version consistency check

Posted by "Shreepadma Venugopalan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509305#comment-13509305 ] 

Shreepadma Venugopalan commented on HIVE-3764:
----------------------------------------------

Irrespective of which option we choose to generate version numbers, we should not execute the insert/update version number statement in the schema creation/upgrade script until all other statements in the schema creation/upgrade script have completed without errors. Thanks.
                
> Support metastore version consistency check
> -------------------------------------------
>
>                 Key: HIVE-3764
>                 URL: https://issues.apache.org/jira/browse/HIVE-3764
>             Project: Hive
>          Issue Type: Improvement
>          Components: Metastore
>            Reporter: Prasad Mujumdar
>            Assignee: Prasad Mujumdar
>             Fix For: 0.10.0
>
>         Attachments: HIVE-3764-1.patch
>
>
> Today there's no version/compatibility information stored in hive metastore. Also the datanucleus configuration property to automatically create missing tables is enabled by default. If you happen to start an older or newer hive or don't run the correct upgrade scripts during migration, the metastore would end up corrupted. The autoCreate schema is not always sufficient to upgrade metastore when migrating to newer release. It's not supported with all databases. Besides the migration often involves altering existing table, changing or moving data etc.
> Hence it's very useful to have some consistency check to make sure that hive is using correct metastore and for production systems the schema is not automatically by running hive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3764) Support metastore version consistency check

Posted by "Ashutosh Chauhan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509134#comment-13509134 ] 

Ashutosh Chauhan commented on HIVE-3764:
----------------------------------------

I like the idea of this consistency check and having autoCreate turned off by default.
I havent looked at the patch in much detail, but hard-coding version number in the code is not the great idea. Standard mechanism for this is to determine current version at compile time via saveVersion scripts and then use that. HIVE-2926 is trying to add that in. Either we should finish that one up or take the bits out from it of saveVersion and include in this patch.
                
> Support metastore version consistency check
> -------------------------------------------
>
>                 Key: HIVE-3764
>                 URL: https://issues.apache.org/jira/browse/HIVE-3764
>             Project: Hive
>          Issue Type: Improvement
>          Components: Metastore
>            Reporter: Prasad Mujumdar
>            Assignee: Prasad Mujumdar
>             Fix For: 0.10.0
>
>         Attachments: HIVE-3764-1.patch
>
>
> Today there's no version/compatibility information stored in hive metastore. Also the datanucleus configuration property to automatically create missing tables is enabled by default. If you happen to start an older or newer hive or don't run the correct upgrade scripts during migration, the metastore would end up corrupted. The autoCreate schema is not always sufficient to upgrade metastore when migrating to newer release. It's not supported with all databases. Besides the migration often involves altering existing table, changing or moving data etc.
> Hence it's very useful to have some consistency check to make sure that hive is using correct metastore and for production systems the schema is not automatically by running hive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3764) Support metastore version consistency check

Posted by "Prasad Mujumdar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Prasad Mujumdar updated HIVE-3764:
----------------------------------

    Fix Version/s: 0.10.0
           Status: Patch Available  (was: Open)

The metastore version is set the in code and a new table is created in metastore to store it on the disk. The schema creation and upgrade scripts add the version string to the metastore.
The version is verified while opening the metastore. A new property hive.metastore.schema.verification (disabled by default) is added to enable this verification. Setting this property also disables autoCreateSchema and enables fixedDataStore for datanucleus.
                
> Support metastore version consistency check
> -------------------------------------------
>
>                 Key: HIVE-3764
>                 URL: https://issues.apache.org/jira/browse/HIVE-3764
>             Project: Hive
>          Issue Type: Improvement
>          Components: Metastore
>            Reporter: Prasad Mujumdar
>            Assignee: Prasad Mujumdar
>             Fix For: 0.10.0
>
>         Attachments: HIVE-3764-1.patch
>
>
> Today there's no version/compatibility information stored in hive metastore. Also the datanucleus configuration property to automatically create missing tables is enabled by default. If you happen to start an older or newer hive or don't run the correct upgrade scripts during migration, the metastore would end up corrupted. The autoCreate schema is not always sufficient to upgrade metastore when migrating to newer release. It's not supported with all databases. Besides the migration often involves altering existing table, changing or moving data etc.
> Hence it's very useful to have some consistency check to make sure that hive is using correct metastore and for production systems the schema is not automatically by running hive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-3764) Support metastore version consistency check

Posted by "Prasad Mujumdar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13509144#comment-13509144 ] 

Prasad Mujumdar commented on HIVE-3764:
---------------------------------------

@Asuthosh,  Thanks for the comments. 
HiveServer2 patch already has a mechanism to extract the version. I didn't want to duplicate that again here, so its using a hardcoded version for time being. I will change it based on what that code.

                
> Support metastore version consistency check
> -------------------------------------------
>
>                 Key: HIVE-3764
>                 URL: https://issues.apache.org/jira/browse/HIVE-3764
>             Project: Hive
>          Issue Type: Improvement
>          Components: Metastore
>            Reporter: Prasad Mujumdar
>            Assignee: Prasad Mujumdar
>             Fix For: 0.10.0
>
>         Attachments: HIVE-3764-1.patch
>
>
> Today there's no version/compatibility information stored in hive metastore. Also the datanucleus configuration property to automatically create missing tables is enabled by default. If you happen to start an older or newer hive or don't run the correct upgrade scripts during migration, the metastore would end up corrupted. The autoCreate schema is not always sufficient to upgrade metastore when migrating to newer release. It's not supported with all databases. Besides the migration often involves altering existing table, changing or moving data etc.
> Hence it's very useful to have some consistency check to make sure that hive is using correct metastore and for production systems the schema is not automatically by running hive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-3764) Support metastore version consistency check

Posted by "Prasad Mujumdar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Prasad Mujumdar updated HIVE-3764:
----------------------------------

    Attachment: HIVE-3764-1.patch
    
> Support metastore version consistency check
> -------------------------------------------
>
>                 Key: HIVE-3764
>                 URL: https://issues.apache.org/jira/browse/HIVE-3764
>             Project: Hive
>          Issue Type: Improvement
>          Components: Metastore
>            Reporter: Prasad Mujumdar
>            Assignee: Prasad Mujumdar
>             Fix For: 0.10.0
>
>         Attachments: HIVE-3764-1.patch
>
>
> Today there's no version/compatibility information stored in hive metastore. Also the datanucleus configuration property to automatically create missing tables is enabled by default. If you happen to start an older or newer hive or don't run the correct upgrade scripts during migration, the metastore would end up corrupted. The autoCreate schema is not always sufficient to upgrade metastore when migrating to newer release. It's not supported with all databases. Besides the migration often involves altering existing table, changing or moving data etc.
> Hence it's very useful to have some consistency check to make sure that hive is using correct metastore and for production systems the schema is not automatically by running hive.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira