You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Edward Capriolo (JIRA)" <ji...@apache.org> on 2010/12/08 18:28:01 UTC

[jira] Created: (HIVE-1841) datanucleus.fixedDatastore should be true in hive-default.xml

 datanucleus.fixedDatastore should be true in hive-default.xml
--------------------------------------------------------------

                 Key: HIVE-1841
                 URL: https://issues.apache.org/jira/browse/HIVE-1841
             Project: Hive
          Issue Type: Improvement
          Components: Configuration
    Affects Versions: 0.6.0
            Reporter: Edward Capriolo
             Fix For: 0.7.0



Two datanucleus variables:
{noformat}
<property>
 <name>datanucleus.autoCreateSchema</name>
 <value>false</value>
</property>

<property>
 <name>datanucleus.fixedDatastore</name>
 <value>true</value>
</property>
{noformat}

are dangerous.  We do want the schema to auto-create itself, but we do not want the schema to auto update itself. 

Someone might accidentally point a trunk at the wrong meta-store and unknowingly update. I believe we should set this to false and possibly trap exceptions stemming from hive wanting to do any update. This way someone has to actively acknowledge the update, by setting this to true and then starting up hive, or leaving it false, removing schema modifies for the user that hive usages, and doing all the time and doing the updates by hand. 


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Updated] (HIVE-1841) datanucleus.fixedDatastore should be true in hive-default.xml

Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Carl Steinbach updated HIVE-1841:
---------------------------------

    Component/s: Metastore

>  datanucleus.fixedDatastore should be true in hive-default.xml
> --------------------------------------------------------------
>
>                 Key: HIVE-1841
>                 URL: https://issues.apache.org/jira/browse/HIVE-1841
>             Project: Hive
>          Issue Type: Improvement
>          Components: Configuration, Metastore
>    Affects Versions: 0.6.0
>            Reporter: Edward Capriolo
>            Priority: Minor
>         Attachments: HIVE-1841.1.patch.txt
>
>
> Two datanucleus variables:
> {noformat}
> <property>
>  <name>datanucleus.autoCreateSchema</name>
>  <value>false</value>
> </property>
> <property>
>  <name>datanucleus.fixedDatastore</name>
>  <value>true</value>
> </property>
> {noformat}
> are dangerous.  We do want the schema to auto-create itself, but we do not want the schema to auto update itself. 
> Someone might accidentally point a trunk at the wrong meta-store and unknowingly update. I believe we should set this to false and possibly trap exceptions stemming from hive wanting to do any update. This way someone has to actively acknowledge the update, by setting this to true and then starting up hive, or leaving it false, removing schema modifies for the user that hive usages, and doing all the time and doing the updates by hand. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] Commented: (HIVE-1841) datanucleus.fixedDatastore should be true in hive-default.xml

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969442#action_12969442 ] 

Edward Capriolo commented on HIVE-1841:
---------------------------------------

Correction 

But datanucleus.fixedDatastore is not specified and it defaults to FALSE.

>  datanucleus.fixedDatastore should be true in hive-default.xml
> --------------------------------------------------------------
>
>                 Key: HIVE-1841
>                 URL: https://issues.apache.org/jira/browse/HIVE-1841
>             Project: Hive
>          Issue Type: Improvement
>          Components: Configuration
>    Affects Versions: 0.6.0
>            Reporter: Edward Capriolo
>             Fix For: 0.7.0
>
>
> Two datanucleus variables:
> {noformat}
> <property>
>  <name>datanucleus.autoCreateSchema</name>
>  <value>false</value>
> </property>
> <property>
>  <name>datanucleus.fixedDatastore</name>
>  <value>true</value>
> </property>
> {noformat}
> are dangerous.  We do want the schema to auto-create itself, but we do not want the schema to auto update itself. 
> Someone might accidentally point a trunk at the wrong meta-store and unknowingly update. I believe we should set this to false and possibly trap exceptions stemming from hive wanting to do any update. This way someone has to actively acknowledge the update, by setting this to true and then starting up hive, or leaving it false, removing schema modifies for the user that hive usages, and doing all the time and doing the updates by hand. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1841) datanucleus.fixedDatastore should be true in hive-default.xml

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969566#action_12969566 ] 

Ning Zhang commented on HIVE-1841:
----------------------------------

Yeah, I think it is good to at least to include datanucleus.fixedDataStore in hive-default.xml. I think it should also be OK to set the default value to 'true' if all the current unit tests pass. But as a rule-of-thumb, a hive-site.xml is necessary for any Hive query running on production cluster.

Paul, do you see other potential problems by changing the default value of datanucleus.fixedDataStore in hive-default.xml.


>  datanucleus.fixedDatastore should be true in hive-default.xml
> --------------------------------------------------------------
>
>                 Key: HIVE-1841
>                 URL: https://issues.apache.org/jira/browse/HIVE-1841
>             Project: Hive
>          Issue Type: Improvement
>          Components: Configuration
>    Affects Versions: 0.6.0
>            Reporter: Edward Capriolo
>             Fix For: 0.7.0
>
>
> Two datanucleus variables:
> {noformat}
> <property>
>  <name>datanucleus.autoCreateSchema</name>
>  <value>false</value>
> </property>
> <property>
>  <name>datanucleus.fixedDatastore</name>
>  <value>true</value>
> </property>
> {noformat}
> are dangerous.  We do want the schema to auto-create itself, but we do not want the schema to auto update itself. 
> Someone might accidentally point a trunk at the wrong meta-store and unknowingly update. I believe we should set this to false and possibly trap exceptions stemming from hive wanting to do any update. This way someone has to actively acknowledge the update, by setting this to true and then starting up hive, or leaving it false, removing schema modifies for the user that hive usages, and doing all the time and doing the updates by hand. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1841) datanucleus.fixedDatastore should be true in hive-default.xml

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969440#action_12969440 ] 

Edward Capriolo commented on HIVE-1841:
---------------------------------------

Ning,

I agree with
{noformat}
<property>
 <name>datanucleus.autoCreateSchema</name>
 <value>true</value>
</property>
{noformat}
But datanucleus.fixedDatastore is not specified and it defaults to true. This causes and auto-upgrade which could be dangerous. I believe we should set datanucleus.fixedDatastore to true, or at very least document it in hive-default.xml. 

I say this because my first assumption was if i build a hive from trunk and point it at a production metastore, run some select queries for QA, it could "do no harm". However since datanucleus.fixedDatastore is true as soon as a client touches the metastore it will make any changes it feels are appropriate. In the future imagine and old hive install laying around, what if an old hive 0.4.0 instances points at an hive 0.9.0 metastore if could try to add something that has since been removed, or worse.




>  datanucleus.fixedDatastore should be true in hive-default.xml
> --------------------------------------------------------------
>
>                 Key: HIVE-1841
>                 URL: https://issues.apache.org/jira/browse/HIVE-1841
>             Project: Hive
>          Issue Type: Improvement
>          Components: Configuration
>    Affects Versions: 0.6.0
>            Reporter: Edward Capriolo
>             Fix For: 0.7.0
>
>
> Two datanucleus variables:
> {noformat}
> <property>
>  <name>datanucleus.autoCreateSchema</name>
>  <value>false</value>
> </property>
> <property>
>  <name>datanucleus.fixedDatastore</name>
>  <value>true</value>
> </property>
> {noformat}
> are dangerous.  We do want the schema to auto-create itself, but we do not want the schema to auto update itself. 
> Someone might accidentally point a trunk at the wrong meta-store and unknowingly update. I believe we should set this to false and possibly trap exceptions stemming from hive wanting to do any update. This way someone has to actively acknowledge the update, by setting this to true and then starting up hive, or leaving it false, removing schema modifies for the user that hive usages, and doing all the time and doing the updates by hand. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1841) datanucleus.fixedDatastore should be true in hive-default.xml

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969413#action_12969413 ] 

Ning Zhang commented on HIVE-1841:
----------------------------------

The hive-default.xml only set the first to true. This is good for first-time users to quickly get things started (by automatically creating tables needed by metastore) without changing any configurations. This removes frictions for the beginners. 

For experts or administrators for production clusters, these parameters can be set in hive-site.xml. This is actually what we are doing at Facebook (setting fixed metastore schema). I think this will give you both the flexibility/configurability as well as security. What do you think?

>  datanucleus.fixedDatastore should be true in hive-default.xml
> --------------------------------------------------------------
>
>                 Key: HIVE-1841
>                 URL: https://issues.apache.org/jira/browse/HIVE-1841
>             Project: Hive
>          Issue Type: Improvement
>          Components: Configuration
>    Affects Versions: 0.6.0
>            Reporter: Edward Capriolo
>             Fix For: 0.7.0
>
>
> Two datanucleus variables:
> {noformat}
> <property>
>  <name>datanucleus.autoCreateSchema</name>
>  <value>false</value>
> </property>
> <property>
>  <name>datanucleus.fixedDatastore</name>
>  <value>true</value>
> </property>
> {noformat}
> are dangerous.  We do want the schema to auto-create itself, but we do not want the schema to auto update itself. 
> Someone might accidentally point a trunk at the wrong meta-store and unknowingly update. I believe we should set this to false and possibly trap exceptions stemming from hive wanting to do any update. This way someone has to actively acknowledge the update, by setting this to true and then starting up hive, or leaving it false, removing schema modifies for the user that hive usages, and doing all the time and doing the updates by hand. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1841) datanucleus.fixedDatastore should be true in hive-default.xml

Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Carl Steinbach updated HIVE-1841:
---------------------------------

    Attachment: HIVE-1841.1.patch.txt

This patch set datanucleus.fixedDatastore=true in hive-default.xml


>  datanucleus.fixedDatastore should be true in hive-default.xml
> --------------------------------------------------------------
>
>                 Key: HIVE-1841
>                 URL: https://issues.apache.org/jira/browse/HIVE-1841
>             Project: Hive
>          Issue Type: Improvement
>          Components: Configuration
>    Affects Versions: 0.6.0
>            Reporter: Edward Capriolo
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1841.1.patch.txt
>
>
> Two datanucleus variables:
> {noformat}
> <property>
>  <name>datanucleus.autoCreateSchema</name>
>  <value>false</value>
> </property>
> <property>
>  <name>datanucleus.fixedDatastore</name>
>  <value>true</value>
> </property>
> {noformat}
> are dangerous.  We do want the schema to auto-create itself, but we do not want the schema to auto update itself. 
> Someone might accidentally point a trunk at the wrong meta-store and unknowingly update. I believe we should set this to false and possibly trap exceptions stemming from hive wanting to do any update. This way someone has to actively acknowledge the update, by setting this to true and then starting up hive, or leaving it false, removing schema modifies for the user that hive usages, and doing all the time and doing the updates by hand. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] [Commented] (HIVE-1841) datanucleus.fixedDatastore should be true in hive-default.xml

Posted by "Steven Wong (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13063474#comment-13063474 ] 

Steven Wong commented on HIVE-1841:
-----------------------------------

Unfortunately, if both datanucleus.{autoCreateSchema,fixedDatastore} are defaulted to true, schema is not created automatically, creating friction for beginners.

>  datanucleus.fixedDatastore should be true in hive-default.xml
> --------------------------------------------------------------
>
>                 Key: HIVE-1841
>                 URL: https://issues.apache.org/jira/browse/HIVE-1841
>             Project: Hive
>          Issue Type: Improvement
>          Components: Configuration, Metastore
>    Affects Versions: 0.6.0
>            Reporter: Edward Capriolo
>            Priority: Minor
>         Attachments: HIVE-1841.1.patch.txt
>
>
> Two datanucleus variables:
> {noformat}
> <property>
>  <name>datanucleus.autoCreateSchema</name>
>  <value>false</value>
> </property>
> <property>
>  <name>datanucleus.fixedDatastore</name>
>  <value>true</value>
> </property>
> {noformat}
> are dangerous.  We do want the schema to auto-create itself, but we do not want the schema to auto update itself. 
> Someone might accidentally point a trunk at the wrong meta-store and unknowingly update. I believe we should set this to false and possibly trap exceptions stemming from hive wanting to do any update. This way someone has to actively acknowledge the update, by setting this to true and then starting up hive, or leaving it false, removing schema modifies for the user that hive usages, and doing all the time and doing the updates by hand. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (HIVE-1841) datanucleus.fixedDatastore should be true in hive-default.xml

Posted by "Paul Yang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12969615#action_12969615 ] 

Paul Yang commented on HIVE-1841:
---------------------------------

Seems pretty safe, but this might break ant tests on a clean checkout.

>  datanucleus.fixedDatastore should be true in hive-default.xml
> --------------------------------------------------------------
>
>                 Key: HIVE-1841
>                 URL: https://issues.apache.org/jira/browse/HIVE-1841
>             Project: Hive
>          Issue Type: Improvement
>          Components: Configuration
>    Affects Versions: 0.6.0
>            Reporter: Edward Capriolo
>             Fix For: 0.7.0
>
>
> Two datanucleus variables:
> {noformat}
> <property>
>  <name>datanucleus.autoCreateSchema</name>
>  <value>false</value>
> </property>
> <property>
>  <name>datanucleus.fixedDatastore</name>
>  <value>true</value>
> </property>
> {noformat}
> are dangerous.  We do want the schema to auto-create itself, but we do not want the schema to auto update itself. 
> Someone might accidentally point a trunk at the wrong meta-store and unknowingly update. I believe we should set this to false and possibly trap exceptions stemming from hive wanting to do any update. This way someone has to actively acknowledge the update, by setting this to true and then starting up hive, or leaving it false, removing schema modifies for the user that hive usages, and doing all the time and doing the updates by hand. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-1841) datanucleus.fixedDatastore should be true in hive-default.xml

Posted by "Carl Steinbach (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12983405#action_12983405 ] 

Carl Steinbach commented on HIVE-1841:
--------------------------------------

Note that having datanucleus.autoCreateSchema=true and datanucleus.fixedDatastore=false also results
in a significant performance hit for MetaStore operations.

See http://www.jpox.org/servlet/forum/viewthread_thread,4066


>  datanucleus.fixedDatastore should be true in hive-default.xml
> --------------------------------------------------------------
>
>                 Key: HIVE-1841
>                 URL: https://issues.apache.org/jira/browse/HIVE-1841
>             Project: Hive
>          Issue Type: Improvement
>          Components: Configuration
>    Affects Versions: 0.6.0
>            Reporter: Edward Capriolo
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1841.1.patch.txt
>
>
> Two datanucleus variables:
> {noformat}
> <property>
>  <name>datanucleus.autoCreateSchema</name>
>  <value>false</value>
> </property>
> <property>
>  <name>datanucleus.fixedDatastore</name>
>  <value>true</value>
> </property>
> {noformat}
> are dangerous.  We do want the schema to auto-create itself, but we do not want the schema to auto update itself. 
> Someone might accidentally point a trunk at the wrong meta-store and unknowingly update. I believe we should set this to false and possibly trap exceptions stemming from hive wanting to do any update. This way someone has to actively acknowledge the update, by setting this to true and then starting up hive, or leaving it false, removing schema modifies for the user that hive usages, and doing all the time and doing the updates by hand. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-1841) datanucleus.fixedDatastore should be true in hive-default.xml

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Edward Capriolo updated HIVE-1841:
----------------------------------

    Description: 
Two datanucleus variables:
{noformat}
<property>
 <name>datanucleus.autoCreateSchema</name>
 <value>false</value>
</property>

<property>
 <name>datanucleus.fixedDatastore</name>
 <value>true</value>
</property>
{noformat}

are dangerous.  We do want the schema to auto-create itself, but we do not want the schema to auto update itself. 

Someone might accidentally point a trunk at the wrong meta-store and unknowingly update. I believe we should set this to false and possibly trap exceptions stemming from hive wanting to do any update. This way someone has to actively acknowledge the update, by setting this to true and then starting up hive, or leaving it false, removing schema modifies for the user that hive usages, and doing all the time and doing the updates by hand. 


  was:

Two datanucleus variables:
{noformat}
<property>
 <name>datanucleus.autoCreateSchema</name>
 <value>false</value>
</property>

<property>
 <name>datanucleus.fixedDatastore</name>
 <value>true</value>
</property>
{noformat}

are dangerous.  We do want the schema to auto-create itself, but we do not want the schema to auto update itself. 

Someone might accidentally point a trunk at the wrong meta-store and unknowingly update. I believe we should set this to false and possibly trap exceptions stemming from hive wanting to do any update. This way someone has to actively acknowledge the update, by setting this to true and then starting up hive, or leaving it false, removing schema modifies for the user that hive usages, and doing all the time and doing the updates by hand. 


       Priority: Minor  (was: Major)

>  datanucleus.fixedDatastore should be true in hive-default.xml
> --------------------------------------------------------------
>
>                 Key: HIVE-1841
>                 URL: https://issues.apache.org/jira/browse/HIVE-1841
>             Project: Hive
>          Issue Type: Improvement
>          Components: Configuration
>    Affects Versions: 0.6.0
>            Reporter: Edward Capriolo
>            Priority: Minor
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1841.1.patch.txt
>
>
> Two datanucleus variables:
> {noformat}
> <property>
>  <name>datanucleus.autoCreateSchema</name>
>  <value>false</value>
> </property>
> <property>
>  <name>datanucleus.fixedDatastore</name>
>  <value>true</value>
> </property>
> {noformat}
> are dangerous.  We do want the schema to auto-create itself, but we do not want the schema to auto update itself. 
> Someone might accidentally point a trunk at the wrong meta-store and unknowingly update. I believe we should set this to false and possibly trap exceptions stemming from hive wanting to do any update. This way someone has to actively acknowledge the update, by setting this to true and then starting up hive, or leaving it false, removing schema modifies for the user that hive usages, and doing all the time and doing the updates by hand. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.