You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Ashish Thusoo (JIRA)" <ji...@apache.org> on 2009/08/27 04:02:59 UTC

[jira] Created: (HIVE-805) Session level metastore

Session level metastore
-----------------------

                 Key: HIVE-805
                 URL: https://issues.apache.org/jira/browse/HIVE-805
             Project: Hadoop Hive
          Issue Type: New Feature
          Components: Query Processor
    Affects Versions: 0.5.0
            Reporter: Ashish Thusoo
            Assignee: Ashish Thusoo


Implement a shadow metastore that is in memory and runs for a session. This can contain definitions for session specific views that can be used to implement data flow variables in Hive. It can also be used for testing scripts. First we will support the later use case where in all the DDL statements in the session create objects in the session metastore and all the queries are converted to explain internal. Any thoughts on load commands?

This feature is enabled when

set hive.session.test = true

is done in the session.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-805) Session level metastore

Posted by "Prasad Chakka (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748565#action_12748565 ] 

Prasad Chakka commented on HIVE-805:
------------------------------------

we need to do this now since Metastore.createTable() will create the directory for you so when the session level metastore closes, these directories will be unnecessarily hanging.
didn't look into the code yet.

> Session level metastore
> -----------------------
>
>                 Key: HIVE-805
>                 URL: https://issues.apache.org/jira/browse/HIVE-805
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.2.0
>            Reporter: Ashish Thusoo
>            Assignee: Ashish Thusoo
>             Fix For: 0.5.0
>
>         Attachments: HIVE-805.patch
>
>
> Implement a shadow metastore that is in memory and runs for a session. This can contain definitions for session specific views that can be used to implement data flow variables in Hive. It can also be used for testing scripts. First we will support the later use case where in all the DDL statements in the session create objects in the session metastore and all the queries are converted to explain internal. Any thoughts on load commands?
> This feature is enabled when
> set hive.session.test = true
> is done in the session.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-805) Session level metastore

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ashish Thusoo updated HIVE-805:
-------------------------------

    Attachment: HIVE-805.patch

Patch that implements this. Please send in your comments.

> Session level metastore
> -----------------------
>
>                 Key: HIVE-805
>                 URL: https://issues.apache.org/jira/browse/HIVE-805
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.2.0
>            Reporter: Ashish Thusoo
>            Assignee: Ashish Thusoo
>             Fix For: 0.5.0
>
>         Attachments: HIVE-805.patch
>
>
> Implement a shadow metastore that is in memory and runs for a session. This can contain definitions for session specific views that can be used to implement data flow variables in Hive. It can also be used for testing scripts. First we will support the later use case where in all the DDL statements in the session create objects in the session metastore and all the queries are converted to explain internal. Any thoughts on load commands?
> This feature is enabled when
> set hive.session.test = true
> is done in the session.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-805) Session level metastore

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ashish Thusoo updated HIVE-805:
-------------------------------

    Attachment: HIVE-805-1.patch

Incorporated Prasad's review comments. I have not yet disabled this for partition tables though.

> Session level metastore
> -----------------------
>
>                 Key: HIVE-805
>                 URL: https://issues.apache.org/jira/browse/HIVE-805
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.2.0
>            Reporter: Ashish Thusoo
>            Assignee: Ashish Thusoo
>             Fix For: 0.5.0
>
>         Attachments: HIVE-805-1.patch, HIVE-805.patch
>
>
> Implement a shadow metastore that is in memory and runs for a session. This can contain definitions for session specific views that can be used to implement data flow variables in Hive. It can also be used for testing scripts. First we will support the later use case where in all the DDL statements in the session create objects in the session metastore and all the queries are converted to explain internal. Any thoughts on load commands?
> This feature is enabled when
> set hive.session.test = true
> is done in the session.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-805) Session level metastore

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829476#action_12829476 ] 

Zheng Shao commented on HIVE-805:
---------------------------------

As with my last comment, I think we should interpret the option "hive.metastore.dryrun" at compile time instead of at execution time.
It should be compiled into the plan.

The execution code can look at the plan and then decide which metastore to "create" the table in.

Now with "view" in, we should also make it work for views.


> Session level metastore
> -----------------------
>
>                 Key: HIVE-805
>                 URL: https://issues.apache.org/jira/browse/HIVE-805
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.6.0
>            Reporter: Ashish Thusoo
>            Assignee: Ashish Thusoo
>         Attachments: HIVE-805-1.patch, HIVE-805.patch
>
>
> Implement a shadow metastore that is in memory and runs for a session. This can contain definitions for session specific views that can be used to implement data flow variables in Hive. It can also be used for testing scripts. First we will support the later use case where in all the DDL statements in the session create objects in the session metastore and all the queries are converted to explain internal. Any thoughts on load commands?
> This feature is enabled when
> set hive.session.test = true
> is done in the session.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-805) Session level metastore

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748510#action_12748510 ] 

Ashish Thusoo commented on HIVE-805:
------------------------------------

It is the same location. However in the session.test mode the queries and dmls do not run - only an explain output is generated for those. We can extend this later to maintain a different namespace for this data - or we can do that now as well and it will not be used in the test mode. Thoughts?


> Session level metastore
> -----------------------
>
>                 Key: HIVE-805
>                 URL: https://issues.apache.org/jira/browse/HIVE-805
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.2.0
>            Reporter: Ashish Thusoo
>            Assignee: Ashish Thusoo
>             Fix For: 0.5.0
>
>         Attachments: HIVE-805.patch
>
>
> Implement a shadow metastore that is in memory and runs for a session. This can contain definitions for session specific views that can be used to implement data flow variables in Hive. It can also be used for testing scripts. First we will support the later use case where in all the DDL statements in the session create objects in the session metastore and all the queries are converted to explain internal. Any thoughts on load commands?
> This feature is enabled when
> set hive.session.test = true
> is done in the session.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-805) Session level metastore

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748570#action_12748570 ] 

Ashish Thusoo commented on HIVE-805:
------------------------------------

Fair enough. Let me fix that. I presume we can create a separate hive conf variable that keeps track of temp or test name space.

> Session level metastore
> -----------------------
>
>                 Key: HIVE-805
>                 URL: https://issues.apache.org/jira/browse/HIVE-805
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.2.0
>            Reporter: Ashish Thusoo
>            Assignee: Ashish Thusoo
>             Fix For: 0.5.0
>
>         Attachments: HIVE-805.patch
>
>
> Implement a shadow metastore that is in memory and runs for a session. This can contain definitions for session specific views that can be used to implement data flow variables in Hive. It can also be used for testing scripts. First we will support the later use case where in all the DDL statements in the session create objects in the session metastore and all the queries are converted to explain internal. Any thoughts on load commands?
> This feature is enabled when
> set hive.session.test = true
> is done in the session.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-805) Session level metastore

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-805:
----------------------------

    Fix Version/s:     (was: 0.5.0)

> Session level metastore
> -----------------------
>
>                 Key: HIVE-805
>                 URL: https://issues.apache.org/jira/browse/HIVE-805
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.2.0
>            Reporter: Ashish Thusoo
>            Assignee: Ashish Thusoo
>         Attachments: HIVE-805-1.patch, HIVE-805.patch
>
>
> Implement a shadow metastore that is in memory and runs for a session. This can contain definitions for session specific views that can be used to implement data flow variables in Hive. It can also be used for testing scripts. First we will support the later use case where in all the DDL statements in the session create objects in the session metastore and all the queries are converted to explain internal. Any thoughts on load commands?
> This feature is enabled when
> set hive.session.test = true
> is done in the session.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-805) Session level metastore

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao updated HIVE-805:
----------------------------

    Status: Open  (was: Patch Available)

> Session level metastore
> -----------------------
>
>                 Key: HIVE-805
>                 URL: https://issues.apache.org/jira/browse/HIVE-805
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.6.0
>            Reporter: Ashish Thusoo
>            Assignee: Ashish Thusoo
>         Attachments: HIVE-805-1.patch, HIVE-805.patch
>
>
> Implement a shadow metastore that is in memory and runs for a session. This can contain definitions for session specific views that can be used to implement data flow variables in Hive. It can also be used for testing scripts. First we will support the later use case where in all the DDL statements in the session create objects in the session metastore and all the queries are converted to explain internal. Any thoughts on load commands?
> This feature is enabled when
> set hive.session.test = true
> is done in the session.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-805) Session level metastore

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748582#action_12748582 ] 

Zheng Shao commented on HIVE-805:
---------------------------------

In the mode of "hive.session.test = true", we should translate "create" to "create temporary", and "select" to "explain select".

The Metastore/Hive.java code should only look at whether it's "create" or "create temporary" to decide which metastore to use.


> Session level metastore
> -----------------------
>
>                 Key: HIVE-805
>                 URL: https://issues.apache.org/jira/browse/HIVE-805
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.2.0
>            Reporter: Ashish Thusoo
>            Assignee: Ashish Thusoo
>             Fix For: 0.5.0
>
>         Attachments: HIVE-805.patch
>
>
> Implement a shadow metastore that is in memory and runs for a session. This can contain definitions for session specific views that can be used to implement data flow variables in Hive. It can also be used for testing scripts. First we will support the later use case where in all the DDL statements in the session create objects in the session metastore and all the queries are converted to explain internal. Any thoughts on load commands?
> This feature is enabled when
> set hive.session.test = true
> is done in the session.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-805) Session level metastore

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ashish Thusoo updated HIVE-805:
-------------------------------

        Fix Version/s: 0.5.0
    Affects Version/s:     (was: 0.5.0)
                       0.2.0
               Status: Patch Available  (was: Open)

submitting patch.

> Session level metastore
> -----------------------
>
>                 Key: HIVE-805
>                 URL: https://issues.apache.org/jira/browse/HIVE-805
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.2.0
>            Reporter: Ashish Thusoo
>            Assignee: Ashish Thusoo
>             Fix For: 0.5.0
>
>         Attachments: HIVE-805.patch
>
>
> Implement a shadow metastore that is in memory and runs for a session. This can contain definitions for session specific views that can be used to implement data flow variables in Hive. It can also be used for testing scripts. First we will support the later use case where in all the DDL statements in the session create objects in the session metastore and all the queries are converted to explain internal. Any thoughts on load commands?
> This feature is enabled when
> set hive.session.test = true
> is done in the session.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-805) Session level metastore

Posted by "Prasad Chakka (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748233#action_12748233 ] 

Prasad Chakka commented on HIVE-805:
------------------------------------

what is the HDFS location for tables in session level metastore? is it the same location as regular table?

> Session level metastore
> -----------------------
>
>                 Key: HIVE-805
>                 URL: https://issues.apache.org/jira/browse/HIVE-805
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.2.0
>            Reporter: Ashish Thusoo
>            Assignee: Ashish Thusoo
>             Fix For: 0.5.0
>
>         Attachments: HIVE-805.patch
>
>
> Implement a shadow metastore that is in memory and runs for a session. This can contain definitions for session specific views that can be used to implement data flow variables in Hive. It can also be used for testing scripts. First we will support the later use case where in all the DDL statements in the session create objects in the session metastore and all the queries are converted to explain internal. Any thoughts on load commands?
> This feature is enabled when
> set hive.session.test = true
> is done in the session.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-805) Session level metastore

Posted by "Prasad Chakka (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748631#action_12748631 ] 

Prasad Chakka commented on HIVE-805:
------------------------------------

# can you rename 'test' mode to 'temporary' mode or something like that? test here should mean either dry-run or temporary.
# this patch tries to allow creation of a partition of a regular table in temporary store. i am sure that it fails. i don't think there is a good solution at all since the metastore requires the table to exist before creating a partition. should we allow this at all? if we need it then we may have to redesign this.
# once a session table is created, a table parameter should identify that as such. this can be done by adding that parameter before creating the table in session metastore. alter_table etc that take in a table object should depend on this table instead of trying to alter both metastores.
# if ignoreUnknownTab=true then a NoSuchObjectException will not be thrown. so the below code will be incorrect.
{code}
boolean tableDropped = false;
if (this.conf.getBoolVar(HiveConf.ConfVars.HIVESESSIONTEST)) {
  try {
    getSessionMSC().dropTable(dbName, tableName, deleteData, ignoreUnknownTab);
    tableDropped = true;
  }
  catch (NoSuchObjectException e) {
    // Ignore if the table is not found
  }
}

if (!tableDropped)
  getMSC().dropTable(dbName, tableName, deleteData, ignoreUnknownTab);
{code}

this pattern can be rewritten as

{code}
if (this.conf.getBoolVar(HiveConf.ConfVars.HIVESESSIONTEST)) {
  try {
    getSessionMSC().dropTable(dbName, tableName, deleteData, ignoreUnknownTab);
  }
  catch (NoSuchObjectException e) {
  	getMSC().dropTable(dbName, tableName, deleteData, ignoreUnknownTab);
  }
}
{code}


> Session level metastore
> -----------------------
>
>                 Key: HIVE-805
>                 URL: https://issues.apache.org/jira/browse/HIVE-805
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.2.0
>            Reporter: Ashish Thusoo
>            Assignee: Ashish Thusoo
>             Fix For: 0.5.0
>
>         Attachments: HIVE-805.patch
>
>
> Implement a shadow metastore that is in memory and runs for a session. This can contain definitions for session specific views that can be used to implement data flow variables in Hive. It can also be used for testing scripts. First we will support the later use case where in all the DDL statements in the session create objects in the session metastore and all the queries are converted to explain internal. Any thoughts on load commands?
> This feature is enabled when
> set hive.session.test = true
> is done in the session.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.