You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "John Sichi (JIRA)" <ji...@apache.org> on 2010/09/22 21:40:34 UTC

[jira] Commented: (HIVE-1496) enhance CREATE INDEX to support immediate index build

    [ https://issues.apache.org/jira/browse/HIVE-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12913737#action_12913737 ] 

John Sichi commented on HIVE-1496:
----------------------------------

The implementation for this will need to chain together a task to do the actual index building together with a task to do the metastore update.  It should be similar to CREATE TABLE AS SELECT (which both creates the table definition in the metastore and does the equivalent of an INSERT to populate it with the SELECT results).

Use "EXPLAIN CREATE TABLE p AS SELECT * FROM pokes;" to see the combined plan.  And see the end of SemanticAnalyzer.genMapRedTasks for where it chains the tasks together.

{noformat}
    if (qb.isCTAS()) {
      // generate a DDL task and make it a dependent task of the leaf
      ...
{noformat}

For immediate index build, we want to combine the existing CREATE INDEX with ALTER INDEX REBUILD.  One hiccup may be that the rebuild already wants the index to be defined in the metastore, whereas for CREATE TABLE AS SELECT we do it in the opposite order (only populating the metastore after the data is successfully loaded).  It may be acceptable to just make the CREATE INDEX non-atomic (i.e. populate the metastore first, and if the rebuild fails, we leave the index empty; the user can retry with ALTER INDEX REBUILD, same as if it had been deferred in the first place).

Ning Zhang (nzhang at facebook dot com) did the CREATE TABLE AS SELECT implementation, so he may be able to provide help if you run into trouble with this one.


> enhance CREATE INDEX to support immediate index build
> -----------------------------------------------------
>
>                 Key: HIVE-1496
>                 URL: https://issues.apache.org/jira/browse/HIVE-1496
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.7.0
>            Reporter: John Sichi
>            Assignee: Russell Melick
>             Fix For: 0.7.0
>
>
> Currently we only support WITH DEFERRED REBUILD.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.