You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "James Taylor (JIRA)" <ji...@apache.org> on 2015/01/27 23:16:35 UTC

[jira] [Commented] (PHOENIX-1609) MR job to populate index tables

    [ https://issues.apache.org/jira/browse/PHOENIX-1609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14294298#comment-14294298 ] 

James Taylor commented on PHOENIX-1609:
---------------------------------------

Might be a dup of PHOENIX-413?

Given that functional indexes (PHOENIX-514) are just about ready, we'd need to support expressions instead of just columns (which I think should be fine given our map-reduce integration, right?). There are a few other options as well: local versus global, included columns, and potential properties on the index table: http://phoenix.apache.org/language/index.html#create_index.  It'd be nice too if this way of building an index could optionally be used when the regular CREATE INDEX call is made.

As far as the implementation, we'd want to set the index state to Building when starting the MR job and then to Active when finished. This would cause updates to the data table to be propagated to the index table while the building is in progress. See PIndexState enum and MetaDataService.updateIndexState(). There'd also be some subtleties around coercing the data table type to the correct index table type. See IndexUtil.getIndexColumnDataType(). 

> MR job to populate index tables 
> --------------------------------
>
>                 Key: PHOENIX-1609
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1609
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: maghamravikiran
>            Assignee: maghamravikiran
>
> Often, we need to create new indexes on master tables way after the data exists on the master tables.  It would be good to have a simple MR job given by the phoenix code that users can call to have indexes in sync with the master table. 
> Users can invoke the MR job using the following command 
> hadoop jar org.apache.phoenix.mapreduce.Index -st MASTER_TABLE -tt INDEX_TABLE -columns a,b,c
> Is this ideal? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)