You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "James Taylor (JIRA)" <ji...@apache.org> on 2018/01/19 18:18:00 UTC

[jira] [Comment Edited] (PHOENIX-4519) Index rebuild MR jobs not created for "alter index rebuild async" rebuilds

    [ https://issues.apache.org/jira/browse/PHOENIX-4519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16332673#comment-16332673 ] 

James Taylor edited comment on PHOENIX-4519 at 1/19/18 6:17 PM:
----------------------------------------------------------------

Thanks for the info, [~ankit@apache.org]. Couple of more specific questions:
* Where are the unit tests for IndexToolForPartialBuildIT?
* Do they test all the corner cases as is done by PartialIndexRebuilderIT?
** Multiple versions of same row
** Multiple versions of same row with family deletes intermixed
** Null values of columns
** Index write failure while executing raw scan while partially rebuilding (with multiple batches)
** Data table taking writes to same rows being partially rebuilt (see testUpperBoundSetOnRebuild)
** Disable or rebuild of index during partial rebuild

I filed PHOENIX-4543 for the MR partial index rebuilder to handle the case in which the index is left active while the partial rebuild is happening. Some use cases would rather tolerate some drift between the index and data table than take the read hit of having a disabled index. Since it's use case dependent, we allow this to be set on a per table basis. This is based on the DISABLE_INDEX_ON_WRITE_FAILURE property on the htable (true means it's disabled, false means it's left active) and REBUILD_INDEX_ON_WRITE_FAILURE (true means to partially rebuild the index while false means not to).


was (Author: jamestaylor):
Thanks for the info, [~ankit@apache.org]. Couple of more specific questions:
* Where are the unit tests for IndexToolForPartialBuildIT?
* Do they test all the corner cases as is done by PartialIndexRebuilderIT?
** Multiple versions of same row
** Multiple versions of same row with family deletes intermixed
** Null values of columns
** Index write failure while executing raw scan while partially rebuilding (with multiple batches)
** Data table taking writes to same rows being partially rebuilt (see testUpperBoundSetOnRebuild)
** Disable or rebuild of index during partial rebuild

I'll file a JIRA for handling the case in which the index is left active while the partial rebuild is happening. Some use cases would rather tolerate some drift between the index and data table than take the read hit of having a disabled index. Since it's use case dependent, we allow this to be set on a per table basis. This is based on the DISABLE_INDEX_ON_WRITE_FAILURE property on the htable (true means it's disabled, false means it's left active) and REBUILD_INDEX_ON_WRITE_FAILURE (true means to partially rebuild the index while false means not to).

> Index rebuild MR jobs not created for "alter index rebuild async" rebuilds
> --------------------------------------------------------------------------
>
>                 Key: PHOENIX-4519
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4519
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Vincent Poon
>            Assignee: Vincent Poon
>            Priority: Major
>
> It seems we have two ASYNC flags for index rebuilds:
> ASYNC_CREATED_DATE - when an index is created async
> ASYNC_REBUILD_TIMESTAMP - created by "alter index ... rebuild async"
> The PhoenixMRJobSubmitter only submits MR jobs for the former.  We should also submit jobs for the latter.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)