You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Geoffrey Jacoby (Jira)" <ji...@apache.org> on 2020/02/05 17:51:00 UTC
[jira] [Resolved] (PHOENIX-5027) PhoenixIndexImportDirectMapper
retried mappers can succeed without inserting all index data
[ https://issues.apache.org/jira/browse/PHOENIX-5027?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Geoffrey Jacoby resolved PHOENIX-5027.
--------------------------------------
Resolution: Fixed
This was resolved as part of PHOENIX-5694.
> PhoenixIndexImportDirectMapper retried mappers can succeed without inserting all index data
> -------------------------------------------------------------------------------------------
>
> Key: PHOENIX-5027
> URL: https://issues.apache.org/jira/browse/PHOENIX-5027
> Project: Phoenix
> Issue Type: Bug
> Reporter: Geoffrey Jacoby
> Assignee: Kadir OZDEMIR
> Priority: Major
>
> On two recent occasions I've rebuilt a large global immutable index by doing a DROP/CREATE and ended up with missing index data, though it doesn't happen every time. Here's what happened:
> 1. PhoenixMRJobSubmitter correctly detects the index rebuild is necessary, and invokes IndexTool.
> 2. IndexTool enqueues a MapReduce job using PhoenixIndexImportDirectMapper
> 3. Some mappers fail because of timeouts due to heavy splitting on the new index table
> 4. Those mappers are retried and succeed. The MR job as a whole completes successfully.
> 5. RowCounter and IndexScrutinyTool show millions of rows are missing from the index, with keys that imply they were part of the failed mappers
> Aside from the timestamp glitch I pointed out in PHOEIX-5018, the code in PhoenixIndexImportDirectMapper _looks_ idempotent on a rerun, so I've been struggling to find the cause of the missing index data.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)