You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Geoffrey Jacoby (JIRA)" <ji...@apache.org> on 2018/11/16 21:16:00 UTC
[jira] [Created] (PHOENIX-5027) PhoenixIndexImportDirectMapper
retried mappers can succeed without inserting all index data
Geoffrey Jacoby created PHOENIX-5027:
----------------------------------------
Summary: PhoenixIndexImportDirectMapper retried mappers can succeed without inserting all index data
Key: PHOENIX-5027
URL: https://issues.apache.org/jira/browse/PHOENIX-5027
Project: Phoenix
Issue Type: Bug
Reporter: Geoffrey Jacoby
On two recent occasions I've rebuilt a large global immutable index by doing a DROP/CREATE and ended up with missing index data, though it doesn't happen every time. Here's what happened:
1. PhoenixMRJobSubmitter correctly detects the index rebuild is necessary, and invokes IndexTool.
2. IndexTool enqueues a MapReduce job using PhoenixIndexImportDirectMapper
3. Some mappers fail because of timeouts due to heavy splitting on the new index table
4. Those mappers are retried and succeed. The MR job as a whole completes successfully.
5. RowCounter and IndexScrutinyTool show millions of rows are missing from the index, with keys that imply they were part of the failed mappers
Aside from the timestamp glitch I pointed out in PHOEIX-5018, the code in PhoenixIndexImportDirectMapper _looks_ idempotent on a rerun, so I've been struggling to find the cause of the missing index data.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)