You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Geoffrey Jacoby (JIRA)" <ji...@apache.org> on 2018/11/16 21:16:00 UTC

[jira] [Created] (PHOENIX-5027) PhoenixIndexImportDirectMapper retried mappers can succeed without inserting all index data

Geoffrey Jacoby created PHOENIX-5027:
----------------------------------------

             Summary: PhoenixIndexImportDirectMapper retried mappers can succeed without inserting all index data
                 Key: PHOENIX-5027
                 URL: https://issues.apache.org/jira/browse/PHOENIX-5027
             Project: Phoenix
          Issue Type: Bug
            Reporter: Geoffrey Jacoby


On two recent occasions I've rebuilt a large global immutable index by doing a DROP/CREATE and ended up with missing index data, though it doesn't happen every time. Here's what happened:

1. PhoenixMRJobSubmitter correctly detects the index rebuild is necessary, and invokes IndexTool.
2. IndexTool enqueues a MapReduce job using PhoenixIndexImportDirectMapper
3. Some mappers fail because of timeouts due to heavy splitting on the new index table
4. Those mappers are retried and succeed. The MR job as a whole completes successfully.
5. RowCounter and IndexScrutinyTool show millions of rows are missing from the index, with keys that imply they were part of the failed mappers

Aside from the timestamp glitch I pointed out in PHOEIX-5018, the code in PhoenixIndexImportDirectMapper _looks_ idempotent on a rerun, so I've been struggling to find the cause of the missing index data. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)