You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "James Taylor (JIRA)" <ji...@apache.org> on 2015/07/29 21:19:04 UTC

[jira] [Commented] (PHOENIX-2154) Failure of one mapper should not affect other mappers in MR index build

    [ https://issues.apache.org/jira/browse/PHOENIX-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14646626#comment-14646626 ] 

James Taylor commented on PHOENIX-2154:
---------------------------------------

Right now, the default reducer we use is KeyValueSortReducer . This aggregates the mapper output and write as HFiles into a directory that the user has passed as an argument to the job. After the map reduce job is complete, the LoadIncrementalHFiles class does the task to move the HFiles onto the HBase table. 

Couple of possibilities that can happen in the above workflow
a) If all mappers succeed:
        A happy path. HFiles get loaded into the HTable and we then update the state to ACTIVE.

b) If few mappers fail:
     The job is marked as failure. There can be instances that few mappers have succeeded but since few(one or many) have failed, the job is marked as a Failure.  In this case, the output that was produced by the successful mappers is discarded as part of the cleanup process in MapReduce.  In this case, the entire job has to be re-run.  

It'd be good to make the mapper task be a kind of independent chunk of work, which once complete, would not need to be run again in case the job is started again.
   Are you referring to the second case above? If so, I am not sure of an easy way to have the output of mappers persist beyond the job. We can address this by pushing the output KeyValue pairs onto a persistent store like HDFS rather than writing to local filesystem directories. However, I see a challenge when the job is run again.  The case here is we have few records(in no sequence order in the primary table) already processed and written to a persistent store and if we do not want them to be processed again in the mapper, there should be some flag or signal to avoid reading it.  Mappers usually have region boundaries, if when the job is run again, if we have the same region boundaries, then we could definitely have a way to hack into the framework by saying to run the mappers on only those regions where the earlier mappers failed.

> Failure of one mapper should not affect other mappers in MR index build
> -----------------------------------------------------------------------
>
>                 Key: PHOENIX-2154
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2154
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>
> Once a mapper in the MR index job succeeds, it should not need to be re-done in the event of the failure of one of the other mappers. The initial population of an index is based on a snapshot in time, so new rows getting *after* the index build has started and/or failed do not impact it.
> Also, there's a 1:1 correspondence between index rows and table rows, so there's really no need to dedup. However, the index rows will have a different row key than the data table, so I'm not sure how the HFiles are split. Will they potentially overlap and is this an issue?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)