You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Karthik Kambatla (JIRA)" <ji...@apache.org> on 2016/09/07 17:05:20 UTC

[jira] [Commented] (MAPREDUCE-6638) Do not attempt to recover jobs if encrypted spill is enabled

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15471152#comment-15471152 ] 

Karthik Kambatla commented on MAPREDUCE-6638:
---------------------------------------------

bq. MR does not seem to have any safe store to persist the encryption key across job attempts
One could store the encrypted key in KMS. Once stored, we could do one of the following:
# Tasks have a way to fetch this key directly
#  Leave the tasks as is, but augment the AM to recover this key as part of recovery. 

bq. Not quite sure about when to make something like this incline or a tiny method.
It is okay to pull this into a separate method. That said, I feel this method could do with improved logging. For instance, when we don't recover, the method dumps a bunch of information that is hard for end-users to parse. Instead, I would like something like this:
{code}
    boolean attemptRecovery = true;
    boolean recoveryEnabled = getConfig().getBoolean(
        MRJobConfig.MR_AM_JOB_RECOVERY_ENABLE,
        MRJobConfig.MR_AM_JOB_RECOVERY_ENABLE_DEFAULT);
    if (!recoveryEnabled) {
      LOG.info("Not attempting to recover. Recovery disabled. To enable " +
          "recovery, set " + MRJobConfig.MR_AM_JOB_RECOVERY_ENABLE);
      attemptRecovery = false;
    }

    boolean recoverySupportedByCommitter = isRecoverySupported();
    if (!recoverySupportedByCommitter) {
      LOG.info("Not attempting to recover. Recovery is not supported by " +
          "committer. Use an OutputCommitter that allows recovery.");
      attemptRecovery = false;
    }

    int numReduceTasks = getConfig().getInt(MRJobConfig.NUM_REDUCES, 0);
    boolean shuffleKeyValidForRecovery =
        TokenCache.getShuffleSecretKey(jobCredentials) != null;
    if (numReduceTasks > 0 && !shuffleKeyValidForRecovery) {
      LOG.info("Not attempting to recover. Shuffle key is not valid for " +
          "recovery.");
      attemptRecovery = false;
    }

    boolean recoverySucceeded = true;
    if (attemptRecovery) {
      LOG.info("Attempting to recover.");
      try {
        parsePreviousJobHistory();
      } catch (IOException e) {
        LOG.warn("Unable to parse prior job history, aborting recovery", e);
        recoverySucceeded = false;
      }
    }
    
    if (!attemptRecovery || !recoverySucceeded) {
      // Get the amInfos anyways whether recovery is enabled or not
      am
{code}
Alternatively, processRecovery could return true if recovery succeeded or if it is first attempt, and false otherwise. 




> Do not attempt to recover jobs if encrypted spill is enabled
> ------------------------------------------------------------
>
>                 Key: MAPREDUCE-6638
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6638
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: applicationmaster
>    Affects Versions: 2.7.2
>            Reporter: Karthik Kambatla
>            Assignee: Haibo Chen
>         Attachments: mapreduce6638.001.patch, mapreduce6638.002.patch
>
>
> Post the fix to CVE-2015-1776, jobs with ecrypted spills enabled cannot be recovered if the AM fails. We should store the key some place safe so they can actually be recovered. If there is no "safe" place, at least we should restart the job by re-running all mappers/reducers. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org