You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Amar Kamat (JIRA)" <ji...@apache.org> on 2008/09/16 17:20:44 UTC

[jira] Created: (HADOOP-4191) Add a testcase for jobhistory

Add a testcase for jobhistory
-----------------------------

                 Key: HADOOP-4191
                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
             Project: Hadoop Core
          Issue Type: Improvement
          Components: mapred
            Reporter: Amar Kamat


Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12672356#action_12672356 ] 

Ravi Gummadi commented on HADOOP-4191:
--------------------------------------

Hi Amar,
  FAILED_MAPS, FAILED_REDUCES are logged only from logFinished() --- i.e. for a successful job.
Will be checking FINISHED_MAPS, FINISHED_REDUCES, FAILED_MAPS, FAILED_REDUCES for the successful job for which all other job level keys are validated. 
Is the following code being added to validateJobLevelKJeys() look fine ? (looks like too much dependant on the fact that validateJobLevelKeys() is called only after UtilsForTests.runJobSucceed() and that internally calls UtilsForTests.runJob(). Hmm may be we can send expected number of finished_maps, finished_reduces, failed_maps, failed_reduces as parameters ???)


    if (status.equals("SUCCESS")) {
      // Since UtilsForTests.runJob() sets number of maps to be run as 1 and
      // number of reduces to be run as 0, we expect the following values
      // for the case of runJobSucceed()
      String fin_maps = values.get(Keys.FINISHED_MAPS);
      assertTrue("Unexpected number of finished maps in history file",
                 fin_maps == 1);
      String fin_reduces = values.get(Keys.FINISHED_REDUCES);
      assertTrue("Unexpected number of finished reduces in history file",
                 fin_reduces == 0);
      String failed_maps = values.get(Keys.FAILED_MAPS);
      assertTrue("Unexpected number of failed maps in history file",
                 failed_maps == 0);
      String failed_reduces = values.get(Keys.FAILED_REDUCES);
      assertTrue("Unexpected number of finished reduces in history file",
                 failed_reduces == 0);
    }

-Ravi




> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>         Attachments: HADOOP-4191.patch, HADOOP-4191.v2.patch
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12672502#action_12672502 ] 

Amareshwari Sriramadasu commented on HADOOP-4191:
-------------------------------------------------

some comments:
1. Any reason for using LocalFileSystem for the tests?
bq.       mr = new MiniMRCluster(2, "file:///", 3);
I thought the tests should be done with DFS.

2. should we have a test with history location on dfs?

3. I think the test is still missing validation for some fieds such as SPLITS, because earlier we have seen bugs where history stopped logging splits.

4. I propose we should validate all the fields of atleast one map task and one reduce task.
For map task attempt, we should validate  TASK_ATTEMPT_ID, TASKID, START_TIME, FINISH_TIME, TRACKER_NAME, HTTP_PORT, TASK_STATUS, HOSTNAME, STATE_STRING, COUNTERS, ERROR (in case of failure)
For map task, we should validate TASKID, START_TIME, FINISH_TIME, TASK_STATUS, SPLITS, COUNTERS.
For reduce task attempt, we should validate TASK_ATTEMPT_ID, TASKID, START_TIME, SHUFFLE_FINISHED, SORT_FINISHED, FINISH_TIME, TRACKER_NAME, HTTP_PORT, TASK_STATUS, HOSTNAME, STATE_STRING, COUNTERS, ERROR (in case of failure).

Since RunningJob handle and JobClient (you can create one) handle are available, we could get the expected values for all these and validate the values in history.
Thoughts?

5. If the test case is looking too big, we can refactor the tests into different files. thoughts?

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>         Attachments: HADOOP-4191.patch, HADOOP-4191.v2.patch
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi updated HADOOP-4191:
---------------------------------

    Attachment: HADOOP-4191.v2.patch

As per Amar's comments, done some cleanup of code, moved runJob(), runJobKill(), runJobFail()  from TestJobKillAndFail to UtilsForTests. Added runJobSucceed() to UtilsForTests. Modified other dependant tests also. Added validation checks for the task Attempt level Keys TRACKER_NAME, HOSTNAME, HTTP_PORT and COUNTERS.

Attaching patch for review. Please provide your comments.

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>         Attachments: HADOOP-4191.patch, HADOOP-4191.v2.patch
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12671215#action_12671215 ] 

Ravi Gummadi commented on HADOOP-4191:
--------------------------------------

Discussed offline with Amar and came up with the following approach for testing job history:

Run a job that will be succeeded and validate its history file content
  (1) history file exists and in correct location
  (2) Verify if the history file is parsable
  (3) Validate the contents of history file
      (a) validate job level key, values
      (b) validate task level key, values
      (c) validate attempt level key, values
      (d) check if all the TaskAttempts, Tasks started are finished

Run jobs with for the given values of hadoop.job.history.user.location as (1)null(default case), (2)"none", and (3)some dir like "/tmp".
  Validate user history file location in each case.

Run jobs that will be (1) succeeded (2) failed (3) killed.
  Validate job status read from history file in each case.


Thoughts ?

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12672864#action_12672864 ] 

Ravi Gummadi commented on HADOOP-4191:
--------------------------------------

1 & 2.  Local file system seems to be good enough.

3. Will add validation of format of SPLITS and TOTAL_MAPS, TOTAL_REDUCES, FINISHED_MAPS, FINISHED_REDUCES, FAILED_MAPS, FAILED_REDUCES.

4. Will do validation of all keys, values of history file by comparing them with actual values obtained from JT.


> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>         Attachments: HADOOP-4191.patch, HADOOP-4191.v2.patch
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12673155#action_12673155 ] 

Ravi Gummadi commented on HADOOP-4191:
--------------------------------------

Unit tests passed on my local machine.

ant test-patch gave:

     [exec] +1 overall.
     [exec]
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]
     [exec]     +1 tests included.  The patch appears to include 18 new or modified tests.
     [exec]
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
     [exec]
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
     [exec]
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec]
     [exec]     +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
     [exec]
     [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>         Attachments: HADOOP-4191.patch, HADOOP-4191.v2.patch, HADOOP-4191.v3.patch
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nigel Daley updated HADOOP-4191:
--------------------------------

    Component/s: test

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12670637#action_12670637 ] 

Ravi Gummadi commented on HADOOP-4191:
--------------------------------------

ok. Will validate the file name and location for both user log file and framework log file.

Currently parser would ignore the rest of the line if the pattern KEY="VALUE" is not found somewhere on the line and will not give any errors. In this testcase we validate if the whole line matches the expected regex.

Missed mentioning in earlier comment. For this testcase, we disable speculative execution and see if all the attemptIDs are finished before TaskID finishes.

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Vinod K V (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635648#action_12635648 ] 

Vinod K V commented on HADOOP-4191:
-----------------------------------

That's quite a few things to keep track of. Instead of trying to track them *only* via a test case, we can centralize these requirements in something like a schema definition, which can then be used by jobhistory clients to validate a history file before its usage. Thoughts?

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>             Fix For: 0.19.0
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi reassigned HADOOP-4191:
------------------------------------

    Assignee: Ravi Gummadi

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12670633#action_12670633 ] 

Amareshwari Sriramadasu commented on HADOOP-4191:
-------------------------------------------------

bq. (1) history file name and location are validated.
This has to done in two locations : mapred.job.hitsory.location and mapred.job.history.user.location

bq. (2) Each line's format is checked against a regex.
I guess this will be done by the parser itself. If anyline doesn't match recType {KEY =" VALUE "}* pattern, the parser will fail.

bq. (6) We maintain a Map<String, List<String>> of TaskIDs to TaskAttemptIDs and see if all the atemptIDs are finished before the TaskID finishes.
This will not be the case always. Incase of speculative tasks, TaskAttemptIDs will be killed after the TaskID finishes.



> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12681603#action_12681603 ] 

Devaraj Das commented on HADOOP-4191:
-------------------------------------

I committed this to the 0.20 branch.

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>             Fix For: 0.21.0
>
>         Attachments: h20_4191.patch, HADOOP-4191.patch, HADOOP-4191.v2.patch, HADOOP-4191.v3.patch, HADOOP-4191.v4.1.patch, HADOOP-4191.v4.patch
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi updated HADOOP-4191:
---------------------------------

    Attachment: HADOOP-4191.patch

Attaching patch with the above approach. Patch also has few code changes to TestJobHistoryVersion such that more issues related to new versions of history files are caught.

Please review the patch and provide your comments.

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>         Attachments: HADOOP-4191.patch
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635642#action_12635642 ] 

Amar Kamat commented on HADOOP-4191:
------------------------------------

My intention behind this was to safeguard the framework against unexpected changes in jobhistory. Some features heavily use jobhistory to get things done. Any incompatible change to jobhistory might break the framework or render these features useless. Here are few things related to jobhistory that are assumed
1) jobhistory filename
2) jobhistory file location(s)
3) versioning .. handled in HADOOP-4190
4) jobhistory format
5) jobhistory line structure
6) jobhistory line content i.e keys and values
7) Features like job-recovery expect few fields to be present in a line. Any change to do with removal of these fields or changing their content/representation should be detected.

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>             Fix For: 0.19.0
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi updated HADOOP-4191:
---------------------------------

    Release Note: 
This patch validates job history files so that issues related to jobHistory are caught when changes to JobHistory  are made.
Validates history file name, file location, parsability. Validates user log location for varoius configuration settings. Validates job status for jobs that are succeeded, failed and killed.
Validates format of history file contents and also validates contents by comparing with actual values obtained from job tracker.
    Hadoop Flags: [Reviewed]
          Status: Patch Available  (was: Open)

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>         Attachments: HADOOP-4191.patch, HADOOP-4191.v2.patch, HADOOP-4191.v3.patch
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12681753#action_12681753 ] 

Hudson commented on HADOOP-4191:
--------------------------------

Integrated in Hadoop-trunk #778 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/778/])
    . Moving the comment about 4191 to the section under 0.20.


> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>             Fix For: 0.21.0
>
>         Attachments: h20_4191.patch, HADOOP-4191.patch, HADOOP-4191.v2.patch, HADOOP-4191.v3.patch, HADOOP-4191.v4.1.patch, HADOOP-4191.v4.patch
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi updated HADOOP-4191:
---------------------------------

    Attachment: HADOOP-4191.v4.1.patch

Attaching new patch fixing a minor bug.

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-4191.patch, HADOOP-4191.v2.patch, HADOOP-4191.v3.patch, HADOOP-4191.v4.1.patch, HADOOP-4191.v4.patch
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677021#action_12677021 ] 

Hudson commented on HADOOP-4191:
--------------------------------

Integrated in Hadoop-trunk #766 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/766/])
    

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-4191.patch, HADOOP-4191.v2.patch, HADOOP-4191.v3.patch, HADOOP-4191.v4.1.patch, HADOOP-4191.v4.patch
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Amareshwari Sriramadasu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12673147#action_12673147 ] 

Amareshwari Sriramadasu commented on HADOOP-4191:
-------------------------------------------------

patch looks fine

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>         Attachments: HADOOP-4191.patch, HADOOP-4191.v2.patch, HADOOP-4191.v3.patch
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi updated HADOOP-4191:
---------------------------------

    Attachment: HADOOP-4191.v3.patch

Added code for validation of all keys, values of history file by comparing them with actual values obtained from JT.

Attaching the patch. Please review and provide your comments.

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>         Attachments: HADOOP-4191.patch, HADOOP-4191.v2.patch, HADOOP-4191.v3.patch
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12670443#action_12670443 ] 

Ravi Gummadi commented on HADOOP-4191:
--------------------------------------

Here is a proposal for the new testcase for job history:

Run a job and then parse the job history file and validate the following:
(1) history file name and location are validated.
(2) Each line's format is checked against a regex.
(3) Format of all TIMEs are checked against a regex.
(4) Checking if the VALUE in history file for a partivcular KEY is valid, specifically for KEYs like JOB_STATUS, TASK_TYPE, TASK_STATUS, JOB_PRIORITY.
(5) Check if changes to properties like Job_PRIORITY are logged to the history file.
(6) We maintain a Map<String, List<String>> of TaskIDs to TaskAttemptIDs and see if all the atemptIDs are finished before the TaskID finishes.
        We match finish of each TaskAttemptIDs, TaskIDs against its start to make sure that all TaskAttempts, Tasks and Jobs started are
        indeed finished and the history log lines for them are in proper order.


Thoughts ?

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12674191#action_12674191 ] 

Amar Kamat commented on HADOOP-4191:
------------------------------------

bq. Some comments:
5. Also check if the time is sane. START_TIME = 0 doesnt make any sense (see HADOOP-5276). Also make sure that FINISH_TIME > START_TIME for each attempt, task and job.

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-4191.patch, HADOOP-4191.v2.patch, HADOOP-4191.v3.patch
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-4191:
--------------------------------

    Fix Version/s:     (was: 0.19.0)

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi updated HADOOP-4191:
---------------------------------

    Attachment: h20_4191.patch

Attaching patch for branch 0.20. Needs to be committed after committing patches of 5276 & 5278. Unit tests passed on my machine.

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>             Fix For: 0.21.0
>
>         Attachments: h20_4191.patch, HADOOP-4191.patch, HADOOP-4191.v2.patch, HADOOP-4191.v3.patch, HADOOP-4191.v4.1.patch, HADOOP-4191.v4.patch
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12635649#action_12635649 ] 

Amar Kamat commented on HADOOP-4191:
------------------------------------

I am thinking more in terms of a precautionary measure. Instead of creating a history file and then testing whether it adheres to some schema, it would be nice to catch the incompatible change early and let the developer know the effect of the changes made. Otherwise the only way to figure the incompatibility would be see why some of the tests failed and infer the cause. Someone trying to change the history might know about its effect on job-recovery or history-viewer. 

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>             Fix For: 0.19.0
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Ravi Gummadi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravi Gummadi updated HADOOP-4191:
---------------------------------

    Attachment: HADOOP-4191.v4.patch

Thanks Devaraj for your comments. Done the validation of history files obtained from TestLostTracker, TestJobTrackerRestart, TestJobTrackerRestartWithLostTracker and identified issues in history files HADOOP-5272, HADOOP-5276, HADOOP-5278, HADOOP-5282. Thanks Amar for fixing these new issues fast. Testcase is also modified.

Attaching patch after incorporating code changes based on Devaraj's & Amar's comments. This patch will work only with the patches of above 4 JIRAs.

Please review the patch and provide your comments.

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-4191.patch, HADOOP-4191.v2.patch, HADOOP-4191.v3.patch, HADOOP-4191.v4.patch
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das updated HADOOP-4191:
--------------------------------

    Fix Version/s: 0.21.0
           Status: Open  (was: Patch Available)

Some comments:
1) The Listener should check the format of the TaskID, TaskAttemptID, JobID strings (just doing id.forName()) is sufficient).
2) We should also catch the case where END statement comes before the START (if this happens in a real jobhistory file, the recovery manager will panic).
3) We should check for duplicates (like multiple START or END lines for the same subject) and catch them.
4) It would be nice to capture checks for the sequence of history lines when a TT running this job's tasks is lost.

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-4191.patch, HADOOP-4191.v2.patch, HADOOP-4191.v3.patch
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12672330#action_12672330 ] 

Amar Kamat commented on HADOOP-4191:
------------------------------------

Looks good. Just one minor comment. Can you also check if count for failed/killed maps/reducers are correct. I think you can merge this check with failed/killed job. Thoughts?

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>         Attachments: HADOOP-4191.patch, HADOOP-4191.v2.patch
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Devaraj Das resolved HADOOP-4191.
---------------------------------

    Resolution: Fixed

I just committed this. Thanks, Ravi!

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>            Assignee: Ravi Gummadi
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-4191.patch, HADOOP-4191.v2.patch, HADOOP-4191.v3.patch, HADOOP-4191.v4.1.patch, HADOOP-4191.v4.patch
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4191) Add a testcase for jobhistory

Posted by "Nigel Daley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nigel Daley updated HADOOP-4191:
--------------------------------

    Fix Version/s: 0.19.0

Amar, can we get this done for 0.19?

> Add a testcase for jobhistory
> -----------------------------
>
>                 Key: HADOOP-4191
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4191
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred, test
>            Reporter: Amar Kamat
>             Fix For: 0.19.0
>
>
> Changes in job history might break the history parser which in turn might break some features like jobtracker-recovery, history-viewer etc. There should be a testcase that catches these incompatible changes early and informs about the expected change.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.