You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by GitBox <gi...@apache.org> on 2022/01/18 16:49:05 UTC

[GitHub] [nutch] sebastian-nagel commented on a change in pull request #721: NUTCH-2923: Added JobId in Job Failure logs

sebastian-nagel commented on a change in pull request #721:
URL: https://github.com/apache/nutch/pull/721#discussion_r786958191



##########
File path: src/java/org/apache/nutch/util/SitemapProcessor.java
##########
@@ -402,9 +402,8 @@ public void sitemap(Path crawldb, Path hostdb, Path sitemapUrlDir, boolean stric
     try {
       boolean success = job.waitForCompletion(true);
       if (!success) {
-        String message = "SitemapProcessor_" + crawldb.toString()
-            + " job did not succeed, job status: " + job.getStatus().getState()
-            + ", reason: " + job.getStatus().getFailureInfo();
+        String message = NutchJob.getJobFailureLogMessage(
+            "SitemapProcessor_" + crawldb.toString(), job);

Review comment:
       I know it was already there: maybe simplify the job name in the message to simply "SitemapProcessor" and leave the path to the CrawlDb away?

##########
File path: src/java/org/apache/nutch/util/NutchJob.java
##########
@@ -81,4 +83,26 @@ public static void cleanupAfterFailure(Path tempDir, Path lock, FileSystem fs)
     }
   }
 
+  /**
+   * Method to return job failure log message. To be used across all Jobs
+   * 
+   * @param name
+   *          Name/Type of the job
+   * @param job
+   *          Job Object for Job details
+   * @return job failure log message
+   * @throws IOException
+   *           Can occur during fetching job status
+   * @throws InterruptedException
+   *           Can occur during fetching job status
+   */
+  public static String getJobFailureLogMessage(String name, Job job)
+      throws IOException, InterruptedException {
+    if (job != null) {
+      return String.format(JOB_FAILURE_LOG_FORMAT, name, job.getJobID(),
+          job.getStatus(), job.getStatus().getFailureInfo());

Review comment:
       Really `job.getStatus()` instead of `job.getStatus.getState()` ?
   
   The log output doesn't look readable:
   ```
   2022-01-18 17:37:04,892 ERROR o.a.n.c.Injector [main] Injector job did not succeed, job id: job_local138302276_0001, job status: job-id : job_local138302276_0001uber-mode : falsemap-progress : 1.0reduce-progress : 0.0cleanup-progress : 1.0setup-progress : 1.0runstate : FAILEDstart-time : 0user-name : sebpriority : DEFAULTscheduling-info : NAnum-used-slots0num-reserved-slots0used-mem0reserved-mem0needed-mem0, reason: NA
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@nutch.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org