You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Santhosh Srinivasan (JIRA)" <ji...@apache.org> on 2008/12/29 23:47:44 UTC

[jira] Created: (PIG-591) Error handling phase four

Error handling phase four
-------------------------

                 Key: PIG-591
                 URL: https://issues.apache.org/jira/browse/PIG-591
             Project: Pig
          Issue Type: Sub-task
          Components: grunt, impl, tools
    Affects Versions: types_branch
            Reporter: Santhosh Srinivasan
            Assignee: Santhosh Srinivasan
             Fix For: types_branch


Phase four of the error handling feature will address the warning message cleanup and warning message aggregation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-591) Error handling phase four

Posted by "Santhosh Srinivasan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Santhosh Srinivasan updated PIG-591:
------------------------------------

    Patch Info: [Patch Available]

> Error handling phase four
> -------------------------
>
>                 Key: PIG-591
>                 URL: https://issues.apache.org/jira/browse/PIG-591
>             Project: Pig
>          Issue Type: Sub-task
>          Components: grunt, impl, tools
>    Affects Versions: types_branch
>            Reporter: Santhosh Srinivasan
>            Assignee: Santhosh Srinivasan
>             Fix For: types_branch
>
>         Attachments: Error_handling_phase4.patch
>
>
> Phase four of the error handling feature will address the warning message cleanup and warning message aggregation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-591) Error handling phase four

Posted by "Santhosh Srinivasan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Santhosh Srinivasan updated PIG-591:
------------------------------------

    Attachment: Error_handling_phase4_1.patch

Attached patch incorporates most of the review comments. All unit test cases pass.

> Error handling phase four
> -------------------------
>
>                 Key: PIG-591
>                 URL: https://issues.apache.org/jira/browse/PIG-591
>             Project: Pig
>          Issue Type: Sub-task
>          Components: grunt, impl, tools
>    Affects Versions: types_branch
>            Reporter: Santhosh Srinivasan
>            Assignee: Santhosh Srinivasan
>             Fix For: types_branch
>
>         Attachments: Error_handling_phase4.patch, Error_handling_phase4_1.patch
>
>
> Phase four of the error handling feature will address the warning message cleanup and warning message aggregation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-591) Error handling phase four

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676399#action_12676399 ] 

Pradeep Kamath commented on PIG-591:
------------------------------------

Code review comments 

The patch looks good to go with minor observations below:

- System.err.println() message in PgHadoopLogger.warn() seems like a debug statement
- In EvalFunc.progress() there is:
 log.warn("No reporter object provided to UDF " + this.getClass().getName());  
 Shouldn't this go through the PigLogger?
- If we want warning aggregation in UDF, should the UDF writer create new entries in PigWarning
(If so, the UDF manual should probably outline this)
- Is there a reason why initialized needs to be volatile in PigMapBase? There
should be only one Map thread in the map() function. If there is a reason for it to be 
volatile, does it apply to PigMapReduce, PigCombiner  and POUserFunc as well?
- In POUserFunc.instantiateFunc() should we still set the Reporter and PigLogger if the
assignments don't actually work and we rely on processinput() for these initializations?
- In DefaultAbstractBag warn() should mimic Utf8StorageConvertor
- GruntParser.java has only a whitespace change (the change should be reverted since earlier
there were spaces and now there is a tab).




> Error handling phase four
> -------------------------
>
>                 Key: PIG-591
>                 URL: https://issues.apache.org/jira/browse/PIG-591
>             Project: Pig
>          Issue Type: Sub-task
>          Components: grunt, impl, tools
>    Affects Versions: types_branch
>            Reporter: Santhosh Srinivasan
>            Assignee: Santhosh Srinivasan
>             Fix For: types_branch
>
>         Attachments: Error_handling_phase4.patch
>
>
> Phase four of the error handling feature will address the warning message cleanup and warning message aggregation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (PIG-591) Error handling phase four

Posted by "Pradeep Kamath (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pradeep Kamath resolved PIG-591.
--------------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]

Santhosh, thanks for the feature contribution. Patch commited with the following changes in HExecution.java:
{code}
@@ -200,7 +200,7 @@
         }
         catch (IOException e) {
             int errCode = 6009; 
-            String msg = "Failed to create job client";
+            String msg = "Failed to create job client:" + e.getMessage();
             throw new ExecException(msg, errCode, PigException.BUG, e);
         }       
     }
@@ -549,11 +549,20 @@
             //this should return as soon as connection is shutdown
             int rc = p.waitFor();
             if (rc != 0) { 
-                String errMsg = new String();
+                StringBuilder errMsg = new StringBuilder();
                 try {   
-                    BufferedReader br = new BufferedReader(new InputStreamReader(p.getErrorStream()));
-                    errMsg = br.readLine();
+                    BufferedReader br = new BufferedReader(new InputStreamReader(p.getInputStream()));
+                    String line = null;
+                    while((line = br.readLine()) != null) {
+                        errMsg.append(line);
+                    }
                     br.close();
+                    br = new BufferedReader(new InputStreamReader(p.getErrorStream()));
+                    line = null;
+                    while((line = br.readLine()) != null) {
+                        errMsg.append(line);
+                    }
+                    br.close();
                 } catch (IOException ioe) {} 
                 int errCode = 6011; 
                 StringBuilder msg = new StringBuilder("Failed to run command ");
@@ -563,7 +572,7 @@
                 msg.append("; return code: ");
                 msg.append(rc);
                 msg.append("; error: ");
-                msg.append(errMsg);
+                msg.append(errMsg.toString());
                 throw new ExecException(msg.toString(), errCode, PigException.REMOTE_ENVIRONMENT);
             }
         } catch (Exception e){

{code}

These extra changes are need so that the right error message is shown when there is an error while connecting to DFS. Since this is the last error handling related patch it seemed logical to add this with this patch. The above change has been taken from the patch submitted for http://issues.apache.org/jira/browse/PIG-682. So when http://issues.apache.org/jira/browse/PIG-682 is finally committed this portion can be omitted.

> Error handling phase four
> -------------------------
>
>                 Key: PIG-591
>                 URL: https://issues.apache.org/jira/browse/PIG-591
>             Project: Pig
>          Issue Type: Sub-task
>          Components: grunt, impl, tools
>    Affects Versions: types_branch
>            Reporter: Santhosh Srinivasan
>            Assignee: Santhosh Srinivasan
>             Fix For: types_branch
>
>         Attachments: Error_handling_phase4.patch, Error_handling_phase4_1.patch
>
>
> Phase four of the error handling feature will address the warning message cleanup and warning message aggregation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-591) Error handling phase four

Posted by "Santhosh Srinivasan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12676456#action_12676456 ] 

Santhosh Srinivasan commented on PIG-591:
-----------------------------------------

1. The System.err.println has to be removed. It was an oversight.
2. Thats a tricky one. The warning message is used to indicate that the reporter was not set. The reporter uses the Hadoop reporter internally. Its highly likely that the Hadoop reporter which is also used in the PigLogger is not set. I will delegate this to the PigLogger to make it consistent.
3. UDF writers cannot create new warning enums in PigWarning. There are two reasons:
   1. PigWarning is part of Pig sources
   2. Java does not allow Enums to be extended (i.e., inherited)
A possible solution in the future could be to use a registration mechanism where UDFs register their warning enums and Pig will be aware of the warnings that are aggregated. The UDF manual has to be updated to reflect this.
4. The volatile is used only in PigMapBase as mappers will be multi-threaded in future Hadoop versions. Yes, this change can be made in the Combiner and the Reducer too.
5. I am following the convention for the reporter. There is an existing comment which states that the re-initialization in the processInput() was added as a safety. The root cause is not clear.
6. Agreed.
7. Agreed.

> Error handling phase four
> -------------------------
>
>                 Key: PIG-591
>                 URL: https://issues.apache.org/jira/browse/PIG-591
>             Project: Pig
>          Issue Type: Sub-task
>          Components: grunt, impl, tools
>    Affects Versions: types_branch
>            Reporter: Santhosh Srinivasan
>            Assignee: Santhosh Srinivasan
>             Fix For: types_branch
>
>         Attachments: Error_handling_phase4.patch
>
>
> Phase four of the error handling feature will address the warning message cleanup and warning message aggregation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-591) Error handling phase four

Posted by "Santhosh Srinivasan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Santhosh Srinivasan updated PIG-591:
------------------------------------

    Attachment: Error_handling_phase4.patch

Attached file implements the warning aggregation feature. Note that no new unit tests have been added. All existing unit tests pass.

> Error handling phase four
> -------------------------
>
>                 Key: PIG-591
>                 URL: https://issues.apache.org/jira/browse/PIG-591
>             Project: Pig
>          Issue Type: Sub-task
>          Components: grunt, impl, tools
>    Affects Versions: types_branch
>            Reporter: Santhosh Srinivasan
>            Assignee: Santhosh Srinivasan
>             Fix For: types_branch
>
>         Attachments: Error_handling_phase4.patch
>
>
> Phase four of the error handling feature will address the warning message cleanup and warning message aggregation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.