You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Amar Kamat (JIRA)" <ji...@apache.org> on 2007/10/25 07:26:50 UTC

[jira] Created: (HADOOP-2098) File handles for log files are still open in case of jobs with 0 maps

File handles for log files are still open in case of jobs with 0 maps
---------------------------------------------------------------------

                 Key: HADOOP-2098
                 URL: https://issues.apache.org/jira/browse/HADOOP-2098
             Project: Hadoop
          Issue Type: Bug
          Components: mapred
    Affects Versions: 0.15.0
            Reporter: Amar Kamat
            Assignee: Amar Kamat
             Fix For: 0.16.0


When a job with zero maps is submitted the handle for the log file for that job is still open and can be seen using {{lsof}}. This over time could lead to {{Too many open files Exception}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2098) File handles for log files are still open in case of jobs with 0 maps

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537633 ] 

Owen O'Malley commented on HADOOP-2098:
---------------------------------------

No, I don't think throwing an exception is the right answer. It is perfectly reasonable to setup a map/reduce job that reads from a directory every 30 minutes. It should not be an error for the input directory to be empty. It would be like "cat < /dev/null" causing an error...

> File handles for log files are still open in case of jobs with 0 maps
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-2098
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2098
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.15.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>             Fix For: 0.16.0
>
>         Attachments: HADOOP-2098.patch
>
>
> When a job with zero maps is submitted the handle for the log file for that job is still open and can be seen using {{lsof}}. This over time could lead to {{Too many open files Exception}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2098) File handles for log files are still open in case of jobs with 0 maps

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12538081 ] 

Hudson commented on HADOOP-2098:
--------------------------------

Integrated in Hadoop-Nightly #283 (See [http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Nightly/283/])

> File handles for log files are still open in case of jobs with 0 maps
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-2098
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2098
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.15.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>             Fix For: 0.16.0
>
>         Attachments: HADOOP-2098.patch
>
>
> When a job with zero maps is submitted the handle for the log file for that job is still open and can be seen using {{lsof}}. This over time could lead to {{Too many open files Exception}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-2098) File handles for log files are still open in case of jobs with 0 maps

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537563 ] 

amar_kamat edited comment on HADOOP-2098 at 10/25/07 3:28 AM:
--------------------------------------------------------------

@Arun : I did as suggested but then the test {{Test org.apache.hadoop.mapred.TestEmptyJobWithDFS}} failed which kind of makes sense. So as of now I will make submit this patch and lets discuss it out here.

      was (Author: amar_kamat):
    @Arun : I did as suggested but then the test {{ Test org.apache.hadoop.mapred.TestEmptyJobWithDFS}} failed which kind of makes sense. So as of now I will make submit this patch and lets discuss it out here.
  
> File handles for log files are still open in case of jobs with 0 maps
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-2098
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2098
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.15.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>             Fix For: 0.16.0
>
>         Attachments: HADOOP-2098.patch
>
>
> When a job with zero maps is submitted the handle for the log file for that job is still open and can be seen using {{lsof}}. This over time could lead to {{Too many open files Exception}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2098) File handles for log files are still open in case of jobs with 0 maps

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-2098:
-------------------------------

    Attachment: HADOOP-2098.patch

> File handles for log files are still open in case of jobs with 0 maps
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-2098
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2098
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.15.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>             Fix For: 0.16.0
>
>         Attachments: HADOOP-2098.patch
>
>
> When a job with zero maps is submitted the handle for the log file for that job is still open and can be seen using {{lsof}}. This over time could lead to {{Too many open files Exception}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2098) File handles for log files are still open in case of jobs with 0 maps

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-2098:
-------------------------------

    Status: Patch Available  (was: Open)

@Arun : I did as suggested but then the test {{ Test org.apache.hadoop.mapred.TestEmptyJobWithDFS}} failed which kind of makes sense. So as of now I will make submit this patch and lets discuss it out here.

> File handles for log files are still open in case of jobs with 0 maps
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-2098
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2098
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.15.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>             Fix For: 0.16.0
>
>         Attachments: HADOOP-2098.patch
>
>
> When a job with zero maps is submitted the handle for the log file for that job is still open and can be seen using {{lsof}}. This over time could lead to {{Too many open files Exception}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2098) File handles for log files are still open in case of jobs with 0 maps

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-2098:
----------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I just committed this. Thanks, Amar!

> File handles for log files are still open in case of jobs with 0 maps
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-2098
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2098
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.15.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>             Fix For: 0.16.0
>
>         Attachments: HADOOP-2098.patch
>
>
> When a job with zero maps is submitted the handle for the log file for that job is still open and can be seen using {{lsof}}. This over time could lead to {{Too many open files Exception}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2098) File handles for log files are still open in case of jobs with 0 maps

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537674 ] 

Arun C Murthy commented on HADOOP-2098:
---------------------------------------

Ok, I withdraw my objection. I'll commit this one as-is.

> File handles for log files are still open in case of jobs with 0 maps
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-2098
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2098
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.15.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>             Fix For: 0.16.0
>
>         Attachments: HADOOP-2098.patch
>
>
> When a job with zero maps is submitted the handle for the log file for that job is still open and can be seen using {{lsof}}. This over time could lead to {{Too many open files Exception}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-2098) File handles for log files are still open in case of jobs with 0 maps

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537575 ] 

Arun C Murthy commented on HADOOP-2098:
---------------------------------------

Good point.

Anyone cares to chime in on why a job without input-files should not be treated as *invalid*?

> File handles for log files are still open in case of jobs with 0 maps
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-2098
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2098
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.15.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>             Fix For: 0.16.0
>
>         Attachments: HADOOP-2098.patch
>
>
> When a job with zero maps is submitted the handle for the log file for that job is still open and can be seen using {{lsof}}. This over time could lead to {{Too many open files Exception}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (HADOOP-2098) File handles for log files are still open in case of jobs with 0 maps

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12537539 ] 

acmurthy edited comment on HADOOP-2098 at 10/25/07 1:26 AM:
-----------------------------------------------------------------

We should also check for zero input files in {{FileInputFormat.validateInput(JobConf)}} and throw an {{InvalidInputException}} in the first place. 

Something along the lines of:
{noformat}
Index: /home/arun/dev/java/hadoop/HADOOP-1881/src/java/org/apache/hadoop/mapred/FileInputFormat.java
===================================================================
---src/java/org/apache/hadoop/mapred/FileInputFormat.java	(revision 588140)
+++ src/java/org/apache/hadoop/mapred/FileInputFormat.java	(working copy)
@@ -150,6 +150,9 @@
         }
       }
     }
+    if (totalFiles == 0) {
+      result.add(new IOException("Found zero input files"));
+    }
     if (!result.isEmpty()) {
       throw new InvalidInputException(result);
     }
{noformat}


      was (Author: acmurthy):
    We should also check for zero input files in {{FileInputFormat.validateInput(JobConf)}} and throw an {{InvalidInputException}} in the first place. Prevention is better than cure...
  
> File handles for log files are still open in case of jobs with 0 maps
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-2098
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2098
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.15.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>             Fix For: 0.16.0
>
>         Attachments: HADOOP-2098.patch
>
>
> When a job with zero maps is submitted the handle for the log file for that job is still open and can be seen using {{lsof}}. This over time could lead to {{Too many open files Exception}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2098) File handles for log files are still open in case of jobs with 0 maps

Posted by "Arun C Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Arun C Murthy updated HADOOP-2098:
----------------------------------

    Status: Open  (was: Patch Available)

We should also check for zero input files in {{FileInputFormat.validateInput(JobConf)}} and throw an {{InvalidInputException}} in the first place. Prevention is better than cure...

> File handles for log files are still open in case of jobs with 0 maps
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-2098
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2098
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.15.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>             Fix For: 0.16.0
>
>         Attachments: HADOOP-2098.patch
>
>
> When a job with zero maps is submitted the handle for the log file for that job is still open and can be seen using {{lsof}}. This over time could lead to {{Too many open files Exception}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-2098) File handles for log files are still open in case of jobs with 0 maps

Posted by "Amar Kamat (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-2098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amar Kamat updated HADOOP-2098:
-------------------------------

    Status: Patch Available  (was: Open)

> File handles for log files are still open in case of jobs with 0 maps
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-2098
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2098
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.15.0
>            Reporter: Amar Kamat
>            Assignee: Amar Kamat
>             Fix For: 0.16.0
>
>         Attachments: HADOOP-2098.patch
>
>
> When a job with zero maps is submitted the handle for the log file for that job is still open and can be seen using {{lsof}}. This over time could lead to {{Too many open files Exception}}.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.