You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Brian Bockelman (JIRA)" <ji...@apache.org> on 2008/10/30 00:06:44 UTC

[jira] Created: (HADOOP-4541) Infinite loop in error handler for libhdfs

Infinite loop in error handler for libhdfs
------------------------------------------

                 Key: HADOOP-4541
                 URL: https://issues.apache.org/jira/browse/HADOOP-4541
             Project: Hadoop Core
          Issue Type: Bug
          Components: libhdfs
    Affects Versions: 0.18.1
            Reporter: Brian Bockelman
         Attachments: libhdfs_failure.txt

If there is a problem writing out to HDFS, libhdfs gets put in an infinite loop.

Unfortunately, my program reroutes stderr/stdout to /dev/null, so I am attaching the strace output.  You can see the java stack traces which are written over and over.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HADOOP-4541) Infinite loop in error handler for libhdfs

Posted by "Brian Bockelman (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brian Bockelman resolved HADOOP-4541.
-------------------------------------

    Resolution: Invalid

> Infinite loop in error handler for libhdfs
> ------------------------------------------
>
>                 Key: HADOOP-4541
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4541
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: libhdfs
>    Affects Versions: 0.18.1
>            Reporter: Brian Bockelman
>         Attachments: libhdfs_failure.txt
>
>
> If there is a problem writing out to HDFS, libhdfs gets put in an infinite loop.
> Unfortunately, my program reroutes stderr/stdout to /dev/null, so I am attaching the strace output.  You can see the java stack traces which are written over and over.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4541) Infinite loop in error handler for libhdfs

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644057#action_12644057 ] 

Pete Wyckoff commented on HADOOP-4541:
--------------------------------------

Looking at the code, hdfsWrite should be returning -1 when this happens.  There are no loops in the hdfsWrite or invokeMethod.


> Infinite loop in error handler for libhdfs
> ------------------------------------------
>
>                 Key: HADOOP-4541
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4541
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: libhdfs
>    Affects Versions: 0.18.1
>            Reporter: Brian Bockelman
>         Attachments: libhdfs_failure.txt
>
>
> If there is a problem writing out to HDFS, libhdfs gets put in an infinite loop.
> Unfortunately, my program reroutes stderr/stdout to /dev/null, so I am attaching the strace output.  You can see the java stack traces which are written over and over.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4541) Infinite loop in error handler for libhdfs

Posted by "Owen O'Malley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644051#action_12644051 ] 

Owen O'Malley commented on HADOOP-4541:
---------------------------------------

Any thoughts Pete or Arun? Pete's had his fingers in there the most recently.

> Infinite loop in error handler for libhdfs
> ------------------------------------------
>
>                 Key: HADOOP-4541
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4541
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: libhdfs
>    Affects Versions: 0.18.1
>            Reporter: Brian Bockelman
>         Attachments: libhdfs_failure.txt
>
>
> If there is a problem writing out to HDFS, libhdfs gets put in an infinite loop.
> Unfortunately, my program reroutes stderr/stdout to /dev/null, so I am attaching the strace output.  You can see the java stack traces which are written over and over.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4541) Infinite loop in error handler for libhdfs

Posted by "Brian Bockelman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644204#action_12644204 ] 

Brian Bockelman commented on HADOOP-4541:
-----------------------------------------

Ah, I think this was caused by my error handlers.  Closing for now.

> Infinite loop in error handler for libhdfs
> ------------------------------------------
>
>                 Key: HADOOP-4541
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4541
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: libhdfs
>    Affects Versions: 0.18.1
>            Reporter: Brian Bockelman
>         Attachments: libhdfs_failure.txt
>
>
> If there is a problem writing out to HDFS, libhdfs gets put in an infinite loop.
> Unfortunately, my program reroutes stderr/stdout to /dev/null, so I am attaching the strace output.  You can see the java stack traces which are written over and over.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4541) Infinite loop in error handler for libhdfs

Posted by "Brian Bockelman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644100#action_12644100 ] 

Brian Bockelman commented on HADOOP-4541:
-----------------------------------------

Dummy me forgot the greatest tool for debugging of them all: gdb.

Here's the stack trace of one of these suckers in action:

(gdb) where
#0  0x0000002a975a0a47 in CollectedHeap::allocate_from_tlab_slow () from /opt/osg/osg-100/jdk1.5/jre/lib/amd64/server/libjvm.so
#1  0x0000002a97529551 in CollectedHeap::common_mem_allocate_noinit () from /opt/osg/osg-100/jdk1.5/jre/lib/amd64/server/libjvm.so
#2  0x0000002a9793374b in typeArrayKlass::allocate () from /opt/osg/osg-100/jdk1.5/jre/lib/amd64/server/libjvm.so
#3  0x0000002a9769f5e8 in jni_NewByteArray () from /opt/osg/osg-100/jdk1.5/jre/lib/amd64/server/libjvm.so
#4  0x0000002a971ef363 in hdfsWrite (fs=Variable "fs" is not available.
) at hdfs.c:590
#5  0x0000002a970ea97a in globus_l_gfs_hdfs_dump_buffers (hdfs_handle=0x6de080) at globus_gridftp_server_hdfs.c:507

That dump_buffers call (internal to my application) worries me; give me a few hours to step through things in GDB; if errors aren't propogated right in that particular function, it could lead to an application-side infinite loop.

> Infinite loop in error handler for libhdfs
> ------------------------------------------
>
>                 Key: HADOOP-4541
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4541
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: libhdfs
>    Affects Versions: 0.18.1
>            Reporter: Brian Bockelman
>         Attachments: libhdfs_failure.txt
>
>
> If there is a problem writing out to HDFS, libhdfs gets put in an infinite loop.
> Unfortunately, my program reroutes stderr/stdout to /dev/null, so I am attaching the strace output.  You can see the java stack traces which are written over and over.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4541) Infinite loop in error handler for libhdfs

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644052#action_12644052 ] 

Pete Wyckoff commented on HADOOP-4541:
--------------------------------------

I thought this might be related to removing exit calls in libhdfs but that was introduced in 0.19, so that's not the culprit if this is indeed 0.18.1. I will look into it some more.   

You wouldn't happen to have code to reproduce this problem?


> Infinite loop in error handler for libhdfs
> ------------------------------------------
>
>                 Key: HADOOP-4541
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4541
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: libhdfs
>    Affects Versions: 0.18.1
>            Reporter: Brian Bockelman
>         Attachments: libhdfs_failure.txt
>
>
> If there is a problem writing out to HDFS, libhdfs gets put in an infinite loop.
> Unfortunately, my program reroutes stderr/stdout to /dev/null, so I am attaching the strace output.  You can see the java stack traces which are written over and over.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-4541) Infinite loop in error handler for libhdfs

Posted by "Pete Wyckoff (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644059#action_12644059 ] 

Pete Wyckoff commented on HADOOP-4541:
--------------------------------------

Dhruba mentioned that HADOOP-4517 could be causing the underlying DFS exception but not the infinite loop. So, maybe you could try 0.18.2 that includes the fix for 4517?


> Infinite loop in error handler for libhdfs
> ------------------------------------------
>
>                 Key: HADOOP-4541
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4541
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: libhdfs
>    Affects Versions: 0.18.1
>            Reporter: Brian Bockelman
>         Attachments: libhdfs_failure.txt
>
>
> If there is a problem writing out to HDFS, libhdfs gets put in an infinite loop.
> Unfortunately, my program reroutes stderr/stdout to /dev/null, so I am attaching the strace output.  You can see the java stack traces which are written over and over.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-4541) Infinite loop in error handler for libhdfs

Posted by "Brian Bockelman (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-4541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brian Bockelman updated HADOOP-4541:
------------------------------------

    Attachment: libhdfs_failure.txt

Attached file is sample strace output.

> Infinite loop in error handler for libhdfs
> ------------------------------------------
>
>                 Key: HADOOP-4541
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4541
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: libhdfs
>    Affects Versions: 0.18.1
>            Reporter: Brian Bockelman
>         Attachments: libhdfs_failure.txt
>
>
> If there is a problem writing out to HDFS, libhdfs gets put in an infinite loop.
> Unfortunately, my program reroutes stderr/stdout to /dev/null, so I am attaching the strace output.  You can see the java stack traces which are written over and over.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.