You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Nils Martin Sande (JIRA)" <ji...@apache.org> on 2011/04/29 12:37:03 UTC

[jira] [Created] (NET-406) Improved disconnect handling in FtpClient: retrieveFile never returns under certain conditions and calling FtpClient.disconnect does not terminate the transfer.

Improved disconnect handling in FtpClient: retrieveFile never returns under certain conditions and calling FtpClient.disconnect does not terminate the transfer.
----------------------------------------------------------------------------------------------------------------------------------------------------------------

                 Key: NET-406
                 URL: https://issues.apache.org/jira/browse/NET-406
             Project: Commons Net
          Issue Type: Improvement
          Components: FTP
    Affects Versions: 2.0
         Environment: Linux
            Reporter: Nils Martin Sande


During file download it is possible for the transfer to stop without any time out being triggered in FtpClient. The result is a "lost" thread (it never returns from retrieveFile). Calling interrupt on the missing thread does not do anything. 

When we encountered this error we fist tried to monitor the downloaded file and then call FtpClient.disconnect if the size on disk remained unchanged for a given period of time. This did not solve the problem since FtpClient.disconnect does not close the socket used by the transfer. Changing FtpClient so that it keeps track of active sockets and then closing all active sockets on FtpClient.disconnect solved the problem for us. Although this error is not very common, the ability to completely kill the connection including all transfers seems like a useful feature. I have included the details of the modifications we made, maybe someone will find them useful. If there is a better way to get around this please let me know.

{code:title=FtpClient.java_retrieveFile|borderStyle=solid}
//The modified retrieveFile method (__activeSockets is a synchronized List<Socket>)
public boolean retrieveFile(String remote, OutputStream local)
            throws IOException {
        InputStream input;
        Socket socket;

        if ((socket = _openDataConnection_(FTPCommand.RETR, remote)) == null) {
            return false;
        }
        __activeSockets.add(socket);
        try {
            input = new BufferedInputStream(socket.getInputStream(),
                    getBufferSize());
            if (__fileType == ASCII_FILE_TYPE) {
                input = new FromNetASCIIInputStream(input);
            }
            // Treat everything else as binary for now
            try {
                Util.copyStream(input, local, getBufferSize(),
                        CopyStreamEvent.UNKNOWN_STREAM_SIZE, null,
                        false);
            } catch (IOException e) {
                try {
                    socket.close();
                } catch (IOException f) {
                }
                throw e;
            }
            socket.close();
            return completePendingCommand();
        } finally {
            __activeSockets.remove(socket);
        }
    }

//The modified disconnect method
public void disconnect() throws IOException {
        super.disconnect();
        __initDefaults();
        IOException exception = null;
        for (Socket socket : __activeSockets) {
            try {
                socket.close();
            } catch (IOException ex) {
                exception = ex;
            }
        }
        __activeSockets.clear();
        if (exception != null) {
            throw exception;
        }
    }
{code}



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (NET-406) Improved disconnect handling in FtpClient: retrieveFile never returns under certain conditions and calling FtpClient.disconnect does not terminate the transfer.

Posted by "Sebb (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NET-406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028154#comment-13028154 ] 

Sebb commented on NET-406:
--------------------------

Have you used FTPClient.setDataTimeout() to set a data timeout?

Does that work for some cases or none?

If none, then that needs to be investigated.

Not sure it's necessary to use a List<Socket>; as far as I can tell, there can be at most one active data socket, which could be stored in a volatile field instead.

Also, I'm not sure that extending disconnect is the correct place to do this, as that also terminates the control connection.

It might be better to provide a means to close the data socket directly.

> Improved disconnect handling in FtpClient: retrieveFile never returns under certain conditions and calling FtpClient.disconnect does not terminate the transfer.
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: NET-406
>                 URL: https://issues.apache.org/jira/browse/NET-406
>             Project: Commons Net
>          Issue Type: Improvement
>          Components: FTP
>    Affects Versions: 2.0
>         Environment: Linux
>            Reporter: Nils Martin Sande
>              Labels: features
>
> During file download it is possible for the transfer to stop without any time out being triggered in FtpClient. The result is a "lost" thread (it never returns from retrieveFile). Calling interrupt on the missing thread does not do anything. 
> When we encountered this error we fist tried to monitor the downloaded file and then call FtpClient.disconnect if the size on disk remained unchanged for a given period of time. This did not solve the problem since FtpClient.disconnect does not close the socket used by the transfer. Changing FtpClient so that it keeps track of active sockets and then closing all active sockets on FtpClient.disconnect solved the problem for us. Although this error is not very common, the ability to completely kill the connection including all transfers seems like a useful feature. I have included the details of the modifications we made, maybe someone will find them useful. If there is a better way to get around this please let me know.
> {code:title=FtpClient.java_retrieveFile|borderStyle=solid}
> //The modified retrieveFile method (__activeSockets is a synchronized List<Socket>)
> public boolean retrieveFile(String remote, OutputStream local)
>             throws IOException {
>         InputStream input;
>         Socket socket;
>         if ((socket = _openDataConnection_(FTPCommand.RETR, remote)) == null) {
>             return false;
>         }
>         __activeSockets.add(socket);
>         try {
>             input = new BufferedInputStream(socket.getInputStream(),
>                     getBufferSize());
>             if (__fileType == ASCII_FILE_TYPE) {
>                 input = new FromNetASCIIInputStream(input);
>             }
>             // Treat everything else as binary for now
>             try {
>                 Util.copyStream(input, local, getBufferSize(),
>                         CopyStreamEvent.UNKNOWN_STREAM_SIZE, null,
>                         false);
>             } catch (IOException e) {
>                 try {
>                     socket.close();
>                 } catch (IOException f) {
>                 }
>                 throw e;
>             }
>             socket.close();
>             return completePendingCommand();
>         } finally {
>             __activeSockets.remove(socket);
>         }
>     }
> //The modified disconnect method
> public void disconnect() throws IOException {
>         super.disconnect();
>         __initDefaults();
>         IOException exception = null;
>         for (Socket socket : __activeSockets) {
>             try {
>                 socket.close();
>             } catch (IOException ex) {
>                 exception = ex;
>             }
>         }
>         __activeSockets.clear();
>         if (exception != null) {
>             throw exception;
>         }
>     }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (NET-406) Improved disconnect handling in FtpClient: retrieveFile never returns under certain conditions and calling FtpClient.disconnect does not terminate the transfer.

Posted by "Sebb (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NET-406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028199#comment-13028199 ] 

Sebb commented on NET-406:
--------------------------

Sounds like you did not have a data timeout set previously. 
If that is the case, then no wonder the data connection can stall.

By the way, what is the FTP server software and OS?
Is there any pattern to the hangs - e.g. only files larger than a certain size?

Also, if there is a need to be able to interrupt data channel transfers, it should be applicable to all data channel instances, not just the one set up by retrieveFile(), so the current data channel would need to be updated in many other places.

> Improved disconnect handling in FtpClient: retrieveFile never returns under certain conditions and calling FtpClient.disconnect does not terminate the transfer.
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: NET-406
>                 URL: https://issues.apache.org/jira/browse/NET-406
>             Project: Commons Net
>          Issue Type: Improvement
>          Components: FTP
>    Affects Versions: 2.0
>         Environment: Linux
>            Reporter: Nils Martin Sande
>              Labels: features
>
> During file download it is possible for the transfer to stop without any time out being triggered in FtpClient. The result is a "lost" thread (it never returns from retrieveFile). Calling interrupt on the missing thread does not do anything. 
> When we encountered this error we fist tried to monitor the downloaded file and then call FtpClient.disconnect if the size on disk remained unchanged for a given period of time. This did not solve the problem since FtpClient.disconnect does not close the socket used by the transfer. Changing FtpClient so that it keeps track of active sockets and then closing all active sockets on FtpClient.disconnect solved the problem for us. Although this error is not very common, the ability to completely kill the connection including all transfers seems like a useful feature. I have included the details of the modifications we made, maybe someone will find them useful. If there is a better way to get around this please let me know.
> {code:title=FtpClient.java_retrieveFile|borderStyle=solid}
> //The modified retrieveFile method (__activeSockets is a synchronized List<Socket>)
> public boolean retrieveFile(String remote, OutputStream local)
>             throws IOException {
>         InputStream input;
>         Socket socket;
>         if ((socket = _openDataConnection_(FTPCommand.RETR, remote)) == null) {
>             return false;
>         }
>         __activeSockets.add(socket);
>         try {
>             input = new BufferedInputStream(socket.getInputStream(),
>                     getBufferSize());
>             if (__fileType == ASCII_FILE_TYPE) {
>                 input = new FromNetASCIIInputStream(input);
>             }
>             // Treat everything else as binary for now
>             try {
>                 Util.copyStream(input, local, getBufferSize(),
>                         CopyStreamEvent.UNKNOWN_STREAM_SIZE, null,
>                         false);
>             } catch (IOException e) {
>                 try {
>                     socket.close();
>                 } catch (IOException f) {
>                 }
>                 throw e;
>             }
>             socket.close();
>             return completePendingCommand();
>         } finally {
>             __activeSockets.remove(socket);
>         }
>     }
> //The modified disconnect method
> public void disconnect() throws IOException {
>         super.disconnect();
>         __initDefaults();
>         IOException exception = null;
>         for (Socket socket : __activeSockets) {
>             try {
>                 socket.close();
>             } catch (IOException ex) {
>                 exception = ex;
>             }
>         }
>         __activeSockets.clear();
>         if (exception != null) {
>             throw exception;
>         }
>     }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (NET-406) Improved disconnect handling in FtpClient: retrieveFile never returns under certain conditions and calling FtpClient.disconnect does not terminate the transfer.

Posted by "Nils Martin Sande (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NET-406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028183#comment-13028183 ] 

Nils Martin Sande commented on NET-406:
---------------------------------------

We did play around with the various timeout settings, but I am unsure if that did work for none or some of the cases.

I have updated the code base so that we use "FtpClient.setDataTimeout(timeout-5000)". The timeout variable is set to 30000 and is used by the file size monitoring thread. This should give us the answer to the "some cases or none" question". It can be several weeks between each time the connection gets stuck so it might take some time (we are unable to reproduce this at will).

I agree that a separate "terminate transfer" method would be better then the modified disconnect(). The code example is simply a minimum effort solution. The problem only appeared after the move to production so we where a bit short on time.



> Improved disconnect handling in FtpClient: retrieveFile never returns under certain conditions and calling FtpClient.disconnect does not terminate the transfer.
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: NET-406
>                 URL: https://issues.apache.org/jira/browse/NET-406
>             Project: Commons Net
>          Issue Type: Improvement
>          Components: FTP
>    Affects Versions: 2.0
>         Environment: Linux
>            Reporter: Nils Martin Sande
>              Labels: features
>
> During file download it is possible for the transfer to stop without any time out being triggered in FtpClient. The result is a "lost" thread (it never returns from retrieveFile). Calling interrupt on the missing thread does not do anything. 
> When we encountered this error we fist tried to monitor the downloaded file and then call FtpClient.disconnect if the size on disk remained unchanged for a given period of time. This did not solve the problem since FtpClient.disconnect does not close the socket used by the transfer. Changing FtpClient so that it keeps track of active sockets and then closing all active sockets on FtpClient.disconnect solved the problem for us. Although this error is not very common, the ability to completely kill the connection including all transfers seems like a useful feature. I have included the details of the modifications we made, maybe someone will find them useful. If there is a better way to get around this please let me know.
> {code:title=FtpClient.java_retrieveFile|borderStyle=solid}
> //The modified retrieveFile method (__activeSockets is a synchronized List<Socket>)
> public boolean retrieveFile(String remote, OutputStream local)
>             throws IOException {
>         InputStream input;
>         Socket socket;
>         if ((socket = _openDataConnection_(FTPCommand.RETR, remote)) == null) {
>             return false;
>         }
>         __activeSockets.add(socket);
>         try {
>             input = new BufferedInputStream(socket.getInputStream(),
>                     getBufferSize());
>             if (__fileType == ASCII_FILE_TYPE) {
>                 input = new FromNetASCIIInputStream(input);
>             }
>             // Treat everything else as binary for now
>             try {
>                 Util.copyStream(input, local, getBufferSize(),
>                         CopyStreamEvent.UNKNOWN_STREAM_SIZE, null,
>                         false);
>             } catch (IOException e) {
>                 try {
>                     socket.close();
>                 } catch (IOException f) {
>                 }
>                 throw e;
>             }
>             socket.close();
>             return completePendingCommand();
>         } finally {
>             __activeSockets.remove(socket);
>         }
>     }
> //The modified disconnect method
> public void disconnect() throws IOException {
>         super.disconnect();
>         __initDefaults();
>         IOException exception = null;
>         for (Socket socket : __activeSockets) {
>             try {
>                 socket.close();
>             } catch (IOException ex) {
>                 exception = ex;
>             }
>         }
>         __activeSockets.clear();
>         if (exception != null) {
>             throw exception;
>         }
>     }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (NET-406) Improved disconnect handling in FtpClient: retrieveFile never returns under certain conditions and calling FtpClient.disconnect does not terminate the transfer.

Posted by "Joerg Schaible (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NET-406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026962#comment-13026962 ] 

Joerg Schaible commented on NET-406:
------------------------------------

??Calling interrupt on the missing thread does not do anything.??

This is not the complete truth, since it is system dependent. On Solaris a connection is interrupted and a InterruptedIOException is thrown, while on Windows and Linux the connection is not affected by the interrupt. Cannot say for other OSses.

> Improved disconnect handling in FtpClient: retrieveFile never returns under certain conditions and calling FtpClient.disconnect does not terminate the transfer.
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: NET-406
>                 URL: https://issues.apache.org/jira/browse/NET-406
>             Project: Commons Net
>          Issue Type: Improvement
>          Components: FTP
>    Affects Versions: 2.0
>         Environment: Linux
>            Reporter: Nils Martin Sande
>              Labels: features
>
> During file download it is possible for the transfer to stop without any time out being triggered in FtpClient. The result is a "lost" thread (it never returns from retrieveFile). Calling interrupt on the missing thread does not do anything. 
> When we encountered this error we fist tried to monitor the downloaded file and then call FtpClient.disconnect if the size on disk remained unchanged for a given period of time. This did not solve the problem since FtpClient.disconnect does not close the socket used by the transfer. Changing FtpClient so that it keeps track of active sockets and then closing all active sockets on FtpClient.disconnect solved the problem for us. Although this error is not very common, the ability to completely kill the connection including all transfers seems like a useful feature. I have included the details of the modifications we made, maybe someone will find them useful. If there is a better way to get around this please let me know.
> {code:title=FtpClient.java_retrieveFile|borderStyle=solid}
> //The modified retrieveFile method (__activeSockets is a synchronized List<Socket>)
> public boolean retrieveFile(String remote, OutputStream local)
>             throws IOException {
>         InputStream input;
>         Socket socket;
>         if ((socket = _openDataConnection_(FTPCommand.RETR, remote)) == null) {
>             return false;
>         }
>         __activeSockets.add(socket);
>         try {
>             input = new BufferedInputStream(socket.getInputStream(),
>                     getBufferSize());
>             if (__fileType == ASCII_FILE_TYPE) {
>                 input = new FromNetASCIIInputStream(input);
>             }
>             // Treat everything else as binary for now
>             try {
>                 Util.copyStream(input, local, getBufferSize(),
>                         CopyStreamEvent.UNKNOWN_STREAM_SIZE, null,
>                         false);
>             } catch (IOException e) {
>                 try {
>                     socket.close();
>                 } catch (IOException f) {
>                 }
>                 throw e;
>             }
>             socket.close();
>             return completePendingCommand();
>         } finally {
>             __activeSockets.remove(socket);
>         }
>     }
> //The modified disconnect method
> public void disconnect() throws IOException {
>         super.disconnect();
>         __initDefaults();
>         IOException exception = null;
>         for (Socket socket : __activeSockets) {
>             try {
>                 socket.close();
>             } catch (IOException ex) {
>                 exception = ex;
>             }
>         }
>         __activeSockets.clear();
>         if (exception != null) {
>             throw exception;
>         }
>     }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira