You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Sebastian Nagel (JIRA)" <ji...@apache.org> on 2013/04/27 23:08:15 UTC

[jira] [Commented] (NUTCH-969) FTP erro encoding

    [ https://issues.apache.org/jira/browse/NUTCH-969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13643796#comment-13643796 ] 

Sebastian Nagel commented on NUTCH-969:
---------------------------------------

The problem is that the URL encoded path (percent-encoded UTF-8 as per [[RFC3986|http://tools.ietf.org/html/rfc3986#section-2.5]] has to be "translated" into Ftp commands in the encoding used by the ftp server (or its file system).
We could make the encoding configurable as property "ftp.encoding" and call [[FTP.setControlEncoding|http://commons.apache.org/proper/commons-net/apidocs/org/apache/commons/net/ftp/FTP.html#setControlEncoding%28java.lang.String%29]].
                
> FTP erro encoding
> -----------------
>
>                 Key: NUTCH-969
>                 URL: https://issues.apache.org/jira/browse/NUTCH-969
>             Project: Nutch
>          Issue Type: Bug
>         Environment: Ubuntu 10.10 tomcat7 Nutch 1.2
>            Reporter: Te mule
>              Labels: Encoding, FTP, UTF-8
>             Fix For: 2.3
>
>
> I use Nutch fetching a chinease ftp sites.But fetch ftp url's encoding is wrong.FTP used gb2312 encoding.but nutch used UTF-8 encoding.I want to change encoding to gb2312.How should I do?I hope nutch next version can change FTP encoding.Thank you.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira