You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "luoleicn@gmail.com" <lu...@gmail.com> on 2011/01/11 01:49:27 UTC
Patch about get url which contains Chinese words
Hi guys:
The urls of some files on the internet may contains Chinese or other
unicode words. For example
http://www.example.com/中文.pdf
But nutch can't encode it well. So I give this patch using URL using
URLEncoder to encode urls correctly.
罗磊