You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Lewis John Mcgibbney <le...@gmail.com> on 2012/09/01 12:28:40 UTC

Re: JIRA Nutch 968, File Protocol error 404 while fetching files that contains CJK character in the file name

Hi Ye,

On Fri, Aug 31, 2012 at 4:11 PM, Ye T Thet <ye...@gmail.com> wrote:
>
> What is the guide-line for adding properties to the nutch-default.xml? I am
> thinking of using file.name.encoding.
>

Generally speaking the name attribute you suggest looks OK. However
for consistency maybe it should mimic the parser property for
encoding. Namely that the property should be
file.character.encoding.default?

Thanks

Lewis

Re: JIRA Nutch 968, File Protocol error 404 while fetching files that contains CJK character in the file name

Posted by Ye T Thet <ye...@gmail.com>.
Hi Lewis,

Your suggestion sounds good. I supposed the patch I would be submitting
changes in two file then.

nutch-default.xml for default encoding setting
FileResponse.java for some code change

Please advise.

Thanks,

Ye


On Sat, Sep 1, 2012 at 6:28 PM, Lewis John Mcgibbney <
lewis.mcgibbney@gmail.com> wrote:

> Hi Ye,
>
> On Fri, Aug 31, 2012 at 4:11 PM, Ye T Thet <ye...@gmail.com> wrote:
> >
> > What is the guide-line for adding properties to the nutch-default.xml? I
> am
> > thinking of using file.name.encoding.
> >
>
> Generally speaking the name attribute you suggest looks OK. However
> for consistency maybe it should mimic the parser property for
> encoding. Namely that the property should be
> file.character.encoding.default?
>
> Thanks
>
> Lewis
>