You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Saurabh Suman <sa...@rediff.com> on 2009/07/13 06:46:22 UTC

Nutch Character encoding converter

hi
Nutch has a auto detector for character encoding. Does it convert character
to standard encoding automatically, after detecting it?
-- 
View this message in context: http://www.nabble.com/Nutch-Character-encoding-converter-tp24456144p24456144.html
Sent from the Nutch - User mailing list archive at Nabble.com.


Re: Nutch Character encoding converter

Posted by Saurabh Suman <sa...@rediff.com>.
Hi
  As  ken said, nutch converts text to Unicode.Does that mean it parsed text
is always in UTF-8 format? 

Ken Krugler wrote:
> 
>>Nutch has a auto detector for character encoding. Does it convert
character
>>to standard encoding automatically, after detecting it?
> 
> Yes - Nutch converts text to Unicode for all subsequent processing.
> 
> -- Ken
> -- 
> Ken Krugler
> +1 530-210-6378
> 
> 

-- 
View this message in context: http://www.nabble.com/Nutch-Character-encoding-converter-tp24456144p24457490.html
Sent from the Nutch - User mailing list archive at Nabble.com.


Re: Nutch Character encoding converter

Posted by Ken Krugler <kk...@transpac.com>.
>Nutch has a auto detector for character encoding. Does it convert character
>to standard encoding automatically, after detecting it?

Yes - Nutch converts text to Unicode for all subsequent processing.

-- Ken
-- 
Ken Krugler
+1 530-210-6378