You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by "jet2web@trashmail.net" <je...@trashmail.net> on 2009/04/10 02:41:54 UTC
java.nio.charset.IllegalCharsetNameException
<ma...@lucene.apache.org>
Hello!
I get the fowling error when I run nutch 1.0 and 1.1dev on some sites:
Error parsing: http://www.nasa.gov/centers/goddard/home/index.html:
failed(2,200): java.nio.charset.IllegalCharsetNameException: .utf8
nutch log:
2009-04-10 02:23:56,902 WARN parse.html -
java.nio.charset.IllegalCharsetNameException: .utf8
2009-04-10 02:23:56,903 WARN parse.html - at
java.nio.charset.Charset.checkName(Charset.java:285)
2009-04-10 02:23:56,903 WARN parse.html - at
java.nio.charset.Charset.lookup2(Charset.java:459)
2009-04-10 02:23:56,903 WARN parse.html - at
java.nio.charset.Charset.lookup(Charset.java:438)
2009-04-10 02:23:56,903 WARN parse.html - at
java.nio.charset.Charset.isSupported(Charset.java:480)
2009-04-10 02:23:56,903 WARN parse.html - at
org.apache.nutch.util.EncodingDetector.resolveEncodingAlias(EncodingDetector.java:310)
2009-04-10 02:23:56,903 WARN parse.html - at
org.apache.nutch.util.EncodingDetector.addClue(EncodingDetector.java:201)
2009-04-10 02:23:56,903 WARN parse.html - at
org.apache.nutch.util.EncodingDetector.addClue(EncodingDetector.java:208)
2009-04-10 02:23:56,903 WARN parse.html - at
org.apache.nutch.util.EncodingDetector.autoDetectClues(EncodingDetector.java:193)
2009-04-10 02:23:56,903 WARN parse.html - at
org.apache.nutch.parse.html.HtmlParser.getParse(HtmlParser.java:136)
2009-04-10 02:23:56,904 WARN parse.html - at
org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:82)
2009-04-10 02:23:56,904 WARN parse.html - at
org.apache.nutch.fetcher.Fetcher$FetcherThread.output(Fetcher.java:766)
2009-04-10 02:23:56,904 WARN parse.html - at
org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:552)
2009-04-10 02:23:56,911 WARN fetcher.Fetcher - Error parsing:
http://www.nasa.gov/centers/goddard/home/index.html: failed(2,200):
java.nio.charset.IllegalCharsetNameException: .utf8
thanks!