You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Massimo Miccoli <mm...@iltrovatore.it> on 2005/11/03 19:20:57 UTC
Error on parser? new parser parse gif jpeg?
Dear Nuthe developpers,
I use mapred branch. In my log I have some line that seams that fetcher
parse images urls:
051103 191009 task_m_t0fty7 Parsing
[http://www.hardydiesel.com/images/Trace/useries.jpg] with
[org.apache.nutch.parse.text.TextParser@6025e7]
051103 191009 task_m_t0fty7 fetching
http://www.mothersbliss.co.uk/mothers-images/right_panel_end.gif
051103 191009 task_m_t0fty7 redirectCount=0
051103 191009 task_m_t0fty7 Parsing
[http://www.villacertano.com/immagini/fotocomp.gif] with
[org.apache.nutch.parse.text.TextParser@6025e7]
Also in my regex-urlfilter.txt (like the default version of Nutch) the
.gif .jpg is skiped, but I see in the task logs the urls like some above.
Any help?
Thanks