You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Massimo Miccoli <mm...@iltrovatore.it> on 2005/11/03 19:20:57 UTC

Error on parser? new parser parse gif jpeg?

Dear Nuthe developpers,

I use mapred branch. In my log I have some line that seams that fetcher 
parse images urls:

051103 191009 task_m_t0fty7  Parsing 
[http://www.hardydiesel.com/images/Trace/useries.jpg] with 
[org.apache.nutch.parse.text.TextParser@6025e7]
051103 191009 task_m_t0fty7  fetching 
http://www.mothersbliss.co.uk/mothers-images/right_panel_end.gif
051103 191009 task_m_t0fty7  redirectCount=0
051103 191009 task_m_t0fty7  Parsing 
[http://www.villacertano.com/immagini/fotocomp.gif] with 
[org.apache.nutch.parse.text.TextParser@6025e7]

Also in my regex-urlfilter.txt (like the default version of Nutch) the 
.gif .jpg is skiped, but I see in the task logs the urls like some above.

Any help?

Thanks