You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Hannu Väisänen <hv...@joyx.joensuu.fi> on 2009/07/02 07:32:43 UTC
How to tell Nutch that text files are text files?
I am using Nutch to index plain text and LaTeX files.
Nutch thinks that some of the files are of type
application/octet-stream.
I have put these lines to file parse-plugins.xml
<mimeType name="application/octet-stream">
<plugin id="parse-text" />
</mimeType>
Now Nutch parses and indexes the files but when I look the search
results on Firefox/tomcat6 Nutch says that they are of type
application/octet-stream and does not show them.
How do I tell Nutch that it should show files of type
application/octet-stream as if they were text files?