You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Lewis John Mcgibbney <le...@gmail.com> on 2012/12/12 20:18:35 UTC

Message in MoreIndexingFilter

Hi,

In MoreIndexingFilter#addType() there is a message from Jerome which
in just over 3 years will be a decade old. The message reads as
follows

    if (contentType == null) {
      // Note by Jerome Charron on 20050415:
      // Content Type not solved by a previous plugin
      // Or unable to solve it... Trying to find it
      // Should be better to use the doc content too
      // (using MimeTypes.getMimeType(byte[], String), but I don't know
      // which field it is?
      // if (MAGIC) {
      //   contentType = MIME.getMimeType(url, content);
      // } else {
      //   contentType = MIME.getMimeType(url);
      // }

What are we to do here? This partly relates to my earlier mail where
we find more than one approach to obtaining the specific data we
require for plugins. This is the case in both trunk and 2.x.

Any ideas?

Lewis

-- 
Lewis