You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2008/06/02 00:17:45 UTC

[jira] Updated: (NUTCH-618) Tika error "Media type alias already exists"

     [ https://issues.apache.org/jira/browse/NUTCH-618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris A. Mattmann updated NUTCH-618:
------------------------------------

    Attachment: NUTCH-618.Mattmann.patch.060108.txt

Hey Guys:

Okey dok: here's a candidate patch. Could someone who has an environment set up already in which these types of errors were manifesting please trying this patch out and see if it makes them go away? I'm thinking that the root of the issue is that the MimeTypes object was not necessarily being re instantiated many many times as much as it wasn't being cached in the ObjectCache. We'll see.

This attached patch passes all unit tests. So, please let me know what you think.

Thanks!

Cheers,
 Chris


> Tika error "Media type alias already exists"
> --------------------------------------------
>
>                 Key: NUTCH-618
>                 URL: https://issues.apache.org/jira/browse/NUTCH-618
>             Project: Nutch
>          Issue Type: Bug
>          Components: mime_type_detector
>    Affects Versions: 1.0.0
>            Reporter: Andrzej Bialecki 
>            Assignee: Chris A. Mattmann
>         Attachments: NUTCH-618.Mattmann.patch.060108.txt
>
>
> After the upgrade to the latest Tika jar we see a lot of errors like this:
> 2008-03-06 08:07:20,659 WARN org.apache.tika.mime.MimeTypesReader: Invalid media type alias: text/xml
> org.apache.tika.mime.MimeTypeException: Media type alias already exists: text/xml
> 	at org.apache.tika.mime.MimeTypes.addAlias(MimeTypes.java:312)
> 	at org.apache.tika.mime.MimeType.addAlias(MimeType.java:238)
> 	at org.apache.tika.mime.MimeTypesReader.readMimeType(MimeTypesReader.java:168)
> 	at org.apache.tika.mime.MimeTypesReader.read(MimeTypesReader.java:138)
> 	at org.apache.tika.mime.MimeTypesReader.read(MimeTypesReader.java:121)
> 	at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:56)
> 	at org.apache.nutch.util.MimeUtil.(MimeUtil.java:58)
> 	at org.apache.nutch.protocol.Content.(Content.java:85)
> 	at org.apache.nutch.protocol.http.api.HttpBase.getProtocolOutput(HttpBase.java:226)
> 	at org.apache.nutch.fetcher.Fetcher2$FetcherThread.run(Fetcher2.java:523)
> This is caused most likely by the duplicate tika-mimetypes.xml file - one copy is embedded inside the Tika jar, the other is found in Nutch conf/ directory. The one inside the jar seems to be more recent, so I propose to simply remove the one we have in conf.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.