You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@oodt.apache.org by "Tyler Palsulich (JIRA)" <ji...@apache.org> on 2014/10/10 19:28:34 UTC

[jira] [Comment Edited] (OODT-630) Upgrade OODT components from using Tika 0.8 to Tika 1.6

    [ https://issues.apache.org/jira/browse/OODT-630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14167155#comment-14167155 ] 

Tyler Palsulich edited comment on OODT-630 at 10/10/14 5:27 PM:
----------------------------------------------------------------

v4 patch. All tests pass. All modules upgraded to Tika 1.6. Can someone ensure their local build is passing, apply this patch (in the root of the trunk, run {{patch -p0 -i [filename]}}), and see if it's still passing? Then I'll commit this. Thanks!


was (Author: tpalsulich):
v4 patch. All tests pass. All modules upgraded to Tika 1.6. Can someone ensure their local build is passing, apply this patch (in the root of the trunk, run {{patch -p0 -i [filename]}}, and see if it's still passing? Then I'll commit this. Thanks!

> Upgrade OODT components from using Tika 0.8 to Tika 1.6
> -------------------------------------------------------
>
>                 Key: OODT-630
>                 URL: https://issues.apache.org/jira/browse/OODT-630
>             Project: OODT
>          Issue Type: Improvement
>          Components: file manager, metadata container, product server
>    Affects Versions: 0.6
>            Reporter: Rishi Verma
>            Assignee: Rishi Verma
>             Fix For: 0.8
>
>         Attachments: OODT-630.Palsulich.101014.patch, OODT-630.Palsulich.101014.v3.patch, OODT-630.Palsulich.101014.v4.patch
>
>
> Currently, OODT makes use of Tika v0.8 (tika-core) for mime-detection purposes. This version is quite out-of-date, and is incompatible with the use of a tika-core or tika-app v1.3 JAR.
> Tika v1.3 contains numerous upgrades since 0.8 (see [1]), some of which include improved metadata generation for common files. These improved features are extremely useful for metadata gathering.
> If a project using OODT needs features provided with the v1.3 tika-core or tika-app JAR (e.g. custom met extractor), currently they cannot use this version when interacting with OODT server-side components like filemgr, crawler etc. since it is incompatible with OODT's use of v0.8.
> One of the incompatibilities is the deprecation of the 'getMimeType' method within org.apache.tika.mime.MimeTypes.getMimeType(URL). This has been supplemented with Tika.detect(URL.getPath()) & MimeTypes.getRegisteredMimeType(String)
> See example exception thrown below. when crawler 0.6-SNAPSHOT was invoked while a 'tika-app-1.3.jar' was placed in the crawler's lib directory:
> ---
> Jun 18, 2013 3:40:07 PM org.apache.oodt.cas.crawl.ProductCrawler ingest
> INFO: ProductCrawler: Ready to ingest product: [/data/staging/IMG_2590.jpg]: ProductType: [GenericFile]
> Jun 18, 2013 3:40:07 PM org.apache.oodt.cas.filemgr.ingest.StdIngester setFileManager
> INFO: StdIngester: connected to file manager: [http://localhost:9000]
> Jun 18, 2013 3:40:07 PM org.apache.oodt.cas.filemgr.datatransfer.InPlaceDataTransferer setFileManagerUrl
> INFO: In Place Data Transfer to: [http://localhost:9000] enabled
> Exception in thread "main" java.lang.NoSuchMethodError: org.apache.tika.mime.MimeTypes.getMimeType(Ljava/net/URL;)Lorg/apache/tika/mime/MimeType;
> at org.apache.oodt.cas.filemgr.structs.Reference.<init>(Reference.java:115)
> at org.apache.oodt.cas.filemgr.versioning.VersioningUtils.addRefsFromUris(VersioningUtils.java:251)
> at org.apache.oodt.cas.filemgr.ingest.StdIngester.ingest(StdIngester.java:189)
> at org.apache.oodt.cas.crawl.ProductCrawler.ingest(ProductCrawler.java:304)
> at org.apache.oodt.cas.crawl.ProductCrawler.handleFile(ProductCrawler.java:188)
> at org.apache.oodt.cas.crawl.ProductCrawler.crawl(ProductCrawler.java:108)
> at org.apache.oodt.cas.crawl.ProductCrawler.crawl(ProductCrawler.java:75)
> at org.apache.oodt.cas.crawl.daemon.CrawlDaemon.startCrawling(CrawlDaemon.java:82)
> at org.apache.oodt.cas.crawl.cli.action.CrawlerLauncherCliAction.execute(CrawlerLauncherCliAction.java:55)
> at org.apache.oodt.cas.cli.CmdLineUtility.execute(CmdLineUtility.java:331)
> at org.apache.oodt.cas.cli.CmdLineUtility.run(CmdLineUtility.java:187)
> at org.apache.oodt.cas.crawl.CrawlerLauncher.main(CrawlerLauncher.java:36)
> ---
> This JIRA issue is seeks to document efforts to upgrade OODT's use of tika from 0.8 to 1.3. 
> ---
> [1] http://www.apache.org/dist/tika/CHANGES-1.3.txt



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)