You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tika.apache.org by Apache Wiki <wi...@apache.org> on 2015/01/18 06:53:37 UTC

[Tika Wiki] Update of "Tika2_0RoadMap" by ChrisMattmann

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Tika Wiki" for change notification.

The "Tika2_0RoadMap" page has been changed by ChrisMattmann:
https://wiki.apache.org/tika/Tika2_0RoadMap?action=diff&rev1=2&rev2=3

Comment:
- comment on service loading

   * Move from service loading to config file for parser specification and loading.  [[https://issues.apache.org/jira/browse/TIKA-1445|TIKA-1445]] raised this as an important area for improvement within Tika.  The current strategy in the AutoDetectParser is to load all parsers and then pick the first parser that matches a given mime type.  Tika chooses the "first" by first sorting on whether or not the class name begins with org.apache.tika and then (effectively) by reverse alphabetical order of the class name.  It would be great if the user could specify the order of parser selection in the config file.  We will be working towards this gradually through Tika 1.8 and 1.9, and we will remove service loading entirely in Tika 2.0.
  
    * ''Not sure this is quite right. We want to allow people full control of parser ordering, combining multiple parsers for fuller metadata etc, but we also want to continue to support the use case of "drop an extra jar on the classpath and automatically have the parser in it loaded+used", which relies on the service loading to find parsers and add them''
+   * ''Who says this use case *has* to be supported using ServiceLoading - seems like we can also support it without ServiceLoading and with more control over the ordering, etc.''
  
   * Allow users to build composite parsers with configurable strategies via the config file ([[https://issues.apache.org/jira/browse/TIKA-1509|TIKA-1509]] and CompositeParserDiscussion).  We will be working towards this gradually through Tika 1.8 and 1.9.  By Tika 2.0, however, this will be the default.