You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tika.apache.org by Apache Wiki <wi...@apache.org> on 2015/12/02 15:26:32 UTC

[Tika Wiki] Update of "Tika2_0RoadMap" by TimothyAllison

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Tika Wiki" for change notification.

The "Tika2_0RoadMap" page has been changed by TimothyAllison:
https://wiki.apache.org/tika/Tika2_0RoadMap?action=diff&rev1=4&rev2=5

  = Background =
  This page is intended for a discussion of changes anticipated in Tika 2.0.
  
- This is only a first draft from one voice.  Please contribute!
+ This is only a first draft initially from one voice.  Please contribute!
  
  = Major Planned Changes =
  
@@ -18, +18 @@

  
   * Allow users to build composite parsers with configurable strategies via the config file ([[https://issues.apache.org/jira/browse/TIKA-1509|TIKA-1509]] and CompositeParserDiscussion).  We will be working towards this gradually through Tika 1.8 and 1.9.  By Tika 2.0, however, this will be the default.
  
+  * Allow for easily configurable parser sub-packages.  The tika-app, tika-server and tika-bundle jars are now pushing or are > 50MB.  It would be great if users easily could specify a subset of parsers they care about, either a la carte or by category (image, common office files (MSOffice, PDF, etc.), environmental data) and only get the dependencies required for that subset of parsers. 
+ 
-  * Move to Java 1.7 (???)
+  * Move to Java 1.8 (???)
+  
+  * Solve the complex metadata challenge; see: [[https://issues.apache.org/jira/browse/TIKA-1607|TIKA-1607]] and [[https://issues.apache.org/jira/browse/TIKA-1691|TIKA-1691]] and [[http://mail-archives.apache.org/mod_mbox/incubator-tika-dev/201510.mbox/%3c561B8B26.30105@geomatys.com%3e|ISO 19115 discussion]] .... Or at least come to some accommodation that will allow for both easy key/values access and more advanced access for those who know what they're doing.
  
  = Minor Planned Changes =
  
  = Wishes =
-  * Allow for easily configurable parser sub-packages.  The tika-app, tika-server and tika-bundle jars are now pushing or are > 30MB.  It would be great if users easily could specify a subset of parsers they care about, either a la carte or by category (image, common office files (MSOffice, PDF, etc.), environmental data) and only get the dependencies required for that subset of parsers. 
+