You are viewing a plain text version of this content. The canonical link for it is here.
- [jira] [Commented] (TIKA-2385) Tesseract OCR rotation.py not run - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/12/01 00:02:00 UTC, 4 replies.
- [jira] [Commented] (TIKA-2497) Unexpected RuntimeException from org.apache.tika.parser.microsoft.ooxml.OOXMLParser - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/04 14:06:00 UTC, 1 replies.
- RE: [DISCUSS] Enable specific ContentHandler for tika-server - posted by "Allison, Timothy B." <ta...@mitre.org> on 2017/12/04 14:48:32 UTC, 0 replies.
- [jira] [Updated] (TIKA-2516) Upgrade CFX version to > 3.0.13 - posted by "Julian Reschke (JIRA)" <ji...@apache.org> on 2017/12/04 18:39:00 UTC, 1 replies.
- [jira] [Created] (TIKA-2516) Upgade CFX version to > 3.0.13 - posted by "Julian Reschke (JIRA)" <ji...@apache.org> on 2017/12/04 18:39:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2517) Maven is not building the source - posted by "Sahil Arora (JIRA)" <ji...@apache.org> on 2017/12/04 19:26:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2517) Maven is not building the source - posted by "Sahil Arora (JIRA)" <ji...@apache.org> on 2017/12/04 19:27:00 UTC, 1 replies.
- [jira] [Closed] (TIKA-2517) Maven is not building the source - posted by "Sahil Arora (JIRA)" <ji...@apache.org> on 2017/12/04 19:44:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2516) Upgrade CFX version to > 3.0.13 - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/04 19:47:01 UTC, 0 replies.
- [jira] [Commented] (TIKA-2516) Upgrade CFX version to > 3.0.13 - posted by "Julian Reschke (JIRA)" <ji...@apache.org> on 2017/12/04 20:55:00 UTC, 3 replies.
- [jira] [Commented] (TIKA-2503) Try to upgrade httpclient to >=4.5.3 - posted by "Julian Reschke (JIRA)" <ji...@apache.org> on 2017/12/05 13:40:01 UTC, 2 replies.
- [jira] [Resolved] (TIKA-2502) Upgrade OpenNLP to 1.8.3 - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/05 13:46:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2499) Sonatype Nexus Auditor is reporting that Tika 1.13 is using a number of vulnerable Third party components. - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/05 14:26:00 UTC, 0 replies.
- [jira] [Assigned] (TIKA-2499) Sonatype Nexus Auditor is reporting that Tika 1.13 is using a number of vulnerable Third party components. - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/05 14:26:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2499) Sonatype Nexus Auditor is reporting that Tika 1.13 is using a number of vulnerable Third party components. - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/05 14:26:00 UTC, 0 replies.
- RE: Tika 1.17? - posted by "Allison, Timothy B." <ta...@mitre.org> on 2017/12/05 20:45:45 UTC, 6 replies.
- [jira] [Created] (TIKA-2518) tika app outputs warnings by default - posted by "Ryan Brueske (JIRA)" <ji...@apache.org> on 2017/12/05 21:42:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2518) tika app outputs warnings by default - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2017/12/05 21:53:00 UTC, 2 replies.
- [jira] [Created] (TIKA-2519) Issue parsing multiple CHM files concurrently - posted by "Eamonn Saunders (JIRA)" <ji...@apache.org> on 2017/12/06 22:08:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2519) Issue parsing multiple CHM files concurrently - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/07 01:39:00 UTC, 4 replies.
- [jira] [Updated] (TIKA-2519) Issue parsing multiple CHM files concurrently - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/07 01:39:00 UTC, 0 replies.
- [jira] [Issue Comment Deleted] (TIKA-2519) Issue parsing multiple CHM files concurrently - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/07 14:43:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2519) Issue parsing multiple CHM files concurrently - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/08 14:52:00 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-2519) Issue parsing multiple CHM files concurrently - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/08 14:52:01 UTC, 0 replies.
- [jira] [Created] (TIKA-2520) OptimaizeLangDetector#loadModels() should not be called for every single langdetect HTTP request - posted by "Vincent van Donselaar (JIRA)" <ji...@apache.org> on 2017/12/08 15:09:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2483) Using PackageParser in ForkParser causes NPE - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/08 16:15:00 UTC, 2 replies.
- [jira] [Comment Edited] (TIKA-2483) Using PackageParser in ForkParser causes NPE - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/08 16:16:00 UTC, 2 replies.
- [jira] [Resolved] (TIKA-2483) Using PackageParser in ForkParser causes NPE - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/08 16:40:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2521) SAX-based docx/pptx should start a new line before second paragraph within a cell - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/08 17:38:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2521) SAX-based docx/pptx should start a new line before second paragraph within a cell - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/08 17:40:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2522) Regression in MSWord parser -- not extracting Encite Add in text any more - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/08 18:04:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2522) Regression in MSWord parser -- not extracting Encite Add in text any more - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/08 18:05:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2522) Trivial regression in MSWord parser -- not extracting Encite Add in text any more - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/08 18:05:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2522) Trivial regression in MSWord parser -- not extracting Encite Add in text any more - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/08 18:08:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2521) SAX-based docx/pptx should start a new line before second paragraph within a cell - posted by "Hudson (JIRA)" <ji...@apache.org> on 2017/12/08 18:56:00 UTC, 0 replies.
- [VOTE] Release Apache Tika 1.17 Candidate #1 - posted by Tim Allison <ta...@apache.org> on 2017/12/08 19:51:14 UTC, 0 replies.
- 1.17 rc1 and two repos in nexus?! - posted by "Allison, Timothy B." <ta...@mitre.org> on 2017/12/08 20:05:22 UTC, 5 replies.
- [jira] [Created] (TIKA-2523) Regression in ppt parsing - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/08 22:01:03 UTC, 0 replies.
- [jira] [Updated] (TIKA-2523) Regression in ppt parsing - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/08 22:01:03 UTC, 0 replies.
- [jira] [Updated] (TIKA-2523) Regression in ppt parsing -- "typeface can't be null or empty" - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/08 22:05:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2523) Regression in ppt parsing -- "typeface can't be null or empty" - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/08 22:16:00 UTC, 2 replies.
- [CANCELLED] Re: [VOTE] Release Apache Tika 1.17 Candidate #1 - posted by Tim Allison <ta...@apache.org> on 2017/12/09 00:37:10 UTC, 0 replies.
- Re: [VOTE] Release Apache Tika 1.17 Candidate #2 - posted by Tim Allison <ta...@apache.org> on 2017/12/09 00:43:42 UTC, 3 replies.
- [jira] [Created] (TIKA-2524) Apache Tika returns empty string when parsing text from XPS files - posted by "Peter Davies (JIRA)" <ji...@apache.org> on 2017/12/11 13:33:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2525) Upgrade to POI 3.17.1 when available - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/11 13:34:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2524) Apache Tika returns empty string when parsing text from XPS files - posted by "Peter Davies (JIRA)" <ji...@apache.org> on 2017/12/11 13:35:00 UTC, 2 replies.
- [jira] [Updated] (TIKA-2525) Upgrade to POI 3.17.1 when available - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/11 13:36:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2524) Apache Tika returns empty string when parsing text from XPS files - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/11 13:50:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2524) Create/integrate a parser for XPS - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/11 13:50:00 UTC, 3 replies.
- [jira] [Commented] (TIKA-2524) Create/integrate a parser for XPS - posted by "Peter Davies (JIRA)" <ji...@apache.org> on 2017/12/11 14:07:00 UTC, 4 replies.
- [jira] [Commented] (TIKA-2515) Sending a 3 character text file to tika server takes 1-2 seconds to process - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2017/12/11 18:57:00 UTC, 2 replies.
- [jira] [Comment Edited] (TIKA-2524) Create/integrate a parser for XPS - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2017/12/11 20:24:00 UTC, 3 replies.
- [jira] [Commented] (TIKA-2513) Tika-Parser does not Extract the content of the .eml file. - posted by "Hardik Trivedi (JIRA)" <ji...@apache.org> on 2017/12/12 07:46:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2513) Tika-Parser does not Extract the content of the .eml file. - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2017/12/12 09:27:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2526) com.mchange.v2.cfg.BasicMultiPropertiesConfig conflict - posted by "bocai (JIRA)" <ji...@apache.org> on 2017/12/13 10:14:00 UTC, 0 replies.
- [RESULT] [VOTE] Release Apache Tika 1.17 Candidate #2 - posted by Tim Allison <ta...@apache.org> on 2017/12/13 13:00:22 UTC, 1 replies.
- [jira] [Commented] (TIKA-2526) com.mchange.v2.cfg.BasicMultiPropertiesConfig conflict - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2017/12/13 13:48:00 UTC, 0 replies.
- steps for Tika 2.0 - posted by "Allison, Timothy B." <ta...@mitre.org> on 2017/12/13 13:51:21 UTC, 3 replies.
- [jira] [Resolved] (TIKA-2524) Create/integrate a parser for XPS - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/13 15:51:00 UTC, 0 replies.
- [ANNOUNCE] Apache Tika 1.17 released - posted by Tim Allison <ta...@apache.org> on 2017/12/14 02:10:34 UTC, 2 replies.
- [jira] [Created] (TIKA-2527) Typos in tika-mimetypes.xml - posted by "Andreas Meier (JIRA)" <ji...@apache.org> on 2017/12/14 08:22:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2527) Typos in tika-mimetypes.xml - posted by "Andreas Meier (JIRA)" <ji...@apache.org> on 2017/12/14 08:23:00 UTC, 3 replies.
- [jira] [Commented] (TIKA-2527) Typos in tika-mimetypes.xml - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2017/12/14 10:36:00 UTC, 2 replies.
- [jira] [Created] (TIKA-2528) Fix key location, keys file and download link - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/14 13:33:07 UTC, 0 replies.
- [jira] [Commented] (TIKA-2528) Fix key location, keys file and download link - posted by "Sebb (JIRA)" <ji...@apache.org> on 2017/12/14 14:34:00 UTC, 1 replies.
- [jira] [Updated] (TIKA-2528) Fix key location, keys file and download link - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/15 01:44:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2528) Fix key location, keys file and download link - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/15 01:46:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2529) ArrayIndexOutOfBoundsException when processing certain .doc files - posted by "Advokat (JIRA)" <ji...@apache.org> on 2017/12/15 11:54:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2529) ArrayIndexOutOfBoundsException when processing certain .doc files - posted by "Advokat (JIRA)" <ji...@apache.org> on 2017/12/15 11:55:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2529) ArrayIndexOutOfBoundsException when processing certain .doc files - posted by "Advokat (JIRA)" <ji...@apache.org> on 2017/12/15 11:56:00 UTC, 2 replies.
- [jira] [Created] (TIKA-2530) OutlookExtractor "buffer underrun" when parsing .msg with embedded .msg - posted by "Pascal Essiembre (JIRA)" <ji...@apache.org> on 2017/12/16 20:45:00 UTC, 0 replies.
- [jira] [Assigned] (TIKA-2530) OutlookExtractor "buffer underrun" when parsing .msg with embedded .msg - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/17 00:05:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-1141) javascript files that contain " - posted by "Daniel Goltz (JIRA)" <ji...@apache.org> on 2017/12/17 12:41:00 UTC, 5 replies.
-
[jira] [Created] (TIKA-2531) RarParser throws RarException instead of EncryptedDocumentException when parsing encrypted file - posted by "TzeKai Lee (JIRA)" <ji...@apache.org> on 2017/12/18 07:32:00 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-1141) javascript files that contain " - posted by "Daniel Goltz (JIRA)" <ji...@apache.org> on 2017/12/18 10:27:02 UTC, 0 replies.
-
[jira] [Commented] (TIKA-2531) RarParser throws RarException instead of EncryptedDocumentException when parsing encrypted file - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2017/12/18 10:45:02 UTC, 2 replies.
- [jira] [Commented] (TIKA-2245) Standardise logging - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/18 18:53:01 UTC, 0 replies.
- [jira] [Created] (TIKA-2532) Output for PDF file contains X-TIKA:content that is postscript - posted by "Trevor Yann (JIRA)" <ji...@apache.org> on 2017/12/20 01:16:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2532) Output for PDF file contains X-TIKA:content that is postscript - posted by "Trevor Yann (JIRA)" <ji...@apache.org> on 2017/12/20 01:17:00 UTC, 1 replies.
- [jira] [Updated] (TIKA-2532) Output for PDF file contains X-TIKA:content that is a PDF fragment - posted by "Trevor Yann (JIRA)" <ji...@apache.org> on 2017/12/20 01:29:00 UTC, 1 replies.
- [jira] [Commented] (TIKA-2496) TIKA crashes / runs out of memory on simple PDF - posted by "chelambarasan (JIRA)" <ji...@apache.org> on 2017/12/20 13:22:00 UTC, 6 replies.
- [jira] [Comment Edited] (TIKA-2496) TIKA crashes / runs out of memory on simple PDF - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/20 13:37:00 UTC, 0 replies.
- [jira] [Issue Comment Deleted] (TIKA-2496) TIKA crashes / runs out of memory on simple PDF - posted by "chelambarasan (JIRA)" <ji...@apache.org> on 2017/12/20 13:39:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2282) Paragraph auto-numbering is not extracted from DOCX and ODT. - posted by "Pascal Magnard (JIRA)" <ji...@apache.org> on 2017/12/20 14:38:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2532) Output for PDF file contains X-TIKA:content that is a PDF fragment - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/20 15:34:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2496) TIKA crashes / runs out of memory on simple PDF - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/20 15:36:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2531) RarParser throws RarException instead of EncryptedDocumentException when parsing encrypted file - posted by "TzeKai Lee (JIRA)" <ji...@apache.org> on 2017/12/21 03:25:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2533) Improve embedded image extraction in PDFs - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/12/21 12:49:00 UTC, 0 replies.
- Tika Write Builder - posted by Sergey Beryozkin <sb...@gmail.com> on 2017/12/28 13:38:24 UTC, 0 replies.
- Re: Integrating Tika with Apache Beam - posted by Sergey Beryozkin <sb...@gmail.com> on 2017/12/28 13:43:35 UTC, 0 replies.
- [jira] [Commented] (TIKA-2479) Handle empty cells in tables uniformly - posted by "Geoff Baskwill (JIRA)" <ji...@apache.org> on 2017/12/29 20:14:00 UTC, 2 replies.
- [jira] [Updated] (TIKA-2479) Handle empty cells in tables uniformly - posted by "Geoff Baskwill (JIRA)" <ji...@apache.org> on 2017/12/29 20:14:00 UTC, 0 replies.
- [jira] [Issue Comment Deleted] (TIKA-2479) Handle empty cells in tables uniformly - posted by "Geoff Baskwill (JIRA)" <ji...@apache.org> on 2017/12/29 20:23:00 UTC, 0 replies.
- why Metadata object is not optimized? - posted by Cristian Lorenzetto <cr...@gmail.com> on 2017/12/30 13:26:03 UTC, 0 replies.