You are viewing a plain text version of this content. The canonical link for it is here.
- [jira] [Commented] (TIKA-2183) Can't Read file if its name is Arabic - posted by "tsuyoushi (JIRA)" <ji...@apache.org> on 2017/08/01 07:56:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2436) Support for GZIP-compressed EMF files - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/01 16:38:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2434) Language detection slow, cpu intensive, CLI interrupts work - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/01 16:45:01 UTC, 9 replies.
- [jira] [Comment Edited] (TIKA-2434) Language detection slow, cpu intensive, CLI interrupts work - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/01 16:46:00 UTC, 1 replies.
- [jira] [Commented] (TIKA-2402) Support all image formats in Object Recognition REST Parser - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/08/02 06:31:02 UTC, 5 replies.
- [jira] [Comment Edited] (TIKA-2183) Can't Read file if its name is Arabic - posted by "tsuyoushi (JIRA)" <ji...@apache.org> on 2017/08/02 06:38:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2438) Test failure at OOXMLParserTest.testBigIntegersWGeneralFormat:1350->TikaTest.assertContains:102 - posted by "Karl Richter (JIRA)" <ji...@apache.org> on 2017/08/02 08:56:02 UTC, 0 replies.
- [jira] [Created] (TIKA-2439) Avoid NullPointerException in org.apache.tika.langdetect.OptimaizeLangDetector if models haven't been loaded - posted by "Karl Richter (JIRA)" <ji...@apache.org> on 2017/08/02 10:51:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2439) Avoid NullPointerException in org.apache.tika.langdetect.OptimaizeLangDetector if models haven't been loaded - posted by "Karl Richter (JIRA)" <ji...@apache.org> on 2017/08/02 10:51:02 UTC, 1 replies.
- [jira] [Commented] (TIKA-2438) Test failure at OOXMLParserTest.testBigIntegersWGeneralFormat:1350->TikaTest.assertContains:102 - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/02 11:09:02 UTC, 3 replies.
- [jira] [Commented] (TIKA-2439) Avoid NullPointerException in org.apache.tika.langdetect.OptimaizeLangDetector if models haven't been loaded - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/02 11:10:00 UTC, 4 replies.
- [jira] [Updated] (TIKA-2438) Test failure at OOXMLParserTest.testBigIntegersWGeneralFormat:1350->TikaTest.assertContains:102 - posted by "Karl Richter (JIRA)" <ji...@apache.org> on 2017/08/02 11:13:02 UTC, 0 replies.
- [jira] [Issue Comment Deleted] (TIKA-2439) Avoid NullPointerException in org.apache.tika.langdetect.OptimaizeLangDetector if models haven't been loaded - posted by "Karl Richter (JIRA)" <ji...@apache.org> on 2017/08/02 11:24:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2438) Test failure at OOXMLParserTest.testBigIntegersWGeneralFormat:1350->TikaTest.assertContains:102 - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/02 11:36:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2439) Avoid NullPointerException in org.apache.tika.langdetect.OptimaizeLangDetector if models haven't been loaded - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/02 11:43:02 UTC, 0 replies.
- [jira] [Created] (TIKA-2440) Phonetic strings handling for multilingual environments. - posted by "Takahiro Ochi (JIRA)" <ji...@apache.org> on 2017/08/08 06:28:00 UTC, 0 replies.
- Re: Query related to Apache Tika dependencies - posted by Chris Mattmann <ma...@apache.org> on 2017/08/08 14:04:22 UTC, 0 replies.
- TIKA-2440 Remove Furigana/phonetic as default for xlsx? - posted by "Allison, Timothy B." <ta...@mitre.org> on 2017/08/09 12:43:39 UTC, 0 replies.
- [jira] [Created] (TIKA-2441) Unable to extract text present in a table inside a textbox in MS Word - posted by "Amit Humnabadkar (JIRA)" <ji...@apache.org> on 2017/08/09 13:52:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2441) Unable to extract text present in a table inside a textbox in MS Word - posted by "Amit Humnabadkar (JIRA)" <ji...@apache.org> on 2017/08/09 13:54:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-1367) Tika documentation should list tika-parsers parser dependencies - posted by "Gus Heck (JIRA)" <ji...@apache.org> on 2017/08/09 14:49:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-1367) Tika documentation should list tika-parsers parser dependencies - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2017/08/09 15:08:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2402) Support all image formats in Object Recognition REST Parser - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2017/08/09 16:40:01 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2402) Support all image formats in Object Recognition REST Parser - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2017/08/09 16:40:01 UTC, 0 replies.
- Release of TREC Dynamic Domain: Polar Dataset - posted by "Mattmann, Chris A (3010)" <ch...@jpl.nasa.gov> on 2017/08/09 16:55:34 UTC, 0 replies.
- [jira] [Commented] (TIKA-2441) Unable to extract text present in a table inside a textbox in MS Word - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/10 12:34:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2268) Improve reports for Profile option in tika-eval - posted by "Hudson (JIRA)" <ji...@apache.org> on 2017/08/10 20:31:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2442) Non-terminal interactive form fields not handled recursively - posted by "Christopher Creutzig (JIRA)" <ji...@apache.org> on 2017/08/14 13:47:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2442) Non-terminal interactive form fields not handled recursively - posted by "Christopher Creutzig (JIRA)" <ji...@apache.org> on 2017/08/14 13:48:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2442) Non-terminal interactive form fields not handled recursively - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/14 14:36:00 UTC, 8 replies.
- [jira] [Commented] (TIKA-2355) Cache trained mode while running ObjectRecognition server from Docker builds - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/08/15 15:09:00 UTC, 4 replies.
- [jira] [Created] (TIKA-2443) Plain text file identified as rfc822 and which can cause StackOverflowError - posted by "Viorica Visan (JIRA)" <ji...@apache.org> on 2017/08/15 15:38:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2443) Plain text file identified as rfc822 and which can cause StackOverflowError - posted by "Viorica Visan (JIRA)" <ji...@apache.org> on 2017/08/15 15:39:00 UTC, 1 replies.
- [jira] [Updated] (TIKA-2355) Cache trained mode while running ObjectRecognition server from Docker builds - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2017/08/16 04:50:00 UTC, 3 replies.
- [jira] [Resolved] (TIKA-2355) Cache trained mode while running ObjectRecognition server from Docker builds - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2017/08/16 04:50:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2265) Problem with footnotes/endnotes in Tika.parseToString with MS Word (.docx) files - posted by "Hudson (JIRA)" <ji...@apache.org> on 2017/08/16 07:32:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2443) Plain text file identified as rfc822 and which can cause StackOverflowError - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2017/08/17 15:59:00 UTC, 4 replies.
- [jira] [Reopened] (TIKA-2374) Tika App -z should extract PDF inline images by default - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/17 17:23:01 UTC, 0 replies.
- [jira] [Commented] (TIKA-2374) Tika App -z should extract PDF inline images by default - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/17 17:40:01 UTC, 1 replies.
- [jira] [Commented] (TIKA-2440) Phonetic strings handling for multilingual environments. - posted by "Takahiro Ochi (JIRA)" <ji...@apache.org> on 2017/08/22 01:40:02 UTC, 6 replies.
- [jira] [Created] (TIKA-2444) JP2 codestream files not parsed - posted by "Matthew Caruana Galizia (JIRA)" <ji...@apache.org> on 2017/08/22 15:01:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2444) JP2 codestream files not parsed - posted by "Matthew Caruana Galizia (JIRA)" <ji...@apache.org> on 2017/08/22 16:04:00 UTC, 0 replies.
- POI 4.0 and Java 8 - posted by Andreas Beeker <ki...@apache.org> on 2017/08/22 22:10:39 UTC, 5 replies.
- [jira] [Comment Edited] (TIKA-2443) Plain text file identified as rfc822 and which can cause StackOverflowError - posted by "Luis Filipe Nassif (JIRA)" <ji...@apache.org> on 2017/08/23 14:01:01 UTC, 0 replies.
- [jira] [Created] (TIKA-2445) Windows BAT / CMD detection - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2017/08/23 22:28:01 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2445) Windows BAT / CMD detection - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2017/08/23 22:42:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2445) Windows BAT / CMD detection - posted by "Hudson (JIRA)" <ji...@apache.org> on 2017/08/24 00:00:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2446) Tainted Zip file can provoke OOM errors - posted by "Thorsten Schäfer (JIRA)" <ji...@apache.org> on 2017/08/24 07:36:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2446) Tainted Zip file can provoke OOM errors - posted by "Thorsten Schäfer (JIRA)" <ji...@apache.org> on 2017/08/24 07:38:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2447) PSDParser creates unnecessary large byte array and discards it - posted by "Jan Burkhardt (JIRA)" <ji...@apache.org> on 2017/08/24 14:33:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2447) PSDParser creates unnecessary large byte array and discards it - posted by "Jan Burkhardt (JIRA)" <ji...@apache.org> on 2017/08/24 14:34:01 UTC, 5 replies.
- [jira] [Commented] (TIKA-2447) PSDParser creates unnecessary large byte array and discards it - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/08/24 14:45:01 UTC, 12 replies.
- [jira] [Resolved] (TIKA-2447) PSDParser creates unnecessary large byte array and discards it - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2017/08/24 18:04:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2298) To improve object recognition parser so that it may work without external RESTful service setup - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/08/24 20:18:00 UTC, 10 replies.
- [jira] [Issue Comment Deleted] (TIKA-2447) PSDParser creates unnecessary large byte array and discards it - posted by "Jan Burkhardt (JIRA)" <ji...@apache.org> on 2017/08/25 08:09:01 UTC, 0 replies.
- Tika 2.0? - posted by "Allison, Timothy B." <ta...@mitre.org> on 2017/08/28 13:32:39 UTC, 5 replies.
- [jira] [Updated] (TIKA-2400) Standardizing current Object Recognition REST parsers - posted by "Thejan Wijesinghe (JIRA)" <ji...@apache.org> on 2017/08/28 17:06:00 UTC, 1 replies.
- [jira] [Comment Edited] (TIKA-2440) Phonetic strings handling for multilingual environments. - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/29 15:27:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2440) Phonetic strings handling for multilingual environments. - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/29 16:27:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2448) Handle phonetic strings in the SAX docx parser - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/29 17:23:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2448) Handle phonetic strings in the SAX docx parser - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/29 17:23:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2448) Handle phonetic strings in the SAX docx parser - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/29 17:44:00 UTC, 2 replies.
- [jira] [Commented] (TIKA-2332) Output SNOMED codes for CUIs in CTAKES output? - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/08/29 18:03:00 UTC, 1 replies.
- [jira] [Assigned] (TIKA-2332) Output SNOMED codes for CUIs in CTAKES output? - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2017/08/29 18:04:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2332) Output SNOMED codes for CUIs in CTAKES output? - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2017/08/29 18:04:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2332) Output SNOMED codes for CUIs in CTAKES output? - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2017/08/29 18:04:00 UTC, 1 replies.
- [jira] [Created] (TIKA-2449) Enabling extraction of standard references from text - posted by "Giuseppe Totaro (JIRA)" <ji...@apache.org> on 2017/08/29 19:23:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2449) Enabling extraction of standard references from text - posted by "Giuseppe Totaro (JIRA)" <ji...@apache.org> on 2017/08/29 19:26:00 UTC, 4 replies.
- [jira] [Created] (TIKA-2450) OfficeParser.parse called for zero-byte file with .doc extension - posted by "Matthew Caruana Galizia (JIRA)" <ji...@apache.org> on 2017/08/30 12:03:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2450) OfficeParser.parse called for zero-byte file with .doc extension - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/30 14:21:00 UTC, 9 replies.
- [jira] [Comment Edited] (TIKA-2450) OfficeParser.parse called for zero-byte file with .doc extension - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/30 14:46:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2451) Detect image frame counts for tiff files - posted by "Mike Cantrell (JIRA)" <ji...@apache.org> on 2017/08/30 16:24:00 UTC, 1 replies.
- [jira] [Created] (TIKA-2451) Detect - posted by "Mike Cantrell (JIRA)" <ji...@apache.org> on 2017/08/30 16:24:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2448) Handle phonetic strings in the SAX docx parser - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/30 16:39:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2452) Handle phonetic strings in the classic DOM docx parser - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/30 16:40:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2452) Handle phonetic strings in the classic DOM docx parser - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/30 16:40:01 UTC, 0 replies.
- [jira] [Commented] (TIKA-2451) Detect image frame counts for tiff files - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/30 16:41:00 UTC, 1 replies.
- [jira] [Created] (TIKA-2453) Corrupt MBOX file detected as text/plain - posted by "Matthew Caruana Galizia (JIRA)" <ji...@apache.org> on 2017/08/30 16:53:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2450) OfficeParser.parse called for zero-byte file with .doc extension - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/30 17:18:01 UTC, 0 replies.
- [jira] [Commented] (TIKA-2444) JP2 codestream files not parsed - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/30 17:33:00 UTC, 1 replies.
- [jira] [Created] (TIKA-2454) Emails extracted from PSTs detected as unexpected file types - posted by "Matthew Caruana Galizia (JIRA)" <ji...@apache.org> on 2017/08/30 17:34:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2454) Emails extracted from PSTs detected as unexpected file types - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/30 18:53:00 UTC, 8 replies.
- [jira] [Resolved] (TIKA-2454) Emails extracted from PSTs detected as unexpected file types - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/30 20:16:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2455) Flag in metadata for alternative email bodies - posted by "Matthew Caruana Galizia (JIRA)" <ji...@apache.org> on 2017/08/31 16:04:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2456) Emails extracted from MBOX not detected as rfc822 - posted by "Luis Filipe Nassif (JIRA)" <ji...@apache.org> on 2017/08/31 16:20:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2456) Emails extracted from MBOX not detected as rfc822 - posted by "Luis Filipe Nassif (JIRA)" <ji...@apache.org> on 2017/08/31 16:22:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2456) Emails extracted from MBOX not detected as rfc822 - posted by "Luis Filipe Nassif (JIRA)" <ji...@apache.org> on 2017/08/31 16:35:00 UTC, 2 replies.
- [jira] [Resolved] (TIKA-2456) Emails extracted from MBOX not detected as rfc822 - posted by "Luis Filipe Nassif (JIRA)" <ji...@apache.org> on 2017/08/31 16:36:00 UTC, 0 replies.
- [jira] [Issue Comment Deleted] (TIKA-2456) Emails extracted from MBOX not detected as rfc822 - posted by "Luis Filipe Nassif (JIRA)" <ji...@apache.org> on 2017/08/31 16:36:00 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-2456) Emails extracted from MBOX not detected as rfc822 - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/31 16:55:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2457) Update MboxParser to more recent handling of embedded docs - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2017/08/31 17:11:00 UTC, 0 replies.
- [ANNOUNCE] Welcome Madhav Sharan as Tika Committer and PMC Member - posted by Dave Meikle <dm...@apache.org> on 2017/08/31 19:29:23 UTC, 2 replies.
- [jira] [Commented] (TIKA-2219) CharsetDetector no longer detects windows-1252 charset - posted by "Matthew Caruana Galizia (JIRA)" <ji...@apache.org> on 2017/08/31 21:56:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2219) CharsetDetector no longer detects windows-1252 charset - posted by "Matthew Caruana Galizia (JIRA)" <ji...@apache.org> on 2017/08/31 21:57:00 UTC, 0 replies.