You are viewing a plain text version of this content. The canonical link for it is here.
- [jira] [Updated] (TIKA-3282) OneNote Parser breaks non-ASCII Characters - posted by "Adrian Diemer (Jira)" <ji...@apache.org> on 2021/02/01 08:00:01 UTC, 1 replies.
- [jira] [Commented] (TIKA-3282) OneNote Parser breaks non-ASCII Characters - posted by "Adrian Diemer (Jira)" <ji...@apache.org> on 2021/02/01 08:01:00 UTC, 7 replies.
- [jira] [Commented] (TIKA-3286) Tika does not issue an error when language file doesn't exist; not supporting script files - posted by "Peter Kronenberg (Jira)" <ji...@apache.org> on 2021/02/01 13:59:00 UTC, 24 replies.
- [jira] [Created] (TIKA-3289) Allow startup of multiple tika-server processes from TikaServerCli - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2021/02/01 20:09:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-3289) Allow startup of multiple tika-server processes from TikaServerCli - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2021/02/01 20:10:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3289) Allow startup of multiple tika-server processes from TikaServerCli - posted by "Hudson (Jira)" <ji...@apache.org> on 2021/02/01 23:45:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3287) Add http fetcher - posted by "Hudson (Jira)" <ji...@apache.org> on 2021/02/01 23:45:00 UTC, 0 replies.
- [jira] [Created] (TIKA-3290) Extension reading it as eml instead of txt - posted by "Vamsi Molli (Jira)" <ji...@apache.org> on 2021/02/02 05:59:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3290) Extension reading it as eml instead of txt - posted by "Nick Burch (Jira)" <ji...@apache.org> on 2021/02/02 09:58:00 UTC, 19 replies.
- [jira] [Comment Edited] (TIKA-3290) Extension reading it as eml instead of txt - posted by "Vamsi Molli (Jira)" <ji...@apache.org> on 2021/02/02 11:10:00 UTC, 6 replies.
- [jira] [Comment Edited] (TIKA-3286) Tika does not issue an error when language file doesn't exist; not supporting script files - posted by "Peter Kronenberg (Jira)" <ji...@apache.org> on 2021/02/02 15:28:00 UTC, 7 replies.
- [jira] [Created] (TIKA-3291) Tika error on parsing MOV files - posted by "Alexey Grigoriev (Jira)" <ji...@apache.org> on 2021/02/02 17:17:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-3291) Tika error on parsing MOV files - posted by "Alexey Grigoriev (Jira)" <ji...@apache.org> on 2021/02/02 17:24:00 UTC, 0 replies.
- [GitHub] [tika] tballison opened a new pull request #401: TIKA-3288 - posted by GitBox <gi...@apache.org> on 2021/02/03 17:49:23 UTC, 0 replies.
- [GitHub] [tika] tballison merged pull request #401: TIKA-3288 - posted by GitBox <gi...@apache.org> on 2021/02/03 17:49:56 UTC, 0 replies.
- [jira] [Commented] (TIKA-3288) Allow batching for emitters - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/02/03 17:50:00 UTC, 2 replies.
- [GitHub] [tika] peterkronenberg opened a new pull request #402: Tika 3286 - Check if lanaguage files exists and provide better error msg; Support script lanaguge files in script directory - posted by GitBox <gi...@apache.org> on 2021/02/03 19:09:00 UTC, 0 replies.
- [GitHub] [tika] tballison commented on a change in pull request #402: Tika 3286 - Check if lanaguage files exists and provide better error msg; Support script lanaguge files in script directory - posted by GitBox <gi...@apache.org> on 2021/02/04 15:31:57 UTC, 4 replies.
- [GitHub] [tika] peterkronenberg commented on a change in pull request #402: Tika 3286 - Check if lanaguage files exists and provide better error msg; Support script lanaguge files in script directory - posted by GitBox <gi...@apache.org> on 2021/02/04 16:12:58 UTC, 3 replies.
- [jira] [Updated] (TIKA-3292) Remove GSON where possible in 2.x - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2021/02/04 16:21:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3292) Remove GSON where possible in 2.x - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2021/02/04 16:21:00 UTC, 5 replies.
- [jira] [Created] (TIKA-3292) Remove GSON where possible in 2.x - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2021/02/04 16:21:00 UTC, 0 replies.
- [jira] [Created] (TIKA-3293) Move most commandline options for tika-server into a config file in 2.0.0 - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2021/02/04 16:31:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-3292) Remove GSON where possible in 2.x - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2021/02/04 17:14:00 UTC, 0 replies.
- [GitHub] [tika] tballison merged pull request #402: Tika 3286 - Check if lanaguage files exists and provide better error msg; Support script lanaguge files in script directory - posted by GitBox <gi...@apache.org> on 2021/02/04 18:01:39 UTC, 1 replies.
- [GitHub] [tika] tballison commented on pull request #402: Tika 3286 - Check if lanaguage files exists and provide better error msg; Support script lanaguge files in script directory - posted by GitBox <gi...@apache.org> on 2021/02/04 18:01:49 UTC, 1 replies.
- [jira] [Updated] (TIKA-3286) Tika does not issue an error when language file doesn't exist; not supporting script files - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2021/02/04 18:15:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-3286) Tika does not issue an error when language file doesn't exist; not supporting script files - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2021/02/04 19:59:00 UTC, 0 replies.
- JDK 16 is now in the Release Candidate Phase - posted by Rory O'Donnell <ro...@oracle.com> on 2021/02/05 10:40:38 UTC, 0 replies.
- [jira] [Created] (TIKA-3294) Usage of "ECB" mode for "AES" is insecure - posted by "Md Mahir Asef Kabir (Jira)" <ji...@apache.org> on 2021/02/05 15:19:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3294) Usage of "ECB" mode for "AES" is insecure - posted by "Nick Burch (Jira)" <ji...@apache.org> on 2021/02/05 15:24:00 UTC, 2 replies.
- [jira] [Updated] (TIKA-3294) Usage of "ECB" mode for "AES" is insecure - posted by "Md Mahir Asef Kabir (Jira)" <ji...@apache.org> on 2021/02/05 16:17:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3277) Apache POI 5.0.0 released - posted by "PJ Fanning (Jira)" <ji...@apache.org> on 2021/02/05 18:30:00 UTC, 0 replies.
- [jira] [Closed] (TIKA-3277) Apache POI 5.0.0 released - posted by "PJ Fanning (Jira)" <ji...@apache.org> on 2021/02/05 18:30:00 UTC, 0 replies.
- [GitHub] [tika] dameikle merged pull request #398: Removed exclusion of tika-server-core from tika-server-classic - posted by GitBox <gi...@apache.org> on 2021/02/06 16:38:48 UTC, 0 replies.
- [jira] [Created] (TIKA-3295) ForkServer timeout due to a bug that could not load a class(not serialized ) - posted by "TWang (Jira)" <ji...@apache.org> on 2021/02/07 08:22:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-3295) ForkServer timeout due to a bug that could not load a class(not serialized ) - posted by "TWang (Jira)" <ji...@apache.org> on 2021/02/07 10:09:00 UTC, 4 replies.
- [jira] [Commented] (TIKA-3295) ForkServer timeout due to a bug that could not load a class(not serialized ) - posted by "TWang (Jira)" <ji...@apache.org> on 2021/02/07 10:11:00 UTC, 1 replies.
- [jira] [Created] (TIKA-3296) Allow tesseract/tessdata path to be specified by environment variables - posted by "Peter Kronenberg (Jira)" <ji...@apache.org> on 2021/02/08 16:14:00 UTC, 0 replies.
- [GitHub] [tika] peterkronenberg opened a new pull request #403: Allow tesseract/tessdata path to be specified by environment variables - posted by GitBox <gi...@apache.org> on 2021/02/08 16:24:37 UTC, 0 replies.
- [jira] [Commented] (TIKA-3296) Allow tesseract/tessdata path to be specified by environment variables - posted by "Peter Kronenberg (Jira)" <ji...@apache.org> on 2021/02/08 16:25:00 UTC, 8 replies.
- [GitHub] [tika] lfcnassif commented on pull request #403: Allow tesseract/tessdata path to be specified by environment variables - posted by GitBox <gi...@apache.org> on 2021/02/08 16:48:01 UTC, 2 replies.
- [GitHub] [tika] peterkronenberg commented on pull request #403: Allow tesseract/tessdata path to be specified by environment variables - posted by GitBox <gi...@apache.org> on 2021/02/08 17:10:44 UTC, 4 replies.
- Discussion on TIKA-3296 - posted by Peter Kronenberg <pe...@torch.ai> on 2021/02/08 19:57:55 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-3296) Allow tesseract/tessdata path to be specified by environment variables - posted by "Peter Kronenberg (Jira)" <ji...@apache.org> on 2021/02/08 19:58:00 UTC, 2 replies.
- load error handler in TikaConfig for 2.x? - posted by Tim Allison <ta...@apache.org> on 2021/02/09 02:33:53 UTC, 3 replies.
- [jira] [Created] (TIKA-3297) Simplify parser configuration in 2.x - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2021/02/09 17:11:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3297) Simplify parser configuration in 2.x - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2021/02/09 21:10:00 UTC, 12 replies.
- RE: {EXTERNAL}[jira] [Commented] (TIKA-3297) Simplify parser configuration in 2.x - posted by Peter Kronenberg <pe...@torch.ai> on 2021/02/09 21:21:28 UTC, 5 replies.
- [jira] [Resolved] (TIKA-3297) Simplify parser configuration in 2.x - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2021/02/09 23:03:00 UTC, 0 replies.
- [GitHub] [tika] peterkronenberg closed pull request #403: Allow tesseract/tessdata path to be specified by environment variables - posted by GitBox <gi...@apache.org> on 2021/02/10 03:32:24 UTC, 0 replies.
- [jira] [Reopened] (TIKA-3297) Simplify parser configuration in 2.x - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2021/02/10 15:16:00 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-3297) Simplify parser configuration in 2.x - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2021/02/10 15:17:00 UTC, 2 replies.
- [jira] [Created] (TIKA-3298) Add a "preloadLangs" parameter to TesseractOCRParser - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2021/02/10 22:41:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-3298) Add a "preloadLangs" parameter to TesseractOCRParser - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2021/02/10 22:42:00 UTC, 0 replies.
- [jira] [Created] (TIKA-3299) StringsParser leaving behind temp file - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2021/02/10 23:05:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3298) Add a "preloadLangs" parameter to TesseractOCRParser - posted by "Peter Kronenberg (Jira)" <ji...@apache.org> on 2021/02/10 23:39:00 UTC, 15 replies.
- [jira] [Updated] (TIKA-3298) Add a "preloadLangs" parameter to TesseractOCRParser - posted by "Peter Kronenberg (Jira)" <ji...@apache.org> on 2021/02/11 00:00:00 UTC, 2 replies.
- [jira] [Comment Edited] (TIKA-3298) Add a "preloadLangs" parameter to TesseractOCRParser - posted by "Peter Kronenberg (Jira)" <ji...@apache.org> on 2021/02/11 16:49:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3293) Move most commandline options for tika-server into a config file in 2.0.0 - posted by "Hudson (Jira)" <ji...@apache.org> on 2021/02/11 17:04:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3299) StringsParser leaving behind temp file - posted by "Hudson (Jira)" <ji...@apache.org> on 2021/02/11 17:04:00 UTC, 1 replies.
- [jira] [Created] (TIKA-3300) Figure out if we can improve tesseract parallelization - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2021/02/12 16:46:01 UTC, 0 replies.
- [jira] [Commented] (TIKA-3263) WriteLimitReachedException is not public - posted by "Peter Kronenberg (Jira)" <ji...@apache.org> on 2021/02/12 17:28:00 UTC, 0 replies.
- [jira] [Assigned] (TIKA-94) Speech recognition - posted by "Lewis John McGibbney (Jira)" <ji...@apache.org> on 2021/02/12 23:36:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-94) Speech recognition - posted by "Lewis John McGibbney (Jira)" <ji...@apache.org> on 2021/02/12 23:37:00 UTC, 14 replies.
- [jira] [Commented] (TIKA-3300) Figure out if we can improve tesseract parallelization - posted by "Luís Filipe Nassif (Jira)" <ji...@apache.org> on 2021/02/14 14:42:00 UTC, 3 replies.
- [jira] [Created] (TIKA-3301) Simplify forking/monitoring in tika-server for 2.x - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2021/02/16 16:49:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-3301) Simplify forking/monitoring in tika-server for 2.x - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2021/02/16 17:22:00 UTC, 0 replies.
- 1.26 release? - posted by Tim Allison <ta...@apache.org> on 2021/02/16 17:23:03 UTC, 0 replies.
- [jira] [Updated] (TIKA-3301) Simplify forking/monitoring in tika-server for 2.x - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2021/02/16 18:06:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3301) Simplify forking/monitoring in tika-server for 2.x - posted by "Hudson (Jira)" <ji...@apache.org> on 2021/02/16 18:56:00 UTC, 1 replies.
- [jira] [Comment Edited] (TIKA-3300) Figure out if we can improve tesseract parallelization - posted by "Luís Filipe Nassif (Jira)" <ji...@apache.org> on 2021/02/18 13:13:00 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-94) Speech recognition - posted by "Lewis John McGibbney (Jira)" <ji...@apache.org> on 2021/02/19 03:50:00 UTC, 1 replies.
- [jira] [Created] (TIKA-3302) CHANGES for 1.25 missing - posted by "Julian Reschke (Jira)" <ji...@apache.org> on 2021/02/19 07:55:00 UTC, 0 replies.
- [jira] [Created] (TIKA-3303) Broken link to Getting Started page on https://tika.apache.org/ - posted by "Luis de Vasconcelos (Jira)" <ji...@apache.org> on 2021/02/19 16:34:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-3303) Broken link to Getting Started page on https://tika.apache.org/ - posted by "Luis de Vasconcelos (Jira)" <ji...@apache.org> on 2021/02/19 16:36:00 UTC, 2 replies.
- [jira] [Commented] (TIKA-3303) Broken link to Getting Started page on https://tika.apache.org/ - posted by "Tilman Hausherr (Jira)" <ji...@apache.org> on 2021/02/19 19:40:00 UTC, 4 replies.
- [jira] [Created] (TIKA-3304) Improve robustness of async processing in 2.x - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2021/02/19 22:17:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-3290) Extension reading it as eml instead of txt - posted by "Vamsi Molli (Jira)" <ji...@apache.org> on 2021/02/22 04:44:00 UTC, 1 replies.
- [GitHub] [tika] pjfanning opened a new pull request #404: WIP: POI 5.0.0 - posted by GitBox <gi...@apache.org> on 2021/02/22 22:09:12 UTC, 0 replies.
- [GitHub] [tika] pjfanning commented on pull request #404: WIP: TIKA-3164: POI 5.0.0 - posted by GitBox <gi...@apache.org> on 2021/02/22 22:27:55 UTC, 6 replies.
- [jira] [Commented] (TIKA-3164) Upgrade to POI 5.0.0 when available - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/02/22 22:28:00 UTC, 22 replies.
- [GitHub] [tika] tballison commented on pull request #404: WIP: TIKA-3164: POI 5.0.0 - posted by GitBox <gi...@apache.org> on 2021/02/23 10:05:38 UTC, 11 replies.
- [GitHub] [tika] tballison edited a comment on pull request #404: WIP: TIKA-3164: POI 5.0.0 - posted by GitBox <gi...@apache.org> on 2021/02/23 11:19:31 UTC, 1 replies.
- [GitHub] [tika] pjfanning edited a comment on pull request #404: WIP: TIKA-3164: POI 5.0.0 - posted by GitBox <gi...@apache.org> on 2021/02/23 12:02:51 UTC, 0 replies.
- [GitHub] [tika] pjfanning closed pull request #404: WIP: TIKA-3164: POI 5.0.0 - posted by GitBox <gi...@apache.org> on 2021/02/23 13:43:18 UTC, 0 replies.
- [jira] [Updated] (TIKA-3305) How do you handle PDFs with custom encoding? - posted by "Nicholas DiPiazza (Jira)" <ji...@apache.org> on 2021/02/23 22:49:00 UTC, 1 replies.
- [jira] [Created] (TIKA-3305) How do you handle PDFs with custom encoding? - posted by "Nicholas DiPiazza (Jira)" <ji...@apache.org> on 2021/02/23 22:49:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3305) How do you handle PDFs with custom encoding? - posted by "Tilman Hausherr (Jira)" <ji...@apache.org> on 2021/02/24 03:39:00 UTC, 3 replies.
- [jira] [Comment Edited] (TIKA-3305) How do you handle PDFs with custom encoding? - posted by "Tilman Hausherr (Jira)" <ji...@apache.org> on 2021/02/24 03:42:00 UTC, 0 replies.
- [jira] [Closed] (TIKA-3305) How do you handle PDFs with custom encoding? - posted by "Nicholas DiPiazza (Jira)" <ji...@apache.org> on 2021/02/24 14:23:00 UTC, 0 replies.
- [GitHub] [tika] peterkronenberg opened a new pull request #405: Correct debugging output - posted by GitBox <gi...@apache.org> on 2021/02/24 14:42:02 UTC, 0 replies.
- [GitHub] [tika] tballison merged pull request #405: Correct debugging output - posted by GitBox <gi...@apache.org> on 2021/02/24 17:56:25 UTC, 0 replies.
- [GitHub] [tika] lewismc opened a new pull request #406: WIP: TIKA-94 Speech recognition - posted by GitBox <gi...@apache.org> on 2021/02/26 05:25:51 UTC, 0 replies.
- [GitHub] [tika] lewismc commented on a change in pull request #406: WIP: TIKA-94 Speech recognition - posted by GitBox <gi...@apache.org> on 2021/02/26 05:44:28 UTC, 0 replies.
- [GitHub] [tika] rohan2810 commented on a change in pull request #406: WIP: TIKA-94 Speech recognition - posted by GitBox <gi...@apache.org> on 2021/02/26 06:57:54 UTC, 2 replies.
- [jira] [Commented] (TIKA-3255) Parsing MP3 file with record size > 100000 fails - posted by "Peter Kronenberg (Jira)" <ji...@apache.org> on 2021/02/27 21:37:00 UTC, 0 replies.
- [GitHub] [tika] abehara2 commented on pull request #406: WIP: TIKA-94 Speech recognition - posted by GitBox <gi...@apache.org> on 2021/02/28 06:16:55 UTC, 0 replies.
- [GitHub] [tika] phantuanminh commented on a change in pull request #406: WIP: TIKA-94 Speech recognition - posted by GitBox <gi...@apache.org> on 2021/02/28 09:28:55 UTC, 0 replies.