You are viewing a plain text version of this content. The canonical link for it is here.
- [jira] [Commented] (TIKA-2749) OCR on PDFs should "just work" out of the box - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/02 11:09:00 UTC, 8 replies.
- [jira] [Created] (TIKA-2845) Override ProcessPages in PDFTextStripper - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/03 13:15:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2845) Override ProcessPages in PDFTextStripper - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/03 13:16:00 UTC, 1 replies.
- [jira] [Commented] (TIKA-2845) Override ProcessPages in PDFTextStripper - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/03 13:17:04 UTC, 3 replies.
- [jira] [Comment Edited] (TIKA-2845) Override ProcessPages in PDFTextStripper - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/03 13:18:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2845) Override ProcessPages in PDFTextStripper - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/03 14:16:03 UTC, 0 replies.
- [jira] [Created] (TIKA-2846) Add per page unicode mapping stats to the metadata in the PDFParser - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/03 15:21:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2846) Add per page unicode mapping stats to the metadata in the PDFParser - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/03 15:22:00 UTC, 1 replies.
- [jira] [Resolved] (TIKA-2846) Add per page unicode mapping stats to the metadata in the PDFParser - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/03 16:34:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2846) Add per page unicode mapping stats to the metadata in the PDFParser - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/03 16:35:00 UTC, 3 replies.
- [jira] [Created] (TIKA-2847) OutOfMemoryError - tika1.19.1.jar - posted by "Ashish Tiwari (JIRA)" <ji...@apache.org> on 2019/04/03 19:58:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2847) OutOfMemoryError - tika1.19.1.jar - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/03 22:14:00 UTC, 10 replies.
- [jira] [Comment Edited] (TIKA-2749) OCR on PDFs should "just work" out of the box - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/04 13:45:00 UTC, 6 replies.
- [jira] [Commented] (TIKA-2840) windows batch file not detected - posted by "chandra (JIRA)" <ji...@apache.org> on 2019/04/04 15:02:00 UTC, 4 replies.
- [jira] [Assigned] (TIKA-2555) Text with [underline] + [another format] in word document generates overlapping html tags. - posted by "Konstantin Gribov (JIRA)" <ji...@apache.org> on 2019/04/04 17:56:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2601) Invalid XHTML output for some WORD documents - posted by "Konstantin Gribov (JIRA)" <ji...@apache.org> on 2019/04/04 21:39:00 UTC, 0 replies.
- [jira] [Closed] (TIKA-2347) Underlined text is not decorated as such when extracting from word documents - posted by "Konstantin Gribov (JIRA)" <ji...@apache.org> on 2019/04/04 21:40:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2601) Invalid XHTML output for some WORD documents - posted by "Funbit (JIRA)" <ji...@apache.org> on 2019/04/05 09:21:00 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-2601) Invalid XHTML output for some WORD documents - posted by "Funbit (JIRA)" <ji...@apache.org> on 2019/04/05 09:22:00 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-2847) OutOfMemoryError - tika1.19.1.jar - posted by "Ashish Tiwari (JIRA)" <ji...@apache.org> on 2019/04/05 13:05:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2848) This file consumes an inordinate amount of memory when parsed by Tika - posted by "Tim Barrett (JIRA)" <ji...@apache.org> on 2019/04/07 15:14:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2848) This file consumes an inordinate amount of memory when parsed by Tika - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/07 15:56:00 UTC, 7 replies.
- [jira] [Created] (TIKA-2849) TikaInputStream copies the input stream locally - posted by "Boris Petrov (JIRA)" <ji...@apache.org> on 2019/04/08 08:49:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2849) TikaInputStream copies the input stream locally - posted by "Ken Krugler (JIRA)" <ji...@apache.org> on 2019/04/08 15:09:00 UTC, 39 replies.
- Tika 1.21? - posted by Tim Allison <ta...@apache.org> on 2019/04/08 18:11:42 UTC, 4 replies.
- [jira] [Updated] (TIKA-2848) This file consumes an inordinate amount of memory when parsed by Tika - posted by "Tim Barrett (JIRA)" <ji...@apache.org> on 2019/04/09 14:28:00 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-2849) TikaInputStream copies the input stream locally - posted by "Boris Petrov (JIRA)" <ji...@apache.org> on 2019/04/09 14:58:00 UTC, 7 replies.
- [jira] [Created] (TIKA-2850) Add more limits to comparison report sql calls - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/09 16:32:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2850) Add more limits to comparison report sql calls - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/09 16:34:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2851) Upgrade to POI 4.1.1 when available - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/10 19:28:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2850) Add more limits to comparison report sql calls - posted by "Hudson (JIRA)" <ji...@apache.org> on 2019/04/10 21:17:00 UTC, 2 replies.
- [jira] [Assigned] (TIKA-2849) TikaInputStream copies the input stream locally - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/11 09:55:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2852) Add reports for missing/unaligned files in tika-eval Compare mode - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/11 20:51:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2852) Add reports for missing/unaligned files in tika-eval Compare mode - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/11 20:52:00 UTC, 1 replies.
- [jira] [Resolved] (TIKA-2852) Add reports for missing/unaligned files in tika-eval Compare mode - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/11 21:10:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2852) Add reports for missing/unaligned files in tika-eval Compare mode - posted by "Hudson (JIRA)" <ji...@apache.org> on 2019/04/11 21:44:00 UTC, 2 replies.
- [jira] [Commented] (TIKA-2835) Upgrade to PDFBox 2.0.15 when available - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/12 22:28:00 UTC, 3 replies.
- [jira] [Created] (TIKA-2853) Consider applying NaiveBayes or similar simple ML to streaming zip detector - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/12 22:32:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2854) upgrade out-of-date dependencies with outstanding CVEs - posted by "Andrew Pavlin (JIRA)" <ji...@apache.org> on 2019/04/16 17:24:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2854) upgrade out-of-date dependencies with outstanding CVEs - posted by "Andrew Pavlin (JIRA)" <ji...@apache.org> on 2019/04/16 19:00:00 UTC, 9 replies.
- [jira] [Commented] (TIKA-2853) Consider applying NaiveBayes or similar simple ML to streaming zip detector - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/17 12:16:00 UTC, 1 replies.
- Re: Wiki migration - posted by Konstantin Gribov <gr...@gmail.com> on 2019/04/17 14:05:42 UTC, 9 replies.
- [jira] [Created] (TIKA-2855) pdfbox version used by both Apache Tika 1.19.1 and 1.20 is vulnerable - posted by "Abhijit Rajwade (JIRA)" <ji...@apache.org> on 2019/04/18 11:19:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-1997) Problem in Tika().detect for xml file signed in CADES - posted by "Chinh Nguyen (JIRA)" <ji...@apache.org> on 2019/04/18 12:05:00 UTC, 1 replies.
- [jira] [Created] (TIKA-2856) Cannot detect digest PKCS7 file - posted by "Chinh Nguyen (JIRA)" <ji...@apache.org> on 2019/04/18 12:14:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2857) Cannot parse PKCS7 files - posted by "Chinh Nguyen (JIRA)" <ji...@apache.org> on 2019/04/18 12:27:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2857) Cannot parse PKCS7 files - posted by "Chinh Nguyen (JIRA)" <ji...@apache.org> on 2019/04/18 12:31:00 UTC, 2 replies.
- [jira] [Updated] (TIKA-2857) Cannot parse PKCS7 files - posted by "Chinh Nguyen (JIRA)" <ji...@apache.org> on 2019/04/18 12:37:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2856) Cannot detect digest PKCS7 file - posted by "Roberto Benedetti (JIRA)" <ji...@apache.org> on 2019/04/18 15:09:00 UTC, 2 replies.
- [jira] [Comment Edited] (TIKA-2856) Cannot detect digest PKCS7 file - posted by "Roberto Benedetti (JIRA)" <ji...@apache.org> on 2019/04/18 15:10:00 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-1997) Problem in Tika().detect for xml file signed in CADES - posted by "Roberto Benedetti (JIRA)" <ji...@apache.org> on 2019/04/18 15:11:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2601) Invalid XHTML output (overlapping a and formatting tags) for some WORD documents - posted by "Konstantin Gribov (JIRA)" <ji...@apache.org> on 2019/04/18 15:13:00 UTC, 0 replies.
- [jira] [Reopened] (TIKA-2601) Invalid XHTML output (overlapping a and formatting tags) for some WORD documents - posted by "Konstantin Gribov (JIRA)" <ji...@apache.org> on 2019/04/18 15:14:00 UTC, 0 replies.
- JDK 13 - Early Access build 17 is available - posted by Rory O'Donnell <ro...@oracle.com> on 2019/04/19 12:41:49 UTC, 0 replies.
- [jira] [Created] (TIKA-2858) JAXRS server: allow passwords with special chars (MIME encoded words) - posted by "Ross Johnson (JIRA)" <ji...@apache.org> on 2019/04/19 16:30:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2835) Upgrade to PDFBox 2.0.15 when available - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/19 20:23:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2855) pdfbox version used by both Apache Tika 1.19.1 and 1.20 is vulnerable - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/19 20:24:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2849) TikaInputStream copies the input stream locally - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/19 20:25:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2849) TikaInputStream copies the input stream locally - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/19 20:25:00 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-2854) upgrade out-of-date dependencies with outstanding CVEs - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/19 21:08:00 UTC, 1 replies.
- [jira] [Commented] (TIKA-2858) JAXRS server: allow passwords with special chars (MIME encoded words) - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/22 13:01:00 UTC, 12 replies.
- [jira] [Comment Edited] (TIKA-2858) JAXRS server: allow passwords with special chars (MIME encoded words) - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/22 13:02:00 UTC, 1 replies.
- [jira] [Assigned] (TIKA-2566) Move logging in tika-core to log4j via slf4j as we do in the rest of Tika - posted by "Konstantin Gribov (JIRA)" <ji...@apache.org> on 2019/04/22 17:00:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2555) Text with [underline] + [another format] in word document generates overlapping html tags. - posted by "Hudson (JIRA)" <ji...@apache.org> on 2019/04/22 17:16:00 UTC, 2 replies.
- [jira] [Commented] (TIKA-2601) Invalid XHTML output (overlapping a and formatting tags) for some WORD documents - posted by "Hudson (JIRA)" <ji...@apache.org> on 2019/04/22 17:16:00 UTC, 2 replies.
- [jira] [Resolved] (TIKA-2824) General dependency/plugin upgrades for next release - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/22 19:27:00 UTC, 0 replies.
- [jira] [Closed] (TIKA-2804) Blanket dependency upgrades for next release cycle - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/22 19:27:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2801) Tika includes 2 vulnerable components - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/22 19:30:00 UTC, 0 replies.
- [COMPRESS] zip-based entry names/metadata data set available - posted by Tim Allison <ta...@apache.org> on 2019/04/22 20:29:50 UTC, 0 replies.
- [jira] [Commented] (TIKA-2824) General dependency/plugin upgrades for next release - posted by "Hudson (JIRA)" <ji...@apache.org> on 2019/04/22 20:33:00 UTC, 2 replies.
- tika-2.x-windows - Build # 403 - Failure - posted by Apache Jenkins Server <je...@builds.apache.org> on 2019/04/22 21:16:29 UTC, 0 replies.
- [jira] [Commented] (TIKA-2293) Tess4jOCRParser - A simpler Java version of TesseractOCRParser - posted by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2019/04/23 10:16:00 UTC, 4 replies.
- [jira] [Resolved] (TIKA-2841) Improve robustness of parsers of zip-based files on truncated files - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/23 13:04:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2601) Invalid XHTML output (overlapping a and formatting tags) for some WORD documents - posted by "Konstantin Gribov (JIRA)" <ji...@apache.org> on 2019/04/23 13:46:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2555) Text with [underline] + [another format] in word document generates overlapping html tags. - posted by "Konstantin Gribov (JIRA)" <ji...@apache.org> on 2019/04/23 13:46:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2566) Move logging in tika-core to log4j via slf4j as we do in the rest of Tika - posted by "Konstantin Gribov (JIRA)" <ji...@apache.org> on 2019/04/23 15:11:00 UTC, 0 replies.
- [jira] [Reopened] (TIKA-2566) Move logging in tika-core to log4j via slf4j as we do in the rest of Tika - posted by "Konstantin Gribov (JIRA)" <ji...@apache.org> on 2019/04/23 15:11:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2566) Move logging in tika-core to log4j via slf4j as we do in the rest of Tika - posted by "Konstantin Gribov (JIRA)" <ji...@apache.org> on 2019/04/23 15:11:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2315) Update logging page at wiki with actual info - posted by "Konstantin Gribov (JIRA)" <ji...@apache.org> on 2019/04/23 15:12:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2314) Migrate logging to slf4j in master (2.x) branch - posted by "Konstantin Gribov (JIRA)" <ji...@apache.org> on 2019/04/23 15:12:00 UTC, 0 replies.
- [jira] [Closed] (TIKA-2315) Update logging page at wiki with actual info - posted by "Konstantin Gribov (JIRA)" <ji...@apache.org> on 2019/04/23 15:13:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2314) Migrate logging to slf4j in master (2.x) branch - posted by "Konstantin Gribov (JIRA)" <ji...@apache.org> on 2019/04/23 23:25:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2566) Move logging in tika-core to log4j via slf4j as we do in the rest of Tika - posted by "Hudson (JIRA)" <ji...@apache.org> on 2019/04/23 23:45:00 UTC, 1 replies.
- [jira] [Resolved] (TIKA-2566) Move logging in tika-core to slf4j-api (with log4j in test scope) as we do in the rest of Tika - posted by "Konstantin Gribov (JIRA)" <ji...@apache.org> on 2019/04/24 00:12:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2566) Move logging in tika-core to slf4j-api (with log4j in test scope) as we do in the rest of Tika - posted by "Konstantin Gribov (JIRA)" <ji...@apache.org> on 2019/04/24 00:12:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2859) Add to html output - posted by "Lior Yaffe (JIRA)" <ji...@apache.org> on 2019/04/24 14:02:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2860) Improve documentation of specifying tika_config.xml in batch mode - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/26 12:46:00 UTC, 0 replies.
- Re: [EXTERNAL] Tika script - posted by Chris Mattmann <ma...@apache.org> on 2019/04/26 20:00:18 UTC, 2 replies.
- [jira] [Resolved] (TIKA-2840) windows batch file not detected - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/30 10:05:00 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-2840) windows batch file not detected - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/30 10:24:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2795) Error starting Tika 2.0 server with -spawnChild on Ubuntu - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2019/04/30 15:52:00 UTC, 0 replies.
- Quarkus integration - posted by Sergey Beryozkin <sb...@gmail.com> on 2019/04/30 17:22:20 UTC, 0 replies.