You are viewing a plain text version of this content. The canonical link for it is here.
- [jira] [Commented] (TIKA-3034) Detector always returns text/plain when scanning Mathematica files - posted by "Mihai Glont (Jira)" <ji...@apache.org> on 2020/02/03 12:28:00 UTC, 7 replies.
- [jira] [Comment Edited] (TIKA-3034) Detector always returns text/plain when scanning Mathematica files - posted by "Mihai Glont (Jira)" <ji...@apache.org> on 2020/02/03 12:29:00 UTC, 3 replies.
- [jira] [Commented] (TIKA-3010) Tika needs service installation script - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/02/03 13:30:01 UTC, 5 replies.
- [jira] [Updated] (TIKA-3010) Tika needs service installation script - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/04 23:16:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-3010) Tika needs service installation script - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/04 23:16:00 UTC, 0 replies.
- [jira] [Created] (TIKA-3036) broken build: "group id is too large" on a Mac - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/05 00:17:00 UTC, 0 replies.
- 1.24? - posted by Tim Allison <ta...@apache.org> on 2020/02/05 14:41:01 UTC, 2 replies.
- [jira] [Created] (TIKA-3037) Tika Docs should highlight Tika-Server - posted by "David Eric Pugh (Jira)" <ji...@apache.org> on 2020/02/05 15:25:00 UTC, 0 replies.
- Re: [EXTERNAL] Do we have a community supported approach for deploying Tika Server in production? - posted by Eric Pugh <ep...@opensourceconnections.com> on 2020/02/05 15:34:00 UTC, 3 replies.
- [jira] [Updated] (TIKA-3034) Detector always returns text/plain when scanning Mathematica files - posted by "Tung Nguyen (Jira)" <ji...@apache.org> on 2020/02/05 15:42:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3037) Tika Docs should highlight Tika-Server - posted by "David Eric Pugh (Jira)" <ji...@apache.org> on 2020/02/05 15:42:00 UTC, 12 replies.
- [jira] [Updated] (TIKA-3037) Tika Docs should highlight Tika-Server - posted by "David Eric Pugh (Jira)" <ji...@apache.org> on 2020/02/05 18:25:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2253) Obtain new Miredot license key and upgrade plugin version in tika-server - posted by "David Eric Pugh (Jira)" <ji...@apache.org> on 2020/02/05 18:32:00 UTC, 0 replies.
- [jira] [Created] (TIKA-3038) Miredot license key expired - posted by "David Eric Pugh (Jira)" <ji...@apache.org> on 2020/02/05 18:36:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3031) NumberFormatException while parsing a certain PDF document - posted by "Trevor Bentley (Jira)" <ji...@apache.org> on 2020/02/05 18:53:00 UTC, 2 replies.
- [jira] [Commented] (TIKA-3038) Miredot license key expired - posted by "David Eric Pugh (Jira)" <ji...@apache.org> on 2020/02/05 19:07:01 UTC, 5 replies.
- [jira] [Commented] (TIKA-3023) Text files starting with MOVI are detected as X-SGI-Movie - posted by "Nick Burch (Jira)" <ji...@apache.org> on 2020/02/06 11:45:00 UTC, 2 replies.
- [jira] [Resolved] (TIKA-3023) Text files starting with MOVI are detected as X-SGI-Movie - posted by "Nick Burch (Jira)" <ji...@apache.org> on 2020/02/06 11:46:00 UTC, 0 replies.
- [jira] [Created] (TIKA-3039) Remove mvn dockerfile:build goal from tika-server - posted by "David Eric Pugh (Jira)" <ji...@apache.org> on 2020/02/06 23:40:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3039) Remove mvn dockerfile:build goal from tika-server - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/02/07 00:15:00 UTC, 7 replies.
- [jira] [Commented] (TIKA-3035) Tika-app --extract mode outputs to stderr instead of stdout - posted by "Tilman Hausherr (Jira)" <ji...@apache.org> on 2020/02/09 14:36:00 UTC, 8 replies.
- JDK 14 is now in the Release Candidate Phase - posted by Rory O'Donnell <ro...@oracle.com> on 2020/02/10 11:03:15 UTC, 0 replies.
- FW: [EXTERNAL] question about Tika - posted by Chris Mattmann <ma...@apache.org> on 2020/02/10 18:18:08 UTC, 1 replies.
- Tika Python not recognizing content. - posted by Max Franklin <ma...@gmail.com> on 2020/02/10 19:04:04 UTC, 1 replies.
- [jira] [Commented] (TIKA-3017) OOM in XSLFSheet.java - posted by "Don (Jira)" <ji...@apache.org> on 2020/02/11 17:45:00 UTC, 1 replies.
- [jira] [Created] (TIKA-3040) PDF inline OCR: Exception while processing certain image (others in same PDF work) - posted by "Markus Mandalka (Jira)" <ji...@apache.org> on 2020/02/11 20:20:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3040) PDF inline OCR: Exception while processing certain image (others in same PDF work) - posted by "Tilman Hausherr (Jira)" <ji...@apache.org> on 2020/02/11 20:28:00 UTC, 16 replies.
- [jira] [Comment Edited] (TIKA-3040) PDF inline OCR: Exception while processing certain image (others in same PDF work) - posted by "Tilman Hausherr (Jira)" <ji...@apache.org> on 2020/02/11 20:44:00 UTC, 3 replies.
- [jira] [Created] (TIKA-3041) ExtractInlineImages missing images from PDFBOX-52 - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/12 14:03:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3006) Regression in PDF keywords extraction since 1.23 - posted by "David Pilato (Jira)" <ji...@apache.org> on 2020/02/12 15:57:00 UTC, 10 replies.
- [jira] [Commented] (TIKA-3041) ExtractInlineImages missing images from PDFBOX-52 - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/12 18:12:00 UTC, 3 replies.
- [jira] [Resolved] (TIKA-3041) ExtractInlineImages missing images from PDFBOX-52 - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/12 18:12:00 UTC, 0 replies.
- [jira] [Created] (TIKA-3042) Date format extraction problem in XLS/XLSX - posted by "Zoltan Farago (Jira)" <ji...@apache.org> on 2020/02/13 12:16:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3026) Consider extracting structure/tags where possible in PDFs with the PDFMarkedContentExtractor - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/13 17:23:00 UTC, 2 replies.
- [jira] [Created] (TIKA-3043) vorbis-java-tika overwrites tika's Parser and Detector in MANIFEST - posted by "CHARUSHEELA BOPARDIKAR (Jira)" <ji...@apache.org> on 2020/02/13 17:25:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3043) vorbis-java-tika overwrites tika's Parser and Detector in MANIFEST - posted by "Nick Burch (Jira)" <ji...@apache.org> on 2020/02/13 18:48:00 UTC, 1 replies.
- [jira] [Created] (TIKA-3044) add -C/--content cli option using WriteOutContentHandler - posted by "Alexander Klimetschek (Jira)" <ji...@apache.org> on 2020/02/14 06:44:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3044) add -C/--content cli option using WriteOutContentHandler - posted by "Alexander Klimetschek (Jira)" <ji...@apache.org> on 2020/02/14 06:46:00 UTC, 0 replies.
- [jira] [Created] (TIKA-3045) Allow users to run custom parsing of xfa and xmp - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/14 12:39:00 UTC, 0 replies.
- [jira] [Created] (TIKA-3046) Add detection of some open office related formats - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/14 16:50:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-3046) Add detection of some open office related formats - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/14 16:56:00 UTC, 0 replies.
- [COMPRESS and Tika/PDFBox/POI] files from bug trackers - posted by Tim Allison <ta...@apache.org> on 2020/02/14 21:48:55 UTC, 1 replies.
- [jira] [Created] (TIKA-3047) Upgrade to POI 4.1.2 - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/14 22:00:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2650) Soft-hyphen is not extracted properly - posted by "Yauheni Salopiy (Jira)" <ji...@apache.org> on 2020/02/18 21:32:00 UTC, 3 replies.
- [jira] [Commented] (TIKA-2650) Soft-hyphen is not extracted properly - posted by "Yauheni Salopiy (Jira)" <ji...@apache.org> on 2020/02/18 22:27:00 UTC, 5 replies.
- [jira] [Created] (TIKA-3048) Tika unable to parse html files with GB2312 charset - posted by "Akash (Jira)" <ji...@apache.org> on 2020/02/19 14:18:00 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-2650) Soft-hyphen is not extracted properly - posted by "Tilman Hausherr (Jira)" <ji...@apache.org> on 2020/02/19 18:26:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3042) Date format extraction problem in XLS/XLSX - posted by "Zoltan Farago (Jira)" <ji...@apache.org> on 2020/02/20 10:31:00 UTC, 2 replies.
- [jira] [Updated] (TIKA-3048) Tika unable to parse html files with non UTF-8 charset - posted by "Akash (Jira)" <ji...@apache.org> on 2020/02/20 10:59:00 UTC, 3 replies.
- [jira] [Commented] (TIKA-3048) Tika unable to parse html files with non UTF-8 charset - posted by "Akash (Jira)" <ji...@apache.org> on 2020/02/20 11:02:00 UTC, 16 replies.
- [jira] [Created] (TIKA-3049) Improve file detection...varia - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/20 17:27:00 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-3048) Tika unable to parse html files with non UTF-8 charset - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/20 18:42:00 UTC, 1 replies.
- [jira] [Commented] (TIKA-3049) Improve file detection...varia - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/20 21:33:00 UTC, 0 replies.
- [jira] [Created] (TIKA-3050) Add xmp extraction to psd files - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/20 22:04:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-3042) Date format extraction problem in XLS/XLSX - posted by "Zoltan Farago (Jira)" <ji...@apache.org> on 2020/02/21 07:21:00 UTC, 0 replies.
- [jira] [Created] (TIKA-3051) Buffer Overflow in com.drewnoakes:metadata-extractor 2.11.0 - posted by "Michael Moritz (Jira)" <ji...@apache.org> on 2020/02/21 10:27:00 UTC, 0 replies.
- [jira] [Created] (TIKA-3052) Unsafe Dependancy Resolution in com.beust:jcommander 1.35 - posted by "Michael Moritz (Jira)" <ji...@apache.org> on 2020/02/21 10:28:00 UTC, 0 replies.
- [jira] [Created] (TIKA-3053) Denial of Service (DoS) in org.apache.cxf:cxf-core 3.3.2 - posted by "Michael Moritz (Jira)" <ji...@apache.org> on 2020/02/21 10:29:00 UTC, 0 replies.
- [jira] [Closed] (TIKA-3053) Denial of Service (DoS) in org.apache.cxf:cxf-core 3.3.2 - posted by "Michael Moritz (Jira)" <ji...@apache.org> on 2020/02/21 10:30:00 UTC, 0 replies.
- [jira] [Created] (TIKA-3054) Cross-site Scripting (XSS) in org.apache.cxf:cxf-rt-transports-http 3.3.2 - posted by "Michael Moritz (Jira)" <ji...@apache.org> on 2020/02/21 10:32:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-3054) [Dependency] Cross-site Scripting (XSS) in org.apache.cxf:cxf-rt-transports-http 3.3.2 - posted by "Michael Moritz (Jira)" <ji...@apache.org> on 2020/02/21 10:33:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-3051) [Dependency] Buffer Overflow in com.drewnoakes:metadata-extractor 2.11.0 - posted by "Michael Moritz (Jira)" <ji...@apache.org> on 2020/02/21 10:34:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-3052) [Dependency] Unsafe Dependancy Resolution in com.beust:jcommander 1.35 - posted by "Michael Moritz (Jira)" <ji...@apache.org> on 2020/02/21 10:34:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3050) Add xmp extraction to psd files - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/21 11:37:00 UTC, 2 replies.
- [jira] [Commented] (TIKA-3045) Allow users to run custom parsing of xfa and xmp - posted by "Hudson (Jira)" <ji...@apache.org> on 2020/02/21 16:58:00 UTC, 1 replies.
- [jira] [Commented] (TIKA-2952) Vulnerable "metadata-extractor 2.11.0" is present in tika 1.22. - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/21 19:03:00 UTC, 3 replies.
- [jira] [Resolved] (TIKA-3051) [Dependency] Buffer Overflow in com.drewnoakes:metadata-extractor 2.11.0 - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/21 19:08:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-3055) Add an optional PreflightPDFParser - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/21 22:01:00 UTC, 0 replies.
- [jira] [Created] (TIKA-3055) Add an optional PreflightPDFParser - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/21 22:01:00 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-2952) Vulnerable "metadata-extractor 2.11.0" is present in tika 1.22. - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/24 16:29:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2956) Stack Overflow issue reported on metadata-extractor used version by Tika - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/24 16:33:00 UTC, 0 replies.
- [jira] [Created] (TIKA-3056) General upgrades for 1.24 - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/24 17:26:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3056) General upgrades for 1.24 - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/24 18:08:00 UTC, 2 replies.
- [jira] [Resolved] (TIKA-2952) Vulnerable "metadata-extractor 2.11.0" is present in tika 1.22. - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/24 18:52:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-3054) [Dependency] Cross-site Scripting (XSS) in org.apache.cxf:cxf-rt-transports-http 3.3.2 - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/24 18:53:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-3052) [Dependency] Unsafe Dependancy Resolution in com.beust:jcommander 1.35 - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/24 18:53:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-3050) Add xmp extraction to psd files - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/24 18:54:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-3045) Allow users to run custom parsing of xfa and xmp - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/24 18:54:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-3033) Upgrade to PDFBox 2.0.19 when available - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/24 18:55:01 UTC, 0 replies.
- [jira] [Resolved] (TIKA-3047) Upgrade to POI 4.1.2 - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/24 18:56:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-3006) Regression in PDF keywords extraction since 1.23 - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/24 18:57:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-3017) OOM in XSLFSheet.java - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/24 18:57:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-3040) PDF inline OCR: Exception while processing certain image (others in same PDF work) - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/24 19:01:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-3026) Consider extracting structure/tags where possible in PDFs with the PDFMarkedContentExtractor - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/24 19:03:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2837) Performance/Stability problem in ToHTMLContentHandler - posted by "Cristian Vat (Jira)" <ji...@apache.org> on 2020/02/24 19:06:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-3037) Tika Docs should highlight Tika-Server - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/24 19:43:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-3031) NumberFormatException while parsing a certain PDF document - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/24 19:44:00 UTC, 0 replies.
- [jira] [Issue Comment Deleted] (TIKA-3035) Tika-app --extract mode outputs to stderr instead of stdout - posted by "Tilman Hausherr (Jira)" <ji...@apache.org> on 2020/02/24 20:09:00 UTC, 0 replies.
- Re: Miredot License Key for Apache Tika Project - posted by Tyler Bui-Palsulich <tp...@apache.org> on 2020/02/24 21:39:15 UTC, 0 replies.
- [jira] [Commented] (TIKA-3047) Upgrade to POI 4.1.2 - posted by "Hudson (Jira)" <ji...@apache.org> on 2020/02/25 08:21:00 UTC, 1 replies.
- [jira] [Commented] (TIKA-3033) Upgrade to PDFBox 2.0.19 when available - posted by "Hudson (Jira)" <ji...@apache.org> on 2020/02/25 08:21:00 UTC, 1 replies.
- [jira] [Comment Edited] (TIKA-3006) Regression in PDF keywords extraction since 1.23 - posted by "David Pilato (Jira)" <ji...@apache.org> on 2020/02/25 11:42:00 UTC, 2 replies.
- pushing branch_1x to Apache snapshots? - posted by Tim Allison <ta...@apache.org> on 2020/02/25 12:22:52 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-3035) Tika-app --extract mode outputs to stderr instead of stdout - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/25 15:30:00 UTC, 1 replies.
- [jira] [Commented] (TIKA-3036) broken build: "group id is too large" on a Mac - posted by "Hudson (Jira)" <ji...@apache.org> on 2020/02/25 19:22:00 UTC, 1 replies.
- Trouble with Parsing a PDF of a Drawing - posted by Kegan Huntley <Ke...@labstrong.com> on 2020/02/25 23:07:35 UTC, 1 replies.
- [jira] [Resolved] (TIKA-3048) Tika unable to parse html files with non UTF-8 charset - posted by "Akash (Jira)" <ji...@apache.org> on 2020/02/26 12:35:00 UTC, 0 replies.
- [jira] [Closed] (TIKA-3048) Tika unable to parse html files with non UTF-8 charset - posted by "Akash (Jira)" <ji...@apache.org> on 2020/02/26 12:35:00 UTC, 0 replies.
- [jira] [Created] (TIKA-3057) Improve detection of zip-based formats - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/28 16:12:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-3039) Remove mvn dockerfile:build goal from tika-server - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/28 16:47:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-3057) Improve detection of zip-based formats - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/28 16:47:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3057) Improve detection of zip-based formats - posted by "Hudson (Jira)" <ji...@apache.org> on 2020/02/28 18:11:00 UTC, 1 replies.
- [jira] [Resolved] (TIKA-3055) Add an optional PreflightPDFParser - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2020/02/28 19:32:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3055) Add an optional PreflightPDFParser - posted by "Hudson (Jira)" <ji...@apache.org> on 2020/02/28 20:08:00 UTC, 3 replies.
- Vm slack channel - posted by Tim Allison <ta...@apache.org> on 2020/02/29 19:16:07 UTC, 0 replies.