You are viewing a plain text version of this content. The canonical link for it is here.
- [jira] [Commented] (TIKA-2972) Allow users to specify a list/map of ContentHandlerFactories in tika-config.xml - posted by "Nick Burch (Jira)" <ji...@apache.org> on 2019/11/02 08:44:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2925) General dependency/plugin upgrades for 1.23 - posted by "Hudson (Jira)" <ji...@apache.org> on 2019/11/02 16:23:00 UTC, 11 replies.
- Concurrent builds - posted by Tilman Hausherr <TH...@t-online.de> on 2019/11/03 16:34:54 UTC, 4 replies.
- [jira] [Created] (TIKA-2978) maxMainMemoryBytes should be long on - posted by "Christian Ribeaud (Jira)" <ji...@apache.org> on 2019/11/08 17:18:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2978) maxMainMemoryBytes should be long on - posted by "Christian Ribeaud (Jira)" <ji...@apache.org> on 2019/11/08 17:19:00 UTC, 3 replies.
- [jira] [Updated] (TIKA-2978) maxMainMemoryBytes should be long on - posted by "Christian Ribeaud (Jira)" <ji...@apache.org> on 2019/11/08 17:20:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2979) tika-server shouldn't throw an exception for a non-supported format - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/08 18:38:00 UTC, 0 replies.
- [jira] [Assigned] (TIKA-2978) maxMainMemoryBytes should be long on - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/08 20:43:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2978) maxMainMemoryBytes should be long on - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/09 01:07:00 UTC, 0 replies.
- Build errors. - posted by Smiles Ip <ip...@gmail.com> on 2019/11/11 00:14:36 UTC, 0 replies.
- [jira] [Created] (TIKA-2980) Clean build of v1.22 failed (4 vulnerable components detected) - posted by "John Pitchko (Jira)" <ji...@apache.org> on 2019/11/11 06:33:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2942) HEIC files are detected as "video/quicktime" media type - posted by "Nick Burch (Jira)" <ji...@apache.org> on 2019/11/11 12:41:00 UTC, 2 replies.
- [jira] [Created] (TIKA-2981) Issue with parsing .numbers file - posted by "Szymon (Jira)" <ji...@apache.org> on 2019/11/11 13:19:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2982) Tika 识别已加密的xlsx、docx、pptx时会把它们错误地识别成doc - posted by "Feng Jiao Jiang (Jira)" <ji...@apache.org> on 2019/11/12 06:45:00 UTC, 2 replies.
- [jira] [Created] (TIKA-2982) Tika 识别已加密的xlsx、docx、pptx时会把它们错误地识别成doc - posted by "Feng Jiao Jiang (Jira)" <ji...@apache.org> on 2019/11/12 06:45:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2982) Tika 识别已加密的xlsx、docx、pptx时会把它们错误地识别成doc - posted by "Tilman Hausherr (Jira)" <ji...@apache.org> on 2019/11/12 06:55:00 UTC, 7 replies.
- [jira] [Closed] (TIKA-2982) Tika 识别已加密的xlsx、docx、pptx时会把它们错误地识别成doc - posted by "Feng Jiao Jiang (Jira)" <ji...@apache.org> on 2019/11/12 08:50:00 UTC, 0 replies.
- JDK 14 - Early Access build 22 is available - posted by Rory O'Donnell <ro...@oracle.com> on 2019/11/12 09:59:28 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2980) Clean build of v1.22 failed (4 vulnerable components detected) - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/12 15:06:00 UTC, 0 replies.
- [jira] [Reopened] (TIKA-2982) Tika 识别已加密的xlsx、docx、pptx时会把它们错误地识别成doc - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/12 15:59:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2981) Issue with parsing .numbers file - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/12 16:28:00 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-2981) Issue with parsing .numbers file - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/12 16:29:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2983) tika-server should add the file name to the metadata when a file url is passed in - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/12 23:05:00 UTC, 0 replies.
- [jira] [Assigned] (TIKA-2983) tika-server should add the file name to the metadata when a file url is passed in - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/12 23:10:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2983) tika-server should add the file name to the metadata when a file url is passed in - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/13 00:19:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2983) tika-server should add the file name to the metadata when a file url is passed in - posted by "Hudson (Jira)" <ji...@apache.org> on 2019/11/13 01:30:00 UTC, 1 replies.
- [jira] [Created] (TIKA-2984) Try to unify unit tests around TikaTest functions - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/13 04:46:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2984) Try to unify unit tests around TikaTest functions - posted by "Hudson (Jira)" <ji...@apache.org> on 2019/11/13 15:30:00 UTC, 2 replies.
- [jira] [Resolved] (TIKA-2982) Tika 识别已加密的xlsx、docx、pptx时会把它们错误地识别成doc - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/13 17:13:00 UTC, 0 replies.
- Re: [EXTERNAL] How to set the page segmentation for TIKA python - posted by Chris Mattmann <ma...@apache.org> on 2019/11/14 04:32:17 UTC, 1 replies.
- [jira] [Created] (TIKA-2985) averageCharTolerance and spacingTolerance are null by default - posted by "Christian Ribeaud (Jira)" <ji...@apache.org> on 2019/11/14 07:51:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2224) Mime magic for OneNote formats - posted by "Nicholas DiPiazza (Jira)" <ji...@apache.org> on 2019/11/18 13:03:00 UTC, 3 replies.
- [jira] [Resolved] (TIKA-2942) HEIC files are detected as "video/quicktime" media type - posted by "Nick Burch (Jira)" <ji...@apache.org> on 2019/11/18 15:02:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2986) Edge case (?) in file type detection - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/18 15:08:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2986) Edge case (?) in file type detection - posted by "Nick Burch (Jira)" <ji...@apache.org> on 2019/11/18 16:14:00 UTC, 9 replies.
- [jira] [Updated] (TIKA-2224) OneNote formats support - Mime Magic and Parser - posted by "Nick Burch (Jira)" <ji...@apache.org> on 2019/11/18 16:22:00 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-2224) OneNote formats support - Mime Magic and Parser - posted by "Nick Burch (Jira)" <ji...@apache.org> on 2019/11/18 16:23:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2224) OneNote formats support - Mime Magic and Parser - posted by "Nicholas DiPiazza (Jira)" <ji...@apache.org> on 2019/11/18 16:41:00 UTC, 2 replies.
- [jira] [Created] (TIKA-2987) Extracting Metadata from JPEG Fails with Tika Bundle - posted by "Dan Klco (Jira)" <ji...@apache.org> on 2019/11/18 22:54:00 UTC, 0 replies.
- [jira] [Assigned] (TIKA-2892) ForkParser deadlock when InputStreamResource catches/returns IOException - posted by "Luís Filipe Nassif (Jira)" <ji...@apache.org> on 2019/11/19 04:01:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2892) ForkParser deadlock when InputStreamResource catches/returns IOException - posted by "Luís Filipe Nassif (Jira)" <ji...@apache.org> on 2019/11/19 04:04:00 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-2986) Edge case (?) in file type detection - posted by "Luís Filipe Nassif (Jira)" <ji...@apache.org> on 2019/11/19 04:15:00 UTC, 2 replies.
- [jira] [Commented] (TIKA-2892) ForkParser deadlock when InputStreamResource catches/returns IOException - posted by "Hudson (Jira)" <ji...@apache.org> on 2019/11/19 06:29:00 UTC, 1 replies.
- [jira] [Created] (TIKA-2988) Add mime for alternative fdf format - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/19 13:47:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2988) Add mime for alternative fdf format - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/19 14:02:00 UTC, 6 replies.
- [jira] [Commented] (TIKA-2987) Extracting Metadata from JPEG Fails with Tika Bundle - posted by "Bob Paulin (Jira)" <ji...@apache.org> on 2019/11/19 15:33:00 UTC, 0 replies.
- Tika 1.23? - posted by Tim Allison <ta...@apache.org> on 2019/11/19 16:07:11 UTC, 0 replies.
- [jira] [Commented] (TIKA-2979) tika-server shouldn't throw an exception for a non-supported format - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/19 16:36:00 UTC, 2 replies.
- [jira] [Updated] (TIKA-2986) Edge case (?) in file type detection - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/19 17:21:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2988) Add mime for alternative fdf format - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/19 17:23:00 UTC, 1 replies.
- [jira] [Closed] (TIKA-2987) Extracting Metadata from JPEG Fails with Tika Bundle - posted by "Dan Klco (Jira)" <ji...@apache.org> on 2019/11/19 18:14:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2989) Add mime detection via xml root for xdp - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/19 19:33:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2989) Add mime detection via xml root for xdp - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/19 19:41:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2989) Add mime detection via xml root for xdp - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/19 19:42:00 UTC, 3 replies.
- [jira] [Created] (TIKA-2990) Add mime detection via xml root for xdfx - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/19 20:42:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2990) Add mime detection via xml root for xfdf - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/19 20:48:00 UTC, 1 replies.
- [jira] [Resolved] (TIKA-2989) Add mime detection via xml root for xdp - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/19 20:54:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2990) Add mime detection via xml root for xfdf - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/19 20:54:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2991) Add a parser for XDP files - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/19 20:57:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2992) java.lang.UnsupportedOperationException: This feature requires ASM7 in Tika 1.21 - posted by "Arvind Jain (Jira)" <ji...@apache.org> on 2019/11/19 22:33:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2990) Add mime detection via xml root for xfdf - posted by "Hudson (Jira)" <ji...@apache.org> on 2019/11/19 22:34:00 UTC, 1 replies.
- [jira] [Commented] (TIKA-2992) java.lang.UnsupportedOperationException: This feature requires ASM7 in Tika 1.21 - posted by "Nick Burch (Jira)" <ji...@apache.org> on 2019/11/20 09:47:00 UTC, 0 replies.
- Re: [EXTERNAL] Tika 1.23? - posted by Chris Mattmann <ma...@apache.org> on 2019/11/20 17:09:20 UTC, 1 replies.
- [jira] [Created] (TIKA-2993) tika-server's /rmeta endpoint shouldn't throw an exception when stacktrace is turned on - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/20 19:37:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-2993) tika-server's /rmeta endpoint shouldn't throw an exception for a parse exception - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/20 19:38:00 UTC, 1 replies.
- Docker image along with 1.23? - posted by Tim Allison <ta...@apache.org> on 2019/11/20 21:20:35 UTC, 1 replies.
- [jira] [Created] (TIKA-2994) ExceptionUtils should let TikaException subclasses through - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/20 21:44:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2979) tika-server shouldn't throw an exception for a non-supported format - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/20 23:03:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2993) tika-server's /rmeta endpoint shouldn't throw an exception for a parse exception - posted by "Hudson (Jira)" <ji...@apache.org> on 2019/11/20 23:07:00 UTC, 1 replies.
- [jira] [Commented] (TIKA-2994) ExceptionUtils should let TikaException subclasses through - posted by "Hudson (Jira)" <ji...@apache.org> on 2019/11/20 23:08:00 UTC, 1 replies.
- [jira] [Resolved] (TIKA-2993) tika-server's /rmeta endpoint shouldn't throw an exception for a parse exception - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/20 23:11:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2994) ExceptionUtils should let TikaException subclasses through - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/20 23:12:00 UTC, 0 replies.
- Re: [EXTERNAL] Docker image along with 1.23? - posted by "Mattmann, Chris A (US 1760)" <ch...@jpl.nasa.gov.INVALID> on 2019/11/20 23:14:31 UTC, 6 replies.
- Re: [EXTERNAL] Re: Docker image along with 1.23? - posted by Chris Mattmann <ma...@apache.org> on 2019/11/21 00:02:23 UTC, 0 replies.
- [jira] [Created] (TIKA-2995) markLimit too small in org.apache.tika.parser.microsoft.POIFSContainerDetector - posted by "Tim Barrett (Jira)" <ji...@apache.org> on 2019/11/21 13:30:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2995) markLimit too small in org.apache.tika.parser.microsoft.POIFSContainerDetector - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/21 14:03:00 UTC, 0 replies.
- regression tests for 1.23-rc1 - posted by Tim Allison <ta...@apache.org> on 2019/11/22 13:25:50 UTC, 4 replies.
- [jira] [Created] (TIKA-2996) Add dropThreshold to PDFParserConfig - posted by "Felix Sonntag (Jira)" <ji...@apache.org> on 2019/11/22 15:29:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2996) Add dropThreshold to PDFParserConfig - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2019/11/22 15:57:00 UTC, 2 replies.
- [jira] [Created] (TIKA-2997) Add embedded depth as a metadata field populated by RecursiveParserWrapperHandler - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/22 18:51:00 UTC, 0 replies.
- [jira] [Created] (TIKA-2998) Allow users to extract font names in PDFs - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/22 19:04:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2966) Create a tika-eval SAXHandler - posted by "Hudson (Jira)" <ji...@apache.org> on 2019/11/22 20:14:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2998) Allow users to extract font names in PDFs - posted by "Hudson (Jira)" <ji...@apache.org> on 2019/11/22 21:04:00 UTC, 3 replies.
- Call for Microsoft OneNote experts for help on OneNote parsing in Tika - posted by Nicholas DiPiazza <ni...@gmail.com> on 2019/11/24 17:21:07 UTC, 1 replies.
- [jira] [Created] (TIKA-2999) PDFParser should set, not add digital signature value - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/25 15:15:00 UTC, 0 replies.
- [jira] [Created] (TIKA-3000) Users should be able to configure POI's IOUtils.setByteArrayMaxOverride - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/25 15:16:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3000) Users should be able to configure POI's IOUtils.setByteArrayMaxOverride - posted by "Hudson (Jira)" <ji...@apache.org> on 2019/11/25 19:15:00 UTC, 1 replies.
- [jira] [Commented] (TIKA-2999) PDFParser should set, not add digital signature value - posted by "Hudson (Jira)" <ji...@apache.org> on 2019/11/25 19:15:00 UTC, 1 replies.
- Sv: [EXTERNAL] Tika Python questions - posted by ha...@avident-it.se on 2019/11/25 20:08:17 UTC, 0 replies.
- [jira] [Created] (TIKA-3001) Throw TaggedIOException when we open the HWP file with the Tika-App GUI - posted by "Kim Ju Young (Jira)" <ji...@apache.org> on 2019/11/26 05:43:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-3001) Throw TaggedIOException when we open the HWP file with the Tika-App GUI - posted by "Kim Ju Young (Jira)" <ji...@apache.org> on 2019/11/26 05:44:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-3001) Throw TaggedIOException when we open the HWP file with the Tika-App GUI - posted by "Kim Ju Young (Jira)" <ji...@apache.org> on 2019/11/26 05:54:00 UTC, 5 replies.
- [jira] [Resolved] (TIKA-3000) Users should be able to configure POI's IOUtils.setByteArrayMaxOverride - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/26 09:58:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2999) PDFParser should set, not add digital signature value - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/26 09:58:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-3001) Throw TaggedIOException when we open the HWP file with the Tika-App GUI - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/26 10:56:00 UTC, 0 replies.
- [VOTE] Release Apache Tika 1.23 Candidate #1 - posted by Tim Allison <ta...@apache.org> on 2019/11/26 21:33:58 UTC, 2 replies.
- whether have file size limitation for parsing the file content? - posted by sunwei <su...@163.com> on 2019/11/27 06:14:07 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-2925) General dependency/plugin upgrades for 1.23 - posted by "Tilman Hausherr (Jira)" <ji...@apache.org> on 2019/11/27 08:06:00 UTC, 1 replies.
- [jira] [Resolved] (TIKA-2925) General dependency/plugin upgrades for 1.23 - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/27 19:17:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2996) Add dropThreshold to PDFParserConfig - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/27 19:20:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2988) Add mime for alternative fdf format - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/27 19:24:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-2630) Wrong height and width metadata for JPEG images - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/27 19:29:00 UTC, 0 replies.
- [jira] [Resolved] (TIKA-2984) Try to unify unit tests around TikaTest functions - posted by "Tim Allison (Jira)" <ji...@apache.org> on 2019/11/27 19:31:00 UTC, 0 replies.
- Concern about tika-parsers' dependencies - posted by Mark Hissink Muller <XM...@kombit.dk> on 2019/11/28 09:46:16 UTC, 1 replies.
- [jira] [Created] (TIKA-3002) Possible bug with OCR strategy AUTO - posted by "Patrick Herber (Jira)" <ji...@apache.org> on 2019/11/28 12:50:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-3002) Possible bug with OCR strategy AUTO - posted by "Patrick Herber (Jira)" <ji...@apache.org> on 2019/11/28 12:52:00 UTC, 0 replies.
- [jira] [Created] (TIKA-3003) Remove unused dependencies - posted by "César Soto Valero (Jira)" <ji...@apache.org> on 2019/11/28 15:23:00 UTC, 0 replies.
- [jira] [Updated] (TIKA-3003) Remove unused dependencies - posted by "César Soto Valero (Jira)" <ji...@apache.org> on 2019/11/28 15:30:00 UTC, 3 replies.
- [jira] [Commented] (TIKA-3003) Remove unused dependencies - posted by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2019/11/28 16:47:00 UTC, 0 replies.
- JDK 14 - Early Access build 25 is available - posted by Rory O'Donnell <ro...@oracle.com> on 2019/11/29 09:58:58 UTC, 0 replies.
- [jira] [Commented] (TIKA-3002) Possible bug with OCR strategy AUTO - posted by "Tilman Hausherr (Jira)" <ji...@apache.org> on 2019/11/30 04:43:00 UTC, 1 replies.