You are viewing a plain text version of this content. The canonical link for it is here.
- infoQ article Apache Nutch 2 Features and Product Roadmap - posted by Julien Nioche <li...@gmail.com> on 2012/11/01 12:39:35 UTC, 0 replies.
- org.apache.tika.parser.txt.UniversalEncodingListener - posted by Aleksandr Dubinsky <ad...@almson.net> on 2012/11/02 14:38:34 UTC, 1 replies.
- [jira] [Updated] (TIKA-991) Mp3Parser cannot extract the duration of an audio file - posted by "Oliver Heger (JIRA)" <ji...@apache.org> on 2012/11/02 21:58:12 UTC, 1 replies.
- [jira] [Closed] (TIKA-909) ForkParser doens't return Metadata - posted by "Jörg Ehrlich (JIRA)" <ji...@apache.org> on 2012/11/04 14:34:13 UTC, 0 replies.
- [jira] [Updated] (TIKA-799) ForkParser does not populate metadata object after completing a parse - posted by "Jörg Ehrlich (JIRA)" <ji...@apache.org> on 2012/11/04 14:34:14 UTC, 0 replies.
- [jira] [Commented] (TIKA-799) ForkParser does not populate metadata object after completing a parse - posted by "Jörg Ehrlich (JIRA)" <ji...@apache.org> on 2012/11/04 14:34:15 UTC, 0 replies.
- [jira] [Created] (TIKA-1016) KEYS file not linked from download page - posted by "Sebb (JIRA)" <ji...@apache.org> on 2012/11/05 19:32:13 UTC, 0 replies.
- [jira] [Commented] (TIKA-1016) KEYS file not linked from download page - posted by "Dave Meikle (JIRA)" <ji...@apache.org> on 2012/11/05 22:38:17 UTC, 0 replies.
- [jira] [Resolved] (TIKA-1016) KEYS file not linked from download page - posted by "Dave Meikle (JIRA)" <ji...@apache.org> on 2012/11/05 22:38:18 UTC, 0 replies.
- [jira] [Closed] (TIKA-1016) KEYS file not linked from download page - posted by "Dave Meikle (JIRA)" <ji...@apache.org> on 2012/11/05 22:40:12 UTC, 0 replies.
- [jira] [Created] (TIKA-1017) DefaultHtmlMapper misses some safe elements - posted by "Daniel Bonniot de Ruisselet (JIRA)" <ji...@apache.org> on 2012/11/06 11:54:12 UTC, 0 replies.
- [jira] [Commented] (TIKA-1017) DefaultHtmlMapper misses some safe elements - posted by "Ken Krugler (JIRA)" <ji...@apache.org> on 2012/11/06 15:56:12 UTC, 1 replies.
- [jira] [Created] (TIKA-1018) tika-bundle-0.6 missing poi-ooxml-schemas.jar - posted by "Akash Kotadia (JIRA)" <ji...@apache.org> on 2012/11/06 19:48:12 UTC, 0 replies.
- [jira] [Commented] (TIKA-291) Adobe InDesign support - posted by "Tom Harper (JIRA)" <ji...@apache.org> on 2012/11/06 20:08:12 UTC, 0 replies.
- [jira] [Updated] (TIKA-291) Adobe InDesign support - posted by "Tom Harper (JIRA)" <ji...@apache.org> on 2012/11/06 20:14:12 UTC, 0 replies.
- [jira] [Commented] (TIKA-682) Creative Suite formats are not supported - posted by "Tom Harper (JIRA)" <ji...@apache.org> on 2012/11/06 20:16:14 UTC, 0 replies.
- [jira] [Resolved] (TIKA-1018) tika-bundle-0.6 missing poi-ooxml-schemas.jar - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2012/11/06 23:08:12 UTC, 0 replies.
- [jira] [Commented] (TIKA-93) OCR support - posted by "Pei Chen (JIRA)" <ji...@apache.org> on 2012/11/07 01:52:13 UTC, 2 replies.
- [jira] [Resolved] (TIKA-799) ForkParser does not populate metadata object after completing a parse - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2012/11/07 14:01:12 UTC, 0 replies.
- Build failed in Jenkins: Tika-trunk #938 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/11/07 14:10:10 UTC, 1 replies.
- [jira] [Resolved] (TIKA-1009) Expose TextDocument in BoilerpipeContentHandler - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2012/11/07 16:07:13 UTC, 0 replies.
- [jira] [Commented] (TIKA-1012) Add additional fields to MimeType reader - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2012/11/07 16:35:12 UTC, 0 replies.
- [jira] [Created] (TIKA-1019) Document links in Word documents don't leave a placeholder - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/07 17:21:12 UTC, 0 replies.
- Jenkins build is back to normal : Tika-trunk #939 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/11/07 17:23:01 UTC, 0 replies.
- [jira] [Assigned] (TIKA-1019) Document links in Word documents don't leave a placeholder - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/07 17:23:13 UTC, 0 replies.
- [jira] [Updated] (TIKA-1019) Document links in Word documents don't leave a placeholder - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/07 17:27:11 UTC, 0 replies.
- [jira] [Updated] (TIKA-953) Tika failed to recognize non-ustar Tar file? - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2012/11/07 17:55:14 UTC, 0 replies.
- [jira] [Updated] (TIKA-1012) Add additional fields to MimeType reader - posted by "Ryan McKinley (JIRA)" <ji...@apache.org> on 2012/11/08 07:22:13 UTC, 0 replies.
- [jira] [Created] (TIKA-1020) Excel 2010 parser missing cell values are not reported resulting in missing columns values - posted by "Neil Blue (JIRA)" <ji...@apache.org> on 2012/11/08 12:25:11 UTC, 0 replies.
- [jira] [Commented] (TIKA-953) Tika failed to recognize non-ustar Tar file? - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/08 15:47:12 UTC, 0 replies.
- [jira] [Created] (TIKA-1021) Exception when parsing PSD files - posted by "Emmanuel Hugonnet (JIRA)" <ji...@apache.org> on 2012/11/09 11:46:11 UTC, 0 replies.
- [jira] [Updated] (TIKA-1021) Exception when parsing PSD files - posted by "Emmanuel Hugonnet (JIRA)" <ji...@apache.org> on 2012/11/09 12:06:14 UTC, 2 replies.
- [jira] [Commented] (TIKA-1021) Exception when parsing PSD files - posted by "Emmanuel Hugonnet (JIRA)" <ji...@apache.org> on 2012/11/09 12:12:12 UTC, 0 replies.
- [jira] [Resolved] (TIKA-1019) Document links in Word documents don't leave a placeholder - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/09 14:30:12 UTC, 1 replies.
- [jira] [Reopened] (TIKA-1019) Document links in Word documents don't leave a placeholder - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/09 14:40:12 UTC, 0 replies.
- [jira] [Created] (TIKA-1022) DWG Custom properties not extracted - posted by "Paolo Nacci (JIRA)" <ji...@apache.org> on 2012/11/09 19:06:13 UTC, 0 replies.
- [jira] [Updated] (TIKA-1022) DWG Custom properties not extracted - posted by "Paolo Nacci (JIRA)" <ji...@apache.org> on 2012/11/09 19:08:12 UTC, 3 replies.
- [jira] [Assigned] (TIKA-1022) DWG Custom properties not extracted - posted by "Ray Gauss II (JIRA)" <ji...@apache.org> on 2012/11/09 23:07:12 UTC, 0 replies.
- [jira] [Commented] (TIKA-1022) DWG Custom properties not extracted - posted by "Ray Gauss II (JIRA)" <ji...@apache.org> on 2012/11/09 23:15:12 UTC, 2 replies.
- [jira] [Resolved] (TIKA-1022) DWG Custom properties not extracted - posted by "Ray Gauss II (JIRA)" <ji...@apache.org> on 2012/11/10 00:05:13 UTC, 0 replies.
- [jira] [Commented] (TIKA-1020) Excel 2010 parser missing cell values are not reported resulting in missing columns values - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2012/11/10 23:21:12 UTC, 0 replies.
- [jira] [Created] (TIKA-1023) Weird associated to vorbis-java-core tests in vorbis-java-tika - posted by "Thomas Mortagne (JIRA)" <ji...@apache.org> on 2012/11/12 10:36:12 UTC, 0 replies.
- [jira] [Closed] (TIKA-1023) Weird associated to vorbis-java-core tests in vorbis-java-tika - posted by "Thomas Mortagne (JIRA)" <ji...@apache.org> on 2012/11/12 10:40:13 UTC, 0 replies.
- [jira] [Commented] (TIKA-1023) Weird associated to vorbis-java-core tests in vorbis-java-tika - posted by "Thomas Mortagne (JIRA)" <ji...@apache.org> on 2012/11/12 10:42:12 UTC, 0 replies.
- [jira] [Created] (TIKA-1024) An MP3 with an UTF-16 ID3 tag containing only the BOM should produce empty string value for that tag - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/13 15:42:12 UTC, 0 replies.
- [jira] [Updated] (TIKA-1024) An MP3 with an UTF-16 ID3 tag containing only the BOM should produce empty string value for that tag - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/13 15:46:12 UTC, 1 replies.
- [jira] [Commented] (TIKA-918) iWork Charts not being parsed in all products (Pages, Numbers, Keynote) - posted by "Erik Peterson (JIRA)" <ji...@apache.org> on 2012/11/13 18:34:12 UTC, 1 replies.
- [jira] [Created] (TIKA-1025) Powerpoint (.ppt) parser doesn't leave placeholder where documents are embedded - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/13 20:06:13 UTC, 0 replies.
- [jira] [Updated] (TIKA-1025) Powerpoint (.ppt) parser doesn't leave placeholder where documents are embedded - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/13 20:16:13 UTC, 0 replies.
- Tika OneNote Support - posted by 122jxgcn <yw...@gmail.com> on 2012/11/15 02:02:03 UTC, 1 replies.
- [jira] [Commented] (TIKA-369) Improve accuracy of language detection - posted by "Pander Musubi (JIRA)" <ji...@apache.org> on 2012/11/17 18:58:12 UTC, 2 replies.
- [jira] [Commented] (TIKA-856) Support CJK (Chinese, Japanese and Korean) language detection - posted by "Pander Musubi (JIRA)" <ji...@apache.org> on 2012/11/17 19:00:12 UTC, 0 replies.
- [jira] [Commented] (TIKA-492) Add language identification support for North Sami, Lule Sami and South Sami - posted by "Pander Musubi (JIRA)" <ji...@apache.org> on 2012/11/17 19:02:13 UTC, 0 replies.
- [jira] [Commented] (TIKA-491) Add language identification support for Norwegian Bokmål and Norwegian Nynorsk - posted by "Pander Musubi (JIRA)" <ji...@apache.org> on 2012/11/17 19:02:13 UTC, 0 replies.
- [jira] [Resolved] (TIKA-1024) An MP3 with an UTF-16 ID3 tag containing only the BOM should produce empty string value for that tag - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/18 16:54:57 UTC, 0 replies.
- [jira] [Resolved] (TIKA-1025) Powerpoint (.ppt) parser doesn't leave placeholder where documents are embedded - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/18 17:12:57 UTC, 0 replies.
- Build failed in Jenkins: Tika-trunk #943 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/11/19 01:30:22 UTC, 1 replies.
- Jenkins build is back to normal : Tika-trunk #944 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/11/19 02:32:24 UTC, 0 replies.
- [jira] [Created] (TIKA-1026) ServiceLoader should respect OSGi service ranking - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2012/11/19 10:54:57 UTC, 0 replies.
- [jira] [Resolved] (TIKA-1026) ServiceLoader should respect OSGi service ranking - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2012/11/19 11:32:59 UTC, 0 replies.
- [jira] [Created] (TIKA-1027) Allow null values when setting metadata - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2012/11/19 15:22:59 UTC, 0 replies.
- [jira] [Reopened] (TIKA-775) Embed Capabilities - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2012/11/19 16:26:59 UTC, 0 replies.
- [jira] [Resolved] (TIKA-1027) Allow null values when setting metadata - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2012/11/19 16:27:00 UTC, 0 replies.
- [jira] [Commented] (TIKA-1027) Allow null values when setting metadata - posted by "Ray Gauss II (JIRA)" <ji...@apache.org> on 2012/11/19 16:28:58 UTC, 1 replies.
- [jira] [Commented] (TIKA-775) Embed Capabilities - posted by "Ray Gauss II (JIRA)" <ji...@apache.org> on 2012/11/19 17:37:59 UTC, 0 replies.
- Patching fix for Tika-521 on Tika 0.8 - posted by "Jana, Kumar Raja" <kj...@ptc.com> on 2012/11/21 12:51:57 UTC, 1 replies.
- [jira] [Created] (TIKA-1028) Tika-server quits parsing of rfc-822 document prematurely when it encounters encrypted zip file as attachment. - posted by "Juha Haaga (JIRA)" <ji...@apache.org> on 2012/11/21 13:57:59 UTC, 0 replies.
- [jira] [Commented] (TIKA-879) Detection problem: message/rfc822 file is detected as text/plain. - posted by "Vladimir L. (JIRA)" <ji...@apache.org> on 2012/11/21 21:32:00 UTC, 1 replies.
- [jira] [Comment Edited] (TIKA-879) Detection problem: message/rfc822 file is detected as text/plain. - posted by "Vladimir L. (JIRA)" <ji...@apache.org> on 2012/11/21 21:33:58 UTC, 1 replies.
- [jira] [Created] (TIKA-1029) Parser exception with the attached document - posted by "Philippe Dubois (JIRA)" <ji...@apache.org> on 2012/11/23 09:52:58 UTC, 0 replies.
- [jira] [Updated] (TIKA-1029) Parser exception with the attached document - posted by "Philippe Dubois (JIRA)" <ji...@apache.org> on 2012/11/23 09:54:57 UTC, 0 replies.
- [jira] [Commented] (TIKA-1029) Parser exception with the attached document - posted by "Philippe Dubois (JIRA)" <ji...@apache.org> on 2012/11/23 09:54:58 UTC, 2 replies.
- [jira] [Created] (TIKA-1030) Page extraction for Word,Excel Documents - posted by "David vandendriessche (JIRA)" <ji...@apache.org> on 2012/11/23 15:22:58 UTC, 0 replies.
- [jira] [Commented] (TIKA-1030) Page extraction for Word,Excel Documents - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2012/11/23 17:16:58 UTC, 0 replies.
- [jira] [Created] (TIKA-1031) TikaCLI doesn't create sub-dirs when extracting Zip files - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/26 14:28:58 UTC, 0 replies.
- [jira] [Updated] (TIKA-1031) TikaCLI doesn't create sub-dirs when extracting Zip files - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/26 14:30:58 UTC, 0 replies.
- [jira] [Created] (TIKA-1032) Powerpoint (.pptx) can have duplicate embedded ids - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/26 20:12:58 UTC, 0 replies.
- [jira] [Assigned] (TIKA-1032) Powerpoint (.pptx) can have duplicate embedded ids - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/26 20:12:58 UTC, 0 replies.
- [jira] [Updated] (TIKA-1032) Powerpoint (.pptx) can have duplicate embedded ids - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/26 20:14:58 UTC, 0 replies.
- [jira] [Updated] (TIKA-712) Master slide text isn't extracted - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/27 12:53:59 UTC, 0 replies.
- [jira] [Updated] (TIKA-1033) Tika doesn't parse embedded OLE Chart/Graph objects - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/27 12:57:58 UTC, 0 replies.
- [jira] [Created] (TIKA-1033) Tika doesn't parse embedded OLE Chart/Graph objects - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/27 12:57:58 UTC, 0 replies.
- [jira] [Commented] (TIKA-1033) Tika doesn't parse embedded OLE Chart/Graph objects - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2012/11/27 13:03:58 UTC, 9 replies.
- [jira] [Created] (TIKA-1034) MimeTypes seems to be doing unnecessary work in the detect method - posted by "Bice Dibley (JIRA)" <ji...@apache.org> on 2012/11/29 04:32:58 UTC, 0 replies.
- [jira] [Resolved] (TIKA-1034) MimeTypes seems to be doing unnecessary work in the detect method - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2012/11/29 10:18:57 UTC, 0 replies.
- [jira] [Created] (TIKA-1035) PDF bookmark text is not extracted - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/30 13:43:58 UTC, 0 replies.
- [jira] [Updated] (TIKA-1035) PDF bookmark text is not extracted - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/30 13:46:02 UTC, 0 replies.
- [jira] [Created] (TIKA-1036) ZIP parsing doesn't leave placeholders for each package entry - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/30 19:53:58 UTC, 0 replies.
- [jira] [Updated] (TIKA-1036) ZIP parsing doesn't leave placeholders for each package entry - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/11/30 19:55:58 UTC, 0 replies.