dev@tika.apache.org, 2012-09

You are viewing a plain text version of this content. The canonical link for it is here.

- [jira] [Resolved] (TIKA-981) Text isn't extracted from PDF pop-up annotations - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/02 15:00:08 UTC, 0 replies.
- [jira] [Resolved] (TIKA-986) NullPointerException trying to parse detached .pk7s signature - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/02 16:11:07 UTC, 0 replies.
- [jira] [Resolved] (TIKA-982) RTF document embedded into Word (.doc) document is extracted as .unknown - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/02 16:11:07 UTC, 0 replies.
- [jira] [Updated] (TIKA-987) Embedded drawing (SHAPE MERGEFORMAT) sometimes not extracted - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/02 16:23:07 UTC, 1 replies.
- [jira] [Created] (TIKA-987) Embedded drawing (SHAPE MERGEFORMAT) sometimes not extracted - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/02 16:23:07 UTC, 0 replies.
- [jira] [Created] (TIKA-988) We don't extract a placeholder for a Word document embedded in an Excel document - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/02 17:00:07 UTC, 0 replies.
- [jira] [Updated] (TIKA-988) We don't extract a placeholder for a Word document embedded in an Excel document - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/02 17:02:07 UTC, 0 replies.
- [jira] [Created] (TIKA-989) We don't extract a placeholder for documents embedded in a Word OOXML (.docx) document - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/02 17:13:07 UTC, 0 replies.
- Build failed in Jenkins: Tika-trunk #919 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/09/02 17:20:14 UTC, 0 replies.
- Tika's Jenkins builds - posted by Michael McCandless <lu...@mikemccandless.com> on 2012/09/03 13:15:23 UTC, 2 replies.
- [jira] [Assigned] (TIKA-920) iWork Numbers sheetnames not being parsed into metadata - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/03 13:29:07 UTC, 0 replies.
- [jira] [Commented] (TIKA-920) iWork Numbers sheetnames not being parsed into metadata - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/03 13:31:07 UTC, 0 replies.
- [jira] [Updated] (TIKA-920) iWork Numbers sheetnames not being parsed into metadata - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/03 13:47:08 UTC, 0 replies.
- [jira] [Assigned] (TIKA-918) iWork Charts not being parsed in all products (Pages, Numbers, Keynote) - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/03 14:13:07 UTC, 0 replies.
- [jira] [Updated] (TIKA-918) iWork Charts not being parsed in all products (Pages, Numbers, Keynote) - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/03 14:17:07 UTC, 0 replies.
- Re: Question about XPath Matcher code & MatchingContentHandler - posted by Jukka Zitting <ju...@gmail.com> on 2012/09/03 17:02:59 UTC, 2 replies.
- Jenkins build is back to normal : Tika-trunk #920 - posted by Apache Jenkins Server <je...@builds.apache.org> on 2012/09/03 21:51:55 UTC, 0 replies.
- [jira] [Created] (TIKA-990) Mp3Parser extracts wrong number of channels - posted by "Oliver Heger (JIRA)" <ji...@apache.org> on 2012/09/03 22:18:07 UTC, 0 replies.
- [jira] [Created] (TIKA-991) Mp3Parser cannot extract the duration of an audio file - posted by "Oliver Heger (JIRA)" <ji...@apache.org> on 2012/09/03 22:33:07 UTC, 0 replies.
- [jira] [Updated] (TIKA-991) Mp3Parser cannot extract the duration of an audio file - posted by "Oliver Heger (JIRA)" <ji...@apache.org> on 2012/09/03 22:43:07 UTC, 0 replies.
- [jira] [Created] (TIKA-992) OpenGraph meta tags to allow multiple values - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/09/04 18:30:07 UTC, 0 replies.
- [jira] [Updated] (TIKA-992) OpenGraph meta tags to allow multiple values - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/09/04 18:32:07 UTC, 0 replies.
- [jira] [Assigned] (TIKA-989) We don't extract a placeholder for documents embedded in a Word OOXML (.docx) document - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/04 18:34:08 UTC, 0 replies.
- [jira] [Updated] (TIKA-989) We don't extract a placeholder for documents embedded in a Word OOXML (.docx) document - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/04 18:36:07 UTC, 1 replies.
- [jira] [Created] (TIKA-993) Language Detection Fault - posted by "Iman Reihanian (JIRA)" <ji...@apache.org> on 2012/09/06 12:39:07 UTC, 0 replies.
- [jira] [Updated] (TIKA-993) Language Detection Fault - posted by "Iman Reihanian (JIRA)" <ji...@apache.org> on 2012/09/06 12:41:07 UTC, 0 replies.
- Apache CMS for Website? - posted by Dave Meikle <lo...@gmail.com> on 2012/09/07 15:50:30 UTC, 1 replies.
- [jira] [Commented] (TIKA-960) Duplicate letters in text extracted from PDF files - posted by "Christof Luick (JIRA)" <ji...@apache.org> on 2012/09/08 08:13:07 UTC, 0 replies.
- [jira] [Commented] (TIKA-946) Improve how the PPTX parser uses XLSF from POI - posted by "Daniel Bonniot de Ruisselet (JIRA)" <ji...@apache.org> on 2012/09/11 14:44:08 UTC, 0 replies.
- [jira] [Commented] (TIKA-918) iWork Charts not being parsed in all products (Pages, Numbers, Keynote) - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/11 16:59:07 UTC, 2 replies.
- [jira] [Resolved] (TIKA-920) iWork Numbers sheetnames not being parsed into metadata - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/11 17:03:07 UTC, 0 replies.
- [jira] [Resolved] (TIKA-989) We don't extract a placeholder for documents embedded in a Word OOXML (.docx) document - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/11 17:21:07 UTC, 0 replies.
- [jira] [Commented] (TIKA-976) Inaccurate XLS detection trough POIFSContainerDetector - posted by "Marco Quaranta (JIRA)" <ji...@apache.org> on 2012/09/12 14:47:07 UTC, 2 replies.
- [jira] [Created] (TIKA-994) Type Detection Fault - posted by "Fangwei Ding (JIRA)" <ji...@apache.org> on 2012/09/13 05:23:07 UTC, 0 replies.
- [jira] [Updated] (TIKA-994) Type Detection Fault - posted by "Fangwei Ding (JIRA)" <ji...@apache.org> on 2012/09/13 05:25:07 UTC, 1 replies.
- [jira] [Created] (TIKA-995) XHTMLContentHandler doesn't pass attributes of body element - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/09/21 12:53:56 UTC, 0 replies.
- [jira] [Created] (TIKA-996) port tika-paser to metadata-extractor 2.6.2 - posted by "Christoph Brill (JIRA)" <ji...@apache.org> on 2012/09/21 15:09:07 UTC, 0 replies.
- [jira] [Updated] (TIKA-996) port tika-paser to metadata-extractor 2.6.2 - posted by "Christoph Brill (JIRA)" <ji...@apache.org> on 2012/09/21 15:11:07 UTC, 0 replies.
- [jira] [Resolved] (TIKA-996) port tika-paser to metadata-extractor 2.6.2 - posted by "Ray Gauss II (JIRA)" <ji...@apache.org> on 2012/09/21 15:15:08 UTC, 0 replies.
- [jira] [Commented] (TIKA-820) Locator is unset for HTML parser - posted by "Daniel Bonniot de Ruisselet (JIRA)" <ji...@apache.org> on 2012/09/21 17:46:08 UTC, 0 replies.
- [jira] [Created] (TIKA-997) Leave a placeholder when documents are embedded in .pptx documents - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/22 20:08:07 UTC, 0 replies.
- [jira] [Updated] (TIKA-995) XHTMLContentHandler doesn't pass attributes of body element - posted by "Markus Jelsma (JIRA)" <ji...@apache.org> on 2012/09/24 13:04:07 UTC, 1 replies.
- [jira] [Created] (TIKA-998) How to handle row span and Colspan in parsing xls or xlsx files - posted by "Rashed Mamun (JIRA)" <ji...@apache.org> on 2012/09/25 14:44:08 UTC, 0 replies.
- [jira] [Updated] (TIKA-997) Leave a placeholder when documents are embedded in .pptx documents - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/26 12:20:07 UTC, 0 replies.
- [jira] [Created] (TIKA-999) RTF Parser doesn't extract page/word/character count metadata - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/26 12:50:07 UTC, 0 replies.
- [jira] [Updated] (TIKA-999) RTF Parser doesn't extract page/word/character count metadata - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/26 12:54:07 UTC, 0 replies.
- [jira] [Resolved] (TIKA-999) RTF Parser doesn't extract page/word/character count metadata - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/26 16:51:07 UTC, 0 replies.
- [jira] [Resolved] (TIKA-997) Leave a placeholder when documents are embedded in .pptx documents - posted by "Michael McCandless (JIRA)" <ji...@apache.org> on 2012/09/28 14:49:07 UTC, 0 replies.