You are viewing a plain text version of this content. The canonical link for it is here.
- Re: Parser does not produce proper sentence breaks? - posted by Michael McCandless <lu...@mikemccandless.com> on 2013/06/03 17:53:33 UTC, 0 replies.
- [jira] [Commented] (TIKA-792) NoSuchMethodException "CTMarkupImpl.(org.apache.xmlbeans.SchemaType, boolean)" processing a OOXML document - posted by "Zhuravskiy Vitaliy (JIRA)" <ji...@apache.org> on 2013/06/04 07:39:22 UTC, 0 replies.
- [jira] [Updated] (TIKA-1130) .docx text extract leaves out some portions of text - posted by "Daniel Gibby (JIRA)" <ji...@apache.org> on 2013/06/05 23:57:21 UTC, 4 replies.
- [jira] [Created] (TIKA-1130) Text extract leaves out text - posted by "Daniel Gibby (JIRA)" <ji...@apache.org> on 2013/06/05 23:57:21 UTC, 0 replies.
- [jira] [Commented] (TIKA-1130) .docx text extract leaves out some portions of text - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2013/06/06 01:39:20 UTC, 13 replies.
- [jira] [Created] (TIKA-1131) Output sentence-break "hints" for files such as PPT/X - posted by "Shai Erera (JIRA)" <ji...@apache.org> on 2013/06/06 23:00:21 UTC, 0 replies.
- Re: MP4Parser triggers .... something betwwen an exception and endDocument() from the Contenthandlers point of view? - posted by Christian Reuschling <re...@dfki.uni-kl.de> on 2013/06/07 13:30:33 UTC, 2 replies.
- [jira] [Created] (TIKA-1132) Parsing some XLS documents hangs entire JVM, requires kill -9 - posted by "Ryan Krueger (JIRA)" <ji...@apache.org> on 2013/06/11 00:44:20 UTC, 0 replies.
- [jira] [Updated] (TIKA-1132) Parsing some XLS documents hangs entire JVM, requires kill -9 - posted by "Ryan Krueger (JIRA)" <ji...@apache.org> on 2013/06/11 00:50:21 UTC, 2 replies.
- [jira] [Comment Edited] (TIKA-1132) Parsing some XLS documents hangs entire JVM, requires kill -9 - posted by "Ryan Krueger (JIRA)" <ji...@apache.org> on 2013/06/11 00:52:20 UTC, 0 replies.
- [jira] [Commented] (TIKA-1132) Parsing some XLS documents hangs entire JVM, requires kill -9 - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2013/06/11 01:03:20 UTC, 4 replies.
- [jira] [Created] (TIKA-1133) Ability to Allow Empty and Duplicate Tika Values for XML Elements - posted by "Ray Gauss II (JIRA)" <ji...@apache.org> on 2013/06/11 05:00:20 UTC, 0 replies.
- [jira] [Resolved] (TIKA-1133) Ability to Allow Empty and Duplicate Tika Values for XML Elements - posted by "Ray Gauss II (JIRA)" <ji...@apache.org> on 2013/06/11 05:17:20 UTC, 0 replies.
- [jira] [Created] (TIKA-1134) ContentHandler gets ignorable whitespace for
tags when parsing HTML
- posted by "Hoss Man (JIRA)" <ji...@apache.org> on 2013/06/11 20:48:21 UTC, 0 replies.
- [jira] [Updated] (TIKA-1134) ContentHandler gets ignorable whitespace for
tags when parsing HTML
- posted by "Hoss Man (JIRA)" <ji...@apache.org> on 2013/06/11 20:50:21 UTC, 3 replies.
- [jira] [Comment Edited] (TIKA-1134) ContentHandler gets ignorable whitespace for
tags when parsing HTML
- posted by "Hoss Man (JIRA)" <ji...@apache.org> on 2013/06/11 21:00:22 UTC, 0 replies.
- [jira] [Created] (TIKA-1135) Incorrect Cardinality and Case in IPTC Metadata Definition - posted by "Ray Gauss II (JIRA)" <ji...@apache.org> on 2013/06/11 22:09:19 UTC, 0 replies.
- [jira] [Resolved] (TIKA-1135) Incorrect Cardinality and Case in IPTC Metadata Definition - posted by "Ray Gauss II (JIRA)" <ji...@apache.org> on 2013/06/11 22:09:20 UTC, 0 replies.
- [jira] [Commented] (TIKA-1120) Enable direct use of org.apache.tika.mime.MediaType.detect(...) - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2013/06/12 18:02:20 UTC, 0 replies.
- [jira] [Updated] (TIKA-1129) Test HTML file has poorly chosen GPL text in it - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2013/06/15 07:19:26 UTC, 0 replies.
- [jira] [Resolved] (TIKA-1129) Test HTML file has poorly chosen GPL text in it - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2013/06/16 02:30:20 UTC, 0 replies.
- [VOTE] Apache TIka 1.4 Release Candidate #1 - posted by Chris Mattmann <ma...@apache.org> on 2013/06/16 05:52:44 UTC, 26 replies.
- [VOTE] Apache TIka 1.4 Release Candidate #2 - posted by Chris Mattmann <ma...@apache.org> on 2013/06/16 20:06:59 UTC, 8 replies.
- [jira] [Assigned] (TIKA-991) Mp3Parser cannot extract the duration of an audio file - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2013/06/16 21:15:20 UTC, 0 replies.
- [jira] [Resolved] (TIKA-991) Mp3Parser cannot extract the duration of an audio file - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2013/06/16 21:15:20 UTC, 0 replies.
- [jira] [Created] (TIKA-1136) Support IPA files in ZipDetector - posted by "Paul Brinich (JIRA)" <ji...@apache.org> on 2013/06/18 17:42:20 UTC, 0 replies.
- [jira] [Updated] (TIKA-1136) Support IPA files in ZipDetector - posted by "Paul Brinich (JIRA)" <ji...@apache.org> on 2013/06/18 17:48:21 UTC, 1 replies.
- [jira] [Commented] (TIKA-1136) Support IPA files in ZipDetector - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2013/06/18 18:04:20 UTC, 3 replies.
- [jira] [Resolved] (TIKA-1136) Support IPA files in ZipDetector - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2013/06/19 01:12:20 UTC, 0 replies.
- [jira] [Updated] (TIKA-1137) Wasted work in WontBeSerializedError.writeObject() - posted by "Adrian Nistor (JIRA)" <ji...@apache.org> on 2013/06/19 01:52:21 UTC, 0 replies.
- [jira] [Created] (TIKA-1137) Wasted work in WontBeSerializedError.writeObject() - posted by "Adrian Nistor (JIRA)" <ji...@apache.org> on 2013/06/19 01:52:21 UTC, 0 replies.
- [jira] [Commented] (TIKA-623) Add support for Outlook PST - posted by "Gary Gregory (JIRA)" <ji...@apache.org> on 2013/06/19 20:22:20 UTC, 0 replies.
- [jira] [Created] (TIKA-1138) I got empty body and empty title with some documents - posted by "Koutsoulis Philippe (JIRA)" <ji...@apache.org> on 2013/06/24 14:14:20 UTC, 0 replies.
- [jira] [Commented] (TIKA-1138) I got empty body and empty title with some documents - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2013/06/24 14:30:20 UTC, 0 replies.
- Your Gump Build(s) - posted by Stefan Bodewig <bo...@apache.org> on 2013/06/24 15:20:47 UTC, 0 replies.
- [jira] [Updated] (TIKA-1138) Empty body and empty title with some XLS and TXT documents - posted by "Koutsoulis Philippe (JIRA)" <ji...@apache.org> on 2013/06/24 16:12:21 UTC, 1 replies.
- [jira] [Commented] (TIKA-1138) Empty body and empty title with some XLS and TXT documents - posted by "Koutsoulis Philippe (JIRA)" <ji...@apache.org> on 2013/06/24 16:14:22 UTC, 1 replies.
- Re: MP4Parser Triggers no ContentHandler.startDocument() and ContentHandler.endDocument() in one case - posted by Nick Burch <ap...@gagravarr.org> on 2013/06/24 16:33:22 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-1130) .docx text extract leaves out some portions of text - posted by "Daniel Gibby (JIRA)" <ji...@apache.org> on 2013/06/24 16:48:20 UTC, 0 replies.
- [jira] [Updated] (TIKA-1138) Empty body and empty title with some TXT documents - posted by "Koutsoulis Philippe (JIRA)" <ji...@apache.org> on 2013/06/25 11:08:21 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-1138) Empty body and empty title with some TXT documents - posted by "Koutsoulis Philippe (JIRA)" <ji...@apache.org> on 2013/06/25 11:10:20 UTC, 1 replies.
- [jira] [Commented] (TIKA-973) PDF form data isn't included in extracted content. - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2013/06/25 16:28:20 UTC, 3 replies.
- [jira] [Commented] (TIKA-1109) Metadata not extracted before the context in OOXML (pptx) - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2013/06/25 16:56:22 UTC, 2 replies.
- [jira] [Commented] (TIKA-1070) StackOverflow error in org.apache.tika.sax.ToXMLContentHandler$ElementInfo.getPrefix(ToXMLContentHandler.java:58) - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2013/06/25 17:02:20 UTC, 0 replies.
- [jira] [Resolved] (TIKA-1070) StackOverflow error in org.apache.tika.sax.ToXMLContentHandler$ElementInfo.getPrefix(ToXMLContentHandler.java:58) - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2013/06/25 17:02:20 UTC, 0 replies.
- RFC822Parser build error on gump - posted by Nick Burch <ap...@gagravarr.org> on 2013/06/26 00:25:18 UTC, 1 replies.
- [jira] [Updated] (TIKA-973) PDF form data isn't included in extracted content. - posted by "Tim Allison (JIRA)" <ji...@apache.org> on 2013/06/27 03:30:21 UTC, 1 replies.
- [jira] [Commented] (TIKA-1053) Upgrade Tika Parsers to use ASM 4.x - posted by "Vincent Massol (JIRA)" <ji...@apache.org> on 2013/06/27 13:20:20 UTC, 2 replies.
- [jira] [Updated] (TIKA-1109) Metadata not extracted before the content in OOXML (pptx) - posted by "Daniel Bonniot de Ruisselet (JIRA)" <ji...@apache.org> on 2013/06/27 13:25:20 UTC, 1 replies.
- [jira] [Commented] (TIKA-1109) Metadata not extracted before the content in OOXML (pptx) - posted by "Daniel Bonniot de Ruisselet (JIRA)" <ji...@apache.org> on 2013/06/27 14:26:21 UTC, 0 replies.
- [jira] [Comment Edited] (TIKA-1109) Metadata not extracted before the content in OOXML (pptx) - posted by "Daniel Bonniot de Ruisselet (JIRA)" <ji...@apache.org> on 2013/06/27 14:28:20 UTC, 0 replies.
- [jira] [Resolved] (TIKA-1109) Metadata not extracted before the content in OOXML (pptx) - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2013/06/27 14:42:20 UTC, 0 replies.
- need URL openStream() to test Tika-327 in MimeDetectionTest? - posted by "Allison, Timothy B." <ta...@mitre.org> on 2013/06/28 15:19:56 UTC, 2 replies.
- Keynote Thumbnails? - posted by Mike Patterson <pa...@gmail.com> on 2013/06/28 21:58:51 UTC, 1 replies.
- Integration of your API/service (article reader and parser) - posted by Dhiman Saha <ds...@gmail.com> on 2013/06/29 22:37:00 UTC, 0 replies.
- [RESULT] [VOTE] Apache TIka 1.4 Release Candidate #2 - posted by Chris Mattmann <ma...@apache.org> on 2013/06/30 21:53:42 UTC, 0 replies.