You are viewing a plain text version of this content. The canonical link for it is here.
- Re: [jira] Commented: (TIKA-396) Parser Attachements from Outlook Messages - posted by Dave Meikle <lo...@gmail.com> on 2010/05/02 18:40:08 UTC, 0 replies.
- [jira] Updated: (TIKA-402) Support for Keynote and Pages documents - posted by "Martijn van Groningen (JIRA)" <ji...@apache.org> on 2010/05/03 01:25:55 UTC, 6 replies.
- [jira] Updated: (TIKA-379) Html elements and attributes not available in XHTML representation - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2010/05/04 11:19:57 UTC, 0 replies.
- [jira] Created: (TIKA-417) Unable to parse the content for UCS2 Litte Endian encoded file - posted by "Rajiv Kumar (JIRA)" <ji...@apache.org> on 2010/05/04 12:10:55 UTC, 0 replies.
- [jira] Updated: (TIKA-417) Unable to parse the content for UCS2 Litte Endian encoded file - posted by "Rajiv Kumar (JIRA)" <ji...@apache.org> on 2010/05/04 12:12:56 UTC, 0 replies.
- [jira] Created: (TIKA-418) RuntimeException while getting content for ppsx, ppsm, pptm, thmx and xps file types - posted by "Rajiv Kumar (JIRA)" <ji...@apache.org> on 2010/05/04 12:16:56 UTC, 0 replies.
- [jira] Created: (TIKA-419) Allow parser lookup from a custom class loader - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2010/05/04 17:56:55 UTC, 0 replies.
- [jira] Resolved: (TIKA-419) Allow parser lookup from a custom class loader - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2010/05/04 18:09:03 UTC, 1 replies.
- [jira] Assigned: (TIKA-379) Html elements and attributes not available in XHTML representation - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/05/05 06:44:04 UTC, 0 replies.
- [jira] Commented: (TIKA-405) Problems handling Hyperlinks and Tables in Word 97 Docs - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2010/05/05 18:37:03 UTC, 0 replies.
- [jira] Created: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages - posted by "Christian Kohlschütter (JIRA)" <ji...@apache.org> on 2010/05/07 22:15:02 UTC, 0 replies.
- [jira] Updated: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages - posted by "Christian Kohlschütter (JIRA)" <ji...@apache.org> on 2010/05/07 22:20:49 UTC, 0 replies.
- [jira] Assigned: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages - posted by "Ken Krugler (JIRA)" <ji...@apache.org> on 2010/05/07 22:45:50 UTC, 0 replies.
- [jira] Commented: (TIKA-420) [PATCH] Integration of boilerpipe: Boilerplate Removal and Fulltext Extraction from HTML pages - posted by "Ken Krugler (JIRA)" <ji...@apache.org> on 2010/05/07 22:47:49 UTC, 8 replies.
- [jira] Created: (TIKA-421) DOAP file to recognize Tika on projects.a.o - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/05/08 19:44:48 UTC, 0 replies.
- [jira] Resolved: (TIKA-421) DOAP file to recognize Tika on projects.a.o - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/05/08 19:56:50 UTC, 0 replies.
- [jira] Commented: (TIKA-421) DOAP file to recognize Tika on projects.a.o - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/05/08 20:00:50 UTC, 0 replies.
- [jira] Created: (TIKA-422) Wrong charset conversion in some RTF documents. - posted by "Piotr B. (JIRA)" <ji...@apache.org> on 2010/05/10 10:53:48 UTC, 0 replies.
- [jira] Updated: (TIKA-422) Wrong charset conversion in some RTF documents. - posted by "Piotr B. (JIRA)" <ji...@apache.org> on 2010/05/10 11:00:48 UTC, 0 replies.
- Attributes in XHTML output - posted by Ken Krugler <kk...@transpac.com> on 2010/05/11 02:56:52 UTC, 3 replies.
- Hudson build is back to normal : Tika-trunk #312 - posted by Apache Hudson Server <hu...@hudson.zones.apache.org> on 2010/05/11 16:20:40 UTC, 0 replies.
- Mailing lists moved - posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> on 2010/05/12 02:49:54 UTC, 0 replies.
- Tika now listed on projects.a.o - posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> on 2010/05/12 03:24:40 UTC, 1 replies.
- TLP project website moved - posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> on 2010/05/12 07:49:06 UTC, 0 replies.
- [jira] Commented: (TIKA-422) Wrong charset conversion in some RTF documents. - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2010/05/12 15:06:41 UTC, 0 replies.
- [jira] Commented: (TIKA-242) Incremental configuration AutoDetectParser - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2010/05/12 15:41:42 UTC, 0 replies.
- Re: [jira] Commented: (TIKA-422) Wrong charset conversion in some RTF documents. - posted by Oleg Tikhonov <ol...@gmail.com> on 2010/05/12 15:52:10 UTC, 0 replies.
- Alternative RTF parsers (Was: [jira] Commented: (TIKA-422) Wrong charset conversion in some RTF documents.) - posted by Jukka Zitting <ju...@gmail.com> on 2010/05/12 16:34:20 UTC, 0 replies.
- [jira] Resolved: (TIKA-415) Findbugs: XHTMLDowngradeHandler equals() comparing different types - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2010/05/12 16:38:41 UTC, 0 replies.
- [jira] Resolved: (TIKA-417) Unable to parse the content for UCS2 Litte Endian encoded file - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2010/05/12 17:34:41 UTC, 0 replies.
- [jira] Commented: (TIKA-418) RuntimeException while getting content for ppsx, ppsm, pptm, thmx and xps file types - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2010/05/12 17:46:42 UTC, 3 replies.
- [jira] Commented: (TIKA-402) Support for Keynote and Pages documents - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2010/05/12 18:44:40 UTC, 2 replies.
- [jira] Issue Comment Edited: (TIKA-402) Support for Keynote and Pages documents - posted by "Martijn van Groningen (JIRA)" <ji...@apache.org> on 2010/05/14 15:30:42 UTC, 0 replies.
- [jira] Created: (TIKA-423) Parse docx and output to text file missing words - posted by "David Tran (JIRA)" <ji...@apache.org> on 2010/05/17 05:04:42 UTC, 0 replies.
- [jira] Updated: (TIKA-423) Parse docx and output to text file missing words - posted by "David Tran (JIRA)" <ji...@apache.org> on 2010/05/17 05:04:43 UTC, 1 replies.
- FW: [Travel Assistance] - Applications Open for ApacheCon NA 2010 - posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov> on 2010/05/17 19:31:00 UTC, 0 replies.
- [jira] Updated: (TIKA-418) RuntimeException while getting content for ppsx, ppsm, pptm, thmx and xps file types - posted by "Murad Shahid (JIRA)" <ji...@apache.org> on 2010/05/18 03:32:42 UTC, 0 replies.
- [jira] Created: (TIKA-424) Avoid ArrayIndexOutOfBoundsException on some mp3 files - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2010/05/18 12:15:42 UTC, 0 replies.
- [jira] Updated: (TIKA-424) Avoid ArrayIndexOutOfBoundsException on some mp3 files - posted by "Nick Burch (JIRA)" <ji...@apache.org> on 2010/05/18 12:17:42 UTC, 0 replies.
- Html5 parsing spec - posted by Ken Krugler <kk...@transpac.com> on 2010/05/18 21:54:25 UTC, 0 replies.
- [jira] Created: (TIKA-425) Exception parsing mp3 - posted by "Erik Hetzner (JIRA)" <ji...@apache.org> on 2010/05/19 02:10:53 UTC, 0 replies.
- [jira] Created: (TIKA-426) Parsing javascript as XML - posted by "Erik Hetzner (JIRA)" <ji...@apache.org> on 2010/05/19 02:30:55 UTC, 0 replies.
- [jira] Created: (TIKA-427) Parsing CSS as XML - posted by "Erik Hetzner (JIRA)" <ji...@apache.org> on 2010/05/19 02:34:56 UTC, 0 replies.
- [jira] Created: (TIKA-428) Unexpected RuntimeException when parsing PPTM (?) file - posted by "Erik Hetzner (JIRA)" <ji...@apache.org> on 2010/05/19 03:30:53 UTC, 0 replies.
- [jira] Commented: (TIKA-428) Unexpected RuntimeException when parsing PPTM (?) file - posted by "Erik Hetzner (JIRA)" <ji...@apache.org> on 2010/05/19 03:30:54 UTC, 0 replies.
- [jira] Commented: (TIKA-425) Exception parsing mp3 - posted by "Gerd Bremer (JIRA)" <ji...@apache.org> on 2010/05/19 13:14:53 UTC, 0 replies.
- [jira] Issue Comment Edited: (TIKA-425) Exception parsing mp3 - posted by "Gerd Bremer (JIRA)" <ji...@apache.org> on 2010/05/19 13:16:53 UTC, 0 replies.
- [jira] Updated: (TIKA-425) Exception parsing mp3 - posted by "Gerd Bremer (JIRA)" <ji...@apache.org> on 2010/05/19 14:33:53 UTC, 0 replies.
- [jira] Created: (TIKA-429) Error parsing DTD - posted by "Erik Hetzner (JIRA)" <ji...@apache.org> on 2010/05/19 22:33:54 UTC, 0 replies.
- Boilerpipe issue with Maven central repository - posted by Ken Krugler <kk...@transpac.com> on 2010/05/21 02:58:16 UTC, 3 replies.
- [jira] Created: (TIKA-430) Automatically let all valid XHTML 1.0 attributes through from HTML documents - posted by "Ken Krugler (JIRA)" <ji...@apache.org> on 2010/05/21 03:04:27 UTC, 0 replies.
- Improved handling of attributes - posted by Ken Krugler <kk...@transpac.com> on 2010/05/21 03:08:29 UTC, 5 replies.
- [jira] Assigned: (TIKA-391) Intermittent errors detecting xls files - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/05/21 19:41:18 UTC, 0 replies.
- [jira] Created: (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. - posted by "Erik Hetzner (JIRA)" <ji...@apache.org> on 2010/05/21 19:59:16 UTC, 0 replies.
- [jira] Commented: (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. - posted by "Erik Hetzner (JIRA)" <ji...@apache.org> on 2010/05/21 20:01:33 UTC, 2 replies.
- [jira] Commented: (TIKA-391) Intermittent errors detecting xls files - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/05/21 21:42:19 UTC, 0 replies.
- [jira] Created: (TIKA-432) Include NOTICE and LICENSE file updates for NCAR NetCDF parser lib - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/05/21 21:48:18 UTC, 0 replies.
- [jira] Resolved: (TIKA-432) Include NOTICE and LICENSE file updates for NCAR NetCDF parser lib - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/05/21 21:54:18 UTC, 0 replies.
- [jira] Updated: (TIKA-391) Intermittent errors detecting xls files - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/05/21 22:02:17 UTC, 0 replies.
- [jira] Created: (TIKA-433) Tika + Hadoop - posted by "Grant Ingersoll (JIRA)" <ji...@apache.org> on 2010/05/25 23:13:31 UTC, 0 replies.
- [jira] Commented: (TIKA-433) Tika + Hadoop - posted by "Julien Nioche (JIRA)" <ji...@apache.org> on 2010/05/26 09:34:33 UTC, 5 replies.
- [jira] Commented: (TIKA-430) Automatically let all valid XHTML 1.0 attributes through from HTML documents - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2010/05/26 10:47:32 UTC, 1 replies.
- [jira] Commented: (TIKA-429) Error parsing DTD - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2010/05/26 10:51:33 UTC, 0 replies.
- [jira] Resolved: (TIKA-425) Exception parsing mp3 - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2010/05/26 11:28:33 UTC, 0 replies.
- [jira] Resolved: (TIKA-428) Unexpected RuntimeException when parsing PPTM (?) file - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2010/05/26 11:34:32 UTC, 0 replies.
- [jira] Commented: (TIKA-427) Parsing CSS as XML - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2010/05/26 12:20:34 UTC, 0 replies.
- [jira] Resolved: (TIKA-424) Avoid ArrayIndexOutOfBoundsException on some mp3 files - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2010/05/26 12:24:33 UTC, 0 replies.
- [jira] Resolved: (TIKA-413) DWG Parser - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2010/05/26 14:14:33 UTC, 0 replies.
- [jira] Assigned: (TIKA-431) Tika currently misuses the HTTP Content-Encoding header, and does not seem to use the charset part of the Content-Type header properly. - posted by "Ken Krugler (JIRA)" <ji...@apache.org> on 2010/05/26 19:10:38 UTC, 0 replies.
- Re: confirm unsubscribe from dev@tika.apache.org - posted by Ian Holsman <li...@holsman.net> on 2010/05/27 20:52:52 UTC, 0 replies.
- [jira] Created: (TIKA-434) Bug in TagSoup causes IOException - posted by "Andrew Khoury (JIRA)" <ji...@apache.org> on 2010/05/27 23:09:41 UTC, 0 replies.
- [jira] Commented: (TIKA-434) Bug in TagSoup causes IOException - posted by "Andrew Khoury (JIRA)" <ji...@apache.org> on 2010/05/27 23:47:36 UTC, 0 replies.
- [jira] Updated: (TIKA-434) Bug in TagSoup causes IOException - posted by "Andrew Khoury (JIRA)" <ji...@apache.org> on 2010/05/27 23:47:37 UTC, 3 replies.
- [jira] Commented: (TIKA-416) Out-of-process text extraction - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2010/05/27 23:51:39 UTC, 0 replies.
- Please unsubscribe me. - posted by Trond Albinussen <tr...@boostcom.no> on 2010/05/28 16:37:16 UTC, 1 replies.
- [jira] Resolved: (TIKA-379) Html elements and attributes not available in XHTML representation - posted by "Chris A. Mattmann (JIRA)" <ji...@apache.org> on 2010/05/31 01:51:38 UTC, 0 replies.
- Re: [jira] Updated: (TIKA-402) Support for Keynote and Pages documents - posted by Martijn v Groningen <ma...@gmail.com> on 2010/05/31 11:10:28 UTC, 2 replies.
- Re: [jira] Updated: (TIKA-402) Support for Keynote and Pages documents - posted by Alex Ott <al...@gmail.com> on 2010/05/31 11:13:07 UTC, 1 replies.
- [jira] Created: (TIKA-435) After using the GUI part of the cli sometimes temporary files are not removed. - posted by "Christoph Weidling (JIRA)" <ji...@apache.org> on 2010/05/31 14:18:37 UTC, 0 replies.
- [jira] Updated: (TIKA-402) Support for iWork documents - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2010/05/31 17:04:51 UTC, 0 replies.
- [jira] Commented: (TIKA-402) Support for iWork documents - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2010/05/31 18:38:40 UTC, 2 replies.
- [jira] Resolved: (TIKA-435) After using the GUI part of the cli sometimes temporary files are not removed. - posted by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2010/05/31 19:00:40 UTC, 0 replies.