user@tika.apache.org, 2019-03

You are viewing a plain text version of this content. The canonical link for it is here.

- tika PDF extraction - ToHTMLContentHandler problems - posted by Cristian Vat <cr...@gmail.com> on 2019/03/02 08:09:02 UTC, 1 replies.
- OCR Strategy ocr_only extracts also text - posted by David Pilato <da...@pilato.fr> on 2019/03/02 14:04:20 UTC, 5 replies.
- Re: OCR and Raw text - posted by David Pilato <da...@pilato.fr> on 2019/03/04 09:52:37 UTC, 0 replies.
- 4 Apache Events in 2019: DC Roadshow soon; next up Chicago, Las Vegas, and Berlin! - posted by Rich Bowen <rb...@apache.org> on 2019/03/06 14:00:23 UTC, 0 replies.
- Zip Bomb false detection with large PDF Outline - posted by Cristian Vat <cr...@gmail.com> on 2019/03/07 20:37:20 UTC, 0 replies.
- Re: Fwd: Very slow PDF parsing. - posted by Konstantin Gribov <gr...@gmail.com> on 2019/03/21 16:56:40 UTC, 1 replies.
- Question about strange characters in the output - posted by Steven Van Ingelgem <st...@vaningelgem.be> on 2019/03/23 14:33:31 UTC, 0 replies.