You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tika.apache.org by ta...@apache.org on 2019/07/11 20:15:30 UTC
[tika] branch branch_1x updated (830094e -> 02437c5)
This is an automated email from the ASF dual-hosted git repository.
tallison pushed a change to branch branch_1x
in repository https://gitbox.apache.org/repos/asf/tika.git.
from 830094e TIKA-1568 -- statically cache encoding detector in AutoDetectReader when default initializer is used.
add 02437c5 TIKA-2899 -- prevent non-aligned tags in xhtml output...I am not convinced there's anything wrong with this RTF, and I may have just covered up list processing bugs in our parser, but this will guarantee balanced tags...
No new revisions were added by this update.
Summary of changes:
.../org/apache/tika/parser/rtf/TextExtractor.java | 80 +-
.../org/apache/tika/parser/rtf/RTFParserTest.java | 7 +-
.../resources/test-documents/testRTFTIKA_2899.rtf | 836 +++++++++++++++++++++
3 files changed, 914 insertions(+), 9 deletions(-)
create mode 100644 tika-parsers/src/test/resources/test-documents/testRTFTIKA_2899.rtf