You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tika.apache.org by ma...@apache.org on 2016/03/02 06:42:29 UTC
[20/20] tika git commit: Fix merge conflict.
Fix merge conflict.
Project: http://git-wip-us.apache.org/repos/asf/tika/repo
Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/9056894d
Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/9056894d
Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/9056894d
Branch: refs/heads/master
Commit: 9056894da580107d1a5a21b29a0b7042ffa15c42
Parents: 3fbc03c 7c245fa
Author: Chris Mattmann <ma...@apache.org>
Authored: Tue Mar 1 21:41:57 2016 -0800
Committer: Chris Mattmann <ma...@apache.org>
Committed: Tue Mar 1 21:41:57 2016 -0800
----------------------------------------------------------------------
CHANGES.txt | 2 +
.../org/apache/tika/parser/pdf/PDF2XHTML.java | 20 ++
.../org/apache/tika/parser/pdf/PDFParser.java | 35 +-
.../apache/tika/parser/pdf/PDFParserConfig.java | 36 ++-
.../apache/tika/parser/pdf/XFAExtractor.java | 318 +++++++++++++++++++
.../apache/tika/parser/pdf/PDFParser.properties | 3 +-
.../apache/tika/parser/pdf/PDFParserTest.java | 32 +-
.../testPDF_XFA_govdocs1_258578.pdf | Bin 0 -> 168176 bytes
8 files changed, 442 insertions(+), 4 deletions(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/tika/blob/9056894d/CHANGES.txt
----------------------------------------------------------------------
diff --cc CHANGES.txt
index d5bebcd,05d6d76..e6603fa
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@@ -1,8 -1,7 +1,10 @@@
Release 1.13 - ???
+ * Tika now incorporates the Natural Language Toolkit (NLTK) from the
+ Python community as an option for Named Entity Recognition (TIKA-1876).
+
+ * Add support for XFA extraction via Pascal Essiembre (TIKA-1857).
+
* Upgrade to sqlite-jdbc 3.8.11.2 (TIKA-1861). NOTE: this dependency
is still <scope>provided</scope>. You need to include this dependency
in order to parser sqlite files.