You are viewing a plain text version of this content. The canonical link for it is here.
Posted to corpora-dev@tika.apache.org by Tim Allison <ta...@apache.org> on 2020/11/13 16:44:27 UTC
Updates/status
All,
Some updates...please see Peter Wyatt's recent article on the refreshing
of the bug tracker corpus:
https://twitter.com/PDFAssociation/status/1327237439732260865?s=20
* successfully upgraded and rebooted the server.
* finished running tika-eval's new FileProfile on the full corpus, and I've
made this available via datasette.
* documented some useful queries in datasette:
https://cwiki.apache.org/confluence/display/TIKA/TikaEvalDatasetteExamples
ttps://
* reported a bug with datasette (
https://github.com/simonw/datasette/issues/1091). It looks like the
base_url fix didn't work across all buttons, but it did get better.
Cheers,
Tim
Fwd: Updates/status
Posted by Tim Allison <ta...@apache.org>.
All,
I kicked off 1.24.1 on the new data so that we'll have a "before" to
compare with 1.25 as soon as Adobe fixes the license issue.
:fingers-crossed:
Have a great weekend!
Cheers,
Tim
---------- Forwarded message ---------
From: Tim Allison <ta...@apache.org>
Date: Fri, Nov 13, 2020 at 11:44 AM
Subject: Updates/status
To: <co...@tika.apache.org>
All,
Some updates...please see Peter Wyatt's recent article on the refreshing
of the bug tracker corpus:
https://twitter.com/PDFAssociation/status/1327237439732260865?s=20
* successfully upgraded and rebooted the server.
* finished running tika-eval's new FileProfile on the full corpus, and I've
made this available via datasette.
* documented some useful queries in datasette:
https://cwiki.apache.org/confluence/display/TIKA/TikaEvalDatasetteExamples
ttps://
* reported a bug with datasette (
https://github.com/simonw/datasette/issues/1091). It looks like the
base_url fix didn't work across all buttons, but it did get better.
Cheers,
Tim