You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2022/10/13 14:29:00 UTC
[jira] [Created] (TIKA-3878) Improve PipesReporter and PipesIterator to report the total number of files to be processed
Tim Allison created TIKA-3878:
---------------------------------
Summary: Improve PipesReporter and PipesIterator to report the total number of files to be processed
Key: TIKA-3878
URL: https://issues.apache.org/jira/browse/TIKA-3878
Project: Tika
Issue Type: New Feature
Reporter: Tim Allison
For user-facing applications, it would be useful to give them a sense of progress in reporting with a denominator (total files to process).
Some pipesiterators will have a natural shortcut (select count(1)... for jdbc or other queries in OpenSearch and/or Solr). Some will have to do twice the work -- file system and s3(?). And some simply won't be able to report a total number.
My initial target is the FileSystemPipesIterator and the FileSystemStatusReporter.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)