You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2022/10/13 14:29:00 UTC

[jira] [Created] (TIKA-3878) Improve PipesReporter and PipesIterator to report the total number of files to be processed

Tim Allison created TIKA-3878:
---------------------------------

             Summary: Improve PipesReporter and PipesIterator to report the total number of files to be processed
                 Key: TIKA-3878
                 URL: https://issues.apache.org/jira/browse/TIKA-3878
             Project: Tika
          Issue Type: New Feature
            Reporter: Tim Allison


For user-facing applications, it would be useful to give them a sense of progress in reporting with a denominator (total files to process). 

Some pipesiterators will have a natural shortcut (select count(1)... for jdbc or other queries in OpenSearch and/or Solr).  Some will have to do twice the work -- file system and s3(?).  And some simply won't be able to report a total number.

My initial target is the FileSystemPipesIterator and the FileSystemStatusReporter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)