You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2021/05/11 14:45:00 UTC

[jira] [Resolved] (TIKA-3370) Refactor the AsyncProcessor in 2.x

     [ https://issues.apache.org/jira/browse/TIKA-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Allison resolved TIKA-3370.
-------------------------------
    Resolution: Fixed

> Refactor the AsyncProcessor in 2.x
> ----------------------------------
>
>                 Key: TIKA-3370
>                 URL: https://issues.apache.org/jira/browse/TIKA-3370
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Assignee: Tim Allison
>            Priority: Major
>
> Yesterday, I finally got back to trying to wire the AsyncProcessor in tika-pipes into the AsyncHandler in tika-server.  I've now convinced myself that the notorious antipattern of using a db as a queue is in fact a really, really bad idea -- there's every chance that I wasn't doing it right or that H2 isn't a great choice...my $ is on the former.
> Nevertheless, I think removing H2 from that process and going with a modification of our ForkParser or a lightweight purpose-built knock-off to handle fetchers and emitters will be as robust, a bunch cleaner, have fewer dependencies and hopefully be more performant than what I had in the AsyncProcessor.
> Immediate term, I'd like to get this running and wired into tika-server.  Longer term, we can use this instead of tika-batch in tika-app...more use, fewer bugs.
> This is the last item I'd like to finish before 2.0.0-BETA.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)