You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2021/05/10 21:34:00 UTC
[jira] [Commented] (TIKA-3370) Refactor the AsyncProcessor in 2.x
[ https://issues.apache.org/jira/browse/TIKA-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17342156#comment-17342156 ]
Tim Allison commented on TIKA-3370:
-----------------------------------
Pushed a cleanup with heavy refactoring for this. Configuration now works.
> Refactor the AsyncProcessor in 2.x
> ----------------------------------
>
> Key: TIKA-3370
> URL: https://issues.apache.org/jira/browse/TIKA-3370
> Project: Tika
> Issue Type: Task
> Reporter: Tim Allison
> Assignee: Tim Allison
> Priority: Major
>
> Yesterday, I finally got back to trying to wire the AsyncProcessor in tika-pipes into the AsyncHandler in tika-server. I've now convinced myself that the notorious antipattern of using a db as a queue is in fact a really, really bad idea -- there's every chance that I wasn't doing it right or that H2 isn't a great choice...my $ is on the former.
> Nevertheless, I think removing H2 from that process and going with a modification of our ForkParser or a lightweight purpose-built knock-off to handle fetchers and emitters will be as robust, a bunch cleaner, have fewer dependencies and hopefully be more performant than what I had in the AsyncProcessor.
> Immediate term, I'd like to get this running and wired into tika-server. Longer term, we can use this instead of tika-batch in tika-app...more use, fewer bugs.
> This is the last item I'd like to finish before 2.0.0-BETA.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)