You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Caleb Cushing (Jira)" <ji...@apache.org> on 2021/06/01 14:56:00 UTC
[jira] [Created] (TIKA-3429) Performance problems partially caused
by tika eagerly loading configuration
Caleb Cushing created TIKA-3429:
-----------------------------------
Summary: Performance problems partially caused by tika eagerly loading configuration
Key: TIKA-3429
URL: https://issues.apache.org/jira/browse/TIKA-3429
Project: Tika
Issue Type: New Feature
Reporter: Caleb Cushing
referencing https://github.com/spring-projects/spring-boot/issues/26709#issuecomment-851953515
{quote}
the tika configuration (eagerly loading a 7K lines XML file)
{quote}
Here's the text of that issue
I'm not sure the problem is spring boot, but I'm having problems finding it. The Jar is currently taking 3 seconds (9 if I live out tiered) to run on my system. Just to error out due to missing options and do nothing.
https://github.com/xenoterracide/brix/tree/8e3d86bcf773e564cc24b51572b0bbd8bb60b73f
{code}
time java -Xverify:none -XX:TieredStopAtLevel=1 -jar modules/app/build/libs/app-0.1.0.jar # brix -> ccushing/copy-5-1
Missing required parameters: '<language>', '<moduleType>', '<project>'
Usage: <main class> [--repo=<repo>] [--workdir=<workdir>] <language>
<moduleType> <project> [COMMAND]
<language> The programming language you're generating code
for. Directory under --dir
<moduleType> The type of code you're generating e.g controller,
also the name of the config file without the
extension.
<project> The name of the project you're generating code for.
The name of the module to be created within the
project.
--repo=<repo> Repository path from the current working directory.
Templates and configs are looked up relative to
here. If the config isn't found here, then we
will search ~/.config/brix
--workdir=<workdir> The working directory you want your destination
paths to be relative to. Defaults to current
working directory
Default:
Commands:
run
java -Xverify:none -XX:TieredStopAtLevel=1 -jar 3.15s user 0.26s system 142% cpu 2.386 total
{code}
since it's a CLI app lazy init isn't helpful. This is worded like a question (that really would not be suitable for stackoverflow, I hate that SO is the support forum for things now, it's terrible because of the attitude of people that the objective is not to help people, also it's bad at getting answers for harder problems, spring should get a discourse or something again), but I also know I had a tika CLI app in the past that loaded in less than 1s without Tiered, so I'm also concerned it's a spring boot bug. I'm going to connect a profiler later to see what I can find, but I'm not sure that will do it.
{code}
Fedora 33
5.11.16-200.fc33.x86_64
14:08:34 up 3 days, 2:04, 1 user, load average: 0.79, 1.10, 1.66
total used free shared buff/cache available
Mem: 15G 11G 1.0G 1.4G 3.0G 2.3G
Swap: 12G 1.5G 10G
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)