You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2022/10/07 13:47:00 UTC
[jira] [Created] (TIKA-3872) Improve namespacing in metadata keys
Tim Allison created TIKA-3872:
---------------------------------
Summary: Improve namespacing in metadata keys
Key: TIKA-3872
URL: https://issues.apache.org/jira/browse/TIKA-3872
Project: Tika
Issue Type: Task
Reporter: Tim Allison
I recently did a group by on metadata keys in roughly 1 million files from our regression corpus. The UTF-8 csvs are available here: https://corpora.tika.apache.org/base/share/metadata-keys-1m-20221006.tgz
My gut feeling is that we should namespace everything. I don't think we should make any changes in 2.x, but I'm opening this for longer range planning.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)