You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by Sudheshna Iyer <iy...@gmail.com> on 2014/02/25 22:43:46 UTC
Extract metadata
Hello,
1. I have few questions about the extraction of metadata. So I wanted to join
mailing list of Tika user group. Can you please provide the email address for
it?
2. How do I extract the metadata from a file? For eg: I need author
information. So for different files, author information is coming from
different fields like:
Author , meta:author , citation_author
Which one should I take? Also I need to extract ~15 of predefined metadata
fields like publication year , doi,.. from Metadata.
What is the best way to extract these fields from Metadata object.
Metadata.names() contains elements like "citation_doi".
Should I say iterate thru metadata names and for each metadata, should I say
if(name.contains("doi") then DOI_CONST = name.getName(name)
Is there any better way to extract the metadata?