You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by Kosta Krauth <ko...@gmail.com> on 2010/09/12 17:44:43 UTC

Why is PagedText.N_PAGES not mapped to Metadata.PAGE_COUNT?

I have been parsing some PDF files (using the AutoDetectParser) and noticed
that in the returned metadata map there was an XMP field
called xmpTPg:NPages which contains the number of pages. However, the
Metadata.PAGE_COUNT property, where I would expect this sort of information,
was null.

I did a bit of googling regarding the xmpTPg:NPages property and stumbled
across the PagedText.N_PAGES constant within the org.apache.tika.metadata
package which seems to serve no other purpose but to map to that particular
XMP property. To my further confusion, the PagedText class was not even
mentioned in the API docs.

Could someone clear this up for me? :) Thank you!