You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "John Hewson (JIRA)" <ji...@apache.org> on 2014/10/10 23:54:34 UTC

[jira] [Resolved] (PDFBOX-207) Better metadata in conversion to HTML

     [ https://issues.apache.org/jira/browse/PDFBOX-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Hewson resolved PDFBOX-207.
--------------------------------
    Resolution: Fixed

These fixes were done a long time ago. Anything outstanding can be a new issue.

> Better metadata in conversion to HTML
> -------------------------------------
>
>                 Key: PDFBOX-207
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-207
>             Project: PDFBox
>          Issue Type: New Feature
>          Components: Text extraction
>            Priority: Minor
>             Fix For: 1.7.0
>
>
> [imported from SourceForge]
> http://sourceforge.net/tracker/index.php?group_id=78314&atid=552835&aid=1576966
> Originally submitted by nobody on 2006-10-13 17:18.
> It would be great to have better support for metadata 
> in conversion to HTML.
> - Being able to create a HTML page with the proper 
> document title in (not one simply guessed from the 
> text of the document).
> - Author, keywords, category etc. extracted from the 
> document and placed into metafields in the HTML
> - Chosen encoding included in the HTML header.
> I am using PDFbox in conjunction with mnoGoSearch to 
> index PDFs on a site. This additional metadata would 
> be extremely handy, since it would form a part of the 
> indexed details for the documents.
> Even if a simple tool could be created that would 
> *just* extract the metadata from a document [into 
> some kind of text format], that would be great. 
> External tools could then be built around that, e.g. 
> a templating tool that could create a final format of 
> any form, using the extracted text and the extracted 
> metadata.
> [comment on SourceForge]
> Originally sent by nobody.
> Logged In: NO 
> BTW I've not used Java before, so don't have any code to 
> contribute, but if I do come up with anything, I'll post 
> it here.
> -- Jason
> (sorry - mislaid my login too)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)