You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "John Hewson (JIRA)" <ji...@apache.org> on 2014/06/17 22:01:22 UTC

[jira] [Closed] (PDFBOX-271) Updated PDFText2HTML

     [ https://issues.apache.org/jira/browse/PDFBOX-271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Hewson closed PDFBOX-271.
------------------------------

    Resolution: Won't Fix

This patch is too old to be applied to PDFBox now, if the original author wants to open a new issue with an updated patch, they are free to.

> Updated PDFText2HTML
> --------------------
>
>                 Key: PDFBOX-271
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-271
>             Project: PDFBox
>          Issue Type: New Feature
>          Components: Text extraction
>            Priority: Minor
>
> [imported from SourceForge]
> http://sourceforge.net/tracker/index.php?group_id=78314&atid=552835&aid=1708294
> Originally submitted by rrufai on 2007-04-26 11:13.
> Hi Ben,
> I was wondering, are you accepting member to the project? 
> I'm using the PDFBox for importing PDF documents and would need more formatting information that is currently supported by PDFBox. The attached is what I've done so far: handles line breaks, bold, italics. Also added some comment delimiters for page boundaries.
> Two things I'll want to handle next are: 
> 1. Underline
> 2. Subscripts and superscripts
> Later on, I'll want to also handle the following:
> 1. Images
> 2. Hyperlinks
> 3. Tables (I know this might be hard)
> I'll need all the help I can get in the form of pointers and clues.
> I look forward to reading from you soon.
> Many, many thanks for providing us with a great library.
> Regards,
> Raimi Rufai
> [attachment on SourceForge]
> http://sourceforge.net/tracker/download.php?group_id=78314&atid=552835&aid=1708294&file_id=226946
> TextPosition.java (text/x-java-source), 6743 bytes
> [attachment on SourceForge]
> http://sourceforge.net/tracker/download.php?group_id=78314&atid=552835&aid=1708294&file_id=226945
> PDFStreamEngine.java (text/x-java-source), 19797 bytes
> [attachment on SourceForge]
> http://sourceforge.net/tracker/download.php?group_id=78314&atid=552835&aid=1708294&file_id=226943
> PDFText2HTML.java (text/x-java-source), 10342 bytes
> [comment on SourceForge]
> Originally sent by rrufai.
> Logged In: YES 
> user_id=1776491
> Originator: YES
> File Added: TextPosition.java
> [comment on SourceForge]
> Originally sent by rrufai.
> Logged In: YES 
> user_id=1776491
> Originator: YES
> File Added: PDFStreamEngine.java
> [comment on SourceForge]
> Originally sent by rrufai.
> Logged In: YES 
> user_id=1776491
> Originator: YES
> File Added: PDFText2HTML.java



--
This message was sent by Atlassian JIRA
(v6.2#6252)