You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by LynX <_L...@bk.ru> on 2011/05/06 21:01:55 UTC
HTML formatting
Hello,
There is several similar issues which concerns formatting after PDF to
HTML conversion:
https://issues.apache.org/jira/browse/PDFBOX-6
https://issues.apache.org/jira/browse/PDFBOX-271
I would like to work on them (I see that some work has been done by
rrufai already, but PDFBOX code have changed since then, so I may need
to do some additional changes), but I see that severity of all these
issues is minor and there are not any comments on them for a long time.
Thats why I am not sure if it make sense to work on them or not? If not
then may be they can be closed?
Thank you,
LX
Re: HTML formatting
Posted by Raimi Rufai <rr...@gmail.com>.
Hi LX,
PDF to HTML conversion is a fascinating set of problems. One of the
hard knots to crack is preserving tables and columns for multi-column
documents. I've not had a look at the code for a long time.
I'll be nice to jump back in.
Regards,
Raimi
On Fri, May 6, 2011 at 3:01 PM, LynX <_L...@bk.ru> wrote:
> Hello,
>
> There is several similar issues which concerns formatting after PDF to HTML
> conversion:
>
> https://issues.apache.org/jira/browse/PDFBOX-6
> https://issues.apache.org/jira/browse/PDFBOX-271
>
> I would like to work on them (I see that some work has been done by rrufai
> already, but PDFBOX code have changed since then, so I may need to do some
> additional changes), but I see that severity of all these issues is minor
> and there are not any comments on them for a long time.
> Thats why I am not sure if it make sense to work on them or not? If not then
> may be they can be closed?
>
> Thank you,
> LX
>
--
«To develop software is to build a machine simply by describing it.»
(Michael A. Jackson -- not the singer)
«Développer des logiciels est de construire une machine tout
simplement en le décrivant.» (Michael A. Jackson - pas le chanteur)