You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@openoffice.apache.org by bu...@apache.org on 2012/05/05 12:13:03 UTC

DO NOT REPLY [Bug 119312] New: PDF import has very poor layout accuracy

https://issues.apache.org/ooo/show_bug.cgi?id=119312

             Bug #: 119312
        Issue Type: DEFECT
           Summary: PDF import has very poor layout accuracy
    Classification: Extensions
           Product: extensions
           Version: current
          Platform: PC
        OS/Version: All
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: pdfimport
        AssignedTo: issues@extensions.openoffice.org
        ReportedBy: frank.breitling@gmx.de
                CC: ooo-issues@incubator.apache.org


Created attachment 77514
  --> https://issues.apache.org/ooo/attachment.cgi?id=77514
One of many PDF documents that gets screwed up

I import a PDF document into Draw and the result looks very different from the
original.

Steps to reproduce:
1. Start LibreOffice
2. Open the PDF document attached

Current behavior:
Several lines get stacked on each other.

Expected behavior:
The imported document should look like the original PDF.

Platform (if different from the browser): 
Ubuntu 12.04, LibreOffice 3.5.2.2.
But it also happens in Windows 7, OpenOffice 3.3.0
Browser: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0) Gecko/20100101
Firefox/12.0

-- 
Configure bugmail: https://issues.apache.org/ooo/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

DO NOT REPLY [Bug 119312] PDF import has very poor layout accuracy

Posted by bu...@apache.org.
https://issues.apache.org/ooo/show_bug.cgi?id=119312

--- Comment #11 from fkbreitl <fr...@gmx.de> 2012-05-07 13:16:55 UTC ---
I agree that fixing is much more work and its highly appreciated.

However testing is work too and there will always be a later version.
The OpenOffice web page advertises 3.3.0 as latest stable version.
For tests of anything newer I see the responsibility on the developer side.
However, I am even willing to help out here, if there are convincing reasons
for it and people start working with me.

However I am reluctant to test for no good reason, since from my experience
with OO I know that even well known, reported and voted bugs are carried on
from release to release.

-- 
Configure bugmail: https://issues.apache.org/ooo/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

DO NOT REPLY [Bug 119312] PDF import has very poor layout accuracy

Posted by bu...@apache.org.
https://issues.apache.org/ooo/show_bug.cgi?id=119312

Dave Fisher <wa...@apache.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |CONFIRMED
                 CC|                            |wave@apache.org
     Ever Confirmed|0                           |1

--- Comment #13 from Dave Fisher <wa...@apache.org> 2012-05-08 20:43:59 UTC ---
In your case I have verified that AOO 3.4 does render page 2 and page 3
imperfectly. I tested a Mac version.

In looking at the attached PDF I see that the original is a Word document and
the file was produced by a Mac OS X 10.5.8 Quartz PDFContext and is version 1.6
PDF. There are embedded font subsets of Windows standard fonts.

I extracted the awful page 3 to a separate page Acrobat and the import in AOO
3.4 was just as bad.

It is very much a non-trivial task to re-assemble the text strings from a PDF
into usable text blocks. Remember that the PDF file format was designed as
digital paper.

With your example the next developer who attempts to fix PDF import will have
another example to use.

Meanwhile if you have the original Word document, how does Writer handle that?

-- 
Configure bugmail: https://issues.apache.org/ooo/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

DO NOT REPLY [Bug 119312] PDF import has very poor layout accuracy

Posted by bu...@apache.org.
https://issues.apache.org/ooo/show_bug.cgi?id=119312

vandertim@rocketmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vandertim@rocketmail.com

--- Comment #12 from vandertim@rocketmail.com 2012-05-08 19:13:49 UTC ---
Bug #119198 and possible solution to this related PDF Error within Apache
OpenOffice 3.4:

https://issues.apache.org/ooo/show_bug.cgi?id=119198

-- 
Configure bugmail: https://issues.apache.org/ooo/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

DO NOT REPLY [Bug 119312] PDF import has very poor layout accuracy

Posted by bu...@apache.org.
https://issues.apache.org/ooo/show_bug.cgi?id=119312

Andre <af...@a-w-f.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |af@a-w-f.de

--- Comment #1 from Andre <af...@a-w-f.de> 2012-05-07 08:41:32 UTC ---
@fkbreitl: Can you check that with Apache OpenOffice as well?
(This is bugzilla of Apache OpenOffice, not of LibreOffice)

-- 
Configure bugmail: https://issues.apache.org/ooo/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

DO NOT REPLY [Bug 119312] PDF import has very poor layout accuracy

Posted by bu...@apache.org.
https://issues.apache.org/ooo/show_bug.cgi?id=119312

--- Comment #7 from fkbreitl <fr...@gmx.de> 2012-05-07 10:18:41 UTC ---
With this attitude I assume OO will keep its poor conditions for decades.

-- 
Configure bugmail: https://issues.apache.org/ooo/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

DO NOT REPLY [Bug 119312] PDF import has very poor layout accuracy

Posted by bu...@apache.org.
https://issues.apache.org/ooo/show_bug.cgi?id=119312

--- Comment #9 from fkbreitl <fr...@gmx.de> 2012-05-07 12:35:47 UTC ---
Instead of hoping in vain for the bug to miraculously disappear in the next
release, it should be confirmed and result in immediate action to get it fixed
and closed in future releases.

If there are profound convincing reason that this bug is resolved in the
pre-release (i.e. somebody has been working on it or at least on the PDF
extension) I will consider testing it. Otherwise such a test is just a waste of
my time with no benefit for the project.
Moreover testing on pre-releases is the job of the developers, which have those
versions already installed and can do it much more efficient than the users.

-- 
Configure bugmail: https://issues.apache.org/ooo/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

DO NOT REPLY [Bug 119312] PDF import has very poor layout accuracy

Posted by bu...@apache.org.
https://issues.apache.org/ooo/show_bug.cgi?id=119312

--- Comment #2 from fkbreitl <fr...@gmx.de> 2012-05-07 08:53:10 UTC ---
(In reply to comment #1)

As I stated above:
> But it also happens in Windows 7, OpenOffice 3.3.0

-- 
Configure bugmail: https://issues.apache.org/ooo/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

DO NOT REPLY [Bug 119312] PDF import has very poor layout accuracy

Posted by bu...@apache.org.
https://issues.apache.org/ooo/show_bug.cgi?id=119312

--- Comment #10 from Andre <af...@a-w-f.de> 2012-05-07 12:52:56 UTC ---
I do not want to rely on hopes and assumptions.  Before I (or probably somebody
else) start to fix this bug I have to make sure that it still exists.  Anything
else would be a waste of my time.  There are newer versions of both LibreOffice
(currently 3.5.3) and Apache OpenOffice (3.4 developer builds).

And as we talk about attitude.  If I (or another developer) spend my time on
fixing this issue, is it really such an outrageous request to ask you (or
anybody else who is interested in getting this fixed) to check that this bug
still exists on current versions?  Please keep in mind that fixing this will
probably take much more time than testing it.

-- 
Configure bugmail: https://issues.apache.org/ooo/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

DO NOT REPLY [Bug 119312] PDF import has very poor layout accuracy

Posted by bu...@apache.org.
https://issues.apache.org/ooo/show_bug.cgi?id=119312

--- Comment #4 from Andre <af...@a-w-f.de> 2012-05-07 09:00:25 UTC ---
Did you check Apache OpenOffice 3.4 (which contains a lot of bug fixes from
OpenOffice 3.4)?

-- 
Configure bugmail: https://issues.apache.org/ooo/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

DO NOT REPLY [Bug 119312] PDF import has very poor layout accuracy

Posted by bu...@apache.org.
https://issues.apache.org/ooo/show_bug.cgi?id=119312

--- Comment #5 from fkbreitl <fr...@gmx.de> 2012-05-07 09:13:15 UTC ---
(In reply to comment #4)
> Did you check Apache OpenOffice 3.4 (which contains a lot of bug fixes from
> OpenOffice 3.4)?

I didn't check prereleases but its very unlikely that the bug disapeared,
especially since its still in LibreOffice 3.4.

-- 
Configure bugmail: https://issues.apache.org/ooo/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

DO NOT REPLY [Bug 119312] PDF import has very poor layout accuracy

Posted by bu...@apache.org.
https://issues.apache.org/ooo/show_bug.cgi?id=119312

--- Comment #8 from Andre <af...@a-w-f.de> 2012-05-07 11:19:40 UTC ---
I beg your pardon, what attitude?

-- 
Configure bugmail: https://issues.apache.org/ooo/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

DO NOT REPLY [Bug 119312] PDF import has very poor layout accuracy

Posted by bu...@apache.org.
https://issues.apache.org/ooo/show_bug.cgi?id=119312

--- Comment #6 from Andre <af...@a-w-f.de> 2012-05-07 09:58:44 UTC ---
Lets wait for the upcoming release of AOO 3.4 and then test if the bug is still
there.

-- 
Configure bugmail: https://issues.apache.org/ooo/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

DO NOT REPLY [Bug 119312] PDF import has very poor layout accuracy

Posted by bu...@apache.org.
https://issues.apache.org/ooo/show_bug.cgi?id=119312

--- Comment #3 from fkbreitl <fr...@gmx.de> 2012-05-07 08:56:26 UTC ---
I also reported it here: https://bugs.freedesktop.org/show_bug.cgi?id=49431

-- 
Configure bugmail: https://issues.apache.org/ooo/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.