You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2008/12/04 23:34:44 UTC

[jira] Commented: (PDFBOX-391) Remove or replace troublesome test files

    [ https://issues.apache.org/jira/browse/PDFBOX-391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653524#action_12653524 ] 

Jukka Zitting commented on PDFBOX-391:
--------------------------------------

I quickly browsed through the test files, and only the following look like something that I'd feel comfortable redistributing within an Apache project:

    test/input/cweb.pdf
    test/input/data-000001.pdf
    test/input/Liste732004001452_001_0.pdf_0_.pdf
    test/input/openoffice-test-document
    test/input/sample_fonts_solidconvertor.pdf
    test/input/sampleForSpec.pdf
    test/input/simple-openoffice.pdf
    test/input/yaddatest.pdf
    test/pdfreader/debug.xml.pdf
    test/pdfreader/excel.pdf
    test/pdfreader/ollix_test_2005-03-11_bin.pdf
    test/pdfreader/pdfbox_webpage.pdf

Note that there is a clear distinction between using and redistributing something. We could still come up with a way to use the test suite in our Hudson CI build and individually by each developer, but we probably can't keep the documents in svn and we definitely can't release them as a part of PDFBox.

> Remove or replace troublesome test files
> ----------------------------------------
>
>                 Key: PDFBOX-391
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-391
>             Project: PDFBox
>          Issue Type: Sub-task
>            Reporter: Jukka Zitting
>            Priority: Blocker
>             Fix For: 0.8.0-incubator
>
>
> One issue raised by the license review (PDFBOX-366) is the status of the various test PDF files included in the test directory. Many of these don't seem to come with a license that would permit redistribution within an Apache project, so our only option seems to be to remove or replace the files before we can make the first Apache release.
> The full list of potentially (I haven't looked at all of these in detail so some might be OK for us to keep) troublesome test files is:
>     $ find test -name '*.pdf'
>     test/encryption/encrypted_doc_no_id.pdf
>     test/input/10101-AR.pdf
>     test/input/601501018.pdf
>     test/input/Exolab.pdf
>     test/input/FreedomExpressions.pdf
>     test/input/Garcia2003b__Correlative_exploration_of_EEG_Signals.pdf
>     test/input/Garcia2004_thesis.pdf
>     test/input/Hd301212.pdf
>     test/input/JavaMail-1.2.pdf
>     test/input/Liste732004001452_001_0.pdf_0_.pdf
>     test/input/Michel2001__Review_p2_structured.pdf
>     test/input/News-Oct-2001-RUS.pdf
>     test/input/OLS2000-rsync.pdf
>     test/input/OSP_framework.pdf
>     test/input/SphericalHomeomorphism.pdf
>     test/input/T05140.pdf
>     test/input/TEST_SetCharSpacing_Error.pdf
>     test/input/amyuni2_05d__pdf1_3_acro4x.pdf
>     test/input/authentication.pdf
>     test/input/c21-5916 .pdf
>     test/input/citi-tr-00-4.ps.gz.pdf
>     test/input/connection_pool.pdf
>     test/input/cweb.pdf
>     test/input/data-000001.pdf
>     test/input/defensive_driving_class_schedule.pdf
>     test/input/ekb_deutsch.pdf
>     test/input/emsv4a4.pdf
>     test/input/fdeb.pdf
>     test/input/frweb-f-332-18.pdf
>     test/input/hexnumberproblem.pdf
>     test/input/irs tax guide for small businesses.pdf
>     test/input/jose-lugo-test.pdf
>     test/input/jun2003.pdf
>     test/input/null_thread_bead.pdf
>     test/input/ocalc.pdf
>     test/input/openoffice-test-document.pdf
>     test/input/org.eclipse.platform.doc.isv.pdf
>     test/input/pdf_with_lots_of_fields.pdf
>     test/input/rc5.pdf
>     test/input/reservedparkingsalaryreductionauthorization.pdf
>     test/input/ruminations.pdf
>     test/input/sampleForSpec.pdf
>     test/input/sample_fonts_solidconvertor.pdf
>     test/input/sha256.pdf
>     test/input/simple-openoffice.pdf
>     test/input/surface_interpolation.pdf
>     test/input/tech_report.pdf
>     test/input/terms_and_conditions_book.pdf
>     test/input/test_rotate_270.pdf
>     test/input/warp.pdf
>     test/input/welcome.pdf
>     test/input/whats_new.pdf
>     test/input/yaddatest.pdf
>     test/pdfparser/genko_oc_shiryo1.pdf
>     test/pdfreader/debug.xml.pdf
>     test/pdfreader/excel.pdf
>     test/pdfreader/ollix_test_2005-03-11_bin.pdf
>     test/pdfreader/pdfbox_webpage.pdf
> My suggestion is that (in line with PDFBOX-368) we create a new src/test/resources directory where we move all reviewed and accepted test cases. Once all these files have been reviewed, we just drop the remaining ones for which an acceptable license could not be found. It would be nice if replacements could be created for such test cases, but in some cases (special PDF constructs, etc.) that might be a bit troublesome so I guess we'll just need to live with some reduction in test coverage due to this.
> For more background, see the discussions at http://markmail.org/message/z7meilylwifef7db and http://markmail.org/message/cuyylr6zqs4fwdiz.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.