You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by Andreas Lehmkuehler <an...@lehmi.de> on 2018/01/09 21:14:52 UTC

Apache PDFBox January 2018 report due

Hi,

find attached a quick draft of the board report we're expected to submit this
month. It's based upon the report template which can be found at [1]


Any further comments, objections or additions?

<draft>

## Description:
  - the Apache PDFBox library is an open source Java tool for working with PDF 
documents.

## Issues:
  - there are no issue requiring board attention at this time.

## Activity:
  - the integration of the JBig2 ImageIO plugin is complete
  - we are planning to release the first Apache based version of the JBig2 
ImageIO plugin this month
  - we are working on fixing bugs in 2.0.x
  - we have resolve quite a number of 2.0.x releated tickets so that most likely 
the next bugfix version 2.0.9 will be released this month as well

## Board feedback (comment from the last october board meeting)

   mt: Reading the "2.0.7 release" thread on private@ it appears that
       the project is dependent on a single committer for at least a
       sub-set of regression tests. Could you explain this in more
       detail please. If there are tests the community depends on, I'd
       expect to see those tests in an ASF repository where any
       committer can run them.


These tests are not classic regression tests but tests on a large amount (> 
500000) of files. The results are compared to the results of a previous version 
and then committers investigate files with some extreme negative differences or 
with new exceptions. The same is done (on an even larger scale) for Tika, see 
[1] and [2].
The Tika tests need 4TB, and the files can't be hosted on a public ASF repo or 
released under the Apache License because the files largely derive from Common 
Crawl or the internet generally, and copyright/licensing would pose a problem. 
There is a special vm to host the described test and it is possible to grant 
access to all interested Tika/PDFBox committers. Tilman already got his access 
bits in december, so that at least one other committer is able to run those 
tests if needed. Maybe others will follow.

[1] 
http://events.linuxfoundation.org/sites/events/files/slides/ApacheConMiami2017_tallison_v2.pdf
[2] 
http://openpreservation.org/blog/2016/10/04/apache-tikas-regression-corpus-tika-1302/ 



## Health report:
  - there is a steady stream of contributions, bug reports and questions on the 
mailing lists

## PMC changes:

  - Currently 21 PMC members.
  - New PMC members:
     - Joerg O. Henne was added to the PMC on Mon Oct 09 2017
     - Sebastian Holder was added to the PMC on Wed Oct 11 2017
     - Carolin Köhler was added to the PMC on Wed Oct 11 2017
     - Matthäus Mayer was added to the PMC on Mon Oct 16 2017

## Committer base changes:

  - Currently 21 committers.
     - Joerg O. Henne was added as a committer on Mon Oct 09 2017
     - Sebastian Holder was added as a committer on Wed Oct 11 2017
     - Carolin Köhler was added as a committer on Wed Oct 11 2017
     - Matthäus Mayer was added as a committer on Mon Oct 16 2017

## Releases:

  - 2.0.8 was released on Thu Nov 02 2017

## JIRA activity:

  - 101 JIRA tickets created in the last 3 months
  - 75 JIRA tickets closed/resolved in the last 3 months


</draft>

Andreas

[1] https://reporter.apache.org/?pdfbox

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: Apache PDFBox January 2018 report due

Posted by Tilman Hausherr <TH...@t-online.de>.
+1

Tilman

Am 09.01.2018 um 22:14 schrieb Andreas Lehmkuehler:
> Hi,
>
> find attached a quick draft of the board report we're expected to 
> submit this
> month. It's based upon the report template which can be found at [1]
>
>
> Any further comments, objections or additions?
>
> <draft>
>
> ## Description:
>  - the Apache PDFBox library is an open source Java tool for working 
> with PDF documents.
>
> ## Issues:
>  - there are no issue requiring board attention at this time.
>
> ## Activity:
>  - the integration of the JBig2 ImageIO plugin is complete
>  - we are planning to release the first Apache based version of the 
> JBig2 ImageIO plugin this month
>  - we are working on fixing bugs in 2.0.x
>  - we have resolve quite a number of 2.0.x releated tickets so that 
> most likely the next bugfix version 2.0.9 will be released this month 
> as well
>
> ## Board feedback (comment from the last october board meeting)
>
>   mt: Reading the "2.0.7 release" thread on private@ it appears that
>       the project is dependent on a single committer for at least a
>       sub-set of regression tests. Could you explain this in more
>       detail please. If there are tests the community depends on, I'd
>       expect to see those tests in an ASF repository where any
>       committer can run them.
>
>
> These tests are not classic regression tests but tests on a large 
> amount (> 500000) of files. The results are compared to the results of 
> a previous version and then committers investigate files with some 
> extreme negative differences or with new exceptions. The same is done 
> (on an even larger scale) for Tika, see [1] and [2].
> The Tika tests need 4TB, and the files can't be hosted on a public ASF 
> repo or released under the Apache License because the files largely 
> derive from Common Crawl or the internet generally, and 
> copyright/licensing would pose a problem. There is a special vm to 
> host the described test and it is possible to grant access to all 
> interested Tika/PDFBox committers. Tilman already got his access bits 
> in december, so that at least one other committer is able to run those 
> tests if needed. Maybe others will follow.
>
> [1] 
> http://events.linuxfoundation.org/sites/events/files/slides/ApacheConMiami2017_tallison_v2.pdf
> [2] 
> http://openpreservation.org/blog/2016/10/04/apache-tikas-regression-corpus-tika-1302/ 
>
>
>
> ## Health report:
>  - there is a steady stream of contributions, bug reports and 
> questions on the mailing lists
>
> ## PMC changes:
>
>  - Currently 21 PMC members.
>  - New PMC members:
>     - Joerg O. Henne was added to the PMC on Mon Oct 09 2017
>     - Sebastian Holder was added to the PMC on Wed Oct 11 2017
>     - Carolin Köhler was added to the PMC on Wed Oct 11 2017
>     - Matthäus Mayer was added to the PMC on Mon Oct 16 2017
>
> ## Committer base changes:
>
>  - Currently 21 committers.
>     - Joerg O. Henne was added as a committer on Mon Oct 09 2017
>     - Sebastian Holder was added as a committer on Wed Oct 11 2017
>     - Carolin Köhler was added as a committer on Wed Oct 11 2017
>     - Matthäus Mayer was added as a committer on Mon Oct 16 2017
>
> ## Releases:
>
>  - 2.0.8 was released on Thu Nov 02 2017
>
> ## JIRA activity:
>
>  - 101 JIRA tickets created in the last 3 months
>  - 75 JIRA tickets closed/resolved in the last 3 months
>
>
> </draft>
>
> Andreas
>
> [1] https://reporter.apache.org/?pdfbox
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: Apache PDFBox January 2018 report due

Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Hi,

thanks for the review. I've posted the report as provided.

Andreas
Am 09.01.2018 um 22:14 schrieb Andreas Lehmkuehler:
> Hi,
> 
> find attached a quick draft of the board report we're expected to submit this
> month. It's based upon the report template which can be found at [1]
> 
> 
> Any further comments, objections or additions?
> 
> <draft>
> 
> ## Description:
>   - the Apache PDFBox library is an open source Java tool for working with PDF 
> documents.
> 
> ## Issues:
>   - there are no issue requiring board attention at this time.
> 
> ## Activity:
>   - the integration of the JBig2 ImageIO plugin is complete
>   - we are planning to release the first Apache based version of the JBig2 
> ImageIO plugin this month
>   - we are working on fixing bugs in 2.0.x
>   - we have resolve quite a number of 2.0.x releated tickets so that most likely 
> the next bugfix version 2.0.9 will be released this month as well
> 
> ## Board feedback (comment from the last october board meeting)
> 
>    mt: Reading the "2.0.7 release" thread on private@ it appears that
>        the project is dependent on a single committer for at least a
>        sub-set of regression tests. Could you explain this in more
>        detail please. If there are tests the community depends on, I'd
>        expect to see those tests in an ASF repository where any
>        committer can run them.
> 
> 
> These tests are not classic regression tests but tests on a large amount (> 
> 500000) of files. The results are compared to the results of a previous version 
> and then committers investigate files with some extreme negative differences or 
> with new exceptions. The same is done (on an even larger scale) for Tika, see 
> [1] and [2].
> The Tika tests need 4TB, and the files can't be hosted on a public ASF repo or 
> released under the Apache License because the files largely derive from Common 
> Crawl or the internet generally, and copyright/licensing would pose a problem. 
> There is a special vm to host the described test and it is possible to grant 
> access to all interested Tika/PDFBox committers. Tilman already got his access 
> bits in december, so that at least one other committer is able to run those 
> tests if needed. Maybe others will follow.
> 
> [1] 
> http://events.linuxfoundation.org/sites/events/files/slides/ApacheConMiami2017_tallison_v2.pdf 
> 
> [2] 
> http://openpreservation.org/blog/2016/10/04/apache-tikas-regression-corpus-tika-1302/ 
> 
> 
> 
> ## Health report:
>   - there is a steady stream of contributions, bug reports and questions on the 
> mailing lists
> 
> ## PMC changes:
> 
>   - Currently 21 PMC members.
>   - New PMC members:
>      - Joerg O. Henne was added to the PMC on Mon Oct 09 2017
>      - Sebastian Holder was added to the PMC on Wed Oct 11 2017
>      - Carolin Köhler was added to the PMC on Wed Oct 11 2017
>      - Matthäus Mayer was added to the PMC on Mon Oct 16 2017
> 
> ## Committer base changes:
> 
>   - Currently 21 committers.
>      - Joerg O. Henne was added as a committer on Mon Oct 09 2017
>      - Sebastian Holder was added as a committer on Wed Oct 11 2017
>      - Carolin Köhler was added as a committer on Wed Oct 11 2017
>      - Matthäus Mayer was added as a committer on Mon Oct 16 2017
> 
> ## Releases:
> 
>   - 2.0.8 was released on Thu Nov 02 2017
> 
> ## JIRA activity:
> 
>   - 101 JIRA tickets created in the last 3 months
>   - 75 JIRA tickets closed/resolved in the last 3 months
> 
> 
> </draft>
> 
> Andreas
> 
> [1] https://reporter.apache.org/?pdfbox
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: Apache PDFBox January 2018 report due

Posted by Timo Boehme <ti...@ontochem.com>.
+1

Timo

Am 09.01.2018 um 22:14 schrieb Andreas Lehmkuehler:
> Hi,
> 
> find attached a quick draft of the board report we're expected to submit 
> this
> month. It's based upon the report template which can be found at [1]
> 
> 
> Any further comments, objections or additions?
> 
> <draft>
> 
> ## Description:
>   - the Apache PDFBox library is an open source Java tool for working 
> with PDF documents.
> 
> ## Issues:
>   - there are no issue requiring board attention at this time.
> 
> ## Activity:
>   - the integration of the JBig2 ImageIO plugin is complete
>   - we are planning to release the first Apache based version of the 
> JBig2 ImageIO plugin this month
>   - we are working on fixing bugs in 2.0.x
>   - we have resolve quite a number of 2.0.x releated tickets so that 
> most likely the next bugfix version 2.0.9 will be released this month as 
> well
> 
> ## Board feedback (comment from the last october board meeting)
> 
>    mt: Reading the "2.0.7 release" thread on private@ it appears that
>        the project is dependent on a single committer for at least a
>        sub-set of regression tests. Could you explain this in more
>        detail please. If there are tests the community depends on, I'd
>        expect to see those tests in an ASF repository where any
>        committer can run them.
> 
> 
> These tests are not classic regression tests but tests on a large amount 
> (> 500000) of files. The results are compared to the results of a 
> previous version and then committers investigate files with some extreme 
> negative differences or with new exceptions. The same is done (on an 
> even larger scale) for Tika, see [1] and [2].
> The Tika tests need 4TB, and the files can't be hosted on a public ASF 
> repo or released under the Apache License because the files largely 
> derive from Common Crawl or the internet generally, and 
> copyright/licensing would pose a problem. There is a special vm to host 
> the described test and it is possible to grant access to all interested 
> Tika/PDFBox committers. Tilman already got his access bits in december, 
> so that at least one other committer is able to run those tests if 
> needed. Maybe others will follow.
> 
> [1] 
> http://events.linuxfoundation.org/sites/events/files/slides/ApacheConMiami2017_tallison_v2.pdf 
> 
> [2] 
> http://openpreservation.org/blog/2016/10/04/apache-tikas-regression-corpus-tika-1302/ 
> 
> 
> 
> ## Health report:
>   - there is a steady stream of contributions, bug reports and questions 
> on the mailing lists
> 
> ## PMC changes:
> 
>   - Currently 21 PMC members.
>   - New PMC members:
>      - Joerg O. Henne was added to the PMC on Mon Oct 09 2017
>      - Sebastian Holder was added to the PMC on Wed Oct 11 2017
>      - Carolin Köhler was added to the PMC on Wed Oct 11 2017
>      - Matthäus Mayer was added to the PMC on Mon Oct 16 2017
> 
> ## Committer base changes:
> 
>   - Currently 21 committers.
>      - Joerg O. Henne was added as a committer on Mon Oct 09 2017
>      - Sebastian Holder was added as a committer on Wed Oct 11 2017
>      - Carolin Köhler was added as a committer on Wed Oct 11 2017
>      - Matthäus Mayer was added as a committer on Mon Oct 16 2017
> 
> ## Releases:
> 
>   - 2.0.8 was released on Thu Nov 02 2017
> 
> ## JIRA activity:
> 
>   - 101 JIRA tickets created in the last 3 months
>   - 75 JIRA tickets closed/resolved in the last 3 months
> 
> 
> </draft>
> 
> Andreas
> 
> [1] https://reporter.apache.org/?pdfbox
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
> 


-- 
Timo Boehme
OntoChem IT Solutions GmbH
Blücherstraße 24
06120 Halle (Saale)
Germany

phone: +49 345 478 047 4        | fax: +49 345 478 047 1
email: timo.boehme@ontochem.com | web: www.ontochem.com
HRB 21962 Amtsgericht Stendal   | USt-IdNr.: DE815563824
managing director : Lutz Weber


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: Apache PDFBox January 2018 report due

Posted by Maruan Sahyoun <sa...@fileaffairs.de>.
+1 Maruan

> Am 09.01.2018 um 22:14 schrieb Andreas Lehmkuehler <an...@lehmi.de>:
> 
> Hi,
> 
> find attached a quick draft of the board report we're expected to submit this
> month. It's based upon the report template which can be found at [1]
> 
> 
> Any further comments, objections or additions?
> 
> <draft>
> 
> ## Description:
> - the Apache PDFBox library is an open source Java tool for working with PDF documents.
> 
> ## Issues:
> - there are no issue requiring board attention at this time.
> 
> ## Activity:
> - the integration of the JBig2 ImageIO plugin is complete
> - we are planning to release the first Apache based version of the JBig2 ImageIO plugin this month
> - we are working on fixing bugs in 2.0.x
> - we have resolve quite a number of 2.0.x releated tickets so that most likely the next bugfix version 2.0.9 will be released this month as well
> 
> ## Board feedback (comment from the last october board meeting)
> 
>  mt: Reading the "2.0.7 release" thread on private@ it appears that
>      the project is dependent on a single committer for at least a
>      sub-set of regression tests. Could you explain this in more
>      detail please. If there are tests the community depends on, I'd
>      expect to see those tests in an ASF repository where any
>      committer can run them.
> 
> 
> These tests are not classic regression tests but tests on a large amount (> 500000) of files. The results are compared to the results of a previous version and then committers investigate files with some extreme negative differences or with new exceptions. The same is done (on an even larger scale) for Tika, see [1] and [2].
> The Tika tests need 4TB, and the files can't be hosted on a public ASF repo or released under the Apache License because the files largely derive from Common Crawl or the internet generally, and copyright/licensing would pose a problem. There is a special vm to host the described test and it is possible to grant access to all interested Tika/PDFBox committers. Tilman already got his access bits in december, so that at least one other committer is able to run those tests if needed. Maybe others will follow.
> 
> [1] http://events.linuxfoundation.org/sites/events/files/slides/ApacheConMiami2017_tallison_v2.pdf
> [2] http://openpreservation.org/blog/2016/10/04/apache-tikas-regression-corpus-tika-1302/ 
> 
> 
> ## Health report:
> - there is a steady stream of contributions, bug reports and questions on the mailing lists
> 
> ## PMC changes:
> 
> - Currently 21 PMC members.
> - New PMC members:
>    - Joerg O. Henne was added to the PMC on Mon Oct 09 2017
>    - Sebastian Holder was added to the PMC on Wed Oct 11 2017
>    - Carolin Köhler was added to the PMC on Wed Oct 11 2017
>    - Matthäus Mayer was added to the PMC on Mon Oct 16 2017
> 
> ## Committer base changes:
> 
> - Currently 21 committers.
>    - Joerg O. Henne was added as a committer on Mon Oct 09 2017
>    - Sebastian Holder was added as a committer on Wed Oct 11 2017
>    - Carolin Köhler was added as a committer on Wed Oct 11 2017
>    - Matthäus Mayer was added as a committer on Mon Oct 16 2017
> 
> ## Releases:
> 
> - 2.0.8 was released on Thu Nov 02 2017
> 
> ## JIRA activity:
> 
> - 101 JIRA tickets created in the last 3 months
> - 75 JIRA tickets closed/resolved in the last 3 months
> 
> 
> </draft>
> 
> Andreas
> 
> [1] https://reporter.apache.org/?pdfbox
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org