You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by Michael McCandless <lu...@mikemccandless.com> on 2012/05/03 21:04:06 UTC
1.7 release?
Any guestimates for a 1.7.0 release?
It's been a long time (9 months) since 1.6.0... and I count ~203
commits since 1.6.0.
Mike McCandless
http://blog.mikemccandless.com
Re: 1.7 release?
Posted by Michael McCandless <lu...@mikemccandless.com>.
On Fri, May 4, 2012 at 9:46 AM, Timo Boehme <ti...@ontochem.com> wrote:
> Am 03.05.2012 21:04, schrieb Michael McCandless:
>
>> Any guestimates for a 1.7.0 release?
>>
>> It's been a long time (9 months) since 1.6.0... and I count ~203
>> commits since 1.6.0.
>
> There was already some discussion about it (see "Re: Next release(s)?"
> dating from 2012-04-10) and it is clear that a new version (probably 1.7.0)
> should be released soon. However I think we will wait until the project lead
> is back online.
Ahh, super, I missed that discussion (but went and read it now). Thanks!
Mike McCandless
http://blog.mikemccandless.com
Re: 1.7 release?
Posted by Timo Boehme <ti...@ontochem.com>.
Am 14.05.2012 10:11, schrieb Maruan Sahyoun:
> ...
> WRT 1.7 I agree with Timo that the enhancements made so far do
> validate a new release esp the new NonSequentialParser Timo created
> has already proven to solve a number of issues raised. Maybe this
> could be the default for the time being?
I wouldn't make it default since it will change which documents can be
processed and which throw an exception. While for most documents it
should be a big step forward there might be some strange/broken
documents for which the standard parser succeeded using workaround and
the new one will fail.
One possibility would be to write a wrapper (as was proposed in
PDFBOX-1199) which first uses the new parser and falls back to the old
one in case of an error.
Another issue is that the new parser needs a file as input for random
access while the old parser also accepts a stream. This could be tackled
by creating a temporary file from stream and use this as input.
I could add this in the next days.
Two further issues:
- need to add method/constructor parameter for specifying password for
encryption
- signed documents are not tested; I would suppose that the signature
string will also be decrypted which is wrong as far as I understand
the spec; there is an implementation for standard parsing
decryption to prevent this but it relies on all objects already loaded
and I need another way to detect which strings not to decrypt
Thus in order to release a stable 1.7 in a short time frame I would
propose keeping the old parser the default but proposing to use the new
parser if possible. If all issues are resolved we may release a 1.8 with
the new parser the default.
Best regards,
Timo
--
Timo Boehme
OntoChem GmbH
H.-Damerow-Str. 4
06120 Halle/Saale
T: +49 345 4780474
F: +49 345 4780471
timo.boehme@ontochem.com
_____________________________________________________________________
OntoChem GmbH
Geschäftsführer: Dr. Lutz Weber
Sitz: Halle / Saale
Registergericht: Stendal
Registernummer: HRB 215461
_____________________________________________________________________
Re: 1.7 release?
Posted by Maruan Sahyoun <sa...@fileaffairs.de>.
the new parser is - unfortunately - still in it's early state and not in any way helpful. I wanted to complete the SimpleParser, which takes the tokens from the PDF Lexer and creates the COS level objects this week. All this is still in preparation for the ConformingParser.
WRT 1.7 I agree with Timo that the enhancements made so far do validate a new release esp the new NonSequentialParser Timo created has already proven to solve a number of issues raised. Maybe this could be the default for the time being?
regards
Maruan
Am 14.05.2012 um 09:54 schrieb Timo Boehme:
> Hi,
>
> Am 13.05.2012 10:24, schrieb Andreas Lehmkuehler:
>> Am 07.05.2012 10:50, schrieb Timo Boehme:
> ...
>>> In my opinion there are already a number of improvements in current trunk
>>> compared to 1.6 and there is no reason to not release another 1.8 before
>>> PDFBOX-1000 is really ready. As I see it we should bump the version to
>>> 2.0 if PDFBOX-1000 finally lands.
>> I just thought about a kind of beta version of the new parser, so that
>> one can test ist without building its own version.
>
> As I see it we are currently not there. However this is a point Maruan is the only one who knows about current state.
>
> ...
>>> Nevertheless I'd like to have your opinion on a release and expertise
>>> doing it :-)
>> The release process uses the maven release plugin and therefore it is
>> quite easy to perform. If you are interested in acting as release
>> manager you have to provide a key which will be used to sign the
>> release. This key should be signed by at least one member of "The Apache
>> Web of Trust", see [1] and [2].
>
> Thanks for the pointers. Since I'm currently a bit short of time I really appreciate that you volunteer as RM.
>
>> I'll volunteer as RM for the next release. What do you think about
>> cutting the release in one week from now on 22th? As I won't be
>> available in the first 2 weeks of june the next reasonable target date
>> could be june 26th, if we need some more time to include more stuff.
>
> 22nd is perfect for me.
>
>
> Best regards,
>
> Timo
>
> --
>
> Timo Boehme
> OntoChem GmbH
> H.-Damerow-Str. 4
> 06120 Halle/Saale
> T: +49 345 4780474
> F: +49 345 4780471
> timo.boehme@ontochem.com
>
> _____________________________________________________________________
>
> OntoChem GmbH
> Geschäftsführer: Dr. Lutz Weber
> Sitz: Halle / Saale
> Registergericht: Stendal
> Registernummer: HRB 215461
> _____________________________________________________________________
>
Re: 1.7 release?
Posted by Timo Boehme <ti...@ontochem.com>.
Hi,
Am 13.05.2012 10:24, schrieb Andreas Lehmkuehler:
> Am 07.05.2012 10:50, schrieb Timo Boehme:
...
>> In my opinion there are already a number of improvements in current trunk
>> compared to 1.6 and there is no reason to not release another 1.8 before
>> PDFBOX-1000 is really ready. As I see it we should bump the version to
>> 2.0 if PDFBOX-1000 finally lands.
> I just thought about a kind of beta version of the new parser, so that
> one can test ist without building its own version.
As I see it we are currently not there. However this is a point Maruan
is the only one who knows about current state.
...
>> Nevertheless I'd like to have your opinion on a release and expertise
>> doing it :-)
> The release process uses the maven release plugin and therefore it is
> quite easy to perform. If you are interested in acting as release
> manager you have to provide a key which will be used to sign the
> release. This key should be signed by at least one member of "The Apache
> Web of Trust", see [1] and [2].
Thanks for the pointers. Since I'm currently a bit short of time I
really appreciate that you volunteer as RM.
> I'll volunteer as RM for the next release. What do you think about
> cutting the release in one week from now on 22th? As I won't be
> available in the first 2 weeks of june the next reasonable target date
> could be june 26th, if we need some more time to include more stuff.
22nd is perfect for me.
Best regards,
Timo
--
Timo Boehme
OntoChem GmbH
H.-Damerow-Str. 4
06120 Halle/Saale
T: +49 345 4780474
F: +49 345 4780471
timo.boehme@ontochem.com
_____________________________________________________________________
OntoChem GmbH
Geschäftsführer: Dr. Lutz Weber
Sitz: Halle / Saale
Registergericht: Stendal
Registernummer: HRB 215461
_____________________________________________________________________
Re: 1.7 release?
Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Hi,
Am 20.05.2012 18:46, schrieb Andreas Lehmkuehler:
> Am 13.05.2012 10:24, schrieb Andreas Lehmkuehler:
>
>> .....
>> I'll volunteer as RM for the next release. What do you think about cutting the
>> release in one week from now on 22th? As I won't be available in the first 2
>> weeks of june the next reasonable target date could be june 26th, if we need
>> some more time to include more stuff.
> As there weren't any objections, I'll cut the release in 2 days on tuesday the
> 22th unless something (unexpected) comes up in the meantime.
Due to a recently started discussion about a new implementation of the preflight
module I postpone my plan to cut a new release for another 2 days.
BR
Andreas Lehmkühler
Re: 1.7 release?
Posted by Jukka Zitting <ju...@gmail.com>.
Hi,
On Tue, May 22, 2012 at 8:40 AM, Jukka Zitting <ju...@gmail.com> wrote:
> I just realized that there are some related changes in Tika that I
> should port to the parser class we now have in PDFBox. I'll take care
> of that within the next few hours.
I'm done with this, so +1 to proceeding with the release.
It turned out that the PDFParser class in Tika had evolved more than
I'd expected, so the easiest solution was to just revert the
PDFBOX-1132 changes and move any improvements we'd made within PDFBox
back to Tika.
Let's see if I or someone else will later have better time to
resurrect PDFBOX-1132, but until then it's probably best leave the
PDFParser class in Tika.
BR,
Jukka Zitting
Re: 1.7 release?
Posted by Jukka Zitting <ju...@gmail.com>.
Hi Andreas,
On Sun, May 20, 2012 at 6:46 PM, Andreas Lehmkuehler <an...@lehmi.de> wrote:
> As there weren't any objections, I'll cut the release in 2 days on tuesday
> the 22th unless something (unexpected) comes up in the meantime.
I just realized that there are some related changes in Tika that I
should port to the parser class we now have in PDFBox. I'll take care
of that within the next few hours.
BR,
Jukka Zitting
Re: 1.7 release?
Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Am 13.05.2012 10:24, schrieb Andreas Lehmkuehler:
> .....
> I'll volunteer as RM for the next release. What do you think about cutting the
> release in one week from now on 22th? As I won't be available in the first 2
> weeks of june the next reasonable target date could be june 26th, if we need
> some more time to include more stuff.
As there weren't any objections, I'll cut the release in 2 days on tuesday the
22th unless something (unexpected) comes up in the meantime.
BR
Andreas Lehmkühler
Re: 1.7 release?
Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Hi,
Am 07.05.2012 10:50, schrieb Timo Boehme:
> Hi,
>
> Am 06.05.2012 16:46, schrieb Andreas Lehmkuehler:
>> Am 04.05.2012 15:46, schrieb Timo Boehme:
>>> Am 03.05.2012 21:04, schrieb Michael McCandless:
>>>> Any guestimates for a 1.7.0 release?
>>>>
>>>> It's been a long time (9 months) since 1.6.0... and I count ~203
>>>> commits since 1.6.0.
>>>
>>> There was already some discussion about it (see "Re: Next
>>> release(s)?" dating from 2012-04-10) and it is clear that a new
>>> version (probably 1.7.0) should be released soon.
>> IMHO there are some things which should be done before, integrate
>> Maruans latest patch (PDFBOX-1000), improve the TTF-Parser (PDFBOX-490)
>
> In my opinion there are already a number of improvements in current trunk
> compared to 1.6 and there is no reason to not release another 1.8 before
> PDFBOX-1000 is really ready. As I see it we should bump the version to 2.0 if
> PDFBOX-1000 finally lands.
I just thought about a kind of beta version of the new parser, so that one can
test ist without building its own version.
> Thus I would vote for only adding stuff already in pipeline and bug fixes in
> order to do a release in the next few weeks.
I fully agree.
>>> However I think we will wait until the project lead is back online.
>> I guess you are adressing me as PMC Chair. I'm afraid there is a
>> misunderstanding I'd like to clarify.
>>
>> There is no concept of leadership within the ASF. An apache project is
>> led by the PMC [1]. The PMC Chair [2] is just the speaker of the project
>> and acts as interface to the board of the foundation. All PMC members
>> [3] including the chair are equal and each of them has one vote.
>
> Point taken.
I just wanted to avoid the misimpression that anyone else than the PMC rules the
project. :-)
> Nevertheless I'd like to have your opinion on a release and expertise doing it :-)
The release process uses the maven release plugin and therefore it is quite easy
to perform. If you are interested in acting as release manager you have to
provide a key which will be used to sign the release. This key should be signed
by at least one member of "The Apache Web of Trust", see [1] and [2].
I'll volunteer as RM for the next release. What do you think about cutting the
release in one week from now on 22th? As I won't be available in the first 2
weeks of june the next reasonable target date could be june 26th, if we need
some more time to include more stuff.
> Best regards
> Timo
BR
Andreas Lehmkühler
[1] http://www.apache.org/dev/release-signing.html
[2] http://www.apache.org/dev/release-signing.html#apache-wot
Re: 1.7 release?
Posted by Michael McCandless <lu...@mikemccandless.com>.
On Mon, May 7, 2012 at 4:50 AM, Timo Boehme <ti...@ontochem.com> wrote:
> In my opinion there are already a number of improvements in current trunk
> compared to 1.6
+1
> and there is no reason to not release another 1.8 before
> PDFBOX-1000 is really ready. As I see it we should bump the version to 2.0
> if PDFBOX-1000 finally lands.
> Thus I would vote for only adding stuff already in pipeline and bug fixes in
> order to do a release in the next few weeks.
+1
In general releasing should not have to wait for patches to be
committed, and release time isn't the time to suddenly commit a bunch
of last minute patches. It should rather be the reverse: right after
a release is when you should commit the big changes; this way they
have the most time to "bake" (uncovering issues) in trunk.
It's best if what's committed is always kept in a releasable state;
this way on any given morning someone could wake up and cut a release
candidate.
If there are truly blocker bugs then they should be marked that way in Jira...
Mike McCandless
http://blog.mikemccandless.com
Re: 1.7 release?
Posted by Timo Boehme <ti...@ontochem.com>.
Hi,
Am 06.05.2012 16:46, schrieb Andreas Lehmkuehler:
> Am 04.05.2012 15:46, schrieb Timo Boehme:
>> Am 03.05.2012 21:04, schrieb Michael McCandless:
>>> Any guestimates for a 1.7.0 release?
>>>
>>> It's been a long time (9 months) since 1.6.0... and I count ~203
>>> commits since 1.6.0.
>>
>> There was already some discussion about it (see "Re: Next
>> release(s)?" dating from 2012-04-10) and it is clear that a new
>> version (probably 1.7.0) should be released soon.
> IMHO there are some things which should be done before, integrate
> Maruans latest patch (PDFBOX-1000), improve the TTF-Parser (PDFBOX-490)
In my opinion there are already a number of improvements in current
trunk compared to 1.6 and there is no reason to not release another 1.8
before PDFBOX-1000 is really ready. As I see it we should bump the
version to 2.0 if PDFBOX-1000 finally lands.
Thus I would vote for only adding stuff already in pipeline and bug
fixes in order to do a release in the next few weeks.
>> However I think we will wait until the project lead is back online.
> I guess you are adressing me as PMC Chair. I'm afraid there is a
> misunderstanding I'd like to clarify.
>
> There is no concept of leadership within the ASF. An apache project is
> led by the PMC [1]. The PMC Chair [2] is just the speaker of the project
> and acts as interface to the board of the foundation. All PMC members
> [3] including the chair are equal and each of them has one vote.
Point taken. Nevertheless I'd like to have your opinion on a release and
expertise doing it :-)
Best regards
Timo
--
Timo Boehme
OntoChem GmbH
H.-Damerow-Str. 4
06120 Halle/Saale
T: +49 345 4780474
F: +49 345 4780471
timo.boehme@ontochem.com
_____________________________________________________________________
OntoChem GmbH
Geschäftsführer: Dr. Lutz Weber
Sitz: Halle / Saale
Registergericht: Stendal
Registernummer: HRB 215461
_____________________________________________________________________
Re: 1.7 release?
Posted by Maruan Sahyoun <sa...@fileaffairs.de>.
Before integrating the current work at PDFBOX-1000 I would prefer to
- make sure the lexer is using the new IO classes
- move some parts to the (new) SimpleParser as e.g. some keywords are already handled in the lexer which is more than the lexer should do imo
regards
Maruan
Am 06.05.2012 um 16:46 schrieb Andreas Lehmkuehler <an...@lehmi.de>:
> Hi,
>
> Am 04.05.2012 15:46, schrieb Timo Boehme:
>> Am 03.05.2012 21:04, schrieb Michael McCandless:
>>> Any guestimates for a 1.7.0 release?
>>>
>>> It's been a long time (9 months) since 1.6.0... and I count ~203
>>> commits since 1.6.0.
>>
>> There was already some discussion about it (see "Re: Next release(s)?" dating
>> from 2012-04-10) and it is clear that a new version (probably 1.7.0) should be
>> released soon.
> IMHO there are some things which should be done before, integrate Maruans latest patch (PDFBOX-1000), improve the TTF-Parser (PDFBOX-490) ....
>
>> However I think we will wait until the project lead is back online.
> I guess you are adressing me as PMC Chair. I'm afraid there is a
> misunderstanding I'd like to clarify.
>
> There is no concept of leadership within the ASF. An apache project is led by the PMC [1]. The PMC Chair [2] is just the speaker of the project and acts as interface to the board of the foundation. All PMC members [3] including the chair are equal and each of them has one vote.
>
>> Kind regards,
>> Timo
>
> BR
> Andreas Lehmkühler
>
> [1] http://www.apache.org/foundation/how-it-works.html#pmc
> [2] http://www.apache.org/foundation/how-it-works.html#pmc-chair
> [3] http://www.apache.org/foundation/how-it-works.html#pmc-members
Re: 1.7 release?
Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Hi,
Am 04.05.2012 15:46, schrieb Timo Boehme:
> Am 03.05.2012 21:04, schrieb Michael McCandless:
>> Any guestimates for a 1.7.0 release?
>>
>> It's been a long time (9 months) since 1.6.0... and I count ~203
>> commits since 1.6.0.
>
> There was already some discussion about it (see "Re: Next release(s)?" dating
> from 2012-04-10) and it is clear that a new version (probably 1.7.0) should be
> released soon.
IMHO there are some things which should be done before, integrate Maruans latest
patch (PDFBOX-1000), improve the TTF-Parser (PDFBOX-490) ....
> However I think we will wait until the project lead is back online.
I guess you are adressing me as PMC Chair. I'm afraid there is a
misunderstanding I'd like to clarify.
There is no concept of leadership within the ASF. An apache project is led by
the PMC [1]. The PMC Chair [2] is just the speaker of the project and acts as
interface to the board of the foundation. All PMC members [3] including the
chair are equal and each of them has one vote.
> Kind regards,
> Timo
BR
Andreas Lehmkühler
[1] http://www.apache.org/foundation/how-it-works.html#pmc
[2] http://www.apache.org/foundation/how-it-works.html#pmc-chair
[3] http://www.apache.org/foundation/how-it-works.html#pmc-members
Re: 1.7 release?
Posted by Timo Boehme <ti...@ontochem.com>.
Am 03.05.2012 21:04, schrieb Michael McCandless:
> Any guestimates for a 1.7.0 release?
>
> It's been a long time (9 months) since 1.6.0... and I count ~203
> commits since 1.6.0.
There was already some discussion about it (see "Re: Next release(s)?"
dating from 2012-04-10) and it is clear that a new version (probably
1.7.0) should be released soon. However I think we will wait until the
project lead is back online.
Kind regards,
Timo
--
Timo Boehme
OntoChem GmbH
H.-Damerow-Str. 4
06120 Halle/Saale
T: +49 345 4780474
F: +49 345 4780471
timo.boehme@ontochem.com
_____________________________________________________________________
OntoChem GmbH
Geschäftsführer: Dr. Lutz Weber
Sitz: Halle / Saale
Registergericht: Stendal
Registernummer: HRB 215461
_____________________________________________________________________