You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Dave Meikle <lo...@gmail.com> on 2008/12/04 02:42:19 UTC

[VOTE] New TIKA 0.2 Release Candidate 1

Hi,

Thanks to all who provided feedback in the first attempt of 0.2 releases.
This release vote is based on a new 0.2 release branch based on the current
trunk following the discussions on the development list.

Please find the release artefacts for voting on under the following
location:
http://people.apache.org/~dmeikle/tika-0.2-rc1/<http://people.apache.org/%7Edmeikle/tika-0.2-rc1/>

I have included a maven deployment repository here:
http://people.apache.org/~dmeikle/tika-0.2-rc1/repos/<http://people.apache.org/%7Edmeikle/tika-0.2-rc1/repos/>

The proposed site can be found here:
http://people.apache.org/~dmeikle/tika-0.2-rc1/site/<http://people.apache.org/%7Edmeikle/tika-0.2-rc1/repos/>

The tag can be found here:
http://svn.apache.org/viewvc/lucene/tika/tags/0.2-rc1/

In hindsight I really should have made this 0.2-rc2 given the history but I
had already deleted the original tag when I realised, so apologies for that.

Thanks in advance to all who can take the time to check and vote on this
release.

Regards,
Dave

Re: [VOTE] New TIKA 0.2 Release Candidate 1

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Fri, Dec 5, 2008 at 11:10 AM, Dave Meikle <dm...@apache.org> wrote:
> Given the time it would take to fix, I would be inclined to fix it and then
> create a new RC - that is as long as people have the time to check the new
> RC.

Sounds good.

> I would expect that. I added the extra bit on the end of the first paragraph
> saying "provided the dependencies are on the classpath". Not sure if anyone
> has any better ideas on how to flag that up.

Yeah, either that or simply drop the entire command line usage section
for now until we are comfortable releasing the standalone jar as well.

PS. There's been good progress lately in resolving the remaining
PDFBox license blockers, and I believe we should be able to have an
Apache PDFBox release in a few weeks. It will probably make sense to
release next version of Tika once we've upgraded the dependency.

BR,

Jukka Zitting

Re: Fwd: [VOTE] New TIKA 0.2 Release Candidate 1

Posted by Chris Hostetter <ho...@fucit.org>.
: Given the time it would take to fix, I would be inclined to fix it and then
: create a new RC - that is as long as people have the time to check the new
: RC.

Better to be quality release then a fast release.  That said: If you put 
up another RC in the next 5 hours i'll review it tonight, or i can review 
it sunday night.  I'm happy to review simple, clean, releases and these 
have been nice and simple and clean.

One other minor nit as long as you're planning to cut another release: 
since target/site/gettingstarted.html is back in existence, it *might* 
make a better start link to provide from the "Documentation" section of 
the README.txt then documentation.html ... but that's somewhat subjective.


-Hoss


Re: [VOTE] New TIKA 0.2 Release Candidate 1

Posted by Dave Meikle <dm...@apache.org>.
Hi,

2008/12/5 Chris Hostetter <ho...@fucit.org>

>
> : Thanks to all who provided feedback in the first attempt of 0.2 releases.
> : This release vote is based on a new 0.2 release branch based on the
> current
> : trunk following the discussions on the development list.
>
> this looks pretty good, two small things concern me...
>
> 1) the pom.xml file still refers to broken @incubator addresses for
> subscribe, unsubscribe, and post, as well as listing the main url as
> http://incubator.apache.org/tika/
>
> How do these get used by maven?  Will they confuse people?
>
> (FWIW: I'd put a patch for all of these in TIKA-178)


Sorry, I assumed this had been updated in trunk but I had taken my branch
before the update thus your patch.

Maven uses them to add information in the generated site (see
http://lucene.apache.org/tika/mail-lists.html). It would confuse people if
they via this page on the generated site locally.

Given the time it would take to fix, I would be inclined to fix it and then
create a new RC - that is as long as people have the time to check the new
RC.

>
> 2) I see the updated gettingstarted guide no longer refers to the
> "standalone" jar, but now it says that the main tika jar can be used as a
> command line util.   When i try that i get a claspath issue...
>
> $ java -jar target/tika-0.2.jar --help
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/log4j/Layout
>
> is this expected?  (it makes sense given the dependencies listed, but i
> want to double check since the command line section suggests that it
> should work)


I would expect that. I added the extra bit on the end of the first paragraph
saying "provided the dependencies are on the classpath". Not sure if anyone
has any better ideas on how to flag that up.

Cheers,
Dave

Re: [VOTE] New TIKA 0.2 Release Candidate 1

Posted by Chris Hostetter <ho...@fucit.org>.
: Thanks to all who provided feedback in the first attempt of 0.2 releases.
: This release vote is based on a new 0.2 release branch based on the current
: trunk following the discussions on the development list.

this looks pretty good, two small things concern me...

1) the pom.xml file still refers to broken @incubator addresses for
subscribe, unsubscribe, and post, as well as listing the main url as 
http://incubator.apache.org/tika/

How do these get used by maven?  Will they confuse people?

(FWIW: I'd put a patch for all of these in TIKA-178)

2) I see the updated gettingstarted guide no longer refers to the 
"standalone" jar, but now it says that the main tika jar can be used as a 
command line util.   When i try that i get a claspath issue...

$ java -jar target/tika-0.2.jar --help
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/log4j/Layout

is this expected?  (it makes sense given the dependencies listed, but i 
want to double check since the command line section suggests that it 
should work)




-Hoss


Re: [VOTE] New TIKA 0.2 Release Candidate 1

Posted by "Mattmann, Chris A" <ch...@jpl.nasa.gov>.
Hi there,

>
>> In hindsight I really should have made this 0.2-rc2 given the history but I
>> had already deleted the original tag when I realised, so apologies for that.
>
> I'd actually have called the tag already 0.2 to match the actual
> version number in the source and to avoid having to tweak the tags
> once the release vote has ended.

+1, I agree with Jukka on this. I am putting together my notes on the
process that I followed when I did Tika 0.1-incubating. It's more or less
the same process we use in Nutch when I pushed out Nutch 0.9 (that I and the
rest of the Nutch folks there vetted), with a couple of changes. I will post
those notes shortly. Since I didn't see an official Tika wiki yet, I went
ahead and requested INFRA to set us up a MoinMoin one (since it seems that
at least Lucene Java and Nutch use it), see:

https://issues.apache.org/jira/browse/INFRA-1817

Once the Tika wiki is established, I can upload the release process notes
there and then Dave and I can vet what's been done and push out a release
procedure for future releases.

Thanks,
 Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: Chris.Mattmann@jpl.nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.



Re: [VOTE] New TIKA 0.2 Release Candidate 1

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Thu, Dec 4, 2008 at 2:42 AM, Dave Meikle <lo...@gmail.com> wrote:
> Thanks to all who provided feedback in the first attempt of 0.2 releases.
> This release vote is based on a new 0.2 release branch based on the current
> trunk following the discussions on the development list.

+1 To release the package as Tika 0.2. Thanks!

I built and tested the project, verified the checksums and the
signature, and spot-checked the staged Maven repository and pre-built
web site. All OK.

For the record, the SHA1 checksum of the apache-tika-0.2.tar.gz
package I reviewed was 1693ca4005fe427b810b1a210431724a577e4868.

Some minor issues:

* The "Latest news" section of the pre-built web site is a bit
mangled. We can fix that when we deploy the changes through the trunk
to the live web site.

* I'm using a repository proxy that for some reason gave me an earlier
1.0-SNAPSHOT of the retrotranslator-maven-plugin after I cleared my
local repository. This broke the build, but I was able to fix it by
forcing the plugin version to 1.0-alpha-4, the latest in the central
Maven repository. This is mostly a local environment issue, so I
suggest we just work around it in the trunk by explicitly specifying
the version of the plugin.

> In hindsight I really should have made this 0.2-rc2 given the history but I
> had already deleted the original tag when I realised, so apologies for that.

I'd actually have called the tag already 0.2 to match the actual
version number in the source and to avoid having to tweak the tags
once the release vote has ended.

BR,

Jukka Zitting