You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Jukka Zitting <ju...@gmail.com> on 2008/05/25 15:50:33 UTC

Planning Tika 0.2

Hi,

Tika has already come a long way since the 0.1 release, and I'd like
to push for the next release, 0.2. Any special wishes of the features
to include?

My goals for the release would be finishing TIKA-115 (making a
runnable jar instead of using startup scripts), upgrading our parser
dependencies (especially POI), and closing some of the reported bugs.

It would be nice to get the media type registry and configuration
changes that I've been working on finished, but that's IMO not a
requirement before 1.0. A nice extra feature would be some light
integration with Lucene Java. Also, I've been thinking about
potentially splitting Tika into component libraries like tika-core,
tika-parsers, tika-lucene, etc. to better manage external dependencies
and to make it more attractive for parser libraries to directly
implement the Parser interface.

BR,

Jukka Zitting

Re: Planning Tika 0.2

Posted by Dave Meikle <lo...@gmail.com>.
2008/9/28 Sami Siren <ss...@gmail.com>

> Jukka Zitting wrote:
>
>> I think the current trunk is good enough to be released.
>>
>>
> +1
>
> --
> Sami Siren
>
>
If it mattered from me I would give it a +1, but since it doesn't I will
just give it a smile :-)

Re: Planning Tika 0.2

Posted by Sami Siren <ss...@gmail.com>.
Jukka Zitting wrote:
> I think the current trunk is good enough to be released.
>   
+1

--
 Sami Siren


Re: Planning Tika 0.2

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

The following issues were remaining on the 0.2 roadmap:

  TIKA-50  Unit tests are incomplete.
  TIKA-61  Add namespaces to our metadata keys
  TIKA-69  ParseUtils methods need to support Metadata
  TIKA-74  Test Resources should be loaded by the class loader ...
  TIKA-79  Mime type detection from file header appears to be failing
  TIKA-80  Utility method in MimeUtils to perform full mime resolution ...
  TIKA-121 MimeType.clean method no longer exists as a capability

None of them looked terribly urgent or blocking, so I just removed
them from the 0.2 roadmap.

I think the current trunk is good enough to be released.

BR,

Jukka Zitting

Re: Planning Tika 0.2

Posted by Robert Burrell Donkin <ro...@gmail.com>.
On Sun, May 25, 2008 at 2:50 PM, Jukka Zitting <ju...@gmail.com> wrote:
> Hi,
>
> Tika has already come a long way since the 0.1 release, and I'd like
> to push for the next release, 0.2. Any special wishes of the features
> to include?
>
> My goals for the release would be finishing TIKA-115 (making a
> runnable jar instead of using startup scripts), upgrading our parser
> dependencies (especially POI), and closing some of the reported bugs.
>
> It would be nice to get the media type registry and configuration
> changes that I've been working on finished, but that's IMO not a
> requirement before 1.0. A nice extra feature would be some light
> integration with Lucene Java. Also, I've been thinking about
> potentially splitting Tika into component libraries like tika-core,
> tika-parsers, tika-lucene, etc. to better manage external dependencies
> and to make it more attractive for parser libraries to directly
> implement the Parser interface.

components sound good to me :-)

- robert

Re: Planning Tika 0.2

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Fri, Jun 6, 2008 at 8:45 PM, Chris Mattmann
<ch...@jpl.nasa.gov> wrote:
> TIKA-118 Bouncycastle binaries requires US exports regulation compliance

Done.

> I think separate libraries is a very interesting and cool idea. I'm happy to
> help out with the separation, but I don't think it's a req for 0.2.

Agreed, we can do that later.

> Also, once we're ready to release, I volunteer to be the release manager if
> everyone is +1 for it.

Excellent, +1 from me.

BR,

Jukka Zitting

Re: Planning Tika 0.2

Posted by "Keith R. Bennett" <kb...@bbsinc.biz>.
A *very* belated +1 from me too.

- Keith


Chris Mattmann wrote:
> 
> Hi Jukka,
> 
> 
> Also, once we're ready to release, I volunteer to be the release manager
> if
> everyone is +1 for it.
> 
> Thanks!
> 
> Cheers,
>  Chris
> 
> 

-- 
View this message in context: http://www.nabble.com/Planning-Tika-0.2-tp17458121p19013119.html
Sent from the Apache Tika - Development mailing list archive at Nabble.com.


Re: Planning Tika 0.2

Posted by Rida Benjelloun <ri...@doculibre.com>.
+1
Rida.


2008/6/9 Niall Pemberton <ni...@gmail.com>:

> On Mon, Jun 9, 2008 at 6:27 PM, Sami Siren <ss...@gmail.com> wrote:
> > Chris Mattmann wrote:
> >>
> >> Hi Jukka,
> >>
> >>  Also, once we're ready to release, I volunteer to be the release
> manager
> >> if
> >> everyone is +1 for it.
> >>
> >
> > +1
>
> +1 from me, sorry haven't found any time to help with Tika
>
> Niall
>

Re: Planning Tika 0.2

Posted by Niall Pemberton <ni...@gmail.com>.
On Mon, Jun 9, 2008 at 6:27 PM, Sami Siren <ss...@gmail.com> wrote:
> Chris Mattmann wrote:
>>
>> Hi Jukka,
>>
>>  Also, once we're ready to release, I volunteer to be the release manager
>> if
>> everyone is +1 for it.
>>
>
> +1

+1 from me, sorry haven't found any time to help with Tika

Niall

Re: Planning Tika 0.2

Posted by Sami Siren <ss...@gmail.com>.
Chris Mattmann wrote:
> Hi Jukka,
>
>   
> Also, once we're ready to release, I volunteer to be the release manager if
> everyone is +1 for it.
>   

+1

--
 Sami Siren

Re: Planning Tika 0.2

Posted by Chris Mattmann <ch...@jpl.nasa.gov>.
Hi Jukka,

> 
> Tika has already come a long way since the 0.1 release, and I'd like
> to push for the next release, 0.2. Any special wishes of the features
> to include?

Yes, you are right and I am really looking forward to Tika 0.2. I've got a
couple wishes:

TIKA-80 Utility method in MimeUtils to perform full mime resolution using
all available strategies

TIKA-74 Test Resources should be loaded by the class loader (e.g.
getResourceAsStream()).

TIKA-61 Add namespaces to our metadata keys

TIKA-121 MimeType.clean method no longer exists as a capability

TIKA-79 Mime type detection from file header appears to be failing.

TIKA-118 Bouncycastle binaries requires US exports regulation compliance


As for TIKA-80, TIKA-74, TIKA-61, TIKA-121, TIKA-79, I assigned them to me
and will push hard to get them closed out within the next few weeks. I'm not
sure how much I can help with TIKA-118, but we have the same issue now in
Nutch (since Nutch now depends on apache-tika-0.1-incubating official
release), so I will watch how you guys solve that problem and then follow
suit :)

> 
> My goals for the release would be finishing TIKA-115 (making a
> runnable jar instead of using startup scripts), upgrading our parser
> dependencies (especially POI), and closing some of the reported bugs.

+1

> 
> It would be nice to get the media type registry and configuration
> changes that I've been working on finished, but that's IMO not a
> requirement before 1.0. A nice extra feature would be some light
> integration with Lucene Java. Also, I've been thinking about
> potentially splitting Tika into component libraries like tika-core,
> tika-parsers, tika-lucene, etc. to better manage external dependencies
> and to make it more attractive for parser libraries to directly
> implement the Parser interface.

I think separate libraries is a very interesting and cool idea. I'm happy to
help out with the separation, but I don't think it's a req for 0.2.

Also, once we're ready to release, I volunteer to be the release manager if
everyone is +1 for it.

Thanks!

Cheers,
 Chris


> 
> BR,
> 
> Jukka Zitting

______________________________________________
Chris Mattmann, Ph.D.
Chris.Mattmann@jpl.nasa.gov
Cognizant Development Engineer
Early Detection Research Network Project
_________________________________________________
Jet Propulsion Laboratory            Pasadena, CA
Office: 171-266B                     Mailstop:  171-246
_______________________________________________________

Disclaimer:  The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.