You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Mattmann, Chris A (3980)" <ch...@jpl.nasa.gov> on 2015/04/01 07:40:51 UTC

Re: [DISCUSS] Tika 1.8 or 1.7.1

+1 to running tika-batch and govdocs. Woot.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Tyler Palsulich <tp...@gmail.com>
Reply-To: "dev@tika.apache.org" <de...@tika.apache.org>
Date: Monday, March 30, 2015 at 3:22 PM
To: "dev@tika.apache.org" <de...@tika.apache.org>
Subject: Re: [DISCUSS] Tika 1.8 or 1.7.1

>I just remembered TIKA-1509 and TIKA-1558 -- testing now for blacklist
>functionality through TIKA-1509. If that works, I'll back out TIKA-1558.
>
>Tim, I think you should run govdocs from the RC, in case something changes
>between your run and the cut.
>
>Tyler
>
>On Mon, Mar 30, 2015 at 10:17 AM, Allison, Timothy B. <ta...@mitre.org>
>wrote:
>
>> All,
>>
>> I've made the changes that I had hoped to.  Grib pdf exclusion remains
>>for
>> any takers.
>>
>> Let me know when I should initiate the run against govdocs1 to see if
>> there are any surprises on that corpus with Tika 1.8.
>>
>> Best,
>>
>>             Tim
>>
>> -----Original Message-----
>> From: Allison, Timothy B. [mailto:tallison@mitre.org]
>> Sent: Monday, March 30, 2015 7:03 AM
>> To: dev@tika.apache.org
>> Subject: RE: [DISCUSS] Tika 1.8 or 1.7.1
>>
>> Unless there are objections, I'd like these to be resolved before 1.8:
>>
>> TIKA-1584 -- I'll fix
>> TIKA-1575 -- Resolved by Konstantin Gribov (thank you!)
>> TIKA-1512 -- I'll put in a temporary fix so that we don't get IOOBEs,
>>but
>> I'll leave this open and do some more digging to see if we need to open
>>a
>> ticket at the POI level
>> TIKA-1511 -- I'll remove "provided" for xerial
>>
>> TIKA-1549 -- We should thank Toke Eskildsen in CHANGES.txt, no?
>>
>> I'll have these fixes completed by noon EDT.  Should I run against
>> govdocs1 before or after the RC?
>>
>> My last build of Tika app (a few days ago) ballooned to ~43MB, and
>>that's
>> before I add ~3MB for xerial.  Tika server is now ~48MB.  As of my last
>> build, we are still including ~4MB of pdfs (README.NLDAS1.pdf and
>> README.NLDAS2.pdf) from the GRIB(?) parser in the tika-app and
>>tika-server
>> jars.
>>
>> Best,
>>
>>               Tim
>>
>>
>>
>> -----Original Message-----
>> From: Tyler Palsulich [mailto:tpalsulich@gmail.com]
>> Sent: Sunday, March 29, 2015 9:13 AM
>> To: dev@tika.apache.org
>> Subject: Re: [DISCUSS] Tika 1.8 or 1.7.1
>>
>> Once TIKA-1584 and TIKA-1575 are resolved, I'll work up an RC (unless
>> something else pops up).
>>
>> Thank you everyone.
>>
>> Tyler
>> On Mar 29, 2015 4:43 AM, "Hong-Thai Nguyen" <th...@gmail.com>
>>wrote:
>>
>> > +1 for 1.8
>> >
>> > Hong-Thai
>> >
>> > > On 28 Mar 2015, at 16:01, Tyler Palsulich <tp...@apache.org>
>> wrote:
>> > >
>> > > Hi Folks,
>> > >
>> > > Now that TIKA-1581 (JHighlight licensing issues) is resolved, we
>>need
>> to
>> > > release a new version of Tika. I'll volunteer to be the release
>>manager
>> > > again.
>> > >
>> > > Should we release this as 1.8 or 1.7.1?
>> > >
>> > > Does anyone have any last minute issues they'd like to finish and
>>see
>> in
>> > > Tika 1.X? I'd like to get the example working with CORS (TIKA-1585
>>and
>> > > TIKA-1586). Any others?
>> > >
>> > > Have a good weekend,
>> > > Tyler
>> >
>>