You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by Jonathan Koren <jo...@soe.ucsc.edu> on 2009/06/02 06:08:54 UTC

0.4 build issues

I recently updated svn and was looking to build 0.4 .   However, after  
performing an `mvn clean; mvn install`, the tika-app-0.4-SNAPSHOT.jar  
failed to build with an out of memory error.  Perhaps it's related,  
but I also fail to get a tika-0.4-SNAPSHOT.jar like the docs say I  
should get.

I was trying to use the application, because after upgrading to 0.4,  
my attempts to the tika parsers returned nothing.   I got no text from  
any document, no exceptions, nothing to stderr.  As far as I can tell  
it's just failing silently even though I have both tika-core and tika- 
parsers both in my CLASSPATH, along with some stuff from apache- 
commons that tika complained about while getting to this point.    I  
don't have any of these issues with tika-0.3-standalone.jar .


Error pasted below:

[INFO]  
----------------------------------------------------------------------------
[INFO] Building Apache Tika application
[INFO]    task-segment: [install]
[INFO]  
----------------------------------------------------------------------------
[INFO] [resources:resources]
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 2 resources
[INFO] [compiler:compile]
[INFO] Nothing to compile - all classes are up to date
[INFO] [resources:testResources]
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory /usr/local/src/lucene/tika/ 
tika-app/src/test/resources
[INFO] [compiler:testCompile]
[INFO] No sources to compile
[INFO] [surefire:test]
[INFO] No tests to run.
[INFO] [bundle:bundle]
[INFO]  
------------------------------------------------------------------------
[ERROR] FATAL ERROR
[INFO]  
------------------------------------------------------------------------
[INFO] Java heap space
[INFO]  
------------------------------------------------------------------------
[INFO] Trace
java.lang.OutOfMemoryError: Java heap space
	at sun.util.calendar.Gregorian.newCalendarDate(Gregorian.java:51)
	at java.util.Date.<init>(Date.java:235)
	at java.util.zip.ZipEntry.dosToJavaTime(ZipEntry.java:288)
	at java.util.zip.ZipEntry.getTime(ZipEntry.java:127)
	at aQute.lib.osgi.ZipResource.build(ZipResource.java:48)
	at aQute.lib.osgi.ZipResource.build(ZipResource.java:32)
	at aQute.lib.osgi.Jar.<init>(Jar.java:36)
	at aQute.lib.osgi.Jar.<init>(Jar.java:55)
	at aQute.lib.osgi.Analyzer.getJarFromName(Analyzer.java:864)
	at aQute.lib.osgi.Builder.extractFromJar(Builder.java:767)
	at aQute.lib.osgi.Builder.doIncludeResource(Builder.java:682)
	at aQute.lib.osgi.Builder.doIncludeResources(Builder.java:668)
	at aQute.lib.osgi.Builder.build(Builder.java:73)
	at  
org 
.apache 
.felix.bundleplugin.BundlePlugin.buildOSGiBundle(BundlePlugin.java:391)
	at  
org.apache.felix.bundleplugin.BundlePlugin.execute(BundlePlugin.java: 
282)
	at  
org.apache.felix.bundleplugin.BundlePlugin.execute(BundlePlugin.java: 
236)
	at  
org.apache.felix.bundleplugin.BundlePlugin.execute(BundlePlugin.java: 
227)
	at  
org 
.apache 
.maven 
.plugin.DefaultPluginManager.executeMojo(DefaultPluginManager.java:443)
	at  
org 
.apache 
.maven 
.lifecycle 
.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java: 
539)
	at  
org 
.apache 
.maven 
.lifecycle 
.DefaultLifecycleExecutor 
.executeGoalWithLifecycle(DefaultLifecycleExecutor.java:480)
	at  
org 
.apache 
.maven 
.lifecycle 
.DefaultLifecycleExecutor.executeGoal(DefaultLifecycleExecutor.java:459)
	at  
org 
.apache 
.maven 
.lifecycle 
.DefaultLifecycleExecutor 
.executeGoalAndHandleFailures(DefaultLifecycleExecutor.java:311)
	at  
org 
.apache 
.maven 
.lifecycle 
.DefaultLifecycleExecutor 
.executeTaskSegments(DefaultLifecycleExecutor.java:278)
	at  
org 
.apache 
.maven 
.lifecycle 
.DefaultLifecycleExecutor.execute(DefaultLifecycleExecutor.java:143)
	at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:334)
	at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:125)
	at org.apache.maven.cli.MavenCli.main(MavenCli.java:272)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at  
sun 
.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: 
39)
	at  
sun 
.reflect 
.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java: 
25)
	at java.lang.reflect.Method.invoke(Method.java:585)
	at org.codehaus.classworlds.Launcher.launchEnhanced(Launcher.java:315)
[INFO]  
------------------------------------------------------------------------
[INFO] Total time: 20 seconds
[INFO] Finished at: Mon Jun 01 20:54:53 PDT 2009
[INFO] Final Memory: 27M/63M
[INFO]  
------------------------------------------------------------------------


--
Jonathan Koren
jonathan@soe.ucsc.edu
http://www.soe.ucsc.edu/~jonathan/



Re: 0.4 build issues

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Tue, Jun 2, 2009 at 6:08 AM, Jonathan Koren <jo...@soe.ucsc.edu> wrote:
> I recently updated svn and was looking to build 0.4 .   However, after
> performing an `mvn clean; mvn install`, the tika-app-0.4-SNAPSHOT.jar failed
> to build with an out of memory error.

The bundle plugin that we nowadays use to package up the standalone
jar seems to require quite a bit memory. I haven't seen this problem
myself with Tika, but I've seen it reported in other projects. Adding
an environment variable like MAVEN_OPTS=-Xmx256m should work around
this, though you may want to report this as a bug in Jira so we can
better look at ways to solve this.

> Perhaps it's related, but I also fail to get a tika-0.4-SNAPSHOT.jar like the
> docs say I should get.

The Tika build has recently been split to multiple components (see
CHANGES.txt) but the web site instructions have yet to be updated to
match this. Once you get the tika-app component to build, you'll find
the standalone jar in tika-app/target/tika-app-0.4-SNAPSHOT.jar.

> I was trying to use the application, because after upgrading to 0.4, my
> attempts to the tika parsers returned nothing.   I got no text from any
> document, no exceptions, nothing to stderr.  As far as I can tell it's just
> failing silently even though I have both tika-core and tika-parsers both in
> my CLASSPATH, along with some stuff from apache-commons that tika complained
> about while getting to this point.

Tika will currently ignore all parser classes that it can't load due
to missing parser library dependencies.

BR,

Jukka Zitting