You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Chih-Hsien Wu <ch...@gmail.com> on 2013/11/19 15:47:22 UTC

Fwd: lucene.util.ArrayUtil.oversize(II)I error with mahout 0.8 on AWS EMR

Hi all,

I'm running a custom mahout jar file on AWS EMR and received the following
error:

Error: org.apache.lucene.util.ArrayUtil.oversize(II)I
Error: org.apache.lucene.util.ArrayUtil.oversize(II)I
Error: org.apache.lucene.util.ArrayUtil.oversize(II)I
java.lang.IllegalStateException: Job failed!
	at org.apache.mahout.vectorizer.collocations.llr.CollocDriver.generateCollocations(CollocDriver.java:238)
	at org.apache.mahout.vectorizer.collocations.llr.CollocDriver.generateAllGrams(CollocDriver.java:187)
	at org.apache.mahout.vectorizer.DictionaryVectorizer.createTermFrequencyVectors(DictionaryVectorizer.java:184)
	at clustering.AmazonClusteringDriver.main(AmazonClusteringDriver.java:133)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:187)


I have the codes run successfully on local machines clusters under
Hadoop 1.2.1 framework. However, I can't get the same code to work on
AWS under Amazon Hadoop 2.2.0 distribution. I'm not sure what the
problems are exactly.


Jason