You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by JJJ959 <so...@hotmail.com> on 2011/04/14 09:35:00 UTC

Error "Generating an output file from a Lucene Index"

Getting errors trying to generate the output file. Any help would be greatly
appreciated. Thanks

Index files are as follows:

_0.fdt
_0.fdx
_0.fnm
_0.frq
_0.nrm
_0.prx
_0.tii
_0.tis
segments.gen
segments_1

When I enter in this command I get the following errors:
$mahout org.apache.mahout.utils.vectors.lucene.Driver --dir
/home/temp/text_output --output /home/temp/text_output --dictOut
/home/temp/text_output --max 50 --field contents

Running on hadoop, using HADOOP_HOME=/usr/lib/hadoop
No HADOOP_CONF_DIR set, using /usr/lib/hadoop/conf
11/04/14 16:15:27 WARN driver.MahoutDriver: No
org.apache.mahout.utils.vectors.lucene.Driver.props found on classpath, will
use command-line arguments only
Exception in thread "main" org.apache.lucene.index.CorruptIndexException:
Unknown format version: -11
        at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:249)
        at
org.apache.lucene.index.DirectoryReader$1.doBody(DirectoryReader.java:73)
        at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:677)
        at
org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:69)
        at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
        at org.apache.lucene.index.IndexReader.open(IndexReader.java:202)
        at
org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:157)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
        at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:184)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:186)


--
View this message in context: http://lucene.472066.n3.nabble.com/Error-Generating-an-output-file-from-a-Lucene-Index-tp2819407p2819407.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Error "Generating an output file from a Lucene Index"

Posted by Lance Norskog <go...@gmail.com>.
Ok, what Lucene is used in the mahout libraries? You'll have to switch
your program or Mahout to use the same libraries?

On Thu, Apr 14, 2011 at 11:28 PM, JJJ959 <so...@hotmail.com> wrote:
> Currently I'm using Mahout 0.4. The Index was generated with Lucene 3.1.0
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Error-Generating-an-output-file-from-a-Lucene-Index-tp2819407p2823550.html
> Sent from the Mahout User List mailing list archive at Nabble.com.
>



-- 
Lance Norskog
goksron@gmail.com

Re: Error "Generating an output file from a Lucene Index"

Posted by JJJ959 <so...@hotmail.com>.
Thanks
If downgraded the lucene version from 3.1.0 to 2.9.4 and it seemed to
generate the dict file however I could not generate the part-out.vec file.
What could be the problem???
Thanks

--
View this message in context: http://lucene.472066.n3.nabble.com/Error-Generating-an-output-file-from-a-Lucene-Index-tp2819407p2837637.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Error "Generating an output file from a Lucene Index"

Posted by JJJ959 <so...@hotmail.com>.
Currently I'm using Mahout 0.4. The Index was generated with Lucene 3.1.0

--
View this message in context: http://lucene.472066.n3.nabble.com/Error-Generating-an-output-file-from-a-Lucene-Index-tp2819407p2823550.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Error "Generating an output file from a Lucene Index"

Posted by Lance Norskog <go...@gmail.com>.
Did you make this index with the mahout job? Or is this from a
separate app? There are a few different versions of Lucene index files
out there; it's possible you made the files with a newer Lucene than
is distributed with Mahout.


On 4/14/11, Philippe Adjiman <ad...@gmail.com> wrote:
> Check your lucene code (where you are generating the index) and make sure
> your indexed fields have the right properties.
> Step 1 of that
> post<http://philippeadjiman.com/blog/2010/12/30/how-to-easily-build-and-observe-tf-idf-weight-vectors-with-lucene-and-mahout/>
> might
> help you to sort out the issue.
>
> -Philippe.
>
> On Thu, Apr 14, 2011 at 10:35 AM, JJJ959 <so...@hotmail.com> wrote:
>
>> Getting errors trying to generate the output file. Any help would be
>> greatly
>> appreciated. Thanks
>>
>> Index files are as follows:
>>
>> _0.fdt
>> _0.fdx
>> _0.fnm
>> _0.frq
>> _0.nrm
>> _0.prx
>> _0.tii
>> _0.tis
>> segments.gen
>> segments_1
>>
>> When I enter in this command I get the following errors:
>> $mahout org.apache.mahout.utils.vectors.lucene.Driver --dir
>> /home/temp/text_output --output /home/temp/text_output --dictOut
>> /home/temp/text_output --max 50 --field contents
>>
>> Running on hadoop, using HADOOP_HOME=/usr/lib/hadoop
>> No HADOOP_CONF_DIR set, using /usr/lib/hadoop/conf
>> 11/04/14 16:15:27 WARN driver.MahoutDriver: No
>> org.apache.mahout.utils.vectors.lucene.Driver.props found on classpath,
>> will
>> use command-line arguments only
>> Exception in thread "main" org.apache.lucene.index.CorruptIndexException:
>> Unknown format version: -11
>>        at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:249)
>>        at
>> org.apache.lucene.index.DirectoryReader$1.doBody(DirectoryReader.java:73)
>>        at
>>
>> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:677)
>>        at
>> org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:69)
>>        at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
>>        at org.apache.lucene.index.IndexReader.open(IndexReader.java:202)
>>        at
>> org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:157)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at
>>
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>        at
>>
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at
>>
>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>>        at
>> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>>        at
>> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:184)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at
>>
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>        at
>>
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Error-Generating-an-output-file-from-a-Lucene-Index-tp2819407p2819407.html
>> Sent from the Mahout User List mailing list archive at Nabble.com.
>>
>
>
>
> --
> Philippe Adjiman | twitter: padjiman | linkedin:
> il.linkedin.com/in/philippeadjiman | blog: http://philippeadjiman.com/blog
>


-- 
Lance Norskog
goksron@gmail.com

Re: Error "Generating an output file from a Lucene Index"

Posted by Philippe Adjiman <ad...@gmail.com>.
Check your lucene code (where you are generating the index) and make sure
your indexed fields have the right properties.
Step 1 of that post<http://philippeadjiman.com/blog/2010/12/30/how-to-easily-build-and-observe-tf-idf-weight-vectors-with-lucene-and-mahout/>
might
help you to sort out the issue.

-Philippe.

On Thu, Apr 14, 2011 at 10:35 AM, JJJ959 <so...@hotmail.com> wrote:

> Getting errors trying to generate the output file. Any help would be
> greatly
> appreciated. Thanks
>
> Index files are as follows:
>
> _0.fdt
> _0.fdx
> _0.fnm
> _0.frq
> _0.nrm
> _0.prx
> _0.tii
> _0.tis
> segments.gen
> segments_1
>
> When I enter in this command I get the following errors:
> $mahout org.apache.mahout.utils.vectors.lucene.Driver --dir
> /home/temp/text_output --output /home/temp/text_output --dictOut
> /home/temp/text_output --max 50 --field contents
>
> Running on hadoop, using HADOOP_HOME=/usr/lib/hadoop
> No HADOOP_CONF_DIR set, using /usr/lib/hadoop/conf
> 11/04/14 16:15:27 WARN driver.MahoutDriver: No
> org.apache.mahout.utils.vectors.lucene.Driver.props found on classpath,
> will
> use command-line arguments only
> Exception in thread "main" org.apache.lucene.index.CorruptIndexException:
> Unknown format version: -11
>        at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:249)
>        at
> org.apache.lucene.index.DirectoryReader$1.doBody(DirectoryReader.java:73)
>        at
>
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:677)
>        at
> org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:69)
>        at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
>        at org.apache.lucene.index.IndexReader.open(IndexReader.java:202)
>        at
> org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:157)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
>
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>        at
> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:184)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Error-Generating-an-output-file-from-a-Lucene-Index-tp2819407p2819407.html
> Sent from the Mahout User List mailing list archive at Nabble.com.
>



-- 
Philippe Adjiman | twitter: padjiman | linkedin:
il.linkedin.com/in/philippeadjiman | blog: http://philippeadjiman.com/blog