You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Frank Scholten (JIRA)" <ji...@apache.org> on 2010/09/13 22:52:33 UTC

[jira] Created: (MAHOUT-501) Property file lucenevector.props doesn't get loaded when running 'mahout lucene.vector'

Property file lucenevector.props doesn't get loaded when running 'mahout lucene.vector'
---------------------------------------------------------------------------------------

                 Key: MAHOUT-501
                 URL: https://issues.apache.org/jira/browse/MAHOUT-501
             Project: Mahout
          Issue Type: Bug
    Affects Versions: 0.3
         Environment: Ubuntu 10.04.1 LTS
            Reporter: Frank Scholten
            Priority: Trivial


Running 'mahout lucene.vector' results in the following stacktrace:

frank@frankthetank:~$ mahout lucene.vector
Running on hadoop, using HADOOP_HOME=/home/frank/software/dist/hadoop
HADOOP_CONF_DIR=/home/frank/software/dist/hadoop/conf
10/09/13 22:44:43 WARN driver.MahoutDriver: No lucene.vector.props found on classpath, will use command-line arguments only
10/09/13 22:44:43 ERROR lucene.Driver: Exception
org.apache.commons.cli2.OptionException: Missing required option --dir
        at org.apache.commons.cli2.option.DefaultOption.validate(DefaultOption.java:172)
        at org.apache.commons.cli2.option.GroupImpl.validate(GroupImpl.java:265)
        at org.apache.commons.cli2.commandline.Parser.parse(Parser.java:104)
        at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:133)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:175)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Usage:                                                                          
 [--dir <dir> --idField <idField> --output <output> --delimiter <delimiter>     
--help --field <field> --max <max> --dictOut <dictOut> --norm <norm>            
--outputWriter <outputWriter> --maxDFPercent <maxDFPercent> --weight <weight>   
--minDF <minDF>]                                                                
Options                                                                         
  --dir (-d) dir                      The Lucene directory                      
  --idField (-i) idField              The field in the index containing the     
                                      index.  If null, then the Lucene internal 
                                      doc id is used which is prone to error if 
                                      the underlying index changes              
  --output (-o) output                The output file                           
  --delimiter (-l) delimiter          The delimiter for outputing the           
                                      dictionary                                
  --help (-h)                         Print out help                            
  --field (-f) field                  The field in the index                    
  --max (-m) max                      The maximum number of vectors to output.  
                                      If not specified, then it will loop over  
                                      all docs                                  
  --dictOut (-t) dictOut              The output of the dictionary              
  --norm (-n) norm                    The norm to use, expressed as either a    
                                      double or "INF" if you want to use the    
                                      Infinite norm.  Must be greater or equal  
                                      to 0.  The default is not to normalize    
  --outputWriter (-e) outputWriter    The VectorWriter to use, either seq       
                                      (SequenceFileVectorWriter - default) or   
                                      file (Writes to a File using JSON format) 
  --maxDFPercent (-x) maxDFPercent    The max percentage of docs for the DF.    
                                      Can be used to remove really high         
                                      frequency terms.  Expressed as an integer 
                                      between 0 and 100. Default is 99.         
  --weight (-w) weight                The kind of weight to use. Currently TF   
                                      or TFIDF                                  
  --minDF (-md) minDF                 The minimum document frequency.  Default  
                                      is 1                                      
10/09/13 22:44:43 INFO driver.MahoutDriver: Program took 54 ms

This is because the program shortname lucene.vector from driver.classes.props has a dot, while the props file is called lucenevector.props, which doesn't have a dot between lucene and vector.

Fix: change lucenevector.props filename to lucene.vector.props.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAHOUT-501) Property file lucenevector.props doesn't get loaded when running 'mahout lucene.vector'

Posted by "Drew Farris (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Drew Farris updated MAHOUT-501:
-------------------------------

        Status: Resolved  (was: Patch Available)
    Resolution: Fixed

Thanks for the patch Frank.

> Property file lucenevector.props doesn't get loaded when running 'mahout lucene.vector'
> ---------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-501
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-501
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.3
>         Environment: Ubuntu 10.04.1 LTS
>            Reporter: Frank Scholten
>            Assignee: Drew Farris
>            Priority: Trivial
>             Fix For: 0.4
>
>         Attachments: MAHOUT-501.patch
>
>   Original Estimate: 0.08h
>  Remaining Estimate: 0.08h
>
> Running 'mahout lucene.vector' results in the following stacktrace:
> frank@frankthetank:~$ mahout lucene.vector
> Running on hadoop, using HADOOP_HOME=/home/frank/software/dist/hadoop
> HADOOP_CONF_DIR=/home/frank/software/dist/hadoop/conf
> 10/09/13 22:44:43 WARN driver.MahoutDriver: No lucene.vector.props found on classpath, will use command-line arguments only
> 10/09/13 22:44:43 ERROR lucene.Driver: Exception
> org.apache.commons.cli2.OptionException: Missing required option --dir
>         at org.apache.commons.cli2.option.DefaultOption.validate(DefaultOption.java:172)
>         at org.apache.commons.cli2.option.GroupImpl.validate(GroupImpl.java:265)
>         at org.apache.commons.cli2.commandline.Parser.parse(Parser.java:104)
>         at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:133)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>         at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>         at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:175)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Usage:                                                                          
>  [--dir <dir> --idField <idField> --output <output> --delimiter <delimiter>     
> --help --field <field> --max <max> --dictOut <dictOut> --norm <norm>            
> --outputWriter <outputWriter> --maxDFPercent <maxDFPercent> --weight <weight>   
> --minDF <minDF>]                                                                
> Options                                                                         
>   --dir (-d) dir                      The Lucene directory                      
>   --idField (-i) idField              The field in the index containing the     
>                                       index.  If null, then the Lucene internal 
>                                       doc id is used which is prone to error if 
>                                       the underlying index changes              
>   --output (-o) output                The output file                           
>   --delimiter (-l) delimiter          The delimiter for outputing the           
>                                       dictionary                                
>   --help (-h)                         Print out help                            
>   --field (-f) field                  The field in the index                    
>   --max (-m) max                      The maximum number of vectors to output.  
>                                       If not specified, then it will loop over  
>                                       all docs                                  
>   --dictOut (-t) dictOut              The output of the dictionary              
>   --norm (-n) norm                    The norm to use, expressed as either a    
>                                       double or "INF" if you want to use the    
>                                       Infinite norm.  Must be greater or equal  
>                                       to 0.  The default is not to normalize    
>   --outputWriter (-e) outputWriter    The VectorWriter to use, either seq       
>                                       (SequenceFileVectorWriter - default) or   
>                                       file (Writes to a File using JSON format) 
>   --maxDFPercent (-x) maxDFPercent    The max percentage of docs for the DF.    
>                                       Can be used to remove really high         
>                                       frequency terms.  Expressed as an integer 
>                                       between 0 and 100. Default is 99.         
>   --weight (-w) weight                The kind of weight to use. Currently TF   
>                                       or TFIDF                                  
>   --minDF (-md) minDF                 The minimum document frequency.  Default  
>                                       is 1                                      
> 10/09/13 22:44:43 INFO driver.MahoutDriver: Program took 54 ms
> This is because the program shortname lucene.vector from driver.classes.props has a dot, while the props file is called lucenevector.props, which doesn't have a dot between lucene and vector.
> Fix: change lucenevector.props filename to lucene.vector.props.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAHOUT-501) Property file lucenevector.props doesn't get loaded when running 'mahout lucene.vector'

Posted by "Frank Scholten (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Frank Scholten updated MAHOUT-501:
----------------------------------

    Fix Version/s: 0.4

Fix version 0.4

> Property file lucenevector.props doesn't get loaded when running 'mahout lucene.vector'
> ---------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-501
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-501
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.3
>         Environment: Ubuntu 10.04.1 LTS
>            Reporter: Frank Scholten
>            Priority: Trivial
>             Fix For: 0.4
>
>         Attachments: MAHOUT-501.patch
>
>   Original Estimate: 0.08h
>  Remaining Estimate: 0.08h
>
> Running 'mahout lucene.vector' results in the following stacktrace:
> frank@frankthetank:~$ mahout lucene.vector
> Running on hadoop, using HADOOP_HOME=/home/frank/software/dist/hadoop
> HADOOP_CONF_DIR=/home/frank/software/dist/hadoop/conf
> 10/09/13 22:44:43 WARN driver.MahoutDriver: No lucene.vector.props found on classpath, will use command-line arguments only
> 10/09/13 22:44:43 ERROR lucene.Driver: Exception
> org.apache.commons.cli2.OptionException: Missing required option --dir
>         at org.apache.commons.cli2.option.DefaultOption.validate(DefaultOption.java:172)
>         at org.apache.commons.cli2.option.GroupImpl.validate(GroupImpl.java:265)
>         at org.apache.commons.cli2.commandline.Parser.parse(Parser.java:104)
>         at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:133)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>         at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>         at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:175)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Usage:                                                                          
>  [--dir <dir> --idField <idField> --output <output> --delimiter <delimiter>     
> --help --field <field> --max <max> --dictOut <dictOut> --norm <norm>            
> --outputWriter <outputWriter> --maxDFPercent <maxDFPercent> --weight <weight>   
> --minDF <minDF>]                                                                
> Options                                                                         
>   --dir (-d) dir                      The Lucene directory                      
>   --idField (-i) idField              The field in the index containing the     
>                                       index.  If null, then the Lucene internal 
>                                       doc id is used which is prone to error if 
>                                       the underlying index changes              
>   --output (-o) output                The output file                           
>   --delimiter (-l) delimiter          The delimiter for outputing the           
>                                       dictionary                                
>   --help (-h)                         Print out help                            
>   --field (-f) field                  The field in the index                    
>   --max (-m) max                      The maximum number of vectors to output.  
>                                       If not specified, then it will loop over  
>                                       all docs                                  
>   --dictOut (-t) dictOut              The output of the dictionary              
>   --norm (-n) norm                    The norm to use, expressed as either a    
>                                       double or "INF" if you want to use the    
>                                       Infinite norm.  Must be greater or equal  
>                                       to 0.  The default is not to normalize    
>   --outputWriter (-e) outputWriter    The VectorWriter to use, either seq       
>                                       (SequenceFileVectorWriter - default) or   
>                                       file (Writes to a File using JSON format) 
>   --maxDFPercent (-x) maxDFPercent    The max percentage of docs for the DF.    
>                                       Can be used to remove really high         
>                                       frequency terms.  Expressed as an integer 
>                                       between 0 and 100. Default is 99.         
>   --weight (-w) weight                The kind of weight to use. Currently TF   
>                                       or TFIDF                                  
>   --minDF (-md) minDF                 The minimum document frequency.  Default  
>                                       is 1                                      
> 10/09/13 22:44:43 INFO driver.MahoutDriver: Program took 54 ms
> This is because the program shortname lucene.vector from driver.classes.props has a dot, while the props file is called lucenevector.props, which doesn't have a dot between lucene and vector.
> Fix: change lucenevector.props filename to lucene.vector.props.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAHOUT-501) Property file lucenevector.props doesn't get loaded when running 'mahout lucene.vector'

Posted by "Frank Scholten (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Frank Scholten updated MAHOUT-501:
----------------------------------

    Attachment: MAHOUT-501.patch

Added patch which removed lucenevector.props and adds empty lucene.vector.props

> Property file lucenevector.props doesn't get loaded when running 'mahout lucene.vector'
> ---------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-501
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-501
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.3
>         Environment: Ubuntu 10.04.1 LTS
>            Reporter: Frank Scholten
>            Priority: Trivial
>         Attachments: MAHOUT-501.patch
>
>   Original Estimate: 0.08h
>  Remaining Estimate: 0.08h
>
> Running 'mahout lucene.vector' results in the following stacktrace:
> frank@frankthetank:~$ mahout lucene.vector
> Running on hadoop, using HADOOP_HOME=/home/frank/software/dist/hadoop
> HADOOP_CONF_DIR=/home/frank/software/dist/hadoop/conf
> 10/09/13 22:44:43 WARN driver.MahoutDriver: No lucene.vector.props found on classpath, will use command-line arguments only
> 10/09/13 22:44:43 ERROR lucene.Driver: Exception
> org.apache.commons.cli2.OptionException: Missing required option --dir
>         at org.apache.commons.cli2.option.DefaultOption.validate(DefaultOption.java:172)
>         at org.apache.commons.cli2.option.GroupImpl.validate(GroupImpl.java:265)
>         at org.apache.commons.cli2.commandline.Parser.parse(Parser.java:104)
>         at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:133)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>         at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>         at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:175)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Usage:                                                                          
>  [--dir <dir> --idField <idField> --output <output> --delimiter <delimiter>     
> --help --field <field> --max <max> --dictOut <dictOut> --norm <norm>            
> --outputWriter <outputWriter> --maxDFPercent <maxDFPercent> --weight <weight>   
> --minDF <minDF>]                                                                
> Options                                                                         
>   --dir (-d) dir                      The Lucene directory                      
>   --idField (-i) idField              The field in the index containing the     
>                                       index.  If null, then the Lucene internal 
>                                       doc id is used which is prone to error if 
>                                       the underlying index changes              
>   --output (-o) output                The output file                           
>   --delimiter (-l) delimiter          The delimiter for outputing the           
>                                       dictionary                                
>   --help (-h)                         Print out help                            
>   --field (-f) field                  The field in the index                    
>   --max (-m) max                      The maximum number of vectors to output.  
>                                       If not specified, then it will loop over  
>                                       all docs                                  
>   --dictOut (-t) dictOut              The output of the dictionary              
>   --norm (-n) norm                    The norm to use, expressed as either a    
>                                       double or "INF" if you want to use the    
>                                       Infinite norm.  Must be greater or equal  
>                                       to 0.  The default is not to normalize    
>   --outputWriter (-e) outputWriter    The VectorWriter to use, either seq       
>                                       (SequenceFileVectorWriter - default) or   
>                                       file (Writes to a File using JSON format) 
>   --maxDFPercent (-x) maxDFPercent    The max percentage of docs for the DF.    
>                                       Can be used to remove really high         
>                                       frequency terms.  Expressed as an integer 
>                                       between 0 and 100. Default is 99.         
>   --weight (-w) weight                The kind of weight to use. Currently TF   
>                                       or TFIDF                                  
>   --minDF (-md) minDF                 The minimum document frequency.  Default  
>                                       is 1                                      
> 10/09/13 22:44:43 INFO driver.MahoutDriver: Program took 54 ms
> This is because the program shortname lucene.vector from driver.classes.props has a dot, while the props file is called lucenevector.props, which doesn't have a dot between lucene and vector.
> Fix: change lucenevector.props filename to lucene.vector.props.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (MAHOUT-501) Property file lucenevector.props doesn't get loaded when running 'mahout lucene.vector'

Posted by "Drew Farris (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Drew Farris reassigned MAHOUT-501:
----------------------------------

    Assignee: Drew Farris

> Property file lucenevector.props doesn't get loaded when running 'mahout lucene.vector'
> ---------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-501
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-501
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.3
>         Environment: Ubuntu 10.04.1 LTS
>            Reporter: Frank Scholten
>            Assignee: Drew Farris
>            Priority: Trivial
>             Fix For: 0.4
>
>         Attachments: MAHOUT-501.patch
>
>   Original Estimate: 0.08h
>  Remaining Estimate: 0.08h
>
> Running 'mahout lucene.vector' results in the following stacktrace:
> frank@frankthetank:~$ mahout lucene.vector
> Running on hadoop, using HADOOP_HOME=/home/frank/software/dist/hadoop
> HADOOP_CONF_DIR=/home/frank/software/dist/hadoop/conf
> 10/09/13 22:44:43 WARN driver.MahoutDriver: No lucene.vector.props found on classpath, will use command-line arguments only
> 10/09/13 22:44:43 ERROR lucene.Driver: Exception
> org.apache.commons.cli2.OptionException: Missing required option --dir
>         at org.apache.commons.cli2.option.DefaultOption.validate(DefaultOption.java:172)
>         at org.apache.commons.cli2.option.GroupImpl.validate(GroupImpl.java:265)
>         at org.apache.commons.cli2.commandline.Parser.parse(Parser.java:104)
>         at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:133)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>         at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>         at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:175)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Usage:                                                                          
>  [--dir <dir> --idField <idField> --output <output> --delimiter <delimiter>     
> --help --field <field> --max <max> --dictOut <dictOut> --norm <norm>            
> --outputWriter <outputWriter> --maxDFPercent <maxDFPercent> --weight <weight>   
> --minDF <minDF>]                                                                
> Options                                                                         
>   --dir (-d) dir                      The Lucene directory                      
>   --idField (-i) idField              The field in the index containing the     
>                                       index.  If null, then the Lucene internal 
>                                       doc id is used which is prone to error if 
>                                       the underlying index changes              
>   --output (-o) output                The output file                           
>   --delimiter (-l) delimiter          The delimiter for outputing the           
>                                       dictionary                                
>   --help (-h)                         Print out help                            
>   --field (-f) field                  The field in the index                    
>   --max (-m) max                      The maximum number of vectors to output.  
>                                       If not specified, then it will loop over  
>                                       all docs                                  
>   --dictOut (-t) dictOut              The output of the dictionary              
>   --norm (-n) norm                    The norm to use, expressed as either a    
>                                       double or "INF" if you want to use the    
>                                       Infinite norm.  Must be greater or equal  
>                                       to 0.  The default is not to normalize    
>   --outputWriter (-e) outputWriter    The VectorWriter to use, either seq       
>                                       (SequenceFileVectorWriter - default) or   
>                                       file (Writes to a File using JSON format) 
>   --maxDFPercent (-x) maxDFPercent    The max percentage of docs for the DF.    
>                                       Can be used to remove really high         
>                                       frequency terms.  Expressed as an integer 
>                                       between 0 and 100. Default is 99.         
>   --weight (-w) weight                The kind of weight to use. Currently TF   
>                                       or TFIDF                                  
>   --minDF (-md) minDF                 The minimum document frequency.  Default  
>                                       is 1                                      
> 10/09/13 22:44:43 INFO driver.MahoutDriver: Program took 54 ms
> This is because the program shortname lucene.vector from driver.classes.props has a dot, while the props file is called lucenevector.props, which doesn't have a dot between lucene and vector.
> Fix: change lucenevector.props filename to lucene.vector.props.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAHOUT-501) Property file lucenevector.props doesn't get loaded when running 'mahout lucene.vector'

Posted by "Frank Scholten (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Frank Scholten updated MAHOUT-501:
----------------------------------

    Status: Patch Available  (was: Open)

> Property file lucenevector.props doesn't get loaded when running 'mahout lucene.vector'
> ---------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-501
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-501
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.3
>         Environment: Ubuntu 10.04.1 LTS
>            Reporter: Frank Scholten
>            Priority: Trivial
>         Attachments: MAHOUT-501.patch
>
>   Original Estimate: 0.08h
>  Remaining Estimate: 0.08h
>
> Running 'mahout lucene.vector' results in the following stacktrace:
> frank@frankthetank:~$ mahout lucene.vector
> Running on hadoop, using HADOOP_HOME=/home/frank/software/dist/hadoop
> HADOOP_CONF_DIR=/home/frank/software/dist/hadoop/conf
> 10/09/13 22:44:43 WARN driver.MahoutDriver: No lucene.vector.props found on classpath, will use command-line arguments only
> 10/09/13 22:44:43 ERROR lucene.Driver: Exception
> org.apache.commons.cli2.OptionException: Missing required option --dir
>         at org.apache.commons.cli2.option.DefaultOption.validate(DefaultOption.java:172)
>         at org.apache.commons.cli2.option.GroupImpl.validate(GroupImpl.java:265)
>         at org.apache.commons.cli2.commandline.Parser.parse(Parser.java:104)
>         at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:133)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>         at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>         at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:175)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Usage:                                                                          
>  [--dir <dir> --idField <idField> --output <output> --delimiter <delimiter>     
> --help --field <field> --max <max> --dictOut <dictOut> --norm <norm>            
> --outputWriter <outputWriter> --maxDFPercent <maxDFPercent> --weight <weight>   
> --minDF <minDF>]                                                                
> Options                                                                         
>   --dir (-d) dir                      The Lucene directory                      
>   --idField (-i) idField              The field in the index containing the     
>                                       index.  If null, then the Lucene internal 
>                                       doc id is used which is prone to error if 
>                                       the underlying index changes              
>   --output (-o) output                The output file                           
>   --delimiter (-l) delimiter          The delimiter for outputing the           
>                                       dictionary                                
>   --help (-h)                         Print out help                            
>   --field (-f) field                  The field in the index                    
>   --max (-m) max                      The maximum number of vectors to output.  
>                                       If not specified, then it will loop over  
>                                       all docs                                  
>   --dictOut (-t) dictOut              The output of the dictionary              
>   --norm (-n) norm                    The norm to use, expressed as either a    
>                                       double or "INF" if you want to use the    
>                                       Infinite norm.  Must be greater or equal  
>                                       to 0.  The default is not to normalize    
>   --outputWriter (-e) outputWriter    The VectorWriter to use, either seq       
>                                       (SequenceFileVectorWriter - default) or   
>                                       file (Writes to a File using JSON format) 
>   --maxDFPercent (-x) maxDFPercent    The max percentage of docs for the DF.    
>                                       Can be used to remove really high         
>                                       frequency terms.  Expressed as an integer 
>                                       between 0 and 100. Default is 99.         
>   --weight (-w) weight                The kind of weight to use. Currently TF   
>                                       or TFIDF                                  
>   --minDF (-md) minDF                 The minimum document frequency.  Default  
>                                       is 1                                      
> 10/09/13 22:44:43 INFO driver.MahoutDriver: Program took 54 ms
> This is because the program shortname lucene.vector from driver.classes.props has a dot, while the props file is called lucenevector.props, which doesn't have a dot between lucene and vector.
> Fix: change lucenevector.props filename to lucene.vector.props.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAHOUT-501) Property file lucenevector.props doesn't get loaded when running 'mahout lucene.vector'

Posted by "Frank Scholten (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Frank Scholten updated MAHOUT-501:
----------------------------------

    Comment: was deleted

(was: Fix version 0.4)

> Property file lucenevector.props doesn't get loaded when running 'mahout lucene.vector'
> ---------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-501
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-501
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.3
>         Environment: Ubuntu 10.04.1 LTS
>            Reporter: Frank Scholten
>            Priority: Trivial
>             Fix For: 0.4
>
>         Attachments: MAHOUT-501.patch
>
>   Original Estimate: 0.08h
>  Remaining Estimate: 0.08h
>
> Running 'mahout lucene.vector' results in the following stacktrace:
> frank@frankthetank:~$ mahout lucene.vector
> Running on hadoop, using HADOOP_HOME=/home/frank/software/dist/hadoop
> HADOOP_CONF_DIR=/home/frank/software/dist/hadoop/conf
> 10/09/13 22:44:43 WARN driver.MahoutDriver: No lucene.vector.props found on classpath, will use command-line arguments only
> 10/09/13 22:44:43 ERROR lucene.Driver: Exception
> org.apache.commons.cli2.OptionException: Missing required option --dir
>         at org.apache.commons.cli2.option.DefaultOption.validate(DefaultOption.java:172)
>         at org.apache.commons.cli2.option.GroupImpl.validate(GroupImpl.java:265)
>         at org.apache.commons.cli2.commandline.Parser.parse(Parser.java:104)
>         at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:133)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>         at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>         at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:175)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Usage:                                                                          
>  [--dir <dir> --idField <idField> --output <output> --delimiter <delimiter>     
> --help --field <field> --max <max> --dictOut <dictOut> --norm <norm>            
> --outputWriter <outputWriter> --maxDFPercent <maxDFPercent> --weight <weight>   
> --minDF <minDF>]                                                                
> Options                                                                         
>   --dir (-d) dir                      The Lucene directory                      
>   --idField (-i) idField              The field in the index containing the     
>                                       index.  If null, then the Lucene internal 
>                                       doc id is used which is prone to error if 
>                                       the underlying index changes              
>   --output (-o) output                The output file                           
>   --delimiter (-l) delimiter          The delimiter for outputing the           
>                                       dictionary                                
>   --help (-h)                         Print out help                            
>   --field (-f) field                  The field in the index                    
>   --max (-m) max                      The maximum number of vectors to output.  
>                                       If not specified, then it will loop over  
>                                       all docs                                  
>   --dictOut (-t) dictOut              The output of the dictionary              
>   --norm (-n) norm                    The norm to use, expressed as either a    
>                                       double or "INF" if you want to use the    
>                                       Infinite norm.  Must be greater or equal  
>                                       to 0.  The default is not to normalize    
>   --outputWriter (-e) outputWriter    The VectorWriter to use, either seq       
>                                       (SequenceFileVectorWriter - default) or   
>                                       file (Writes to a File using JSON format) 
>   --maxDFPercent (-x) maxDFPercent    The max percentage of docs for the DF.    
>                                       Can be used to remove really high         
>                                       frequency terms.  Expressed as an integer 
>                                       between 0 and 100. Default is 99.         
>   --weight (-w) weight                The kind of weight to use. Currently TF   
>                                       or TFIDF                                  
>   --minDF (-md) minDF                 The minimum document frequency.  Default  
>                                       is 1                                      
> 10/09/13 22:44:43 INFO driver.MahoutDriver: Program took 54 ms
> This is because the program shortname lucene.vector from driver.classes.props has a dot, while the props file is called lucenevector.props, which doesn't have a dot between lucene and vector.
> Fix: change lucenevector.props filename to lucene.vector.props.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-501) Property file lucenevector.props doesn't get loaded when running 'mahout lucene.vector'

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12913801#action_12913801 ] 

Hudson commented on MAHOUT-501:
-------------------------------

Integrated in Mahout-Quality #311 (See [https://hudson.apache.org/hudson/job/Mahout-Quality/311/])
    

> Property file lucenevector.props doesn't get loaded when running 'mahout lucene.vector'
> ---------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-501
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-501
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.3
>         Environment: Ubuntu 10.04.1 LTS
>            Reporter: Frank Scholten
>            Assignee: Drew Farris
>            Priority: Trivial
>             Fix For: 0.4
>
>         Attachments: MAHOUT-501.patch
>
>   Original Estimate: 0.08h
>  Remaining Estimate: 0.08h
>
> Running 'mahout lucene.vector' results in the following stacktrace:
> frank@frankthetank:~$ mahout lucene.vector
> Running on hadoop, using HADOOP_HOME=/home/frank/software/dist/hadoop
> HADOOP_CONF_DIR=/home/frank/software/dist/hadoop/conf
> 10/09/13 22:44:43 WARN driver.MahoutDriver: No lucene.vector.props found on classpath, will use command-line arguments only
> 10/09/13 22:44:43 ERROR lucene.Driver: Exception
> org.apache.commons.cli2.OptionException: Missing required option --dir
>         at org.apache.commons.cli2.option.DefaultOption.validate(DefaultOption.java:172)
>         at org.apache.commons.cli2.option.GroupImpl.validate(GroupImpl.java:265)
>         at org.apache.commons.cli2.commandline.Parser.parse(Parser.java:104)
>         at org.apache.mahout.utils.vectors.lucene.Driver.main(Driver.java:133)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>         at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>         at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:175)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Usage:                                                                          
>  [--dir <dir> --idField <idField> --output <output> --delimiter <delimiter>     
> --help --field <field> --max <max> --dictOut <dictOut> --norm <norm>            
> --outputWriter <outputWriter> --maxDFPercent <maxDFPercent> --weight <weight>   
> --minDF <minDF>]                                                                
> Options                                                                         
>   --dir (-d) dir                      The Lucene directory                      
>   --idField (-i) idField              The field in the index containing the     
>                                       index.  If null, then the Lucene internal 
>                                       doc id is used which is prone to error if 
>                                       the underlying index changes              
>   --output (-o) output                The output file                           
>   --delimiter (-l) delimiter          The delimiter for outputing the           
>                                       dictionary                                
>   --help (-h)                         Print out help                            
>   --field (-f) field                  The field in the index                    
>   --max (-m) max                      The maximum number of vectors to output.  
>                                       If not specified, then it will loop over  
>                                       all docs                                  
>   --dictOut (-t) dictOut              The output of the dictionary              
>   --norm (-n) norm                    The norm to use, expressed as either a    
>                                       double or "INF" if you want to use the    
>                                       Infinite norm.  Must be greater or equal  
>                                       to 0.  The default is not to normalize    
>   --outputWriter (-e) outputWriter    The VectorWriter to use, either seq       
>                                       (SequenceFileVectorWriter - default) or   
>                                       file (Writes to a File using JSON format) 
>   --maxDFPercent (-x) maxDFPercent    The max percentage of docs for the DF.    
>                                       Can be used to remove really high         
>                                       frequency terms.  Expressed as an integer 
>                                       between 0 and 100. Default is 99.         
>   --weight (-w) weight                The kind of weight to use. Currently TF   
>                                       or TFIDF                                  
>   --minDF (-md) minDF                 The minimum document frequency.  Default  
>                                       is 1                                      
> 10/09/13 22:44:43 INFO driver.MahoutDriver: Program took 54 ms
> This is because the program shortname lucene.vector from driver.classes.props has a dot, while the props file is called lucenevector.props, which doesn't have a dot between lucene and vector.
> Fix: change lucenevector.props filename to lucene.vector.props.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.