You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Yann Moisan (JIRA)" <ji...@apache.org> on 2012/09/17 19:21:08 UTC

[jira] [Created] (MAHOUT-1068) FileDataModel should ignore directories when reloading data

Yann Moisan created MAHOUT-1068:
-----------------------------------

             Summary: FileDataModel should ignore directories when reloading data
                 Key: MAHOUT-1068
                 URL: https://issues.apache.org/jira/browse/MAHOUT-1068
             Project: Mahout
          Issue Type: Bug
          Components: Collaborative Filtering
    Affects Versions: 0.7
            Reporter: Yann Moisan
            Assignee: Sean Owen


I work with a directory that contains :
- a file test.csv (my data for recommendation)
- a directory test (for other purpose ...)

And surprinsigly i encountered the following error.java.io.FileNotFoundException: .../test (Is a directory)
	at java.io.FileInputStream.open(Native Method) ~[na:1.7.0_03]
	at java.io.FileInputStream.<init>(FileInputStream.java:138) ~[na:1.7.0_03]
	at org.apache.mahout.common.iterator.FileLineIterator.getFileInputStream(FileLineIterator.java:98) ~[mahout-core-0.7.jar:0.7]
	at org.apache.mahout.common.iterator.FileLineIterator.<init>(FileLineIterator.java:79) ~[mahout-core-0.7.jar:0.7]
	at org.apache.mahout.common.iterator.FileLineIterator.<init>(FileLineIterator.java:67) ~[mahout-core-0.7.jar:0.7]
	at org.apache.mahout.cf.taste.impl.model.file.FileDataModel.buildModel(FileDataModel.java:238) [mahout-core-0.7.jar:0.7]
	at org.apache.mahout.cf.taste.impl.model.file.FileDataModel.reload(FileDataModel.java:207) [mahout-core-0.7.jar:0.7]
	at org.apache.mahout.cf.taste.impl.model.file.FileDataModel.<init>(FileDataModel.java:193) [mahout-core-0.7.jar:0.7]
	at org.apache.mahout.cf.taste.impl.model.file.FileDataModel.<init>(FileDataModel.java:148) [mahout-core-0.7.jar:0.7]


After looking at the code, i saw that the method findUpdateFilesAfter doesn't filter directories. 

I proposed to add a test in the method :
    ...
    for (File updateFile : parentDir.listFiles()) {
+     if (!updateFile.isDirectory()) { 
      String updateFileName = updateFile.getName();


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (MAHOUT-1068) FileDataModel should ignore directories when reloading data

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-1068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved MAHOUT-1068.
-------------------------------

       Resolution: Fixed
    Fix Version/s: 0.8

No problem, added (as a FileFilter).
                
> FileDataModel should ignore directories when reloading data
> -----------------------------------------------------------
>
>                 Key: MAHOUT-1068
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1068
>             Project: Mahout
>          Issue Type: Bug
>          Components: Collaborative Filtering
>    Affects Versions: 0.7
>            Reporter: Yann Moisan
>            Assignee: Sean Owen
>             Fix For: 0.8
>
>
> I work with a directory that contains :
> - a file test.csv (my data for recommendation)
> - a directory test (for other purpose ...)
> And surprinsigly i encountered the following error.java.io.FileNotFoundException: .../test (Is a directory)
> 	at java.io.FileInputStream.open(Native Method) ~[na:1.7.0_03]
> 	at java.io.FileInputStream.<init>(FileInputStream.java:138) ~[na:1.7.0_03]
> 	at org.apache.mahout.common.iterator.FileLineIterator.getFileInputStream(FileLineIterator.java:98) ~[mahout-core-0.7.jar:0.7]
> 	at org.apache.mahout.common.iterator.FileLineIterator.<init>(FileLineIterator.java:79) ~[mahout-core-0.7.jar:0.7]
> 	at org.apache.mahout.common.iterator.FileLineIterator.<init>(FileLineIterator.java:67) ~[mahout-core-0.7.jar:0.7]
> 	at org.apache.mahout.cf.taste.impl.model.file.FileDataModel.buildModel(FileDataModel.java:238) [mahout-core-0.7.jar:0.7]
> 	at org.apache.mahout.cf.taste.impl.model.file.FileDataModel.reload(FileDataModel.java:207) [mahout-core-0.7.jar:0.7]
> 	at org.apache.mahout.cf.taste.impl.model.file.FileDataModel.<init>(FileDataModel.java:193) [mahout-core-0.7.jar:0.7]
> 	at org.apache.mahout.cf.taste.impl.model.file.FileDataModel.<init>(FileDataModel.java:148) [mahout-core-0.7.jar:0.7]
> After looking at the code, i saw that the method findUpdateFilesAfter doesn't filter directories. 
> I proposed to add a test in the method :
>     ...
>     for (File updateFile : parentDir.listFiles()) {
> +     if (!updateFile.isDirectory()) { 
>       String updateFileName = updateFile.getName();

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira