You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2011/02/05 12:40:30 UTC

[jira] Created: (MAHOUT-608) Collect various data directories in Mahout dir structure

Collect various data directories in Mahout dir structure
--------------------------------------------------------

                 Key: MAHOUT-608
                 URL: https://issues.apache.org/jira/browse/MAHOUT-608
             Project: Mahout
          Issue Type: Improvement
    Affects Versions: 0.4
            Reporter: Sean Owen
            Assignee: Sean Owen
             Fix For: 0.5


The top-level project directory has collected, over time, a number of directories that have a generally similar purpose: to collect various config files, data files, and scripts. In addition toWe have, at first glance:

bin/
 mahout
conf/
 (various .props files)
etc/
 build.xml (reusable  Ant tasks?)
 findbugs-exclude.xml
 mahout.importorder
mahout/
 conf/
  arff.vector.props (wrong place?)
src/
 main/
  appended-resources/
   META-INF/
    NOTICE
   supplemental-models.xml
 site/
  site.xml

There are a few top-level generated directories:

input/
 ...
output/
 ...
testdata/
 transactions
  test.txt


I'd like to prune whatever isn't needed anymore, and rationalize one directory structure as a start.
Can anyone help by suggesting things to be removed, or a directory structure?

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: [jira] Commented: (MAHOUT-608) Collect various data directories in Mahout dir structure

Posted by Ted Dunning <te...@gmail.com>.
+0

I just glanced through the diff, but didn't check functionality.


On Fri, Feb 11, 2011 at 5:11 AM, Sean Owen (JIRA) <ji...@apache.org> wrote:

>
>    [
> https://issues.apache.org/jira/browse/MAHOUT-608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12993480#comment-12993480]
>
> Sean Owen commented on MAHOUT-608:
> ----------------------------------
>
> I'd like to commit this. I think it's a good bit of tidying, and as far as
> I know doesn't break anything. I think Ted is at least +0? Would be good to
> hear if anyone has other thoughts on this before proceeding though.
>
> > Collect various data directories in Mahout dir structure
> > --------------------------------------------------------
> >
> >                 Key: MAHOUT-608
> >                 URL: https://issues.apache.org/jira/browse/MAHOUT-608
> >             Project: Mahout
> >          Issue Type: Improvement
> >    Affects Versions: 0.4
> >            Reporter: Sean Owen
> >            Assignee: Sean Owen
> >             Fix For: 0.5
> >
> >         Attachments: MAHOUT-608.patch, MAHOUT-608.patch
> >
> >
> > The top-level project directory has collected, over time, a number of
> directories that have a generally similar purpose: to collect various config
> files, data files, and scripts. In addition toWe have, at first glance:
> > bin/
> >  mahout
> > conf/
> >  (various .props files)
> > etc/
> >  build.xml (reusable  Ant tasks?)
> >  findbugs-exclude.xml
> >  mahout.importorder
> > mahout/
> >  conf/
> >   arff.vector.props (wrong place?)
> > src/
> >  main/
> >   appended-resources/
> >    META-INF/
> >     NOTICE
> >    supplemental-models.xml
> >  site/
> >   site.xml
> > There are a few top-level generated directories:
> > input/
> >  ...
> > output/
> >  ...
> > testdata/
> >  transactions
> >   test.txt
> > I'd like to prune whatever isn't needed anymore, and rationalize one
> directory structure as a start.
> > Can anyone help by suggesting things to be removed, or a directory
> structure?
>
> --
> This message is automatically generated by JIRA.
> -
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
>

[jira] Updated: (MAHOUT-608) Collect various data directories in Mahout dir structure

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen updated MAHOUT-608:
-----------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

OK. I'm confident enough that this is a good change (or that if anyone doesn't like it after seeing it, that it's easy to modify) that I went ahead. The top level is significantly cleaner now.

> Collect various data directories in Mahout dir structure
> --------------------------------------------------------
>
>                 Key: MAHOUT-608
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-608
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.4
>            Reporter: Sean Owen
>            Assignee: Sean Owen
>             Fix For: 0.5
>
>         Attachments: MAHOUT-608.patch, MAHOUT-608.patch
>
>
> The top-level project directory has collected, over time, a number of directories that have a generally similar purpose: to collect various config files, data files, and scripts. In addition toWe have, at first glance:
> bin/
>  mahout
> conf/
>  (various .props files)
> etc/
>  build.xml (reusable  Ant tasks?)
>  findbugs-exclude.xml
>  mahout.importorder
> mahout/
>  conf/
>   arff.vector.props (wrong place?)
> src/
>  main/
>   appended-resources/
>    META-INF/
>     NOTICE
>    supplemental-models.xml
>  site/
>   site.xml
> There are a few top-level generated directories:
> input/
>  ...
> output/
>  ...
> testdata/
>  transactions
>   test.txt
> I'd like to prune whatever isn't needed anymore, and rationalize one directory structure as a start.
> Can anyone help by suggesting things to be removed, or a directory structure?

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (MAHOUT-608) Collect various data directories in Mahout dir structure

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen updated MAHOUT-608:
-----------------------------

    Status: Patch Available  (was: Open)

Here's most of what I think should be done. Tests pass, but, does it make sense?

> Collect various data directories in Mahout dir structure
> --------------------------------------------------------
>
>                 Key: MAHOUT-608
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-608
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.4
>            Reporter: Sean Owen
>            Assignee: Sean Owen
>             Fix For: 0.5
>
>
> The top-level project directory has collected, over time, a number of directories that have a generally similar purpose: to collect various config files, data files, and scripts. In addition toWe have, at first glance:
> bin/
>  mahout
> conf/
>  (various .props files)
> etc/
>  build.xml (reusable  Ant tasks?)
>  findbugs-exclude.xml
>  mahout.importorder
> mahout/
>  conf/
>   arff.vector.props (wrong place?)
> src/
>  main/
>   appended-resources/
>    META-INF/
>     NOTICE
>    supplemental-models.xml
>  site/
>   site.xml
> There are a few top-level generated directories:
> input/
>  ...
> output/
>  ...
> testdata/
>  transactions
>   test.txt
> I'd like to prune whatever isn't needed anymore, and rationalize one directory structure as a start.
> Can anyone help by suggesting things to be removed, or a directory structure?

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MAHOUT-608) Collect various data directories in Mahout dir structure

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12994062#comment-12994062 ] 

Sean Owen commented on MAHOUT-608:
----------------------------------

I'd love to make it more than +0 material. I think the idea is good, right? The directory structure doesn't seem to reflect one idea about organizing these files; it's just where they landed. So good to organize them. But the right-est organization is an open question. If not the current one, piling it into src/, what are some other good ideas?

> Collect various data directories in Mahout dir structure
> --------------------------------------------------------
>
>                 Key: MAHOUT-608
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-608
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.4
>            Reporter: Sean Owen
>            Assignee: Sean Owen
>             Fix For: 0.5
>
>         Attachments: MAHOUT-608.patch, MAHOUT-608.patch
>
>
> The top-level project directory has collected, over time, a number of directories that have a generally similar purpose: to collect various config files, data files, and scripts. In addition toWe have, at first glance:
> bin/
>  mahout
> conf/
>  (various .props files)
> etc/
>  build.xml (reusable  Ant tasks?)
>  findbugs-exclude.xml
>  mahout.importorder
> mahout/
>  conf/
>   arff.vector.props (wrong place?)
> src/
>  main/
>   appended-resources/
>    META-INF/
>     NOTICE
>    supplemental-models.xml
>  site/
>   site.xml
> There are a few top-level generated directories:
> input/
>  ...
> output/
>  ...
> testdata/
>  transactions
>   test.txt
> I'd like to prune whatever isn't needed anymore, and rationalize one directory structure as a start.
> Can anyone help by suggesting things to be removed, or a directory structure?

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MAHOUT-608) Collect various data directories in Mahout dir structure

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12997297#comment-12997297 ] 

Hudson commented on MAHOUT-608:
-------------------------------

Integrated in Mahout-Quality #638 (See [https://hudson.apache.org/hudson/job/Mahout-Quality/638/])
    MAHOUT-608 one more change that somehow didn't commit


> Collect various data directories in Mahout dir structure
> --------------------------------------------------------
>
>                 Key: MAHOUT-608
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-608
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.4
>            Reporter: Sean Owen
>            Assignee: Sean Owen
>             Fix For: 0.5
>
>         Attachments: MAHOUT-608.patch, MAHOUT-608.patch
>
>
> The top-level project directory has collected, over time, a number of directories that have a generally similar purpose: to collect various config files, data files, and scripts. In addition toWe have, at first glance:
> bin/
>  mahout
> conf/
>  (various .props files)
> etc/
>  build.xml (reusable  Ant tasks?)
>  findbugs-exclude.xml
>  mahout.importorder
> mahout/
>  conf/
>   arff.vector.props (wrong place?)
> src/
>  main/
>   appended-resources/
>    META-INF/
>     NOTICE
>    supplemental-models.xml
>  site/
>   site.xml
> There are a few top-level generated directories:
> input/
>  ...
> output/
>  ...
> testdata/
>  transactions
>   test.txt
> I'd like to prune whatever isn't needed anymore, and rationalize one directory structure as a start.
> Can anyone help by suggesting things to be removed, or a directory structure?

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (MAHOUT-608) Collect various data directories in Mahout dir structure

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen updated MAHOUT-608:
-----------------------------

    Attachment: MAHOUT-608.patch

Here's another go. Same concept, but converging on src/. I think this is actually better organization.

> Collect various data directories in Mahout dir structure
> --------------------------------------------------------
>
>                 Key: MAHOUT-608
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-608
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.4
>            Reporter: Sean Owen
>            Assignee: Sean Owen
>             Fix For: 0.5
>
>         Attachments: MAHOUT-608.patch, MAHOUT-608.patch
>
>
> The top-level project directory has collected, over time, a number of directories that have a generally similar purpose: to collect various config files, data files, and scripts. In addition toWe have, at first glance:
> bin/
>  mahout
> conf/
>  (various .props files)
> etc/
>  build.xml (reusable  Ant tasks?)
>  findbugs-exclude.xml
>  mahout.importorder
> mahout/
>  conf/
>   arff.vector.props (wrong place?)
> src/
>  main/
>   appended-resources/
>    META-INF/
>     NOTICE
>    supplemental-models.xml
>  site/
>   site.xml
> There are a few top-level generated directories:
> input/
>  ...
> output/
>  ...
> testdata/
>  transactions
>   test.txt
> I'd like to prune whatever isn't needed anymore, and rationalize one directory structure as a start.
> Can anyone help by suggesting things to be removed, or a directory structure?

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (MAHOUT-608) Collect various data directories in Mahout dir structure

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen updated MAHOUT-608:
-----------------------------

    Attachment: MAHOUT-608.patch

> Collect various data directories in Mahout dir structure
> --------------------------------------------------------
>
>                 Key: MAHOUT-608
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-608
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.4
>            Reporter: Sean Owen
>            Assignee: Sean Owen
>             Fix For: 0.5
>
>         Attachments: MAHOUT-608.patch
>
>
> The top-level project directory has collected, over time, a number of directories that have a generally similar purpose: to collect various config files, data files, and scripts. In addition toWe have, at first glance:
> bin/
>  mahout
> conf/
>  (various .props files)
> etc/
>  build.xml (reusable  Ant tasks?)
>  findbugs-exclude.xml
>  mahout.importorder
> mahout/
>  conf/
>   arff.vector.props (wrong place?)
> src/
>  main/
>   appended-resources/
>    META-INF/
>     NOTICE
>    supplemental-models.xml
>  site/
>   site.xml
> There are a few top-level generated directories:
> input/
>  ...
> output/
>  ...
> testdata/
>  transactions
>   test.txt
> I'd like to prune whatever isn't needed anymore, and rationalize one directory structure as a start.
> Can anyone help by suggesting things to be removed, or a directory structure?

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MAHOUT-608) Collect various data directories in Mahout dir structure

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12994260#comment-12994260 ] 

Hudson commented on MAHOUT-608:
-------------------------------

Integrated in Mahout-Quality #625 (See [https://hudson.apache.org/hudson/job/Mahout-Quality/625/])
    MAHOUT-608 - Collect top-level config files into Maven-standard location like src/. Push some top-level files down. Remove duplication in NOTICE.txt. Remove some apparently unused files.


> Collect various data directories in Mahout dir structure
> --------------------------------------------------------
>
>                 Key: MAHOUT-608
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-608
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.4
>            Reporter: Sean Owen
>            Assignee: Sean Owen
>             Fix For: 0.5
>
>         Attachments: MAHOUT-608.patch, MAHOUT-608.patch
>
>
> The top-level project directory has collected, over time, a number of directories that have a generally similar purpose: to collect various config files, data files, and scripts. In addition toWe have, at first glance:
> bin/
>  mahout
> conf/
>  (various .props files)
> etc/
>  build.xml (reusable  Ant tasks?)
>  findbugs-exclude.xml
>  mahout.importorder
> mahout/
>  conf/
>   arff.vector.props (wrong place?)
> src/
>  main/
>   appended-resources/
>    META-INF/
>     NOTICE
>    supplemental-models.xml
>  site/
>   site.xml
> There are a few top-level generated directories:
> input/
>  ...
> output/
>  ...
> testdata/
>  transactions
>   test.txt
> I'd like to prune whatever isn't needed anymore, and rationalize one directory structure as a start.
> Can anyone help by suggesting things to be removed, or a directory structure?

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MAHOUT-608) Collect various data directories in Mahout dir structure

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12993480#comment-12993480 ] 

Sean Owen commented on MAHOUT-608:
----------------------------------

I'd like to commit this. I think it's a good bit of tidying, and as far as I know doesn't break anything. I think Ted is at least +0? Would be good to hear if anyone has other thoughts on this before proceeding though.

> Collect various data directories in Mahout dir structure
> --------------------------------------------------------
>
>                 Key: MAHOUT-608
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-608
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.4
>            Reporter: Sean Owen
>            Assignee: Sean Owen
>             Fix For: 0.5
>
>         Attachments: MAHOUT-608.patch, MAHOUT-608.patch
>
>
> The top-level project directory has collected, over time, a number of directories that have a generally similar purpose: to collect various config files, data files, and scripts. In addition toWe have, at first glance:
> bin/
>  mahout
> conf/
>  (various .props files)
> etc/
>  build.xml (reusable  Ant tasks?)
>  findbugs-exclude.xml
>  mahout.importorder
> mahout/
>  conf/
>   arff.vector.props (wrong place?)
> src/
>  main/
>   appended-resources/
>    META-INF/
>     NOTICE
>    supplemental-models.xml
>  site/
>   site.xml
> There are a few top-level generated directories:
> input/
>  ...
> output/
>  ...
> testdata/
>  transactions
>   test.txt
> I'd like to prune whatever isn't needed anymore, and rationalize one directory structure as a start.
> Can anyone help by suggesting things to be removed, or a directory structure?

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MAHOUT-608) Collect various data directories in Mahout dir structure

Posted by "Frank Scholten (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12990989#comment-12990989 ] 

Frank Scholten commented on MAHOUT-608:
---------------------------------------

arff.vector.props is indeed in the wrong place. Should be under conf along with the other props files. It was from MAHOUT-508, apparently applied on the wrong dir in the source tree.

> Collect various data directories in Mahout dir structure
> --------------------------------------------------------
>
>                 Key: MAHOUT-608
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-608
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.4
>            Reporter: Sean Owen
>            Assignee: Sean Owen
>             Fix For: 0.5
>
>
> The top-level project directory has collected, over time, a number of directories that have a generally similar purpose: to collect various config files, data files, and scripts. In addition toWe have, at first glance:
> bin/
>  mahout
> conf/
>  (various .props files)
> etc/
>  build.xml (reusable  Ant tasks?)
>  findbugs-exclude.xml
>  mahout.importorder
> mahout/
>  conf/
>   arff.vector.props (wrong place?)
> src/
>  main/
>   appended-resources/
>    META-INF/
>     NOTICE
>    supplemental-models.xml
>  site/
>   site.xml
> There are a few top-level generated directories:
> input/
>  ...
> output/
>  ...
> testdata/
>  transactions
>   test.txt
> I'd like to prune whatever isn't needed anymore, and rationalize one directory structure as a start.
> Can anyone help by suggesting things to be removed, or a directory structure?

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MAHOUT-608) Collect various data directories in Mahout dir structure

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991251#comment-12991251 ] 

Sean Owen commented on MAHOUT-608:
----------------------------------

OK that's cool to standardize on src/ if that's standard-ish. The point is mostly to not have 3-4 "stuff" directories. Let me try again for src/

It still feels like there are many 'config' directories: (top level + src/), mahout-eclipse-support, mahout-distribution. I don't know how much it can be refactored but it still feels spread out.

> Collect various data directories in Mahout dir structure
> --------------------------------------------------------
>
>                 Key: MAHOUT-608
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-608
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.4
>            Reporter: Sean Owen
>            Assignee: Sean Owen
>             Fix For: 0.5
>
>         Attachments: MAHOUT-608.patch
>
>
> The top-level project directory has collected, over time, a number of directories that have a generally similar purpose: to collect various config files, data files, and scripts. In addition toWe have, at first glance:
> bin/
>  mahout
> conf/
>  (various .props files)
> etc/
>  build.xml (reusable  Ant tasks?)
>  findbugs-exclude.xml
>  mahout.importorder
> mahout/
>  conf/
>   arff.vector.props (wrong place?)
> src/
>  main/
>   appended-resources/
>    META-INF/
>     NOTICE
>    supplemental-models.xml
>  site/
>   site.xml
> There are a few top-level generated directories:
> input/
>  ...
> output/
>  ...
> testdata/
>  transactions
>   test.txt
> I'd like to prune whatever isn't needed anymore, and rationalize one directory structure as a start.
> Can anyone help by suggesting things to be removed, or a directory structure?

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MAHOUT-608) Collect various data directories in Mahout dir structure

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991241#comment-12991241 ] 

Ted Dunning commented on MAHOUT-608:
------------------------------------

As I see it, this patch did the following:

moved conf under etc
moved a few files into etc
moved the appended resources out of src/main into etc

I don't much see the benefit here.  Moving a few files into etc is an improvement, but if conf is to be moved, it seems it should be moved into a resources directory under src/main rather than into etc.

Similarly, moving resources out of the maven standard sort of place (under src/main) doesn't seem a benefit.

How about an alternative where conf goes into mahout/src/main, the few files go into mahout/etc and the appended-resources sit still?


> Collect various data directories in Mahout dir structure
> --------------------------------------------------------
>
>                 Key: MAHOUT-608
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-608
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.4
>            Reporter: Sean Owen
>            Assignee: Sean Owen
>             Fix For: 0.5
>
>         Attachments: MAHOUT-608.patch
>
>
> The top-level project directory has collected, over time, a number of directories that have a generally similar purpose: to collect various config files, data files, and scripts. In addition toWe have, at first glance:
> bin/
>  mahout
> conf/
>  (various .props files)
> etc/
>  build.xml (reusable  Ant tasks?)
>  findbugs-exclude.xml
>  mahout.importorder
> mahout/
>  conf/
>   arff.vector.props (wrong place?)
> src/
>  main/
>   appended-resources/
>    META-INF/
>     NOTICE
>    supplemental-models.xml
>  site/
>   site.xml
> There are a few top-level generated directories:
> input/
>  ...
> output/
>  ...
> testdata/
>  transactions
>   test.txt
> I'd like to prune whatever isn't needed anymore, and rationalize one directory structure as a start.
> Can anyone help by suggesting things to be removed, or a directory structure?

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (MAHOUT-608) Collect various data directories in Mahout dir structure

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12994145#comment-12994145 ] 

Ted Dunning commented on MAHOUT-608:
------------------------------------

My +0 didn't reflect discredit on the intent.  It merely reflect that I only skimmed the patch and couldn't say if the change was done entirely correctly.  You (Sean) have a good history of doing changes correctly so I wouldn't worry much.

> Collect various data directories in Mahout dir structure
> --------------------------------------------------------
>
>                 Key: MAHOUT-608
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-608
>             Project: Mahout
>          Issue Type: Improvement
>    Affects Versions: 0.4
>            Reporter: Sean Owen
>            Assignee: Sean Owen
>             Fix For: 0.5
>
>         Attachments: MAHOUT-608.patch, MAHOUT-608.patch
>
>
> The top-level project directory has collected, over time, a number of directories that have a generally similar purpose: to collect various config files, data files, and scripts. In addition toWe have, at first glance:
> bin/
>  mahout
> conf/
>  (various .props files)
> etc/
>  build.xml (reusable  Ant tasks?)
>  findbugs-exclude.xml
>  mahout.importorder
> mahout/
>  conf/
>   arff.vector.props (wrong place?)
> src/
>  main/
>   appended-resources/
>    META-INF/
>     NOTICE
>    supplemental-models.xml
>  site/
>   site.xml
> There are a few top-level generated directories:
> input/
>  ...
> output/
>  ...
> testdata/
>  transactions
>   test.txt
> I'd like to prune whatever isn't needed anymore, and rationalize one directory structure as a start.
> Can anyone help by suggesting things to be removed, or a directory structure?

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira