You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Pat Ferrel <pa...@occamsmachete.com> on 2015/03/22 17:34:02 UTC

naming

The primary artifacts needed/created for the new scala stuff are math, math-scala, spark, spark-shell, h2o

Might it be better to rename math-scala to core-scala (or just core) since it does now and will increasingly include non-math. This is where engine neutral stuff goes and so is a core dependency.

Also thinking about a name for the Mahout Scala DSL, Shell, optimized liner algebra, bayesian ops, stats, yada yada - environment. How about (sticking with the sanskrit theme) "him tendua" or maybe just “tendua", which is sanskrit for mountain leopard or panther. A nice word easily found in searches that evokes good connotations. Besides the kits are seriously cute.


Re: naming

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
for cli i'd leave it where dependencies allow it. if it needs spark
dependencies, it goes to spark. otherwise, math is ok for now. For now it
is important for me to maintain stuff that is "abstract" -- and that's
math-scala at the moment, and that's backend-specific -- and that's either
spark or h20.

But consider this old tale i was pushing for some time now:

CLI dependencies in reality are method dependencies and we already said
that method could be quasi-algebraic (i.e. both math and backend specific).
Moreover, methods could be written such that they include more than one
backend specific support. So they themselves might have single entry point
(which is backend-independent) and backend-specific support via factories
or strategies if modelled right (just like DistributedEngine is modeled).

That seems to imply that CLI code in particular can always be made without
backend dependencies even for quasi-algebraic methods and therefore can
usually be fit in math module along with embedded entry point of the method.

On Tue, Mar 24, 2015 at 11:12 AM, Dmitriy Lyubimov <dl...@gmail.com>
wrote:

> Data structures supporting math are math. (i.e. "openHashMap")
>
>
> On Tue, Mar 24, 2015 at 9:52 AM, Andrew Palumbo <ap...@outlook.com>
> wrote:
>
>>
>> On 03/24/2015 11:24 AM, Pat Ferrel wrote:
>>
>>> Lots of non-math in math-scala, Reader/Writer traits, options, option
>>> parsing, driver base classes, IndexedDataset (mini-Dataframes), and there
>>> will always be more because there is no other engine-neutral module. So far
>>> the rules as I understand them are; if it’s engine neutral put it in
>>> math-scala. The rule doesn’t match the name and since this will be the
>>> first release to go to the Maven repos as an artifact it seems like a good
>>> time to name it for what it is. Either this or we create a new module for
>>> non-math?
>>>
>>> Vote:
>>> math-scala
>>> scala-core
>>> scala-base
>>> other?
>>>
>>
>> I would consider IndexedDataset to be math, along with their
>> corresponding reader writer traits (similar to drm.dfsWrite(...)).Would it
>> be too confusing  to keep as is for 0.10.0 and then move options, option
>> parsing, driver base classes along with any other new non-math stuff over
>> to something like scala-core for a later release?
>>
>> if so i'd say "math-scala" or "math-base" and "scala-core" are good names.
>>
>>
>>
>>> On Mar 23, 2015, at 11:02 AM, Dmitriy Lyubimov <dl...@gmail.com>
>>> wrote:
>>>
>>> I like math-*. And it is math only there. Or was last time i checked. it
>>> will be what R calls "R-base", and I would welcome no other scope there.
>>> all environment things are math. all ML things are math. quasi-newton,
>>> bayesian optimizers, linear search are all math. Stats are math. als,
>>> (d)ssvd, d(spca), (d)als are all math.
>>>
>>> non-math are perhaps an app server like R shiny, if we ever get there,
>>> that definitely deserves a module. But other than that, what else we are
>>> talking here?
>>>
>>> renaming artifacts is confusing to hands-on people (me including). it is
>>> the reason why i lost my way in modern hadoop dependencies.
>>>
>>> i am good with all the "oneric ocelot" and the rest of fancy animal
>>> kingdom names
>>>
>>> On Sun, Mar 22, 2015 at 9:34 AM, Pat Ferrel <pat@occamsmachete.com
>>> <ma...@occamsmachete.com>> wrote:
>>> The primary artifacts needed/created for the new scala stuff are math,
>>> math-scala, spark, spark-shell, h2o
>>>
>>> Might it be better to rename math-scala to core-scala (or just core)
>>> since it does now and will increasingly include non-math. This is where
>>> engine neutral stuff goes and so is a core dependency.
>>>
>>> Also thinking about a name for the Mahout Scala DSL, Shell, optimized
>>> liner algebra, bayesian ops, stats, yada yada - environment. How about
>>> (sticking with the sanskrit theme) "him tendua" or maybe just “tendua",
>>> which is sanskrit for mountain leopard or panther. A nice word easily found
>>> in searches that evokes good connotations. Besides the kits are seriously
>>> cute.
>>>
>>>
>>>
>>>
>>>
>>>
>>
>

Re: naming

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
Data structures supporting math are math. (i.e. "openHashMap")


On Tue, Mar 24, 2015 at 9:52 AM, Andrew Palumbo <ap...@outlook.com> wrote:

>
> On 03/24/2015 11:24 AM, Pat Ferrel wrote:
>
>> Lots of non-math in math-scala, Reader/Writer traits, options, option
>> parsing, driver base classes, IndexedDataset (mini-Dataframes), and there
>> will always be more because there is no other engine-neutral module. So far
>> the rules as I understand them are; if it’s engine neutral put it in
>> math-scala. The rule doesn’t match the name and since this will be the
>> first release to go to the Maven repos as an artifact it seems like a good
>> time to name it for what it is. Either this or we create a new module for
>> non-math?
>>
>> Vote:
>> math-scala
>> scala-core
>> scala-base
>> other?
>>
>
> I would consider IndexedDataset to be math, along with their corresponding
> reader writer traits (similar to drm.dfsWrite(...)).Would it be too
> confusing  to keep as is for 0.10.0 and then move options, option parsing,
> driver base classes along with any other new non-math stuff over to
> something like scala-core for a later release?
>
> if so i'd say "math-scala" or "math-base" and "scala-core" are good names.
>
>
>
>> On Mar 23, 2015, at 11:02 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>>
>> I like math-*. And it is math only there. Or was last time i checked. it
>> will be what R calls "R-base", and I would welcome no other scope there.
>> all environment things are math. all ML things are math. quasi-newton,
>> bayesian optimizers, linear search are all math. Stats are math. als,
>> (d)ssvd, d(spca), (d)als are all math.
>>
>> non-math are perhaps an app server like R shiny, if we ever get there,
>> that definitely deserves a module. But other than that, what else we are
>> talking here?
>>
>> renaming artifacts is confusing to hands-on people (me including). it is
>> the reason why i lost my way in modern hadoop dependencies.
>>
>> i am good with all the "oneric ocelot" and the rest of fancy animal
>> kingdom names
>>
>> On Sun, Mar 22, 2015 at 9:34 AM, Pat Ferrel <pat@occamsmachete.com
>> <ma...@occamsmachete.com>> wrote:
>> The primary artifacts needed/created for the new scala stuff are math,
>> math-scala, spark, spark-shell, h2o
>>
>> Might it be better to rename math-scala to core-scala (or just core)
>> since it does now and will increasingly include non-math. This is where
>> engine neutral stuff goes and so is a core dependency.
>>
>> Also thinking about a name for the Mahout Scala DSL, Shell, optimized
>> liner algebra, bayesian ops, stats, yada yada - environment. How about
>> (sticking with the sanskrit theme) "him tendua" or maybe just “tendua",
>> which is sanskrit for mountain leopard or panther. A nice word easily found
>> in searches that evokes good connotations. Besides the kits are seriously
>> cute.
>>
>>
>>
>>
>>
>>
>

Re: naming

Posted by Andrew Palumbo <ap...@outlook.com>.
On 03/24/2015 11:24 AM, Pat Ferrel wrote:
> Lots of non-math in math-scala, Reader/Writer traits, options, option parsing, driver base classes, IndexedDataset (mini-Dataframes), and there will always be more because there is no other engine-neutral module. So far the rules as I understand them are; if it’s engine neutral put it in math-scala. The rule doesn’t match the name and since this will be the first release to go to the Maven repos as an artifact it seems like a good time to name it for what it is. Either this or we create a new module for non-math?
>
> Vote:
> math-scala
> scala-core
> scala-base
> other?

I would consider IndexedDataset to be math, along with their 
corresponding reader writer traits (similar to drm.dfsWrite(...)).Would 
it be too confusing  to keep as is for 0.10.0 and then move options, 
option parsing, driver base classes along with any other new non-math 
stuff over to something like scala-core for a later release?

if so i'd say "math-scala" or "math-base" and "scala-core" are good names.

>
> On Mar 23, 2015, at 11:02 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
>
> I like math-*. And it is math only there. Or was last time i checked. it will be what R calls "R-base", and I would welcome no other scope there. all environment things are math. all ML things are math. quasi-newton, bayesian optimizers, linear search are all math. Stats are math. als, (d)ssvd, d(spca), (d)als are all math.
>
> non-math are perhaps an app server like R shiny, if we ever get there, that definitely deserves a module. But other than that, what else we are talking here?
>
> renaming artifacts is confusing to hands-on people (me including). it is the reason why i lost my way in modern hadoop dependencies.
>
> i am good with all the "oneric ocelot" and the rest of fancy animal kingdom names
>
> On Sun, Mar 22, 2015 at 9:34 AM, Pat Ferrel <pat@occamsmachete.com <ma...@occamsmachete.com>> wrote:
> The primary artifacts needed/created for the new scala stuff are math, math-scala, spark, spark-shell, h2o
>
> Might it be better to rename math-scala to core-scala (or just core) since it does now and will increasingly include non-math. This is where engine neutral stuff goes and so is a core dependency.
>
> Also thinking about a name for the Mahout Scala DSL, Shell, optimized liner algebra, bayesian ops, stats, yada yada - environment. How about (sticking with the sanskrit theme) "him tendua" or maybe just “tendua", which is sanskrit for mountain leopard or panther. A nice word easily found in searches that evokes good connotations. Besides the kits are seriously cute.
>
>
>
>
>


Re: naming

Posted by Pat Ferrel <pa...@occamsmachete.com>.
Lots of non-math in math-scala, Reader/Writer traits, options, option parsing, driver base classes, IndexedDataset (mini-Dataframes), and there will always be more because there is no other engine-neutral module. So far the rules as I understand them are; if it’s engine neutral put it in math-scala. The rule doesn’t match the name and since this will be the first release to go to the Maven repos as an artifact it seems like a good time to name it for what it is. Either this or we create a new module for non-math?

Vote:
math-scala
scala-core
scala-base
other?


On Mar 23, 2015, at 11:02 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

I like math-*. And it is math only there. Or was last time i checked. it will be what R calls "R-base", and I would welcome no other scope there. all environment things are math. all ML things are math. quasi-newton, bayesian optimizers, linear search are all math. Stats are math. als, (d)ssvd, d(spca), (d)als are all math. 

non-math are perhaps an app server like R shiny, if we ever get there, that definitely deserves a module. But other than that, what else we are talking here?

renaming artifacts is confusing to hands-on people (me including). it is the reason why i lost my way in modern hadoop dependencies.

i am good with all the "oneric ocelot" and the rest of fancy animal kingdom names

On Sun, Mar 22, 2015 at 9:34 AM, Pat Ferrel <pat@occamsmachete.com <ma...@occamsmachete.com>> wrote:
The primary artifacts needed/created for the new scala stuff are math, math-scala, spark, spark-shell, h2o

Might it be better to rename math-scala to core-scala (or just core) since it does now and will increasingly include non-math. This is where engine neutral stuff goes and so is a core dependency.

Also thinking about a name for the Mahout Scala DSL, Shell, optimized liner algebra, bayesian ops, stats, yada yada - environment. How about (sticking with the sanskrit theme) "him tendua" or maybe just “tendua", which is sanskrit for mountain leopard or panther. A nice word easily found in searches that evokes good connotations. Besides the kits are seriously cute.





Re: naming

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
I like math-*. And it is math only there. Or was last time i checked. it
will be what R calls "R-base", and I would welcome no other scope there.
all environment things are math. all ML things are math. quasi-newton,
bayesian optimizers, linear search are all math. Stats are math. als,
(d)ssvd, d(spca), (d)als are all math.

non-math are perhaps an app server like R shiny, if we ever get there, that
definitely deserves a module. But other than that, what else we are talking
here?

renaming artifacts is confusing to hands-on people (me including). it is
the reason why i lost my way in modern hadoop dependencies.

i am good with all the "oneric ocelot" and the rest of fancy animal kingdom
names

On Sun, Mar 22, 2015 at 9:34 AM, Pat Ferrel <pa...@occamsmachete.com> wrote:

> The primary artifacts needed/created for the new scala stuff are math,
> math-scala, spark, spark-shell, h2o
>
> Might it be better to rename math-scala to core-scala (or just core) since
> it does now and will increasingly include non-math. This is where engine
> neutral stuff goes and so is a core dependency.
>
> Also thinking about a name for the Mahout Scala DSL, Shell, optimized
> liner algebra, bayesian ops, stats, yada yada - environment. How about
> (sticking with the sanskrit theme) "him tendua" or maybe just “tendua",
> which is sanskrit for mountain leopard or panther. A nice word easily found
> in searches that evokes good connotations. Besides the kits are seriously
> cute.
>
>