You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Jay Vyas <ja...@gmail.com> on 2014/02/21 17:01:42 UTC

Adapters for mahout inputs .... anyone working on this?

Hi mahout.  Was thinking about building adapters so it would be easier to
run algorithms on a broader set of input data structures, unless there is
already an initiative to do so:

Any thoughts on this JIRA?
https://issues.apache.org/jira/browse/MAHOUT-1421

-- 
Jay Vyas
http://jayunit100.blogspot.com

Re: Adapters for mahout inputs .... anyone working on this?

Posted by Jay Vyas <ja...@gmail.com>.
Yes it will be tricky.  But that said, i think we should be able to simply
change the existing parsers to be more flexible , to accomodate at least
slightly more diverse inputs.  For example, variable columns in CSV etc...

Re: Adapters for mahout inputs .... anyone working on this?

Posted by Ted Dunning <te...@gmail.com>.
Robin,

This is a great example of how the problem is a bit harder for Mahout.
 Mnay of the large datasets of interest will not fit in memory.  That means
that mixed memory and disk formats are important in this flow which also
makes API's more complicated.




On Sat, Feb 22, 2014 at 10:12 AM, Robin East <ro...@xense.co.uk> wrote:

> Various existing frameworks like R or python/sklearn already have existing
> interfaces so would be good to follow some existing patterns. I personally
> find sklearn's transform/fit/predict pattern very easy to use and
> understand.
>
> Sent from my iPhone
>
> On 22 Feb 2014, at 09:15, Gokhan Capan <gk...@gmail.com> wrote:
>
> > I'm personally positive on this.
> >
> > Could you give an example code snippet that shows how the usage is going
> to be?
> >
> > Sent from my iPhone
> >
> >> On Feb 22, 2014, at 5:37, Jay Vyas <ja...@gmail.com> wrote:
> >>
> >> Hi dead.  Sure I will take a look.
> >>
> >>
> >>> On Fri, Feb 21, 2014 at 7:51 PM, Ted Dunning <te...@gmail.com>
> wrote:
> >>>
> >>> Great idea.  Hard to do well.
> >>>
> >>> Would it be possible for you to try to build a picture of all the
> pieces
> >>> that need to be connected before you start building connectors and
> >>> converters?
> >>>
> >>>
> >>>
> >>>> On Fri, Feb 21, 2014 at 8:01 AM, Jay Vyas <ja...@gmail.com>
> wrote:
> >>>>
> >>>> Hi mahout.  Was thinking about building adapters so it would be
> easier to
> >>>> run algorithms on a broader set of input data structures, unless
> there is
> >>>> already an initiative to do so:
> >>>>
> >>>> Any thoughts on this JIRA?
> >>>> https://issues.apache.org/jira/browse/MAHOUT-1421
> >>>>
> >>>> --
> >>>> Jay Vyas
> >>>> http://jayunit100.blogspot.com
> >>
> >>
> >>
> >> --
> >> Jay Vyas
> >> http://jayunit100.blogspot.com
>

Re: Adapters for mahout inputs .... anyone working on this?

Posted by Robin East <ro...@xense.co.uk>.
Various existing frameworks like R or python/sklearn already have existing interfaces so would be good to follow some existing patterns. I personally find sklearn's transform/fit/predict pattern very easy to use and understand.

Sent from my iPhone

On 22 Feb 2014, at 09:15, Gokhan Capan <gk...@gmail.com> wrote:

> I'm personally positive on this.
> 
> Could you give an example code snippet that shows how the usage is going to be?
> 
> Sent from my iPhone
> 
>> On Feb 22, 2014, at 5:37, Jay Vyas <ja...@gmail.com> wrote:
>> 
>> Hi dead.  Sure I will take a look.
>> 
>> 
>>> On Fri, Feb 21, 2014 at 7:51 PM, Ted Dunning <te...@gmail.com> wrote:
>>> 
>>> Great idea.  Hard to do well.
>>> 
>>> Would it be possible for you to try to build a picture of all the pieces
>>> that need to be connected before you start building connectors and
>>> converters?
>>> 
>>> 
>>> 
>>>> On Fri, Feb 21, 2014 at 8:01 AM, Jay Vyas <ja...@gmail.com> wrote:
>>>> 
>>>> Hi mahout.  Was thinking about building adapters so it would be easier to
>>>> run algorithms on a broader set of input data structures, unless there is
>>>> already an initiative to do so:
>>>> 
>>>> Any thoughts on this JIRA?
>>>> https://issues.apache.org/jira/browse/MAHOUT-1421
>>>> 
>>>> --
>>>> Jay Vyas
>>>> http://jayunit100.blogspot.com
>> 
>> 
>> 
>> --
>> Jay Vyas
>> http://jayunit100.blogspot.com

Re: Adapters for mahout inputs .... anyone working on this?

Posted by Gokhan Capan <gk...@gmail.com>.
I'm personally positive on this.

Could you give an example code snippet that shows how the usage is going to be?

Sent from my iPhone

> On Feb 22, 2014, at 5:37, Jay Vyas <ja...@gmail.com> wrote:
>
> Hi dead.  Sure I will take a look.
>
>
>> On Fri, Feb 21, 2014 at 7:51 PM, Ted Dunning <te...@gmail.com> wrote:
>>
>> Great idea.  Hard to do well.
>>
>> Would it be possible for you to try to build a picture of all the pieces
>> that need to be connected before you start building connectors and
>> converters?
>>
>>
>>
>>> On Fri, Feb 21, 2014 at 8:01 AM, Jay Vyas <ja...@gmail.com> wrote:
>>>
>>> Hi mahout.  Was thinking about building adapters so it would be easier to
>>> run algorithms on a broader set of input data structures, unless there is
>>> already an initiative to do so:
>>>
>>> Any thoughts on this JIRA?
>>> https://issues.apache.org/jira/browse/MAHOUT-1421
>>>
>>> --
>>> Jay Vyas
>>> http://jayunit100.blogspot.com
>
>
>
> --
> Jay Vyas
> http://jayunit100.blogspot.com

Re: Adapters for mahout inputs .... anyone working on this?

Posted by Jay Vyas <ja...@gmail.com>.
Hi dead.  Sure I will take a look.


On Fri, Feb 21, 2014 at 7:51 PM, Ted Dunning <te...@gmail.com> wrote:

> Great idea.  Hard to do well.
>
> Would it be possible for you to try to build a picture of all the pieces
> that need to be connected before you start building connectors and
> converters?
>
>
>
> On Fri, Feb 21, 2014 at 8:01 AM, Jay Vyas <ja...@gmail.com> wrote:
>
> > Hi mahout.  Was thinking about building adapters so it would be easier to
> > run algorithms on a broader set of input data structures, unless there is
> > already an initiative to do so:
> >
> > Any thoughts on this JIRA?
> > https://issues.apache.org/jira/browse/MAHOUT-1421
> >
> > --
> > Jay Vyas
> > http://jayunit100.blogspot.com
> >
>



-- 
Jay Vyas
http://jayunit100.blogspot.com

Re: Adapters for mahout inputs .... anyone working on this?

Posted by Ted Dunning <te...@gmail.com>.
Great idea.  Hard to do well.

Would it be possible for you to try to build a picture of all the pieces
that need to be connected before you start building connectors and
converters?



On Fri, Feb 21, 2014 at 8:01 AM, Jay Vyas <ja...@gmail.com> wrote:

> Hi mahout.  Was thinking about building adapters so it would be easier to
> run algorithms on a broader set of input data structures, unless there is
> already an initiative to do so:
>
> Any thoughts on this JIRA?
> https://issues.apache.org/jira/browse/MAHOUT-1421
>
> --
> Jay Vyas
> http://jayunit100.blogspot.com
>