You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Amrhal Lelasm <ar...@hotmail.com> on 2012/04/30 03:36:30 UTC

integrating databases

I had a nice week playing with the Mahout CF Libray and its MySQLJDBCDataModel to get the input data from  a database. 
But then this idea occurred to me. I have another sever with similar data sets, which is running another database server, Oracle, namely.


I'm wondering how I can combine these two to get the input data for my recommender engine. Do, I start by implementing the the JDBCDataModel or ?


I appreciate any insight you might have for this?








 		 	   		  

Re: integrating databases

Posted by Sean Owen <sr...@gmail.com>.
(I think the question is more blending two data sources than two
recommenders.)

On Tue, May 1, 2012 at 9:37 AM, Manuel Blechschmidt <
Manuel.Blechschmidt@gmx.de> wrote:

> Hi Amrhal,
> combining data for a recommender from two data sources is current
> research. Search for ensemble learning or blending learners. Basically the
> combination of multiple learners is called blending. (
> http://pragmatictheory.blogspot.de/2008/07/blending-101.html). Blending
> different models is currently one of the most important features to get
> these great accuracy's like in the NetFlix Price or blending offline and
> online learning methods to do something in semi realtime (Fast Online
> Learning through Offline Initialization for Time-sensitive Recommendation -
> http://users.cis.fiu.edu/~lzhen001/activities/KDD_USB_key_2010/docs/p703.pdf
> ).
>
> I recommend the chapter Combining Multiple Learners from Introduction to
> Machine Learning. (
> http://www.amazon.com/Introduction-Machine-Learning-Adaptive-Computation/dp/026201243X/
> )
>
> As far as I know there are currently no ensemble or blending learners in
> Mahout. Here you find a block diagram how such a data model could look like:
>
> https://source.apaxo.de/svn/semrecsys/trunk/doc/images/SemanticRecommenderDataModel.pdf
>
> In the block diagram you see 3 data source: explicit ratings, implicit
> views and explicit purchases. This data model should be optimized to sell
> as much products as possible. So a customer can interact in up to 3 ways
> with a product view it, rate it, and buy it. You could call these different
> contexts and then use different contexts for learning (see Workshop on
> Context-Aware Recommender Systems 2009/2010/2011). e.g. you could use
> Tensor Factorization etc.
>
> Further you have to differentiate between the data model and the learner
> (recommender). I just describe how to make the learner more accurate and
> use data from different sources. If you just keep the same data for the
> same context in your Oracle database then it won't be very complicated to
> integrate it with the MySQL database.
>
> Have a great week
>    Manuel
>
>

Re: integrating databases

Posted by Manuel Blechschmidt <Ma...@gmx.de>.
Hi Amrhal,
combining data for a recommender from two data sources is current research. Search for ensemble learning or blending learners. Basically the combination of multiple learners is called blending. (http://pragmatictheory.blogspot.de/2008/07/blending-101.html). Blending different models is currently one of the most important features to get these great accuracy's like in the NetFlix Price or blending offline and online learning methods to do something in semi realtime (Fast Online Learning through Offline Initialization for Time-sensitive Recommendation - http://users.cis.fiu.edu/~lzhen001/activities/KDD_USB_key_2010/docs/p703.pdf).

I recommend the chapter Combining Multiple Learners from Introduction to Machine Learning. (http://www.amazon.com/Introduction-Machine-Learning-Adaptive-Computation/dp/026201243X/)

As far as I know there are currently no ensemble or blending learners in Mahout. Here you find a block diagram how such a data model could look like:
https://source.apaxo.de/svn/semrecsys/trunk/doc/images/SemanticRecommenderDataModel.pdf

In the block diagram you see 3 data source: explicit ratings, implicit views and explicit purchases. This data model should be optimized to sell as much products as possible. So a customer can interact in up to 3 ways with a product view it, rate it, and buy it. You could call these different contexts and then use different contexts for learning (see Workshop on Context-Aware Recommender Systems 2009/2010/2011). e.g. you could use Tensor Factorization etc.

Further you have to differentiate between the data model and the learner (recommender). I just describe how to make the learner more accurate and use data from different sources. If you just keep the same data for the same context in your Oracle database then it won't be very complicated to integrate it with the MySQL database.

Have a great week
    Manuel

On 30.04.2012, at 23:37, Amrhal Lelasm wrote:

> 
> Thank you Sean but what I want is to combine the input data I'm getting from two severs before feeding it to the recommender.It's more like appending the data of 
> MySQLJDBCDataModel  with MyOracleJDBCDataModel(which I reckon I have to implement) and then sorting the data from the two database servers and finally giving it to the recommender engine.
> 
> 
>> Date: Mon, 30 Apr 2012 06:38:03 +0100
>> Subject: Re: integrating databases
>> From: srowen@gmail.com
>> To: user@mahout.apache.org
>> 
>> Can you connect to an Oracle database? sure, just do so. I think the
>> SQL just works, but you'll find out.
>> 
>> On Mon, Apr 30, 2012 at 2:36 AM, Amrhal Lelasm <ar...@hotmail.com> wrote:
>>> 
>>> I had a nice week playing with the Mahout CF Libray and its MySQLJDBCDataModel to get the input data from  a database.
>>> But then this idea occurred to me. I have another sever with similar data sets, which is running another database server, Oracle, namely.
>>> 
>>> 
>>> I'm wondering how I can combine these two to get the input data for my recommender engine. Do, I start by implementing the the JDBCDataModel or ?
>>> 
>>> 
>>> I appreciate any insight you might have for this?
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
> 		 	   		  

-- 
Manuel Blechschmidt
Dortustr. 57
14467 Potsdam
Mobil: 0173/6322621
Twitter: http://twitter.com/Manuel_B


Re: integrating databases

Posted by Sean Owen <sr...@gmail.com>.
You would have to write code to manually splice together the data. You'd
also have to figure out what to do with updates -- which DB gets it? Or you
may say you don't have updates.

Querying two DBs and manually combining their results is even slower than
one. You're going to have to get all the data into memory anyway. So
instead I'd look at modifying the ReloadFromJDBCDataModel to read and
combine the data in their entirety on a reload. That's not hard.

On Mon, Apr 30, 2012 at 10:37 PM, Amrhal Lelasm <ar...@hotmail.com> wrote:

>
> Thank you Sean but what I want is to combine the input data I'm getting
> from two severs before feeding it to the recommender.It's more like
> appending the data of
> MySQLJDBCDataModel  with MyOracleJDBCDataModel(which I reckon I have to
> implement) and then sorting the data from the two database servers and
> finally giving it to the recommender engine.
>
>

RE: integrating databases

Posted by Amrhal Lelasm <ar...@hotmail.com>.
Thank you Sean but what I want is to combine the input data I'm getting from two severs before feeding it to the recommender.It's more like appending the data of 
MySQLJDBCDataModel  with MyOracleJDBCDataModel(which I reckon I have to implement) and then sorting the data from the two database servers and finally giving it to the recommender engine.


> Date: Mon, 30 Apr 2012 06:38:03 +0100
> Subject: Re: integrating databases
> From: srowen@gmail.com
> To: user@mahout.apache.org
> 
> Can you connect to an Oracle database? sure, just do so. I think the
> SQL just works, but you'll find out.
> 
> On Mon, Apr 30, 2012 at 2:36 AM, Amrhal Lelasm <ar...@hotmail.com> wrote:
> >
> > I had a nice week playing with the Mahout CF Libray and its MySQLJDBCDataModel to get the input data from  a database.
> > But then this idea occurred to me. I have another sever with similar data sets, which is running another database server, Oracle, namely.
> >
> >
> > I'm wondering how I can combine these two to get the input data for my recommender engine. Do, I start by implementing the the JDBCDataModel or ?
> >
> >
> > I appreciate any insight you might have for this?
> >
> >
> >
> >
> >
> >
> >
> >
> >
 		 	   		  

Re: integrating databases

Posted by Sean Owen <sr...@gmail.com>.
Can you connect to an Oracle database? sure, just do so. I think the
SQL just works, but you'll find out.

On Mon, Apr 30, 2012 at 2:36 AM, Amrhal Lelasm <ar...@hotmail.com> wrote:
>
> I had a nice week playing with the Mahout CF Libray and its MySQLJDBCDataModel to get the input data from  a database.
> But then this idea occurred to me. I have another sever with similar data sets, which is running another database server, Oracle, namely.
>
>
> I'm wondering how I can combine these two to get the input data for my recommender engine. Do, I start by implementing the the JDBCDataModel or ?
>
>
> I appreciate any insight you might have for this?
>
>
>
>
>
>
>
>
>

Re: integrating databases

Posted by Ted Dunning <te...@gmail.com>.
On Mon, Apr 30, 2012 at 1:36 AM, Amrhal Lelasm <ar...@hotmail.com> wrote:

>
> I'm wondering how I can combine these two to get the input data for my
> recommender engine. Do, I start by implementing the the JDBCDataModel or ?
>

Yes.


> I appreciate any insight you might have for this?


Sounds like you got it.