You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by abhishek kumar <ab...@itbhu.ac.in> on 2014/01/10 09:12:35 UTC

Integrating browser with apache mahout !

Hi ,

I'm new to apache mahout. I'm working in topic modelling (particularly
LDA), I have learnt that Mahout has various modules for machine learning
and topic modelling.I want to use it's capabilities in browsing to
recommend sites (or predownload desired webpages based on topics of url's).

I don't have idea how it can be incorporated or attached to any
browser.Please help me in this project also if you have any suggestions it
will be helpful.


Aks

Re: Integrating browser with apache mahout !

Posted by Piero Giacomelli <pg...@gmail.com>.
Il 10/01/2014 09:12, abhishek kumar ha scritto:
> Hi ,
>
> I'm new to apache mahout. I'm working in topic modelling (particularly
> LDA), I have learnt that Mahout has various modules for machine learning
> and topic modelling.I want to use it's capabilities in browsing to
> recommend sites (or predownload desired webpages based on topics of url's).
>
> I don't have idea how it can be incorporated or attached to any
> browser.Please help me in this project also if you have any suggestions it
> will be helpful.
>
>
> Aks
>

Could you pls be more specific?
What is your final purpose? Do you have any use case?



-- 
Piero Giacomelli, Italia
phone:+39 34 71 02 42 95
e-mail: pgiacome@gmail.com
skype: pgiacome
my books
--------------------------------------------------------------------------------------------------
Apache Mahout Cookbook <http://www.packtpub.com/apache-mahout-cookbook/book>
HornetQ Messaging Developer's Guide 
<HornetQ%20Messaging%20Developer%27s%20Guide>
--------------------------------------------------------------------------------------------------

Re: Integrating browser with apache mahout !

Posted by Ted Dunning <te...@gmail.com>.
Generally, you should get all the data you can that is reasonably likely to
relate to what people are interested in.

As far as integrating different kinds of information, see my talk from
Buzzwords last yet on multi-modal recommendation.

http://www.slideshare.net/tdunning/buzz-wordsdunningmultimodalrecommendation
https://www.youtube.com/watch?v=fWR1T2pY08Y



On Wed, Jan 15, 2014 at 12:27 AM, abhishek kumar <
abhishek.kumar.cse10@itbhu.ac.in> wrote:

> Thanks Ted, it was very helpful.Though I still have some doubts:
>
> The click log would contain URL's and engagement of user as you suggested,
> how can we use this data for building a recommend-er engine in mahout ?
>
> I started working with topic modelling,(I know mahout has LDA ) can it be
> used for recommendation ?
>
> I only want to recommend web pages rather than making product based
> recommender system, do I need to record user engagement for that also?
>
>
> On Wed, Jan 15, 2014 at 4:06 AM, Ted Dunning <te...@gmail.com>
> wrote:
>
> > Manuel is close to what I meant.
> >
> > Just taking click-logs is a good start, but you also have to look at the
> > user experience and decide what actions that people take that indicates
> > engagement.
> >
> > For example, if you look at amazon, I would say that if you scroll down
> the
> > page to see the related items and then scroll down to see the reviews,
> > these are both signs of engagement with a product.  This engagement is
> > distinctly better than just loading a page or clicking on a product link.
> >
> > Neither of these actions would appear in a click log since neither
> involves
> > a click.  It is quite plausible that these actions would have different
> > weight for recommendations as well and that for one product, reviews
> might
> > be important while for another related products might be more important.
> >  You need both and you need to remember the difference, especially before
> > you know if these actions are important or distinct.
> >
> > You can cause these actions to be logged by using Javascript that
> executes
> > in the browser.  Focus events or scroll events are both candidate methods
> > for finding out more about what the user is doing.
> >
> >
> >
> >
> >
> > On Tue, Jan 14, 2014 at 7:47 AM, Manuel Blechschmidt <
> > Manuel.Blechschmidt@gmx.de> wrote:
> >
> > > Hi Abhishek,
> > >
> > > On 14.01.2014, at 16:24, abhishek kumar wrote:
> > >
> > > > Hi Ted,
> > > >
> > > > I'm new to mahout so I don't know how plugins for browser can be
> > written
> > > to
> > > > incorporate mahout. Will you please explain in detail.
> > >
> > > Plugin for browsers have nothing to do with recommendations. Ted means
> > > that you take a click log like the access log of the apache web server
> > and
> > > use this as the basis for a recommender system.
> > >
> > > Then you have to write a web application that serves the
> recommendations
> > > to the users.
> > >
> > > Here is a project that gives you all the details:
> > > https://github.com/ManuelB/facebook-recommender-demo
> > >
> > > The project is based on Java EE technology. Java EE and Recommendations
> > > are both very complex topics and you can easily spend 10 years with
> both
> > > and you won't discover all the details.
> > >
> > > I would recommend that you search for Mahout at youtube and slideshare
> to
> > > get some more details.
> > >
> > > > Thanks in advance.
> > > >
> > > > Regards
> > > >
> > > >
> > > > On Mon, Jan 13, 2014 at 1:17 AM, Ted Dunning <te...@gmail.com>
> > > wrote:
> > > >
> > > >> On Sat, Jan 11, 2014 at 11:44 PM, Abhishek Kumar <
> > > >> abhishek.kumar.cse10@iitbhu.ac.in> wrote:
> > > >>
> > > >>> For this I need to somehow integrate apache mahout to a browser. I
> > also
> > > >>> need
> > > >>> to train my model on some server or database and then pipeline it
> to
> > > the
> > > >>> client.
> > > >>>
> > > >>> Please help if you have any suggestions.
> > > >>>
> > > >>
> > > >> Use log files.  You can build a browser plugin that sends recent
> URL's
> > > to
> > > >> an analytics server.
> > > >>
> > > >> If you are just working with friendly users, you might even just
> use a
> > > >> proxy server to get these logs.
> > > >>
> > >
> > > --
> > > Manuel Blechschmidt
> > > M.Sc. IT Systems Engineering
> > > Dortustr. 57
> > > 14467 Potsdam
> > > Mobil: 0173/6322621
> > > Twitter: http://twitter.com/Manuel_B
> > >
> > >
> >
>

Re: Integrating browser with apache mahout !

Posted by abhishek kumar <ab...@itbhu.ac.in>.
Thanks Ted, it was very helpful.Though I still have some doubts:

The click log would contain URL's and engagement of user as you suggested,
how can we use this data for building a recommend-er engine in mahout ?

I started working with topic modelling,(I know mahout has LDA ) can it be
used for recommendation ?

I only want to recommend web pages rather than making product based
recommender system, do I need to record user engagement for that also?


On Wed, Jan 15, 2014 at 4:06 AM, Ted Dunning <te...@gmail.com> wrote:

> Manuel is close to what I meant.
>
> Just taking click-logs is a good start, but you also have to look at the
> user experience and decide what actions that people take that indicates
> engagement.
>
> For example, if you look at amazon, I would say that if you scroll down the
> page to see the related items and then scroll down to see the reviews,
> these are both signs of engagement with a product.  This engagement is
> distinctly better than just loading a page or clicking on a product link.
>
> Neither of these actions would appear in a click log since neither involves
> a click.  It is quite plausible that these actions would have different
> weight for recommendations as well and that for one product, reviews might
> be important while for another related products might be more important.
>  You need both and you need to remember the difference, especially before
> you know if these actions are important or distinct.
>
> You can cause these actions to be logged by using Javascript that executes
> in the browser.  Focus events or scroll events are both candidate methods
> for finding out more about what the user is doing.
>
>
>
>
>
> On Tue, Jan 14, 2014 at 7:47 AM, Manuel Blechschmidt <
> Manuel.Blechschmidt@gmx.de> wrote:
>
> > Hi Abhishek,
> >
> > On 14.01.2014, at 16:24, abhishek kumar wrote:
> >
> > > Hi Ted,
> > >
> > > I'm new to mahout so I don't know how plugins for browser can be
> written
> > to
> > > incorporate mahout. Will you please explain in detail.
> >
> > Plugin for browsers have nothing to do with recommendations. Ted means
> > that you take a click log like the access log of the apache web server
> and
> > use this as the basis for a recommender system.
> >
> > Then you have to write a web application that serves the recommendations
> > to the users.
> >
> > Here is a project that gives you all the details:
> > https://github.com/ManuelB/facebook-recommender-demo
> >
> > The project is based on Java EE technology. Java EE and Recommendations
> > are both very complex topics and you can easily spend 10 years with both
> > and you won't discover all the details.
> >
> > I would recommend that you search for Mahout at youtube and slideshare to
> > get some more details.
> >
> > > Thanks in advance.
> > >
> > > Regards
> > >
> > >
> > > On Mon, Jan 13, 2014 at 1:17 AM, Ted Dunning <te...@gmail.com>
> > wrote:
> > >
> > >> On Sat, Jan 11, 2014 at 11:44 PM, Abhishek Kumar <
> > >> abhishek.kumar.cse10@iitbhu.ac.in> wrote:
> > >>
> > >>> For this I need to somehow integrate apache mahout to a browser. I
> also
> > >>> need
> > >>> to train my model on some server or database and then pipeline it to
> > the
> > >>> client.
> > >>>
> > >>> Please help if you have any suggestions.
> > >>>
> > >>
> > >> Use log files.  You can build a browser plugin that sends recent URL's
> > to
> > >> an analytics server.
> > >>
> > >> If you are just working with friendly users, you might even just use a
> > >> proxy server to get these logs.
> > >>
> >
> > --
> > Manuel Blechschmidt
> > M.Sc. IT Systems Engineering
> > Dortustr. 57
> > 14467 Potsdam
> > Mobil: 0173/6322621
> > Twitter: http://twitter.com/Manuel_B
> >
> >
>

Re: Integrating browser with apache mahout !

Posted by Ted Dunning <te...@gmail.com>.
Manuel is close to what I meant.

Just taking click-logs is a good start, but you also have to look at the
user experience and decide what actions that people take that indicates
engagement.

For example, if you look at amazon, I would say that if you scroll down the
page to see the related items and then scroll down to see the reviews,
these are both signs of engagement with a product.  This engagement is
distinctly better than just loading a page or clicking on a product link.

Neither of these actions would appear in a click log since neither involves
a click.  It is quite plausible that these actions would have different
weight for recommendations as well and that for one product, reviews might
be important while for another related products might be more important.
 You need both and you need to remember the difference, especially before
you know if these actions are important or distinct.

You can cause these actions to be logged by using Javascript that executes
in the browser.  Focus events or scroll events are both candidate methods
for finding out more about what the user is doing.





On Tue, Jan 14, 2014 at 7:47 AM, Manuel Blechschmidt <
Manuel.Blechschmidt@gmx.de> wrote:

> Hi Abhishek,
>
> On 14.01.2014, at 16:24, abhishek kumar wrote:
>
> > Hi Ted,
> >
> > I'm new to mahout so I don't know how plugins for browser can be written
> to
> > incorporate mahout. Will you please explain in detail.
>
> Plugin for browsers have nothing to do with recommendations. Ted means
> that you take a click log like the access log of the apache web server and
> use this as the basis for a recommender system.
>
> Then you have to write a web application that serves the recommendations
> to the users.
>
> Here is a project that gives you all the details:
> https://github.com/ManuelB/facebook-recommender-demo
>
> The project is based on Java EE technology. Java EE and Recommendations
> are both very complex topics and you can easily spend 10 years with both
> and you won't discover all the details.
>
> I would recommend that you search for Mahout at youtube and slideshare to
> get some more details.
>
> > Thanks in advance.
> >
> > Regards
> >
> >
> > On Mon, Jan 13, 2014 at 1:17 AM, Ted Dunning <te...@gmail.com>
> wrote:
> >
> >> On Sat, Jan 11, 2014 at 11:44 PM, Abhishek Kumar <
> >> abhishek.kumar.cse10@iitbhu.ac.in> wrote:
> >>
> >>> For this I need to somehow integrate apache mahout to a browser. I also
> >>> need
> >>> to train my model on some server or database and then pipeline it to
> the
> >>> client.
> >>>
> >>> Please help if you have any suggestions.
> >>>
> >>
> >> Use log files.  You can build a browser plugin that sends recent URL's
> to
> >> an analytics server.
> >>
> >> If you are just working with friendly users, you might even just use a
> >> proxy server to get these logs.
> >>
>
> --
> Manuel Blechschmidt
> M.Sc. IT Systems Engineering
> Dortustr. 57
> 14467 Potsdam
> Mobil: 0173/6322621
> Twitter: http://twitter.com/Manuel_B
>
>

Re: Integrating browser with apache mahout !

Posted by Manuel Blechschmidt <Ma...@gmx.de>.
Hi Abhishek,

On 14.01.2014, at 16:24, abhishek kumar wrote:

> Hi Ted,
> 
> I'm new to mahout so I don't know how plugins for browser can be written to
> incorporate mahout. Will you please explain in detail.

Plugin for browsers have nothing to do with recommendations. Ted means that you take a click log like the access log of the apache web server and use this as the basis for a recommender system.

Then you have to write a web application that serves the recommendations to the users.

Here is a project that gives you all the details:
https://github.com/ManuelB/facebook-recommender-demo

The project is based on Java EE technology. Java EE and Recommendations are both very complex topics and you can easily spend 10 years with both and you won't discover all the details.

I would recommend that you search for Mahout at youtube and slideshare to get some more details.

> Thanks in advance.
> 
> Regards
> 
> 
> On Mon, Jan 13, 2014 at 1:17 AM, Ted Dunning <te...@gmail.com> wrote:
> 
>> On Sat, Jan 11, 2014 at 11:44 PM, Abhishek Kumar <
>> abhishek.kumar.cse10@iitbhu.ac.in> wrote:
>> 
>>> For this I need to somehow integrate apache mahout to a browser. I also
>>> need
>>> to train my model on some server or database and then pipeline it to the
>>> client.
>>> 
>>> Please help if you have any suggestions.
>>> 
>> 
>> Use log files.  You can build a browser plugin that sends recent URL's to
>> an analytics server.
>> 
>> If you are just working with friendly users, you might even just use a
>> proxy server to get these logs.
>> 

-- 
Manuel Blechschmidt
M.Sc. IT Systems Engineering
Dortustr. 57
14467 Potsdam
Mobil: 0173/6322621
Twitter: http://twitter.com/Manuel_B


Re: Integrating browser with apache mahout !

Posted by abhishek kumar <ab...@itbhu.ac.in>.
Hi Ted,

I'm new to mahout so I don't know how plugins for browser can be written to
incorporate mahout. Will you please explain in detail.
Thanks in advance.

Regards


On Mon, Jan 13, 2014 at 1:17 AM, Ted Dunning <te...@gmail.com> wrote:

> On Sat, Jan 11, 2014 at 11:44 PM, Abhishek Kumar <
> abhishek.kumar.cse10@iitbhu.ac.in> wrote:
>
> > For this I need to somehow integrate apache mahout to a browser. I also
> > need
> > to train my model on some server or database and then pipeline it to the
> > client.
> >
> > Please help if you have any suggestions.
> >
>
> Use log files.  You can build a browser plugin that sends recent URL's to
> an analytics server.
>
> If you are just working with friendly users, you might even just use a
> proxy server to get these logs.
>

Re: Integrating browser with apache mahout !

Posted by Ted Dunning <te...@gmail.com>.
On Sat, Jan 11, 2014 at 11:44 PM, Abhishek Kumar <
abhishek.kumar.cse10@iitbhu.ac.in> wrote:

> For this I need to somehow integrate apache mahout to a browser. I also
> need
> to train my model on some server or database and then pipeline it to the
> client.
>
> Please help if you have any suggestions.
>

Use log files.  You can build a browser plugin that sends recent URL's to
an analytics server.

If you are just working with friendly users, you might even just use a
proxy server to get these logs.

Re: Integrating browser with apache mahout !

Posted by Abhishek Kumar <ab...@iitbhu.ac.in>.
Hi,


I'm trying to map each url's visited by user in particular time period or 
location (context ) to topics through topic modelling (I'm planning to use 
text content of web pages for now later I plan to incorporate meta data and 
hyperlinks for better modelling).

 For each particular context I will then get an average topic structure. 

I then want to use same trained model to map all the url's in user history to 
corresponding topic structure.I then want to pre download all the web pages 
that are most similar to current context.

For this I need to somehow integrate apache mahout to a browser. I also need 
to train my model on some server or database and then pipeline it to the 
client.

Please help if you have any suggestions.

Regards
Aks 





Re: Integrating browser with apache mahout !

Posted by abhishek kumar <ab...@itbhu.ac.in>.
Hi Tharindu Rusira,

Thanks for replying.

Yes, I'm extracting topics from text content of the page (though I'm also
working on how to incorporate meta data and links in it for better
modelling).

 Actually I'm trying to predownload those pages which are most similar to
the url's that are visited by users in a time period (context ). For this I
want to use only those url's that are in the history of user(usually
recorded by browsers) . Also I want to train my model on some server or
large database and pipeline it to client. That is why I need some web
interface or browser to work on.

Please ask if you still have any doubt.

Aks


On Fri, Jan 10, 2014 at 2:01 PM, Tharindu Rusira
<th...@gmail.com>wrote:

> On Fri, Jan 10, 2014 at 1:42 PM, abhishek kumar <
> abhishek.kumar.cse10@itbhu.ac.in> wrote:
>
> > Hi ,
> >
> Hi Abishek,
>
> >
> > I'm new to apache mahout. I'm working in topic modelling (particularly
> > LDA), I have learnt that Mahout has various modules for machine learning
> > and topic modelling
>
> Yes, Mahout has a topic modelling component that implements LDA
>  but I have not worked with it yet
> . (org.apache.mahout.clustering.lda.cvb)
> .
>
>
>
> > .I want to use it's capabilities in browsing to
> > recommend sites (or predownload desired webpages based on topics of
> url's).
> >
> Just out of curiosity, why do you want to extract topics from URLs and not
> from the content of the page?
>
> >
> > I don't have idea how it can be incorporated or attached to any
> > browser.
>
> You want a web interface, don't you?
>
>
> > Please help me in this project also if you have any suggestions it
> > will be helpful.
> >
> >
> > Aks
> >
> Regards,
>
>
>
> --
> M.P. Tharindu Rusira Kumara
>
> Department of Computer Science and Engineering,
> University of Moratuwa,
> Sri Lanka.
> +94757033733
> www.tharindu-rusira.blogspot.com
>

Re: Integrating browser with apache mahout !

Posted by Константин Слисенко <ks...@gmail.com>.
If you want to visualize your results you can try Neo4j open source graph
database. It has build-in graph explorer available in a web-browser. I use
it for storing mahout results data and some visualizations.
You can view my example here (database with my data)
http://50.16.193.54:7474/webadmin/#/data/search/9192/ and click "Switch
view mode" icon to view visualized graph.


2014/1/10 Tharindu Rusira <th...@gmail.com>

> On Fri, Jan 10, 2014 at 1:42 PM, abhishek kumar <
> abhishek.kumar.cse10@itbhu.ac.in> wrote:
>
> > Hi ,
> >
> Hi Abishek,
>
> >
> > I'm new to apache mahout. I'm working in topic modelling (particularly
> > LDA), I have learnt that Mahout has various modules for machine learning
> > and topic modelling
>
> Yes, Mahout has a topic modelling component that implements LDA
>  but I have not worked with it yet
> . (org.apache.mahout.clustering.lda.cvb)
> .
>
>
>
> > .I want to use it's capabilities in browsing to
> > recommend sites (or predownload desired webpages based on topics of
> url's).
> >
> Just out of curiosity, why do you want to extract topics from URLs and not
> from the content of the page?
>
> >
> > I don't have idea how it can be incorporated or attached to any
> > browser.
>
> You want a web interface, don't you?
>
>
> > Please help me in this project also if you have any suggestions it
> > will be helpful.
> >
> >
> > Aks
> >
> Regards,
>
>
>
> --
> M.P. Tharindu Rusira Kumara
>
> Department of Computer Science and Engineering,
> University of Moratuwa,
> Sri Lanka.
> +94757033733
> www.tharindu-rusira.blogspot.com
>

Re: Integrating browser with apache mahout !

Posted by Tharindu Rusira <th...@gmail.com>.
On Fri, Jan 10, 2014 at 1:42 PM, abhishek kumar <
abhishek.kumar.cse10@itbhu.ac.in> wrote:

> Hi ,
>
Hi Abishek,

>
> I'm new to apache mahout. I'm working in topic modelling (particularly
> LDA), I have learnt that Mahout has various modules for machine learning
> and topic modelling

Yes, Mahout has a topic modelling component that implements LDA
 but I have not worked with it yet
. (org.apache.mahout.clustering.lda.cvb)
.



> .I want to use it's capabilities in browsing to
> recommend sites (or predownload desired webpages based on topics of url's).
>
Just out of curiosity, why do you want to extract topics from URLs and not
from the content of the page?

>
> I don't have idea how it can be incorporated or attached to any
> browser.

You want a web interface, don't you?


> Please help me in this project also if you have any suggestions it
> will be helpful.
>
>
> Aks
>
Regards,



-- 
M.P. Tharindu Rusira Kumara

Department of Computer Science and Engineering,
University of Moratuwa,
Sri Lanka.
+94757033733
www.tharindu-rusira.blogspot.com