You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by "Edward J. Yoon" <ed...@apache.org> on 2009/08/03 12:11:26 UTC

Re: FYI X-RIME: Hadoop based large scale social network analysis released

That's really cool. BTW, Have you tried these algorithms on the
distributed environment?

On Mon, Aug 3, 2009 at 6:33 PM, Bin Cai<ca...@gmail.com> wrote:
> *X-RIM**E**(http://xrime.sourceforge.net/): Hadoop based large scale social
> network analysis*
> *
> Motivation*
> Today's telecom service providers and Internet-based social network sites
> possess huge user communities. They hold large amount of data about their
> users and want to generate core competency from the data. A key enabler for
> this is a cost efficient solution for social data management and social
> network analysis (SNA).
>
> Such a solution faces a few challenges. The most important one is that the
> solution should be able to handle massive and heterogeneous data sets.
> Facing this challenge, the traditional data warehouse based solutions are
> usually not cost efficient enough. On the other hand, existing SNA tools are
> mostly used in single workstation mode, and not scalable enough. To this
> end, low cost and highly scalable data management and processing
> technologies from cloud computing society should be brought in to help.
>
> However, most of existing cloud based data analysis solutions are trying to
> provide SQL-like general purpose query languages, and do not directly
> support social network analysis. This makes them hard to optimize and hard
> to use for SNA users. So, we came up with X-RIME to fix this gap.
>
> So, briefly speaking, X-RIME wants to provide a few value-added layers on
> top of existing cloud infrastructure, to support smart decision loops based
> on massive data sets and SNA. To end users, X-RIME is a library consists of
> Map-Reduce programs, which are used to do raw data pre-processing,
> transformation, SNA metrics and structures calculation, and graph / network
> visualization. The library could be integrated with other Hadoop based data
> warehouses (e.g., HIVE) to build more comprehensive solutions.
>
> *Currently Supported SNA Metrics and Structures*
> vertex degree statistics
> weakly connected components (WCC)
> strongly connected components (SCC)
> bi-connected components (BCC)
> ego-centric density
> bread first search / single source shortest path (BFS/SSSP)
> K-core
> maximal cliques
> pagerank
> hyperlink-induced topic search (HITS)
> minimal spanning tree (MST)
>



-- 
Best Regards, Edward J. Yoon @ NHN, corp.
edwardyoon@apache.org
http://blog.udanax.org