You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2007/12/29 10:00:24 UTC
[Lucene-hadoop Wiki] Update of "Hbase/Matrix" by udanax
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.
The following page has been changed by udanax:
http://wiki.apache.org/lucene-hadoop/Hbase/Matrix
The comment on the change is:
I'll re-write after some transformation.
------------------------------------------------------------------------------
- [[TableOfContents(4)]]
+ deleted
- * It's a designing stage.
-
- ----
- == LINA, a Framework for Large-scale Sparse Linear Algebra ==
-
- Using Hbase's Row,Column(Qualifier) two dimensional space, we are able to store large sparse matrix.
- [[BR]]The Auto-partitioned sparsity sub-structure will be efficiently managed and serviced by Hbase.
-
- Row or Column operations can be done in linear time and algorithms such as structured Gaussian elimination
- [[BR]]or iterative methods run in '''O(~-the number of non-zero elements in the matrix-~)''' time, Hbase Providing Both Vertical and Horizontal access.
-
- Therefore, using iterative algorithms on a parallel processing platform like !MapReduce on Hadoop,
- [[BR]]we should be able to implement '''High Performance''' and '''World's Largest''' Matrix Computations.
-
- === Applications by LINA ===
-
- It will support a wide variety of applications in the domain of Physics, Linear Algebra, Relational Algebra,
- [[BR]]Statistics, Graphics Rendering, Computational Dynamics and others.
-
- * Scientific simulation and modeling
- * Matrix-vector/matrix-matrix multiply
- * Soving linear systems
- * Collaborative filtering/recommendation systems
- * Information retrieval
- * Sorting
- * Finding eigenvalues and eigenvectors
- * Computer graphics and computational geometry
- * Matrix multiply
- * computing matrix determinate
-
- === Initial Contributors ===
-
- * [:udanax:Edward Yoon] (R&D center, NHN corp.)
- * I have profited by '''Prof. Samuel Kim''', '''Dr. Yongchan Park'''
-
- == Storing and manipulating numeric, sparse matrices on Hadoop + Hbase ==
-
- To store sparse matrix in hbase, I will use an abstracted table with a single column family, and tune existing partitioning conditions.
- The extendable columnfamily feature combine matrices that have the same row dimension.
-
- === Scheduling Algorithm ===
-
- The scheduling algorithm is designed for driving the parallel execution of the factorization on on a MapReduce model.
-
- NOTE : indirect and irregular, the inefficient data access ....
-
- NOTE : blocking factor........ code generator......
-
- * Needs
- * Total Row/Column Key Count in the matrix table header.
- * Big floating point number, Big integer calculator
-
- {{{
-
- +--------------------+ +------+ | +--+
- | . . | |. | | |. |
- | . . | | .| | | .|
- | | +------+ | +--+
- | | ----> ------------+-----------
- | | +-----+ |+-------+
- | . . . . | |. . | ||. .|
- | | +-----+ |+-------+
- +--------------------+ |
-
- }}}
-
-
- === Sums on the MapReduce Tasks of Hadoop ===
-
- == Examples ==
-
- === Collaborative Filtering Via Ensembles of Matrix Factorizations ===
-
- No work, no grub.
-
- === Fast Page-Rank Computation Via Sparse Linear System ===
-
- No work, no grub.
-
- === Latent Semantic Analysis Via Singular Value Decomposition ===
-
- No work, no grub.
-
- == References ==
-
- * High performance numerical libraries in Java, Bjørn-Ove Heimsund
- * Parallel Conjugate Gradients Assignment, a parallel implementation of the conjugate gradient algorithm
- * ScaLAPACK, a library of high-performance linear algebra routines for distributed-memory message-passing MIMD computers
- * [http://bebop.cs.berkeley.edu/oski/ OSKI], optimized sparse kernel interface (OSKI) library
- * [http://icl.cs.utk.edu/iclprojects/pages/files/sans/yelick-bebop.pdf Automatic Performance Tuning of Sparse Matrix Kernels], August 7, 2002
- * Scheduling algorithms for parallel Gaussian elimination with communication costs, Amoura, A.K.; Bampis, E.; Konig, J.-C.
- * Google's Page-Rank and Beyond: The Science of Search Engine Rankigs, Amy N. Langville; Carl D. Meyer
-