You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Jimmy Lin <ji...@umd.edu> on 2010/02/19 17:23:04 UTC

Data-Intensive Text Processing with MapReduce

Hi everyone,

I'm pleased to present the first complete draft of a forthcoming book:

Data-Intensive Text Processing with MapReduce
by Jimmy Lin and Chris Dyer

The complete text is available at:
http://www.umiacs.umd.edu/~jimmylin/book.html

It's slated for publication by Morgan & Claypool in mid-2010.

This text is currently being used in the MapReduce course at the 
University of Maryland.  The focus of the book is on algorithm design 
and "thinking at scale".  Quite explicitly, the book is *not* about 
Hadoop programming.  Tom White's book already does that quite well... :)

Table of Contents

    1. Introduction
    2. MapReduce Basics
    3. MapReduce algorithm design
    4. Inverted Indexing for Text Retrieval
    5. Graph Algorithms
    6. EM Algorithms for Text Processing
    7. Closing Remarks

We hope you find this resource helpful... Comments and feedback are welcome!

Best,
Jimmy


Re: Data-Intensive Text Processing with MapReduce

Posted by "Ankur C. Goel" <ga...@yahoo-inc.com>.
Hi Jimmy,
            Congratulations on the good work. In chapter 6 it would be good to supplement EM examples with more sudo code as the chapter is quite mathematical in nature.

Regards
-@nkur

On 2/19/10 9:53 PM, "Jimmy Lin" <ji...@umd.edu> wrote:

Hi everyone,

I'm pleased to present the first complete draft of a forthcoming book:

Data-Intensive Text Processing with MapReduce
by Jimmy Lin and Chris Dyer

The complete text is available at:
http://www.umiacs.umd.edu/~jimmylin/book.html

It's slated for publication by Morgan & Claypool in mid-2010.

This text is currently being used in the MapReduce course at the
University of Maryland.  The focus of the book is on algorithm design
and "thinking at scale".  Quite explicitly, the book is *not* about
Hadoop programming.  Tom White's book already does that quite well... :)

Table of Contents

    1. Introduction
    2. MapReduce Basics
    3. MapReduce algorithm design
    4. Inverted Indexing for Text Retrieval
    5. Graph Algorithms
    6. EM Algorithms for Text Processing
    7. Closing Remarks

We hope you find this resource helpful... Comments and feedback are welcome!

Best,
Jimmy