You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Bob Carpenter <ca...@alias-i.com> on 2008/06/12 23:24:07 UTC

Book: Building Search Applications: Lucene, LingPipe and Gate

Manu Konchady's book on building search applications
is out:

Konchady, Manu. 2008. Building Search Applications: Lucene, LingPipe,
and Gate. Mustru Publishing.

It's available from Amazon:
http://www.amazon.com/Building-Search-Applications-Lucene-Lingpipe/dp/0615204252/

The book's a gentle introduction to enterprise and
web search, focusing on the three tools in the title
(disclaimer: I wrote most of LingPipe.)

The target audience is Java programmers who are new to
search and text analytics.  As such it provides step-by-step
code-based explanations in the mold of a Manning "in Action"
book.

I read the draft, have a copy of the book right here,
and can vouch for its technical accuracy (disclaimer 2:
I didn't actually run the code.)

It's based on Lucene 2.3.

Here's a chapter-by-chapter overview
of topics:

After (1) a brief discussion of application issues,
the chapters include (2) tokenization in all three frameworks,
(3) indexing with Lucene, (4) searching with Lucene,
(5) sentence extraction, part-of-speech tagging,
interesting/significant phrase extraction, and
entity extraction with LingPipe and Gate (6) clustering
with LingPipe, (7) topic and language classification
with LingPipe, (8 ) enterprise and web search,
page rank/authority calculation, and crawling with
Nutch, (9) tracking news, sentiment analysis with
LingPipe, detecting offensive content and plagiarism,
and finally, (10) future directions including vertical
search, tag-based search and question-answering.

That may sound like a whole lot of ground to cover in
400 pages, but Konchady pulls the reader along by illustrating
everything  with working code and not getting bogged down in
technicalities. There are pointers to theory, and a bit of
math where necessary, but the book never loses sight of its
goal  of providing a practical introduction. In that way,
it’s like the Manning "in Action" series.

About the author: Manu Konchady has a home page/blog on Amazon:

http://www.amazon.com/gp/blog/A2TWRNMTU6T9TW/ref=cm_blog_dp_artist_blog

- Bob Carpenter
   Alias-i

PS:  If you want a theory book on roughly the same
selection of topics, check this out:

Manning, Raghavan and Schuetze.  2008.
Introduction to Information Retrieval.
Cambridge University Press.

It's due out in July; the content will remain
free online in PDF form:

http://www-csli.stanford.edu/~hinrich/information-retrieval-book.html

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org