Rating: Summary: Fantastic return on investment Review: There are lots of books (and even more junk email) with titles like "Get Rich Quick". On the surface, this book is the exact opposite: a scholarly, scientific text aimed at comprehensive, accurate description, not at commercial hype. But if someone told me I had to make a million bucks in one year, and I could only refer to one book to do it, I'd grab a copy of this book and start a web text-processing company. Your return on investment might not be $1M, but this book delivers everything it promises. For all the major practical applications of statistical text processing, this book accurately and clearly surveys the major techniques. It often has pretty good advice about which techniques to prefer, but sometimes reads more like a catalog of listings (this reflects not on the authors' failing, but rather on the field's immaturity).It's worth comparing this book to the other recent NLP text: Jurafsky and Martin's. (Disclaimer: I worked with them on the preparation of their text.) Jurafsky and Martin cover much more ground, including many aspects that are ignored by Manning and Schutze. So if you want a general overview of natural language, if you want to know about the syntax of English, or the intricacies of dialog, then Jurafsky and Martin is for you. But if your needs are more focused on the algorithms for lower-level text processing with statistical techniques, then Manning and Schutze is far more comprehensive. If you're a serious student or professional in NLP, you just have to have both.
Rating: Summary: very definitive, really a must read Review: this is an import pre-req to any research/inquiry into this field.
Rating: Summary: Makes a great textbook... Review: This is the best book I've ever read on computational linguistics. It should be ideal for both linguists who want to learn about statistical language processing and those building language applications who want to learn about linguistics. This book isn't even published and it's now my most highly used reference book, joining gems such as Cormen, Leiserson and Rivest's algorithm book, Quirk et al.'s English Grammar, and Andrew Gelman's Bayesian statistics book (three excellent companions to this book, by the way). The book is written more like a computer science or math book in that it starts absolutely from scratch, but moves quickly and assumes a sophisticated reader. The first one hundred or so pages provide background in probability, information theory and linguistics. This book covers (almost) every current trend in NLP from a statistical perspective: syntactic tagging, sense disambiguation, parsing, information retrieval, lexical subcategorization, Hidden Markov Models, and probabilistic context-free grammars. It also covers machine translation and information retrieval in later chapters. It covers all the statistical techniques used in NLP from Bayes' law through to maximum entropy modeling, clustering: nearest neighbors and decision trees, and much more. What you won't find is information on applications to higher-level discourse and dialogue phenomena like pronoun resolution or speech act classification.
Rating: Summary: An absolute MUST for anyone interested in NLP. Review: This is the best book I've ever read on computational linguistics. It should be ideal for both linguists who want to learn about statistical language processing and those building language applications who want to learn about linguistics. This book isn't even published and it's now my most highly used reference book, joining gems such as Cormen, Leiserson and Rivest's algorithm book, Quirk et al.'s English Grammar, and Andrew Gelman's Bayesian statistics book (three excellent companions to this book, by the way). The book is written more like a computer science or math book in that it starts absolutely from scratch, but moves quickly and assumes a sophisticated reader. The first one hundred or so pages provide background in probability, information theory and linguistics. This book covers (almost) every current trend in NLP from a statistical perspective: syntactic tagging, sense disambiguation, parsing, information retrieval, lexical subcategorization, Hidden Markov Models, and probabilistic context-free grammars. It also covers machine translation and information retrieval in later chapters. It covers all the statistical techniques used in NLP from Bayes' law through to maximum entropy modeling, clustering: nearest neighbors and decision trees, and much more. What you won't find is information on applications to higher-level discourse and dialogue phenomena like pronoun resolution or speech act classification.
|