Hidden Markov Models of Bioinformatics

Home :: Books :: Professional & Technical

Arts & Photography
Audio CDs
Audiocassettes
Biographies & Memoirs
Business & Investing
Children's Books
Christianity
Comics & Graphic Novels
Computers & Internet
Cooking, Food & Wine
Entertainment
Gay & Lesbian
Health, Mind & Body
History
Home & Garden
Horror
Literature & Fiction
Mystery & Thrillers
Nonfiction
Outdoors & Nature
Parenting & Families
Professional & Technical
Reference
Religion & Spirituality
Romance
Science
Science Fiction & Fantasy
Sports
Teens
Travel
Women's Fiction

	Hidden Markov Models of Bioinformatics
	List Price: $41.00 Your Price: $35.85

Product Info

Reviews

<< 1 >>

Rating: 5 stars

Summary: Good material, but you really have to want it.
Review: The book gives outstanding coverage of all that goes into building HMMs - one of the most important tools in genome analysis and structure prediction. It covers the field in extreme depth. More depth, in fact, than needed for building useful HMM systems. It not only presents the forward and backward algorithms leading up to Baum-Welch, it presents all the extras - convergence, etc.

This additional depth of coverage may go beyond many readers' needs. It is very helpful, though, for people who need more than the usual algorithms. By giving the background in such detail, a persistent reader can follow to a certain point, then create modifications with a clear idea of where the new algorithm actually comes from.

Regarding the current practice of HMM usage, I found it a bit thin. Widely-known tools based on HMMs are mentioned only occasionally and in passing, and HMM-based alignment is discussed only briefly. Well, this book isn't for the tool user. Perhaps more important, I found scant mention of scoring with respect to some background probability model ("null" model, as it's called here).

My one real complaint, and this is truly minor, is the quality of illustration. The line-drawings look like Word pictures - not necessarily a bad thing, if done well. These aren't particularly professional-looking, though, and oddly stretched or squashed in many cases. Still, they're readable enough and make all the needed points.

A lesser point, and not the author's fault, is the editorial implication that this book introduces probabilitic models in general. It does not. This is strictly about HMMs, not Bayesian nets, bootstrap techniques, or any of the dozens of other probabilistic models used in bioinformatics. That is not a flaw of the book, just a flaw in how it's represented.

If you are dedicated to becoming an expert in HMM construction and application, you must have this book. It's a bit much, though, for people who just want the results that HMMs give.

Rating: 4 stars
Summary: Primarily for bio-mathematicians
Review: The field of computational biology has expanded greatly in the last decade, mainly due to the increasing role of bioinformatics in the genome sequencing projects. This book outlines a particular set of algorithms called hidden Markov models, that are used frequently in genetic sequence search routines. The book is primarily for mathematicians who want to move into bioinformatics, but it could be read by a biologist who has a strong mathematical background. The book is detailed at some places, sparse in others, and reads like a literature survey at times, but many references are given, and there are very interesting exercises at the end of each chapter section. In fact it is really imperative that the reader work some of these exercises, as the author proves some of the results in the main body of the text via the exercises.

Some of the highlights of the book include: 1. An overview of the probability theory to be used in the book. The material is fairly standard, including a review of continuous and discrete random variables, from the measure-theoretic point of view, i.e the author introduces them via a probability space which is set with its sigma field, and a probability measure on this field. The weight matrix or "profile" as it is sometimes called, is defined, this having many applications in bioinformatics. Bayesian learning is also discussed, and the author introduces what he calls the "missing information principle", and is fundamental to the probabilistic modeling of biological sequences. Applications of probability theory to DNA analysis are discussed, including shotgun assembly and the distribution of fragment lengths from restriction digests. A collection of interesting exercises is included at the end of the chapter, particularly the one on the null model for pairwise alignments. 2. An introduction to information theory and the relative entropy or "Kullback distance", the latter of which is used to learn sequence models from data. The author defines the mutual information between two probability distributions and the entropy, and calculates the latter for random DNA. He also proves some of the Shannon source coding theorems, one being the convergence to the entropy for independent, identically distributed random variables. The Kullback distance is then defined, as a distance between probability distributions, with the caution that it is not a metric because of lack of symmetry. 3. The overview of probabilistic learning theory, where 'learning from data' is defined as the process of inferring a general principle from observations of instances. 4. The very detailed treatment of the EM algorithm, including the discussion of a model for fragments with motifs. 5. The discussion of alignment and scoring, especially that of global similarity. Local alignment is treated in the exercises. 6. The discussion of the learning of Markov chains via Bayesian modeling applied to a training sequence via a family of Markov models. Frame dependent Markov chains are discussed in the context of Markovian models for DNA sequences. 7. The discussion of influence diagrams and nonstandard hidden Markov models, in particular the excellent diagrams drawn to illustrate the main properties, and excellent discussion is given of an "HMM with duration" in the context of the functional units of a eukaryotic gene. This is important in the GeneMark:hmm software available. 8. The treatment of motif-based HMM, in particular the discussion of the approximate common substring problem. 9. The discussion of the "quasi-stationary" property of some chains and the connection with the "Yaglom limit". 10. The treatment of Derin's formula for the smoothing posterior probability of a standard HMM. The author shows in detail that the probability of a finite length emitted sequence conditioned on a state sequence of the HMM depends only on a subsequence of the state sequence. 11. The treatment of the lumping of Markov chains, i.e. the question as to whether a function of a Markov chain is another Markov chain. 12. The very detailed treatment of the Forward-Backward algorithm and the Viterbi algorithm. 13. The discussion of the learning problem via the quasi-log likelihood function for HMM. 14. The discussion of the limit points for the Baum-Welch algorithm. Since the Baum-Welch algorithm deals with iterations of a map, its convergence can be proved by finding the fixed points of this map. These fixed points are in fact the stationary points of the likelihood function and can be related to the convergence of the algorithm via the Zangwill theory of algorithms. Unfortunately the author does not give the details of the Zangwill theory, but instead delegates it to the references (via an exercise). The Zangwill theory can be discussed in the context of nonlinear programming, with generalizations of it occurring in the field of nonlinear functional analysis. It might be interesting to investigate whether the properties of hidden Markov models, especially their rigorous statistical properties, can all be discussed in the context of nonlinear functional analysis.

Rating: 2 stars
Summary: Written by a mathematician for mathematicians
Review: The intended audience of this book are mathematicians. To understand this book, you should have prior coursework experience in at least several upper division undergraduate courses in mathematical statistics and probability theory. The structure of this book is also that of a typical math book; full of proposition, corollary, lemma, etc, and very limited use of illustrations (e.g., there is no single figure up to chapter 6).

I wanted a book with a mathematical sophistication simliar to Durbin's book, but this book is way more than that. On the other hand, I showed this book to a mathematics graduate student and she said this book is perfect for her. So I guess this book is written by a mathematician only for mathematicians.

<< 1 >>