This is the best book I've ever read on computational linguistics. It should be ideal for both linguists who want to learn about statistical language processing and those building language applications who want to learn about linguistics. This book isn't even published and it's now my most highly used reference book, joining gems such as Cormen, Leiserson and Rivest's algorithm book, Quirk et al.'s English Grammar, and Andrew Gelman's Bayesian statistics book (three excellent companions to this book, by the way). The book is written more like a computer science or math book in that it starts absolutely from scratch, but moves quickly and assumes a sophisticated reader. The first one hundred or so pages provide background in probability, information theory and linguistics. This book covers (almost) every current trend in NLP from a statistical perspective: syntactic tagging, sense disambiguation, parsing, information retrieval, lexical subcategorization, Hidden Markov Models, and probabilistic context-free grammars. It also covers machine translation and information retrieval in later chapters. It covers all the statistical techniques used in NLP from Bayes' law through to maximum entropy modeling, clustering: nearest neighbors and decision trees, and much more. What you won't find is information on applications to higher-level discourse and dialogue phenomena like pronoun resolution or speech act classification.
Jeder CLer sollte sich dieses Buch vornehmen! Es ist sehr gut verständlich, auch ohne Statistikkenntnisse, und es enthält alles, was man wissen muss, wenn man in die statistische Computerlinguistik einsteigen möchte. Die Erklärungen enthalten immer verständliche Beispiele, Diagramme oder Zeichnungen. Man kommt um dieses Buch einfach nicht herum und sollte es auch nicht. Eine absolute Empfehlung!
My professor chose this book for a undergraduate course in Statistical Natural Language Processing and as a student I found it to be a great learning tool. It gave sufficient background in statistics and language so people with little background in this areas can get up to speed quickly. Lots of interesting assignments are proposed at the end of each chapter, and while some of the questions are rather vague (particularly with respect to the data they are refering to at times) they can be good starting points for further discussion or projects. As a student, I give this book an A+.