am 7. September 2001
This Data Mining textbook is an excellent ressource for teaching and the practical application of learning algorithms. Students and teachers receive a powerful tool when they use the book and the corresponding software package WEKA which is available for free including the source code. The main profit from this book and the software lies is the huge variety of learning algorithms which can be applied to your own data. The book sets the context for data mining by looking at the social implications as well as the mathematical aspects. The focus of the book lies on symbolic learning methods like rule sets and decision trees. At least neural networks should have been included as well. The explanations do not go too far into the mathematical details of data mining, therefore, the book may be used by less technically oriented students as well as practioneers merely interested in the use of machine learning.
am 7. Februar 2000
Witten and Frank have generated a book that is readable without eliminating all technical (yes, even mathematical!) descriptions of the key data mining algorithms. And they are up-to-date, including support vector machines and boosting. There are sufficient examples of the techniques to provide readers with a good feel for what each technique can accomplish. For example, how many books can provide a readable explanation of support vector machines?
There are some quibbles, such as not including any discussion of neural networks (noted in Ch. 1 with another reference)--I believe it deserves some attention because of its widespread use. Additionally, future editions should include a least a brief summary of data preprocessing, input selection, feature creation, etc. But these are quibbles.
The Java portion of the book is not of as much interest to me, but for those wishing to implement the algorithms, it provides a nice blueprint (from the code I looked at).
For what they have undertaken, they have performed admirably, and I would highly recommend this book.
am 3. Dezember 1999
Broad coverage, including hot new topics: SVM, boosting and bagging, modern evaluation methods (ROC and lift curves). Well grounded in practical data mining applications, talks about DM issues outside model building, which are rarely discussed: feature engineering, data cleaning, etc. Clear and well written: illustrative examples help the presentation a lot. Describes in detail decision trees and rule learners, instance-based learning, and numerical prediction. Accompanied by the WEKA system, implementing in Java many of the methods discussed in the book, and available for download for free. An excellent hands-on textbook for an applied Machine Learening/DM class, or recommended reading for ayone who wants to understand DM. Good next step for those that have whetted their appetite with Berry and Linof's book.
am 28. Januar 2000
This book is THE best book I have read about data mining. And I have read most of them (see ISBNs: 0070057796, 0471253847, 0262560976, 0201403803, 0471179809, 013743980, 0137564120, 1558605290, 1558604030). It is fresh, clear, well balanced. If your native language is not English, then you should definetly read THIS book first.
The feature that is the most important for me is "just enough statistics". That is, you can understand the processes & descriptions even if you have not wasted your life and youth studying statistics; what is needed of it to understand is given shortly and very well. Many other books are too deep or too shallow (like Berry's, which is a good introduction, but nothing more than that).
If the rating was scaled 1-6 stars, I'd give this book a 10.
am 25. November 1999
This book is excellent for anyone entering the fields of data mining or machine learning. The material is organised into functions rather than techniques, which promotes a deeper understanding of why different approaches work, when to use them, and how they can be combined to maximise results.
For those already conversant in machine learning, it contains a wealth of practical techniques for improving and analysing results. I expect to use it often in the course of my research.