4 von 4 Kunden fanden die folgende Rezension hilfreich
Mark Tabladillo (MarkTab)
- Veröffentlicht auf Amazon.com
This purchase review is based on the Alpha Book (pre-publication electronic version): I will have full access to the production version of the book, at which point I will update this commentary.
Having owned and read many books on data science, I recommend this book for both advanced business analysts and data scientists. In brief, if you plan to use Microsoft Azure Machine Learning for commercial or non-profit purposes, you need to have this authoritative reference available. The cost (either paperback or electronic) is minimal compared to the value you will receive from this book and its recommended resources. This book will guide you in how to make a data science experiment, and the best practices from AzureML, straight from the people who helped spearhead this new cloud-based technology.
This longer review provides details on the book for the purpose of informing a purchase decision. I will review the technology, and end on a description of my background.
This book is valuable to advanced business analysts since it does not assume previous exposure to Data Science. The book provides a foundation of the key ideas of data science, describes how Microsoft Azure Machine Learning exposes data science, provides several business use-case scenarios for using the technology, and has many references for further learning. The book describes how to get started in minutes: matching my first-hand AzureML experience with the book’s exercises, I can verify that you can see results in minutes.
This book is valuable to already-trained data scientists (statisticians and application developers) since it provides a foundational look at how Microsoft is providing Azure-based (cloud) model building and web services. First, this technology works on any browser. You can be using the laptop OS (including Apple, Linux, UNIX) or tablet OS (including Android, iPad) of your choice and produce data science experiments with Microsoft Azure Machine Learning. Second, this technology is priced on usage. Third, although this technology has its own native machine learning algorithms, you can alternatively (or in combination) use R code for your experiments. Fourth, this technology allows for creating a web service, and even provides the application connection code (in C#, Python, and R) to your web service. There is a well-done video from Microsoft on this last point on YouTube: contact me if you cannot find it.
Microsoft Azure Machine Learning is cloud-based, managed from a browser, and you can try Azure free for a month (given a certain number of credits) to get started. Microsoft recently announced a free tier which does NOT require a credit card or Azure subscription, excellent for students (whether high school, university, graduate school). Microsoft previously (and still) produces SQL Server Data Mining, which requires the Windows OS and a SQL Server license: I have worked with clients doing serious production projects with that proven and scalable technology. By contrast, Microsoft Azure Machine Learning works in Azure (cloud), requires data to be in the cloud, and provides an easy way to produce a web service (build applications for enterprise through entrepreneurial use).
Table of Contents:
Part 1: Introducing Data Science and Microsoft Azure Machine Learning
1. Introduction to Data Science
2. Introducing Microsoft Azure Machine Learning
3. Integration with R
Part 2: Statistical and Machine Learning Algorithms
4. Introduction to Statistical and Machine Learning Algorithms
Part 3: Practical Applications
5. Customer Propensity Models
6. Building Churn Models
7. Customer Segmentation Models
8. Predictive Maintenance
Chapter 1 provides foundational definitions for data science. Chapter 2 is a specific how-to-build an experiment with Microsoft Azure Machine Learning. At the end of the second chapter you have a fully working experiment in minutes (though, you will need to shut off your phone and email, and be in a place where you can concentrate). Chapter 3 discusses what is available with R, and you build an experiment: you can have an R widget which accepts up to 2 dataset inputs and a ZIP file with your R code (uploaded from your operating system, which need not be Windows).
Chapter 4 describes the difference among the machine learning algorithms in the categories of regression (linear regression, neural networks, decision trees, boosted decision trees), classification (support vector machines, Bayes point machines) and clustering (k-means).
Chapters 5 through 8 are especially valuable to advanced business analysts or people new to data science. Each chapter provides a description of a type of common use for machine learning, along with a business problem and specific experiments. This book is written by the people who helped build the technology, and the book’s first author Roger Barga had a key role in leading this technology’s development. The key development team included experienced Microsoft professionals who have contributed to Microsoft Research, Bing, and other areas in Microsoft. All the book authors have been contributors to training their fellow Microsoft professionals and also the general user community in data science, and you can find their contributions by searching for them in Bing.
One area which this book could have stressed more is the strength and depth of the Microsoft community, and the communal use of AzureML. The technology has an active forum hosted at Microsoft’s website supporting your technical Azure questions. Microsoft has a new blog on machine learning, including and beyond AzureML. There is an area called Connect where you can report bugs or provide feedback suggestions. If you create a Microsoft partner relationship, you might be involved in the development of new Microsoft technology. Next, we know from many data science competitions that the largest projects are completed with teams. Microsoft Azure Machine Learning was made with teams in mind: you can create workspaces and invite people to jointly have access. People working across time zones can develop experiments 24*7, if they so choose. That option might be important if you are needing to ramp up a new application for an entrepreneurial venture. Microsoft is uniformly encouraging professional team collaboration through AzureML and Office 365. Finally, there is a Machine Learning Marketplace already available where you can both purchase and publish experiments. Microsoft has already published some free demonstration experiments, and is hoping developers will use the marketplace to license their data science work.
I will reinforce a consistent message in the book: more training leads to better results. It is true that you can get started in minutes, but you can also do many other things in minutes like learn to drive or cook. To become proficient, and especially in a competitive data science scenario (commercial companies competing for business) your team will need more training. It’s one thing to drive or cook non-competitively, and another thing to compete for million-dollar driving or cooking contracts. The book has many links throughout all chapters encouraging further study.
This first version purchase review is based on the alpha pre-publication edition. I will revise this review once a final book is available. I have some pre-publication errata (none of which devalues the substantial contribution of this book):
* The book talks about the “telecommunication industry” but that expression is usually “telecommunications (plural)”
* The book claims that Microsoft Azure Machine learning uses R version 4.1.0, but R is at version 3: they mean 3.1.0 — that version reflects at present, and AzureML will be continuously updated.
I am a consultant focusing on data science: I help non-profit and for-profit companies establish best-in-practice commercial-grade enterprise technologies for advanced analytics. For years, I have been part of the Microsoft MVP program. Along with dozens of others, I participated in the beta development of Microsoft Azure Machine Learning. Also, I developed a personal relationship with especially two of the book’s authors, Roger Barga and Val Fontama (though I had already been friends with all three on Linked In). I have been to Microsoft Seattle to have training on this technology, and have presented this technology for several PASS user group meetings. I have doctoral training in this area, and have taught undergraduate and graduate level statistics, and contributed to peer-reviewed research. I provide data science training, blog on data science, and have free presentations (including AzureML) posted on Slideshare and YouTube. I have written on machine learning for Gigaom. I have used many technologies professionally including SAS, SPSS, open source, and Microsoft. I connect with people on Linked In and Twitter @marktabnet.
I will be posting an even more detailed free series of chapter-by-chapter commentary to my blog: though, that longer technical commentary will only be valuable if you have the book, and have an active Azure account.