Unlike many (most?) books and courses on machine learning, Barber's outstanding text is very suitable for self study. There are many reasons for this, and high among them is the fact that he carefully explains, with commonsense examples and applications, many of the tougher logical, mathematical and processing foundations of pattern recognition.
For relative beginners, Bayesian techniques began in the 1700s to model how a degree of belief should be modified to account for new evidence. The techniques and formulas were largely discounted and ignored until the modern era of computing, pattern recognition and AI, now machine learning. The formula answers how the probabilities of two events are related when represented inversely, and more broadly, gives a precise mathematical model for the inference process itself (under uncertainty), where deductive reasoning and logic becomes a subset (under certainty, or when values can resolve to 0/1 or true/false, yes/no etc. In "odds" terms (useful in many fields including optimal expected utility functions in decision theory), posterior odds = prior odds * the Bayes Factor.
For context, I'm the lead scientist at IABOK dot org-- we design algorithms for huge data mining problems and applications. This text is our "go to" reference for programmers not up to speed in many of the new pattern recognition algorithms, including those writing new versions. All the most recent relevant models, from a probability standpoint, are represented here, with a clarity that is stunning. My only criticism (a mild one) is that, when applying Barber's examples to Bodies of Knowledge and data mining, he skips Prolog, backward chaining, predicate calculus and other techniques that are the foundation of automated inference systems (systems that extend knowledge bases automatically by checking whether new propositions can be inferred from the KB as consistent, relevant, etc.).
In the next 20 years, algorithms will rule this planet. If you either want to see the future of your grandkids, or participate in it if you're young, this is a MUST HAVE exploration of where what we used to call AI is now headed. There IS plenty of calculus in this volume, so don't mistakenly think it is "simple" -- but if you put the time in, you can "get it" even if you're a bright undergrad level thinker. The author's goal of training new algorithm programmers is laudable and right on point for where pattern recognition is headed.
With this amount of math, how can we star it high for self study? Easy: unlike most "recipe" books that just give bushels of codes or techniques, the authors here give the what, where when and why of both code and math, not just the how, as their goal is independent, creative contributors who can write their OWN algorithms. There are a few minor UK vs US differences in terminology also (event space instead of sample space, for example), but they expand the reader's horizon rather than distract or annoy as some others do. There are others like Bishop and many more that have more recipes, and more compact and difficult math, but you have to either be really good (just show me the recipe) or really bad (I don't know what I'm doing, but can follow this recipe) to benefit from them. This is a happy middle ground that does not disappoint.