

Buy anything from 5,000+ international stores. One checkout price. No surprise fees. Join 2M+ shoppers on Desertcart.
Desertcart purchases this item on your behalf and handles shipping, customs, and support to Nicaragua.
This is the first textbook on pattern recognition to present the Bayesian viewpoint. The book presents approximate inference algorithms that permit fast approximate answers in situations where exact answers are not feasible. It uses graphical models to describe probability distributions when no other books apply graphical models to machine learning. No previous knowledge of pattern recognition or machine learning concepts is assumed. Familiarity with multivariate calculus and basic linear algebra is required, and some experience in the use of probabilities would be helpful though not essential as the book includes a self-contained introduction to basic probability theory. Review: Excellent text - First of all, as some other reviewers have pointed out, the subtitle of the book should include the word 'Bayesian' in some form or the other. The reason this is important is because the Bayesian approach, although an important one, is not adapted across the board in machine learning, and consequently, an astonishing number of methods presented in the book (Bayesian versions of just about anything) are not mainstream. The recent Duda book gives a better idea of the mainstream in this sense, but because the field has evolved in such rapidity, it excludes massive recent developments in kernel methods and graphical models, which Bishop includes. Pedagogically, however, this book is almost uniformly excellent. I didn't like the presentation on some of the material (the first few sections on linear classification are relatively poor), but in general, Bishop does an amazing job. If you want to learn the mathematical base of most machine learning methods in a practical and reasonably rigorous way, this book is for you. Pay attention in particular to the exercises, which are the best I've seen so far in such a text; involved, but not frustrating, and always aiming to further elucidate the concepts. If you want to really learn the material presented, you should, at the very least, solve all the exercises that appear in the sections of the text (about half of the total). I've gone through almost the entire text, and done just that, so I can say that it's not as daunting as it looks. To judge your level regarding this, solve the exercises for the first two chapters (the second, a sort of crash course on probability, is quite formidable). If you can do these, you should be fine. The author has solutions for a lot of them on his website, so you can go there and check if you get stuck on some. As far as the Bayesian methods are concerned, they are usually a lot more mathematically involved than their counterparts, so solving the equations representing them can only give you more practice. Seeing the same material in a different light can never hurt you, and I learned some important statistical/mathematical concepts from the book that I'd never heard of, such as the Laplace and Evidence Approximations. Of course, if you're not interested, you can simply skip the method altogether. From the preceding, it should be clear that the book is written for a certain kind of reader in mind. It is not for people who want a quick introduction to some method without the gory details behind its mathematical machinery. There is no pseudocode. The book assumes that once you get the math, the algorithm to implement the method should either become completely clear, or in the case of some more complicated methods (SVMs for example), you know where to head for details on an implementation. Therefore, the people who will benefit most from the book are those who will either be doing research in this area, or will be implementing the methods in detail on lower level languages (such as C). I know that sounds offputting, but the good thing is that the level of the math required to understand the methods is quite low; basic probability, linear algebra and multivariable calculus. (Read the appendices in detail as well.) No knowledge is needed, for example, of measure-theoretic probability or function spaces (for kernel methods) etc. Therefore the book is accessible to most with a decent engineering background, who are willing to work through it. If you're one of the people who the book is aimed at, you should seriously consider getting it. Edited to Add: I've changed my rating from 4 stars to 5. Even now, 4-5 years later, there is simply no good substitute for this book. Review: Still (one of) the best - I recently had to quickly understand some facts about the probabilistic interpretation of pca. Naturally I picked up this book and it didn't disappoint. Bishop is absolutely clear, and an excellent writer as well. In my opinion, despite the recent publication of Kevin Murphy's very comprehensive ML book, Bishop is still a better read. This is mostly because of his incredible clarity, but the book has other virtues: best in class diagrams, judiciously chosen; a lot of material, very well organized; excellent stage setting (the first two chapters). Now, sometimes he's a bit cryptic, for example, the proof that various kinds of loss lead to conditional median or mode is left as an exercise (ex 1.27). Murphy actually discusses it in some detail. This is true in general: Murphy actually discusses many things that Bishop leaves to the reader. I thought chapters three and four could have been more detailed, but I really have no other complaints. Please note that in order to get an optimal amount out of reading this book you should already have a little background in linear algebra, probability, calculus, and preferably some statistics. The first time I approached it was without any background and I found it a bit unfriendly and difficult; this is no fault of the book, however. Still, you don't need that much, just the basics. Update: I should note that there are some puzzling omissions from this book. E.g. f-score & confusion matrices are not mentioned (see Murphy section 5.7.2) - it would have been very natural to mention these concepts in ch 1, along with decision theory. Nor is there much on clustering, except for K-means (see Murphy ch 25). Not a huge deal, it's easy to get these concepts from elsewhere. I recommend using Murphy as and when you need, to fill in gaps. One more update: I've been getting into Hastie et al's ESL recently, and I'm really impressed with it so far - I think the practitioner should probably get familiar with both ESL and PRML, as they have complementary strengths and weaknesses. ESL is not very Bayesian at all; PRML is relentlessly so. ESL does not use graphical models or latent variables as a unifying perspective; PRML does. ESL is better on frequentist model selection, including cross-validation (ch 7). I think PRML is better for graphical models, Bayesian methods, and latent variables (which correspond to chs 8-13) and ESL better on linear models and density based methods (and other stuff besides). Finally, ESL is way better on "local" models, like kernel regression & loess. Your mileage may vary...They are both excellent books. ESL seems a bit more mathematically dense than PRML, and is also better for people who are in industry as versus academia (I was in the latter but now in the former),
| Best Sellers Rank | #118,227 in Books ( See Top 100 in Books ) #16 in Computer Vision & Pattern Recognition #82 in Probability & Statistics (Books) #311 in Artificial Intelligence & Semantics |
| Customer Reviews | 4.5 out of 5 stars 793 Reviews |
K**A
Excellent text
First of all, as some other reviewers have pointed out, the subtitle of the book should include the word 'Bayesian' in some form or the other. The reason this is important is because the Bayesian approach, although an important one, is not adapted across the board in machine learning, and consequently, an astonishing number of methods presented in the book (Bayesian versions of just about anything) are not mainstream. The recent Duda book gives a better idea of the mainstream in this sense, but because the field has evolved in such rapidity, it excludes massive recent developments in kernel methods and graphical models, which Bishop includes. Pedagogically, however, this book is almost uniformly excellent. I didn't like the presentation on some of the material (the first few sections on linear classification are relatively poor), but in general, Bishop does an amazing job. If you want to learn the mathematical base of most machine learning methods in a practical and reasonably rigorous way, this book is for you. Pay attention in particular to the exercises, which are the best I've seen so far in such a text; involved, but not frustrating, and always aiming to further elucidate the concepts. If you want to really learn the material presented, you should, at the very least, solve all the exercises that appear in the sections of the text (about half of the total). I've gone through almost the entire text, and done just that, so I can say that it's not as daunting as it looks. To judge your level regarding this, solve the exercises for the first two chapters (the second, a sort of crash course on probability, is quite formidable). If you can do these, you should be fine. The author has solutions for a lot of them on his website, so you can go there and check if you get stuck on some. As far as the Bayesian methods are concerned, they are usually a lot more mathematically involved than their counterparts, so solving the equations representing them can only give you more practice. Seeing the same material in a different light can never hurt you, and I learned some important statistical/mathematical concepts from the book that I'd never heard of, such as the Laplace and Evidence Approximations. Of course, if you're not interested, you can simply skip the method altogether. From the preceding, it should be clear that the book is written for a certain kind of reader in mind. It is not for people who want a quick introduction to some method without the gory details behind its mathematical machinery. There is no pseudocode. The book assumes that once you get the math, the algorithm to implement the method should either become completely clear, or in the case of some more complicated methods (SVMs for example), you know where to head for details on an implementation. Therefore, the people who will benefit most from the book are those who will either be doing research in this area, or will be implementing the methods in detail on lower level languages (such as C). I know that sounds offputting, but the good thing is that the level of the math required to understand the methods is quite low; basic probability, linear algebra and multivariable calculus. (Read the appendices in detail as well.) No knowledge is needed, for example, of measure-theoretic probability or function spaces (for kernel methods) etc. Therefore the book is accessible to most with a decent engineering background, who are willing to work through it. If you're one of the people who the book is aimed at, you should seriously consider getting it. Edited to Add: I've changed my rating from 4 stars to 5. Even now, 4-5 years later, there is simply no good substitute for this book.
E**E
Still (one of) the best
I recently had to quickly understand some facts about the probabilistic interpretation of pca. Naturally I picked up this book and it didn't disappoint. Bishop is absolutely clear, and an excellent writer as well. In my opinion, despite the recent publication of Kevin Murphy's very comprehensive ML book, Bishop is still a better read. This is mostly because of his incredible clarity, but the book has other virtues: best in class diagrams, judiciously chosen; a lot of material, very well organized; excellent stage setting (the first two chapters). Now, sometimes he's a bit cryptic, for example, the proof that various kinds of loss lead to conditional median or mode is left as an exercise (ex 1.27). Murphy actually discusses it in some detail. This is true in general: Murphy actually discusses many things that Bishop leaves to the reader. I thought chapters three and four could have been more detailed, but I really have no other complaints. Please note that in order to get an optimal amount out of reading this book you should already have a little background in linear algebra, probability, calculus, and preferably some statistics. The first time I approached it was without any background and I found it a bit unfriendly and difficult; this is no fault of the book, however. Still, you don't need that much, just the basics. Update: I should note that there are some puzzling omissions from this book. E.g. f-score & confusion matrices are not mentioned (see Murphy section 5.7.2) - it would have been very natural to mention these concepts in ch 1, along with decision theory. Nor is there much on clustering, except for K-means (see Murphy ch 25). Not a huge deal, it's easy to get these concepts from elsewhere. I recommend using Murphy as and when you need, to fill in gaps. One more update: I've been getting into Hastie et al's ESL recently, and I'm really impressed with it so far - I think the practitioner should probably get familiar with both ESL and PRML, as they have complementary strengths and weaknesses. ESL is not very Bayesian at all; PRML is relentlessly so. ESL does not use graphical models or latent variables as a unifying perspective; PRML does. ESL is better on frequentist model selection, including cross-validation (ch 7). I think PRML is better for graphical models, Bayesian methods, and latent variables (which correspond to chs 8-13) and ESL better on linear models and density based methods (and other stuff besides). Finally, ESL is way better on "local" models, like kernel regression & loess. Your mileage may vary...They are both excellent books. ESL seems a bit more mathematically dense than PRML, and is also better for people who are in industry as versus academia (I was in the latter but now in the former),
S**S
If only all textbooks were this well-written
I was a big fan of Bishop's earlier "Neural Networks for Pattern Recognition" despite my not being particularly interested in neural networks (as opposed to other aspects of machine learning), and so I was pretty excited when I heard about this book. Reading it has not left me disappointed. Like his earlier book, this text is quite mathematically oriented, and not well-suited for people who aren't comfortable with calculus. However, also like in "NNPR", the writing style here is very clear, and everything past basic calculus and linear algebra is well-explained before it's needed. The appendices alone are a goldmine. (Appendix B is a great "cheat sheet" for commonly used probability distributions; Appendix C has lots of useful matrix properties you may have forgotten or never known; Appendix D quickly explains what you need to know about the calculus of variations; and Appendix E does the same for Lagrange multipliers.) The author also does an excellent job throughout the text of marrying math and intuition without giving either short shrift. However, note that the material covered is inherently pretty complex, so the book can still be intimidating in parts despite the excellent writing. It's more appropriate for, say, Ph.D. students and professional researchers in statistics or machine learning than people who just want to crank out code for a simple classifier. There is very little pseudocode (although copious MATLAB code will supposedly be made available in a companion book due out in 2008), and the book's overall approach to machine learning is basically hard-core Bayesian statistics. If you are not willing to scratch your head for a while over lots and lots of equations, this may not be the book for you. On the flip side, people who are already experts in machine learning may be mildly disappointed with the lack of coverage some of their pet topics get. For example, while the chapter on graphical models is excellent as far as it goes, it only mentions the problem of learning graphical model structures (one of my areas of interest) in passing. Reinforcement learning (another personal area of interest) is discussed briefly in the introduction and then written off as beyond the scope of the book. However, the book is already a fabulous resource as it stands; complaining there's not even more of it would be gauche. The cover may look like goat barf, and there are some innocuous missing words here and there (hey, it's a first edition), but if you're serious about machine learning and not afraid of a little math, you should definitely own this book. I can only imagine how much cooler my own thesis research might have been if this book had been around a few years earlier.
A**E
A Definitive Reference
This book is used in an undergraduate statistics course at the University of Chicago. As a textbook for a class, I believe it does its job well. It doesn't concern itself with silly 'plug and chug' examples and presents the mathematical derivations of each technique clearly and concisely. Obviously, you'll need a background in linear algebra and probability theory. To think that the 5 page introduction to probability theory could possibly be helpful is ridiculous. It helps to also have some programming experience so you can practice implementing the techniques directly. This is written so as to be helpful as a reference book. Just the other day, I found myself implementing a parallelized feed-forward neural network and quickly picked up the text to see what variations would be helpful. Each topic is generally presented in its basic form with any alternatives and optimizations presented toward the end of the chapter. This meant it was very easy to find what I wanted in seconds. Although the book covers most of the major techniques (binary classifiers, regressions, vector support machines, neural networks, etc.) and their optimizations, it doesn't concern itself with optimal implementations. This is *not* a machine learning algorithms textbook. This is what you consult if you want to refresh yourself on a particular technique.
D**N
A sound conceptual approach
Usually considered to be a branch of artificial intelligence, especially at the present time, pattern recognition is defined in this book as the automatic discovery of regularities in data by the use of computer algorithms and the use of these regularities for classifying the data in different categories. The first part of this definition is typically referred to as `unsupervised learning' and the latter `supervised learning.' Both of these areas have resulted in a gargantuan amount of research due to their importance in areas such as medicine, genomics, network modeling, financial engineering, and voice recognition. This book emphasizes a "conceptual" approach to teaching pattern recognition, and therefore is highly valuable to those who need to learn the subject. Too often this field is taught purely from the formal standpoint, or conversely by the use of many trivial examples that illustrate the algorithms that are used. These approaches make the subject appear to be either a highly-developed mathematical one (which it is) or a cookbook that does not have a sound foundation. This book is one of the few that will allow the reader to gain a more in-depth understanding and appreciation of the subject as preparation for doing research and development in pattern recognition. The author claims that the book is self-contained as far as background in probability theory is concerned, but readers should still be prepared with this background in order to better appreciate the content. The Bayesian paradigm dominates the book, as it should given the current emphasis in research circles. Some of the highlights of the book include discussions on: * Relative entropy and mutual information. These two concepts have become very important in recent years, especially in the validation of pattern recognition models, the selection of relevant variables, and in independent component analysis. * Periodic variables and how they can be used in contexts where Gaussian distributions are problematic. * Markov chain Monte Carlo sampling, especially the role of the detailed balance condition in obtaining the acceptance probability for the Metropolis-Hastings algorithm. * Bayesian linear regression and its ability to deal with the over-fitting problem in calculations of maximum likelihood and the determination of model complexity. * Kernel learning (usually called support vector machines in other books). Some of the minuses of the book include: * Needs more in-depth discussion of Bayesian neural networks, over and above what is done in the book. The author's does devote a section in the book to this topic, but given its enormous importance, especially in automated learning and economic forecasting, more examples need to be included. * More real-world test cases need to be included, along with a comparison of the efficacies of different approaches, so as to illustrate the "no free lunch" philosophy. * More exercises that require more analysis on part of the reader, instead of derivation-type problems or straightforward numerical exercises. * Needs more details on independent component analysis. Only a few paragraphs are devoted to this important topic.
N**U
Probably the best book for machine learning
I am a PhD student in machine learning. Bishop is really gifted and he explains very well basic and advanced concepts of machine learning. I would say that this book is much more comprehensive than Hastie's Statistical learning book The Elements of Statistical Learning . Very good illustrations and very complete. I would definitely recommend it for those who want to learn statistical/machine learning on their own. The only thing that I don't like is that it often tries to present theories under a probabilistic framework. My personal opinion is that Probabilities is a nice way to present easy theories in a super complex way. I tend to think in a more linear algebra/optimization framework. After so many years in Academia probabilities still confuse me. Hopefully the author doesn't use only the probabilistic framework, that is why I like it
R**J
Arguably the best book ever on Machine Learning
Let me be very honest: If you don't have a strong maths background as is expected from serious researchers in Electrical Engineering or Computer Science, you may NOT find this book useful. I don't think this book should be tried by undergraduates except may be those who are in their final semester. The key aspect of this book is its intuition, that kind of draws a clear picture of what the complex equations mean. In no way,by itself, is Machine Learning an easy topic. So I don't think there is a book that may be so easy to follow and would also offer excellent grasp of the topic. But if you really want to learn ML, this book is certainly sufficient to take care of all the requirements. It is SIMPLY AMAZING compared to some other 'classic texts' that I bought.
M**L
Excellent Book on Machine Learning & Pattern Recognotion
This survey of ML topics is well written, well organized, and interesting to read. I particularly like his sidebar cross references to exercises that apply to the content discussed in the text there and to sections having related material elsewhere in the book. Bishop also puts some solutions to exercises online. Of the four or five ML texts now available (I mean real text books having a substantial amount of theory and requiring some knowledge of calculus and linear algebra), PRML is by far my favorite. One important addition: the book is strong in mathematical statistics.
S**A
Sehr gutes Buch
Habe das Buch bestellt, alles super funktioniert.
P**L
Amazingly written, fantastic print quality.
This book is excellently written. It is not simply a reference bible, the author tells a chronological story and takes you along for the ride. The print quality of my copy is excellent, nice waxy paper, crisp text and nice and colourful. As you've probably read elsewhere online, you will need to have done prior courses in probability and linear algebra, as the introductory chapters here, although technically "self contained", are very dense. Although Kevin Murphy's new 2022 book is also great, it feels like more of a reference on a zillion topics. Whereas with PMRL, Bishop is really trying to get you to understand the fundamentals.
U**E
ใใฟใผใณ่ช่ญใฎๆ็งๆธ
็ด ๆดใใใๆฌใงใใ ใใฟใผใณ่ช่ญใฎๆ็งๆธใจใใฆใ้ๅธธใซๅชใใฆใใใจๆใใพใใ ใใฟใผใณ่ช่ญใฎๅ็ใ็นๅพดใๆขๅญใฎๆ็จใชๆๆณใชใฉใๅใใใใใๆธใใใฆใใพใใ ใใใใฏ็ตฑ่จใฎ็ฅ่ญใ้งไฝฟใใฆใใพใใใใใฎๅบๆฌใฎ้จๅใใๆธใใใฆใใใฎใง ็ฌ็ฟใใไบใๅฏ่ฝใงใใ ใพใใใใซใซใฉใผใชใฎใงใใฐใฉใใๅณใ้ๅธธใซ็ถบ้บใง่ฆใใใใงใใ ใใฟใผใณ่ช่ญใ็ ็ฉถใใๅใปไธญ็ด่ ๅใใฎๆฌใจ่จใใใจๆใใพใใ
C**N
Clear and interesting
Itโs a good introduction to the theory of machine learning. The math requirements are low, mostly undergrad algebra, and the calculations are easy to follow; needed probability concepts are explained in the first chapter. The book is quite exhaustive and covers all common ML techniques in detail.
A**X
Superb
There are a huge number of machine learning books now available. I own many of them. But I don't think any have had such an impact as Chris Bishop's effort here - I certainly count it as my favourite. The material covered is not exhaustive (although good for 2006), but it's a good springboard to many other advanced texts. (The moniker of ML 'Bible' has apparently been passed to Kevin Murphy's book.) What *is* covered is explained with exceptional clarity with an eye for understanding the intuition as well as the theory. If you are after a practitioners guide, or a first ML book for self study, this probably isn't ideal. It assumes significant familiarity with multivariate calculus, probability and basic stats (identities, moments, regression, MLE etc.). The pitch is probably early post-graduate level, but with a few stretching parts. If this is your background, I think it's a better first ML book than MacKay (Information Theory ...), Murphy (Machine Learning ...), or Hastie et al. (Elements of Statistical Learning), due to its coherence of topics and consistency of depth. But those books are all excellent in their own ways. However, Barber (Bayesian Reasoning ...) is a good alternative. Most chapters are fairly self contained, so once you've worked your way through the first couple of chapters, you can skip around as required. A particular highlight for me were the chapters on EM and variational methods (ch 9 & 10); I think you'd be hard pressed to find a better explanation of either of them. Finally, worth pointing out it's unrepentantly Bayesian, and lacking some subtelty which may be grating for seasoned statisticians. Nevertheless, if the above sounds like what you're looking for, this is probably a good choice.
Trustpilot
5 days ago
2 weeks ago