Programming Massively Parallel Processors und über 1,5 Millionen weitere Bücher verfügbar für Amazon Kindle. Erfahren Sie mehr
EUR 51,95
  • Alle Preisangaben inkl. MwSt.
Nur noch 3 auf Lager (mehr ist unterwegs).
Verkauf und Versand durch Amazon.
Geschenkverpackung verfügbar.
Menge:1
Programming Massively Par... ist in Ihrem Einkaufwagen hinzugefügt worden
Ihren Artikel jetzt
eintauschen und
EUR 10,25 Gutschein erhalten.
Möchten Sie verkaufen?
Zur Rückseite klappen Zur Vorderseite klappen
Anhören Wird wiedergegeben... Angehalten   Sie hören eine Probe der Audible-Audioausgabe.
Weitere Informationen
Alle 2 Bilder anzeigen

Programming Massively Parallel Processors: A Hands-On Approach (Englisch) Taschenbuch – 20. Dezember 2012


Alle 2 Formate und Ausgaben anzeigen Andere Formate und Ausgaben ausblenden
Amazon-Preis Neu ab Gebraucht ab
Kindle Edition
"Bitte wiederholen"
Taschenbuch
"Bitte wiederholen"
EUR 51,95
EUR 42,98 EUR 44,99
61 neu ab EUR 42,98 4 gebraucht ab EUR 44,99

Wird oft zusammen gekauft

Programming Massively Parallel Processors: A Hands-On Approach + The Cuda Handbook: A Comprehensive Guide to GPU Programming + CUDA by Example: An Introduction to General-Purpose GPU Programming
Preis für alle drei: EUR 117,85

Die ausgewählten Artikel zusammen kaufen
Jeder kann Kindle Bücher lesen — selbst ohne ein Kindle-Gerät — mit der KOSTENFREIEN Kindle App für Smartphones, Tablets und Computer.


Produktinformation


Produktbeschreibungen

Pressestimmen

"For those interested in the GPU path to parallel enlightenment, this new book from David Kirk and Wen-mei Hwu is a godsend, as it introduces CUDA (tm), a C-like data parallel language, and Tesla(tm), the architecture of the current generation of NVIDIA GPUs. In addition to explaining the language and the architecture, they define the nature of data parallel problems that run well on the heterogeneous CPU-GPU hardware ... This book is a valuable addition to the recently reinvigorated parallel computing literature." - David Patterson, Director of The Parallel Computing Research Laboratory and the Pardee Professor of Computer Science, U.C. Berkeley. Co-author of Computer Architecture: A Quantitative Approach "Written by two teaching pioneers, this book is the definitive practical reference on programming massively parallel processors--a true technological gold mine. The hands-on learning included is cutting-edge, yet very readable. This is a most rewarding read for students, engineers, and scientists interested in supercharging computational resources to solve today's and tomorrow's hardest problems." - Nicolas Pinto, MIT, NVIDIA Fellow, 2009 "I have always admired Wen-mei Hwu's and David Kirk's ability to turn complex problems into easy-to-comprehend concepts. They have done it again in this book. This joint venture of a passionate teacher and a GPU evangelizer tackles the trade-off between the simple explanation of the concepts and the in-depth analysis of the programming techniques. This is a great book to learn both massive parallel programming and CUDA." - Mateo Valero, Director, Barcelona Supercomputing Center "The use of GPUs is having a big impact in scientific computing. David Kirk and Wen-mei Hwu's new book is an important contribution towards educating our students on the ideas and techniques of programming for massively parallel processors." - Mike Giles, Professor of Scientific Computing, University of Oxford "This book is the most comprehensive and authoritative introduction to GPU computing yet. David Kirk and Wen-mei Hwu are the pioneers in this increasingly important field, and their insights are invaluable and fascinating. This book will be the standard reference for years to come." - Hanspeter Pfister, Harvard University "This is a vital and much-needed text. GPU programming is growing by leaps and bounds. This new book will be very welcomed and highly useful across inter-disciplinary fields." - Shannon Steinfadt, Kent State University "GPUs have hundreds of cores capable of delivering transformative performance increases across a wide range of computational challenges. The rise of these multi-core architectures has raised the need to teach advanced programmers a new and essential skill: how to program massively parallel processors." - CNNMoney.com

Über den Autor und weitere Mitwirkende

David B. Kirk is well recognized for his contributions to graphics hardware and algorithm research. By the time he began his studies at Caltech, he had already earned B.S. and M.S. degrees in mechanical engineering from MIT and worked as an engineer for Raster Technologies and Hewlett-Packard's Apollo Systems Division, and after receiving his doctorate, he joined Crystal Dynamics, a video-game manufacturing company, as chief scientist and head of technology. In 1997, he took the position of Chief Scientist at NVIDIA, a leader in visual computing technologies, and he is currently an NVIDIA Fellow. At NVIDIA, Kirk led graphics-technology development for some of today's most popular consumer-entertainment platforms, playing a key role in providing mass-market graphics capabilities previously available only on workstations costing hundreds of thousands of dollars. For his role in bringing high-performance graphics to personal computers, Kirk received the 2002 Computer Graphics Achievement Award from the Association for Computing Machinery and the Special Interest Group on Graphics and Interactive Technology (ACM SIGGRAPH) and, in 2006, was elected to the National Academy of Engineering, one of the highest professional distinctions for engineers. Kirk holds 50 patents and patent applications relating to graphics design and has published more than 50 articles on graphics technology, won several best-paper awards, and edited the book Graphics Gems III. A technological "evangelist" who cares deeply about education, he has supported new curriculum initiatives at Caltech and has been a frequent university lecturer and conference keynote speaker worldwide. Wen-mei Hwu: CTO of MulticoreWare, and is a professor at University of Illinois at Urbana-Champaign specializing in compiler design, computer architecture, computer microarchitecture, and parallel processing. He currently holds the Walter J. ("Jerry") Sanders III-Advanced Micro Devices Endowed Chair in Electrical and Computer Engineering in the Coordinated Science Laboratory. He is a PI for the petascale Blue Waters system, is co-director of the Intel and Microsoft funded Universal Parallel Computing Research Center (UPCRC), and PI for the world's first NVIDIA CUDA Center of Excellence. At the Illinois Coordinated Science Lab, Dr. Hwu leads the IMPACT Research Group and is director of the OpenIMPACT project - which has delivered new compiler and computer architecture technologies to the computer industry since 1987. He previously edited GPU Computing Gems, a similar work focusing on NVIDIA CUDA.

Welche anderen Artikel kaufen Kunden, nachdem sie diesen Artikel angesehen haben?


In diesem Buch (Mehr dazu)
Ausgewählte Seiten ansehen
Buchdeckel | Copyright | Inhaltsverzeichnis | Auszug | Stichwortverzeichnis
Hier reinlesen und suchen:

Kundenrezensionen

Es gibt noch keine Kundenrezensionen auf Amazon.de
5 Sterne
4 Sterne
3 Sterne
2 Sterne
1 Sterne

Die hilfreichsten Kundenrezensionen auf Amazon.com (beta)

Amazon.com: 15 Rezensionen
13 von 13 Kunden fanden die folgende Rezension hilfreich
A solid introduction to CUDA programming and more... 12. Februar 2013
Von JASA - Veröffentlicht auf Amazon.com
Format: Taschenbuch
This second edition of PMPP extends the table of contents of the first one, almost doubling the number of pages (in the 2nd ed. its ~500 pages. I have the paper version.)

The book can be separated roughly in 4 parts: the first, and more important, deals with parallel programming using Nvidia's CUDA technology: this takes about the first 10 chapters and Ch. 20; the second slice shows a couple of important examples (MRI image reconstruction and molecular simulation and visualization, chapters 11 and 12); the 3rd important block of chapters (chapters 14 upto 19) deals with other parallel programming technologies and CUDA expansions: OpenCL, OpenACC, CUDA Fortran, Thrust, C++AMP, MPI. Finally, spread all over the book, there are several "outlier", but nevertheless important, chapters: Ch. 7 discusses floating-point issues and its impact in calculation's accuracy; Ch. 13, "PP and Computational Thinking", discusses broadly how to think when converting sequential algorithms to parallel; and Ch. 21 discusses the future of PP (using CUDA goggles :-).

I've read about half of the book (I attended Coursera's MOOC -"Heterogeneous Parallel Computing"- taught by one of the authors, Prof. W. Hwu, and waited until the 2nd edition was out to buy it), and browsed carefully the other half. Here are my...

Comments
----------
(+++) Pluses:
# There are just a few typos, here and there, but they are easy to spot (the funniest is in line 5 of ch. 1 (!), where Giga corresponds to 10^12 and Tera to 10^15, according to the authors: of course Giga is 10^9 and Tera is 10^12 - this bug is browseable with Amazon's "look inside" feature...).
# CUDA is described under an application POV; many computation patterns are exemplified in CUDA, from the simplest (vector addition) to more difficult ones (matrix multiplication, image filtering with convolution kernels, scanning,...)
# There is a description of several other PP technologies (OpenCL, MPI,...), what is a good and useful feature if you are evaluating or selecting a PP technology to use.
# The book is quite comprehensive about current PP technologies. CUDA is the "leading actress", but if you dominate CUDA you can easily transfer your PP knowledge to other technologies. The "tête-à-tête" of CUDA with those technologies appears in the respective chapters, by showing the respective templates for basic tasks (e.g. for vector addition or matrix multiplication).
# There are many discussions about the tradeoffs between memory transfer (from CPU to GPU) and speed of GPU computation, as well as about the optimization of this speed.

(---) Minuses:
# The figures, pictures and tables use a variety of font sizes and backgrounds (either gray, white, etc...); some fonts are very tiny and so in those cases the code is difficult to read.
# The chapters with descriptions of other PP technologies (OpenCL, OpenACC, MPI,...), often written by "invited" authors (acknowledged by Kirk and Hwu), are in general succinct; and the maturity and availability (free, commercial, open-source,...) of the technologies are not discussed.
# The prose is often excessive (somewhat verbose), and then the tentative to explain deeply the matters sometimes leads to confusion.
# There is some heterogeneity in the book (that's OK, we are talking about "heterogeneous parallel processors and programming" here ;-) perhaps because there are two main authors with different backgrounds and several "guest authors" in the latter chapters.
# It lacks a well-thought introductory chapter treating, along a pedagogical and explicative style, the subset of C commonly seen in PP computation and the CUDA extensions/templates. These matters are (lightly) in the book, but scattered in many chapters.
# Browsing the CUDA programming manual we can see that there are many issues not treated (or barely mentioned) in PMPP. An appendix of 20 or 30 pages with a systematic summary of the CUDA API and C extensions would be an welcome addition to the book.

Conclusion
----------
After having browsed (with Amazon's "look inside" feature and by reading the reader's comments) other books about PP and CUDA, I decided to get this one and I am not disappointed at all. It has a nice description of CUDA and of many parallel computation patterns and its CUDA implementation and it gives you a barebones sample of other PP technologies. PMPP can be read easily as a "straight-line" text or on a chapter-by-chapter basis (this last one was more useful for me). Recommended for guys and gals with some experience in C programming and a will of getting into PP (or in expanding their skills...)
13 von 14 Kunden fanden die folgende Rezension hilfreich
Good, but not a must-have 13. Mai 2013
Von John M. Hauck - Veröffentlicht auf Amazon.com
Format: Kindle Edition
"Programming Massively Parallel Processors (second edition)" by Kirk and Hwu is a very good second book for those interested in getting started with CUDA. A first must-read is "CUDA by Example: An Introduction to General-Purpose GPU Programming" by Jason Sanders. After reading all of Sanders work, feel free to jump right to chapters 8 and 9 of this Kirk and Hwu publication.

In chapter 8, the authors do a nice job of explaining how to write an efficient convolution algorithm that is useful for smoothing and sharpening data sets. Their explanation of how shared memory can play a key role in improving performance is well written. They also handle the issue of "halo" data very well. Benchmark data would have served as a nice conclusion to this chapter.

In chapter 9, the authors provide the best description of the Prefix Sum algorithm I have seen to date. It describes the problem being solved in terms that I can easily relate to - food. They write, "We can illustrate the applications of inclusive scan operations using an example of cutting sausage for a group of people." They first describe a simple algorithm, then a "work-efficient" algorithm, and then an extension for larger data sets. What puzzles me here is that the authors seem fixated on solving the problem with the least number of total operations (across all threads) as opposed to the least number of operations per thread. They do not mention that the "work-efficient" algorithm requires almost twice as many more operations for the longest-path thread than the simple algorithm. Actual performance benchmarks showing a net throughput gain would be required for a skeptical reader.

Now before moving forward, lets back up a bit. Even though we have already read CUDA by Example, it is worth reading chapter 6... at least the portion regarding the reduction algorithm starting at the top of page 128. The discussion is rather well written and insightful. Now, onward.

In chapter 13, the authors list the tree-fold goals of parallel computing: solve a given problem in less time, solve bigger problems in the same amount of time, and achieve better solutions for a given problem in a given amount of time. These all make sense, but have not been the reasons I have witnessed for the transition to parallel computing. I believe the biggest motivation for utilizing CUDA is to solve problems that would otherwise be unsolvable. For example, the rate of data generated by many scientific instruments could simply not be processed without a massively parallel computing solution. In other words, CUDA makes things possible.

Also in Chapter 13 they bring up a very important point. Solving problems with thousands of threads requires that software developers think differently. To think of the resources of a GPU as a means by which you can make a parallel-for-loop run faster completely misses the point - and the opportunity the GPU provides. These three chapters then make the book worthwhile.

The chapters on OpenCL, OpenACC, and AMP seem a bit misplaced in a book like this. The author's coverage of these topics is a bit too superficial to make them useful for serious developers. On page 402 they list the various data types that AMP supports. It would have made sense for the authors to point out that AMP does not support byte and short. When processing large data sets of these types, AMP introduces serious performance penalties.

This then brings me to my biggest concern about this book. There is very little attention paid to the technique of overlapping data transfer operations and with kernel execution. I did happen upon a discussion of streaming in chapter 19, "Programming a Heterogeneous Computing Cluster." However, the context of the material is with respect to MPI, and those not interested in MPI might easily miss it. Because overlapping I/O with kernel operations can easily double throughput, I believe this topic deserves at least one full chapter. Perhaps in the next edition we can insert it between chapters 8 and 9? Oh, and let's add "Overlapped I/O", "Concurrent" and "Streams" as first class citizens in the index. While we are editing the index, let's just drop the entry for "Apple's iPhone Interfaces". Seriously.

In summary, I believe this is a very helpful book and well written. I would consider it a good resource for CUDA developers. It is not, however, a must-have CUDA resource.
8 von 8 Kunden fanden die folgende Rezension hilfreich
excellent content marred by typos 3. Februar 2013
Von Eric van Tassell - Veröffentlicht auf Amazon.com
Format: Kindle Edition Verifizierter Kauf
the content of the book is excellent but the prose and code are marred by numerous typos - a veritable horde of them. The illustrations in the print book are blurry but the kindle version is a little better. It is a shame that the effort of Drs. Kirk & Hwu are marred by poor editing and production
4 von 4 Kunden fanden die folgende Rezension hilfreich
Sloppy, sad, and salvageable 20. September 2014
Von Marc W. Abel - Veröffentlicht auf Amazon.com
Format: Taschenbuch Verifizierter Kauf
So much effort went into writing this text, but better guidance was needed. True, there are typos as many have said, but not as many as I find in most textbooks and research papers.

Before I get started, I have one strong compliment for the text: exercises at the ends of the chapters, where they appear, are very well thought out and get right to the point.

The worst part of this book is its index, which gets an unnegotiable F grade from me. So much is missing. Try looking for the gridDim built-in variable; it's not there. blockDim is, but not gridDim. Or look up "constant memory"; that's not there under "memory", "constant", or otherwise. In fact although most of parallel programming is about overcoming memory bottlenecks, the index doesn't say much about memory in general. Indexing books is very hard and is usually left to professional indexers; there really is such a trade. But for this book the authors needed to pay more attention to the end product.

The overall arrangement is illogical, starting from an enormous abstraction and then adding detail. Better to begin with something small and concrete which can be understood, and build up.

At several points a non-programmer was permitted to do the typing, which is like allowing a pilot to fly while waking out of general anesthesia. See Figure 10.4 at lines 1 and 5.

Many concepts are never brought together in some kind of summary that one would ever turn to as a reference, such as special variables. One can also find seeming conflicts that need clarification, such as Table 5.1's "local" memory for automatic array variables that are "stored into the global memory" on the next page. Both turn out to be true, but the presentation is not ideal.

References are sloppily missing in certain cases, such as ELLPACK on page 224.

At several points the text completely derails with outrageous metaphors. For instance page 85, where the authors try to explain standard queries of an implementation's capabilities by informing us that "most American hotel rooms come with hair dryers". The illustration's purpose is to publish a thicker book that looks "more valuable", but the truth is that any programmer who read to page 85 knows that most libraries have functions for querying their parameters and limits. There's no need to digress into toothpaste in Asia.

If you must take a course that is based on this text, things get worse for you: the authors made presentation slides, and you bear the risk that your instructor may use them in the classroom. These are not well thought out from either a clarity or instructional standpoint, and they are very cluttered with confusing details. One example is how tiled matrix multiplication is explained; one slide actually draws three 4x4 matrices with row, column, and matrix name (!!!) labels in every single cell, giving the poor class 3 x (4 x 4) x 3 = 144 extra trivialities to distract them on that slide alone.

Another pedagogical problem with their slide example, which the book also mishandles, is that they've chosen to illustrate the use of submatrices by dividing a 4 x 4 problem into a 2 x 2 problem. Although this is the smallest case they could select, the difficulty is that the human brain comprehends these 2 x 2 divisions in terms of "corners" (top left, etc.) instead of columns and rows. The examples should have been 9 x 9 divided into 3 x 3.

In fairness, Kirk and Hwu did not have the best of resources available to assemble this book from. nVIDIA's "CUDA C Programming Guide" has no index whatsoever, and the table of contents doesn't seem to have much to do with what one is trying to find a second time. There's a reason nVIDIA doesn't sell print copies of that text on Amazon; people would review it.
3 von 3 Kunden fanden die folgende Rezension hilfreich
Many typos, but still useful 25. August 2013
Von Wavefunction - Veröffentlicht auf Amazon.com
Format: Taschenbuch Verifizierter Kauf
This book is quite useful in understanding CUDA fundamentals. But, buyer beware. If you are new to CUDA or C programming, you may find this book confusing, as typos run throughout. They are easy to pick out if you know basic CUDA syntax. If you do not, "CUDA by Example" is a better introduction.
Waren diese Rezensionen hilfreich? Wir wollen von Ihnen hören.