- Taschenbuch: 336 Seiten
- Verlag: AddisonWesley Professional; Auflage: 01 (19. März 2014)
- Sprache: Englisch
- ISBN-10: 0321934504
- ISBN-13: 978-0321934505
- Größe und/oder Gewicht: 17,8 x 2 x 22,6 cm
- Durchschnittliche Kundenbewertung: 1 Kundenrezension
- Amazon Bestseller-Rang: Nr. 387.396 in Fremdsprachige Bücher (Siehe Top 100 in Fremdsprachige Bücher)
- Komplettes Inhaltsverzeichnis ansehen
Apache Hadoop YARN: Moving beyond MapReduce and Batch Processing with Apache Hadoop 2 (AddisonWesley Data & Analytics) (Englisch) Taschenbuch – 19. März 2014
|Neu ab||Gebraucht ab|
Kunden, die diesen Artikel gekauft haben, kauften auch
Es wird kein Kindle Gerät benötigt. Laden Sie eine der kostenlosen Kindle Apps herunter und beginnen Sie, Kindle-Bücher auf Ihrem Smartphone, Tablet und Computer zu lesen.
Geben Sie Ihre Mobiltelefonnummer ein, um die kostenfreie App zu beziehen.
Wenn Sie dieses Produkt verkaufen, möchten Sie über Seller Support Updates vorschlagen?
" This book is a desperately needed resource for administrators, developers, and power-users of the Hadoop YARN framework. It does an excellent job of documenting the (often unknown) history that inevitably lead up to YARN from previous versions of Hadoop, which provides a valuable canvas against which to present the remaining pragmatically-oriented text. Moving from the history of YARN, it wisely jumps right into getting the reader up and running with their own YARN setup (on a single machine or on a larger cluster) such that the rest of the text is not merely conjecturing, but real guidance for a real instance of YARN. Chapters 7 and 8 were the ones I was most looking forward to in the text from the start, as those "core" components of YARN are some of the ones which are least understood and yet concurrently most impacting on performance. They did not disappoint." - Ellis H. Wilson III, Storage Scientist
"This book is a desperately needed resource for administrators, developers, and power-users of the Hadoop YARN framework. It does an excellent job of documenting the (often unknown) history that inevitably lead up to YARN from previous versions of Hadoop, which provides a valuable canvas against which to present the remaining pragmatically-oriented text. Moving from the history of YARN, it wisely jumps right into getting the reader up and running with their own YARN setup (on a single machine or on a larger cluster) such that the rest of the text is not merely conjecturing, but real guidance for a real instance of YARN. Chapters 7 and 8 were the ones I was most looking forward to in the text from the start, as those "core" components of YARN are some of the ones which are least understood and yet concurrently most impacting on performance. They did not disappoint." - Ellis H. Wilson III, Storage Scientist"
Über den Autor und weitere Mitwirkende
Arun Murthy has contributed to Apache Hadoop full-time since the inception of the project in early 2006. He is a long-term Hadoop committer and a member of the Apache Hadoop Project Management Committee. Previously, he was the architect and lead of the Yahoo Hadoop MapReduce development team and was ultimately responsible, technically, for providing Hadoop MapReduce as a service for all of Yahoo--currently running on nearly 50,000 machines. Arun is the founder and architect of the Hortonworks Inc., a software company that is helping to accelerate the development and adoption of Apache Hadoop. Hortonworks was formed by the key architects and core Hadoop committers from the Yahoo! Hadoop software engineering team in June 2011. Funded by Yahoo! and Benchmark Capital, one of the preeminent technology investors, their goal is to ensure that Apache Hadoop becomes the standard platform for storing, processing, managing, and analyzing big data. Vinod Kumar Vavilapalli has been contributing to Apache Hadoop project full-time since mid-2007. At Apache Software Foundation, he is a long-term Hadoop contributor, Hadoop committer, member of the Apache Hadoop Project Management Committee, and a foundation member. Vinod is a MapReduce and YARN go-to guy at Hortonworks Inc. For more than five years, he has been working on Hadoop. He was involved in HadoopOnDemand, Hadoop-0.20, CapacityScheduler, Hadoop security, and MapReduce, and is now a lead developer and the project lead for Apache Hadoop YARN. Before Hortonworks, he was at Yahoo!, working in the Grid team that made Hadoop what it is today, running at large scale--up to tens of thousands of nodes. Vinod loves reading books of all kinds and is passionate about using computers to change the world for better, bit by bit. He has a bachelor s degree in computer science and engineering from the Indian Institute of Technology Roorkee. He can be reached at twitter handle @tshooter. Douglas Eadline, Ph.D., began his career as a practitioner and a chronicler of the Linux Cluster HPC revolution and now documents big data analytics. Starting with the first Beowulf How To document, Doug has written hundreds of articles, white papers, and instructional documents covering virtually all aspects of HPC computing. Prior to starting and editing the popular ClusterMonkey.net website in 2005, he served as editor-in-chief for "ClusterWorld" magazine, and was senior HPC editor for "Linux Magazine." Currently, he is a consultant to the HPC industry and writes a monthly column in "HPC Admin" magazine. Both clients and readers have recognized Doug s ability to present a technological value proposition in a clear and accurate style. He has practical, hands-on experience in many aspects of HPC, including hardware and software design, benchmarking, storage, GPU, cloud, and parallel computing. He is the author of "Hadoop Fundamentals LiveLessons" (video) from Addison-Wesley. Joseph Niemiec is a big data solutions engineer whose focus is on designing Hadoop solutions for many "Fortune 1000" companies. In this position, Joseph has worked with customers to build multiple YARN applications providing a unique perspective on moving customers beyond batch processing, and has worked on YARN development directly. An avid technologist, Joseph has been focused on technology innovations since 2001. His interest in data analytics originally started in game score optimization as a teenager, and has shifted to helping customers uptake new technology innovations such as Hadoop and, most recently, building new data applications using YARN. Jeff Markham is a solution engineer at Hortonworks Inc., the company promoting open source Hadoop. Previously, he was with VMware, Red Hat, and IBM, helping companies build distributed applications with distributed data. He has written articles on Java application development and has spoken at several conferences and to Hadoop User Groups. Jeff is a contributor to Apache Pig and Apache HDFS. "
DON'T BUY IT
Die hilfreichsten Kundenrezensionen auf Amazon.com (beta) (Kann Kundenrezensionen aus dem "Early Reviewer Rewards"-Programm beinhalten)
Unfortunately, it is not. It is more of a "beginner's guide", geared for those who just want to get started with Hadoop and aren't yet using it as a core part of their business. I reviewed another Hadoop 2.x book and perhaps was a bit more "kind" about it on this point but that was because I expected it would be more of a tutorial and it was.
This book is well-written, well-organized, and its layout is slick and professional. That's no surprise as it's from a big-name publisher. It's comprehensive in its description of YARN's history, its design and architecture, its features, how it interacts with HDFS and MapReduce, and how it's supposed to be used with other tools and frameworks, and getting a *basic* Hadoop 2.x cluster up-and-running, (by several means, including Ambari).
But you can learn almost all of this from the Hadoop project's website, and that info will also be more up-to-date.
The big disappointment is that there's pretty much no useful info on, for example, how to configure Hadoop & YARN memory limits and tuning parameters, how to plan for and configure YARN scheduler queues, best-practices for deployment of more than a handful of nodes, or recommended JVM, OS, disk or network configuration. And forget about learning how to do anything at scale. And forget about hearing the hard truth about running and using hadoop in the real world.
If you're a beginner to Hadoop and/or just want a single tome from which you can learn about Hadoop 2.x and YARN from a somewhat high-level, this is a great book for you. But if you want to learn more than you can find online in an afternoon, keep moving. I'm sure something more useful will come out sooner or later.
For example, the book just mentioned about capacity scheduler using queue without giving more example on how to use it or how it compares with other Hadoop schedulers.
And also too many emphasis with HW sponsored projects like a lot of screen shots about Ambari =(
This book is good for finding out what available in YARN but for more details and examples on how to work with Hadoop YARN in production I would recommend Hadoop In Practice 2nd edition by Alex Holmes.