In weniger als einer Minute können Sie mit dem Lesen von Data Analysis with Open Source Tools auf Ihrem Kindle beginnen. Sie haben noch keinen Kindle? Hier kaufen oder mit einer unserer kostenlosen Kindle Lese-Apps sofort zu lesen anfangen.

An Ihren Kindle oder ein anderes Gerät senden


Kostenlos testen

Jetzt kostenlos reinlesen

An Ihren Kindle oder ein anderes Gerät senden

Der Artikel ist in folgender Variante leider nicht verfügbar
Keine Abbildung vorhanden für
Keine Abbildung vorhanden

Data Analysis with Open Source Tools [Kindle Edition]

Philipp K. Janert
4.0 von 5 Sternen  Alle Rezensionen anzeigen (2 Kundenrezensionen)

Kindle-Preis: EUR 21,21 Inkl. MwSt. und kostenloser drahtloser Lieferung über Amazon Whispernet

Kostenlose Kindle-Leseanwendung Jeder kann Kindle Bücher lesen  selbst ohne ein Kindle-Gerät  mit der KOSTENFREIEN Kindle App für Smartphones, Tablets und Computer.

Geben Sie Ihre E-Mail-Adresse oder Mobiltelefonnummer ein, um die kostenfreie App zu beziehen.

Weitere Ausgaben

Amazon-Preis Neu ab Gebraucht ab
Kindle Edition EUR 21,21  
Taschenbuch EUR 24,95  

Kunden, die diesen Artikel gekauft haben, kauften auch

Seite von Zum Anfang
Diese Einkaufsfunktion wird weiterhin Artikel laden. Um aus diesem Karussell zu navigieren, benutzen Sie bitte Ihre Überschrift-Tastenkombination, um zur nächsten oder vorherigen Überschrift zu navigieren.



Collecting data is relatively easy, but turning raw information into something useful requires that you know how to extract precisely what you need. With this insightful book, intermediate to experienced programmers interested in data analysis will learn techniques for working with data in a business environment. You'll learn how to look at data to discover what it contains, how to capture those ideas in conceptual models, and then feed your understanding back into the organization through business plans, metrics dashboards, and other applications.

Along the way, you'll experiment with concepts through hands-on workshops at the end of each chapter. Above all, you'll learn how to think about the results you want to achieve -- rather than rely on tools to think for you.

  • Use graphics to describe data with one, two, or dozens of variables
  • Develop conceptual models using back-of-the-envelope calculations, as well asscaling and probability arguments
  • Mine data with computationally intensive methods such as simulation and clustering
  • Make your conclusions understandable through reports, dashboards, and other metrics programs
  • Understand financial calculations, including the time-value of money
  • Use dimensionality reduction techniques or predictive analytics to conquer challenging data analysis situations
  • Become familiar with different open source programming environments for data analysis

"Finally, a concise reference for understanding how to conquer piles of data."--Austin King, Senior Web Developer, Mozilla

"An indispensable text for aspiring data scientists."--Michael E. Driscoll, CEO/Founder, Dataspora

Über den Autor und weitere Mitwirkende

Philipp K. Janert is Chief Consultant at Principal Value, LLC. He has worked for small start-ups and in large corporate environments, both in the US and overseas, including several years at, where he initiated and led several projects to improve Amazon's order fulfillment processes. Philipp K. Janert has written about software and software development for the O'Reilly Network, IBM developerWorks, IEEE Software, and Linux Magazine. He holds a Ph.D. in Theoretical Physics from the University of Washington. Visit his website at


Mehr über den Autor

Entdecken Sie Bücher, lesen Sie über Autoren und mehr

Welche anderen Artikel kaufen Kunden, nachdem sie diesen Artikel angesehen haben?


4 Sterne
2 Sterne
1 Sterne
4.0 von 5 Sternen
4.0 von 5 Sternen
Die hilfreichsten Kundenrezensionen
3.0 von 5 Sternen Licht und Schatten 5. Mai 2015
Von Kestler
Format:Taschenbuch|Verifizierter Kauf
Das Buch hat einige gute Ansätze, hat aber auch didaktische Schwächen. Daher nur drei Sterne. Didaktisch geschickter wäre es gewesen, erst mal einen Rundflug zu geben und dann die Themengebiete auszubauen. Der Ansatz des "WEKA-Buches" von Ian H. Witten, griffige Datenbeispiele mit zu liefern wäre hier eine gute Idee gewesen.

Der Untertitel "with OpenSource Tools" ist zu ambitioniert, es werden zwar Beispiele mit Python, R etc angerissen, aber damit kommt man nicht weit.

Empfehlenswert mit Einschränkungen, man liest schon ziemlich lange daran und ein wirkliches Kompendium ist es dann aber doch nicht. Es sind eher die kleinen Fingerzeige wie das Kapitel über die Entstehungsgeschichte der klassischen Statistik und warum diese so ist wie sie ist und warum wir heute einfach andere Voraussetzungen haben (Computer, große Mengen an Messwerten).
War diese Rezension für Sie hilfreich?
1 von 3 Kunden fanden die folgende Rezension hilfreich
5.0 von 5 Sternen Extraordinary 24. November 2012
Format:Taschenbuch|Verifizierter Kauf
This is the book you want, if you try to get quickly into scientific programming and visualization with Python and R! I strongly reccommend this book!
War diese Rezension für Sie hilfreich?
Die hilfreichsten Kundenrezensionen auf (beta) 4.2 von 5 Sternen  42 Rezensionen
41 von 41 Kunden fanden die folgende Rezension hilfreich
3.0 von 5 Sternen Full of insight, light on details 17. April 2011
Von Code Monkey - Veröffentlicht auf
This book covers such a wide range of topics that it necessarily skims over all of them but it always hits all the major points that an introductory survey should. Each chapter has a straight forward tone, strikes the right balance between developing mathematical rigor and developing an intuitive understanding of data , and undeniably passes on the lessons of hard earned, real world experience. But a reader who is actually working on a real data problem will almost certainly come to the realization that the understanding gained is somewhat superficial - that it's going to take a lot more heavy reading (probably of books, papers, and software tools recommended in this book) to get any real work done!

The single biggest problem with this book is its misleading title. This book is not going to teach you how to use open source software to analyze data. There is only minimal information about how one would actually use the software tools being discussed. What you get is a brief commentary about what the author thinks each software package is good for. It's the same story as with the mathematical details: you will not find them here, but this book will give you an excellent idea of what to look for. So in the end it does leave you feeling just a little bit cheated, even though all the advice you got seems extremely well informed.

What this book does astonishingly well is communicate an attitude to data analysis that most textbooks (and nearly all the college courses I took) seem to miss. Nearly every chapter is a stream of stunningly insightful observations on how to approach data, without the mathematical detail that overwhelms most practicing programmers. I would recommend it to any reader who understands that truly useful insights are hard to come by, but detailed algorithms and formulae are easily found in the Internet Age. I wish the book were a few hundred pages shorter, that it corrected a few sloppy mistakes (like confusing revenue and profit), but I'm certainly glad I read it.
209 von 231 Kunden fanden die folgende Rezension hilfreich
2.0 von 5 Sternen It falls short of initial expectations 7. Februar 2011
Von J. Felipe Ortega Soto - Veröffentlicht auf
This book is aimed at offering a practical, hands-on introduction to data analysis for pragmatic readers without strong scientific or statistical background. Some basic programming experience is required. The author provides many personal (and sometimes useful) comments about different tools and procedures in data analysis.

However, a careful reading reveals many problems, specially an obscure presentation of key concepts. In my opinion, the target audience for this book would be people without previous contact with data analysis. Hence the importance of presenting its core elements correctly. Otherwise, it's useless for them.

In particular:

- Few pages are actually dedicated to present open source tools supporting the different graphs and techniques included in the book. From the title, I expected a more complete tour through available open source tools for data analysis.

- No clues about how to obtain most of the graphs and results presented in the book. No related data sets are available for download, either. A book like this is useless if we cannot learn how to replicate all the examples.

- The formula of the variance for a sample is just wrong. One must divide by n-1 and not n; see "Applied Statistics and Probability for Engineers" (Montgomery and Runger 2006).

- The author presents one of the most obscure explanations for the median I've ever come across. Recurring to an RFC (RFC 2330) to explain such a simple concept is really awkward.

- In chapter 3 and Appendix B, natural logarithms (base e) are presented in the text, while graphs plot powers of 10. Definitely, not the right way to transmit correct concepts and methods.

- I concur with a previous review in that "Workshop" sections just present an ultra-short overview of some open source tools. A quick search in your favourite engine will display much more informative introductions (even quick start guides).

- Today, effective data analysis heavily depends on using the best possible implementation. While I might find educational to learn some of this implementations, in a real situation it is much better to rely on precise implementations of algorithms already available (e.g. libraries in GNU R).

All in all, I still recommend "R in a Nutshell" for a gentle introduction to data analysis with an open source tool (GNU R). It also has some inaccuracies and typos, but at least it's much more informative and clear. Besides, it does include an R package with all datasets and examples, ready to be installed and explored.
48 von 51 Kunden fanden die folgende Rezension hilfreich
3.0 von 5 Sternen Good, not great. Prerequisites and chapter organization issues. 27. Januar 2011
Von Jack Sparrow - Veröffentlicht auf
The book is very good for the intermediate-to-advanced data analysts. Beginners beware: there are some important prerequisites that are not obvious before you buy it, and there are some organization problems.

First, the prerequisites. "I strongly recommend that you make it a habit to avoid all statistical language"..."Once we start talking about standard deviations, the clarity is gone." These are two sentences in the same passage from the Preface. The rest of that passage is similar. However, even the first chapters make heavy use of statistical language. Moreover, they assume that you already know statistics to the level of density estimation, noise, splines, and regression. Page 21 even features a footnote about the Fourier transform and Fourier convolution theorem. Clearly this book is not for the statistically-shy or for mathematically-shy in general, no matter what the Preface suggests. You also need to know Python and R.

Second, the chapter organization problems. There's a mismatch between the first part of each chapter, which introduces concepts and techniques, and the Workshop part of the same chapter, which uses software. I was expecting the Workshop to illustrate the implementation of the same concepts and techniques. It's not really so. The Workshop introduces Python and R facilities at a different (lower) speed than the rest of the chapter. One could even wonder why the Workshop is in the same chapter. I'd rather that each chapter consisted of a few detailed case studies that first introduce concepts and techniques and then illustrate them with software libraries.
14 von 15 Kunden fanden die folgende Rezension hilfreich
2.0 von 5 Sternen Wrong enough to hurt 9. August 2012
Von T. Carroll - Veröffentlicht auf
Format:Taschenbuch|Verifizierter Kauf
While I'm not an expert in all the areas covered in this book, I am in a few. In those areas, this book is really wrong -- actually doing damage wrong.

For instance, when talking about regressions, the author claims that:

1) "Regression only makes sense when you want to use it as a prediction." This a very wrong. Any decent Econometrics book well be almost entirely about counter examples.
2) "Linear regression is appropriate only if the data can be described as a straight line." The "linear" in linear regression doesn't mean that at all. It just means that the form of the function must be linear in the coefficients to be calculated. In particular, x = a + b*x + c*x^2 will fit a parabola to the data.
3) "Historically, one of the attractions of linear regression has been that it is easy to calculate." It's easy to calculate for a single independent variable but multivariate regression are devilishly difficult to calculate because of numerical issues.

The is where I really have problems because multiple regression models are one of the most useful techniques available for understanding the effect of different factors and the author just dismisses them out of hand.

There are other problems:

When talking about the CDF, he defines it as the integral of the histogram. The histogram is not the probability density function. The PDF is defined to integrate to 1 where the histogram integrates to something else.

The formula for standard deviation is wrong, the formula for exponential moving average is wrong (a typesetting problem).

So, my problem is that I find a lot of problems with the portions of the book I know. Can I trust the remainder of the book or should I be wary? In this case, I'm wary.
43 von 54 Kunden fanden die folgende Rezension hilfreich
5.0 von 5 Sternen Wow! 22. November 2010
Von Jeffrey K. Tyzzer - Veröffentlicht auf
Format:Taschenbuch|Verifizierter Kauf
Lucid, learned, and full of insights--a great book on a difficult subject. When I pre-ordered this title, I expected it to be more cookbook-oriented. There are certainly cookbook aspects to it, but it goes way beyond that. For one, it's deep: Janert gives you solutions, sure, but you also get considerable background to go with them. I particularly like chapter 9's sagacious treatment of probability models, especially the section on power law distributions. For another, it's comprehensive--there is a lot of material here, and it's delivered with discipline and care. You can tell that Janert really pushed himself (maybe with a bit of help from his editor) when writing this book. Finally, this book has heart. Data analysis is a means to an end (albeit a wonderful, fascinating one), and the author does his best to ensure that we the reader keep the objective in mind--to inform and enlighten--all the while ensuring that we know enough to pick the right tool for the job. Chapter 16 is another stand-out, and I especially appreciated Janert's distinction here between operational and representative reports and his point about the former: good design emphasizes the content. That's a bit of Tufte-esque advice that we would all do well to remember.
Waren diese Rezensionen hilfreich?   Wir wollen von Ihnen hören.
Kundenrezensionen suchen
Nur in den Rezensionen zu diesem Produkt suchen

Kunden diskutieren

Das Forum zu diesem Produkt
Diskussion Antworten Jüngster Beitrag
Noch keine Diskussionen

Fragen stellen, Meinungen austauschen, Einblicke gewinnen
Neue Diskussion starten
Erster Beitrag:
Eingabe des Log-ins

Kundendiskussionen durchsuchen
Alle Amazon-Diskussionen durchsuchen

Ähnliche Artikel finden