I think this book is genuinely trying to be helpful, by giving an extended tutorial on the pandas library; but the tutorial covers only selected topics, and needs to be supplemented with a comprehensive function reference. The narrative also needs to be cut with the help of a strict editor.
If you are trying to decide whether to learn to use the pandas library, this book is for you. It starts with an example of how python and the pandas library can make it easy to do some basic analyses of data, and then develops more specialized chapters: summary statistics, data storage, data transformation (merging and joining), plotting, aggregation, time-series, special considerations for financial or economic data, advanced special topics.
Once I decided to use the pandas library, the book suddenly became less useful. The author has a verbose pedagogical style, and the book never departs from its tutorial perspective. Functions are introduced with examples but no definitions, and it's hard to find the rare summaries of functions, function arguments, or discussion suggesting when to use one method instead of another.
If you want to do something very close to what's done in an example, it's easy to follow along. Once you want to do something not emphasized or covered by an example, there is no guidance, no reference or dictionary section to give any hint about where I might search next --- google will probably direct you to stackoverflow.com, or the official pandas documentation site.
For example, suppose you have loaded your data into a DataFrame, and you want to use another column as the index. The book has several pages on the useful reindex() method, but that method is for resampling the data. Instead, you want set_index() --- but the book only mentions set_index() in passing, without saying what it does, far from the section where the DataFrame index is covered.
There have been some attempts to remedy this, with "quick reference cards" for pandas --- but they are in general also not comprehensive.
Finally, there is little guidance on the kinds of problems where you would be better served using numpy or some other tool instead of pandas. (There are a few paragraphs on areas where you might not want to use python.)
[Update: by mid 2013, the API reference at the official pandas documentation has the comprehensive listings that I was looking for --- see http pandas.pydata.org pandas-docs stable api.html . By version 0.12.0, all of the various function arguments seem to have been described with examples of acceptable settings. Also, the data analytical work (as opposed to cleaning and organization) has moved to the related statsmodels project, which requires pandas. So, to use that, it's important to be familiar with pandas.]
To the editor:
On many pages, there is some comment, phrasing, or trivial fact that I would have eliminated. Example:
"In some cases, a table might not have a fixed delimiter, using whitespace or some other pattern to separate fields. In these cases, ..."
"In part for legacy reasons (much earlier versions of pandas), DataFrame's join method ..."
"In my experience, having to align data by hand (and worse, having to verify that data is aligned) is a far too rigid and tedious way to work. It is also rife with potential for bugs due to combining misaligned data."
This is a technical publication, not a narrative!
Many of the code examples break across physical and PDF pages, which create small interruptions when reading. This may be hard to avoid when about half the text space is occupied by worked examples.
last line on page 129: a b c d a b c d e
first line on page 130: 0 0 1 2 3 0 0 1 2 3 4