The thirteen columns in this book appeared in the Communications of the ACM between 1983 and 1985. There can't be more than a couple of technical books on computing from that era that are still worth reading. Kernighan & Ritchie's book, "The C Programming Language", is one that springs to mind; this book is definitely another, and will probably outlast K&R as it has almost no ties to existing or past hardware or languages.
What Bentley does in each of these columns is take some part of the field of programming--something that every one of us will have run into at some point in our work--and dig underneath it to reveal the part of the problem that is permanent; that doesn't change from language to language. The first two parts cover problem definition, algorithms, data structures, program verification, and efficiency (performance, code tuning, space tuning); the third part applies the lessons to example pseudocode, looking at sorting, searching, heaps, and an example spellchecker.
Bentley writes clearly and enthusiastically, and the columns are a pleasure to read. But the reason so many people love this book is not for the style, it's for the substance--you can't read this book and not come away a better programmer. Inefficiency, clumsiness, inelegance and obscurity will offend you just a little more after you've read it.
It's hard to pick a favourite piece, but here's one nice example from the algorithm design column that shows how little the speed of your Pentium matters if you don't know what you're doing. Bentley presents a particular problem (the details don't matter) and multiple different ways to solve it, calculating the relationship between problem size and run time for each algorithm. He gives, among others, a cubic algorithm (run time equal to a constant, C, times the cube of the problem size, N--i.e. t ~ CN^3), and a linear algorithm with constant K (t ~ KN). He then implemented them both: the former in fine-tuned FORTRAN on a Cray-1 supercomputer; the latter in BASIC on a Radio Shack TRS-80. The constant factors were as different as they could be, but with increasing problem size the TRS-80 eventually has to catch up--and it does. He gives a table showing the results: for a problem size of 1000, the Cray takes three seconds to the TRS-80's 20 seconds; but for a problem size of 1,000,000, the TRS-80 takes five and a half hours, whereas the Cray would take 95 years.
The book is informative, entertaining, and will painlessly make you a better programmer. What more can you ask?