am 22. Februar 2014
This book attracted my interest as I'm dealing with big data sets in my day-to-day job using ORACLE, SQL,.... As a risk consultant, mainly focusing on credit risk, the tasks usually is to gather, cleansing and prepare the bank's loan details in the view of a quantitative risk model development.
As data sets are getting bigger and bigger, I thought it might be useful to learn more about industry practices on designing a proper data structure. Colleagues of mine mentioned the Dimensional Model and I thought this book would help me understand not only the basic concept but also give advise on lessons learned, best practices and tips on how to deal with common stumbling blocks.
And this is why I am saying it is my own fault spending so much money on this book. The basic concept of dimensional data models can be explained with 10 sentences, whereas 8 out of them are common sense that most people should already consider if they ever touched a big data set. I should have asked google before...
Otherwise, the book clearly states that "This book is intended for data warehouse and buiness intelligence designers, implementers, and managers." Nothing for risk consultants like me, as the focus is more on data governance in general and explaning the reader that you can add "facts [...] by creating new columns".
Apart from me choosing the wrong book because it has been intended for another audience, I really can't get used to the way the authors are presenting their ideas. Let me give you some examples of how proud the author is on this (for me still common sence) technique.
"Focus on delivering business value. This has been the Kimball mantra for decades."
"The Kimball techniques have been accepted as industry best practices."
You will read the word Kimball about 30 times in the first chapter (35 pages) and sentences like those above on almost every other page. And let me also give you some examples why I think this book is full of common sence without sharing new knowledge with the reader:
"Dimensional models should not be designed in isolation by folks who don't fully understand the business and their needs; collaboration is critical!"
"For example, in a retail sales transaction, the quantity of a product sold and its extended price are good facts, whereas the store manager's salary is disallowed."
"Dimensional modeling is widely accepted as the preferred technique for presenting analytic data because it addresses two simultaneous requirements:
- Deliver data that's understandable to the business users.
- Deliver fast query performance."
(Really? That's the 'secret' behind a 500-page description of the model?)
To summarize, if you really want to learn about the dimensional model you should ask google first. In case you have never ever seen a data warehouse nor worked with data outside Excel this might be not the best book to start with either.