2 von 2 Kunden fanden die folgende Rezension hilfreich
- Veröffentlicht auf Amazon.com
I was quite surprised how light the book is in terms of content. Out of 106 pages, 30 are for opening/closing details (cover, copyright, credits, reviewers, table of contents, index). This leaves some 76 pages to cover the following topics:
- Chapter 1 – Installing Vertica
- Chapter 2 – Cluster Management
- Chapter 3 – Monitoring Vertica
- Chapter 4 – Backup and Restore
- Chapter 5 – Performance Improvement
- Chapter 6 – Bulk Loading
The target audience for this book are “Vertica users and DBAs who want to perform basic administration and fine tuning.” Although, prior knowledge of Vertica is not mandatory (in my opinion, a user will most likely be lost in this book).
Chapter 1 – Installing Vertica
The author begins outlining the differences of Vertica from other MPP databases and mentions that data is stored in a columnar fashion, but misses encoding on top of compression, as well as other critical features of Vertica such as its high availability. I feel the author should have also included the features that come with each version (Community vs. Enterprise) of Vertica when mentioning the Management Console. It would have been helpful to mention that the logical design that’s typically performed at the database level can be taken to the schema level.
The pre-installation requirements attempt to be covered in 4 short paragraphs. There are many other critical steps to pre-installation such as OS level configuration and hardware planning that should have been touched on. The author suggests to keep 20-30 percent of disk space free on each node, however, the official recommendation is 40%. There is also no mention that Vertica can be run on AWS or that it can be run locally using a VM image from the Marketplace.
The rest of the chapter steps through the software installation process, and mentions it aims at covering a two-node cluster installation. I can’t really come up with any good reason to demonstrate a two node installation, as the most common installation has three nodes. The output from the installation script seems completely unnecessary.
Chapter 2 – Cluster Management
Most of the material in this chapter seems to imply that projections are strictly segmented, where they can obviously also be replicated (especially in the case of smaller dimension tables in a star schema). This isn’t hinted until Chapter 5. However, the author does a descent job of explaining how skew plays a factor in segmentation.
There appears to be confusion between adding hosts to a cluster and adding nodes to a database. The distinction should be more explicit and each process individually called out as it is with removing the node from the database and removing the host from the cluster.
There is also no mention that a K-safety higher than 2 isn’t really recommended or that a minimum K-safety of 1 is required for production clusters.
The rest of the chapter does a fairly descent job describing node/host management. However, the section on spread isn’t really applicable after version 6.1 since spread is integrated into the OS.
Chapter 3 – Monitoring Vertica
This chapter fell short on a critical part of the workload management of Vertica. At the very least, there should have been a mention of resource pools, query requests, background processes, monitoring for potential problems, and on the data collector and its role in aggregating system data.
Chapter 4 – Backup and Restore
The author provides a good basic overview of the backup and restore process. Mentioning the miscellaneous settings, or parameters for vbr.py seems unnecessary and a reference to the documentation would have sufficed. When mentioning the copycluster, the author could have added more detail about how a dormant node can be used in a production environment for failover. I feel that this chapter could have highlighted more of Vertica’s high availability strategy, as some customers don’t even use backup.
Chapter 5 – Performance Improvement
The author does a poor job of comparing Vertica’s columnar architecture to traditional a row-store. While it is true that Vertica can only use columns involved with the query, this is also true in a traditional row-store under certain conditions with proper indexes. I feel a better approach would be showing an example of how the physical data is stored in each architecture.
The first section also incorrectly states that a superprojection gets created when the table is created (occurs at first data load). The remainder of the section does a reasonable job of introducing the concept. With regards to high availability and recovery, I feel it’s important to mention how Vertica uses checkpoints epochs with projections to recover data.
The material on the Database Designer seemed to completely miss the performance design priority (Balanced/Query/Load).
The remainder of the chapter briefly discusses the concept of ROS/WOS and Tuple Mover operations.
Chapter 6 – Bulk Loading
The COPY command is covered in extreme brevity. There should have been some details about monitoring loads.
I feel that the book falls short on the discussed topics. Critical concepts such as the architecture, resource management, and monitoring/troubleshooting are not adequately covered. I couldn’t find anything that isn’t more thoroughly covered in the official documentation. There is too much space used on script outputs and screenshots.
It also seemed that the book tried to be version agnostic, however, there are many features such as the installation script, database designer and management console that have been dramatically improved and overhauled. The author should have explicitly mentioned this book focuses on version 6.1. The book comes late into the game as Version 7.0 was released late last year.
The amount of material in understanding the essentials of the platform would probably require at least three books (with 500+ pages). A proper anthology on the platform would probably look like:
- Performance Tuning
I rate the book 2 out of 5 stars.
1 von 1 Kunden fanden die folgende Rezension hilfreich
- Veröffentlicht auf Amazon.com
"Look on my works, ye mighty, and despair". Starting tinkering with databases in 2010 and mastering 20 databases by 2014 - while finding time for poetry, dancing and badminton - is an impressive accomplishment in anyone's book. Apart from Vertica (last touched in 2012; I remember the atrocious projections crafted by DBAs using Vertica's designer tool, and the database choking on "select ... where x in (select ...)" queries), SSAS and QlikView might be the only new technologies which I myself picked up over these years - and in each case, there was a good Packt-published book to help me along the way.
Sadly, those books were exceptions, and the typical Packt book is a low-quality quickie that looks like a digest of a better book, written by a first-time author hailing from the Indian subcontinent. "HP Vertica Essentials" is just one more example of this quantity-over-quality approach. (And so is one of the other reviews). The book is transparently inadequate, and unnecessary, given that Vertica documentation - helpfully split into thematic volumes - is only a Google search away. ("Concepts Guide" is the best place to start if you are a beginner).
- Veröffentlicht auf Amazon.com
I have been working with HP Vertica for a while when I worked in my previous company. I did some setup operations, querying and aggregation. I read only official docs, so when I saw this book about HP Vertica, I decided to read it.
First of all I want to notice, that this book is mostly for DBAs, but some chapters would be interesting for developers too. If you're DBA or like to know more about Vertica and at least want to know how to setup Vertica server to play with it, then you can read it.
About the author
Rishabh Agrawal is a senior database research engineer and consultant at Impetus India. He has been working with different databases for four last years, including relational databases, MPP databases and NoSQL databases. Here I want to notice that I like to read books written by real developers, database administrators, I mean people with practical experience, not theorists. In my opinion only such people have enough knowledge to teach other people. That's why I was full of motivation and wanted to read this book.
Let's quickly go through chapters, but first I want to say a few words. First thing that I noticed is that this book is really short, all chapters are ~80 pages only. HP Vertica's documentation is quite big, also MySQL documentation is really big, nearly 3500 pages, so it's hard to understand how author can describe and explain so many aspects of Vertica using only 80 pages.
Chapter 1: Installing Vertica
This chapter is interesting both for developers and DBAs. Here we have step by step guide how to install Vertica's server and create a first database. It's easy to follow and quick way to setup a Vertica server. After that, depending on your needs you'll be able to play with it and explore queries or if you're a DBA, then you may read about more specific and complex admin tasks.
In this chapter Rishabh provides a few advices about your host configuration to make things smooth while installing Vertica, such as swap space, CPU frequency scaling and so on. I would recommend to remove all unnecessary install_vertica script output that is specific for everyone and doesn't have too much sense to be in this book and take so much place in a really short book.
Probably you'll have some issues installing Vertica on your own OS, so I would recommend author add link to Vertica's Community, which can save your time and help you solve your issues and continue your way with Vertica.
Chapter 2: Cluster Management
I like how Rishabh Agrawal describes configuration of elastic cluster, when and why we need it, what it can do for us and so on in easy to read and understand manner. He explains us all operations that we may need working with Vertica's cluster: adding, removing and replacing nodes, changing K-safety and local segmentation of nodes. He shows us how he does it and what we should do before doing these operations, for example backup our data, check hosts availability for each other and so on. Also Rishabh mentioned Vertica's Management Console which is available in enterprise edition only. In my opinion you won't buy enterprise edition if you're doing a research and for example want to figure out if your company need this technology or not. Of course you want to quickly setup it, play with it and then make a decision, so I would remove all things related to Vertica's Management Console from this book.
If you follow Rishabh's instructions step by step, probably you need to read chapter 4 first, about backup/restore operations and then come back to this chapter again.
Chapter 3: Monitoring Vertica
Here we become familiar with two ways how we can monitor Vertica. First one is using system tables and second one - monitoring through log files. As in previous chapter, I would remove last approach, related to management console. Also I would like author to describe system tables more, I mean it would be great to hear from him which system tables are more important in a day per day basis. Finally, author did a good description of Vertica's events and how we can check them via system tables.
Chapter 4: Backup and restore
In this chapter author describes how to do full and incremental backups and restore data using vbr.py. It's an important chapter both for DBAs and developers, because it's critical to avoid data loss.
Chapter 5: Performance Improvement
Here Rishabh Agrawal explains projections, both segmented and unsegmented and in which case you should use one or another. Then author describes how to create projections using your queries set for some tables and Database designer - tool for projections creation. Plus we have example how to create projection manually. In the second part of the chapter, author describes tuple mover, its operations and how we can optimize its work. Here I would like to see at least some research wich shows performance improvements in percentage or something like this, achieved by author after applying all modifications. I would like to see if all these improvements worth my time.
Chapter 6: Bulk Loading
This chapter is important for developers and DBAs too. At my previous work I used bulk loading in a lot of cases. Author explains different load methods like auto, direct and trickle, plus he shows examples of data loading from different places, including copy from local location and so on.
I would add more examples to this chapters, because now I don't see more complex cases when you need to avoid or transform some columns from your original data file, before copying it into some table.
This book won't solve all your problems, but it's good point to start from. Probably this book will make your start with HP Vertica a little bit more comfortable and straight, but you won't get the whole understanding of Vertica. I think that my time wasn't wasted and now I know a little bit more about Vertica's administration, but sure, you definitely need to read official documentation after this book anyway. Chapter 1 (installing Vertica) and chapters from 4 to 6 (backup and restore, projections and bulk loading) are interested not only for DBAs, but for developers too. This book is helpful for beginners, but I think people that already have some experience with HP Vertica will find interesting parts or chapters in it too.
P.S. reposted from my blog http://4devs.io
- Veröffentlicht auf Amazon.com
Format: Kindle Edition
The book is way too short for a database book. It is not possible to explain a database in so few pages.
The installation chapter doesn't mention about using deadline for I/O scheduler which is important for read performance. There are other pre-requisites that should be mentioned if not explained in detail.
It would have been useful to explain vsql and frequently used commands to get started.
Resource pool which is a very important part of the Vertica architecture was not covered.
Security was not covered.
In my opinion, the purpose of writing a book on database should be to ensure better understanding of concepts and sharing insights got from experience and testing. Most of the book feels like reading the official documentation.
- Veröffentlicht auf Amazon.com
This is a book for newbies (is very short), the book explain basic topics clearly with examples using a database, but for details more in deep of the platform to look for in other sources for example in the HP Vertica documentation. The book have all the essential topics from installation to load data. the topics are.
.-Backup and Restore
Other point is that the book is not based on the current version of the HP Vertica but the conceps are similar.
If you haven't experience in HP Vertica and you need the basic rapidly, is a book for you.