Why Write The Data Access Handbook? The Authors Explain

March 24, 2009 Data Platform

In this podcast, Rob Steward explains why he and Coauthor, John Goodson, wrote The Data Access Handbook. The podcast runs for 4:58.

To listen to the podcast, please click on the following link: http://dataaccesshandbook.com/media/Rob3.mp3

Podcast text:

Why did you write The Data Access Handbook?

Rob Steward: We wrote the book because we have been told that the world is flat. What I mean by that, by the fourth century Greek philosophers had started to offer compelling evidence that the world was in fact round. But it still took several hundred more years before anybody started to believe them. I see the same thing happens in the database application performance area. What we’re told, and what all the experts tell you, is that if you rollout your application, and you’ve got performance and scalability problems, that all the problems exist in the database. What they’ll tell you is: ‘You need to tune your database;’ ‘you need to do the right configuration options;’ ‘you need to create the right schemas, with the right tables, and the right relationships, and the right indexes.’ And that will solve all of your problems. Essentially what they’re telling you is that the world is flat, because that’s not all that there is to it.

Even once you’ve done that and you’ve tuned your database, often times we find that the performance problems are not solved. And the reason for that is because in today’s environment we see 75-95% of the time accessing your data is actually in the middleware or on the network.

I’ve seen, literally hundreds or thousands of times in my career, where people are doing exactly what the experts say: Let’s tune those things; let’s get ‘em right. And it still doesn’t resolve the problem, or it only solves a very small part of their problem. I’ve spent the last ten year, a large percentage of my time, going around talking to people at conferences like JavaOne, TechEd and many other conferences, and talking about what are the performance implications of the middleware. I’ve written magazine articles, written whitepapers; my coauthor, John, has been doing the same thing for 10 or 15 years now. Between the two of us we have a combined 35 years of building database middleware. So we’ve built up a lot of knowledge, we’ve spent a lot of time sharing that knowledge, and we’ve helped a lot of people.

I like to tell this particular story, because I think it shows you the kind of impact I’m talking about. A couple of years ago I was giving a talk on performance, and there is a particular tip that I give out – which I’ve been giving out for a number of years and it’s also in the book – the particular tip about how to change one little thing and it explains how that makes a difference, why it reduces lots of network round trips, and the amount of I/O that you cause to happen on the database server. So I was giving the talk and a guy came up to me and he said, ‘Rob, I heard you give this particular tip about a year ago at another place you were speaking. I went back to work and implemented that particular tip. We were doing an operation where we were inserting 5 million rows into a database, and it was taking 8 hours to do. So we made this one change that you suggested, and all of a sudden it took ten minutes instead of 8 hours.’ Now that is an extreme example, but it is a real-world example that I like to use to explain what it is we are talking about. That one little change – they didn’t change their database, they didn’t tune it, they didn’t make any changes on that side – all they did was make a couple of lines of code change, and that’s the kind of benefit that they saw: an operation that took 8 hours now took 10 minutes.

Now the reason that they didn’t know about that before is because you can’t find information like you can find in this book. You can walk into any bookstore and you’ll find book after book after book on how you tune those database servers. But what you won’t find, at least until now, is a single book that says well, how do I write the best data access code? And how I tune that middleware? And how does it actually work? And how does it influence things? There has never been a book like this before. It’s covering a new ground that the experts you are used to don’t talk about.

Two years ago we finally decided that we need to take all of this information that we’ve been sharing with people for years – through articles, white papers and at conferences – and put it into a form that we can share all of this with a lot more people. That’s ultimately why we wrote the book.

In this podcast, Rob Steward explains why he and Coauthor, John Goodson, wrote The Data Access Handbook. The podcast runs for 4:58. To listen to the podcast, please click on the following link: http://media.libsyn.com/media/geoliv/Rob3.mp3
Podcast text:
Why did you write The Data Access Handbook?
Rob Steward: We wrote the book because we have been told that the world is flat. What I mean by that, by the fourth century Greek philosophers had started to offer compelling evidence that the world was in fact round. But it still took several hundred more years before anybody started to believe them. I see the same thing happens in the database application performance area. What we’re told, and what all the experts tell you, is that if you rollout your application, and you’ve got performance and scalability problems, that all the problems exist in the database. What they’ll tell you is: ‘You need to tune your database;’ ‘you need to do the right configuration options;’ ‘you need to create the right schemas, with the right tables, and the right relationships, and the right indexes.’ And that will solve all of your problems. Essentially what they’re telling you is that the world is flat, because that’s not all that there is to it.
Even once you’ve done that and you’ve tuned your database, often times we find that the performance problems are not solved. And the reason for that is because in today’s environment we see 75-95% of the time accessing your data is actually in the middleware or on the network.
I’ve seen, literally hundreds or thousands of times in my career, where people are doing exactly what the experts say: Let’s tune those things; let’s get ‘em right. And it still doesn’t resolve the problem, or it only solves a very small part of their problem. I’ve spent the last ten year, a large percentage of my time, going around talking to people at conferences like JavaOne, TechEd and many other conferences, and talking about what are the performance implications of the middleware. I’ve written magazine articles, written whitepapers; my coauthor, John, has been doing the same thing for 10 or 15 years now. Between the two of us we have a combined 35 years of building database middleware. So we’ve built up a lot of knowledge, we’ve spent a lot of time sharing that knowledge, and we’ve helped a lot of people.
I like to tell this particular story, because I think it shows you the kind of impact I’m talking about. A couple of years ago I was giving a talk on performance, and there is a particular tip that I give out – which I’ve been giving out for a number of years and it’s also in the book – the particular tip about how to change one little thing and it explains how that makes a difference, why it reduces lots of network round trips, and the amount of I/O that you cause to happen on the database server. So I was giving the talk and a guy came up to me and he said, ‘Rob, I heard you give this particular tip about a year ago at another place you were speaking. I went back to work and implemented that particular tip. We were doing an operation where we were inserting 5 million rows into a database, and it was taking 8 hours to do. So we made this one change that you suggested, and all of a sudden it took ten minutes instead of 8 hours.’ Now that is an extreme example, but it is a real-world example that I like to use to explain what it is we are talking about. That one little change – they didn’t change their database, they didn’t tune it, they didn’t make any changes on that side – all they did was make a couple of lines of code change, and that’s the kind of benefit that they saw: an operation that took 8 hours now took 10 minutes.
Now the reason that they didn’t know about that before is because you can’t find information like you can find in this book. You can walk into any bookstore and you’ll find book after book after book on how you tune those database servers. But what you won’t find, at least until now, is a single book that says well, how do I write the best data access code? And how I tune that middleware? And how does it actually work? And how does it influence things? There has never been a book like this before. It’s covering a new ground that the experts you are used to don’t talk about.
Two years ago we finally decided that we need to take all of this information that we’ve been sharing with people for years – through articles, white papers and at conferences – and put it into a form that we can share all of this with a lot more people. That’s ultimately why we wrote the book.

Rob Steward