Though not its primary focus, The Data Access Handbook does talk some about tuning the database. In this podcast Rob Steward explains why tuning the database in relation to middleware will greatly improve database application performance? The podcast runs for 5:36
Click on the following link to listen to the podcast: http://dataaccesshandbook.com/media/RobSteward_Jun3_Tuning.mp3
The tuning that we talk about in The Data Access Handbook is mainly focused on the drivers. And I would say in answer to your question, that’s the thing that most people overlook.
Now you can go buy any of those books on how to tune my Oracle database, or how to tune my DB2 database, or how to tune my SQL Server, and every one of those books out there is going to talk about some configuration and tuning parameters of your database server. So what we say in The Data Access Handbook is, ‘do you need to do that?’ Absolutely! You absolutely need to tune your database. You need to follow those guidelines that are setout in all of those hundreds of books that exist on that subject, make sure those configurations parameters are right so you’re getting the best throughput, and the best scalability, and the best performance out of your database. But the thing that most people overlook is the ability to tune your database middleware.
So I’ve spent the last 16 years writing ODBC drivers, and JDBC drivers, and ADO.NET data providers, and OLE DB providers and just database middleware in general, and one of the things that I’ve sort of always focused on was giving the programmer the ability to tune what happens in those drivers.
Let me give you an example. A driver may ask the database for a certain amount of data; so let’s say you execute ‘SELECT * FROM tableA.’ Well that may be 100 rows, it might be 100,000 rows; it doesn’t really matter. But the key is the driver itself in every database has the ability to ask for some segment of that result set in one network roundtrip. So I may say give me 64K worth of data, or I may say give me 16K worth of data, or I may say give me 100 rows, there is some way with every database that I’ve talked about with which to block fetch – I’ll call it block fetching. I don’t want to just fetch just one row at a time, because that’s typically not very good performance. But here’s the thing, the driver doesn’t know if your table’s got 100 rows or if it has 100,000 rows? When it executes a statement and asks for the data it doesn’t necessarily know what the size of the row is. Is it four bytes? Or is it 4,000 bytes?
There’s a very vast range in that case where the amount of data that needs to be returned. Now in order to do it most efficiently, there is a way to break it up, and that is exactly what those drivers, or that middleware is attempting to do. It is attempting to say, ‘okay fetch me 64K, because I know that this is going to be a larger result set. Or send me 4K because I know it is going to be a small result set. And the bigger that package size, or the bigger that buffer that we get on each roundtrip, the less roundtrips we have to make, and therefore reducing network latency, reducing the amount of data – the requests that go back and forth across the database. Now the reason this needs to be exposed in tuning options is that the driver itself doesn’t know. It doesn’t know how many rows are in your table. It only knows, ‘here’s the SQL statement that you gave us, execute it, and begin to fetch the data back.’ So what the tuning parameters allow you to do is tell that driver, ‘hey, you know what, this is going to be a large result set.’
There are other things that the drivers can tune for, particularly with numbers and network packets. It also could involve buffers that it may pre-allocate to hold that data when it comes back. There are other tuning parameters related to doing inserts or updates where you may want to issue 100 inserts. And there are tuning parameters where you tell the driver, ‘hey, I’m probably going to issue 100 inserts, optimize for that.’ In terms of the interim buffers that you use, in terms of network roundtrips, and in some cases it is even things that get passed up to the database that make the database more intelligent, and to be able to more efficiently return your data to scale better, returning your data.
Really what I would say is the most overlooked thing the most often is the ability to tune those drivers. The drivers that I’ve written throughout my career all have multiple tuning options, because there are always choices you can make. Some of those options, or most of those options, can have a dramatic impact on your performance.
One of the tips we talk about in The Data Access Handbook is make sure that you buy/use ODBC, JDBC, ADO.NET drivers that have these tuning parameters. I’ve seen applications from customers that will send me – and they may be doing an operation that takes 10 seconds – and I can look at that trace and I can say, ‘you know, based on what you’re doing here, if you’ll accept this one tuning parameter…’ They may rerun it, and all of a sudden that 10 second operation takes 1 second. Many of those tuning options can have huge impacts on the overall performance of your data. So that would be my number 1 tip about tuning: Make sure you look at that middleware.
View all posts from Rob Steward on the Progress blog. Connect with us about all things application development and deployment, data integration and digital business.
Copyright © 2018 Progress Software Corporation and/or its subsidiaries or affiliates.
All Rights Reserved.
Progress, Telerik, and certain product names used herein are trademarks or registered trademarks of Progress Software Corporation and/or one of its subsidiaries or affiliates in the U.S. and/or other countries. See Trademarks for appropriate markings.