The Data Access Handbook covers numerous causes for database bottlenecks, including hardware constraints, particularly with regards to memory issues. In this podcast Rob addresses issues that cause memory bottlenecks and explains how these can be overcome. The podcast lasts for 4:32.
You may listen to the podcast here: http://blogs.datadirect.com/media/RobSteward_MemoryDataAccessBottlenecks.mp3
So there's a specific class of issues that will happen that will cause you to use a lot more memory than you may need or you may want. In the book, we talk about a number of different tips that are sort of related, but we don't -- it's a great question because we don't have it organized necessarily all in one spot saying "Here's a thing that can affect memory." So the biggest thing I would say that can affect memory is the data that you retrieve, if it contains long data. So if the data that you're retrieving or trying to insert or update contains pictures or papers, very long books, very long string fields or character fields, typically they're going to be in a clob or a blob column or a long or a text or an image or any of those data types across the different databases that can store, say, a 2GB piece of data in one column in one row. So for example, this week I had a customer who was getting an out of memory error, and when we started to look at the code, what we found was they were trying to insert some pictures, and actually their pictures were getting up close to 2GB, but what would happen was within -- and this particular case was ADO.NET -- what was happening was they were loading that entire picture into memory, into an insert in the database, but there are a lot of ways within ADO.NET and within .NET to stream that picture in not all at once, and again, the exact error that they were getting was that they were getting an out of memory exception, so this is exactly, you know, the case that we're talking about. Once they change their code to use the streaming interfaces that exist in .NET, everything looks fine. So again, there was that long data, you know -- the same types of interfaces existed in JDBC. And in ODBC, you can also stream data in in chunks when you're dealing with very large data.
Now, that's when you're doing like an insert or an update. Same thing applies when you're fetching the data. Most drivers have options that control how much of that data's buffered. So again, we talk extensively in the book about getting database middleware, drivers, providers, or whatever that is tunable, one of the tuning parameters that you typically have is when I fetch that picture data, or that long data, how much of it should I get at a time, how much of it should I buffer? So look there as well. That's one of the big things that we see as well as those tuning parameters affect how the drivers do that. Also, some of them -- or all the APIs have some way to retrieve that data in chunks, so don't just allocate a 1GB buffer. If you think about it, if you have a picture column and you retrieve 100 rows, and even if, you know, you do half a GB, you know, that's 50GB -- you're going to get an out of memory error. So you have to be careful about the way you stream long data. That's the most common way that I see that people hit memory limitations.
The other way that we see is when you pull very large result sets -- so for example, you might have some data warehouse query that returns a million rows. Now, when those million rows come back, you know, they may be a couple thousand bytes a piece, or they may -- you may get into where you're using a lot of memory to store those results coming back. Again, there are tuning parameters on most good database middleware, most drivers, that allow you to affect how much it will hash in memory, and some of them contain options that say, "Well, when I hit this certain memory limit let's start caching the disk," but probably the biggest thing you're going to see in terms of memory is watch out for that long data.
View all posts from Rob Steward on the Progress blog. Connect with us about all things application development and deployment, data integration and digital business.
Copyright © 2018 Progress Software Corporation and/or its subsidiaries or affiliates.
All Rights Reserved.
Progress, Telerik, and certain product names used herein are trademarks or registered trademarks of Progress Software Corporation and/or one of its subsidiaries or affiliates in the U.S. and/or other countries. See Trademarks for appropriate markings.