Three Keys to SQL Server Performance
Everyone wants good performance from their database systems. Some even expect and need a high performing Ferrari all the time. How is this achieved though? What do you need to understand about SQL Server specifically in order to make your company’s applications hum along like a well tuned car? We will look at three keys to SQL Server performance.
SQL Server Memory
1. SQL Server must be able to cache sufficient data in memory
SQL Server does not use memory like most applications. People unfamiliar with SQL Server’s use of memory are surprised to see SQL Server using 80% or more of a server’s RAM. “I have 128 GB of RAM on this machine why is SQL Server set to consume 115 GB of that?!” Then they decide to change Max Server Memory down to like 64 GB or less. Suddenly they might find that disk IO shoots up. You will also likely see that execution plans are rapidly aging out of the plan cache, which will burn up CPU with new compiles for query plans and potentially lead to parameter sniffing issues. When all this comes into play people start complaining about applications not working as well. They start saying the magical phrase, “It’s slow.”
When a query is issued and the data requested is not in memory, SQL Server must use CPU and disk IO to get the data into memory first. While it does this a PAGEIOLATCH_* wait is registered in SQL Server because the current query is waiting on the data to be retrieved. This wait causes the query to be put on hold. Reading data off the disk is always, always going to be slower than working with data that’s already in memory.
So, how do you determine what is a sufficient amount of RAM for your SQL Server? The short answer is that I would encourage you to look at my post here and use the DBATools cmdlet Set-DBAMaxMemory. The post will help you understand more about what the Max Server memory setting does and the PowerShell cmdlet will recommend a good starting place for setting Max Server Memory. Here are some examples from the documentation from DBATools and from the Help commands available in PowerShell.
<# If you have a Central Management Server for your SQL environment, consider using this command to loop through all the SQL Servers and set the Max Server Memory where it is set to something larger than the total amount of RAM assigned to the server. #> Get-DbaRegServer -SqlInstance sqlserver | Test-DbaMaxMemory | Where-Object { $_.MaxValue -gt $_.Total } | Set-DbaMaxMemory <# If you have a Central Management Server for your SQL environment, consider using this command to loop through all the SQL Servers and set the Max Server Memory to this accepted formula created by a SQL Server expert. #> Get-DbaRegServer -SqlInstance sqlserver | Test-DbaMaxMemory | Set-DbaMaxMemory <# If you don't have a registered server then just use the below #> Test-DbaMaxMemory -SQLinstance SQLServerInstanceNameHere | Set-DbaMaxMemory
The longer answer is this. Take a look at three metrics in the SQL Server:BufferManager category in Performance Monitor.
- Page Life Expectancy:
First, if you observe Page Life Expectancy over time and the value has long stretches where it plummets and stays low without recovering, then you have some heavy hitting queries running that could likely benefit from having more RAM available. This will likely be accompanied by SQL Server wait stats showing a high average wait for PAGEIOLATCH_* type waits. This is SQL Server reading pages from disk because the data pages it needed were not in memory. This queries also may improve and use less RAM if you can build an appropriate index to assist the query causing the plummet in Page Life Expectancy. Keep in mind that you you to do appropriate index consolidation when considering adding new indexes.
- Lazy Writes/sec
Second, review the Perfmon counter Lazy Writes/Sec. If SQL Server is having to free up memory in between checkpoints, this value will be above zero. If it is regularly above zero, then SQL Server is regularly under memory pressure and is having to write data pages in memory back to the disk so that it can load different data pages to satisfy queries.
- Free List Stalls/sec
Third, look at Free List Stalls/sec. The value of this Performance Monitor counter indicates the number of requests per second that had to wait for a free data page in memory. If Page Life Expectancy is often low for long stretches and both Lazy Writes/sec and Free List Stalls/sec are greater than zero, then you need to either adjust Max Server Memory up (as long as you don’t go too high based on the above information), add memory or, take a hard look at your indexes and queries involved when these PerfMon metrics are out of balance.
SQL Server Indexing and Disk Performance
2. SQL Server Must Be Able to Retrieve Data From Disk Efficiently
Look at your SQL Server wait stats information and if you see PAGEIOLATCH_* very prominent then there could be a good chance that the indexes in your database need attention or those long IO waits could mean a problem with the disk subsystem.
PAGEIOLATCH_* type waits are all wait types related to reading 8KB data pages from the disk into memory. Inefficient indexes could be making these reads longer because SQL Server can’t quickly get the data it needs from the existing indexes. This can happen over time as query and data patterns change.
For example, your company might introduce a new category of products. Suddenly people are querying that category and its associated products far more and older, mainstay products less. The distribution of that data may affect the execution plan generated for those queries.
The company may have re-factored an existing application into new tables and missed indexing the Category column. Now when people are searching for things like “Bike Accessories” there is no supporting index. This results in long table scans of millions of rows.
As a start to determine if there is inefficient indexing, run and save the output of sp_BlitzIndex to examine your tables and indexes. Review its recommendations and make adjustments. Some time after a restart of SQL Server, re-run sp_BlitzIndex and compare the output to the previously saved run looking to see if SQL Server is using the adjusted indexes.
To review whether you have a disk IO subsystem issue, look at the DMV called sys.dm_io_virtual_file_stats. You can also review the SQL Server Error Log looking for messages indicating IO that took longer than 15 seconds. These could be an indication of an IO subsystem issue. Review this article and this article on these topics.
Both of these articles provide information on understanding the DMV and the error message. There is also information about Performance Monitor counters to use to measure potential problems.
If you are in the cloud, such as AWS, be sure to review settings for ebs and fsx storage to ensure that the IOPs and throughput are set up appropriately. Also, be sure to take into consideration how the AWS ec2 instance type might be throttling your IO and throughput capabilities. Just because your storage is set up with a certain IOPs and throughput doesn’t mean that the ec2 instance can support the storage settings.
T-SQL Anti-Patterns to Avoid
3. You Must Write “Good” T-SQL
Poorly written T-SQL can cause poor application and SQL Server performance. Here is a brief list of things to avoid.
A. Writing T-SQL that makes it non-SARGable. This will cause SQL Server to have to scan entire tables instead of using an existing index. For details and examples see this, this and this. there is no need for me to explain this in depth, as it has been written about quite frequently.
B. Overuse of cursors. SQL is a set based language so making it do operations row by row, is far less efficient than using set-based logic. Take a look at this post and this post.
C. Overuse of SELECT *. Some tables are very wide and have millions or even billions of rows. If you don’t truly need every column, then do SQL Server and your application a favor and only return the columns that are actually needed!
D. Be careful with scalar user defined functions. This goes back to the idea of SQL being a set-based language. A scalar udf returns a single value for each value passed to it. When you use this sort of logic and pass into it large numbers of rows, then each row is processed one at a time inside the function. Also, SQL Server doesn’t do a good job of “seeing inside” Scalar UDFs. This means that it can be difficult to tell a UDF is involved when reviewing a query plan. Additionally, scalar UDFs kill SQL Server’s ability to go parallel. For more, take a look at this, this , and this.
This list could be much, much longer. The point here is that how you write your T-SQL often has a direct impact on performance. If you write T-SQL for applications, I strongly encourage you to look at the blogs of folks like Erik Darling and Kendra Little as well as sites that have a large number of entries on T-SQL like this one.
Here’s the short list of performance keys.
1. SQL Server must be able to cache sufficient data in memory
2. SQL Server must be able to retrieve data from disk efficiently
3. You must write “good” T-SQL
Next Steps To Take
If you would like help with anything in this post, or with something else related to SQL Server, reach out to me here, on Twitter, or by email at [email protected] and I’ll be glad to offer assistance.