One of the foundational responsibilities of a DBA is making sure that backups are available. SQL Server has several foundational types of backups that can be made. Before I dive into that though, I want to ask a question that seems to have a fairly obvious answer.
Why take backups?
- You want to be able to recover from data corruption or mistakes made when people make changes directly to the data – like when they write and run a script in SSMS, outside of an application that’s using a set of tested stored procedures.
- You want to be able to refresh you non-prod environments so developers have data that looks like prod to develop against.
- You want to have backups so you can test whether you can restore and how long it will take
- You want backups as they are often an essential part of the migration process from one SQL Server to another – usually newer SQL Server, or to Azure.
SQL Server Full Backups
What does a full backup accomplish? A full backup, creates a backup of the entire database. It is all-inclusive, containing the entire database structure and all the data. When a full backup starts, the Log Sequence Number is recorded and when the full backup completes, the Log Sequence Number is recorded. This Log Sequence Number is the mechanism used by SQL Server to know what order INSERTS, UPDATES or DELETES occurred in. As such, having the beginning and ending Log Sequence number recorded as part of the full backup allows for a transactionally consistent backup because the full backup is aware of the changes that took place while the backup was being made. This allows for recovery of those transactions during the restore process. This full backup serves as the base backup, or starting point for all differential and log backups after it. Each time a new full backup is taken, a new base backup point is created.
SQL Server Differential Backups
What does a differential backup do for you? A differential backup works off the concept of a differential bitmap. SQL Server is composed of eight kilobyte pages. These pages are collected in groups of eight called “Extents.” There is a bitmap that is used to mark any Extents that have experienced data changes since the last full backup. When a differential backup is executed, it is this bitmap that determines what data is backed up.
Based on the size and rate of change in a database, differential backups are often going to complete faster than a full backup. Additionally, the differential backups will be smaller in size since only changed extents are backed up, rather than the entire database.
When the full recovery model is used, the power of the differential can be very important. The differential backup reduces the number of log backups that have to be restored. For example, if there is a full backup taken on Sunday night, and transaction logs every 15 minutes after that, when a DBA needs to restore the database to Thursday at 5:53 AM, she needs to construct a restore process based on the Full backup and every transaction log since Sunday evening! That’s a lot of transaction logs to have to include. With a daily differential occurring at 5 AM, the DBA now only needs to restore the full backup from Sunday evening, the one differential that was created at 5 AM on Thursday, and only the transaction log backups that occurred after 5 AM Thursday. Being able to skip over using all those log backups is powerful.
This makes for much simpler script creation and readability, and can save a lot of time if the DBA is trying to write this script manually. You should have an automated process to create the backup script, but some DBAs don’t have that set up yet. We’ll talk about automating the backups script creation later in the series.
SQL Server File and FileGroup Backups
What are File and FileGroup backups? Well to start with, this is a more advanced topic than these other backup types. As such, it is less likely that you will need this type of backup. However, knowing what your options are is always a good idea.
File and FileGroup backups allow a more granular backup, and by extension, a more granular restore process. It does what it sounds like. A file or filegroup backup backs up an individual file or a collection of files contained in a filegroup. This allows for restoring a small part of a database to fix a problem rather than having to restore the entire database. This option, however, does introduce complexity and, in my opinion, is really only useful for large databases, particularly in the range of 500 GB or more.
These file and filegroup backup types can be used when a larger database has been divided into multiple files. For example, a large database containing sales information might be divided into various Sales files, perhaps by year, or even by month if there are a high volume of sales recorded in the database. These files are then aggregated into FileGroups. See below for an example of creating files and filegroups in a database. This example was taken from the MS docs page here.
USE master; GO -- Create the database with the default data -- filegroup, filestream filegroup and a log file. Specify the -- growth increment and the max size for the -- primary data file. CREATE DATABASE MyDB ON PRIMARY ( NAME='MyDB_Primary', FILENAME= 'c:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\data\MyDB_Prm.mdf', SIZE=4MB, MAXSIZE=10MB, FILEGROWTH=1MB), --Here is where a separate file is created and assigned to a FileGroup, which is little more than a container. FILEGROUP MyDB_FG1 ( NAME = 'MyDB_FG1_Dat1', FILENAME = 'c:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\data\MyDB_FG1_1.ndf', SIZE = 1MB, MAXSIZE=10MB, FILEGROWTH=1MB), --Here is where another separate file is created and assigned to a FileGroup. ( NAME = 'MyDB_FG1_Dat2', FILENAME = 'c:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\data\MyDB_FG1_2.ndf', SIZE = 1MB, MAXSIZE=10MB, FILEGROWTH=1MB) LOG ON ( NAME='MyDB_log', FILENAME = 'c:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\data\MyDB.ldf', SIZE=1MB, MAXSIZE=10MB, FILEGROWTH=1MB); GO ALTER DATABASE MyDB MODIFY FILEGROUP MyDB_FG1 DEFAULT; GO -- Create a table in the user-defined filegroup. USE MyDB; CREATE TABLE MyTable ( cola int PRIMARY KEY, colb char(8) ) ON MyDB_FG1; GO
SQL Server Transaction Log Backups
What is a transaction log backup? A transaction log backup will take a backup of SQL Server’s transaction log so that data changes stored there can be recovered during the restore process. Again the Log Sequence Number concept mentioned earlier is key. Because transactions occur in a certain order, the log backups contain that order of transactions. Transaction log backups must be restored in order. If you try to restore a sequence of log backups in the wrong order, SQL Server will generate an error.
Having these log backups allows you to do what is called a point in time restore. Having the option for a point in time restore can be critical for a business. Let’s suppose someone is making changes directly to one or more tables. If a mistake is made and an unexpected data change occurs, point in time restore will allow you as the DBA to recover the data up to the moment just before the erroneous change was made.
SQL Server Copy Only Backups
What are Copy-only backups? Copy only backups are backups that do not affect or track the Log Sequence Number information. Why would you use this? Well, let’s suppose that you are already taking full, differential and log backups. You are asked to provide a backup file of a database to another department or perhaps just a single developer who is working on a project. That Developer, hopefully, doesn’t have access to the backup directory for your backup processes. Rather than you trying to copy a set of backups to a share or other location that the Developer does have access to, you could simply create a Copy Only full backup that is written to a shared location. The Developer can then use that backup to restore locally to her machine, and you receive the benefit of not changing the base backup for your backup chain.
What to Do Next
- Examine your environment to see if you are leveraging the “Power of the Diff” by taking differential backups to help simplify your recovery process.
- Consider the size of your databases and determine if you have a situation that might benefit from having a database with multiple files and filegroups.