Lessons Learned Updating to SQL Server 2017 AlwaysOn Availability Groups

In late 2019, a project that had been in progress for well over a year finally came to a conclusion.  I had collaborated with a number of people to migrate a stand alone SQL Server 2008 and two stand alone SQL Server 2008 R2 instances.  Each stand alone SQL Server was migrated to a three node Availability Group. Here are a few things learned along the way.

Automatic Seeding

First, consider using automatic-seeding if you are migrating to AlwaysOn Availability Groups.  I hadn’t previously worked with anything newer than SQL Server 2014 so I was unaware of this feature.  It is new as of SQL Server 2016. In this case, the old SQL Servers were migrating up to SQL Server 2017, so this option was available to us.  For a couple of months leading up to the migration event there had been various tests involving the set up of the AGs.  One of those tests involved restoring a full backup followed by one or more log backups over on the secondary nodes.  As the migration date approached, we started experimenting with auto seeding.

The auto-seeding feature streams the database from the primary to one or more secondaries behind the scenes.  This is accomplished using the endpoints created during the Availability Group creation process. This feature can eliminate the need to create scripts to handle the restores of databases over to secondary nodes in the AG.  

There are four ways to monitor the process of the streaming of the databases to the secondary nodes. First, the SQL Server Error Log makes entries about the process. Second, various DMVs contain information about the status and progress of automatic seeding.  Third, Backup history tables in MSDB can be queried. Fourth, Extended Events can be used to monitor the progress.  We chose to query DMVs during the migration.

What’s In a Name?

Second, ponder how to handle the scenario where connection names for applications change. As it turns out, a lot of meaning and importance is attached to names.  The computer and SQL Server instance names we were dealing with were deeply ingrained in the business. The names had been around for years and were used everywhere in connection strings.  If we migrate to new SQL Server instances, how do we handle that? 

One option is to create an alias in DNS that points to another location. Several months before the actual migration, we changed the computer names for each of the stand alone SQL Servers. The Infrastructure team used the previous computer names as aliases that then pointed to the newly named computers. This does require a change inside SQL Server, by the way.  SQL Server is aware of the computer it is installed on, so if you do this, know that you have to make a change in SQL Server as well. 

Let me explain the concept by way of example. Let’s suppose you have a computer called MyServerABC and it has a default instance of SQL Server on it.  This means that database connections also use the name MyServerABC to talk to the databases.  That connection name might be used in a lot of applications.  So that tens, or maybe even hundreds, of connection strings don’t need to change when the SQL Server is migrated, here is what you can do. During a maintenance window, change the name of the computer and make a change to SQL Server so it knows the new computer name it is on. Here is a link to describe how to do that in SQL Server and it has a discussion of other considerations when doing this sort of thing.

Let’s suppose the new computer name is MyServerXYZ. After the previous actions are taken, in DNS a new record is made called “MyServerABC”.  That alias is then “pointed” or directed to the computer name “MyServerXYZ”.  Now, applications that use MyServerABC as a connection name can continue using that connection name.  When the application makes a call to MyServerABC, that call is simply re-routed to MyServerXYZ and the application is none the wiser.

This exact change process worked for us early on.  Later, once we migrated the first stand alone SQL Server to an Availability Group, a person on the Infrastructure team changed the alias so that it pointed to the AG Listener instead of the old server we just migrated from.  Connections were then instantly routed to the new SQL Server 2017 AG. 

Database Migration Assistant

Third, the Database Migration Assistant is software from Microsoft that evaluates a SQL instance and the databases on that SQL Server for readiness to be migrated to a new platform. This tool can help migrate databases to Azure, but we were executing a migration involving on-prem to on-prem.  The tool provides an assessment of and information regarding scenarios in use in your environment that might block a successful migration as well as offering information on new features to leverage in the new SQL Server. 

We were able to find several things in our environment that had to be changed prior to the migration using this software.  Without the assessment, we would have migrated only to experience failures of various kinds in the new environment.  

Planning Software

Fourth, use planning software of some kind so the migration steps are known and progress can be tracked.  One of the things I chose to do was plan out each step and create an electronic project board for all the things that needed to happen the day of the migration.  I had been using Office 365’s Planner at that time, so I created my steps there.  One of the Developers converted that to Trello because that is what the Dev teams were using at the time, and the Devs were at the office with us to help with testing applications post-migration.  The Trello board was made visible on a couple of large, wall mounted TV’s for everyone to see.  This allowed everyone to know what had been done, what was being worked on and what was next.

Communication

Fifth, think carefully about how the migration team will communicate during the migration.  (All in the same physical room? Company conference bridge while working remote? Conference call with a support vendor? Remote work platform like Teams or Slack? Some combinatin of these things?)

If you have a big room that will hold everyone, that might be a great option. Remember, this happened back in early December 2019, pre-covid. So, having a dozen or so of your favorite co-workers all in a room made sense.

Meeting together allows for easy, fast communication by simply turning to the person in the room to discuss something or ask a question. Collaboration is smooth and convenient in that scenario.  As I mentioned earlier, we leveraged a Trello board that was displayed on a couple of large TVs so everyone can see what has been worked on, what is currently being worked on and what is next.  

Be aware and mindful of the human tendency to talk when idle though.  If you get a dozen people in a room and only a few of them are busy at the moment, there could be a significant amount of chatter to block out for the people who are busy at that moment.

These days, now that the U.S. is experiencing an explosion of work from home due to the pandemic, you would probably set up something like Teams or Slack and use that as your collaboration platform so people can collaborate from home on migration day.  Platforms that allow for phone calls, screen sharing and quick collaboration among large groups of people would be essential in a large scale, high impact IT project. In those scenarios, the people collaborating still need to think carefully about communication so that people aren’t talking over each other while addressing the migration work. 

Well, that’s it for now.  I hope these points help you consider some things that maybe need to be included or discussed as part of your next IT project.  

Querying An AS400 Using SQL Server 2019 Data Virtualization

For those who have taken an interest in the latest SQL Server release, SQL Server 2019, you may have noticed a new feature called Data Virtualization. Perhaps you’ve seen a presentation about it, or maybe you’ve read about it in Bob Ward’s book entitled “SQL Server 2019 Revealed.”

I myself was curious about this feature and so I set out to see if I could use it to connect to an AS400 iSeries machine. Tom Willwerth, my friend at StraightPathSQL, and I worked on this process together. This post is primarily about walking you through the steps we recently took to get this configured.

Before I dive into that, why was I interested in this feature? What did I hope to gain? Well, first of all, there was definitely the motivation of wondering, “Can we get this to work?” Secondly, and more practically, the promise of SQL Server Data Virtualization is to make other data sources available without using a Linked Server and without the time it takes to develop an ETL process to move the data. On a related note, you can cut out the time it takes for an ETL job to actually move the data somewhere like a data warehouse or flattened tables for reporting. Third, the Polybase feature has a built in engine that can provide query performance not available via Linked Server. Fourth, I wanted to provide a way for developers to query data in the AS400 without having to learn the different syntax required by the AS400 iSeries. Fifth, query writers can also join the external able to local SQL Server data.

Now let’s move forward with the how of this process. I started by installing SQL Server 2019, and in that process there are some steps to select for installing the Polybase feature. If you’re not familiar with that, it came out in SQL Server 2016 and is the foundation of data virtualization. During the install process, SQL Server suggested a port range that it would use as part of configuring Polybase. I wasn’t sure if SQL Server knew that the suggested port range was open or not. As a result, I actually stopped to download the SysInternals tool called TCPView and run it on the server to confirm that the suggested port range was open.

After the SQL Server install completed, I ran the code below in SSMS to enable the Polybase feature.

exec sp_configure 'Show Advanced Options', 1
reconfigure
EXEC sp_configure 'polybase enabled', 1
reconfigure
EXEC sp_configure 'polybase enabled'

The next thing I did was create a database master key. This has to exist in order to encrypt the credential that will be stored in SQL Server later on in the process.

CREATE MASTER KEY WITH ENCRYPTION BY PASSWORD = 'S0meStr0ngP@55w0rd1234xYz!'

I then set my test database to compatibility level 150 for SQL Server 2019 and made my database scoped credential, like so.

ALTER DATABASE MyTestSQLDatabase SET COMPATIBILITY_LEVEL = 150

CREATE DATABASE SCOPED CREDENTIAL MySQLCred WITH IDENTITY = 'MYAS400CRED', SECRET = 'S0m3Str0ngP@55w0rd';

The value for IDENTITY corresponds to a user in the source database, in this case, the AS400. SECRET, of course, is the password for the user name stored in IDENTITY.

I believe after this Tom joined the process and we began together by installing version 1.8 of the 64 bit iSeries Client Access driver for Windows onto the SQL Server instance I was testing from. Going into the details of that are a bit beyond the scope of this post so I won’t describe the process here.

Creating an external data source was the next part of the journey. Syntax for that is here.

After a lot of wrangling, searching, and looking here for an AS400 connection string example, Tom and I arrived at this code sample that we ran successfully. Obviously I’ve changed some details to anonymize this, but you get the idea.

CREATE EXTERNAL DATA SOURCE MYAS400DEV WITH (LOCATION = 'odbc://MYAS400DEV.LeeMarkum.com', CONNECTION_OPTIONS = 'Driver={iSeries Access ODBC Driver}; System = MYAS400DEV'; ServerNode = MYAS400DEV.LeeMarkum.com', PUSHDOWN = ON, CREDENTIAL = MySQLCred);

Next up we took the optional step of creating a schema called EXT for holding our external tables. Afterward, we moved on to the business of actually making our first external table. This was very much a multi-step, trial and error process.

Here is the process we eventually found that worked:

  • Get SQL definition from AS400 via the iSystem Navigator, minus trigger definitions, and paste into SSMS
  • Remove any index or key definitions. Without this change SQL Server throws an error
  • Remove any defined defaults. Without this change SQL Server throws an error
  • Change the CHAR data types to NCHAR. Without this change SQL Server throws an error
  • Replace DB2 schema name with EXT, our previously created SQL Server schema.
  • Change CREATE TABLE  to CREATE EXTERNAL TABLE
  • Add this at the bottom: WITH ( LOCATION=’MYAS400DEV.SomeSchemaName.DB2TABLENAMEHERE’, DATA_SOURCE= MYDEVAS400)

This left us with code like the below:

CREATE EXTERNAL TABLE EXT.Customer 
(CUST# NUMERIC(7,0)NOT NULL,
CUSTNAME NCHAR(30) NOT NULL,
CUSTADDRS NCHAR(50) NOT NULL,
CUSTCITY NCHAR(30) NOT NULL,
CUSTSTATE NCHAR(2) NOT NULL,
CUSTZIP NCHAR(2) NOT NULL
)
WITH ( LOCATION='MYAS400DEV.SomeSchemaName.DB2TABLENAMEHERE', DATA_SOURCE= MYDEVAS400)

As part of this process, we had to keep the same column names as what existed in the AS400. Otherwise, an error was thrown, and this makes sense given that we’re looking at the AS400 table in the WITH (LOCATION = ”, data_source = ) part of the syntax.

After this step there is the optional step of creating statistics on columns of the virtual table. You might want to do this, for instance, on columns that might be involved in joins as a performance enhancement. For the purposes of this work, I skipped this step for the time being because Tom and I just wanted to see some query results from all this work! We could return later to this item to try to tweak performance.

The moment of truth had finally arrived. We loaded up a query against our EXT.Customer table in SQL Server 2019 and, to our delight, we received the expected result set back using standard T-SQL.

I’m not gonna lie, this was a fair amount of work and took some time to set up. However, I’m hoping this blog post, and others like it, will save you some pain and time.

SQL Server Install Guide Part 3

I’m in part 3 of a series on installing SQL Server.  In the previous post I began discussing steps to take after the install is complete.  Because there are quite a number of things to consider post-install, this part 3 will continue discussing the post-install configuration items.

One of the things I always do after installing SQL Server is set up SQL Server Alerts.  This is a free, easy way to find out about problems on the SQL instance, like read retry errors or other potential nasty issues.  There are probably a number of places to find scripts useful for this task.  The one I recommend is from Glenn Barry at SQLSkills.com.  You can find his script at this link.  One of my favorite things about Glenn’s script is that it has very well defined alert names that allow the DBA to easily know what server is involved when an alert is sent.

Next up is setting MAXDOP.  The foundational advice from Microsoft concerning Max Degree of Parallelism is in KB2806535, found here.  The summary of this article is that if you have a single NUMA node and less than 8 logical processors, then keep MAXDOP at or below the number of logical processors.    If you have multiple NUMA nodes and less than 8 logical processors per NUMA node then keep MAXDOP at or below the number of logical processors per NUMA node.   If there are more than 8 logical processors per NUMA node, keep MAXDOP at 8.

So, how do you tell how many NUMA nodes there are on the machine?  I’m glad you asked!  There are several ways to determine this information.  I will refer the reader to a post by Denny Cherry on the various methods.

MIN/MAX server memory should be reviewed as part of your post install configuration items.  The MIN Server Memory setting is the minimum memory your server will allocate to SQL Server.  The trick to understanding this is that SQL Server does not automatically grab that minimum value of memory upon start up.  The memory used by SQL Server increases gradually after start up, assuming a steady workload on the instance.  Once it crosses the value set in this MIN Memory setting then it does not give that memory back to the OS.

Setting the MAX Server Memory setting will prevent SQL Server from taking so much memory that it starves the OS.  I typically leave 4-6 GB of RAM free to the OS and have not had problems with memory pressure using that guideline.

I will say though, that the machines I work with are dedicated to SQL Server, and almost always have just the database engine installed on them.  If you have other SQL Server components on this same box, or application components from one or more applications then 4-6 GB left to non database engine components probably is not enough.  Definitely monitor the Available Megabytes Perfmon counter over time so you can see how much free memory the machine has and adjust the Max Server Memory setting for SQL Server as needed.

To keep this series to a reasonable length, I’m simply going to list the next few items I have on my own personal Install Guide that I use.  The series wasn’t meant to be an exhaustive explanation of everything one would put on an install guide anyway.

The guide I use is in Excel format and I go through each item on it as I work through the process.  I mark off what I’ve done and then I save the guide.  This helps me to ensure that I’ve covered all the steps and it gives me and other people a record of what I did during the process.  Some of these steps can be entire blog posts on their own, or even an entire series written on that one step, so summarizing them here and expanding on them in later posts will be my approach.  Ok, so here’s the remaining list.

  1. Ensure Windows power plan is on high performance
  2. Configure tempdb appropriately. (Google this and do research.  Here are a couple of starting points.  See this and this.
  3. Set up a DBA management database. (This is a database on each instance that has key procedures and objects, like sp_WhoIsActive, sp_Blitz, and any other custom scripts you like to use for troubleshooting or diagnostics)
  4. Enable Query Store (SQL Server 2016 and higher)
  5. Update compatibility level of databases to newest level. (This assumes that the new cardinality estimator was tested in a Dev and test environment prior to this migration or new install)
  6. Make sure database backups are occurring as expected.
  7. Consider enabling automatic plan choice correction (SQL Server 2017)

 

 

SQL Server Install Guide Part 2

In part 1 of this series, I discussed pre-installation steps.  There are many guides available online for the actual installation process so I’m not going to give a step by step installation guide here.  Rather, I’m going to focus on what to think about after the install is complete.

Recent versions of SQL Server will now go out and download additional content for you before the install actually happens so you’re not left on the base, RTM release of the product.  However, one thing to consider post installation is still, are you on the latest patch?  How do you know?  Where can you go to compare the version number of the SQL Server you installed to what is currently available?  Well, you can look back at this post to answer that question.

Once you’ve determined whether or not additional patching is needed and completed that, then what?  Well, there is a lot of post install configuration still to do.  For example, do you need to migrate Linked Servers, SQL Agent jobs, SQL Operators?  What about setting up database mail or moving Logins?  Each one of these can take up a lot of time.  What if there was a quick way to handle those things?  Enter DBATools.

DBATools is a fantastic collection of hundreds of PowerShell commands that have been developed and tested by users worldwide.  With a simple command, linked servers can be migrated, with their password information intact. The same with SQL Server logins and their passwords and permissions.  Database mail? No problem!  In fact, if you call Start-DBAMigration and pass a source and target SQL instance, then you can migrate the entire instance in a single command.  DBATools is fantastic for ongoing maintenance tasks and a bunch of other things too so be sure to check out all that it can do for you as a DBA.

The next item on my install guide is trace flags.  Trace flags control the behavior of SQL Server in a variety of ways.  As one example, trace flag 3226 controls whether successful backups are logged in the SQL Sever error log.  Setting this flag prevents those “backup successful’ messages from cluttering up your error log.

The important thing to remember if you’re doing a migration is that you most likely want the new SQL Server to behave in the same way as the old one.  To find out what trace f;lags are globally enabled on a SQL Server, run the following code.

DBCC TRACESTATUS(-1)

If you want a list of trace flags and what they do, those lists do exist but they are sometimes hard to find.  Here is one such list that tells you whether you should be concerned about a particular trace flag being enabled or not.

If you’re starting from scratch with the database and application you’re installing, be sure to find out if the vendor has any recommendations and why they recommend those particular trace flags.  Also, do your own research about trace flag functionality and see why you personally may or may not want to use a particular flag.

 

SQL Server Install Guide Part 1

In part 1 of a series on installing SQL Server, I’m going to discuss things to do prior to running the installation program for SQL Server.  There are a number of install guides for SQL Server on the internet and plenty of recommendations around installation best practices and how-to’s.  This will be what I’ve arrived at over the course of time.

This first step I’m going to introduce makes the assumption that this install is going to be for one or more databases that you’re planning to migrate from an existing instance.  My first step is to run the Database Migration Assistant from Microsoft.  This product is free and can be installed away from your SQL Servers.  I run my install from my laptop.  After opening the UI and starting an assessment, this tool will scan the source databases selected and determine if there are any breaking changes, behavior changes, or deprecated features between the source and target for your migration.

The Database Migration Assistant will also do two additional things in regards to migrations.  First, it will make suggestions on new features that may be advisable to use in the new instance.  Second, the software can also actually do the database migration for you.  That includes migration to a recent SQL Server version on premises and migration to Azure.

The next item on my list is a relatively easy one to handle, but it’s still worth at least some forethought.  The database instance needs a name.  For a default instance, the computer name will be the instance name.  Which means that in some cases your name is decided for you by a System Administrator.  If you have input into the name, try to use a logical naming convention that will have some longevity.  In some environments you might have a convention like “Application” + “Purpose” + “Location” + “some number” such that a machine name might be CitrixWebSTL01 for a Citrix web server in Saint Louis.  I would recommend something similar to this so that when the machine name is viewed the person knows exactly what it is for.  Going generic, like SQL01, SQL02, SQL03, etc. just doesn’t say much about what the SQL instance is for.

The third item on my install guide is to identify or create accounts for the SQL Server modules you plan to install.  For instance, if you’re installing the database engine, utilizing SQL Agent and maybe installing SSIS on the box, then you really should use three different accounts with different passwords, one for each service that will be installed.  Doing this will help keep the machine and the environment more secure.  If you use one account for every service and that one account is used on every SQL install, you’ve now made your entire environment vulnerable if that username and password is ever compromised.

Fourth, identify and agree upon drive letters for files.  If the machine is physical and you can have some influence over what is used, be sure to separate mdf from ldf files and be sure to get TempDb on its own drive.  Doing these things will help overall performance as each of these file types has its own usage pattern.  Mixing those files types on the same drive is going to hurt performance.

If the environment is virtual and so a SAN is involved, I would still encourage you to have a scheme for where the files go because it makes administration easier. Also, having a consistent design across all servers helps when it comes time to restore databases from one place to another.  No more pesky WITH MOVE statements in your RESTORE DATABASE scrips if across the environment there is a consistent way drives and files are laid out.

Enabling instant file initialization is the fifth item on my SQL Server install guide.  Instant file initialization is a performance enhancing option that will really help with data file growth times and restore times.  It allows SQL Server to claim space on a drive without zeroing out the data that might be on that part of the drive.  This means that the data is still retrievable by a skilled hacker until SQL Server writes over the data. However, there are many layers of security to be bypassed before that would be possible and there are substantial benefits to instant file initialization. Check out a post by Kimberly Tripp on the subject.  Instant File initialization is a multi step process prior to SQL Server 2016, but now we have a handy check box.

Correlating IO Wait with Queries

There are a lot of great tools out there to help DBA’s monitor SQL Server perfromance.  Windows Perfmon counters and use of the Windows Relog command, for example, can be used to build a database of performance metrics.  Write a few queries against that database and a DBA can learn a lot about what’s going on inside the SQL Server.  Profiler and Extended Events also provide windows into performance.

I recently ran across a question on SQLServerCentral.com about performance that intrigued me.  The poster wanted to know if there was a way to correlate pending io requests with the queries that were experiencing the waits using the DMV sys.dm_io_pending_requests. This was a well timed question that got my attention because I was working on my own scenario with a SQL Server and IO issues.  I had set up a custom data collector set in perfmon and analyzed the data.   The data showed very high Avg. Disk Queue Length, low Page Life Expectancy and high Page Reades/sec for sustained periods of time.  The question gave me an opportunity to do some additional investigation and I arrived at the query below.

SELECT    
ipir.io_type, 
ipir.io_pending,
ipir.scheduler_address, 
ipir.io_handle,
os.scheduler_id, 
os.cpu_id, 
os.pending_disk_io_count,
ER.session_id, 
ER.command, 
ER.cpu_time, 
ST.text

FROM sys.dm_io_pending_io_requests ipir
INNER JOIN sys.dm_os_schedulers os ON ipir.scheduler_address = os.scheduler_address
INNER JOIN sys.dm_exec_requests AS ER ON os.scheduler_id = ER.Scheduler_id
CROSS APPLY sys.dm_exec_sql_text(ER.sql_handle) AS ST

I began by looking at sys.dm_io_pending_requests in BOL.  I noticed that this DMV had Scheduler_Address as one of its outputs.  I quickly realized I could join to sys.dm_os_schedulers on the scheduler_address and then make an additional join to get to sys.dm_exec_requests.  From there I was only a CROSS APPLY away from getting the SQL statement associated with the IO waits.

Later I realized that sys.dm_os_schedulers has its own column that keeps track of pending IO called “pending_disk_io_count.”  As a result, you could eliminate sys.dm_io_pending_requests from the above query if you were truly interested only in the pending io count.  However, using both DMV’s does provide a more detailed picture.

Also, keep in mind that, like many things in life, observed issues may only be symptoms.  In my case the high IO wasn’t due to poor performing storage, but queries that were performing millions of physical reads.  I examined the query plan and could see that an index was missing.  I applied the index and the physical reads dropped to almost zero.  However, I used the above query to bolster my case to the application vendor that the index was needed because extremely high physical IO was killing the server.

Try out these DMV’s and see what else you can get from them.

Happy querying!

Consolidate Multiple Email Operators

Suppose you’ve been working on cleaning up SQL Agent jobs and their email operators and find that you have more than one operator with the same email address. How would you tidy this up to ease administration?

Let’s do a little setup to demonstrate the scenario.  The T-SQL below will create three operators with different names but the same email address.  Then it updates two of my existing test jobs to use those email operators when the job fails.  I am also logging failures to the Windows Event Log.

USE [msdb];
GO

EXEC msdb.dbo.sp_add_operator @name=N'DBA', 
 @enabled=1, 
 @pager_days=0, 
 @email_address=N'SuperHumanPerson@MyDomain.com'
GO

EXEC msdb.dbo.sp_add_operator @name=N'Super Human Person', 
 @enabled=1, 
 @pager_days=0, 
 @email_address=N'SuperHumanPerson@MyDomain.com'
GO

EXEC msdb.dbo.sp_add_operator @name=N'SQL Server Team', 
 @enabled=1, 
 @pager_days=0, 
 @email_address=N'TheDBADistroGroup@MyDomain.com'

Now I will run this simple query to find out what my email operators and email addresses are.

USE msdb;
GO

SELECT id
, name
, email_address
FROM sysoperators

Here is the result set.

How does this sort of thing happen, you ask.  Well, it doesn’t just happen in contrived blog posts!  Suppose there is a fair amount of turnover in a DBA environment and the DBA’s are not always checking what the last person did.  Each one after another goes in and does her own thing.  Voila!  You end up with duplicate email addresses for email operators.

Some jobs have “DBA” as the email operator.  Some have “Super Human Person” as the email operator.  You might even have a couple of “test” operators in their with “SuperHumanPerson@Mydomain.com”.

Maybe the DBA didn’t think about the fact that he might not always be the DBA and so it might be better to set up an email distribution group that was controlled by the Exchange person.  That would mean that as DBA’s came and went the same email operators could be used in SQL Server. Who knows! But I have seen this.  So what now?

sp_delete_operator to the rescue!

Here is our example.

EXEC sp_delete_operator @name = 'DBA', @reassign_to_operator = 'SQL Server Team';
GO
EXEC sp_delete_operator @name = 'Super Human Person', @reassign_to_operator = 'SQL Server Team';

So what this does is it deletes the email operator in the @name parameter and assigns all jobs that had the @name operator to the operator assigned to the @reassign_to_operator parameter.  Now when you run the second T-SQL snippet from above against the sysoperators table in msdb, you get back just the operator name “SQL Server Team.”

 

 

 

 

Finding Failed Jobs

How would you go about finding jobs that have failed and why would you want to look for these?  You might want to do this to ensure that you are being notified appropriately of job failures. For instance, after reviewing failed jobs, you may realize that there are job failures that no one knew about. Maybe this is occurring because the jobs are not set to send email. If that’s the case see my post here so you can fix that.

But how do you find these job failures in the first place?

1. Manually look at the job activity monitor for the SQL instance.

The job activity monitor is found under the SQL Server Agent node in the Object Explorer of SSMS.

 

Just double click it to open and you will see something like this. In this case I set up a job with an incomplete SELECT statement so that it would create a failure.


2. Query the msdb database.

SELECT
J.Name AS JobName
, J.description AS JobDescription
, H.step_id
, H.step_name
, msdb.dbo.agent_datetime(run_date, run_time) as 'RunDateTime'
, H.sql_severity
, H.message
FROM sysjobs AS J
INNER JOIN sysjobhistory AS H ON J.job_id = H.job_id
INNER JOIN sysjobsteps AS JS ON J.job_id = JS.job_id AND H.step_id = JS.step_id
WHERE H.run_status = 0 --Failed jobs

 

Fixing Jobs With No Email Notification

In the last post we found jobs that had no email notification. Once you find them, how do you fix this problem?

I’m going to assume that database mail is already enabled and configured on the SQL Server instance.
1. Create an operator, assuming one does not exist.

This can be done either through T-SQL or through the GUI.  In the GUI, right click the Operator folder under SQL Server Agent and chooser “New Operator.”  Then fill in the details of operator “Name “and “Email name.”  See below for an example.

 

Then you click OK in the bottom right of the GUI to finish the process.

 

Alternatively you can use T-SQL to create the Operator:

USE [msdb]
GO
EXEC msdb.dbo.sp_add_operator @name=N'DBA Team', 
    @enabled=1, 
    @pager_days=0, 
    @email_address=N'MyEmail@MyDomain.com'
GO

2. Assign that operator to multiple jobs via T-SQL.

Once there is an Operator to assign to jobs with no email notifications, the next step is to actually add the new Operator to the applicable SQL Agent jobs.  This will create the T-SQL that can be copied and ran in a new SSMS window, allowing you to confirm before running the statements.

Keep in mind that in this example I am only updating jobs that don’t already have an operator assigned to them.  This makes the WHERE clause very important.  If you leave this off, copy and run the generated statements, then all jobs will be updated to whatever was specified in the variable.  This may not be what you want for your environment, so please be sure to use the WHERE clause and choose notify_level_email = 0 if you only want to update jobs that have no operator assigned to them.

 

use msdb
GO
DECLARE @operator varchar(50)
SET @operator = 'DBA Team' 

SELECT 'EXEC msdb.dbo.sp_update_job @job_ID = ''' + convert(varchar(50),J.job_id) + ''' ,@notify_level_eventlog = 2, @notify_level_email = 2, @notify_email_operator_name = ''' + @operator + '''' 
FROM sysjobs As J
 INNER JOIN sysjobschedules AS JS ON J.job_id = JS.job_id
 INNER JOIN sysschedules AS SS ON SS.schedule_id = JS.schedule_id
 
WHERE J.enabled = 1 --Job is enabled
 AND J.notify_level_email = 0 --never notify
 AND SS.enabled = 1; --it has a schedule that is enabled

Notice that I am using sp_update_job not only to add the email operator, but also to set the job to log to the Windows event log in case of a failure.  Why?  Well, what happens if database mail is not configured properly?  Maybe it worked when the job was created, or maybe it worked the last time you touched database mail, but what if someone else made changes to it?  In that case, you might not get the expected email when the job fails.

If you have someone, like a Junior DBA, who is looking through Windows Event logs for issues they might find it.  Or what if you have one of those slick 3rd party monitoring that scans for errors and notifies you when it finds one?  Then you have another way to be notified that the job failed.

And here is your BOL link to all the goodies involved with sp_update_job.

Finding Jobs With No Email Notifications

When you take over a new SQL Server instance one of the things you might consider doing early on is finding SQL Agent jobs that have no email notification set up.  I am of the opinion that if a job is worth creating, then it’s worth telling someone when it doesn’t do what it is supposed to do.

A quick, easy way to determine jobs that have no email operator is to connect to your SQL Server instance and run the code below.

 

USE msdb;
GO

SELECT J.name,
 J.notify_level_email,
 J.notify_email_operator_id

FROM sysjobs AS J
 INNER JOIN sysjobschedules AS JS ON J.job_id = JS.job_id
 INNER JOIN sysschedules AS SS ON SS.schedule_id = JS.schedule_id
 
WHERE J.enabled = 1
 AND J.notify_level_email = 0 --never notify
 AND SS.enabled = 1; --it has a schedule that is enabled

For jobs that are not going to email anyone if they fail, the last two columns will return a zero.  These are jobs that you will want to focus on and fix so that when they fail, someone can fix the problem.

Here is a link on all the goods available to you from dbo.sysjobs:

https://msdn.microsoft.com/en-us/library/ms189817.aspx