Editorials

Finding SQL Agent Job Failures

Finding SQL Agent Job Failures

SQL Agent jobs have the ability to notify an operator should a failure occur when a job executes. The problem is that many times a job may have multiple, autonomous steps, requiring a failing step to continue execution with the next step. When steps within a job are configured in this fashion, the job does not consider itself to have failed, and does not send a notification to specified operators.

I did find that the table msdb.dbo.sysjobhistory contains the results of job step execution. So, I put together a query allowing me to identify the failed rows. I identify a failed row by the sql_severity column in the sysjobhistory table. A row with sql_severity > 0 has some sort of error associated with it. You could also filter on the sql_message_id column having a value > 0.

Here is a query providing information about failed jobs. You can modify it to filter based upon a specific point in time, drop the contents into a table, send Email alerts with new records found, build a report, or many other useful possibilities.

I configured a scheduled job to run looking for any new failures and send an email to the system operators with any failing jobs. With this in place, I don’t have to generate any additional error trapping when creating SQL Agent jobs. They are all covered for error notification by default, having this process in place.

Sadly, this is a SQL Server specific implementation. But, since SQL Agent is a SQL Server specific service, I don’t suppose that is too much of an issue.

SELECT
j.name AS jobName
,h.step_id
,h.step_name
,s.command
,s.database_name
,h.sql_message_id as ErrorID
,h.sql_severity as Severity
,ISNULL(msg.description, '') as Description
,h.message
,h.run_date as runDate
,h.run_time as runTime
,h.server
FROM MSDB.dbo.sysjobhistory h WITH (NOLOCK)
JOIN msdb.dbo.sysJobs j WITH (NOLOCK) ON h.job_id = j.job_id
JOIN msdb.dbo.sysjobsteps s WITH (NOLOCK) ON h.step_id = s.step_id
AND h.job_id = s.job_id
LEFT JOIN master.sys.sysmessages msg WITH (NOLOCK) ON h.sql_message_id = msg.error
AND h.sql_severity = msg.severity
AND msg.msglangid = 1033
WHERE sql_severity > 0

The table Master.sys.sysmessages contains multiple entries for most error values, a different one for each language. I selected msglangid = 1033 to restrict the messages to a single row in English.

Perhaps tomorrow we can look at how I convert this output into an Email and send it to the operators from the Database server.

Feel free to make the code better by leaving your comments below, or send suggestions to btaylor@sswug.org.

Cheers,

Ben

$$SWYNK$$

Featured Article(s)
DAX and PowerPivot Essentials: Introduction to PowerPivot, DAX and the Related() and Distinct() Functions, Pt. 1
Join BI Architect and SSAS Maestro Bill Pearson as he introduces his DAX and PowerPivot Essentials series, a new SSWUG exclusive. In this three-part article, we will cover getting started with PowerPivot for Excel, and then begin our first look at the Data Analysis eXpressions (“DAX”) language with hands-on exposure to the Related() and Distinct() functions.

Featured White Paper(s)
Encryption Key Management Simplified
Managing encryption keys is a fundamental step to securing encrypted data, meeting compliance requirements, and preventing da… (read more)

Featured Script
list_userTablePermissions
Q&D select how to look up permissions on all usertables for the current user… (read more)