Problem I have an internally-developed metadata repository for the SQL Server instances I administer. One of the metrics I track is based upon Job History success and failure. This information comes directly from the msdb..sysjobhistory table that resides upon each SQL Server instance. The issue is that there are idiosyncrasies with this table when it comes to storing dates and times of job executions. I need to be able to use this information in reports and queries, but these columns are stored in a non-standard date format, which considering the fact that we're discussing system tables is baffling to me.
Solution Before we begin let's take a look at what we're dealing with in regards to format of the run_date and run_time fields. Though this tip pertains simply to converting the run_date to a useful format, we will be dealing with converting the run_time date in a subsequent tip.
SELECT SJ.[name], SJH.[run_date], SJH.[run_time] FROM msdb.dbo.[sysjobhistory] SJH INNER JOIN [msdb].dbo.[sysjobs] SJ ON SJH.[job_id] = SJ.[job_id] WHERE SJH.[step_id] = 0 ORDER BY SJ.[name]
The above query yields the following results:
As you can see, run_date is stored in the format of YYYYMMDD. It is stored as an integer format, not as a string as is the root format underlying the datetime data type within Microsoft SQL Server. I suspect it will only be a matter of time before the msdb database receives the same treatment the master database did in regards to the logical migration of system tables into system catalog views, which back the Dynamic Management Views. Ultimately, Dynamic Management Views addressing the job and backup structures along with their associated history and configurations stored in the msdb database make a perfect target for eventual expansion of the DMVs; however I digress. Back to the matter at hand.
What is to happen if we want to use this information in any of the many system functions within SQL Server that rely on a date parameter. I am referring to such functions as DATEADD(), DATEDIFF(), DATENAME(), or DATEPART() for instance. Why don't we pick one and give it a shot:
SELECT SJ.[name], MAX(SJH.[run_date]) AS [last_run_date], MAX(DATEDIFF(dd, SJH.[run_date], GETDATE())) AS [Days Since Last Run] FROM msdb.dbo.[sysjobhistory] SJH INNER JOIN [msdb].dbo.[sysjobs] SJ ON SJH.[job_id] = SJ.[job_id] WHERE SJH.[step_id] = 0 GROUP BY SJ.[name] ORDER BY SJ.[name]
The above query yields the following error::
Msg 8115, LEVEL 16, State 2, Line 1 Arithmetic overflow error converting expression TO data type datetime.
You encounter an error of course, because you're attempting to pass an integer data typed parameter where a datetime parameter is expected; a classic square peg, round hole situation. How do we address this? A multiple-step CAST and CONVERT should do the trick.
SELECT SJ.[name], MAX(SJH.[run_date]) AS [last_run_date], MAX(DATEDIFF(dd, CONVERT(datetime, CAST(SJH.[run_date] AS CHAR(8)), 101), GETDATE())) AS [Days Since Last Run] FROM msdb.dbo.[sysjobhistory] SJH INNER JOIN [msdb].dbo.[sysjobs] SJ ON SJH.[job_id] = SJ.[job_id] WHERE SJH.[step_id] = 0 GROUP BY SJ.[name] ORDER BY SJ.[name]
Now that we see how to get this to work; lets take a moment and briefly break down the components of this conversion:
CAST(SJH.[run_date] AS CHAR(8)) - This gets around the issue we would continue to encounter if we attempted to pass the raw integer value for the [run_date] field into the CONVERT() function.
CONVERT(datetime, <string parameter>, 101) - The CONVERT() function can be used to modify how the standard datetime format is presented to end users in a query or report. The function expects an ending format; in this case we wish to use datetime; an initial value; and a format identifier. The format in our example converts input values to the format of MM/DD/YYYY. Please refer to Microsoft SQL Server Books Online for all available output formats.
This process can be made even easier by wrapping this code into a user-defined function. By doing so, you'll be able to (a) minimize the amount of code necessary to accomplish the task at hand and (b) create a reusable process:
CREATE FUNCTION udf_convert_int_date (@date_in INT) RETURNS datetime AS BEGIN DECLARE @date_out datetime SET @date_out = CONVERT(datetime, CAST(@date_in AS CHAR(8)), 101)
RETURN @date_out END
You would then proceed to use this function as you would any other system (built-in) SQL Server function. Below are two examples of such calls. The first illustrates a simple call of this newly created function. It returns the distinct results of the raw run_date field from msdb.dbo.sysjobhistory and its corresponding converted value:
SELECT DISTINCT SJH.[run_date], master.dbo.udf_convert_int_date(SJH.[run_date]) AS [converted_run_date] FROM msdb.dbo.[sysjobhistory] SJH
The second example is used in place of the code provided earlier in this tip. It is used to calculate the amount of days since a SQL Server Agent Job was last run. It utilizes the function within another function to accomplish the task at hand:
SELECT SJ.[name], MAX(SJH.[run_date]) AS [last_run_date], MAX(DATEDIFF(dd, master.dbo.udf_convert_int_date(SJH.[run_date]), GETDATE())) AS [Days Since Last Run] FROM msdb.dbo.[sysjobhistory] SJH INNER JOIN [msdb].dbo.[sysjobs] SJ ON SJH.[job_id] = SJ.[job_id] WHERE SJH.[step_id] = 0 GROUP BY SJ.[name] ORDER BY SJ.[name]
The next tip in this series will discuss conversion of the run_date column format.
I did however notice that when I ran the T-SQL to list the jobs, their last run date, and count of days, the count of days was incorrect. Below is a revised version, using CTE to eliminate the MAX() around your calculation of days. This calculates correctly for me.
JobHistory ( hName, hLastRunDate ) AS ( SELECT SJ.[name]
, MAX(SJH.[run_date]) AS [last_run_date] FROM msdb.dbo.[sysjobhistory] SJH
INNER JOIN [msdb].dbo.[sysjobs] SJ ON SJH.[job_id] = SJ.[job_id]
WHERE SJH.[step_id] = 0GROUP BY SJ.[name]
)SELECT hName , hLastRunDate
, DATEDIFF(dd, master.dbo.udf_convert_int_date([hLastRunDate]), GETDATE()) AS [Days Since Last Run]FROM JobHistory ORDER BY [hName]
I'm sorry, but I don't know why this article was published.
1) this function, while it "works", does nothing but add the overhead of wrapping native functions in a UDF. I'm sure it merely adds cost to the queries that use it.
2) while the nature of abstract data types (e.g. datetime) is that the interal workings are none of anyones' business, I don't think datetimes are stored as strings. I think this conception is some sort of COBOL hangover -- again, not that it is any of my business how Microsoft does things.
3) datetime has NO format. All formatting is done at the presentation layer.
4) the CONVERT function when used as it is here to take a string representation and make a datetime DOES NOT impart any format; the format code of 101 represents the INPUT format of the second argument.
5) what do you mean non-standard date format? YYYYMMDD string representation of a date is ISO standard and will be interpreted implicitly by Microsoft magic!
Though I think magic is bad form:
select datediff(day, CAST(20090401 AS CHAR(8)), CAST(20090402 AS CHAR(8))) from_int
, datediff(day, '20090401', '20090402') straight_magic
Functions & procedures are two different things. Functions should return one thing. Table functions should return one table. Procedures are a mechanism where one can perform all sorts of tasks AND return values. It's mostly just a matter of good form, but for good form you shouldn't think of functions and procedures as interchangible. When doing maintenance, one should feel comfortable upon seeing a function call that nothing happened behind the scenes orther than "calculations" to map the arguments to a return value.
He SHOULD have just done convert(cast()) or convert(convert()) -- the function gained nothing, added overhead and made maintenance more difficult.
The article was witten to outline a situation where users expect a datetime value provided from a system-supplied source to behave in a manner that is consistent with datetime datatype values, and then to realize that is not possible without conversion/casting. While your solution is completely appropriate (and it is) I would prefer to simply the process for the end user that may not understand the various formatting opportunities available and code numbers associated with the CONVERT() function. The solution outlined will require the end users/developers/DBAs to only deal with supplying a single parameter value for the integer datetyped value; no need to remember which code parameter for the CONVERT() function corresponds to what they need. By wrapping it in a UDF it makes it reusable.
My mantra is to make it simple for the end user, even if that means more overhead for me; it's an inverse relationship. I also find it important that DBAs understand the inner workings of the RDBMS they support. It makes them a stronger DBA.
Again, thank you for your interest and we appreciate your time and comments.
Fantastic! I saw this when you posted it oinliralgy but didn't have the time to check then. Glad you did a recap of it as I now had the time. Records going back years is right! I had some going back to January of 2001! I had over 200k records in one of my backupset tables. It's a good thing we only use simple and daily full backups at this point. I set a limit on how far back we want to save, created a job to run the sp_delete_backuphistory. I also added a step to reindex the tables it touched as well. You might not need the backupfilegroup table, as it doesn't exist on all my servers I'm nto sure why. Probably only if you have ever performed a filegroup backup I'm not sure. These could always be werapped in an exists statement too.dbcc dbreindex ( backupfilegroup', ) ;dbcc dbreindex ( backupfile', ) ;dbcc dbreindex ( backupmediafamily', );dbcc dbreindex ( backupmediaset', );dbcc dbreindex ( backupset', );dbcc dbreindex ( restorefile', );dbcc dbreindex ( restorefilegroup', );dbcc dbreindex ( restorehistory', );Thanks Brent!