By: Aaron Bertrand
In this video tutorial we will look at choosing the correct SQL Server date format and how SQL Server could interpret dates incorrectly depending on format and region used to submit dates. We will look at different examples and make a recommendation on how all dates should be formatted when working with SQL Server to avoid any confusion or errors.
Referenced Links in Video
Recommended Reading
- Date and Time Conversions Using SQL Server
- Format SQL Server Dates with FORMAT Function
- Add and Subtract Dates using DATEADD in SQL Server
- SQL Server Date and Time Functions with Examples
T-SQL Script
The below script will allow you to run the tests discussed in the video. You can also download the script.
-- How most date/time formats are unsafe -- Let's check the language of my session: SET LANGUAGE us_english; PRINT @@LANGUAGE; -- Then we can see how [`us_english`] interprets various settings. First, April 12th / 20th: SELECT [April_12_1] = CONVERT(datetime, '04/12/2020'), -- mm/dd/yyyy [April_12_2] = CONVERT(datetime, '2020-04-12'), -- yyyy-mm-dd (ISO 8601) [April_12_3] = CONVERT(datetime, '20200412'), -- yyyymmdd [April_20_1] = CONVERT(datetime, '04/20/2020'), [April_20_2] = CONVERT(datetime, '2020-04-20'), [April_20_3] = CONVERT(datetime, '20200420'); -- And formats that include time: SELECT [April_12_a] = CONVERT(datetime, '2020-04-12 04:00:00'), -- yyyy-mm-dd hh:mm:ss[.nnn] [April_12_b] = CONVERT(datetime, '2020-04-12T04:00:00'), -- yyyy-mm-ddThh:mm:ss[.nnn] [April_12_c] = CONVERT(datetime, '20200412 04:00:00'), -- yyyymmdd hh:mm:ss[.nnn] [April_20_a] = CONVERT(datetime, '2020-04-20 04:00:00'), [April_20_b] = CONVERT(datetime, '2020-04-20T04:00:00'), [April_20_c] = CONVERT(datetime, '20200420 04:00:00'); -- Then what happens if a user has different language settings? SET LANGUAGE français; SELECT [April_12_1] = CONVERT(datetime, '04/12/2020'), -- December 4th! [April_12_2] = CONVERT(datetime, '2020-04-12'), -- December 4th! [April_12_3] = CONVERT(datetime, '20200412'); -- April 12th -- That's just wrong data, and you have no idea. -- And silently wrong data is arguably worse than errors, but neither is good: SET LANGUAGE français; SELECT [April_20_1] = CONVERT(datetime, '04/20/2020'); -- Error! SELECT [April_20_2] = CONVERT(datetime, '2020-04-20'); -- Error! SELECT [April_20_3] = CONVERT(datetime, '20200420'); -- April 20th -- Or NULLs, if you use [`TRY_CONVERT`]: SET LANGUAGE français; SELECT [April_20_1] = TRY_CONVERT(datetime, '04/20/2020'), -- NULL [April_20_2] = TRY_CONVERT(datetime, '2020-04-20'), -- NULL [April_20_3] = TRY_CONVERT(datetime, '20200420'); -- April 20th -- And wrong data again, sometimes, if including time and the dates transpose correctly: SET LANGUAGE français; SELECT [April_12_a] = TRY_CONVERT(datetime, '2020-04-12 04:00:00'), -- December 4th! [April_12_b] = TRY_CONVERT(datetime, '2020-04-12T04:00:00'), -- April 12th [April_12_c] = TRY_CONVERT(datetime, '20200412 04:00:00'); -- April 12th -- Or errors, if the days can't transpose correctly: SET LANGUAGE français; SELECT [April_20_a] = CONVERT(datetime, '2020-04-20 04:00:00'); -- Error! GO SELECT [April_20_b] = CONVERT(datetime, '2020-04-20T04:00:00'), -- April 20th [April_20_c] = CONVERT(datetime, '20200420 04:00:00'); -- April 20th
Video Transcript
Hi, I'm Aaron Bertrand with a series of short videos for mssqltips.com. Today I wanted to talk about regional date time formats. This is a problem that I come across quite a bit with users who will accept, user input from end users that have dates in a literal format, like MM/DD/YYYY, and I want to talk about why that could be a bad idea.
So not everyone writes dates the same, right? We have people in the United States and a couple of other countries that use this MM/DD/YYYY format, that reflects how they say the date. We want to talk about May 1st, it's May the 1st, 2020. In other countries, they say it the other way. They put the smallest unit first, day slash month slash year. And that reflects the way that most people in other countries say the date. They say the 1st of May, 2020. But it also reflects the mathematical sequence of granularity from smallest day to medium to month to large is the year.
I think the United States is wrong on this and there are these memes that say, "Here's how the United States does it." "Here's how the rest of the world does it," This isn't quite true, it's actually more like this. There are a lot of standards, there are a lot of different formats that other countries use and these aren't all the different formats, these are just how the countries rank their formats in preference. So some countries, the United States is the only ones that ranks month slash day slash year as first. Other countries have different sequences of rankings, so that's from a Wikipedia article, and it's interesting, that's all.
There are a lot of different formats so you can't rely on users that are using your application to pass in a format that you expect. And you can't possibly know what they meant when they type a date a certain way. So which formats are unsafe? The ones that are, the ones aren't safe, that can be misinterpreted are almost all of them and most of the ones that you most commonly see.
When I see questions on Stack Overflow, they almost always have, they call a stored procedure and they're passing a date time variable in and they pass it in as a string that says M/D/Y. These could all be vulnerable to misinterpretation. So M/D/Y, it can be vulnerable, D/M/Y can be vulnerable, those two can switch. If I have a user in England and a user in the United States and they type in July 9th of 2020, or I'm sorry, if they type in 7/9/2020, I have no idea, I have no way to know if they meant July 9th or September 7th, no there's no way for me to know that. I could tie it to the regional settings but what if I have an employee from the United Kingdom who's in the United States and they have their laptop here and they're using a website that uses regional settings based on their IP address, not their regional settings from their machine. So there can be all kinds of different ways where you might be able to infer what they probably meant but you can't know for sure.
And then this one, this surprises a lot of people. YYYY-MM-DD, this is ISO 8601. This is the standard, this is how it's defined. However, in SQL Server this can be misinterpreted in more than half of the languages that SQL Server supports, all right?
So let's take a look at a quick demo. So I have a notebook here. I'm using a notebook in Azure Data Studio and I just set up a few queries that control how these date time literal strings can be misinterpreted. So first we'll see what's the language. So let's make sure that we start from the right spot. So we'll set the language to US English and then we'll print the language US English, okay, great. So now, how can, how does US English interpret these strings? So we've got three different formats for April 12th. These are the most common ones that I see. MM/DD/YYYY, then the ISO 8601 format, and then YYYYMMDD, so this is basically ISO 8601 without any separators without any dashes between the dates.
And then we'll also take a look April 20th. So using the same formats, MM/DD/YY, the ISO 8601 and then ISO 8601 without any separators. And if we run those, we can see that they all return the date we expected, right? April 12th, April 12th, April 12th, April 20th, April 20th, April 20th. Now if we look at some formats that include time, not going to be any surprises here, these are all gonna return the same thing. April 12th, April 12th, April 12th, the 20th, the 20th, the 20th.
Now what if the user has a different language? You can see that the, you already got a clue about this at the beginning. So if I set my language to French and then I run the same query again, I end up getting the wrong dates. So, here my settings, and my settings might have already been set to French if I live in France or if I live in Quebec, I'm probably already using French settings. And now this format, which is the American format, is interpreted as D/M/Y based on the language, not based on where I am or my regional settings or anything like that, this is based on the language. So this bit comes out as December 4th and then this one too, YYYY-MM-DD, this comes out as December 4th. That surprises a lot of people and this gives you a clue here that this format is just not safe. You can't use this format and assume that everyone who uses this format in a query is going to get the date that they expected, all right?
And then this one, YYYYMMDD, this one works, no problem, April 12th. So that's just wrong data. So now imagine that you are allowing your users to call this stored procedure, passing this date string and you were storing this data in a table, how do you go back and find this error? You can't, there's no way you can find this error because the data was stored as December 4th, and how do you know that that was stored incorrectly according to what the user typed and what the user expected? The insert into the table had no, there's no saving of where that regional format, how the regional format was interpreted and translated. You have no idea that the date that was passed in was actually some other format that was translated before it was inserted, that the information is all gone.
Silently wrong data is arguably worse than errors but neither is good, obviously. So if I set my language to French again and I try to convert some of these formats, April 20th, where there obviously isn't a 20th month. So if that 20 is ever interpreted as a month, I'm going to get an error. So if we run these and we see, in fact, so April_20_1 is an error, April_20_2 is an error. And then finally, the safe format, 20200420 with no dashes, that came out correctly. And if instead of convert you use TRY_CONVERT, that just means you'll end up with NULL. So it'll try to convert April 20th or I'm sorry 04/20/2020, it will try to convert that to the 20th month, of the 4th day of the 20th month in 2020. And since it can't convert that, you end up with NULL values here.
All right, so you've not only, instead of getting wrong data, now you're getting empty data, which is not any better. And again, this 20200420 format always works. This could also happen if you have the times included in these formats.
So we'll try it with the French language again. You have, you're converting the, trying to convert to date time, 2020-04-12, at some, at some point in time in that day. And then also these two formats the, the date, the ISO 8601 format, with the time but there's a t separator here and that's a big, that's an important difference. And then finally, date time with the unseparated version of ISO 8601 and a time. And we can see that this one was also interpreted incorrectly, this is December 4th and the other two are correct, so if with the T, all the, the only difference between these two is I put a t in here in between, that's just a separator between the date and the time. And that tells SQL Server ignore anything about language or regional settings, this is an actual ISO 8601 format.
Without that T, it has to obey. SQL server is programmed to obey the regional settings of the user or the language of the user. And again if you're not using TRY_CONVERT, you will get errors for some of these, so if you. With 0412, that just happens to transpose nicely, right? If April 12th just becomes December 4th then there's no error because that's also a valid date time even though it's wrong according to the user, what the user wanted. But if you try to translate this and the date doesn't transpose so you obviously can't have a 20th month, you get an error in this case. So you get an error on this first one April_20_a which is the ISO 8601 without the T separator, but these other two work just fine these are both April 20th.
So which formats are always safe? Well, as I showed in that demo these are safe regardless of language and regional settings. YYYYMMDD with no separators, all right? Just an eight character string. The eight character string with a space and a time as long as it includes seconds that would be fine. And then the ISO 8601 format as long as you have a T in between the date and the time that's the only time that the dashes the format, the ISO 8601 format with dashes is safe. How do you make sure that this happens? Always send strongly-typed parameters from your application, you shouldn't be sending strings, you shouldn't be, never let an end user input a date by hand, so nobody should ever be typing in, because then you have to interpret what they meant and you have to assume that when I type 7/9/2020, that I must have meant July. You can't assume that, you don't, there's no way for you to know unless you're application is so local and centric to your language and region, which I don't know that you can write an application that would assume that today.
Use drop-downs, a date picker, there are all kinds of calender controls out there. You can control the format that you pass from your application into SQL Server, don't let a user take that into their hands. So thank you very much for watching this video, if you look down below, there are some other resources and then I also have a couple of links here, bit.ly/mssql-dates those go to all the date and time related tip categories at mssqltips.com and then bit.ly/sql-bad-habits that is just an index page of all of the bad habits and best practices posts I've written over the years, a lot of those relate to date time topics. So thank you very much, and I'll see you in the next video.