SQL Server Database Data Types

Overview

Once the logical design has been determined, you can start to physically design your database where tables and their columns are created. When the task of defining the columns of a table is begun, it very important to choose data types and their sizes wisely.

Explanation

SQL Server data pages hold row data and are roughly capped at 8060 overall bytes (8k) though there is a row overflow mechanism that can kick in. Generally speaking, you should look to keep your data type sizes as compact as possible. It stands to reason that the more compact your rows, the more rows that can fit on a data page resulting in fewer I/Os required (page reads) to get the data required for a query request. In addition, SQL Server's data cache will have more room to store data. I usually take a minimalist approach to data type size definitions and use the smallest data sizes needed to reasonably define a column.

Data types themselves also require careful consideration. Is your application multi-national or is there a chance it may become multi-national? You may be best served by using unicode types (i.e. nvarchar, nchar) vs. the varchar and char types. Unicode types do take twice the data storage (they're double byte) but they allow for storing international characters such as Japanese Kanji. That said, this does negatively impact the 8k size limit.

One item to consider when defining character based columns, I usually avoid using char/nchar unless the size of the data in the column is in a known, fixed size format (such as a U.S. check routing number). The reason is that once a column is defined as char/nchar, the entire column size is allocated and counts against your 8k page size limit. For instance, if you were to define a column called FirstName as char(50), the full 50 characters will be allocated even if a first name that is entered is less than 50 characters. Of course, this becomes wasted space in your data page and causes fewer rows to fit resulting in more I/Os to satisfy queries.

Consider the following example where a table is created with a column defined as char(2000) and seeded with 1000 rows

use tempdb
go
if object_id('SizeTest') is not null
   drop table SizeTest
go
create table SizeTest (id int identity primary key clustered, value char(2000) default 'x')
go
insert into SizeTest default values
go 1000

Examining the page count, we see that a total of 250 pages were allocated.

use tempdb
go
select page_count
from sys.dm_db_index_physical_stats(DB_ID('tempdb'),OBJECT_ID('SizeTest'),NULL,NULL,NULL)
go

In contrast, let's rebuild the table and define the value column as varchar(2000)

use tempdb
go
if object_id('SizeTest') is not null
   drop table SizeTest
go
create table SizeTest (id int identity primary key clustered, value varchar(2000) default 'x')
go
insert into SizeTest default values
go 1000

Re-running our query against sys.dm_db_index_physical_stats shows a great reduction in the required row storage

Also look to avoid deprecated data types such as text, ntext, and image as they will eventually become unsupported. Instead, look to convert to and start using varchar(max), nvarchar(max), and varbinary, respectively.

Additional Information

<< Previous

Next >>

Comments For This Article

Wednesday, June 19, 2019 - 6:43:17 PM - Tom W

I was able to follow your example, reproducing the page counts exactly. However, I noticed that if we started with the first example and then issued an ALTER TABLE statement, instead of rebuilding the table from scratch, the page count bloats quite a bit. If one was refactoring an existing database, they would be more likely to use ALTER TABLE instead of rebuilding tables from scratch. As you can see below, I doubled the page count. Is there any way of getting it to just the 3 pages in the second example, using an ALTER TABLE approach?
Thank You

--=================================
use tempdb;
go
if object_id('SizeTest') is not null
   drop table SizeTest;
go
create table SizeTest (id int identity primary key clustered, MyValue char(2000) default 'x');
go
insert into SizeTest default values;
go 1000
--Check Page Count: 250 pages
select page_count
from sys.dm_db_index_physical_stats(DB_ID('tempdb'),OBJECT_ID('SizeTest'),NULL,NULL,NULL);
go
--Issue ALTER TABLE statement:
--What happens to the page count if we convert Char(2000) to Varchar(2000)?
alter table SizeTest
alter column MyValue varchar(2000) null;
go
--Check Page Count: 500 pages
select page_count
from sys.dm_db_index_physical_stats(DB_ID('tempdb'),OBJECT_ID('SizeTest'),NULL,NULL,NULL);
go