SQL Server Table Design Guidelines - Data Types, Designing for Deletes and Primary Keys
Ask any database expert, and they'll agree tables are the lifeblood of SQL Server. It's easy to create them with two lines of code. Since they are easy to create, it's easy to create them poorly. I doubt anyone starts their day saying I'll make a table today and do a lousy job. For starters, I'll set all my columns to NVARCHAR(MAX); forget about primary keys and indexes. Unless you guide developers down a good path, how can they break free from creating terrible tables?
Building guidelines around table design saves headaches down the road. Sometimes you hear these referred to as best practices, but I like the terms guidelines or rules. In this article, we'll start by looking at the importance of building tables from a set of guidelines. I'll then present three guidelines you can put in place today. I hope you start creating guidelines and sharing them with others—especially people you work with.
The Importance of Guidelines
When you were a kid, did an adult tell you to never run out into the street? Maybe you've told your kids this—I know I have. I exaggerate the likelihood of getting struck by a moving car. But the advice sticks with you. To this day, I look both ways before crossing the road. Guidelines are everywhere around us. The U.S. Department of Health advises how many daily servings of fruits and veggies you should eat. You don't have to follow the advice, but you may live a longer and happier life—like not running out into the street.
Search online for best practices for creating tables, and Google returns hundreds of results. Often, this advice applies regardless of your environment. Here's one: don't make a table to hold a single column. Maybe for a number or tally table, but as a rule, I can't think of a situation where this advice wouldn't apply. If you can, tell me in the comments below.
Why is it important to follow a set of guidelines when creating tables? Tables tend to stick around for a long time. How often do you hear people say, "Let's get something out in production, and we'll return later and fix it." This advice is not good. Do not fall into this trap.
Create Table Design Guidelines
When most developers are .net or front-end focused, they likely don't have the highest regard for building great tables. I don't believe anyone sets out to create a terrible table, but like a recipe you don't follow, that's what you get. It's your job as a DBA or Developer —to guide them onto the path of good table design. Before giving advice, write it down and store it in a commonplace, like a wiki. Take down that ERD wrapped around your cubicle and replace it with your guidelines.
Only write a few rules to start. No one wants to read a list of 100 complex rules. Start with three or five of the most important ones. If you can get developers to follow these rules, you're one step closer to never shaking your head in anger again after inspecting tables in production. Wouldn't it be wonderful if all your tables followed a specific design pattern? You can make that dream a reality.
To get you started, here are three guidelines I preach almost every day.
- Pick the Right Data Types
- Design for Deleting
- Add a Primary Key
Table Design Guideline 1: Pick the Right Data Types
Mom bought all my shirts in the XL size, even though a medium would fit like a glove. She said you'll grow into it; I never did. In the 90's, baggy clothes were trendy, so it wasn't all bad. Some developers approach data types as my mom did with buying shirts. Let's say you need a column for state abbreviation; you slap an NVARCHAR(MAX) on it because you never know. Better to be safe than sorry. What's wrong with the table definition below?
CREATE TABLE dbo.Employee ( Id BIGINT NOT NULL, Firstname NVARCHAR(MAX) NOT NULL, LastName NVARCHAR(MAX) NOT NULL, Notes NVARCHAR(MAX) NULL );
Let's focus on the ID column. Do you think the employee table holds more than two billion rows? Walmart, one of the largest companies in the world, only staffs 2.3 million. I often use an integer for surrogate keys like this. If you want to experiment with small or tiny integers for fixed-depth tables that hold countries or states, go for it.
When two columns are first and last names, go look up the average size of each and add 10% to it. The person with the 1,000-character-long first name is out of luck. Be mindful of the data types so you don't end up with everything as a BIGINT or NVARCHAR(MAX). Choosing the right type and size saves space and allows SQL Server to predict an accurate memory grant.
Before adding a notes column, ask yourself, do you want people pasting entire emails into it? If you need to de-identify data in a lower environment, it's harder to do on these types of columns. There are better places to store emails and documents than in SQL Server. Below is a better version of the employee table from above.
CREATE TABLE dbo.Employee ( Id INT NOT NULL, Firstname VARCHAR(10) NOT NULL, LastName VARCHAR(12) NOT NULL, Notes VARCHAR(250) NULL );
Table Design Guideline 2: Design for Deleting
Nothing is more satisfying than deleting data from a table—on purpose. Most people don't create tables with deleting rows in mind. Deleting is a problem for someone in the future. Your compliance team may have rules saying you can only keep client data for five or 10 years. How will you know how long it's been there unless a column captures the create date?
The first step towards deleting data is adding a create date column. Think of it as the birthday of the row. You can make it datetimeoffset, whatever you like. Please set it to be NOT NULL. Unless you work with the most diligent developers, they tend to pass NULL when allowed—don't allow it. Unless you have specific requirements for a modified date, I wouldn't add one for the aesthetics. Remember to set the modified date to allow NULLs if you need one.
CREATE TABLE dbo.Client ( Id INT NOT NULL, [Name] NVARCHAR(30) NOT NULL, [Description] NVARCHAR(120) NOT NULL, Notes NVARCHAR(250) NULL, CreateDate DATE NOT NULL );
Companies sometimes use a design pattern of soft deletes, where a bit or date column indicates when you hide a row from the end user. I think of these as deletes for people who can't commit—I kid. It's prudent to investigate if you need to include such columns.
If most of your existing tables don't follow soft deletes, I wouldn't include it on one lone table. Here's why: if you have soft delete enabled on table B and it has a foreign key reference to table A and table A doesn't use soft deletes, you'll need to NULL out the reference in table B. But if you're starting from scratch and implementing soft deletes across the board, go for it.
CREATE TABLE dbo.Client ( Id INT NOT NULL, [Name] VARCHAR(30) NOT NULL, [Description] VARCHAR(120) NOT NULL, Notes VARCHAR(250) NULL, CreateDate DATE NOT NULL, DeleteDate DATE NULL );
Don't leave deleting data as a problem for your future self.
Table Design Guideline 3: Add a Primary Key
Growing up, I kept all my valuables in a metal safe in my closet including social security card. This card contained a number identifying me as a unique person. A primary key is like the social security number for each row. I'm still surprised to see transactional tables without primary keys. The code below illustrates how easy it is to add one.
CREATE TABLE dbo.Client ( Id INT NOT NULL, [Name] VARCHAR(30) NOT NULL, [Description] VARCHAR(120) NOT NULL, Notes VARCHAR(250) NULL, CreateDate DATE NOT NULL, DeleteDate DATE NULL, CONSTRAINT PK_Client_Id PRIMARY KEY CLUSTERED (Id) );
There are instances where you don't place a primary key on a table. For example, if you load data into a staging area, you might skip a primary key. But you'll want to include a primary key for most tables. Likely, it will be the clustered index. Let's go one step further and say it's unique.
Without a primary key, it's hard to know which row needs to be updated or deleted. If your table contains three rows, duplicates for all intents and purposes, how would you know which one to update without a key? Yes, you can even use a GUID if you prefer. Expect fragmentation to be high. I like an ever-increasing integer starting at one.
You can even use a natural key if you prefer. For example, if you have an employee table and the company has an internal number like 000001, then use it. Please don't reuse the numbers. When most of your queries have a predicate of the employee number, it makes sense to use it.
What are your favorite table design guidelines? Please share them in the comments below.
Table Design Key Points
- Creating a well-structured table is one of the most important things you can do for your database and application.
- Assemble a collection of rules or guidelines developers can follow when adding a new table or column. You can't blame your developers when they don't follow guidelines if they are in your head—share them with your team.
- Please start with a manageable list. Nothing discourages people from following the rules when they need to scroll through them. Start with only three or five and add to them as time passes. Make it a community project and ask others to add to your list.
- Are you trying to decide if you need a VARCHAR(MAX)? Check out Daniel Farina's article, Comparison of the VARCHAR(max) and VARCHAR(n) SQL Server Data Types.
- For a quick overview of primary keys in SQL Server, please check out a tutorial created by Armando Prato.
- Brent Ozar wrote the article 9 Tips for Faster SQL Server Applications outlining five table guidelines you should follow.
About the author
View all my tips
Article Last Updated: 2023-11-01