SQL Server Table Design Guidelines - Data Types, Designing for Deletes and Primary Keys

By:   |   Updated: 2023-11-01   |   Comments (3)   |   Related: More > Database Design


Problem

Ask any database expert, and they'll agree tables are the lifeblood of SQL Server. It's easy to create them with two lines of code. Since they are easy to create, it's easy to create them poorly. I doubt anyone starts their day saying I'll make a table today and do a lousy job. For starters, I'll set all my columns to NVARCHAR(MAX); forget about primary keys and indexes. Unless you guide developers down a good path, how can they break free from creating terrible tables?

Solution

Building guidelines around table design saves headaches down the road. Sometimes you hear these referred to as best practices, but I like the terms guidelines or rules. In this article, we'll start by looking at the importance of building tables from a set of guidelines. I'll then present three guidelines you can put in place today. I hope you start creating guidelines and sharing them with others—especially people you work with.

The Importance of Guidelines

When you were a kid, did an adult tell you to never run out into the street? Maybe you've told your kids this—I know I have. I exaggerate the likelihood of getting struck by a moving car. But the advice sticks with you. To this day, I look both ways before crossing the road. Guidelines are everywhere around us. The U.S. Department of Health advises how many daily servings of fruits and veggies you should eat. You don't have to follow the advice, but you may live a longer and happier life—like not running out into the street.

Search online for best practices for creating tables, and Google returns hundreds of results. Often, this advice applies regardless of your environment. Here's one: don't make a table to hold a single column. Maybe for a number or tally table, but as a rule, I can't think of a situation where this advice wouldn't apply. If you can, tell me in the comments below.

Why is it important to follow a set of guidelines when creating tables? Tables tend to stick around for a long time. How often do you hear people say, "Let's get something out in production, and we'll return later and fix it." This advice is not good. Do not fall into this trap.

Create Table Design Guidelines

When most developers are .net or front-end focused, they likely don't have the highest regard for building great tables. I don't believe anyone sets out to create a terrible table, but like a recipe you don't follow, that's what you get. It's your job as a DBA or Developer —to guide them onto the path of good table design. Before giving advice, write it down and store it in a commonplace, like a wiki. Take down that ERD wrapped around your cubicle and replace it with your guidelines.

Only write a few rules to start. No one wants to read a list of 100 complex rules. Start with three or five of the most important ones. If you can get developers to follow these rules, you're one step closer to never shaking your head in anger again after inspecting tables in production. Wouldn't it be wonderful if all your tables followed a specific design pattern? You can make that dream a reality.

To get you started, here are three guidelines I preach almost every day.

  • Pick the Right Data Types
  • Design for Deleting
  • Add a Primary Key

Table Design Guideline 1: Pick the Right Data Types

Mom bought all my shirts in the XL size, even though a medium would fit like a glove. She said you'll grow into it; I never did. In the 90's, baggy clothes were trendy, so it wasn't all bad. Some developers approach data types as my mom did with buying shirts. Let's say you need a column for state abbreviation; you slap an NVARCHAR(MAX) on it because you never know. Better to be safe than sorry. What's wrong with the table definition below?

CREATE TABLE dbo.Employee
(
    Id BIGINT NOT NULL,
    Firstname NVARCHAR(MAX) NOT NULL,
    LastName NVARCHAR(MAX) NOT NULL,
    Notes NVARCHAR(MAX) NULL
);

Let's focus on the ID column. Do you think the employee table holds more than two billion rows? Walmart, one of the largest companies in the world, only staffs 2.3 million. I often use an integer for surrogate keys like this. If you want to experiment with small or tiny integers for fixed-depth tables that hold countries or states, go for it.

When two columns are first and last names, go look up the average size of each and add 10% to it. The person with the 1,000-character-long first name is out of luck. Be mindful of the data types so you don't end up with everything as a BIGINT or NVARCHAR(MAX). Choosing the right type and size saves space and allows SQL Server to predict an accurate memory grant.

Before adding a notes column, ask yourself, do you want people pasting entire emails into it? If you need to de-identify data in a lower environment, it's harder to do on these types of columns. There are better places to store emails and documents than in SQL Server. Below is a better version of the employee table from above.

CREATE TABLE dbo.Employee
(
    Id INT NOT NULL,
    Firstname VARCHAR(10) NOT NULL,
    LastName VARCHAR(12) NOT NULL,
    Notes VARCHAR(250) NULL
);

Table Design Guideline 2: Design for Deleting

Nothing is more satisfying than deleting data from a table—on purpose. Most people don't create tables with deleting rows in mind. Deleting is a problem for someone in the future. Your compliance team may have rules saying you can only keep client data for five or 10 years. How will you know how long it's been there unless a column captures the create date?

The first step towards deleting data is adding a create date column. Think of it as the birthday of the row. You can make it datetimeoffset, whatever you like. Please set it to be NOT NULL. Unless you work with the most diligent developers, they tend to pass NULL when allowed—don't allow it. Unless you have specific requirements for a modified date, I wouldn't add one for the aesthetics. Remember to set the modified date to allow NULLs if you need one.

CREATE TABLE dbo.Client
(
    Id INT NOT NULL,
    [Name] NVARCHAR(30) NOT NULL,
    [Description] NVARCHAR(120) NOT NULL,
    Notes NVARCHAR(250) NULL,
    CreateDate DATE NOT NULL
);

Companies sometimes use a design pattern of soft deletes, where a bit or date column indicates when you hide a row from the end user. I think of these as deletes for people who can't commit—I kid. It's prudent to investigate if you need to include such columns.

If most of your existing tables don't follow soft deletes, I wouldn't include it on one lone table. Here's why: if you have soft delete enabled on table B and it has a foreign key reference to table A and table A doesn't use soft deletes, you'll need to NULL out the reference in table B. But if you're starting from scratch and implementing soft deletes across the board, go for it.

CREATE TABLE dbo.Client
(
    Id INT NOT NULL,
    [Name] VARCHAR(30) NOT NULL,
    [Description] VARCHAR(120) NOT NULL,
    Notes VARCHAR(250) NULL,
    CreateDate DATE NOT NULL,
    DeleteDate DATE NULL
);

Don't leave deleting data as a problem for your future self.

Table Design Guideline 3: Add a Primary Key

Growing up, I kept all my valuables in a metal safe in my closet including social security card. This card contained a number identifying me as a unique person. A primary key is like the social security number for each row. I'm still surprised to see transactional tables without primary keys. The code below illustrates how easy it is to add one.

CREATE TABLE dbo.Client
(
    Id INT NOT NULL,
    [Name] VARCHAR(30) NOT NULL,
    [Description] VARCHAR(120) NOT NULL,
    Notes VARCHAR(250) NULL,
    CreateDate DATE NOT NULL,
    DeleteDate DATE NULL,
    CONSTRAINT PK_Client_Id
        PRIMARY KEY CLUSTERED (Id)
);

There are instances where you don't place a primary key on a table. For example, if you load data into a staging area, you might skip a primary key. But you'll want to include a primary key for most tables. Likely, it will be the clustered index. Let's go one step further and say it's unique.

Without a primary key, it's hard to know which row needs to be updated or deleted. If your table contains three rows, duplicates for all intents and purposes, how would you know which one to update without a key? Yes, you can even use a GUID if you prefer. Expect fragmentation to be high. I like an ever-increasing integer starting at one.

You can even use a natural key if you prefer. For example, if you have an employee table and the company has an internal number like 000001, then use it. Please don't reuse the numbers. When most of your queries have a predicate of the employee number, it makes sense to use it.

What are your favorite table design guidelines? Please share them in the comments below.

Table Design Key Points

  • Creating a well-structured table is one of the most important things you can do for your database and application.
  • Assemble a collection of rules or guidelines developers can follow when adding a new table or column. You can't blame your developers when they don't follow guidelines if they are in your head—share them with your team.
  • Please start with a manageable list. Nothing discourages people from following the rules when they need to scroll through them. Start with only three or five and add to them as time passes. Make it a community project and ask others to add to your list.
Next Steps


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Jared Westover Jared Westover is a passionate technology specialist at Crowe, helping to build data solutions. For the past two decades, he has focused on SQL Server, Azure, Power BI, and Oracle. Besides writing for MSSQLTips.com, he has published 12 Pluralsight courses about SQL Server and Reporting Services. Jared enjoys reading and spending time with his wife and three sons when he's not trying to make queries go faster.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2023-11-01

Comments For This Article




Wednesday, November 15, 2023 - 2:05:37 PM - ScottPletcher Back To Top (91759)
Start at the beginning. Don't assume you are designing a single table. Instead, gather the data requirements -- what data is needed and what (non-database) data type it has. Use business names at this stage, not "computerized" names. Then go thru the standard normalization process.
You could easily end up more than one table, i.e. with a child table(s). Be sure to key the child table with parent table's key *first*, then the child key. Table names are typically plural.

Wednesday, November 1, 2023 - 5:11:03 PM - Malcolm Back To Top (91727)
I like Boolean bit flags for denoting logical things like "this record is soft-deleted". Unless the operation needs to be scheduled, in which case a date or datetimeoffset from the create date is sensible. Bit flags are quick and easy to check, can save a ton of space if you have many of them, and if a bit column name is always something like "Is_<DescribeWhatItDoes>" it is completely obvious just by looking at the table or a query what it's doing, and with what data type. Associated with a datetime of when the bit was changed, and maybe some other useful not Null indicator data (like who changed it, and perhaps a comment about why) it becomes self-auditing.

Wednesday, November 1, 2023 - 9:08:53 AM - Richard Back To Top (91723)
If you're going to implement or already have replication in your environments include a column to track InsertedDatetime and LastUpdatedDatetime. You'll be glad you did when users start to question whether the data is up to date.