Why You Should Avoid SELECT * in SQL Server T-SQL Code
I have read and been told that using SELECT * is a no-no in my SQL Server T-SQL code. From my perspective, it's easy and it means I don't have to type all the column names for a query. If it is a problem, I can always rewrite the query in question. Is there something I'm missing?
Given the scenario you've described, it does seem like folks are making SELECT * to be an issue when it isn't. However, there are some things to consider:
- If you have to return to the code after some length of time, say a year, will you remember that you used SELECT *?
- Will you remember everywhere you used SELECT *?
- If someone else has to edit your code, will they catch all of your uses of SELECT *?
- What if SELECT * was used for other objects, such as a view definition? How will an underlying table change affect things?
The last example, when there is dependent code using SELECT *, which other systems could be using as an interface, is easy to see. Let's take a simple example, say a table which stores info on domain logins:
CREATE TABLE dbo.DomainUser_Live ( UserID NVARCHAR(50) NOT NULL, FirstName NVARCHAR(30) NOT NULL, Surname NVARCHAR(30) NOT NULL, CONSTRAINT PK_DomainUser PRIMARY KEY (UserID) ); GO
We'll put in some dummy data:
INSERT INTO dbo.DomainUser_Live ( UserID, FirstName, Surname ) VALUES ( N'John.Doe', N'John', N'Doe'); INSERT INTO dbo.DomainUser_Live( UserID, FirstName, Surname ) VALUES ( N'Jane.Doe', N'Jane', N'Doe'); GO
Then we'll create a view using SELECT * against the table.
CREATE VIEW dbo.DomainUser AS SELECT * FROM dbo.DomainUser_Live; GO
If you're wondering about the name with the suffix _Live, think about a set of tables for Domain Users. One is live and active for applications and people to use. The other can be loaded without interruption. A couple of renames and the data is "refreshed." Meanwhile, the view allows an unchanging interface for anyone or any system that needs the data, without having to understand what's going on behind the scenes. This is just one example of where you'd have a dependent object like a view referring to a table. So let's query against the view dbo.DomainUser:
Scenario: A New Requirement
Now let's say there's a new requirement. An application needs the Distinguished Name of the user, which is the path in Active Directory to find where the user account is located. Therefore, the following change is made:
ALTER TABLE dbo.DomainUser_Live ADD DN NVARCHAR(255) NULL; GO
And after data is loaded for every user object, then we would flip the column to NOT NULL like so. First, the data:
UPDATE dbo.DomainUser_Live SET DN = 'CN=' + UserID + ', OU=Users, DC=MyDomain, DC=local'; GO
Then the ALTER COLUMN to set the column to NOT NULL:
ALTER TABLE dbo.DomainUser_Live ALTER COLUMN DN NVARCHAR(255) NOT NULL; GO
Now let's query the view and see what we get:
The view, though it's defined as SELECT *, doesn't return the column DN! Let's check the table to make sure we've made the changes:
SO the SELECT * isn't actually returning all columns! Imagine trying to troubleshoot this issue at 3 AM in the morning! Let's issue a query against sys.columns to figure out what's going on:
SELECT OBJECT_NAME(c.object_id) AS 'View', c.name AS 'Column' FROM sys.columns AS c JOIN sys.views AS v ON c.object_id = v.object_id WHERE v.name = 'DomainUser'; GO
The results of the query reveals the problem:
The table's original columns are specified for the view, but not the new one, DN. With this example, it's not hard to do the rewrite. However, consider the case where you have a lot of tables and a lot of views. If you have code relying upon those views, rewriting from SELECT * to add the columns needed can become a significant amount of work.
How did this happen? The problem is that the SELECT * for the view definition was automatically expanded by SQL Server at the time the view was created. So even though specific column names weren't specified, SQL Server figured out what they were and assigned them to the view. Therefore, the view only returns the original columns. Now take the scenario a step further and imaging that someone is only looking at the view definition from what was checked into source control. The view definition looks right, only we know from this exercise that it's not. Therefore, a person reviewing code in source control will miss the problem.
- Learn how to use sp_refreshview to update the view without using ALTER VIEW.
- Find out how to determine object dependencies in SQL Server.
- Check out these tips on SQL Server Views.
About the author
View all my tips